Interest Rate Modeling

Stochastic Processes in Finance
A Physics Perspective
Marakani Srikant
Supervisors
Assoc. Prof. Belal E. Baaquie
Asst. Prof. Edward Teo
In partial fulfillment of the requirement for the degree of Master of Science
Department of Physics
National University of Singapore
2002/03
To,
my mother and friends.

Acknowledgements
I would like to thank my supervisors, Assoc. Prof. Belal Baaquie and Asst. Prof. Edward Teo
for the great academic help and moral support they have given me while doing this project.
I would also like to thank Dr. Parwani, Prof. Warachka and Prof. Coriano for many
interesting conversations and help in understanding some of the concepts algorithmics and
programming. I would also like to thank Prof. Bouchaud and his firm Science and Finance for
kindly providing the Eurodollar futures data used in this project.
ii
A BSTRACT
We investigate the use of theoretical and computational methods from physics in finance,
particularly in the areas of contingent claim valuation and in generalizing existing financial
models. We apply these methods to simplify the analysis of complicated options on stocks
and to develop and investigate models of the term structure of interest rates. We find that the
language and techniques of theoretical physics are very useful in dealing with some of the
problems in theoretical finance.
iii
Contents
Acknowledgements ii
Overview 1
1 A Brief Introduction to Finance 2

1.1 Securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Common Stocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Bonds and Fixed Income Securities . . . . . . . . . . . . . . . . . . . 3
1.1.3 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The Practical Uses of Options . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Speculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 The Principle of No Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 The Historic Connection between Physics and Finance . . . . . . . . . . . . . 7
2 Stochastic Processes 9
2.1 Events, Probability Measures and Random Variables . . . . . . . . . . . . . . 9
2.2 Generating functions and cumulants . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Equivalent Measures and the Radon-Nikodým Derivative . . . . . . . . . . . . 11
2.4 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Martingales and semi-martingales . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7 The Kramers-Moyal Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8 The Fokker-Planck and Langevin Equations . . . . . . . . . . . . . . . . . . . 18
iv
2.9 Extension to Multivariate Distributions . . . . . . . . . . . . . . . . . . . . . . 19
2.10 Stochastic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.11 Killing terms and Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.12 Cumulants and the Central Limit Theorem . . . . . . . . . . . . . . . . . . . . 23
2.13 Itô Stochastic Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.13.1 The Wiener process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.13.2 Stochastic Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.13.3 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . 27
2.14 Girsanov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 The Fundamental Theorem of Asset Pricing 31

3.1 Finite Sample Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Self-financing trading strategies, no arbitrage and price systems . . . . 32
3.1.2 Equivalence of martingale measures and price systems . . . . . . . . . 33
3.1.3 No arbitrage and the non-emptiness of Π . . . . . . . . . . . . . . . . 34
3.1.4 Completeness and the uniqueness of martingale measures . . . . . . . 34
3.2 Continuous Sample Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Models of the Market in the Language of Physics . . . . . . . . . . . . . . . . 36
4 Applications to Stocks 40
4.1 The most general risk-neutral Gaussian Hamiltonian . . . . . . . . . . . . . . 40
4.2 The Black-Scholes equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Single barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4 Double barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5 More Path Dependent Options . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5.1 Soft barrier options . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5.2 Asian options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5.3 Seasoned options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 Interest Rate Models 53

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2 History of interest rate models . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3 The HJM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
v
5.3.1 Definition of the model . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.2 The fundamental theorem of asset pricing and the action . . . . . . . . 55
5.3.3 Lattice field theory formulation . . . . . . . . . . . . . . . . . . . . . 59
5.3.4 Hamiltonian for HJM . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3.5 Dynamics of the bond prices and the martingale measure . . . . . . . . 64
5.3.6 Futures pricing in the HJM model . . . . . . . . . . . . . . . . . . . . 65
5.3.7 Option pricing in the HJM model . . . . . . . . . . . . . . . . . . . . 66
5.4 The field theory model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.5 The Santa-Clara and Sornette model . . . . . . . . . . . . . . . . . . . . . . . 70
6 Comparison of the Models with Market Data 72

6.1 The Market Data used for the Study . . . . . . . . . . . . . . . . . . . . . . . 72
6.2 Assumptions behind the tests of the models . . . . . . . . . . . . . . . . . . . 72
6.3 The Correlation Structure of the Forward Rates . . . . . . . . . . . . . . . . . 73
6.4 Analysis of the Field Theory Model with Constant Rigidity . . . . . . . . . . . 75
6.5 Field Theory Model with Rigidity µ(θ) . . . . . . . . . . . . . . . . . . . . . 78
6.6 Field Theory Model with f (t, z(θ)) . . . . . . . . . . . . . . . . . . . . . . . . 80
6.7 Phenomenology of the Forward Rate Curve . . . . . . . . . . . . . . . . . . . 82
7 Hedging in Field Theory Models of the Term Structure 85

7.1 Hedging in General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.2 Instantaneous Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.1 Hedging bonds with other bonds . . . . . . . . . . . . . . . . . . . . . 86
7.2.2 Semi-empirical results : Constant Rigidity model . . . . . . . . . . . . 90
7.2.3 Hedging with futures . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.2.4 Semi-empirical results for hedging with futures . . . . . . . . . . . . . 94
7.3 Finite time hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.4 Empirical Results for Finite Time Hedging . . . . . . . . . . . . . . . . . . . . 98
8 Non-Linear Field Theory Models 102

8.1 A general Hamiltonian for the bond prices . . . . . . . . . . . . . . . . . . . . 102
8.2 General Gaussian model for the bonds . . . . . . . . . . . . . . . . . . . . . . 103
8.3 Volatility dependent on forward rates . . . . . . . . . . . . . . . . . . . . . . . 104
vi
8.3.1 Definition of the model . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.4 Stochastic volatility models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.4.1 The first model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.4.2 The second model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.5 Monte Carlo Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.5.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
A The Generic Program for Fitting the Parameters 120
B The Simulation Program for Volatility as a Function of Forward Rates 124
C The Simulation Program for Volatility as an Independent Field 129
vii
Summary
We will present a very brief introduction to basic finance for those physicists who are unfamiliar
with the field in the first chapter. This introduction will be very elementary but should be suf-
ficient for the purposes of this thesis. The interested reader should read any of the introductory
books in this field.
A very brief recollection of the theory of stochastic processes will be presented in the second
chapter. Most of the facts in this chapter should be familiar to physicists but the notation used
might be somewhat new.
We will provide a proof of the fundamental theorem of asset pricing in the case of finite
sample spaces in the third chapter and heuristically indicate how the proof is carried out in the
continuous case. This theorem forms the bedrock of theoretical finance and we need it before
we can apply any ideas from physics to the problem of contingent claim valuation. We will
also discuss how this theorem affects the market models that we can construct and how we can
use some of the language of physics for this purpose.
We will study the use of path integration in the valuation of contingent claims on stocks in
the fourth chapter. We will show that deriving the price of barrier and other path dependent
options is significantly simpler if techniques from quantum mechanics are used.
In the fifth chapter, we will introduce the famous Heath-Jarrow-Morton (HJM) model for
the valuation of contingent claims based on interest rates and their recent generalizations and
discuss how these generalizations are similar to the theory of free quantum fields (in other
words, quadratic theories). We will also introduce the common idea in finance of the change of
numeraire and look at its meaning in terms of the action.
In the sixth chapter, we analyze Eurodollar futures data to give us more information about
the market behaviour of interest rates and propose further generalization of the field theory
models using the insights developed from this analysis. We also present some interesting phe-
nomenological insights gained from an analysis of the data.
In the seventh chapter, we discuss hedging contingent claims in field theoretic models. We
will discuss both instantaneous and finite time hedging. The former is much more important
for theoretical purposes while the latter is probably more important for practical purposes. We
see that finite time hedging calculations are simplified using path integral techniques.
In the eighth chapter, we will look at nonlinear field models for forward rates and see how
we can derive an equivalent martingale measure for them in general. We will also see how
we can efficiently perform Monte Carlo simulations for these theories by showing that they
are equivalent to certain Langevin equations. We then present a few results of the simulations
which show the effect of the nonlinearity on the behaviour of bond prices.
1
C HAPTER 1
A B RIEF I NTRODUCTION TO F INANCE
This chapter presents a brief introduction to the subject of finance. No attention is paid to
rigour and several simplifications have been made to aid exposition. The material is extremely
brief and elementary and largely follows Srikant[1]. However, it should be sufficient for a
person more interested in the mathematical aspects of this thesis. More detailed description of
financial instruments can be found in Jacob and Pettit[2] and Hull[3].
1.1 Securities
All the financial instruments we see around us such as shares, bonds, futures, options, etc. are
considered securities. More precisely, a securities contract (security, for short) is a contract
issued by a government or company in order to acquire financial capital which is used to ac-
quire real capital, labour and management talent. Examples of securities are common stocks
which are issued by companies to acquire capital (which make the buyer a partial owner of
the company), bonds and fixed income securities which are issued by governments and compa-
nies which need to borrow money for both short and long-term use, debt/equity combinations
(which are basically combinations of stocks and bonds) and third-party financial contracts (or
derivatives, which are the most important from our point of view).
1.1.1 Common Stocks
The precise nature of the common stock contract is normally defined in accordance with the
laws of the country. However, almost all common stock contracts have the following features :
1. In case of liquidation, they give the owner the right to receive his share of the remaining
value of the company after all the outstanding loans, fines etc. have been cleared.
2. They give the owner the right to a share of the dividends (in proportion to the number of
shares owned by him) paid out by the firm.
3. The owner has the right to sell the stock to another investor in the secondary markets
(more commonly known as stock exchanges).
While most common stock contracts have the above features, there are several varieties of them
which are distinguished by other features of the contracts. For example they may be voting or
2
Sec. 1.1 Securities 3
non-voting (i.e., the owners may or may not have the right to vote about major decisions that
affect the company), they may have restrictions on the proportion of the stock that can be held
by one investor and so on and so forth.
1.1.2 Bonds and Fixed Income Securities
Bonds and fixed income security contracts are issued by governments and companies essen-
tially for the purpose of borrowing money. They ordinarily specify the financial obligations
the issuer has to the owner. They generally require the issuer to pay interest and principal in
specified amounts at specified future dates. The number of such dates (excluding the maturity)
is called the number of coupons of the bond, the terminology arising from the fact that the
bond-holder has to tear off coupons from the contract to claim the payments before the final
date.
Debt/equity combinations are financial contracts which are combinations of debt and equity
securities (in other words, a combination of stocks and bonds). One common type of such a
security are the warrants issued with some stocks. These warrants are securities entitling the
bond-holder to purchase a specified number of shares of the firm at a fixed price. These are
similar in nature to the options we will consider later. For an good introduction to bonds, see
Fabozzi[4].
1.1.3 Derivatives
A derivative is an instrument whose value is dependent on other securities (called the under-
lying securities). The derivative value is therefore a function of the value of the underlying
securities. Two popular traded derivatives are futures and options. A futures contract is an
agreement between two parties to buy or sell an asset (the underlying security) at a certain time
in the future for a certain price. They are either traded on exchanges or over the counter. The
over the counter (OTC) derivatives market is significantly larger than the exchange derivatives
market.
The first historical use of derivatives was probably by the mathematician and philosopher
Thales of Miletus in ancient Greece when he bought options on the use of olive presses after
predicting a bumper harvest of olives. He made a windfall after the prediction turned out to
be correct. The ancient Romans and Phoenicians used options on shipping. Options were also
extensively used during the tulip mania in Holland in the early 1600s. The default of several
of the contract holders provided an early example of liquidity risk that these contracts could
create.1 A small options market has always existed in the industrialized world but the size of
the market was limited by the lack of a set of standards and a guarantor of the contract. These
obstacles were removed with the setting up of the Chicago Board of Trade (CBOT) and the
Chicago Board Options Exchange (CBOE). The seminal contribution of Black and Scholes
[5] to the theory of option pricing enabled accurate pricing of options to be easily done and
tremendously increased their popularity. Currently, nearly 30 million options contracts are
traded every month on the CBOT alone and the over the counter derivatives market is worth
1 The rise and collapse of the hedge fund Long Term Capital Management in 1998 which seriously endangered
the world financial system provides a more recent example.
Sec. 1.2 Options 4
over 100 trillion dollars (or nearly three times the world GDP). American banks alone hold
derivative contracts having a notional value of 55 trillion dollars.
Given the immense amount of derivatives that are traded and held by institutions around the
world, the importance of the derivatives (contingent claims) can hardly be overstated. When
used prudently, they can benefit the users tremendously by tailoring investments to their par-
ticular risk profiles. When used without proper understanding, they can lead to disasters such
as the bankruptcy of Orange County or Barings Bank. Further, their effect need not be limited
to the companies and financial institutions that use them as could be seen from the Long Term
Capital Management fiasco when the entire international payments system nearly collapsed.
Options are one of the most important form of derivatives and it can be easily shown that
if one values all options, one can value any derivative whatsoever (the precise meaning of this
will be discussed in the next chapter). Hence we will look at them in slightly more detail.
1.2 Options
Profit on a Call Option
40.0
30.0
20.0
Profit
10.0
0.0
−10.0
−20.0
50.0 70.0 90.0 110.0 130.0 150.0
Price of the underlying at maturity
Figure 1.1: The profit of a call option at maturity as a function of the stock price at maturity.
The strike price is taken to be $100 and the original price of the option is assumed to be $10.
There are two basic types of options that are traded in the market. A call option gives the
holder the right to buy the underlying asset by a certain date for a certain price. A put option
gives the holder the right to sell the underlying asset by a certain date for a certain price. This
price is called the strike price and the date is called the exercise date or maturity of the contract.
There is a further classification of options according to when they can be exercised. An
European option can only be exercised at maturity while an American option can be exercised
at any time up to the maturity. The value of an Asian option is dependent on the average value
of the underlying security during the term of the contract while a Bermudan option2 can only
2 The name is derived from the fact that Bermuda is between America and Europe!
Sec. 1.3 The Practical Uses of Options 5
Profit on a Put Option
40.0
30.0
20.0
Profit
10.0
0.0
−10.0
−20.0
50.0 70.0 90.0 110.0 130.0 150.0
Price of the underlying at maturity
Figure 1.2: The profit of a put option at maturity as a function of the stock price at maturity.
The strike price is taken to be $100 and the original price of the option is assumed to be $10.
be exercised on certain days between the present time and the maturity of the contract.
From the definition of a call option, we can see that the value of an European call option at
maturity is given by
C = (S − K)θ(S − K) (1.1)
(if S < K then the option will not be exercised and if S > K, the profit on the option will be
S − K) where C is the value of the call option at maturity, S is the value of the underlying
security at maturity and K is the strike price of the option. (θ represents the Heaviside function
defined by θ(x) = 0 if x ≤ 0, and θ(x) = 1 if x > 0).3
We see that the payoff of a call option at maturity is either positive or zero. Hence, the call
option must have a positive value before maturity, which of course is the price of the option. If
we also consider the original price of the option, the profit is then given by
Y = (S − K)θ(S − K) − C0 ert (1.2)
where Y is the profit and C0 is the initial price of the call option. For the rest of this section,
we will assume that the present value of the underlying security is $100. This does not change
anything as all the prices can be rescaled by a constant factor without affecting the theory. The
profit of the call option (assuming a strike price of $100 and an initial option price of $10) as a
function of the value of the underlying security at maturity is shown in figure 1.1.
Similarly, the payoff of a put option at maturity is given by
Y = (K − S)θ(K − S) (1.3)
(if K < S, the option will not be exercised and if K > S, the profit is K − S) where P0 is the
initial price of the put option. The profit (with the same assumed values as for the call option)
as a function of the value of the underlying security is shown in figure 1.2.
3 Most physics books will define θ = 0.5. However, we use θ(0) = 0 as we follow the Itô convention. Hence,
the derivative of θ represents a right sided δ function, usually represented by δ R rather than the centered δ function.
Sec. 1.4 The Principle of No Arbitrage 6
1.3 The Practical Uses of Options
1.3.1 Speculation
Options give a person using them very high leverage for speculation. For example, if one is
convinced that the price of Microsoft stock is going to increase over the next three months, one
can either buy the stock or call options on the stock. If we assume that the stock’s present price
is $100, that the price of a call option maturing in three months for a strike price of $100 is
$10 (we assume that one call option contract gives us the right to buy 1 share) and that we have
$10,000 to invest, we can either buy 100 shares or 1,000 call options on the stock. If the price
rises to $120 in three months, we would make a profit of $2,000 if we had bought the stock or
a profit of $10,000 if we had bought the options. Of course, if the price drops, the loss on the
options will be greater than the loss on the stock.
1.3.2 Hedging
While speculators want to increase their exposure in the market, hedgers are trying to do exactly
the reverse. For example, if a fund manager wants to insure his holding of Microsoft stock, he
can sell call options on it. If the price decreases below the strike, he will make a profit equal
to the price of the options when he sold them while if the price increases, he incurs a loss
as the options are called. This compensates for the decrease and increase respectively in the
value of the Microsoft stock he holds. The hedger’s aim is to prevent price changes in the
price of the underlying security from affecting the value of his portfolio. Hedging can also be
performed (and is, in fact, usually done) using futures. In fact, the demand for futures contracts
on exchanges is largely determined by hedging considerations. Futures for single stocks (as
opposed to indices) were only recently introduced as the demand for them was low. In the case
of interest rate dependent contracts, however, the demand for hedging has always been high
and interest rate futures have a relatively long history.4
1.4 The Principle of No Arbitrage
The principle of no arbitrage effectively states that there is no such thing as a free lunch in the
financial markets. It is one of the most important and central principles of finance. The logic
behind the existence of this principle is that if a free lunch exists it will be used by everyone so
that it ceases to be free or that the lunch is exhausted.
More concretely, the principle of no arbitrage states that there exists no trading strategy
which guarantees a riskless profit with no initial investment. This statement is equivalent to
the statement that one cannot get a riskless return above the risk free interest rate in the market
provided that there are no transaction costs (in the presence of transactions, one can only say
that one cannot get a riskless return more than the risk free interest rate plus the transaction
costs). The main assumption behind this principle is that people prefer more money to less
money.5 This is, of course, a very mild condition but we will see that it leads to very powerful
4I thank Prof. Warachka for pointing this out to me.
5 This condition might not be as evidently true as it seems (for example, see Lamont and Thaler [6]).
Sec. 1.5 The Historic Connection between Physics and Finance 7
conclusions due to the fundamental theorem of asset pricing which will be covered in chapter 3.
1.5 The Historic Connection between Physics and Finance
The principal connection between the fields of finance and physics is the subject of stochastic
processes (for a good and recent discussion of this connection, see Paul and Baschnagel [7]).
Stochastic processes form the bedrock of the theory of contingent claim (derivative) valuation
and, in fact, the theory of stochastic processes arose from the study of option pricing. Stochastic
processes have also been extensively used in physics for Brownian motion, diffusion, laser
physics and quantum field theory (for a good review of the use of stochastic processes in field
theory, see Namiki[8]).
This connection between physics and finance can be traced back to over a hundred years.
The study of the mathematical theory of stochastic processes was initiated by Louis Bache-
lier [9] in his seminal PhD thesis on option pricing published under the supervision of Henri
Poincaré who apparently forgot about it later when he himself started studying Brownian mo-
tion. In one of the more remarkable doctoral theses of any era, he assumed that stock prices
followed what came to be later called a continous homogeneous Markov process (Markov had
not yet started his study of Markov processes). He showed that the density function of such
a process must satisfy what later came to be called the Chapman-Kolmogorov equation and
that a Gaussian density of linearly increasing variance satisfied this equation. He also showed
that the continous version of the random walk would also lead to the same density functions.
Further, he showed that the probability density function for this process solved the heat equa-
tion which is the Fokker-Planck equation for this process. Finally, assuming that the option
price is a martingale, he valued several kinds of options under these assumptions. Thus, he
laid the origin not only for the study of Brownian motion, but also for the study of Markov and
diffusion processes.6 As is well known, the mathematical theory of Brownian motion was inde-
pendently developed by Einstein[10] five years later when he calculated the average trajectory
of a microscopic particle subjected to random collisions by molecules in a fluid in one of his
five seminal papers of 1905. Bachelier also studied this problem and derived the distribution of
the Ornstein-Uhlenbeck process (see Jacobsen[11]).
While both fields have extensively studied and used stochastic processes for their own pur-
poses, they have not interacted much until the late eighties when physicists started becoming
interested in the subject of option pricing.
Since the late eighties, there has been considerable interest among physicists in applying
their computational techniques for stochastic processes to problems in theoretical finance. The
early applications were in the use of path integral techniques to value stock options. This was
pioneered by Jan Dash (see Dash [12], [13] and [14]). For a good review of these applications,
see Linetsky [15]. While this continues to be a fertile area of research (for recent research in
this area, see Baaquie[16], Srikant[1], Kleinert[17] and the references therein), the ideas and
techniques of physics are now being applied to theoretical finance in many other ways.
For example, physicists have been trying to use ideas from condensed matter physics in
6 Itis interesting to note that several of his ideas are still used extensively today in the field of option pricing.
The martingale property which was simply assumed by Bachelier turns out to be of profound importance in
theoretical finance as we will see later.
Sec. 1.5 The Historic Connection between Physics and Finance 8
behavioural finance to explain the properties of stock price movements (for recent work in this
area, see Cavagna et al [18] and Savit et al [19]) and have analysed the properties of security
prices in various markets in great detail (for example, see Bouchaud [20] [21] and Plerou et
al [22]). In recent years, this subject has been given a formal name, econophysics (for a good
introduction, see Mantegna and Stanley [23]).
C HAPTER 2
S TOCHASTIC P ROCESSES
In this chapter, we will first introduce the standard terminology of probability theory since some
physicists might not be very familiar with it. We will then briefly recall some of the standard
theory of stochastic processes and look at a few examples of how they are used in physics. If
the reader finds any of the first five sections of this chapter unfamiliar, he should refer to any
basic text on stochastic processes such as Brzeźniak and Zastawniak [24] or Ross [25]. This
chapter is largely adapted from Rao [26], Paul and Baschnagel [7], Øksendal [27] and others.
It also uses some ideas from Kleinert [17] and Jacob [28].
2.1 Events, Probability Measures and Random Variables
A probability space is a triple (Ω, F, P ) where Ω is a point set representing all possible out-
comes of the random experiment, F is a σ-algebra of subsets of Ω (a σ-algebra of sets is a
collection of sets that contains the empty set, the complement of any set that is an element of
the algebra as well as all the countable unions of sets within the algebra) and P is a σ-additive1
non-negative function on F with P (Ω) = 1. The elements of F are called events and the func-
tion P : F → [0, 1] defines the probability associated with each event. A simple discrete example
would be the probability space to describe the random experiment of tossing a fair coin once.
In this case Ω = {H, T }, F = {0, / {H}, {T }, {H, T }} and P would be defined by P (0) / = 0,
1
P ({H}) = P ({T }) = 2 and P ({H, T }) = 1. Note that only one of the probabilities P ({H})
or P ({T }) needs to be specified as the other values would be generated using the properties of
P . In general, there is a collection of events in the σ-algebra, which we will call simple events,
whose probability is sufficient to give the probability of any event in the σ-algebra. In other
words, the σ-algebra is generated by the simple events.
In many cases, we would like to map the probability space to another probability space
whose first two elements are the set of real numbers R and the Borel σ-algebra B (the Borel
σ-algebra is the σ-algebra generated by the open intervals of R). Consider a map X : Ω → R.
If X −1 (B) ⊂ F, we call X a random variable whose probability function is given by P ′ (A ∈
B) = P (X −1(A)). Hence (R, B, P ′ ) would be the probability space generated by the random
variable X from the original space. As an example, we consider the probability space in the
previous example together with the map X : Ω → R so that X(H) = 1 and X(T ) = −1. The
probability function is then P (A) = 0 if neither 1 or -1 is in A, P (A) = 21 if −1 or 1 is in A
and P (A) = 1 if both 1 and -1 are in A. Note that the same process could be used to map to
1 this means if A and B are disjoint sets in the σ-algebra, P (A ∪ B) = P (A) + P (B)
9
Sec. 2.2 Generating functions and cumulants 10
a different point set Ω′ rather than R. In that case, X : Ω → Ω′ would be called an Ω′ -valued
random variable. We will need to use this concept when it is more convenient to obtain a
random variable on Ω′ = RN .
It should be readily seen that the probability density function provides a measure so that we
can define expectations of random variables over this measure. The expectation of the random
variable X is given by Z Z
E[X] = X(ω)dP (ω) = xdP ′ (2.1)
Ω R
It can be easily shown that P ′ is in fact the distribution function of elementary probability
theory. Hence dP ′ = f (x)dx where f (x) is the density function (if it exists) of the random
variable. We can similarly define expectations of functions of the random variable. Hence, all
the familiar notions of elementary probability theory such as the mean, variance, skewness and
so on can be translated into this framework quite easily.
2.2 Generating functions and cumulants
The moment generating function of a set of random variables Xi with joint probability distri-
bution f (X) is defined as
Z
M(t) = E[e t·X
]= dn Xet·X f (X) (2.2)
It is readily seen that this is the generating function for the correlation functions between powers
of Xi which are given by
∂ ∑i ni 

ni
E[∏ Xi ] =  M(t)
ni  (2.3)
i ∏ i ∂ti t=0
These correlations are also called the moments of the distribution. The moment generating
function might be undefined for certain values of t as the expectation might diverge. In contrast,
the characteristic function always exists and is hence more commonly used.
The characteristic function is defined to be the inverse Fourier transform of the probability
distribution Z
G(p) = E[eip·X ] = dn Xeip·X f (X) (2.4)
The moments are now given by
∂ ∑i ni 

1
E[∏ Xini ] = ∑ n n
 G(p) (2.5)
i i i i ∏i ∂pi i p=0
The characteristic function can be considered as the probability distribution in momentum

space. From the fact that dn Xf (X) = 1, we see that G(0) = 1.
R
The cumulant generating function is defined to be the logarithm of the characteristic func-
tion. We denote this by A(p). The cumulants are defined to be
1 ∂ ∑i ni 

ani = ∑ n  A(p) (2.6)
i i i ∏i ∂pni i p=0
Sec. 2.4 Stochastic Processes 11
This definition can be extended to define non-integer order α cumulants as the absolute value
of the coefficient of pα in the cumulant generating function. This extended definition is of
fundamental importance when considering the full set of probability distributions.
Let us consider a simple example to illustrate these concepts. Consider the univariate stan-
dard normal distribution given by
1 x2
f (x) = √ e− 2 (2.7)
2π
Its moment generating function is given by
t2
M(t) = e 2 (2.8)
characteristic function by
p2
G(p) = e− 2 (2.9)
and cumulant generating function by
p2
A(p) = − (2.10)
2
It is seen that the standard normal distribution has an infinite number of non-zero even moments
(odd moments being zero due to symmetry) but only one non-zero cumulant a2 = 1.
Cumulants are also called n-point functions or Ursell functions in physics where n = ∑ ni .
2.3 Equivalent Measures and the Radon-Nikodým Deriva-

tive
The idea of equivalent measures is important when one wants to transform the probability mea-
sure. When transforming the probability measure, it is obvious that one has to ensure that
events that are impossible in the initial measure are also impossible in the final measure. Mea-
sures that preserve this property are called equivalent measures. The function implementing
the change of measure is called the Radon-Nikodým derivative. The formal definitions are
presented below.
If we have two probability spaces (Ω, F, µ) and (Ω, F, ν) differing only in the measure, the
measures are called equivalent if
∀A ∈ F, µ(A) = 0 ⇔ ν(A) = 0 (2.11)
In other words, the two measures have the same null events. If two measures are equivalent, it
can be shown that there exists an F-measurable function p, such that for every A ∈ F,
Z Z
µ(A) = pdν = dµ (2.12)
A A
This function p = dµ/dν is called the Radon-Nikodým derivative of µ with respect to ν. In the
case of a single continuous random variable x, this has a simple interpretation. If f (x) and g(x)
are probability density functions such that f (x) = 0 =⇒ g(x) = 0, then the Radon-Nikodým
derivative of µ, where dµ = f (x)dx with respect to ν where dν = g(x)dx is just the ratio fg(x)(x)
.
The Radon-Nikodým derivative is non-trivial for more complicated systems as we will see
when we discuss interest rate models which have infinite variable probability measures.
Sec. 2.4 Stochastic Processes 12
2.4 Stochastic Processes
A stochastic process is a family {Xt , t ∈ T } such that for each t ∈ T , Xt is a random variable. If
T ⊂ R, the process is usually called a random process while if T ⊂ Rn , n ≥ 2, it is called a ran-
dom field. Even more general stochastic processes can be defined since the only fundamental
restriction on T is that it should be a separable metric space. For example, stochastic processes
defined on Hilbert spaces form the subject of quantum stochastic calculus [29]. We will only
use stochastic processes and fields in this thesis. In most cases, t is taken to mean time. We can
also consider t as a function argument rather than an index. The mapping X(t, ·) is then called a
random function while the mapping X(·, ω ∈ Ω) is called a sample path. It should be noted that
we can consider the stochastic process as a probability space (T × Ω, F, P ). Random variables
over this probability space would be what are normally considered as functionals in physics
while the measure is now defined over a function space. Again P would be the distribution
function of the functional and the expectation of the functional is given by
Z
E[Y ] = Y [x(t)]dP (2.13)
T ×R
In physics, we usually write dP = Dx(t)eS[x(t)] / T ×R Dx(t)eS[x(t)] where S is called the ac-

R
tion functional. Using this, we get the usual statistical mechanical result for the expectation of
a functional
Dx(t)Y [x(t)]eS[x(t)]
R
E[Y ] = (2.14)
Dx(t)eS[x(t)]
R
To make this slightly more rigorous, we consider a finite number of random variables from
the Xt or start with a finite set T . We need to know the joint distribution function of the ran-
dom process to be able to calculate the distribution functions of any function of those random
variables. The joint distribution function of the process is defined as follows
!
n
\
FXt1 ,...,Xtn (x1 , . . . , xn ) = P {ω : Xti (ω) ∈ (−∞, xi )} (2.15)
i=1
In other words, it is the probability that the process at each point is less than the argument. The
Kolmogorov existence theorem states that stochastic processes exist for any reasonable set of
finite time distribution functions. In many cases of practical interest, a density function f exists
such that Z x1 Z xn
FXt1 ,...,Xtn (x1 , . . . , xn ) = dy1 . . . dynf (y1 , . . . , yn ) (2.16)
−∞ −∞
The expectation of any function of these random variables is then given by
Z ∞ Z ∞
E[Z(Xt1 , . . . , Xtn )] = dy1 . . . dyn Z(y1 , . . . , yn )f (y1 , . . . , yn ) (2.17)
−∞ −∞
On taking the limit n → ∞, we get the functional integral (provided that the limit exists)
Z
E[Z[Xt ]] = Dx(t)f [x(t)]Z[x(t)] (2.18)
It is, of course, much more usual to write the density in terms of the action functional.
Sec. 2.6 Markov Processes 13
In practice, the procedure is usually applied in the reverse direction. We start by proposing
an action functional (or density functional), discretize the functional integral required to calcu-
late the quantity of interest and then take the limit of number of points to infinity to get the final
result. However, to be rigorous, one must start with the distribution or density function for a
finite number of the random variables in the stochastic process as there are many cases where
the continuous time process might not exist.
In many cases, we want to consider functionals indexed by time φt [x(t′ )] that depend only
on the history of the stochastic process. In other words, φt [x(t′ )] = φt [y(t′)] if x(t′ ) = y(t′ ), t′ ≤
t. To do this, we need to consider the σ-algebras generated by the processes only up to the time
t that will be denoted by Ft . It is easy to see that Ft ⊃ Ft′ if t > t′ . The set of expanding σ-
algebras Ft is called a filtration. A probability space endowed with a filtration is usually called
a filtered probability space. The functional φt is easily seen to be Ft measurable and can by
itself be considered a stochastic process. We say that the process φt is Ft adapted.
2.5 Martingales and semi-martingales
A stochastic process is said to be a martingale if it has no inbuilt drift. More precisely, a

stochastic process is a martingale if
E[Xt |Ft′ ] = Xt′ , t′ < t and E[|Xt |] < ∞ (2.19)
In other words, the best estimator for the future value of the process is the current value. The
concept is particularly simple but very useful, especially in finance.
A stochastic process Xt is called a semi-martingale if it can be written as
Xt = Mt + At (2.20)
where Mt is a martingale and At is a process of total finite variation. A process of total finite
(i+1)T
variation is one that satisfies limN →∞ ∑N −1 iT
i=0 |A( N ) − A( N )| < ∞ almost surely.
Historically, the concept of a martingale was developed from the idea of a fair game in
gambling. If the game is fair, the expected winnings should be zero and the amount of cash has
no inherent drift. The amount of money held by a person who bets on the toss of a fair coin
at every time period is an example of a martingale. Another example is that of the position of
a particle performing a random walk on a regular lattice so that the probability of the particle
moving to any of the neighbouring sites is equal at each time step.
2.6 Markov Processes
A Markov process is a stochastic process whose future time evolution is not dependent upon
the history of its trajectory though it might depend on its current position. In other words, a
Markov process has no memory. More formally, a stochastic process is called a Markov process
if it is true for all n and all t1 < t2 < · · · < tn that
f (xn , tn |x1 , t1 ; . . . ; xn−1 , tn−1 ) = f (xn , tn |xn−1 , tn−1 ) (2.21)

Sec. 2.7 The Kramers-Moyal Expansion 14
where f (A|B) denotes the conditional density function of event A given event B, the first
argument refers to the random variable for which this is the density function and the second
argument denotes that the density function is conditional on the history from the second point
onwards and this point can be arbitrarily close to the first point. In other words, all the historical
information about the Markov process does not enable us to better predict its future.
This is an enormous simplification and the joint density functional of the process can be
reduced to the product of a sequence of relatively simple conditional density functions. In
other words,
N
g(x1 , t1 ; x2 , t2 ; . . . ; xn , tn ) = ∏ f (xi , ti |xi−1 , ti−1 )p(x1 , t1 ) (2.22)
i=2
where p(x1 , t1 ) is the initial density function of the process. We will denote the density function
for the value of the process at time t given only the initial conditions by p(x, t).
Since the process must take some value at intermediate times and the Markov property
holds at all times, the density functions must satisfy the Chapman-Kolmogorov equation
Z
f (x3 , t3 |x1 , t1 ) = dx2 f (x3 , t3 |x2 , t2 )f (x2 , t2 |x1 , t1 ), t3 ≥ t2 ≥ t1 (2.23)
This is a consistency equation for the conditional probabilities of a Markov process. The
Markov process is completely specified by the initial and conditional probability density func-
tions. In general, the semigroup f has a one to one correspondence with a positive definite
pseudo-differential operators [28] and can be studied in great generality using this approach.
Since this approach requires considerable formalism and the same results can be derived from
formal expressions such as the Kramers-Moyal expansion (which is not rigorous since even the
first term in the expansion need not exist), we present it heuristically and link it to the formal
expressions seen in many textbooks.
If f (x2 , t2 ; x1 , t1 ) only depends on the time difference t2 − t1 , we say that the process is
stationary and we see that the Chapman-Kolmogorov equation becomes
Z
K(x3 , x1 , t + τ ) = dx2 K(x3 , x2 , τ )K(x2 , x1 , t) (2.24)
where K(x1 , x2 , t2 − t1 ) = f (x1 , t1 ; x2 , t2 ). We call K the transition probability. If K depends

only on the difference x2 − x1 , then we say that the process is homogeneous. It is important to
note that the kernel f specifies the Markov process completely.
2.7 The Kramers-Moyal Expansion
For a Markov process, the probability density function satisfies

Z
P (x, t + τ ) = dx′ f (x, t + τ |x′ , t)P (x′ , t) (2.25)
Assume that the moments of the conditional density function

Z
′
Mn (x , t, τ ) = dx(x − x′ )n f (x, t + τ |x′ , t) (2.26)
are known. We can then write

Z
f (x, t + τ |x′ , t) = dyδ(y − x)f (y, t + τ |x′ , t)
Z
= dyf (y, t + τ |x′, t)δ(x′ − x + y − x′ )
∞
(y − x′ )n ∂ n
Z
′
= dyf (y, t + τ |x , t) ∑ ′
δ(x′ − x)
n=0 n! ∂x (2.27)
∞
∂ n

1
Z
′ n ′
= ∑ dy(y − x ) f (y, t + τ |x , t) ′
δ(x′ − x)
n=0 n! ∂x
∞ n !
1 ∂
= 1 + ∑ Mn (x′ , t, τ ) ′
δ(x′ − x)
n=1 n! ∂x
where δ(x) is the Dirac delta function.2 Putting this into (2.25), retaining only terms linear in τ
and performing some integrations by parts where the boundary terms are taken to be zero, we
get
∞
∂ n Mn (x, t, τ )

∂P (x, t) 2
τ + O(τ ) = ∑ − P (x, t) (2.28)
∂t n=1 ∂x n!
Note carefully the (−1)n term that appears because of the n integration by parts that have to
performed for each term. Expanding Mn (x, t, τ ) in a Taylor series about τ = 0 and retaining
only the linear term, we obtain the Kramers-Moyal expansion
∞
∂ n d

∂P (x, t) 1
= ∑ −  Mn (x, t, τ ) P (x, t) (2.29)
∂t n=1 n! ∂x dτ 
τ =0
d

For the following discussion, we denote dτ M (x, t, τ ) by an . It should be noted that the
τ =0 n
an are merely the scaled cumulants for the infinitesimal transition probability density function.
They are the cumulants rather than the moments as the expression above might seem to indicate
since on scaling, the cumulant becomes dominant. This will be seen more clearly when we
consider the Langevin equation. It is useful to note that defining the scaled cumulants in this
way is perfectly reasonable since they are additive.
Alternatively, we can define a time evolution operator for the density function by eǫHKM (t,p,x) =
K(t, p, x, ǫ) = dye−ipy K(t, y, x, ǫ) or HKM = 1ǫ G(t, −p, x, ǫ) in the limit ǫ → 0, write this op-
R
erator in terms of the momentum (of the y variable) and position (in other words, consider it
as a pseudo-differential operator) and expand it in a power series to obtain the Kramers-Moyal
expansion. This can be seen from the fact that the transformation to the momentum basis is just
a Fourier transform that will give the logarithm of the complex conjugate of the characteristic
function G(t, −p, x, ǫ) which, by definition is a series expansion with the coefficients being the
cumulants of K multiplied by (−ip)n . That is
1 1
HKM (t, p, x) = lim ln K(t, p, x, ǫ) = lim G(t, −p, x, ǫ)
ǫ→0 ǫ ǫ→0 ǫ
1 ∞
(−ip) n ∞
(−ip)n (2.30)
= lim ∑ an (x, t, ǫ) = ∑ an (x, t)
ǫ→0 ǫ n=1 n! n=1 n!
2 The
R∞
Dirac delta function is actually a distribution defined as δ(x) = 0, x 6= 0 and −∞ f (x)δ(x)dx = f (0) for
all continuous functions f .
Note that the expansion might not necessarily be in terms of integer n as the characteristic
function may include non-integer powers of p, though we write it in this manner to retain the
similarity to the Kramers-Moyal expansion. Since the probability distribution at time t+ǫ is the
convolution of the kernel K and the distribution at time t, the Fourier transforms or momentum
space probability distribution at time t + ǫ is given by the product of the momentum space
versions of the kernel K and probability distribution P at time t. That is,
1
Z
P (x, t + ǫ) = dpeipx eǫHKM (t,p,x) P (p, t) (2.31)
2π
or since ǫ is small
1
Z
P (x, t + ǫ) = P (x, t) + ǫ dpeipx HKM (t, p, x)P (p, t) (2.32)
2π
which gives, on straightforward simplification with the use of (2.30)
∂P (x, t) 1 1 ∞ (−ip)n
Z Z
ipx
= dpe HKM (t, p, x)P (p, t) = ∑ dpeipx an (x, t)P (p, t)
∂t 2π 2π n=1 n!
(2.33)
which means that in position space, we have
∞ n
(−1)n

1 ∂
Z
ipx
ĤKM (t, x) = dpe H(t, p, x) = ∑ an (x, t) (2.34)
2π n=1 n! ∂x
where the first form is always valid. When writing ĤKM in this manner, we must be careful
about operator ordering. This can be resolved by using the fact that the probability distribution
has to be normalized or that P (p = 0) = 1. Borrowing from the notation of quantum mechanics
where such calculations are common, we see that
∂hp | P i
= hp | ĤKM (t, x, p) | P i (2.35)
∂t
which gives
Z
hp | P (t + ǫ)i = hp | P (t)i + ǫ dxhp | ĤKM (t, x, p) | xihx | P (t)i (2.36)
Since the value of | P i evaluated at zero p remains constant, the second term in the equation
above must be zero at zero p. This can only be so if p̂ appears before x̂ and if there are no
terms in ĤKM which are non-zero at zero p (for example, one cannot add potential terms like
V (x̂)). If these conditions are satisfied and if K is normalized and nonnegative for all x, the
Hamiltonian describes a Markov process (the validity of a Hamiltonian can also be directly
checked from Bochner’s theorem). If we want an expansion that goes backward in time, we
have to use the adjoint of the above expansion, or in other words, put the x̂s before the p̂s (see
Paul and Baschnagel [7], for example). The backwards expansion is also more important in
finance due to final value problems and as we will see that expectations evolve forward with
the backward expansion. It is easy to see that the backward Kramers-Moyal expansion is given
by n
∞
† ∂
ĤKM (x) = ∑ an (x, t) (2.37)
n=1 ∂x
∂ † ∂
The (−1)n term has been removed since ∂x

= − ∂x . One minus sign overall has also been
removed as we adopt the convention of writing the backwards Kramers-Moyal expansion with
time travelling backwards.
Let us now consider the evolution of the expectation of some function of x,R h(x). We can
denote the function h(x) by hh | xi so that its expectation at any time is given by dxh(x)P (x, t) =
hh | P (t)i. The time evolution of this expectation can then be written as
∂hh | P i †
= hh | ĤKM | P i = hĤKM h | Pi (2.38)
∂t
†
Hence, the rate of change of the expectation of the function h is also the expectation of ĤKM h.
In finite time, we have
†
hh | P i(t) = hh | etĤKM | P i = hetĤKM h | P i(0) (2.39)
†
Hence the expectation of h at time t is the expectation of the function etĤKM h at the current
†
time. If we denote the function etĤKM h by h(t, x), we see that it satisfies the partial differential
equation
∂h(t, x) †
= ĤKM h(t, x), h(0, x) = h(x) (2.40)
∂t
Further, we can write (2.39) as
†
| h(t)i = etĤKM | h(0)i (2.41)
where we now denote h as a ket vector as we would like to think of it as a “state”. This state
then satisfies the evolution equation
∂| hi
= Ĥ| hi (2.42)
∂t
†
where Ĥ = ĤKM . Our reason for writing it this way is because we generally want to consider
the evolution of the expectation of a function of the stochastic process at some point in time.
Due to the above reasoning, we see that we can consider this state in a way very similar to a
quantum mechanical state vector with Ĥ as the Hamiltonian. The analogy should not be taken
too far as Ĥ need not be Hermitian and may not even have anything akin to a kinetic term
p̂2 /2m. However, as long as time evolution is the only concern, we can think of the above
as a quantum mechanical problem. We shall use this analogy extensively and call Ĥ as the
Hamiltonian from now on.
We can put the above evolution in another form to give us the following very useful result
Z t
′ ′
E0 [h(x(t))] = E0 [h(x(0))] + E0 dt Ĥh(x(t )) (2.43)
0
which is known as Dynkin’s formula (see Øksendal [27]). In many books on stochastic pro-
cesses, Ĥ is called the generator of the diffusion described by the process.
A major simplification occurs when the stochastic process is homogeneous which means
that ĤKM depends only on p. In this case, the process in momentum space becomes very
simple as the evolution of the probability in momentum space is trivial. In this case, we have
P (p, t) = etHKM (p)P (p, 0) (2.44)
Sec. 2.8 The Fokker-Planck and Langevin Equations 18
For example, if we have a process which starts at 0 (that is, x(0) = 0 so that in momentum space
1
we have P (p, 0) = 1) and whose Hamiltonian is − Γ(5/3) |p|3/2 (so the process is a standard Lévy
flight process) its distribution at time t in momentum space is given simply by
− t |p|3/2
Γ( 35 )
P (p, t) = e (2.45)
This allows one to do path integrals over such processes very simply numerically.
2.8 The Fokker-Planck and Langevin Equations
If the conditional density function f (x, t + τ |x′ , t) for small τ varies slowly as a function of
x and the cumulants do not diverge, we can truncate the Kramers-Moyal expansion and still
obtain a good approximation. In fact, if the conditional density function is Gaussian, only
the first two terms in the expansion are non-zero since the Gaussian distribution has only two
non-zero cumulants. Hence, the expansion is usually truncated to two terms when describing
Gaussian processes3 yielding the Fokker-Planck equation
∂p(x, t) ∂(a1 (x)p(x, t)) 1 ∂ 2 (a2 (x)p(x, t))

=− + (2.46)
∂t ∂x 2 ∂x2
The backwards Fokker-Planck equation is then obviously given by
∂p(x, t) ∂p(x, t) a2 (x) ∂ 2 p(x, t)

= a1 (x) + (2.47)
∂t ∂x 2 ∂x2
While the infinite order PDE given by the Kramers-Moyal expansion and the Fokker-Planck
equation describe how the probability density function of the process p(x, t) behaves, they do
not describe directly how the sample paths behave. This is done by the Langevin equation
dx
= η(x, t, ω) (2.48)
dt
Usually, we drop the ω and write η(x, t) which is understood to be a stochastic quantity. Of
course, we must now explicitly describe the properties of η to make this meaningful. The prob-
η η
ability density function for η is given by limdt→0 K( dt , x, dt) (it is dt rather than η because of
the dt term which scales all changes in x by ǫ). Using the characteristic function of the condi-
a2 (x) 2
tional density function K for small τ which is by definition edtĤKM (x,p) = edt(ia1 (x)p− 2 p ) ,
where we take x to be its value at t, we can see that

1 d
hη(x, t)i =  HKM (x, p) = a1 (x) (2.49)
idt dpp=0
and 
1 d a2 (x)
hη 2 (x, t)i = − 2  + a21 (x)

HKM (x, p) = (2.50)
dt dp p=0
 dt
3 Note that by Gaussian processes, I mean processes that are Gaussian in the the short time limit, that is the
random variable limǫ→0 x(t + ǫ) − x(t) follows a Gaussian distribution. The random variable x(t + T ) − x(t) for
finite T can have any distribution.
Sec. 2.9 Extension to Multivariate Distributions 19
Since we are describing a Markov process the correlations for different times are zero. Hence,
we can write the above more generally as
hη(x, t1 )i = a1 (2.51)
2
hη(x, t1 )η(x, t2 )i = a2 δ(t1 − t2 ) + a1 (2.52)
hη(x, t1 )η(x, t2 )η(x, t3 )i = a3 δ(t1 − t2 )δ(t1 − t3 ) (2.53)
3
+a2 a1 (δ(t1 − t2 ) + δ(t2 − t3 ) + δ(t1 − t3 )) + a1
and so on and so forth. Note that these expectations are basically expressing the relation be-
tween the moments and cumulants since the a are the cumulants for η. We also see that the
cumulants dominate the expansion for infinitesimal time as they are greater by atleast one delta
function factor. This is the reason why the scaled moments in the previous section became
equal to the scaled cumulants. For processes where the cumulants do not exist, the Fourier
transform of η can be specified.
2.9 Extension to Multivariate Distributions
The extension of the Kramers-Moyal expansion, the Fokker-Planck equation and the Langevin
equation to multivariate distributions is straightforward and the analysis proceeds in exactly the
same way. Rather than repeat all the calculations, we present the results.
The Kramers-Moyal expansion for multivariate distributions of N variables is given by
∞ ∞
′ 1
f (x, t + τ |x , t) = 1 + ∑ ··· ∑ Mn1 ,...,nN (x′ , t, τ )
n1 =1 nN =1 n1 ! . . . nN !
n1 nN (2.54)
∂ ∂
... δ(x′ − x)
∂x′1 ∂x′N
where Mn1 ,...,nN are the multivariate moments of order ni for the ith variable. The alternative
derivation remains exactly the same with the x, p and y now being vectors since no assumption
of dimensionality was assumed in any of the steps. The evolution equation for the probability
density becomes
∂ n1
∞ ∞ nN
∂p(x, t) 1 ∂
= ∑ ··· ∑ − ... −
∂t n1 =1 nN =1 n1 ! . . . nN ! ∂x1 ∂xN
 (2.55)
d Mn ,...,n (x, t, τ ) p(x, t) = ĤKM p(x, t)
dτ τ =0 1 N
d

where the an1 ,...,nN = dτ M (x, t, τ ) are now the scaled multivariate cumulants. The
τ =0 n
Fokker-Planck equation therefore becomes
N
∂p(x, t) ∂(ai (x)p(x, t)) 1 N N ∂(ai,j (x)p(x, t))
=−∑ + ∑∑ (2.56)
∂t i=1 ∂xi 2 i=1 j=1 ∂xi ∂xj
where ai is the first cumulant of the ith variable and ai,j is the second joint cumulant of the ith
and jth variables. In the future, we drop the summations which are always assumed for further
compactness.
Sec. 2.9 Extension to Multivariate Distributions 20
†
The evolution of the expectations takes place with the operator ĤKM or Ĥ as before (this
is shown in exactly the same manner as in the univariate case) with
† ∂ 1 ∂2
Ĥ = ĤKM = ani (x) + ani ,nj (x) (2.57)
∂xi 2 ∂xi ∂xj
when truncated to second order. When written in integral form with the expectations explicitly
put in, it is called the multivariate Dynkin’s formula
Z t
Et [h(x)|x(0) = a] = h(a) + Et Ĥh(x)dt (2.58)
0
The stochastic process with generator (2.57) is equivalent to the Itô stochastic differential equa-
tion
dx(t) = a1 (x)dt + a2 (x)dB(t) (2.59)
(see section 2.13 for the definition of the stochastic differential equation) where a1 is the vector
ani , 1 ≤ i ≤ N, a2 is a N × M matrix such that a2 aT2 is the matrix ani ,nj , 1 ≤ i, j ≤ N and B
is a M-dimensional Wiener process. If the cumulants diverge, then the Hamiltonian must be
expressed in the momentum basis as in the univariate case. Even otherwise, the Hamiltonian
looks simpler when expressed in terms of both x and p. This can be considered as a phase space
formulation (but not necessarily canonical as this is not necessarily a Hamiltonian dynamical
system). For example, (2.57) becomes
1
Ĥ = iani (x)pi − ani ,nj (x)pipj (2.60)
2
The multivariate Dynkin formula is a very powerful tool which is usually not seen in phsyics
books dealing with stochastic processes. To show its usefulness, we will provide a very simple
proof that simple Brownian random walks are recurrent in two dimensions but are not recurrent
in higher dimensions.
Example
Consider a n-dimensional Brownian B = (B1 , . . . , Bn ) starting at a point a such that R < |a| <
2R for some R. Let us call the first exit time of the Brownian motion from the annulus Ak =
{x : R < |x| < 2k R}, k ∈ N αk . For any smooth function f , we then have by Dynkin’s formula
(2.43) Z α
1 k
2
E[f (B)] = f (a) + E ∇ f (B(s))ds (2.61)
2 0
Here we have made use of the fact that the kernel for each component of the Brownian motion
which is N(0, dt) has only one non-zero cumulant which is the second one. Since this cumulant
∂2
on scaling has value one, Ĥ has a contribution of 21 ∂x 2 for each component. No other terms
i
exist since there are no cross-correlations. Hence, Ĥ is given by 21 ∇2 leading to the formula
above. Since we want to keep things simple, we choose the function f to be harmonic so that
the second term above disappears. This means that
(
− ln |x| n = 2
f (x) = (2.62)
|x|2−n n > 2
Sec. 2.10 Stochastic Fields 21
Then, Dynkin’s formula reduces to
E[f (B)] = f (a) (2.63)
irrespective of the value of k. Let us call the probability that the particle first leaves the annulus
at radius R pk and the probability that the particle first leaves the annulus at radius 2k R qk .
Obviously, qk = 1 − pk . Then, if n = 2, we get from (2.62) and (2.63) that
−pk ln R − qk (ln R + k ln 2) = − ln |a| (2.64)
which implies that qk → 0 as k → ∞. This means that the probability that the Brownian motion
wanders away to infinity is zero or that the Brownian motion in two dimensions is recurrent.
For n > 2, (2.62) and (2.63) give
pk R2−n + qk (2k R)2−n = |a|2−n (2.65)
From this, we get

2−n
|a|
lim pk = (2.66)
k→∞ R
which is between zero and one. Hence Brownian motion is transient in more than two dimen-
sions.
This is usually proved in many books by using generating functions. The proof above can
be seen to be much simpler.
2.10 Stochastic Fields
We can further extend the analysis to stochastic fields. If we have a n dimensional stochastic
field f (x), the Kramers-Moyal Hamiltonian can be written as
∞ Z ∞ n
(−1)n δn
ĤKM = ∑ ∏ dxi an (t, {xi }, {f (t, xi )}) (2.67)
n=1 n! −∞ i=1 ∏ni=1 δf (t, xi )
where an are now scaled multivariate cumulants. The generator of the diffusions is usually a
much more useful quantity since one rarely wants to consider the joint probability distribution
of an infinite number of variables. It is given by
∞ Z ∞ n
† 1 δn
Ĥ = ĤKM = ∑ ∏ dxian (t, {xi }, {f (t, xi)}) (2.68)
n=1 n! −∞ i=1 ∏ni=1 δf (t, xi )
For the case of locally Gaussian stochastic fields, we get
δ2
Z ∞
1
Ĥ = dxdx′ a2 (x, x′ , f (t, x), f (t, x′ ))
2 −∞ δf (t, x)δf (t, x′ )
Z ∞ (2.69)
δ
+ dxa1 (x, f (t, x))
−∞ δf (t, x)
or
Z ∞ Z ∞
1 ′ ′ ′ ′
Ĥ = − dxdx a2 (x, x , f (t, x), f (t, x ))p(x)p(x ) + i dxa1 (x, f (t, x))p(x) (2.70)
2 −∞ −∞
Sec. 2.10 Stochastic Fields 22
where the p(x) represent the Fourier conjugate variables of f (x), a result which we exten-
sively use when analysing interest rate models. Note that since we are using the kernel and
the Hamiltonian to define the Markov process, the Hamiltonian above is not a derived quantity
but follows directly from the specification of the kernel of the Markov process which we have
specified using the cumulants. Normally, the stochastic fields would be specified by the path
integral which would be obtained from the finite dimensional distributions.
To illustrate the point, we write down the finite dimensional kernel of the process which is
given by the multivariate normal distribution
1
p ×
(2π)N/2 ǫ det a2 (fi−1 )
(2.71)
1
exp − (fi,j − fi−1,j − ǫa1 (fi−1 ))a2 (fi−1 )(fi,j − fi−1,j − ǫa1 (fi−1 ))
2ǫ
where the time has been discretized with index i and spacing ǫ while the space has been dis-
cretized with indices j (the number of indices or the spacing
√ is irrelevant). fi−1 is the collection
of variables fi−1,j . On integration over fi , the term det a2 will be cancelled. However, we
should remember to include this term in the definition of the path integral as the path integral
would otherwise be meaningless. Note that while the determinant for the previous time slice
enters into the path integral, we will write the integration measure as
Z ∞
df (t, x)
∏ −∞
p
det a2 ({f })
(2.72)
x
for simplicity. However, it should be recalled that the measure is actually

Z ∞
dfi,j
∏ −∞
p
det a2 ({fi−1 })
(2.73)
j
In the continuous case, the two are equivalent. In continuous space and time, we write the path
integral as Z ∞ Z ∞
1
Z
S =− dt dx dx′ (f˙ − a1 )(x)a−1 ′ ˙ ′
2 (x, x )(f − a1 )(x ) (2.74)
2 −∞ −∞
with measure given by (2.72).
We now consider a relatively simple separable case to illustrate the measure term further.
If the covariance a2 can be expressed as σ(t, x, f (t, x))D(x, x′ )σ(t, xq
′ , f (t, x′ )) where D −1 is
√
a local differential operator. In this case, a2 = σDσ and the term det a−1 2 = ∏ x
1
σ det D
appears after the integration is done. Hence, we re-scale the terms by σ and write the above
action more simply as
Z ∞ ˙ ˙
1 f − a1 −1 f − a1
Z
S=− dt dx D (2.75)
2 −∞ σ σ
√
and the remaining term det D will appear after the path integral is done. In the definition of
the path integral for this action, one must be careful to include the measure terms induced by
the rescaling with σ since they are f dependent in general. Hence, the path integral will be
given by Z
Z= Df σ −1 eS (2.76)
Sec. 2.12 Cumulants and the Central Limit Theorem 23
where Z Z ∞
Df σ −1 = ∏ df (t, x)σ −1(t, x, f (t, x)) (2.77)
x −∞
We will see concrete examples of the application of this when we analyse interest rate models.
2.11 Killing terms and Potentials
One interesting property of the generator of diffusions Ĥ for the stochastic processes consid-
ered above is that it doesn’t contain any terms without differential operators such as potential
terms in physics. However, we can find other stochastic processes where the generator has such
terms which are generally called killing terms.
The generator Ĥ
1 ∂2 ∂
ani ,nj (x) + ani (x) −c (2.78)
2 ∂xi ∂xj ∂xi
generates a stochastic process x̃ which is obtained by killing x at a certain killing time ζ pro-
vided c(x) > 0 . That is, there exists a random time ζ such that if we put
x̃(t) = x(t), t < ζ (2.79)
and leave x̃(t) undefined if t ≥ ζ, then x̃ is also a Markov process and
h Rt i
E[h(x̃(t))] = E h(x(t))e− 0 c(x(s))ds (2.80)
Note carefully that the path dependent discounting factor is outside the function argument so
that all functions are discounted in the same way. This is the only property of x̃ that we will
need and so we will not provide an explicit construction of the stopping time ζ. The interested
reader can find this construction in Karlin and Taylor[30].
2.12 Cumulants and the Central Limit Theorem
We note some of the important properties of cumulants. Firstly, they are additive, i.e., if two
independent random variables X and Y have distributions with cumulants an and bn , the sum
X + Y has cumulants an + bn . Secondly, they are homogeneous, i.e., if a random variable X
has cumulants an then the random variable cX has cumulants cn an .
While these properties alone are sufficient to motivate the central limit theorem, there is one
interesting property that they have which deserves mention due its ubiquitous use in physics
and information theory. If a random variable X has n moments µr , then the relation between
the moments and cumulants are given by
r
r
µr+1 = ∑ µj ar+1−j , r = 0, . . . , n − 1 (2.81)
j=0
j
This can be written in an illuminating fashion as

αn = ∑ cπ aπ (2.82)
π
Sec. 2.12 Cumulants and the Central Limit Theorem 24
where π denotes the set of additive partitions of n. The additive partitions of a number are
the non-increasing lists of positive integers which add up to that number. aπ is defined to be
∏j∈π aj . cπ is given by
n! 1
cπ = m m
(2.83)
(j1 !) 1 . . . (jk !) k m1 ! . . . mk !
where there are assumed to be k distinct integers jk of multiplicity mk in the partition π. When
generalized to multivariate random variables Xi , we obtain the following relation
m
m−1
an1 ,...,nN (X) = ∑(−1) (m − 1)! ∏ µGj ({XGj }) (2.84)
G j=1
where G is a partition of the list of indices i repeated ni times each into m subgroups Gj and
XGj are the Xi with indices in Gj . The µk are now multivariate moments. The multivariate
cumulant functions are also called the Ursell functions or connected Greens functions. This
partition property of the multivariate cumulants ensures that they are zero if any of the random
variables Xi are independent. A commonly used example is the second multivariate cumu-
lant usually called the covariance. One important use for this property of the cumulants is in
perturbative calculations in quantum field theory where the cumulant generating function is
used to generate the one particle irreducible graphs or connected Green’s functions. For more
information on this property of cumulants, see Wolf[31] or Weinberg[32].
We now use the homogeneity and additive properties of cumulants to see how the sums of
many independent identically distributed (iid) random variables behave. If the random variables
are not scaled when more of them are added, the cumulants will, of course, diverge. To ensure
that the cumulants don’t diverge, we will have to scale them according to the non-zero cumulant
with the smallest index. In other words, if this index is α, we have to consider the sum
N
1
lim ∑ Xi
N →∞ N 1/α i=1
(2.85)
where Xi are the iid random variables. Due to the scaling property, we immediately see that
all the cumulants of index greater than α will become zero when the limit is taken. Hence,
only the cumulant of index α remains and the distribution of the sum has only one non-zero
cumulant. When applied to integer cumulants, we see that if a distribution has a non-zero well
defined mean µ, the sum of the random variables
N
1
S = lim
N →∞ N
∑ Xi (2.86)
i=1
has the distribution δ(s − µ) since only the first cumulant survives. If the mean is zero but the
variance σ 2 is non-zero and finite, then the variance of the sum
N
1
S = lim √
N →∞ N
∑ Xi (2.87)
i=1
is σ 2 . Hence, the cumulant generating function of S is −p2 σ 2 /2 and taking the Fourier trans-
form of its exponential, we see that its probability distribution is N(0, σ 2 ). This gives us the
original form of the central limit theorem which states that if we take the limit of an infinite
sum (2.87) of iid random variables with finite variance, its probability distribution is normal.
Sec. 2.13 Itô Stochastic Calculus 25
Note that if the mean of the random variables is not zero, then one should subtract out this
mean before we can make this statement as the mean of the sum will otherwise diverge. There
is no need to consider higher indices as it is not possible to have distributions having zero vari-
ance and non-zero higher cumulants. However, we do need to consider non-integer indexed
cumulants for index between zero and two which we do below.
The original central limit theorem is limited in that only considers distributions with well
defined variance and many distributions do not have well defined variance. For example, the
random variable with cumulant generating function −c|p|α , 0 < α < 2, does not have a well
defined variance but it is not difficult to see that the limit of the sum (2.85) of iid random vari-
ables with this distribution has the same distribution as well (to see this, use the fact that the
cumulant generating function of the convolution of two probability density functions is just the
sum of their cumulant generating functions due to the properties of Fourier transforms). It is
also not difficult to see that one can add higher order terms to the cumulant generating function
without affecting the limiting distribution due to the scaling property. In a loose way, it can
be said that this distribution has a non-zero “cumulant” of lowest index α making its cumulant
generating function converge to −c|p|α . In other words, the distributions with cumulant gen-
erating function −c|p|α are attracting points in the space of probability distributions with the
attraction defined by (2.85). This means that if we have a discrete random walk with probabil-
ity distribution that converges to the Lévy distribution with index α, the scaling behaviour of
the walk will be given by < x2 >∼ t2/α . Such random walks were considered by Weierstrass
much before the Lévy distributions were known [7]. The full set of attractors and the basin
of attraction for each attractor was found by Lévy and are now called Lévy distributions. The
cumulant generating functions of the attractors are given by

α p
ln Lα,β (p) = −c|p| 1 + iβ ω(p, α) (2.88)
|p|
where c ≥ 0, 0 < α ≤ 2, −1 ≤ β ≤ 1 and ω(p, α) is given by


tan(πα/2) α 6= 1, 2

ω(p, α) = (2/π) ln|p| α = 1 (2.89)

0 α=2

The basin of attraction for each of these probability distributions is the set of probability dis-
tributions having the same asymptotic behaviour (i.e., the behaviour of p(x) as x → ±∞) as
the attracting distribution except for the Gaussian distribution where a finite variance is suf-
ficient. Hence, the Gaussian distribution has the “largest” basin of attraction and explains its
high importance.
2.13 Itô Stochastic Calculus
Itô’s stochastic calculus defines integrals and differential equations involving the Wiener pro-
cess that can be thought of as representing the continuous version of the random walk. We will
now describe the multi-dimensional Wiener process.
2.13.1 The Wiener process
The Wiener process was first discussed by Bachelier [9] in his thesis on option pricing and was
put into a rigorous form by Wiener [33].
The stochastic process Xt is said to be the Wiener process in m dimensions if the finite-
dimensional distributions
P (Xt1 ∈ A1 , Xt2 ∈ A2 , . . . , Xtn ∈ An )
are of the form

Z Z Z
m m
d xn . . . d x2 dm x1 K(xn − xn−1 , tn − tn−1 ) . . . K(x2 − x1 , t2 − t1 )K(x1 , t1 )
An A2 A1
(2.90)
with K(x, t) given by 2
1 x
K(x, t) = √ exp − (2.91)
2πt 2t
and if the initial distribution is
(
1 if 0 ∈ A0
P (X0 ∈ A0 ) = (2.92)
0 otherwise
In other words, the initial distribution is δ(x).

We can rewrite (2.90) as
n
P (Xt1 ∈ dx1 , . . . , Xtn ∈ dxn ) = ∏ dm xk K(xk − xk−1, tk − tk−1 ) (2.93)
k=1
(x0 = 0, t0 = 0) with density K(x, t) given by (2.91).

The fact that the right hand side of (2.93) is a product tells us that the Wiener process is a
Markov process. We denote the one-dimensional Wiener process by W (t).
2.13.2 Stochastic Integrals
The stochastic integral is defined in a way that is very similar to the Riemann integral but there
are some subtle differences that one must be aware of. The stochastic integral of a function
G(t) over the Wiener process is defined by
Z tn n
G(t′ )dW (t′) = lim ∑ G(τi)(W (ti) − W (ti−1 )), τi ∈ [ti−1 , ti ) (2.94)
t0 n→∞
i=1
The r.h.s is, of course, a random quantity and hence the limit must be taken in the mean square
sense rather than point-wise as is the case for the Riemann integral. That is, for this subsection,
we are using the following definition for the limit
lim Xn = X ⇔ lim h(Xn − X)2 i = 0 (2.95)

n→∞ n→∞
Another difference from the Riemann integral is that the choice of particular τi affects the value
of the integral. Two choices are commonly used, τi = ti−1 which results in the Itô stochastic
integral and τi = 12 (ti−1 + ti ) which results in the Stratanovich stochastic integral. Physicists
usually use the Stratanovich integral but the Itô integral is much more useful in finance. The
Itô integral is easily seen to be a martingale, a particularly useful property.
The following results are useful
Z t Z t
G(t′ )[dW (t′)]2 = dt′ G(t′ ) (2.96)
t0 t0
Z t
G(t′ )[dW (t′)]n = 0, n ≥ 3 (2.97)
t0
The results above are related to the fact that the Gaussian density has only two non-zero cumu-
lants. Putting G(t′ ) = 1 in (2.96), we see that (dW )2 = dt. Hence, a function of the Wiener
process has the following differential
∂f 1 ∂ 2 f

∂f
df [W (t), t] = + 2
dt + dW (t) (2.98)
∂t 2 ∂W ∂W
2.13.3 Stochastic Differential Equations
A stochastic process x(t) is said to obey an Itô stochastic differential equation
dx(t) = a(x(t), t)dt + b(x(t), t)dW (t) (2.99)
when for all t0 , t we have

Z t Z t
′ ′ ′
x(t) = x(t0 ) + a(x(t ), t )dt + b(x(t′ ), t′ )dW (t′ ) (2.100)
t0 t0
A unique solution exists if the following conditions are satisfied
• ∃k > 0 : |a(x, t′ ) − a(y, t′ )| + |b(x, t′ ) − b(y, t′ )| ≤ k|x − y|∀x, y and t′ ∈ (t0 , t)
• ∃k > 0 : |a(x, t′ )|2 + |b(x, t′ )|2 ≤ k(1 + |x|)2, ∀t ∈ (t0 , t)
It may be recalled that these conditions are very similar to those required for the existence of
solutions to ordinary differential equations.
If the process x(t) satisfies the stochastic differential equation (2.99), we can use the fact
that (dW )2 = dt to show that the function f [x(t), t] has the differential
∂f 1 2 ∂ 2 f

∂f ∂f
df = +a + b 2
dt + b dW (t) (2.101)
∂t ∂x 2 ∂x ∂x
where the function arguments have been suppressed for brevity. This formula is called Itô’s
formula and is of fundamental importance in applications to option pricing. This formula again
reflects the fact that the Gaussian distribution has only two non-zero cumulants. For more
general processes, many more terms would be generated in the differential.
We now look at an application of stochastic differential equations in physics. Quantum

mechanics can be derived from classical mechanics under the hypothesis of universal Brownian
motion as postulated by Nelson in Nelson[34] (also see Nelson[35]). This fact can be used
to derive the path integral approach to quantum mechanics from the Chapman-Kolmogorov
equation of the stochastic process corresponding to the specific system. The stochastic process
of a quantum system has the general form
r
~
dx(t) = b+ (x, t) + dW(t) (2.102)
M
with the restriction that the process be conservative. Note that we get back to classical mechan-
ics as ~ → 0. We will consider three examples to illustrate this better.
For the first example, let us consider the simplest possible case, a free particle of mass m.
Let us also assume that the initial probability density function |ψ(x, t)|2 is a normal distribution
with zero mean and variance σ 2 . The stochastic differential equation for this system is the
simple equation r
~
dx = dW (t) (2.103)
M
whose solution is obviously r
~
x = x0 + W (t) (2.104)
M
The distribution of W (t) is given by K(x, t) ∼ N(0, t) (2.91) by definition since W (0) = 0.
Since the initial probability distribution was Gaussian, it can be easily seen (using the con-
volution property of Gaussian functions or the fact that the Gaussian distribution is a fixed
point in probability space) that the final probability distribution is also Gaussian with variance
σ 2 + ~Var(W (t))/M = σ 2 + ~t/M. Notice that in this particular case, the solution is much
simpler than the usual method of solving the Green’s function of the Schrödinger equation and
convolving that with the initial wave function.
The second example we use is the quantum mechanical simple harmonic oscillator. It’s
stochastic differential equation is given by

pcl (t)
dx(t) = − ω(x − xcl (t)) dt + σdW (t) (2.105)
M
or r
~
dx0 = −ωx0 dt + dW (t) (2.106)
M
where x0 = x − xcl from which the complete family of solutions can be generated quite easily.
The third example we consider is the free Klein-Gordon field. If we write this field as
Z
φ(x, t) = d3 keik·x qn (t) (2.107)
the qn follow the stochastic process
dqn = −ωn qn dt + dWn (t) (2.108)

Sec. 2.14 Girsanov’s Theorem 29
p
in the ground state where ωn = k2n + M 2 . This is, of course, the same as (2.106) since these
coefficients behave like simple harmonic oscillators. Since (2.108) is just the equation for the
stationary Ornstein-Uhlenbeck process, we immediately see that
1 −ωn |t−t′ |
E[qn (t)qn (t′ )] = δnn′ e (2.109)
2ωn
and hence the covariance functional of the stochastic field φ is given by
√
1 d3 k
Z
∗ ′ ′ ′ 2 2 ′
E[φ(x, t)φ (x , t )] = √ eik·(x−x ) e− k +M |t−t | (2.110)
(2π)3 2
2 k +M 2
To get more information on this formulation of quantum mechanics and field theory, see Namiki
[8].
We can also handle stochastic differential equations for stochastic fields. If we have a one
dimensional stochastic field f (x), we can write a stochastic differential equation
Z
df (t, x) = a1 (t, x, f (t, x))dt + dx′ σ(t, x, x′ , f (t, x), f (t, x′ ))dW (t, x′ ) (2.111)
where dW (t, x′ ) represent separate independent Wiener process for each x′ so that we have
hdW (t, x)dW (t, x′ )i = dtδ(x − x′ ). The scaled covariance between df (t, x) and df (t, x′ ) is
then given by
Z
′ ′
a2 (t, x, x , f (t, x), f (t, x )) = dx′′ σ(t, x, x′′ , f (t, x), f (t, x′′ ))σ(t, x′ , x′′ , f (t, x′ ), f (t, x′′ ))
(2.112)
This is a very complicated expression in general but in some cases simplifies if a2 can be
represented as σ1 (t, x, f (t, x))D(x, x′ )σ1 (t, x′ , f (t, x′ )) where D is the Greens function of a
local differential operator D −1 . This is the special case we will be analysing in some detail
later when we consider interest rate models.
In finance, the stock price is usually modelled by the stochastic differential equation
dS(t) = µS(t)dt + σS(t)dW (t) (2.113)
which represents geometric Brownian motion. This contrasts with the process that Bachelier
initially used
dS(t) = µdt + σdW (t) (2.114)
which represents arithmetic Brownian motion. The former is preferable to the latter for theo-
retical purposes as the latter allows the possibility of negative stock prices which we know are
impossible. Geometric Brownian motion does have the flaw that it does not allow the stock
price to go to zero which we know is very much possible. However, this is a much more minor
flaw. In practice, (2.113) is usually transformed to
σ2
dx(t) = (µ − )dt + σdW (t) (2.115)
2
where x = ln S using Itô’s formula as this is much easier to analyse.
Sec. 2.14 Girsanov’s Theorem 30
2.14 Girsanov’s Theorem
The following theorem due to Girsanov [36]. This theorem shows how to change the measure
of the standard Wiener process so as to introduce an arbitrary drift to it and gives us a way to
characterize how the Wiener process is transformed under this new measure. This turns out
to be very useful due to the fundamental theorem of asset pricing (which will be covered in
the next chapter) that states that the price of any attainable contingent claim is the expectation
of its value under a measure that is equivalent to the actual measure but under which all the
discounted asset prices are martingales (this measure is usually called the equivalent martingale
measure). The Girsanov theorem gives us a way of finding this measure for many cases.
The actual statement of the theorem is as follows. Let γ(t) : 0 ≤ t ≤ T be a measurable
process satisfying4 Z T
1 2
E exp γ (t)dt < ∞ (2.116)
2 0
If we then define two processes L(t) : 0 ≤ t ≤ T and W µ (t) : 0 ≤ t ≤ T by
Z t
1 t 2 ′ ′
Z
′ ′
L(t) = exp γ(t )dW (t ) − γ (t )dt (2.117)
0 2 0
Z t
µ
W (t) = W (t) − γ(t′ )dt′ (2.118)
0
where W (t) is a Wiener process with respect to the measure ν. Then, L(t) is a martingale and
W µ (t) is a Wiener process under the equivalent probability measure µ with Radon-Nikodým
derivative dµ
dν = L(t). To see what this means more physically, we consider the action functional
for the white noise process (see Parisi and Sourlas[37]),
1
Z
S=− η 2 (t)dt (2.119)
2
where η(t)dt = dW (t). If we now introduce a drift γ(t) to this simple white noise, we get the
action
1
Z
S=− (η(t) − γ(t))2dt (2.120)
2
and the ratio between these two gives the required Radon-Nikodým derivative. It is also easy to
see
Rt
intuitively that if a drift γ(t)
Rt
has been added to the white noise process, then the new process
′ ′ ′ ′ ′
0 (η(t ) −γ(t ))dt = W (t) − 0 γ(t )dt is a Wiener process in this new measure. In effect, that
is all that the theorem is stating. In economics, the function γ(t) has the interpretation of being
the market price of risk in the transformation between the market and risk-neutral measures. It
is the excess return per unit of variance that the risky asset must return to satisfy the holders of
the instrument.5 It is also called the Sharpe ratio if there is only one source of risk.
4 this condition is called Novikov’s condition

5I thank Prof. Warachka for pointing this out to me.
C HAPTER 3
T HE F UNDAMENTAL T HEOREM OF
A SSET P RICING
The fundamental theorem of asset pricing is a remarkable theorem developed by Harrison and
Kreps [38] based on arguments presented in Cox and Ross [39] and was elaborated in Harrison
and Pliska [40]. The theorem has two parts to it. The first is that the absence of arbitrage in
the market implies the existence of a measure under which all the discounted asset prices are
martingales1. The second part of the theorem basically states that in a complete frictionless
market with no transaction costs or arbitrage opportunities, the price of all contingent claims
(satisfying some basic regularity conditions) are the expectations of the future payoff of the
claim under a unique measure in which all discounted asset prices are martingales. The theo-
rem is proved by showing that there exists a self-financing trading strategy that replicates the
contingent claim provided that such an unique equivalent martingale measure exists. If the
market is not complete, there will be many equivalent martingale measures and all attainable
contingent claims, attainable claims being those which can be replicated by a self-financing
trading strategy, can be priced as expectations of future payoffs under any of these measures
which will be consistent in the absence of arbitrage opportunities. These are standard results
in theoretical finance but are probably not known to most physicists, therefore we will quickly
review them.
In this chapter, we will first define the terms we have introduced above in an economy with a
finite number of states and provide a proof of the theorem. We will then show only heuristically
how this carries over to continuous processes as the details are very technical and do not add
much insight. However, this heuristic description will explain why the continuous version is
necessary. This is an important point since it might be reasonably thought that a large finite
sample space would approximate any model as well as we would like. We will then see the
how the situation changes when the markets are incomplete. This is currently a highly active
research area in financial mathematics. Finally, we will see how this theorem affects how we
analyse models using techniques from physics.
In this chapter, we will assume that all assets have no dividends or payoffs. This is not a
major restriction as we can always assume that the dividends or payoffs are reinvested in the
same asset. The assets used in the analysis below will be such compounded assets.
The proof of the theorem for both the finite and continuous cases is adapted from Harrison
and Pliska [40] and Duffie [42].
1 The discounting can be with respect to any traded asset as pointed out in Geman et al[41]
31
Sec. 3.1 Finite Sample Spaces 32
3.1 Finite Sample Spaces
We assume that the economy consists of a finite probability space (Ω, F, P ) and that there are
K + 1 traded assets whose prices follow stochastic processes Stk that are strictly positive and
Ft adapted. Note that the finite sample space also implies a finite time horizon. The time is also
discrete and consists of T steps, i.e. t = 0, . . . , T . The zeroth security S 0 will be the numeraire
with respect to which the other secuirities will be valued. We assume S00 = 1 with no loss of
generality since we have not set any scale for the currency. We define βt = 1/St0 and call it the
discount process. A contingent claim is defined to be a nonnegative random variable X on this
probability space. We denote the set of all such contingent claims by X.
3.1.1 Self-financing trading strategies, no arbitrage and price systems
A trading strategy φt is defined to be a predictable K + 1 dimensional vector process (i.e.

φt ∈ Ft−1 so that φt is known at time t − 1). Since this means that φ0 is undefined, we set
φ0 = φ1 . The quantity φkt is the quantity of security S k held by the investor between times
t − 1 and t. In other words, it represents the portfolio of the investor. For convenience in the
following exposition, we represent φt as a row vector and the price process St as a column
vector and the inner product as φt St . The inner product Vt = φt St obviously represents the
value of the portfolio φt at time t.
A self-financing trading strategy is one for which there is no outside investment after the
initial one. In this case, it is easy to see that φt St = φt+1 St since the portfolio value before and
after rebalancing at instant t must be the same. For the purposes of proving this theorem, we
only admit strategies for which the value process V (φ) ≥ 0. This is not a particularly restrictive
claim as one can always add a constant value to the final contingent claim and a bond paying off
this constant amount at the time horizon of the contingent claim to the trading strategy. In other
words, the restriction only ensures that the value process of the trading strategy are bounded
from below. This is necessary due to the existence of pathological doubling strategies atleast in
the continuous case. We call the set of all admissible self-financing trading strategies Φ.
A contingent claim X is said to be attainable if there exists an admissible trading strategy
such that VT (φ) = X. The trading strategy φ is then said to generate X and π = V0 (φ) is called
the price of the contingent claim at time zero.
We are now in a position to apply the principle of no arbitrage to this model. We define an
arbitrage opportunity to be a trading strategy φ such that V0 (φ) = 0, VT (φ) ≥ 0 and E[VT (φ)] >
0. This means that there is a trading strategy that starts with nothing but ends with either a
positive or zero value (note that VT (φ) < 0 is not possible as φ is admissible).
We will now show that the absence of arbitrage opportunities in this economy implies the
existence of a unique price for each contingent claim π(X) such that
π(X) = 0 ⇐⇒ X = 0 (3.1)
and that π is linear on the contingent claim space, that is
π(aX + bX ′ ) = aπ(X) + bπ(X ′) ∀a, b ≥ 0 and X, X ′ ∈ X (3.2)

The prices must, of course, satisfy π(VT (φ)) = V0 (φ) for all φ ∈ Φ. If the price system π
satisfies this criterion, then it is said to be consistent with the market model. We denote the set
of all price systems consistent with the market model by Π.
3.1.2 Equivalence of martingale measures and price systems
We define a measure Q to be a martingale measure if it is equivalent to P and if the discounted

price process βS is a martingale under Q. We denote the set of all such Q by Q.
We will now establish the one to one correspondence between π ∈ Π and Q ∈ Q. The one
to one correspondence is that
π(X) = EQ (βT X) (3.3)
and
Q(A) = π(ST0 1A ), A ∈ F (3.4)
where 1A stands for the indicator random variable that is 1 when the event A occurs and zero
otherwise.
To show the first correspondence, we assume we have Q ∈ Q and define π using (3.3).
Clearly, (3.1) and (3.2) are satisfied, so we still have to show that π(VT (φ)) = V0 (φ) to prove
that π ∈ Π. We note that, for any φ ∈ Φ, we have
T
βT VT (φ) = ∑ φi(βiSi − βi−1 Si−1) + φ1β1 S1 (3.5)
i=2
due to the self-financing nature of the strategy. Therefore, by the deifinition of π, we have
π(VT (φ)) = EQ [βT VT (φ)] = EQ [φ1 β1 S1 ] = φ1 EQ [β1 S1 ] = φ1 β0 S0 = V0 (φ) (3.6)
with the second equality following from the fact that βS is a martingale under Q and that φ is
predictable. Hence, we have shown that π ∈ Π and the correspondence (3.3) is proved.
To show the converse, we let π ∈ Π and define Q by (3.4). To show the equivalence of Q to
P , we note that Q(ω) = π(ST0 1ω ) > 0 if P (ω) > 0 due to (3.1). Further, if we have a strategy
φ of holding one unit of S 0 throughout (this strategy is obviously self-financing and admissible
thus belonging in Φ), we see that
V0 (φ) = 1 = π(VT (φ)) = π(ST0 1Ω ) =⇒ Q(Ω) = 1 (3.7)
showing that Q is a probability measure. By decomposing any contingent claim to a linear com-
bination of indicator random variables, we see from the definition of Q that π(X) = EQ (βT X).
We still have to show that βS is a martingale under Q. To do so, we consider another simple
strategy φ which starts with holding one unit of security S k , k ≥ 1, holding it until some time
τ , using the proceeds of the sale at τ to buy the numeraire security S 0 and holding this until
maturity. More compactly, we have
Sτk
φkt = 1t≤τ , φ0t = 1t>τ (3.8)
Sτ0
For this strategy, V0 (φ) = S0k and VT (φ) = ST0 βτ Sτk and consistency of π gives us
S0k = π(ST0 βτ S0k ) = EQ [βT ST0 βτ Sτk ] = EQ [βτ Sτk ] (3.9)
Hence, we have shown that βS is a martingale under Q since k and τ are arbitrary. Therefore,
Q ∈ Q and the correspondence (3.4) is proved.
3.1.3 No arbitrage and the non-emptiness of Π
None of the preceding shows that a consistent viable price system or an equivalent martingale
measure exists for the market model. Unless we show this, the above equivalence will only
be a mathematical curiosity. We therefore now proceed to show that the absence of arbitrage
implies that Π and therefore Q are non-empty.
To do so, we will consider two different subsets of the set of random variables X on this
probability space. These subsets are
X+ = {X ∈ X : E[X] ≥ 1} (3.10)
and
X0 = {X ∈ X : ∃φ ∈ Ξ, V0 (φ) = 0} (3.11)
where Ξ is the set of self-financing (not necessary admissible) strategies. It should be obvious
that, in the absence of arbitrage, X+ and X0 are disjoint. Since X+ is a closed and convex
subset of RΩ and X0 is a linear subspace, the separating hyperplane theorem ensures that there
is a linear functional RΩ such that L(X0 ) = 0 and L(X+ ) > 0. We now show that the price
system π(X) = L(X)/L(ST0 ) is an element of Π. It is obvious that π satisfies (3.1) and (3.2).
Hence, it only remains to show that V0 (φ) = π(VT (φ)) for all φ ∈ Φ. We define
(
φ0t − V0 (φ) if k = 0
ψtk = (3.12)
φkt if k = 1, . . . , K
Then ψ ∈ X0 is a self-financing strategy with V0 (ψ) = 0. We note that since π(X) ∝ L(X),
π(X0 ) = 0. Hence,
π(VT (ψ)) = 0 = π(VT (ψ)) = VT (φ) − V0 (φ)ST0 = π(VT (φ)) − V0 (φ) (3.13)
Hence, we see that π satisfies the consistency condition and therefore is an element of Π.
Therefore, Π and hence Q are nonempty if the no arbitrage condition is satisfied.
To show the converse is considerably simpler. If Π is nonempty, we choose a π ∈ Π and
a trading strategy φ ∈ Φ such that V0 (φ) = 0. Since π ∈ Π, π(VT (φ)) = 0 which implies that
VT (φ) = 0 due to (3.1).
We have therefore proved the statement that the market model contains no arbitrage oppor-
tunities if and only if Π (and hence Q) is nonempty.
3.1.4 Completeness and the uniqueness of martingale measures
There are two ways to characterize completeness in a market model with a finite sample space.
The first way is particularly simple and elegant but does not carry over to the continuous case
while the second one is much less intuitive but does carry over to the continuous case.
We will now consider the first way to show market completeness. We denote the partition
underlying the σ-algebra Ft by Pt . Without loss of generality, we will assume that the set of
securities are nonredundant, i.e. there is no nontrivial vector α for which P (αSt+1|A) = 0 for
any A ∈ Pt . Let Kt (A) be the number of cells of Pt+1 contained in A ∈ Pt . The number of
Sec. 3.2 Continuous Sample Spaces 35
dimensions spanned by the space of random variables increases by Kt (A) if A has already been
achieved. The space spanned by the number of securities has added K + 1 dimensions in going
from time t to t + 1. However, for completeness, both spaces must have the same dimension.
Hence, the model is only complete if Kt (A) = K + 1 for all A ∈ Pt for all t. This is very
intuitive but unfortunately does not carry over to the continuous case. To see this, consider a
simple case where all securities are uncorrelated to each other and each security, except the
numeraire security S 0 , can go up or down by a certain factor at each time step. It is easily seen
that Kt (A) is given by 2K . Hence, if K > 1, this simple model is incomplete. This, as we
shall see later, is not true in the continuous case. The K = 1 case is covered in great detail in a
classic paper by Cox, Ross and Rubinstein [43].
The second way to characterize completeness is by showing that the market is complete
if and only if there is exactly one equivalent martingale measure. This characterization of
completeness does in fact carry over to the continuous case. If we assume that the market
is complete, then we know that 1A is a contingent claim that can be attained for any A ∈ F.
Hence, it must be true that, for any Q, Q′ ∈ Q, EQ [1A ] = EQ′ [1A ] which means that Q(A) =
Q′ (A) for all A ∈ F or that Q = Q′ . Therefore, we have shown that if the market is complete,
the equivalent martingale measure is unique. To show the converse, we note that the set of
attainable contingent claims is a linear subspace of the space of all contingent claims. Hence,
if the two sets are not equal, there exists a vector which is orthogonal to the space of attainable
contingent claims which can be added to a price system π ∈ Π to generate another price system
which is consistent across all the attainable contingent claims. Hence, incompleteness implies
the nonuniqueness of the price systems and hence, the equivalent martingale measures. Since
we are assuming no arbitrage is satisfied, the set of equivalent martingale measures is not empty
and the previous statemtent shows what we set out to prove. Therefore, we see that a market
model satisfying no arbitrage is complete if and only if there is an unique equivalent martingale
measure.
3.2 Continuous Sample Spaces
The proof for the continuous case is exceptionally difficult and the paper by Harrison and Kreps
[38] actually proves a weak form of the theorem. There have been many recent developments
in proving this theorem such as the paper by Delbaen and Schachermeyer [44] which have
extended the original result and made it somewhat stronger.
We will outline the main steps in the proof of the fundamental theorem in the continuous
case as presented in Harrison and Pliska [40] and comment on the similarities and differences
of the proof in this case and the finite case. The main difference with the finite case to note
is that Q is assumed to be nonempty without justification.2 A filtered probability space is set
up as in the finite case. The numeraire asset S 0 is now restricted to be continuous and have
finite variation. In other words, it is locally riskless. The predictable σ-algebra is then defined
as the σ-algebra generated by simple predictable processes. Simple predictable processes are
processes which are piecewise constant and which are predictable in the finite sense at the
points where they change. Predictable processes can then be defined in terms of this predictable
σ-algebra. Some technical restrictions are imposed on this set of preditable processes to get the
2 InDelbaen and Schachermeyer [44], the non-emptiness is shown from the assumption that No Free Lunch
with Vanishing Risk (NFLVR), a principle slightly more restrictive than no arbitrage, is satisfied.
Sec. 3.3 Models of the Market in the Language of Physics 36
set of trading strategies and the class of all admissible trading strategies Φ is defined exactly
as in the finite case. The set of attainable contingent claims and the price associated with
them is defined in almost exactly the same manner as in the finite case. The correspondence
between the martingale measures and the consistent price systems is shown in the same way
as well. The main difference with the finite case arises in the characterization of completeness
where the proof that the uniqueness of the equivalent martingale measure implies completeness
makes use of the martingale representation theorem. The martingale representation theorem
only works for Wiener processes, Poisson processes and binomial processes. Hence, it is only
for models involving these processes that we can be sure that this theorem is correct.
The martingale representation theorem does work for multidimensional Wiener processes
and hence, a market with K > 1 securities with the returns of each following indepndent Wiener
processes and a locally riskless security for the numeraire is indeed complete. This is a major
advantage of the continuous model over the finite model where the model with K securities
which followed geometric random walks is not complete as we saw in the last section. Hence,
we can only price contingent claims depending on more than one security in the continuous
framework. As we noted at the beginning of the chapter, we have explained why it is necessary
to use a continuous market model and why a finite system, however large, will not work.
When the market is not complete, there will be many equivalent martingale measures since
NLFVR shows that there is atleast one such measure and that this measure cannot be unique
due to incompleteness. While the price of all attainable contingent claims will be the same
when priced using any of these martingale measures, this is not true for claims which are not
attainable. The question of which martingale measure is “best” in such cases is one of great
practical importance and is the subject of much current research (for some recent work on this
subject, see Fritelli [45], Goll and Rüschendorf [46] and Yuan [47]).
3.3 Models of the Market in the Language of Physics
We will now see how we can translate the above theorem into the language of physics. We can
specify a Markovian market model by defining its Hamiltonian Ĥ(x) in terms of some state
variables x where the x might be infinite dimensional3. We also use the notation introduced in
the previous chapter of denoting expectations by state vectors since the analogy is useful and
has been used in the literature [16].
The concept of equivalent Hamiltonian in this context is crucial. Two Hamiltonians are said
to be equivalent if they only differ in the term linear in p̂. Hence the Hamiltonian iµSp − 12 S 2 p2
is equivalent to the Hamiltonian − 21 S 2 p2 . Since the term linear in p refers only to the non-
stochastic drift of the process, the equivalent Hamiltonians describe similar stochastics.
The prices of assets traded in this model will be expressible as functions of these fundamen-
tal variables | Si = hx | Si where | Si might again be infinite dimensional. We assume that the
fundamental assets are time independent. The fundamental theorem of asset pricing can then
be shown to state that we must find an equivalent Hamiltonian (called the risk-neutral Hamilto-
nian) that must annihilate all the discounted traded assets (that is, Ĥ| Zi = 0), where we have
3 Here we are using the term Hamiltonian in the sense given in the last chapter. It is very similar to the
Hamiltonian in quantum mechanics in that it describes time evolution but is otherwise quite different (for example,
it is not necessarily Hermitian).
denoted the discounted asset prices as | Zi, in order to price the contingent claims. The idea
originated in Baaquie [48] where it was used and an alternative proof, where the discounting
was done with respect to a money market account and applied to the basic set of underlying
assets, was given.
We note that the vector | Zi always has one trivial component, the numeraire asset expressed
with respect to itself, which we will call Z1 if there are a countable number of traded assets.
To prove the above statements, we have to make use of the time evolution equation (2.39)
d| hi
= Ĥ| hi (3.14)
dt
The Hamiltonian in the z basis can be written as

∂ ∂
Ĥ = ani + ani ,nj +... (3.15)
∂zi ∂zi ∂zj
where the a stand for the joint scaled cumulants of the transition process as noted in the pre-
vious chapter.4 In general these can depend on zi . Since a martingale process has no drift by
definition, its evolution kernel K(t, y, x, ǫ) cannot have a mean or first cumulant a1 for any x or
t. Hence, we see that the first term above is zero. All the other terms obviously annihilate the
vector | Zi. Hence, we have
Ĥ| Zi = 0 (3.16)
We know that cumulants for many probability distributions do not exist, so the above proof is
not valid for them. However, we can generalize the above proof by considering the Hamiltonian
in the momentum basis conjugate to the particular traded asset. In that case, the drift is ∂∂pĤ 

p=0
This is the coefficient ani in the above analysis and must be zero for a martingale process. It
does not matter if the higher cumulants exist in this basis as the characteristic function is always
well defined. However, ∂∂p Ĥ 

itself must still be well defined and be zero for a martingale
p=0
process. Hence, we cannot, for example use the Lévy distributions with α ≤ 1.
Such an equivalent Hamiltonian will exist if the condition of no arbitrage is satisfied. For
example, suppose we have two traded assets in the economy which depend only on one noise
variable (so their behaviour is perfectly correlated in time) and different drifts, we will not be
able to find such an equivalent Hamiltonian.
The proof in Baaquie [48] is as follows : the future price of the security evolved back
towards the present is the current price times a discounting factor due to the money market
account. That is
| Si = e−r(t∗ −t) eĤ(t∗−t) | Si (3.17)
giving (Ĥ + r)| Si = 0.
The above reasoning works only if the discounted traded assets are only functions of the
state variables x and do not depend explicitly on time. If they do depend explicitly on time, we
must then have
d| Zi ∂
= Ĥ + | Zi = 0 (3.18)
dt ∂t
4 If the assetsform a continuous basis as in Baaquie [48], we need to use functional differential operators rather
than plain differential operators
This is not a vacuous point as we will see that there are many asset prices which explicitly
depend on time. In fact, the most common numeraire asset, the money market account, usually
explicitly depends on time. However, if we use this numeraire, then the discounted assets do
not depend on time explicitly. Alternatively, the way to handle this is to notice that the function
whose expectation we want h(x, t) has a time dependent first argument when written as h(Z, t).
The time dependence can be removed by applying the translation operator to h so as to express
it with the time dependence removed. To make this clear, let us consider a simple example.
Consider the stochastic differential equation
dS(t) = rS(t) + σS(t)dW (t) (3.19)
with Hamiltonian
1 ∂2 ∂
Ĥ = σ 2 S 2 2 + rS (3.20)
2 ∂S ∂S
Z(t+dt)
If we want to express the Hamiltonian for Z = e−rt S, then we must note the fact that S(t+dt) =
Z(t)
e−rdt S(t) . Without loss of generality, we can assume that t = 0. In terms of Z(0) = S, the
value at time dt of any smooth function h of S can be written as h(Se−rdt , dt) = h(S, 0) +
∂h(S,0) ∂h(S,0)
dt ∂t − rdtS ∂S which means that
h(Z(dt), dt) − h(Z(0), 0) ∂h(Z, 0) ∂h(S, t) ∂h(S, t)

= = − rS (3.21)
dt ∂t ∂t ∂S
Hence, the evolution of the expectation of h(Z, t) is governed by the equation

∂E[h(Z, t)] ∂
= Ĥ − rS E[h] (3.22)
∂t ∂S
so that the Hamiltonian for Z is given by
∂ 1 ∂2 1 ∂2
Ĥ(Z) = Ĥ(S) − rS = σ2S 2 2 = σ2 Z 2 2 (3.23)
∂S 2 ∂S 2 ∂Z
This result becomes trivial if we instead start with the stochastic differential equation, change
variables to Z there and then write the Hamiltonian for the transformed stochastic differential
equation.
In all the cases we consider, we start with a set of traded assets which do not explicitly
depend on time (except for the numeraire which can be handled as above) but the set of contin-
gent claims will mostly not have this property. In fact, we will use (3.18) to calculate the value
of these claims.
It might seem that we can always include the contingent claims in the model and get a richer
structure for the set of martingales (which, in a loose sense, can be considered as the vacuum),
but this is not the case because the attainable contingent claims can be recreated using linear
combinations of the original set of assets. If the market is not complete, it means that the traded
assets do not span all the states, or, in other words, do not form a complete basis. In that case,
adding contingent claims might aid in getting a complete basis.
For discounted attainable contingent claims, we see that we can write
| f (x, t)i = ∑ φi (x, t, ω)| Zi i (3.24)

i
with φ representing the trading strategy, ω the history of the asset prices since the trading
strategy can depend on this and where we must have

∂φi (x, t, ω)
∑ ∂t
+ Ĥφi (x, t, ω) | Zi i = 0 (3.25)
i
to ensure that the contingent claim is a martingale (which is equivalent to ensuring that the
strategy is self-financing). There is a slight abuse of the notation here as, in general, the option
is not part of the wavefunction space as the φ are not purely functions of x and t. For path
independent options, this is not a problem as the φ do not then depend on the history of the asset
prices. Further, if the history is taken to be known, then obviously the φ can be considered as
functions of only x and t thus ensuring that (3.18) is a valid evolution equation for the option.
An important point to note is that the Hamiltonian will depend on the choice of numeraire
asset due to the different transformations that must be applied as we go from the state variables
to the discounted assets. This point was made in Amin and Jarrow [49] and Geman et al
[41] where the statement was made in terms of equivalent martingale measures rather than
Hamiltonians. The Hamiltonians are, of course, then closely related to each other.
In simple cases, we can start with a Hamiltonian for the traded assets that we believe is
correct and find the equivalent Hamiltonian which annihilates the discounted traded assets and
use it to price the contingent claims. We will show how this is done in subsequent chapters.
In general, however, it is only feasible to find one or more Hamiltonians which annihilate
the state vector of discounted traded assets and which we believe are equivalent to the actual
Hamiltonian. We can then check the price of some contingent claims to see if they agree with
the Hamiltonian we are using. If they do, we can then use this Hamiltonian to price other
contingent claims.
If we think about this in terms of physics, the equivalent Hamiltonian corresponds to the
energy in the equivalent risk-neutral world and the state vector of discounted traded assets
constitutes the vacuum. If, as is usually the case, we take a money market account which has
a guaranteed positive return to exist, we see that all assets are expected to increase over time
in the same way in the equivalent world and thus have positive and equal energy. Since only
energy differences are physically relevant, we subtract this energy to get a redefined energy
which is zero in the equivalent risk-neutral world. Notice that this is equivalent to considering
only the ratios of the traded assets as only these have financial relevance. Of course, as with all
analogies, this should not be taken too far as in physics we do have non-zero energy states and
there is no equivalent world where everything is a vacuum. However, we see that the system
does have many analogies to systems considered in physics.
C HAPTER 4
A PPLICATIONS TO S TOCKS
We will now discuss how we can use the results of the previous chapters in analysing a very
simple market model with just one risky asset and one risk-less asset. We call the risky asset S
and the risk-less asset B. If we call the locally risk-less return on B at each instant r(t), we see
that Rt ′ ′
B(t) = B(0)e 0 r(t )dt (4.1)
Without loss of generality, we assume that B(0) = 1 for the rest of the discussion. We choose
B as the numeraire. We could have chosen S as the numeraire, but in this case, it just makes
the non-trivial element Z2 , which we denote by Z from now on, the inverse of what it would be
otherwise making the change of dynamics trivial. We still have to provide a stochastic process,
or equivalently, the Hamiltonian for the discounted risky asset to completely specify the model.
We have only one restriction on the market Hamiltonian which is that ∂∂p Ĥ 

must exist.
p=0
The equivalent Hamiltonian to generate a martingale measure is then given by H − ∂∂p Ĥ 

.
p=0
For the rest of this discussion, we only consider Hamiltonians which are quadratic in p, or in
other words, locally Gaussian processes which can be specified by an Itô stochastic differential
equation or a quadratic Hamiltonian.
4.1 The most general risk-neutral Gaussian Hamiltonian
Let us first write down the most general quadratic equivalent risk-neutral Hamiltonian for the
discounted risky asset Z. This is
σ 2 (Z, t) ∂ 2
(4.2)
2 ∂Z 2
since the first moment is zero which means that the dynamics of Z is given by the stochastic
differential equation
dZ(t) = σ(Z(t), t)dW (t) (4.3)
Rt
The discounted asset Z(t) is given by Z(t) = S(t) exp (− 0 r(t′ )dt′ ). Hence, the risk-neutral
stochastic differential equation followed by S is given by
dS(t) = d(Z(t)B(t)) = r(t)B(t)Z(t)dt + B(t)dZ(t) = r(t)S(t)dt + σS (S(t), t)dW (t) (4.4)
where σS (S(t), t) = σ(S(t)/B(t), t)B(t). Since there will be no possibility of confusion, we

will denote σS by σ in the rest of this chapter. Hence, the equivalent Hamiltonian in the S basis
40
Sec. 4.2 The Black-Scholes equation 41
is given by
σ 2 (S(t), t) ∂ 2 ∂
2
+ r(t)S(t) (4.5)
2 ∂S ∂S
∂
We therefore see that the coefficient of ∂S is fixed purely by risk-neutrality so that the only
freedom that exists is in choosing the form for σ. Note that in Baaquie[16] and Srikant[1],
there was an extra r(t) term in the Hamiltonian as it was transformed back to non-discounted
assets. If one goes back to non-discounted assets one must add a term r(t) to the Hamiltonian
called the killing term as discussed in section 2.9.
4.2 The Black-Scholes equation
In the seminal paper of Black and Scholes[5], the process assumed for the risky asset was
dS(t) = µS(t)dt + σS(t)dW (t) (4.6)
to model the price of the risky asset as is usually done. It was also assumed that the interest
rate r was fixed in time. We can immediately write down the Hamiltonian for this process
∂ σ2S 2 ∂2
Ĥ = µS + (4.7)
∂S 2 ∂S 2
Since Z = Se−rt , its dynamics are given by
dZ(t) = (µ − r)Z(t)dt + σZ(t)dW (t) (4.8)
which does not describe a martingale process unless µ = r. This is because we have specified
a Hamiltonian we believe describes the system but have not yet transformed it into one which
gives an equivalent martingale measure. From the above analysis, we see that replacing µ in
the Hamiltonian by r will give such an equivalent martingale measure. In terms of measures,
we have transformed the measure by the use of Girsanov’s theorem with γ = µ−r σ , a quantity
usually called the Sharpe ratio or the market price of risk. Hence, the equivalent Hamiltonian
that we should use to price discounted contingent claims is
∂ σ2 S 2 ∂2
ĤBS = rS + (4.9)
∂S 2 ∂S 2
which is of the form found in the previous section with σ(S(t), t) given by σS.
We can now value contingent claims in this model. Since we have the risk-neutral Hamil-
tonian, we just apply (3.18) to obtain an equation for the discounted contingent claim f˜
∂ f˜ ∂ f˜ σ 2 S 2 ∂ 2 f˜
+ rS + =0 (4.10)
∂t ∂S 2 ∂S 2
To obtain the equation for the actual contingent claim, we need to handle the discounting by
including a killing term equal to the discounting factor r in the Hamiltonian to obtain
∂f ∂f σ 2 S 2 ∂ 2 f
+ rS + − rf = 0 (4.11)
∂t ∂S 2 ∂S 2
Sec. 4.2 The Black-Scholes equation 42
which is the famous Black-Scholes equation. In many cases, it is written in terms of the variable
x = ln S as the equation looks considerably simpler in this variable. In this variable, the Black-
Scholes equation is
∂f σ2 ∂ σ2 ∂2
+ (r − ) + − rf = 0 (4.12)
∂t 2 ∂x 2 ∂x2
Note that all the coefficients are now constants.
The value of the contingent claim as a function of the sample path will provide the boundary
conditions for the equation. The value of the contingent claim is then the solution of the Black-
Scholes equation given the boundary conditions appropriate for that contingent claim. For a
European call option expiring at time T , the boundary condition is that f (T ) = max(S − K, 0).
This leads to the famous Black-Scholes option pricing formula
c = SN(d1 ) − Ke−r(T −t)N(d2 ) (4.13)
where
2
2

S
+ r + σ2 (T − t) S
+ r − σ2 (T − t)

ln K ln K √
d1 = √ , d2 = √ = d1 − σ T − t (4.14)
σ T −t σ T −t
and N(x) is the cumulative standard normal distribution.
In the case of other options such as American options, the boundary conditions are consider-
ably more complicated. For path dependent options, it might not even be possible to specify the
boundary conditions meaningfully. In such cases, the option price will have to be determined
more directly using the martingale measure.
An alternative way to proceed is to use path integrals to solve the problem. Using (2.119),
we see that the action functional for the Black-Scholes process is given by
2
σ2

1
Z
SBS = − 2 dt ẋ − r + (4.15)
2σ 2
The propagator is hence given by the path integral
Z
DxeSBS [x(t)] (4.16)
with boundary conditions x(t) = x and x(T ) = x′ . The propagator can be evaluated by per-
forming the Gaussian integrations. Another alternative is to integrate the stochastic differential
equation and write the answer in terms of the measure for W (t). Finally, we can consider the
evolution with the Hamiltonian in the momentum basis which gives
Z
′ τ ĤBS
pBS (x, τ ; x ) = hx | e | x i = dpdp′ hx | pihp | eτ ĤBS | p′ ihp′ | x′ i
′
(4.17)

1 1 ′ 2
2
= √ exp − x − x + τ (r − σ /2)
2πτ σ 2 2τ σ 2
where τ = T − t is the time between the present and the maturity of the option. In Baaquie [50],
there is an extra term e−rτ in the propagator as the killing term r was explicitly included in the
Hamiltonian. From now on, we include the killing term in the Black-Scholes Hamiltonian so
that it is given by
σ2 ∂2 σ2 ∂

ĤBS = + r− −r (4.18)
2 ∂x2 2 ∂x
Sec. 4.3 Single barrier options 43
(there is a further difference of a negative sign as the Hamiltonians are defined with the opposite
sign in Baaquie [50]) as this makes the notation more compact. This means we have to add r
to the Lagrangian as well so that the Black-Scholes action is now
2 !
σ2

1
Z
SBS = − 2 dt ẋ − r + +r (4.19)
2σ 2
In effect this is the Black-Scholes Hamiltonian for pricing undiscounted contingent claims.
In the next two sections, we will see how some option prices with slightly more complicated
conditions than European options can be valued much more simply using ideas from basic
quantum mechanics. The pricing formulae for these options are well known in the finance
literature and our aim is only to show that their derivation can be much simpler than the usual
solution procedures employed.
4.3 Single barrier options
This section is adapted from Baaquie[50]. A single barrier option is one which either attains
or loses value when the underlying asset price hits a pre-specified barrier. For concreteness,
we consider an option which loses all its value when the price goes below the barrier. Such an
option is called a down and out barrier.
The pricing of single barrier options has a long history. The down and out call option price
was first solved by Merton [51] in 1973. The solution to all eight types (combinations of the
choices of calls and puts, up and down, knock-in and knock-out) of single barrier options was
found by Rubinstein and Reiner [52] in 1991.
Since, in the risk-neutral measure, all discounted asset prices are martingales, we see that
the down and out barrier option price at time t0 is given by
e−r(T −t0 ) Et [(ex(T ) − K)+ 1x(t)>b,t0<t′ <T ] (4.20)
where 1 stands for the indicator function, T is the time of expiry of the option, the barrier is at
B = eb and the expectation is taken under the risk neutral measure. We can write the above as
a path integral to obtain
Z
−r(T −t0 )
e DxΘ(b − x(t))eSBS (x(t))(ex(T ) − K)+ (4.21)
where SBS is the Black-Scholes action

While the Heaviside step function looks complicated in the path integral, it can be seen to
be having the effect of an infinite potential barrier since this effectively prohibits the path from
entering the forbidden region outside the barrier. Hence, the problem might be better solved
using the Hamiltonian and this is indeed the case.
In the Schrödinger formulation, the above problem is to find the propagator for a system
with the Hamiltonian
Ĥ = ĤBS − V (x) (4.22)
Sec. 4.3 Single barrier options 44
b x
Figure 4.1: The potential for the single barrier option
where the Black-Scholes Hamiltonian is given by
σ2 ∂2 σ2 ∂ σ2 2 σ2

ĤBS = + r− − r = − p + ip r − −r (4.23)
2 ∂x2 2 ∂x 2 2
and the potential V (x) is given by

(
∞ x<b
V (x) = (4.24)
0 x>b
This is very similar to the well known problem of a particle in an infinite potential well except
∂
that the Hamiltonian has an extra term involving ∂x which makes it non-Hermitian.
Approaching this problem differently from Baaquie[50], we note that it can be solved by
transforming the underlying wave functions. Defining
σ 2 /2 − r (σ 2 /2 + r)2
α= and β= (4.25)
σ2 σ4
and making the transformation hx | φi = e−α(x−a) hx | ψi and hφ | xi = eα(x−a)hψ | xi, where

|φi are the vectors in the new (Hilbert) space, |ψi and hψ̃| are the original vectors and their duals
respectively.
2 In this new space, the Black-Scholes Hamiltonian takes the simple Hermitian form
σ2 ∂
2 ∂x2 − β .
The problem is now identical to that of a quantum mechanical particle of mass σ12 (in units
where ~ = 1) in a system with an infinite potential barrier at b. The eigenfunctions are hence
given by
hx | ψp i = eα(x−a) hx | φp i = 2ieα(x−b) sin p(x − b) (4.26)

hψ̃p | xi = e−α(x−b) hφp | xi = 2ie−α(x−b) sin p(x − b) (4.27)
and where hx | φn i are the eigenfunctions of the quantum mechanical particle in a system with
one infinite potential barrier.
Sec. 4.4 Double barrier options 45
The propagator is then given by
pDO (xτ ; x′ ) = hx | eτ Ĥ | x′ i
τ βσ2 ′
Z ∞ dp 1 2 2
= e− 2 +α(x−x ) e− 2 τ σ p ×
0 2π
ip(x−x′ ) ′ ′ ′
+ e−ip(x−x ) − eip(x+x −2B) − e−ip(x+x −2B)

e
1 τ βσ2 ′ 1
= pBS (x, τ ; x′ ) − √ e− 2 −α(x−x ) exp − ′ 2

(x + x − 2B)
2πτ σ 2 2τ σ 2
ex 2α
= pBS (x, τ ; x′ ) − B pBS (2B − x, τ ; x′ ) x, x′ > b (4.28)
e
To get the option price, we just fold its final value (for eg., max(S − K, 0) for a call option) into
the propagator.
We note that in the entire derivation above, we only assumed that the option becomes worth-
less when the underlying hits the barrier at B but didn’t specify from which side the barrier
should be hit. If we redefined the potential as
(
0 x<b
V (x) = (4.29)
∞ x>b
the calculation would proceed in exactly the same way. The only modification to the result
would be that the condition x, x′ > b would be changed to x, x′ < b. Hence, we have also
solved the problem of the up and out barrier. We can also value options which knock in (that
is attain non-zero value) when the underlying hits the barrier by noting that a knock-in barrier
together with a down and out barrier combine to form a simple option. Since the two are
together equivalent to the simple option, the sum of their values must equal that of the simple
option.
4.4 Double barrier options
The above does not still demonstrate the efficiency of the quantum mechanical method since
single barrier options can be very efficiently priced using the method of images. However, we
will now show that it is fairly easy to price double barrier options as well in this way. This is
quite difficult to do using standard stochastic calculus methods.
A double barrier knock out option is one for which the option becomes worthless when the
underlying reaches either the values A = ea or B = eb > ea . In this case, the potential is given
by 
∞ x < a

V (x) = 0 a < x < b (4.30)

∞ x>b

It is well known that the momenta for the infinite potential well are discrete and given by
a b x
Figure 4.2: The potential for the double barrier option
nπ
pn = b−a . The eigenfunctions are hence given by
r
2
hx | ψn i = eα(x−a) hx | φn i = ieα(x−a) sin pn (x − a) (4.31)
b−a
r
2
hψ̃n | xi = e−α(x−a) hφn | xi = − ie−α(x−a) sin pn (x − a) (4.32)
b−a
where hx | φn i are the eigenfunctions of the quantum mechanical particle in an infinite potential
well.
The eigenfunctions are orthonormal since
Z b
2
hψ̃n | ψ i =
n′ sin pn (x − a) sin pn′ (x − a)dx = δnn′ (4.33)
b−a a
and form a complete basis since

∞
2 α(x−x′ ) ∞
∑ hx | ψn ihψ̃n | x′ i =e ∑ sin pn (x − a) sin pn (x′ − a)
n=1 b−a n=1
∞
1 ′
α(x−x ) inπ ′ inπ ′
=
2(b − a)
e ∑ exp b − a (x − x ) − exp b − a (x + x − 2a) (4.34)
n=−∞
π(x − x′ ) π(x + x′ − 2a)

π α(x−x′ )
= e δ −δ
b−a b−a b−a
′
= δ(x − x )
since a < x < b and a < x′ < b. The propagator is hence given by
∞ ∞
−τ Ĥ ′
hx | e |xi= ∑ ∑ hx | ψn ihψ̃n | e−τ Ĥ | ψn′ ihψ̃n′ | x′ i
′
n=1 n =1
∞
= ∑ hx | ψn ihψ̃n | x′ ie−τ En
n=1
τ σ2 β

1 ′
= exp − + α(x − x )
2(b − a) 2
∞
τ σ 2 p2n

′ ′
∑ exp − 2 (eipn(x−x ) − eipn(x+x −2a))
n=−∞
∞ Z
τ σ2 β
2 2 2
1 ′ y π τσ
= exp − + α(x − x ) ∑ dyδ(y − n) exp −
2(b − a) 2 n=−∞ 2(b − a)2
iyπ(x − x′ ) iyπ(x + x′ − 2a)

exp − exp
b−a b−a
r
1

τσ β2
′
= exp − + α(x − x )
2πτ σ 2 2
∞
(x − x′ + 2n(b − a))2 (x + x′ − 2a − 2n(b − a))2

∑ exp − 2τ σ 2
− exp −
2τ σ 2
n=−∞
(4.35)
where the identity

∞ ∞
∑ δ(y − n) = ∑ e2πiny (4.36)
n=−∞ n=−∞
has been used.
Hence, we see that the propagator (apart from the drift terms) is given by an infinite sum of
Gaussians. To check its reasonableness, we check the value in the limits b → ∞ and a → −∞. In
the former case, only the n = 0 term contributes and in the latter, only the n = 0 and n = 1 terms
contribute. It is easy to see that, in both cases, the result reduces to the solution for the single
knockout barrier propagator. When both limits are simultaneously active, only the first term
in the n = 0 term exists and it is easily seen that gives rise to the well known Black-Scholes
propagator.
We can now evaluate the price of a double barrier European call option using the propagator
from (4.35) by folding in the final value max(ex(T ) − K, 0). The calculation is straightforward
and the result is seen to be
∞
−2nα(b−a) 2n(b−a) −rτ
f= ∑ e e SN(dn1 ) − Ke N(dn2 )
n=−∞
2a (4.37)
e
− S 2α e−2α(n(b−a)−a) e2n(b−a) N(dn3 ) − Ke−rτ N(dn4 )
S
Sec. 4.5 More Path Dependent Options 48
where
2

S
ln( K ) + 2n(b − a) + τ r + σ2
dn1 = √ (4.38)
σ τ
2

S
ln( K ) + 2n(b − a) + τ r − σ2 √
dn2 = √ = dn1 − σ τ (4.39)
σ τ
2a

e σ2
ln( SK ) + 2n(b − a) + τ r + 2
dn3 = √ (4.40)
σ τ

e2a 2
ln( SK ) + 2n(b − a) + τ r − σ2 √
dn4 = √ = dn3 − σ τ (4.41)
σ τ
which is the well-known formula for the double barrier price found by Kunitomo and Ikeda[53].
4.5 More Path Dependent Options
We can have considerably more complicated path dependent options since an option is an
arbitrary random variable on the underlying sample space, or in other words, a completely
arbitrary functional of the history of asset prices. For many but not all kinds of path dependent
options, we can extend the above technique of obtaining from the path integral a Hamiltonian
for a quantity related to the option which is path independent and can therefore be represented
as a wave function. The solution for the propagator of this quantity then gives us the solution
for the path dependent option. It must be noted that this quantity cannot be a traded asset as
all traded assets evolve with the Black-Scholes Hamiltonian (4.9) according to (3.18) in the
Black-Scholes model. Let us first look at how this is done for some relatively simple (but more
complicated than simple barrier options) path dependent options.
4.5.1 Soft barrier options
These options have been considered in detail in Linetsky [54]. They are similar to the barrier
options considered above but do not knock out the option completely when the barrier is hit.
Instead, they discount the final payoff by the exponential of a constant times the amount of time
spent inside the barrier. For example, for a down and discounted step option whose barrier is at
B and strike price at K, the payoff at expiry is
e−V τB− (ST − K)+ (4.42)
where τB− is the time spent below the barrier B and V is the discounting factor. Considered as
a path integral, the current price of the option is given by
Z Z x(T )=x′
′
dx′ (ex − K)+ DxeSBS e−V τB− (4.43)
x(0)=ln S
Defining a potential (
−V x < ln B
V (x) = (4.44)
0 x ≥ ln B
we see that that the path integral is equivalent to

Z Z x(T )=x′
′ x′
dx (e − K)+ DxeS (4.45)
x(0)=ln S
with the action S now being given by

2 !
σ2

1
Z Z
S =− dtL(x, ẋ) = − 2 dt ẋ − r + + r + V (x) (4.46)
2σ 2
In other words, we have just introduced a potential into the problem. The Lagrangian is now
L = LBS + V (x) (4.47)
and the Hamiltonian now has an extra term −V (x). The new Hamiltonian is therefore
σ2 ∂2 σ2 ∂

Ĥ = + r− − r − V (x) (4.48)
2 ∂x2 2 ∂x
Hence, the solution of the step option price in the Black-Scholes model is equivalent to the
solution of the plain vanilla option in a model with the above Hamiltonian.
More generally, the pricing of an option whose final payoff is
R
e− V (x(t))dt
(ST − K)+ (4.49)
is equivalent to the pricing of a plain vanilla call option in a model where the Hamiltonian is
Ĥ = ĤBS − V (x) (4.50)
If we can find the eigenvalues and eigenfunctions of the operator Ĥ, we can write down the
propagator using the decomposition
hx | eτ Ĥ | x′ i = ∑ hx | nihn | eτ Ĥ | nihn | x′ i = ∑ e−τ En ψn∗ (x′ )ψn (x) (4.51)

n n
where the eignevalues of Ĥ are −En and the eigenfunctions corresponding to these eigenvalues
are ψn . The eigenvalues are denoted with a negative sign as they are negative definite when
V = 0. When the eignenvalues are continuous, the sum becomes an integral as was the case in
the calculations for the barrier options. In this case, we can consider the Laplace transform of
the propagator Z ∞
dτ e−sτ hx | eτ Ĥ | x′ i (4.52)
0
which is seen to be the Green’s function of the operator s − Ĥ (this can also be directly seen
from the Feynman-Kac theorem). Once we find the Green’s function, we can perform the
inverse Laplace transform to get the propagator and hence the solution to the problem.
To see how this works, let us take the example of the simple barrier option. We modify the
state space using the terms α and β and rescaling x by σ so that we only deal with a standard
Brownian motion (the rescaling of x makes the obvious change α → σα and β → σ 2 β for
the parameters so that we obtain a Hermitian Hamiltonian). In that case, the potential only
influences the boundary conditions, so we have to find the solution to the equation
1 d2

2
− s G(x, x′ ; s) = −δ(x − x′ ) (4.53)
2 dx
with the boundary conditions G(x, x′ ; s) = 0, x, x′ = b = ln B and limx→∞ G(x, x′ ; s) = 0. The

Green’s functions can be easily found using standard methods and the result is given by
√ √ ′
′ 2 sinh 2s(x − b)e− 2s(x −b)
G(x, x ; s) = √ Θ(x′ − x)
2s
√ √ (4.54)
2 sinh 2s(x′ − b)e− 2s(x−b)
+ √ Θ(x − x′ )
2s
whose inverse Laplace transform is the propagator found in (4.28) (after adjusting for the trans-
formation). As can be seen, both methods are very closely related (effectively, one uses Laplace
transforms while the other is using Fourier transforms).
This technique is applied to find a closed form solution for the step option in Linetsky [54].
If we again only deal with the standard Brownian motion, we find that the Green’s function is
given by
 √ √ √ √
1 ′| s+V − s ′ −2b)
 √ e 2s|x−x − s+V +√s e
√ 2s(x+x x, x′ > b

 2s
√ √
2(s+V )(x−b)− 2s(x′ −b)

e √ x ≤ b, x′ ≥ b

√


2(s+V )+ 2s
G(x, x′ ; s) = √ √
)(x′ −b)− 2s(x−b)
e 2(s+V

 √ √ x ≥ b, x′ ≤ b
2(s+V )+
√ 2s
√ √


 √
′ ′
√ 1 e 2(s+V )|x−x | − √s+V −√s e 2(s+V )(x+x −2b) x, x′ < b


2(s+V ) s+V + s
(4.55)
whose inverse Laplace transform gives us the propagator.
4.5.2 Asian options
There are several options which cannot be put into a simple form by discounting alone. One
such option which is also fairly popular in the market is the Asian option which has been
considered in detail in the literature. The Laplace transform of an out of the money Asian
option was found in Geman and Yor [55]. The payoff of the Asian option is defined to be
1 T
Z
max 0, S(t)dt − K (4.56)
T 0
We can write the option price as a path integral

1
Z Z Z
1 RT
dtex(t) −ν)
dp dν DxeSBS eip( T 0 (ν − K)+ (4.57)
2π
using a standard expression for the Dirac delta function. We see that we can consider this
expression as a plain vanilla call option with a Lagrangian modified by − ip x
Te .
One option which is a fairly good approximation for the Asian option but which is easily
solvable is the geometric Asian option. Its final payoff is defined to be
1 RT

max 0, e T 0 x(t)dt − K (4.58)
To solve this, let us write the Black-Scholes evolution as
σ2

dx
= r− + ση(t) (4.59)
dt 2
where η(t) is white noise. Hence,
σ2
Z t
x(t) = x(0) + r − t + dt′ η(t′ ) (4.60)
2 0
and
σ2
Z T
σ T

1 1
Z
dtx(t) = x(0) + r− + (T − t)η(t)dt (4.61)
T 0 2 2 T 0
To find the distribution of the last term, we make use of the generating function for white noise
(2.119) to give
1
Z
1 RT σ RT
Dηe− 2 0 dtη (t) eip( T 0 dt(T −t)η(t)−ν )
2
(4.62)
2π
which evaluates to r
2
3 − 3ν2
e 2σ T (4.63)
2πσ 2 T
1 RT 2
In other words, the distribution of T 0 x(t)dt is N(0, σ 3T ). Therefore, the price of the geo-
metric Asian option is given by
c = SN(d1 ) − Ke−r(T −t)N(d2 ) (4.64)
where
√ S
2
√ 2

+ r2 + σ12 (T − t) S
+ 2r − σ4 (T − t)

3 ln K 3 ln K
d1 = √ , d2 = √ (4.65)
σ T −t σ T −t
4.5.3 Seasoned options
In the above discussion, we have only discussed how to value path dependent options where
the path dependence starts at the present. In practice, the problem of solving for the price of
path dependent options after they have been initiated is very important. Such options are called
seasoned options. In many cases, the valuation of seasoned options proceeds very similarly to
that of new options.
Let us consider a seasoned soft barrier option with discounting by a potential V (x) at time
t > 0 where the path dependence has started at zero time. We denote the maturity time be T .
The value of this option is given by
Rt Z RT
dt′ V (x(t′ )) dt′ V (x(t′ ))
e− 0 DxeSBS e− t (ex(T ) − K)+ (4.66)
where we should take into account the discounting until time t separately since the history is
already known. We see that apart from this factor, there is no substantial difference between
the valuation of the new and seasoned options.
For Asian options, we see that we can value the seasoned option provided we can value the
new option since the probability distribution of the remaining part of the average will determine
the option price at the time when combined with information about the contribution to the
average of the revealed historical price. Hence, we see that we can price seasoned Asian options
if we can price new Asian options.
C HAPTER 5
I NTEREST R ATE M ODELS
5.1 Introduction
We introduced bonds in chapter 1. They are promises by the issuer to pay certain sums of
money in the future. For the purposes of this thesis, we only consider default-free bonds, that
is, those for which there is no repayment or credit risk. Hence, the main source of uncertainty
in the valuation of these bonds and options on them is the uncertainty of future interest rates.
We shall also be largely concerned only with zero coupon bonds, that is, bonds which have
only one payment date in the future. This is because all bonds can be composed out of these
elementary zero coupon bonds.
We now introduce some of the terminology used in this and following chapters. We denote
the price at time t of a unit zero coupon bond maturing (ie, making its payment) at time T by
P (t, T ). The yield on this zero coupon bond is defined to be y(t, T ) = ln(P (t, T ))/(T − t). We
call the interest rate for an instantaneous loan at time x contracted at time t (obviously, x ≥ t)
the forward rate for x at time t and denote it by f (t, x). It should be easy to see that we then
have the relationship RT
P (t, T ) = e− t dxf (t,x) = e−y(t,T )(T −t) (5.1)
It should also be easy to see that
∂y(t, T )
f (t, T ) = y(t, T ) + (T − t) (5.2)
∂T
The shape of the domain for the forward rates is shown in figure 5.1. In the figure, it has been
assumed that the forward rates are defined only up to a time TF R into the future. Theoretically,
forward rates can exist for all future time, so in most cases we will be taking the limit TF R → ∞.
The forward rate for the current time f (t, t) is usually denoted by r(t) and is called the spot rate.
For a long time, it was thought that the spot rate alone determined the dynamics of all the bond
prices but modern models tend to introduce dynamics to the entire forward rate curve. Note R
that
all the models assume that the zero coupon bonds and the money market account exp( r(t)dt)
are the only traded assets.1 Hence, we will be very much interested in their dynamics.
1 Even the money market account can be thought of as an infinite series of infinitesimal duration bonds.
53
Sec. 5.3 The HJM model 54
t0 (t0 , t0 + TF R )
(t0 , t0 )
0 t0 t 0 + TF R x
Figure 5.1: The domain for the forward rates
5.2 History of interest rate models
The field of interest rate modelling was started by Vasicek[56] in his seminal paper show-
ing how to price bonds and derive the market price of risk based on diffusion models of the
spot rate. He also introduced his famous Vasicek model in that paper. Since then, there have
been many other diffusion processes proposed for the short rate such as those by Brennan
and Schwartz[57], Cox, Ingersoll and Ross[58], Hull and White[59], Jamshidian [60, 61, 62],
Black, Derman and Toy[63] and Black and Karasinski[64]. Each of these has its unique fea-
tures, advantages and disadvantages. However, as noted in Heath, Jarrow and Morton[65],
they all have one serious problem. Since all of them only model the spot rate, they make very
specific predictions for the forward rate structure. These predictions are usually not satisfied
in reality and this leads to model specification problems. The specification of arbitrary mar-
ket prices of risk in these models tends to alleviate this problem but introduces the even more
severe problem of introducing arbitrage opportunities as noted in Cox, Ingersoll and Ross[58].
This led Heath, Jarrow and Morton[65] to develop their famous model where all the forward
rates are modelled together. This model, usually called the HJM model is, together with its vari-
ants, now the industry standard interest rate model. This model is still restricted by the fact that
it has only a finite number of factors which each influence the entire forward rate curve. This
restricts the possible correlation structure of the forward rates. This restriction can be removed
by taking the number of factors to infinity as pointed out in Cohen and Jarrow[66]. This is how-
ever unrealistic from a specification point of view as an infinite number of parameters cannot, of
course, be estimated. Hence, models where a rich correlation structure could be imposed with
a small number of parameters were developed. The earliest such model was by Kennedy[67]
and was followed by Goldstein[68], Baaquie[69] and Santa-Clara and Sornette[70]. We shall
be largely interested in the HJM model and its field theory generalisation by Baaquie. We shall
also present the Santa-Clara and Sornette model as it is also fairly general and has the models
by Kennedy and Goldstein as special cases. We will also see in chapter 8 that the other models
can all be incorporated in the generalization of Baaquie’s framework presented in Baaquie [48].
5.3 The HJM model
5.3.1 Definition of the model
The HJM model models the forward rates as

Z t K Z t
′ ′
f (t, x) = f (t0 , x) + dt α(t , x) + ∑ dt′ σi (t′ , x)dWi (t′ ) (5.3)
t0 i=1 t0
where Wi are independent Wiener processes. We can also write this as

K
∂f (t, x)
= α(t, x) + ∑ σi (t, x)ηi (t) (5.4)
∂t i=1
where ηi represent independent white noises. The action functional, following (2.119) is just
1 K
Z
S[W ] = − ∑ dtηi2 (t) (5.5)
2 i=1
We can use this action to calculate the generating functional which is

K R t2 dtj (t)W (t)
Z
Z[j, t1 , t2 ] = DW e∑i=1 t1 i i
eS0 [W,t1,t2 ]
1 R t2
∑K dtji2 (t)
= e 2 i=1 t1
(5.6)
5.3.2 The fundamental theorem of asset pricing and the action
We have already seen that the fundamental theorem of finance can be put in the form that
all contingent claims have a unique well defined price if there exists a unique Hamiltonian
equivalent to the actual Hamiltonian (that is, differing only in terms of first order in p) such that
Ĥ| Zi = 0 (5.7)
where the Z are the discounted traded assets. However, we have not yet discussed the interpre-
tation in terms of the action. We proceed to do so now and derive new results which are useful
in the analysis of martingale measures.
In the case of stocks, we now consider the meaning of this in terms of the action. In an
economy of one stock and the money market account, we can write the expectation of the
future stock price ex as
Rt Z Rt
− t 1 dt′ r(t′ ) x(t1 ) x(t0 )
′ ′ ′
S[x] t01 dt (ẋ(t )−r(t ))
Et0 e 0 e =e Dxe e (5.8)
If we now introduce the partition function for ẋ which we denote by U[j] and which is defined
by Z R t1
dt′ j(t′ )(ẋ(t′ )−r(t′ ))
U[j(t)] = DxeS[x] e t0
(5.9)
we see that that the martingale condition

R t1
− dt′ r(t′ ) x(t1 )
Et0 [e t0
e ] = ex(t0) (5.10)
leads to the conclusion that

U[1] = 1 (5.11)
since Z t1 Z t1
′ ′ ′
dt (ẋ(t ) − r(t )) = x(t1 ) − x(t0 ) − r(t′ )dt′ (5.12)
t0 t0
Firstly, we note that the term r is not of particular importance as it would disappear if we were
modelling the logarithm of the discounted asset price z = x−rt. In other words, we see that the
partition function related to the change in the logarithm of the discounted traded asset should
be one when evaluated at one for a risk-neutral action. This conclusion should hold true for all
times t1 > t0 .
More generally, if we choose the numeraire the traded asset X = ex in an economy which
N other traded assets Yi = eyi , the action must satisfy the condition
Z R t1
dt′ (ẏk −ẋ)
Dx ∏ Dyi eSef f e t0
=1 (5.13)
i
for all k where Sef f is a new effective action related to the previous one by
Z t1
Sef f = S + (ẋ − r)dt (5.14)
t0
The partition function for ẏi defined by

Z R t1
dt′ ∑N ′
k=1 jk (t )(ẏk −ẋ)
U[j1 (t), . . . , jN (t)] = Dx ∏ Dyi eSef f e t0
(5.15)
i
then satisfies
U[ji (t) = 1, t0 < t < t1 ] = 1, 1 ≤ i ≤ N (5.16)
for all times t1 > t0 . It should be noted that different choices of numeraire will lead to different
actions but all of them are related in the manner mentioned above. In the above discussion, we
were restricted to a countable number of assets (as implied by the summation) but this is not
required. In the analysis of the economy of bonds below, we will see that a continuous set of
assets can be treated equally easily by extending the set of assets to a field with infinite degrees
of freedom.
Let us now apply this result to the economy of bonds that we are considering. We will
use the money market account Ras the numeraire. The discounted traded assets are now the
t T
dt′ f (t′ ,t′ )−
R
− dxf (t,x)
discounted bondsRe t0 t
where T forms a continuous index. Their logarithms
t ′ ′ ′
RT
take the form − t0 dt f (t , t ) − t dxf (t, x) and their rate of change with time is given by
− tT dxf˙(t, x) which changes with t. Using this fact and just integrating this over time to
R
get the exponent, we immediately obtain (5.20) as the partition function to consider for the
martingale measure. Alternatively, we see that over a finite time interval we are considering the
t
( t* ,t* ) ( t* ,T)
t*
1111111
0000000
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
0000000
1111111
t0 0000000
1111111
( t 0 , t 0) ( t 0,T)
0 t0 t* T t 0 + TFR x
Figure 5.2: Domain R is shaded above
expression
R t∗
− r(t)dt
Et0 [e t0
P (t∗ , T )] RT R t∗
Z RT
dxf (t0 ,x) − dtr(t) −
=e t0 Df eS[f ] e t0
e t∗ dxf (t∗ ,x)
P (t0 , T )
Z R t∗ R t∗ RT RT
−( f (t,t)dt− dtf (t0 ,t)) −(
= Df eS[f ] e t0 t0
e t∗ dxf (t∗ ,x)dx− t∗ dxf (t0 ,x))
=1
(5.17)
and since Z t∗ Z t∗ Z t∗ Z t∗
− dtf (t, t) + dtf (t0 , t) = − dt dxf˙(t, x) (5.18)
t0 t0 t0 t
we get
Z t∗ Z T
− dt dxf˙(t, x) (5.19)
t0 t
as the exponent of the partition function to use for the martingale measure. Hence, either way,
we see that the partition function
Z R∞ R∞
− ˙
t0 dt t dxj(t,x)f (t,x)
U[j(t, x)] = Df eS[f ] e (5.20)
should evaluate to one if j(t, x) = 1, t0 < t < t∗ , t < x < T and zero otherwise2 . The domain
that this describes is the trapezoidal domain consisting of both the unshaded triangle and shaded
rectangle in figure 5.2.
Another way to look at the change of numeraire is to consider it as an introduction of unity
into the path integral which changes the action. By the martingale principle,
Z R t∗
− r(t)dt
P (t0 , T ) = Df e t0
P (t∗ , T )eS (5.21)
2 Note that T > t∗ for this to make sense. Otherwise, the values of t∗ and T are arbitrary.
which we can write as (see above discussion)

Z R t∗ RT
− dt ˙
dxf(t,x) S
P (t0 , T ) = P (t0 , T ) Df e t0 t
e (5.22)
We now introduce unity in the form of

 R t∗ R t2 
dt dxf˙(t,x)
 eR
t0 t
t∗ R t2
 (5.23)
˙
t0 dt t dxf (t,x)
e
into the path integral which will give
R t2 Z R t∗ RT
− dxf (t0 ,x) Sef f − dt f˙(t,x)
P (t0, T ) = P (t0 , T )e t0
Df e e t0 t2
(5.24)
where Z t∗ Z t2
Sef f = S − dt dxf˙(t, x) (5.25)
t0 t
since
Z t∗ Z t2 Z t∗ Z t∗ Z t2 Z t2
dt dxf˙(t, x) = dtf (t, t)dt − dxf (t0 , x) + dxf (t∗ , x) − dxf (t0 , x)
t0 t t0 t0 t∗ t∗
(5.26)
In terms of probability measures, the ratio between the two probability measures or the Radon-
Nikodým derivative is given by eSef f −S which is given by (5.25) . For the case of the transfor-
mation of the action by changing the numeraire to P (t, t∗ ), the Radon-Nikodým derivative is
seen to be given by R t∗
− r(t)dt
e t0
(5.27)
P (t0, t∗ )
as shown in Geman et al [41] using probabilistic methods.
In practice, when we deal with locally Gaussian processes, we can get an effective action
of the form (5.25) by changing the drift of the process and applying Girsanov’s theorem. The
Jacobian of the transformation from the variables without the drift to the variables with the drift
gives the extra current term to the action.
(5.24) can be rewritten in a more transparent manner as
P (t0 , T ) P (t∗, T ) Sef f
Z
= Df e (5.28)
P (t0 , t2 ) P (t∗, t2 )
P (t,T )
showing that P (t,t ) is a martingale with this new action. For actions quadratic in f˙ describing f
2
being driven by Gaussian random fields, the change of action only implies a changed drift which
can be calculated using Girsanov’s theorem or even more straightforwardly by considering the
Hamiltonian for the random field and the changed form of the discounted assets. We will soon
see how to carry out this calculation when we consider the field theory model of forward rates.
Let us now discuss more physically how the integration domain changes when we use
different numeraires. In other words, we want to look at the different form of the currents for
which the partition function is one for each of the different effective actions. The partition
function for the effective action can be generally written as
Z R∞ R∞
− ˙
t0 dt t dxj(t,x)f (t,x)
U ′ [j(t, x)] = Df eSef f [f ] e (5.29)
should evaluate to one if j(t, x) = 1, t0 < t < t∗ , t2 < x < T . The domain that this describes is
a rectangle which is the shaded rectangle in figure 5.2 if t2 = t∗ . We have effectively removed
the triangle from the domain if we are only working with bonds expiring after time t∗ and this
simplifies calculations significantly as we will see later. When t > t2 , the discounted assets are
not defined as the bond P (t, t2 ) has expired and no longer exists, hence we should not choose
a bond which expires before the final time in the domain as the numeraire. When t2 > T , we
see that j should have the value −1 in the domain t0 < t < t1 , T < x < t2 for U ′ [j] to equal
one. Whenever we change the numeraire, we have a different action Sef f so that the different
partition functions for each numeraire evaluate to one.
5.3.3 Lattice field theory formulation
While the Hamiltonian for the HJM model can be written down directly from the definition,
there are several subtle issues due to the odd structure of the domain. Hence, we will first
present a lattice formulation which can be easily extended to more complicated theories. This
section borrows very heavily from Baaquie[50].
The state space of a field theory is a linear vector space – denoted by V – that consists of
functionals of the field configurations at some fixed time t. The dual space of V – denoted by
VDual – consists of all linear mappings from V to the complex numbers, and is also a linear
vector space. The Hamiltonian H is an operator – the quantum analog of energy – that is an
element of the tensor product space V ⊗ VDual .
In this subsection, we study the specific features of the state space and Hamiltonian for the
quantum field theory of forward rates. Since the HJM Lagrangian for the forward rates given
in (5.6) has only first order derivatives in time (the ηi depend can be defined linearly in terms
of first order derivatives in time of the forward rates), an infinitesimal generator, namely the
Hamiltonian H exists for it. Obtaining the Hamiltonian for the forward rates is a complicated
exercise due to the non-trivial structure of the underlying domain P. In particular, the forward
rates quantum field will be seen to have a distinct state space Vt for every instant t.
For greater clarity, we discretize both time and maturity time into a finite lattice, with lattice
spacing in both directions taken to be ǫ. The points comprising the discrete domain P̃ are shown
in Figure 5.3.
Figure 5.3: Lattice in Time and Maturity Directions

The discrete domain P̃ is given by
(t, x) → ǫ(n, l) ; n, l : integers (5.30)

(Ti , Tf , TF R ) → ǫ(Ni , Nf , NF R ) (5.31)
Lattice P̃ = {(n, l)|Ni ≤ n ≤ Nf ; n ≤ l ≤ (n + NF R )} (5.32)
f (t, x) → fn,l (5.33)
∂f (t, x) fn+1,l − fn,l ∂f (t, x) fn,l+1 − fn,l
≃ ; ≃ (5.34)
∂t ǫ ∂x ǫ
The partition function is now given by a finite multiple integral, namely

Z
Z = ∏ dfn,l eS[f ] (5.35)
(n,l)ǫP̃
S = ∑ S(n) (5.36)
n
Consider two adjacent time slices labelled by n and n + 1, as shown in Figure 5.4. S(n) is
the action connecting the forward rates of these two time slices.
n+1
Figure 5.4: Two Consecutive Time Slices for t = nǫ and t = (n + 1)ǫ
As can be seen from figure 5.4, for the two time slices there is a mismatch of the 2-lattice
sites on the edges, namely, lattice sites (n, n) at time n and (n + 1, n + 1 + NF R ) at time n + 1
are not in common. We isolate the un-matched variables and have the following
Variables at time n :
{fn,n , F̃n } ; F̃n ≡ {fn,l |n ≤ l ≤ n + NF R } (5.37)
Variables at time (n + 1) :
{Fn , fn+1,n+1+NF R } ; Fn ≡ {fn+1,l |n + 1 ≤ l ≤ n + 1 + NF R } (5.38)
Note that although the variables Fn refer to time n + 1, we label it with earlier time n for later
convenience. From figure 5.4 we see that both sets of variables Fn and F̃n cover the same
lattice sites in the maturity direction, namely n + 1 ≤ l ≤ n + NF R , and hence have the same
number of forward rates, namely NF R − 1. The Hamiltonian will be expressible solely in terms
of these variables.
From the discretized time derivatives defined in (5.34) the discretized action S(n) contains
terms that couple only the common points in the lattice for the two time slices, namely the
variables belonging to the sets F̃n ; Fn . We hence have for the action
S(n) = ǫ ∑ Ln [fn,l , fn+1,l ] (5.39)
{l}
= ǫ ∑ Ln [F̃n ; Fn ] (5.40)
{l}
As shown is in figure 5.5, the action for the entire domain P̃ shown in figure 5.3 can be
constructed by repeating the construction given in figure 5.4 and summing over the action S(n)
over all time Ni ≤ n ≤ Nf .
Figure 5.5: Reconstructing the Lattice from the Two Time Slices
The Hamiltonian of the forward rates is an operator that acts on the state space of states of
the forward rates. Hence, we need to determine the co-ordinates of its state space.
Consider again the two consecutive time slices n and n + 1 given in figure 5.4. We interpret
the forward rates for two adjacent instants, namely {fn,n , F̃n } and {Fn , fn+1,n+1+NF R } given
in (5.37) – and which appear in the action (5.39) – as the co-ordinates of the state spaces Vn
and Vn+1 respectively.
For every instant of time n there is a distinct state space Vn , and its dual VDual,n . The co-
ordinates of the state spaces Vn and Vn+1 are given by the tensor product of the space of state
for every maturity point l, namely
< f˜n | =
O
< fn,l | ≡< fn,n | < F̃n | (5.41)
n≤l≤n+NF R
: co − ordinate state of VDual,n
O
|fn+1 > = |fn+1,l >≡ |Fn > |fn+1,n+1+NF R > (5.42)
(n+1)≤l≤n+1+NF R
: co − ordinate state of Vn+1
The state vector |Fn > belongs to the space space Vn+1 , but we reinterpret it as correspond-
ing to the state space Fn at earlier time n. This interpretation allows us to study the system
instantaneously using the Hamiltonian formalism.
The state space Vn consists of all possible functions of NF R forward rates {fn,n , F̃n }. The
state spaces Vn differ for different n by the fact that a different set of forward rates comprise
its set of independent variables.
Although the state spaces Vn and Vn+1 are not identical, there is an intersection of these
two spaces, namely Vn ∩ Vn+1 that covers the same interval in the maturity direction, and is
coupled by the action S(n). The intersection yields a state space, namely Fn , on which the
Hamiltonian evolution of the forward rates takes place. In symbols, we have
Vn+1 = Fn ⊗ |fn+1,n+1+NF R > (5.43)

VDual,n = < fn,n | ⊗ FDual,n (5.44)
Hn : Fn → Fn ⇒ Hn ∈ VDual,n ⊗ Vn+1 (5.45)
The Hamiltonian Hn is an element on the tensor product space spanned by the operators
|Fn >< F̃n |, namely the space of operators given by Fn ⊗ FDual,n .
The vector spaces Vn and the Hamiltonian Hn acting on these spaces is shown in figure
5.5.
Vn+2
Hn+1
Vn+1
Hn
Vn
Figure 5.6: Hamiltonians Hn propagating the space of Forward Rates Vn
Note that both the states |Fn > and < F̃n | belong to the same state space Fn , and we use
twiddle to indicate that the two states are different. This is in contrast to the two states |f >
and < f | indicate that one state is the dual of the other.
As one scans through all possible values for the forward rates f and f, ˜ one obtains a com-
plete basis for the state space Vn . In particular, the resolution of the identity operator for Vn –
denoted by In – is a reflection that the basis states are complete, and is given by [16]
Z
In = ∏ dfn,l |fn >< fn |
n≤l≤n+NF R
Z (5.46)
≡ dfn,n dF̃n |fn,n ; F̃n >< fn,n ; F̃n |
The Hamiltonian of the system H is defined by the Feynman formula (up to a normaliza-
tion), from (5.39), as
ρn eǫ ∑{l} Ln[fn,l ,fn+1,l ] =< fn,n , F̃n |e−ǫHn |Fn , fn+1,n+1+NF R > (5.47)
where in general ρn is a field-dependent measure term. Using the property of the discrete action
given in (5.40), we have
ρn eǫ ∑{l} Ln [Fn ,F̃n ] = < fn,n , F̃n |e−ǫHn |Fn , fn+1,n+1+NF R >
= < F̃n |e−ǫHn |Fn > (5.48)
In going from (5.47) to (5.48) we have used the fact that the action connecting time slices
n and n + 1 does not contain the variables fn,n and fn+1,n+1+NF R respectively. This leads to
the result that the Hamiltonian Hn consequently does not depend on these variables.
The interpretation of (5.48) is that the Hamiltonian Hn propagates the initial state < F̃n | in
time ǫ to the final state |Fn >. Note the relation
< fn,n , F̃n |e−ǫHn |Fn , fn+1,n+1+NF R >=< F̃n |e−ǫHn |Fn > (5.49)
shows that there is an asymmetry in the time direction, with the Hamiltonian being independent
of the earliest forward rate fn,n of the initial state and of the latest forward rate fn+1,n+1+NF R
of the final state. It is this asymmetry in the propagation of the forward rates which yields the
parallelogram domain P given in figure 5.3, and reflects the asymmetry that the forward rates
f (t, x) exist only for x > t.
For notational simplicity, we will henceforth use continuum notation. In particular, the state
space is labelled by Vt , and state vector by |ft >. The elements of the state space of the forward
rates Vt includes all the basic financial instruments that are traded in the market at time t. In
continuum notation, from (5.42), we have that
O
|ft > = |f (t, x) > (5.50)
t≤x≤t+TF R
O
|Ft > = |f (t, x) > (5.51)
t<x≤t+TF R
In continuum notation, the only difference between state vectors |ft > and |Ft > is that, in
(5.51), the point x = t is excluded in the continuous tensor product.
The partition function Z given in (5.35) can be reconstructed from the Hamiltonian by re-
cursively applying the procedure discussed for the two time slices. We then have, in continuum
notation, that
Z
Z= Df eS[f ]
n Z Tf o (5.52)
=< finitial |T exp(− H(t) dt) |ffinal >
Ti
where the symbol T in the equation above stands for time ordering the (non-commuting) oper-
ators in the argument, with the earliest time being placed to the left.
5.3.4 Hamiltonian for HJM
In the case of HJM, it is easier to derive the Hamiltonian by interpreting it as the generator
for the infinite variable diffusion (2.69). The scaled cumulants can be easily found from (5.4).
Since the model is Gaussian, we have only two cumulants, the first of which is < df (t, x) >=
α(t, x) and < df (t, xi )df (t, xj ) > − < df (t, xi ) >< df (t, xj ) >= σ(t, xi )σ(t, xj ) Hence, the
Hamiltonian for the HJM model of forward rates is given by
!
N
1 t+TF R δ2
Z
HHJM (t) = dxdx′ ∑ σi (t, x)σi (t, x′ )
2 t i=1 δf (t, x)δf (t, x′ )
Z t+TF R (5.53)
δ
+ dxα(t, x)
t δf (t, x)
Note carefully that the Hamiltonian is explicitly dependent on time due to the changing limits
of integration. When expressed in terms of both position and the Fourier transform of f which
we denote by p, the Hamiltonian operator becomes
!
N
1 t+TF R
Z
HHJM (t) = − dxdx ∑ σi (t, x)σi (t, x ) p(x)p(x′ )
′ ′
2 t i=1 (5.54)
Z t+TF R
+i dxα(t, x)p(x)
t
where p(x) is the Fourier variableRconjugate

R to f (x). Under this transformation, the state vector
| F [f ]i is transformed to G[p] = Df ei dxp(x)f (x)F [f ].
5.3.5 Dynamics of the bond prices and the martingale measure
We now look at the dynamics of the bond RT

prices in the HJM model. Since the price of the zero
coupon bond is given by P (t, T ) = e− t dxf (t,x) , we see that its dynamics is given by
2 !
1 N
Z T Z T
dP (t, T ) = P (t, T ) r(t) − dxα(t, x) + ∑ dxσi (t, x) dt
t 2 i=1 t
! (5.55)
N Z T
−∑ dxσi (t, x) dWi (t)
i=1 t
where the r(t) term enters because of the change in the integration domain of f and the third
term appears because of Itô’s lemma. If we choose the money market account as the numeraire,
we see that the discounted asset Z(t, T ) follows the same dynamics except that the r(t) term is
removed. Hence the condition for the martingale measure is
1 N
Z T Z T Z T
dxα(t, x) = ∑ dx dx′ σi (t, x)σi (t, x′ ) (5.56)
t 2 i=1 t t
and, differentiating with respect to T , we finally get the risk-neutral condition

N Z T
α(t, T ) = ∑ σi (t, T ) t
dxσi (t, x) (5.57)
i=1
Putting this value into (5.53), we see that the Hamiltonian does annihilate the discounted bond
as expected. Hence, we now have the equivalent Hamiltonian where all the discounted assets
are martingales. Another way to derive the equivalent Hamiltonian which gives more insight
is to consider the form of the discounted bond state vectors in the Fourier RT transformed state
space. In the f basis the bond state vectors are given by P (t, T ) = | e − t f (x)dx i In the Fourier
R RT
i p(x)f (x)dx − f (x)dx
R
transformed basis, this state vector is given by Df e e t which is the sim-
ple vector | iΘ(x − t)Θ(T − x)i which means that p(x) = i, t ≤ x ≤ T . Putting this into (5.54),
we get
Z T Z T N Z T
1 ′ ′
HHJM | P (t, T )i =
2 t
dx
t
dx ∑ σi (t, x)σi (t, x ) − t
dxα(t, x) (5.58)
i=1
and setting this equal to zero gives us the expression for α for the equivalent Hamiltonian.
If we use a numeraire different from the money market, α will be different as we are the
discounted assets will be different. If we choose P (t, t1 ) as the numeraire, we can use the
condition of the Hamiltonian annihilating the discounted bonds to give us
N Z T
α(t, T ) = ∑ σi (t, T ) t1
dxσi (t, x) (5.59)
i=1
which will give us a different equivalent Hamiltonian. Note that if t1 > T , α will be negative
for this Hamiltonian. The fact that this change in α gives the risk-neutral measures for the
different numeraires can also be shown (in a more tedious way) using Girsanov’s theorem.
Note that in the market, α can take almost any value depending on the market prices of risk
but in the risk-neutral measure with the money market account as the numeraire, α has to take
the value above. Now that we have the risk-neutral measure, we can see how contingent claims
for bonds can be calculated in this model.
5.3.6 Futures pricing in the HJM model
A forward contract on a zero-coupon bond expiring at time T is a contract signed at time t0

to buy the bond at time t∗ in the future for a fixed price. We denote this price by F (t0 , t∗ , T ).
This price is easily fixed by a simple no arbitrage argument. The argument is that if one buys
F (t0 , t∗ , T ) zero coupon bonds expiring at t∗ today (today being t0 ) and uses the proceeds to
buy the bond expiring at T , then one should obtain the same result by just buying the zero
coupon bond expiring at T initially. Otherwise, one can sell the more expensive of F (t0 , t∗ , T )
bonds maturing at t∗ and the bond expiring at T , buy the other and keep the difference as the
two different assets will cancel each other out at time T . Hence,
P (t0 , T )
F (t0 , t∗ , T ) = (5.60)
P (t0 , t∗ )
The futures contract is similar to the forward contract with the difference being that, in the
forward contract, there is only one cash flow at time t∗ while for a futures contract there is
a continuous cash flow from time t0 to t∗ such that all variations in the price of P (t + dt, T )
away from P (t, T ) are settled continuously between the buyer and seller with a final payment
of P (t∗ , T ) at t∗ . The futures price can be shown to be given by [71]
F(t0 , t∗ , T ) = Et0 [P (t∗ , T )] (5.61)
since the value of the futures contract at any time is exactly zero. Using (5.5) and (5.6), we can
evaluate this to get
Z RT
F(t0, t∗ , T ) = Dηe− t∗ f (t∗ ,x) S
e
Z RT N R
(5.62)
R
= Dηe− t∗ dxf (t0 ,x) −
e R α−∑i=1 R σi ηi (t) eS
= F (t0 , t∗ , T ) exp ΩF
where
N Z t∗ Z t∗ Z T
ΩF (t0 , t∗ , T ) = − ∑ dt dxσi (t, x) dx′ σi (t, x′ ) (5.63)
i=1 t0 t t∗
and R represents the rectangular domain in figure 5.2.
A somewhat simpler way to perform the calculation is to write the expectation as
Et0 [P (t∗, T )] (5.64)
which becomes R t∗
r(t)dt
P (t0 , t∗ )EtQ0 [e t0
P (t∗ , T )] (5.65)
in the risk-neutral measure with numeraire P (t, t∗ ) which we denote by Q. The reason for this
change of numeraire is so that we are dealing with martingales which are easier to analyse. The
futures price is not a martingale (due to the cash flows) in any risk-neutral measure but we see
that it can be considered as the ratio of two log-normal martingales. We note that all the vari-
ables above are log-normal random variables since the forward rates are normally distributed.
The two martingales in this case are Pe(t,t )
R ∗ . and P (t , T ). The innovations in the martingales
rdt ∗
are log-normal random variables. Hence, we now make use of the following properties R t∗
of
log-normal random variables : E[AB] = E[A]E[B]e Cov(log A,log B) ˙
where log A = t0 dxf(t, x)
RT
and log B = − t∗ f˙(t, x) are the innovations of the logarithms of the variables. Since the in-
novations are for martingales, we see that E[A] = E[B] = 1. The covariance between log A
and log B can be seen Rdirectly from the two point function which is the coefficient of the
t∗ RT ′
Hamiltonian to be −dt t dx t dx σ(t, x)σ(t, x′ ) Hence, the covariance over a finite time is
− tt0∗ dt tt∗ dx tT∗ dx′ σ(t, x)σ(t, x′ ). Hence, the expression reduces to
R R R
F (t0 , t∗ , T )eCov(log A,log B) = F (t0 , t∗ , T )eΩF (5.66)

the first term coming from evaluating the expectation assuming no noise. Note the crucial
change in numeraire without which the calculation could not have been carried out in such a
simple manner (the explanation has been long but with a bit of experience the above reasoning
should be quite rapid).
5.3.7 Option pricing in the HJM model
We will now show how to price an European option on a zero coupon bond in the HJM model.
We first present the solution in [50] and show a simpler way to arrive at the same result.
Let us assume that the terminal value of the option is given by g(P (t∗, T )) and let us denote
the value of the option at time t as o(t). Since the discounted price of the option should be a
martingale in the risk neutral measure, we see that the option price is given by
h R t∗ i
− dtf (t,t)
o(t0 ) = Et0 e t0 g(P (t∗, T )) (5.67)
To evaluate this, we write it as
R Z T
− tt∗ dtf (t,t)

G
o(t0 ) = Et0 e 0 g F (t0 , t∗ , T )e δ dx(f (t∗ , x) − f (t0 , x)) − G
t∗
Z ∞ R Z T
− tt∗ dtf (t,t)

G
= dGg F (t0 , t∗ , T )e Et0 e 0 δ dx(f (t∗ , x) − f (t0 , x)) − G
0 t∗
(5.68)
Sec. 5.4 The field theory model 67
Hence, it is sufficient if we find the expectation of the discounted indicator function (as is
always the case).3 Using the identity
Z ∞
1
δ(z) = dpeipx (5.69)
2π −∞
and (5.4), we obtain the following expression for this expectation
Z ∞
1 R R Z R KR
− ∆0 α(t,x)+ip R α(t,x) − ∆0 σi (t,x)Wi (t)+ip ∑i R σi (t,x)ηi (t)
dpe Dηe eS0 (5.70)
2π −∞
where the domains ∆0 and R are the unshaded triangle and the shaded rectangle in figure 5.2
respectively. This can be evaluated using the generating functional and performing the Gaussian
integration to give s 2
q2

1 1
exp − 2 G − (5.71)
2πq 2 2q 2
where 2
K Z t∗ Z T
2
q = ∑ dt dxσi (t, x) (5.72)
i=1 t0 t∗
We see that this derivation is again much simpler when done using the risk neutral measure
with P (t, t∗ ) as the numeraire. In this case, the option price is given by
Q
P (t0 , t∗ )Et0 [f (P (t∗, T ))] (5.73)
where f (x) = max(0, S − K). To evaluate this, we need to know the probability distribu-
tion of P (t∗ , T ) in this measure. We note that the distribution is log-normal. The inno-
vation in the logarithm of the discounted bond price − tT∗ f (t, x) has instantaneous variance
R
RT
dt ∑K ′ ′ 2
i=1 t∗ dxdx σi (t, x)σi (t, x ) giving a total variance of q . Further, the mean of the value of
the discounted bond P (t∗ , T ) is F (t0 , t∗ , T ) due to the martingale condition
P (t0 , T )
F (t0 , t∗ , T ) = = E Q [P (t∗, T )] (5.74)
P (t0 , t∗ )
2
Therefore, the mean of the logarithm of the bond price is given by ln F (t0 , t∗ , T ) − q2 as the
mean of the exponential of a normally distributed random variable X ∼ N(0, σ 2 ) is given by
2
eσ /2 . Hence, we now have the full distribution of the bond price with which the expectation
can be evaluated to give the option price
P (t0 , T )N(d1 ) − KP (t0 , t1 )N(d2 ) (5.75)
where
2
F (t0 ,t1 ,T )
ln K + q2
d1 = (5.76)
q
d2 = d1 − q (5.77)
It is useful to realise that only the distribution of the forward price of the bond enters into the
calculation due to the change in numeraire. The distribution of the money market account is
not relevant in this measure.
3 An indicator function is one whose value is one if some condition is satisfied and zero otherwise. In the con-
tinuum case, this becomes a delta function. Hence, the indicator function we are talking about is the expectation
in the integral.
5.4 The field theory model
We now review the field theory model presented in [69]. Baaquie proposed that the white noise
processes in (5.4) be replaced by a field A(t, x). This makes the fundamental definition
Z t K Z t
f (t, x) = f (t0 , x) + dt′ α(t′ , x) + ∑ dt′ σi (t′ , x)Ai (t′ , x) (5.78)
t0 i=1 t0
or
K
∂f (t, x)
= α(t, x) + ∑ σi (t, x)A(t, x) (5.79)
∂t i=1
The main extension to HJM is that A depends on x as well as t unlike W which only depends
on t.
While we can put in many fields Ai , we will see that the extra generality brought into the
process due to the extra argument x will make one field sufficient. Hence, in future, we will
drop the subscript for A.
Baaquie further proposed that the field A has the action functional
2 !
1 ∞
Z t+TF R
1 ∂A
Z
S=− dt dx A2 + 2 (5.80)
2 t0 t µ ∂x
with Neumann boundary conditions imposed at x = t and x = t + TF R . This makes the action
equivalent (after an integration by parts where the surface term vanishes) to
1 ∞ 1 ∂2
Z Z t+TF R
S =− dt dxA(t, x) 1 − 2 2 A(t, x)
2 t0 t µ ∂x
Z ∞ Z t+TF R (5.81)
1 −1 ′ ′
=− dt dxA(t, x)D (x − t, x − t)A(t, x )
2 t0 t
This action has the partition function

Z t Z t+T
1 1 FR
′ ′ ′
Z[j] = exp dt dxdx j(t, x)D(x − t, x − t)j(t, x ) (5.82)
2 0 t
from which we can calculate expectations and correlations. Note that due to the boundary
conditions imposed, the differential operator D −1 actually depends only the difference x − t.
The above action represents a Gaussian random field with covariance structure D. We also note
that the differential operator used is incidental to the formulation and can be changed to suit
empirical results. For the particular differential operator considered in [69], the inverse with
the Neumann boundary conditions is given by
cosh µ (TF R − |θ − θ′ |) + cosh µ (TF R − (θ + θ′ ))
D(θ, θ′ ; TF R ) = µ
2 sinh µTF R (5.83)
= D(θ , θ; TF R ) : Symmetric Function of θ, θ′
′
where θ = x − t and θ′ = x′ − t. In [69], a different form was found as the boundary conditions
used were Dirichlet with the endpoints integrated over. This boundary condition is in fact equiv-
alent to the Neumann condition which leads to the much simpler propagator above. In the limit
TF R → ∞ which we will usually take, the propagator takes the simple form µe−µθ> cosh µθ<
where θ> and θ< stand for max(θ, θ′ ) and min(θ, θ′ ) respectively.
When µ → 0, this model should converge to the HJM model. This is indeed seen to be
the case as limµ→0 D(θ, θ′ ; TF R ) = TF1R . The extra factor of TF R is irrelevant as it is due to
the freedom we have in scaling σ and D. The σ we use for the different models are only
comparable after D is normalized. On normalization, the propagator for both the HJM model
and field theory model in the limit µ → 0 is one showing that the two models are equivalent in
this limit.
It is sufficient for D to be continuous in θ and θ′ and σ to be continuous in θ for the conti-
nuity of the forward rate curve to be preserved. To see this, let us first consider two Gaussian
white noiseprandom variables A and B having correlation 1 − O(ǫ). We can then represent
B = ρA + √ 1 − ρ2 B1 , where B1 is independent of A. Putting in ρ = 1 −√ O(ǫ) and noting that
A, B ∼ O( δ) where δ is the time step, we see that A − B is of order O( ǫδ). We now apply
this to two neighbouring points f (t, θ) and f (t, θ + ǫ). If the initial curve is continuous, their
difference goes to√zero as ǫ goes to zero. The changes in each of their neighbouring points in
time δ is then O( ǫ) and still goes to zero as ǫ goes to zero. Hence, we see that if the forward
rate curve is continuous, it will remain continuous (but not differentiable).
We now analyse this model from our viewpoint presented in the first three chapters. Since
the covariance structure of the innovations in the field is given by D, we see that the Hamilto-
nian is given by
δ2
Z ∞ Z ∞
1 ′ ′ ′ δ
H(t) = dθdθ σ(t, θ)D(θ, θ )σ(t, θ ) + dθα(t, θ) (5.84)
2 0 δf (t, θ)δf (t, θ′ ) 0 δf (t, θ)
This can also be derived using the Lagrangian and (5.48). It must be kept in mind that this
∂ ∂ ∂
Hamiltonian propagates the states in the ∂t − ∂θ direction (that is, the ∂t direction in the vari-
ables t, x). 4
In the Fourier transformed basis, the Hamiltonian is given by

Z ∞ Z ∞
1
H(t) = − dθdθ′ σ(t, θ)D(θ, θ′ )σ(t, θ′ )p(θ)p(θ′) + i dθα(t, θ)p(θ) (5.85)
2 0 0
From this point on, we assume that σ is only a function of θ and express all the variables in
terms of θ. This is true for all the functions σ usually used for HJM or for the field theory. In
this variable, the fundamental definition of the model becomes
∂f (t, θ) ∂f (t, θ)
= + α(t, θ) + σ(θ)A(t, θ) (5.86)
∂t ∂θ
Note that in [70], the first term is absorbed into the definition of α.
Using the fact that the discounted bonds must be annihilated by the above Hamiltonian,
we see that the condition for the equivalent martingale measure with the money market as
4
I am using standard notation from differential geometry where the vectors are expressed in terms of differen-
tial operators. This notation is particularly useful for this kind of problem. In more usual notation, this Hamiltonian
is evolving states in the (et , −eθ ) direction which is the same as the et direction when the basis vectors were et
and ex . Further note that this change of variables does not change the Lagrangian as the transformed volume form
does not change since dt ∧ dθ = dt ∧ (dx − dt) = dt ∧ dx.
Sec. 5.5 The Santa-Clara and Sornette model 70
numeraire is Z T −t
1 T −t T −t Z Z
dθα(t, θ) = dθ dθ′ σ(θ)D(θ, θ′ )σ(θ′ ) (5.87)
0 2 0 0
Again this can be derived in both the f basis and the p basis as in the HJM case. Differentiating
with respect to T , we get the martingale condition
Z T −t
α(t, T − t) = σ(T − t) dθD(T − t, θ)σ(θ) (5.88)
0
showing that in the risk-neutral measure, α depends only on θ. Hence, we can write the above
as Z θ
α(θ) = σ(θ) dθ′ D(θ, θ′ )σ(θ′ ) (5.89)
0
We note that this is an appealing structure for α as it allows us to have a σ which does not fall
to zero as θ → 0 if the covariance falls off fast enough. This is not possible for HJM (see [72]
or [73]) and is a bit of a problem for the theory since there is no empirical evidence for this
falloff.
If we use the bond P (t, t1 ) as the numeraire, the martingale condition will give the result
Z θ
α(θ) = σ(θ) dθ′ D(θ, θ′ )σ(θ′ ) (5.90)
t1 −t
for the equivalent Hamiltonian with that numeraire.

Using the same arguments as in the previous section for the HJM model, we can easily see
that the futures price for the field theory model is given by
F(t0 , t∗ , T ) = F (t0 , t∗ , T )eΩF (5.91)
where Z t∗ Z t∗ −t Z T −t
ΩF = dt dθσ(θ) dθ′ D(θ, θ′ )σ(θ′ ) (5.92)
t0 0 t∗ −t
is the covariance between tt0∗ dt tt∗ dxf˙(t, x) (compare with (5.62)) and the probability density
R R
function for the logarithm of the discounted zero coupon bond price is given by (5.71) with
Z t∗ Z T −t
q = 2
dt dθdθ′ σ(θ)D(θ, θ′ )σ(θ′ ) (5.93)
t0 t∗ −t
whereRq 2 is the volatility of the innovations of the logarithm of the discounted bond given by
R t∗ T ˙ 2
t0 dt t∗ dxf (t, x) and hence the option price is given by (5.75) with the above value for q .
5.5 The Santa-Clara and Sornette model
We now quickly review the Santa-Clara and Sornette model presented in [70]. In this paper,
the stochastic discount factor is not assumed to just be the inverse of the money market account
but includes a market price of risk which cannot be ruled out. This modification is trivial to
incorporate into the models and we will not do so explicitly.
Sec. 5.5 The Santa-Clara and Sornette model 71
The Santa-Clara and Sornette model models the forward rates as

Z t Z t
′ ′
f (t, θ) = f (t0 , θ) + dt α(t , θ) + σ(t′ , θ)dt′ Z(t′ , θ) (5.94)
0 0
where dt′ means that only the difference in the t direction is taken for the random field Z.
Several conditions are then imposed on Z to ensure that dt Z(t, θ) describes Gaussian white
noise with some positive definite symmetric covariance structure in x and decorrelated in t.
This covariance structure is determined by a stochastic partial differential equation with at least
one partial derivative in t and θ.
The condition of decorrelation in t effectively means that the random field Z can be mod-
elled as Z t
′
Z ∞ √
Z(t, θ) = Z(t0 , θ) + dt dθ′ D(θ, θ′ )η(t′, θ′ ) (5.95)
0 0
√
where D(θ, θ′ ) is the Green’s function of the partial differential equation√used to model Z.
We will see in a moment why we have decided to call the Green’s function D.
Note that (5.94) is very similar to (5.78) if dt′ Z(t′ , x) is replaced by A(t′ , x) and if α in-
cludes the ∂f
∂θ term which will take into account the fact that the Baaquie process links forward
rates at constant x while the Santa-Clara and Sornette model links forward rates at constant θ
together. Because of the modification of α, this is only a superficial difference.
√
Now, from (5.95), we see that dt Z(t, θ) is given by 0∞ dθ√′′ D(θ, θ√ ′′ )η(t, θ ′′ ) . Hence, the
R
covariance between dt Z(t, θ) and dt Z(t, θ′ ) is given √ by dθ′′ D(θ, θ′′ ) D(θ′ , θ′′ ) = D(θ, θ′ ).
R
It should be clear now why we used the expression D for the Green’s function since it is, in
some sense, the square root of the covariance structure. It is quite well known from functional
analysis that such a function can be found given positive definite and symmetric D if the domain
of integration above is restricted to a finite limit or if D dies off fast enough as θ, θ′ → ∞. Since
the model, only considers these cases, we see that Baaquie’s model and the Santa-Clara and
Sornette model are equivalent once we consider the covariance structure D. They both describe
random Gaussian fields with the given covariance structure and are just two different ways of
specifying the same mathematics.
Hence, we can use the results for futures, option and other derivative pricing in Baaquie’s
model in the Santa-Clara and Sornette model as well. This is very useful as no analytical results
for derivative pricing are presented in Santa-Clara and Sornette[70] which only recommends
simulation as the preferred method for solving for the price of contingent claims.
C HAPTER 6
C OMPARISON OF THE M ODELS WITH

M ARKET DATA
6.1 The Market Data used for the Study
We used the Eurodollar futures data for the following study. A Eurodollars futures contract
represents a deposit of US$1,000,000 for three months at some time in the future. Currently,
futures contracts for deposits up to ten years into the future are actively traded. Significant
historical data for contracts on deposits up to seven years into the future are available. If one
makes the reasonable approximation that f (t, θ) is linear for θ between contract times, one can
use this data as a direct measure of the forward rates. Further, the straightforward simplification
that the Eurodollar futures prices directly reflect the forward rate was done, an assumption
previously used in the literature [21]. We also attempted to analyse Treasury bond tick data
from the GovPx database but we found it impossible to obtain forward rates accurate enough
for our purposes. The main reason for this is that while we were able to obtain reasonably
accurate yields for a few maturities, the differentiation required to get the forward rates from
the yields introduced too many inaccuracies. This is somewhat unfortunate since Treasury
bonds represent risk free instruments while a small credit risk exists for Eurodollar deposits.
For the following analysis, we used the closing prices for the Eurodollar futures contracts
in the 1990s. This is exactly the same data as used by Bouchaud [21] as well as [74] and we
thank Science and Finance for kindly providing us with the data. In Bouchaud [21], the spread
of the forward rates and the eigenfuctions of its changes in time are analyzed. For our purposes,
we found it more useful to look at the scaled multivariate cumulants of the changes in forward
rates for different maturity times.
6.2 Assumptions behind the tests of the models
The main assumption that has to be made for all the tests of the models is that of time translation
invariance. In other words, we have to assume that σ(t, θ) is actually only dependent on θ and
not explicitly on t. We also assume that the propagator D(θ, θ′ ) has no explicit time dependence
which is possible in principle. It is reasonable and conceptually economical to assume that
different times in the future are equivalent. Further, carrying out any meaningful analysis while
these quantities are subject to changes in time is impossible.
72
Sec. 6.3 The Correlation Structure of the Forward Rates 73
f(t, θ0 )
x=t+ θ0 x
Figure 6.1: The lines of constant θ for which we have obtained the forward rates by linear
interpolation from the actual forward rates which are specified at constant x.
Another important assumption that has to be made is that the forward rate curve is reason-
ably smooth at small intervals at any given point in time. This assumption is very difficult to test
in any meaningful sense given the relative paucity of data as forward rate data is available only
at 3 month intervals (which is what necessitates this assumption in the first place). However, the
assumption is a reasonable one to make as one would intuitively expect that the forward rate,
say three years into the future would not be too different from that three years and one month
into the future. In fact, we will show later that there seems to be strong evidence of very long
term correlations in the movements of the forward rate. This seems to make the smoothness
assumption reasonable as nearby forward rates tend to move together (except possibly at points
very close to the current time). This assumption is required as the forward rate data is provided
for constant maturity which we have been denoting by x while we want data for constant θ, as
shown in figure 6.1. With this assumption, we can get the data by simple linear interpolation.
The loss in accuracy due to this linear interpolation is not all that serious if ǫ, the time interval
of t between specifications of the forward rates is small as the random changes which we are
interested in will be much larger than the introduced errors. This same procedure was used in
Matacz and Bouchaud [21] as well as Baaquie and Srikant [74].
6.3 The Correlation Structure of the Forward Rates
A very interesting quantity to look at in the analysis of forward rates f (t, θ) is the correlation
(or scaled covariance) among their changes for different θ. Specifically we are interested in
the correlation between δf (t, θ) and δf (t, θ′ ), where δf (t, θ) = f (t + ǫ, θ) − f (t, θ). Using a
random field model, this quantity should be equal to
hδf (t, θ)δf (t, θ′ )i − hδf (t, θ)ihδf (t, θ′)i D(θ, θ′ )
C(θ, θ′ ) = p p =p (6.1)
hδf 2(t, θ)i − hδf (t, θ)i2 hδf 2 (t, θ′ )i − hδf (t, θ′)i2 D(θ, θ)D(θ′, θ′ )
To a reasonable degree of accuracy, we can ignore the first order expectations such as hδf (t, θ)i
as they are much smaller than the second order expectations if ǫ is small. For an ǫ of one day, the
error is completely negligible especially given the other approximations. We will do so for the
rest of the chapter. If we have a model for the propagator D(θ, θ′ ), we have a prediction for this
Sec. 6.3 The Correlation Structure of the Forward Rates 74
The empirical normalized sigma

0.013
0.0125
0.012
0.0115
sigma*sqrt(year) 0.011
0.0105
0.01
0.0095
0.009
0.0085
0.008
0 1 2 3 4 5 6 7 8
theta/year
Figure 6.2: The empirically determined function σ(θ).
correlation structure. Alternatively, we can use the correlation structure to fit free parameters
in D(θ, θ′ ). It must be noted that the correlation is independent of σ(θ), so no assumption of
its form has to be made. This is the reason why we used the scaled covariance rather than the
covariance itself to perform the study. It is equivalent to fixing the inherent freedom in the
quantities σ and D 1 to make D(θ, θ) = 1. The reduction p in the freedom of σ also allows us
to directly estimate it from data since we have σ(θ) = < δf 2 (t, θ) > if D(θ, θ) = 1. This is
shown in figure 6.2. Further, the correlation between innovations in the forward curve is given
exactly by D. The correlation structure in the market estimated from the Eurodollar futures data
is shown in figure 6.3. The structure is fairly stable in the sense that the correlation structure
for different sections of the data are reasonably similar.
Since the propagator is always symmetric, it will be convenient to calculate only D(θ< , θ> )
for the different models where θ< = min(θ, θ′ ) and θ> = max(θ, θ′ ) for purposes of comparison.
For the one factor HJM model, this correlation structure is constant as all the changes in the
forward rates are perfectly correlated. In other words, D(θ, θ′ ) = 1. For the two factor HJM
model, the predicted correlation structure is given by
σ1 (θ)σ1 (θ′ ) + σ2 (θ)σ2 (θ′ ) 1 + g(θ)g(θ′)

C(θ, θ′ ) = q q =p p (6.2)
σ12 (θ) + σ22 (θ) σ12 (θ′ ) + σ22 (θ′ ) 1 + g 2 (θ) 1 + g 2 (θ′ )
σ (θ)
We see that this correlation structure depends on a function of g(θ) = σ1 (θ) . Hence, a whole
2
function has to be fitted from the correlation structure, something which is quite infeasible. The
covariance might be a better quantity to test the two factor HJM model as the prediction of the
covariance has a simpler form
C(θ, θ′ ) = σ1 (θ)σ1 (θ′ ) + σ2 (θ)σ2 (θ′ ) (6.3)
We still need to specify a functional form for σ1 and σ2 as it is not possible to estimate entire
functions from data. The usual specification of σ1 (θ) = σ0 and σ2 (θ) = σ1 e−λθ inspired by the
assumption that the spot rate follows a Markov process is easily seen to be unable to explain
1 since we can always make the transformation σ(θ) ∼ η(θ)σ(θ) and D(θ, θ′ ) ∼ D(θ, θ′ )/(η(θ)η(θ′ ))
Sec. 6.4 Analysis of the Field Theory Model with Constant Rigidity 75
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
30
25
20
0 15
5
10 10
15
20 5
25
30 0
Figure 6.3: The correlation structure observed in the market.
many features of the covariance in figure 6.4 such as the peak at one year or the sharp reduction
in the covariance as the maturity goes to zero. We can straightaway conclude that the one factor
HJM model is insufficient to characterize the data while the two factor HJM model provides us
with too much freedom as we can put in an entire arbitrary function to explain the correlation
structure. If we try to reduce the freedom by theoretical considerations, we are again unable to
explain the data.
In the case of the field theory model, we have explicit predictions of the correlation structure
for the innovations in forward rates. We will see that the field theory model, while explaining
some features of the correlation, does not predict the correlation very well. Hence, we consider
generalizations to the model, one of them proposed by Baaquie [50] and the others proposed
by us in this chapter based on the empirical data.
6.4 Analysis of the Field Theory Model with Constant Rigid-

ity
We have analysed this model in detail in the previous chapter. We have seen that the model
describes the innovations in the forward rates in terms of a Gaussian random field A whose
structure is defined by the action in (5.80). For convenience, we repeat the action below in
terms of the variables t and θ = x − t
!
∂A 2
Z ∞
1 t1
Z
2
S =− dt dθ A + (6.4)
2 t0 0 ∂θ
To obtain the predicted correlation structure from the propagator (5.83), we have to take the
limit TF R → ∞. In this limit, the propagator becomes
µ −µ|θ−θ′ | ′

D(θ, θ′ ) = µe−µθ> cosh µθ< = e + e−µ(θ+θ ) (6.5)
2
Sec. 6.4 Analysis of the Field Theory Model with Constant Rigidity 76
Covariance of the changes in the forward rates
Covariance
0.007
0.0065
0.006
0.0055
0.005
0.0045
0.004
0.0035
0.003
0.0025
0.002
8
7
6
5
0 4
1
2 3 theta’/year
3
4 2
5 1
theta/year 6
7
8 0
Figure 6.4: The covariance of innovations of forward rates observed in the market
The predicted correlation structure for this model can be found from this form of the propagator
by normalization and is given by
s
e−µθ> cosh µθ<
C(θ, θ′ ) =< δf (t, θ)δf (t, θ) > − < δf (t, θ) >< δf (t, θ′ ) >= (6.6)
e−µθ< cosh µθ>
when the limit TF R → ∞ is taken. To estimate the parameter µ from market data, we use
the Levenberg-Marquardt method from Press et al [75] to fit the parameters to the observed
correlation structure graphed in figure 6.3. The fitting was done by minimizing the square of
the error. The overall correlation was fitted by µ = 0.061/year. To obtain the error bounds, the
data was split into 346 data sets of 500 contiguous days of data each and the estimation done
for each of the sets. The 90% confidence interval for this data set is (0.057, 0.075). Note that
the confidence interval is asymmetric from the overall best fit due to the nonlinear dependence
of the correlation (6.6) on µ. The root mean square for the correlation for the best fit value is
4.23% which shows that the model’s prediction for the correlation structure is not very good.
The main problem as can be seen from a comparison between the prediction for the best fit µ
in figure 6.5 and the actual correlation structure in figure 6.3 is that the prediction is largely
independent of the actual value of θ and largely determined by |θ − θ′ | which is not the case in
reality. The correlation rapidly increases as θ increases in reality.
One clear fact we notice from the covariance of the innovations in the forward rates in figure
6.4 is that the covariance falls rapidly as θ → 0. This observation inspired Baaquie to consider a
model where A(t, 0) is constrained in [50]. A(t, 0) is constrained to follow a normal distribution
with variance a. The mean of A(t, 0) can be fixed at any value but will cause a corresponding
change in α(0) which makes the mean value irrelevant. For calculational purposes it is easiest
to assume that it remains at zero. This constraint can be implemented by modification of the
action to Z ∞
2 2
Sconstrained = dξSeiξA(t,0) e−a ξ /2 (6.7)
−∞
Sec. 6.5 Field Theory Model with Rigidity µ(θ) 77
Fitted Correlation for Constant Rigidity Model
Correlation
1 0
0.95
0.9 5
0.85
0.8 10
0.75
15
time/quarter
0 20
5
10 25
15
20
25 30
time/quarter 30
Figure 6.5: Fitted correlation for constant rigidity field theory model
Fitted Correlation for Constrained Constant Rigidity Model
Correlation
1
0.95
0.9
0.85 0
0.8
0.75 5
10
0
15
5
10 time/quarter
20
15
20 25
time/quarter 25
30
Figure 6.6: Fitted correlation for constrained field theory model
where S is the action specified in (6.4). The propagator D(θ, θ′ ) for this model is given by
µe−µθ<

′ −µθ>
D(θ, θ ) = µe cosh µθ< − (6.8)
µ+a
After normalizing, we see that the prediction for the correlation structure is given by
v
u e−µθ> (cosh µθ − µe−µθ< )
u
′ < µ+a
C(θ, θ ) = t (6.9)
u
µe −µθ>
e−µθ< (cosh µθ> − µ+a )
We can see that the free parameters are µ and a. Further, it will be seen that it is easier
to consider the ratio a/µ2 as it is dimensionless. The results of the Levenberg-Marquardt
method showed that the fitted value of µ and a were very small, of the order of 10−7 /year for
µ and 10−13 /year2 for a both being very unstable but the ratio a/µ2 was stable with a value
in the range (6.7, 10.7) with an overall best fit of 9.4. The most reasonable explanation for
this behaviour is that the ratio a/µ2 determines the behaviour of (6.9) for small µ and it is this
region of parameter space that gives a correlation structure closest to the empirically observed
one. We see from the fitted propagator in figure 6.6 that the behaviour at large θ is slightly better
when the constraint is put in. The root mean square error was 3.35% which again means the
fit was not very good though significantly better than if the constraint was not applied. It must
be recognized that the constraint introduces one extra free parameter which should improve
the best fit. Hence, we see that this model, while again performing better than HJM, is still
not very accurate. While the results are not very good, they do represent a reasonable first
approximation and are still significantly better than the one factor HJM model.
6.5 Field Theory Model with Rigidity µ(θ)
Another way to get a correlation structure that depends directly on the values of θ and θ′ in
a significant way and not only on their difference is to make µ a function of θ. This has a
direct physical meaning as it means that if we imagine the forward rate curve as a string, its
rigidity is increasing as maturity increases making the A for larger θ more strongly correlated
µ0
if µ decreases as a function of θ. We choose the function µ = 1+λθ as it declines to zero as θ
becomes large as is expected from the observed covariance in figure 6.4, contains the constant
µ case as a limit and is solvable. The action is given by
Z ∞ 2 !
1 t1

1 + λθ ∂A
Z
S=− dt dθ A2 + (6.10)
2 t0 0 µ0 ∂θ
This is still a quadratic action and can be put into a quadratic form by performing integration
by parts and setting the boundary term to zero since we are assuming Neumann boundary
conditions. The inverse (Greens function) of the quadratic operator or the propagator for this
action is found to be
µ20 α
D(θ, θ′ ; TF R ) =
2λα(α + 1/2)(1 − (1 + λTF R)−2α )

α + 1/2 −2α α−1/2 −α−1/2
(1 + λTF R ) (1 + λθ> ) + (1 + λθ> ) (6.11)
α − 1/2

α + 1/2 −2α α−1/2 −α−1/2
(1 + λTF R ) (1 + λθ< ) + (1 + λθ< )
α − 1/2
q
µ2
where α = 14 + 4λ02 and where we have put the bound on the θ variable TF R explicitly. The
reason for this is that the limits have to be taken carefully in order to compare this model to the
HJM in the limit µ0 → 0 and to the constant rigidity field theory model when λ → 0.
Let us first consider the limit λ → 0. First, we note
s s
1 µ20 µ0 λ2 µ0
α= + 2∼ 1+ 2 ∼ (6.12)
4 λ λ 4µ0 λ
Therefore, we have
−µ0
(1 + λθ)−α−1/2 = (1 + λθ)1/λ (1 + λθ)−1/2 ∼ e−µ0 θ (6.13)
Similarly (1 + λθ)α−1/2 ∼ eµ0 θ , (1 + λθ)−α−1/2 ∼ e−µ0 θ and (1 + λTF R )−2α ∼ e−2µ0 TF R .

Putting all these limits into (6.11) and performing some straightforward simplifications, we
see that (6.11) becomes equal (5.83) in the limit λ → 0. In the taking of this limit, we did not
have any trouble with TF R . However, for the HJM limit, we will see that the limit TF R → ∞
has to be taken only after the limit µ0 → 0 has been taken.
µ2
Let us now consider the limit µ0 → 0. In this limit α ∼ 21 + λ02 . Hence, only one term in
(6.11) survive as all the others are multiplied by α − 1/2. This surviving term can be evaluated
µ20 (α + 1/2)2 1
−1
(1 + λTF R )−1
2λ α − 1/2 1 − (1 + λTF R )
µ2 2λ2 1 + λTF R 1
= 0× 2 × × (6.14)
2λ µ0 λTF R 1 + λTF R
1
=
TF R
The terms (1 + λθ> )α−1/2 and (1 + λθ< )α−1/2 obviously go to one in this limit and so were
not included in the calculation above. This result can be seen to be equivalent to the HJM
propagator after normalization. If the limit TF R → ∞ is taken first, then the propagator becomes
µ20 (α − 1/2)

′ −α−1/2 α + 1/2 α−1/2 −α−1/2
D(θ, θ ) = (1 + λθ> ) (1 + λθ< ) + (1 + λθ< )
2λα(α + 1/2) α − 1/2
(6.15)
which exhibits a θ dependence in the limit µ0 → 0. Hence, this cannot be made equivalent to
HJM if the limits are taken in the wrong order. This problem is not present in the constant
rigidity model.
For comparison with market data, we still take the limit TF R → ∞ as the model is then still
directly related to the field theory model. The predicted correlation structure for this model is
then given by
1
(α + 1/2)(1 + λθ< )2α + α − 1/2 2

′
C(θ, θ ) = (6.16)
(α + 1/2)(1 + λθ> )2α + α − 1/2
Fitted Correlation for Non-constant Rigidity Model
Correlation
1
0.95
0.9
0.85 0
0.8
0.75 5
10
0
15
5
10 time/quarter
20
15
20 25
time/quarter 25
30
Figure 6.7: Fitted correlation structure for the non-constant rigidity model.
We fitted the parameters µ0 and λ to the correlation structure observed in the market in a
similar manner as for the field theory model and obtained the results µ0 = 1.2 × 10−5 /year and
λ = 0.108/year. The root mean square error in the correlation was 3.35%. On performing the
error analysis for the parameters, it is found that µ0 is very unstable but always very small (less
than 10−2 /year) while the 90% confidence interval for λ is (0.099, 0.149). The relatively high
µ0
value for λ seems to show that the falloff of the rigidity parameter µ = 1+λθ is fairly rapid.
Sec. 6.6 Field Theory Model with f (t, z(θ)) 80
The error is reduced from 4.23% to 3.35% but an extra parameter has had to be added and
the model has become considerably more complicated due to the freedom of the form of the
rigidity parameter µ. Further, we seem to be in the region of very small µ0 which does not
behave well in the HJM limit. In fact, the correlation structure in this limit is given by
s
1 + λθ<
C(θ, θ′ ) = (6.17)
1 + λθ>
Due to the very small value of µ0 for the fitted function, this is a very good approximation for
the fit. The obtained fit for the correlation function can be seen in figure 6.7.
The limited improvement, the relatively complicated form of the correlation and the near
zero µ0 problem prompted us to consider a different way of approaching the problem which
presented a much more satisfactory solution. This model is described in the next section
6.6 Field Theory Model with f (t, z(θ))
To see where we might make an improvement, we notice that the predicted correlation structure
with the field theory model is largely defined by the e−µ|θ> −θ< | term which means that the
correlation does not depend explicitly on the times θ> and θ< 2 . However, we see immediately
from figure 6.3 that the correlation increases significantly as we increase θ> and θ> . This is
intuitively reasonable as market participants are likely to treat the difference between ten and
fifteen years into the future quite differently from the difference between now and five years.
Far out into the future, we would expect all times to be equivalent. In other words, there is
good reason to expect limθ< →∞ D(θ> , θ< ) = 1 3 . This is not satisfied by the constant rigidity
models or by the varying rigidity model (if the limit TF R → ∞ is taken). For the latter model,
this is slightly surprising since µ → 0 as θ → ∞ and we might expect that for large θ the varying
rigidity model should go into the HJM model limit (D = 1). However, this does not happen as
previously discussed since we have taken the limit TF R → ∞.
Further, the relatively marginal reduction of the error shows that varying the rigidity pa-
rameter does not quite reflect the data. An alternative way to consider the problem would be to
use the observed correlation structure to induce a metric onto the θ direction. In some sense,
this metric would be measuring the “psychological distance” in the investor’s minds which
corresponds to a certain separation in maturity time. To make this concrete, let us write the
′
observed correlation as D(θ, θ′ ) = e−s(θ,θ ) . Since D(θ, θ) = 1, s(θ, θ) = 0 and s is symmetric
as well. If we can show the triangle property (which in the case of one dimension reduces to
the straightforward condition that s(θ1 , θ3 ) = s(θ1 , θ2 ) + s(θ2, θ3 )), we can see that s makes a
good definition of distance in θ. From the market data, it can be shown that this rule is very
approximately satisfied and we can use it as an approximate way to induce a metric onto the θ
direction from the observed market data.
It should also be noted that introducing the metric is different from changing the form of
2 There is another term of the form e−µ(θ> +θ< ) but this has only a small effect on the correlation structure
3 Obviously, θ
< → ∞ automatically implies θ> → ∞
Sec. 6.6 Field Theory Model with f (t, z(θ)) 81
Observed Correlation
Correlation
1 0
0.95
0.9
0.85 5
0.8
0.75 10
0.7
0.65
15
time/quarter
0 20
5
10 25
15
20
25 30
time/quarter 30
Figure 6.8: Observed correlation between forward rates
the rigidity function µ(θ). To see this, we write the action with the rigidity function µ(θ) as
!
1 ∂A 2
Z ∞
1 t1
Z
2
Sold = − dt dθ A + 2 (6.18)
2 t0 0 µ ∂z
where the functional variation of µ with θ has been absorbed into the variable z = g(θ) (where
g is invertible) so that the µ above is a constant. With a change of variables we get the action
(again Sold since it is only re-expressing the same action in new variables) as
2 !
1 t1
Z g(∞)
1 ∂A
Z
Sold = − dt dzh′ (z) A2 + 2 (6.19)
2 t0 g(0) µ ∂z
where h = g −1 . We see that this is not a convenient action due to the presence of h′ (z) which
comes from the transformation of the volume form. With the introduction of the metric, we
obtain the action 2 !
Z t1 Z g(∞)
1 1 ∂A
Snew = − dt dz A2 + 2 (6.20)
2 t0 g(0) µ ∂z
Note that this action gets rid of the inconvenient term h′ (z). Re-expressing this back in terms
of the θ variable, we see that the new action in the old variables is of the form
Z ∞ 2 !
1 t1

1 ∂A
Z
Snew = − dt dθg ′ (θ) A2 + 2 ′2 6= Sold (6.21)
2 t0 0 µ g (θ) ∂θ
We thus see that this action cannot be derived from modifying the form of the rigidity function
µ. Needless to say, the Green’s functions for this operator should be solved using the z variables
for which the solution is trivial (namely D(z, z ′ ) = 21 (exp −µ|z − z ′ | + exp −µ(z + z ′ ))).
Bearing in mind the condition that, at large θ, the correlations should be close to 1, or
equivalently that the distance should be small, we choose a metric that satisfies property, g(θ) =
tanh βθ. We use this form of the metric to fit the correlation structure and obtain the result that
µ = 0.48/year and β = 0.32/year with a root mean square error of only 2.46%. Both parameters
are also stable when the error analysis for the parameters is carried out. The 90% confidence
interval for µ is (0.45, 0.58) and that for β is (0.22, 0.33). Hence, we see that even the parameter
estimation for this model is more robust as the parameters are at least stable. Further, the shape
of the fitted function is clearly closer to the observed one as can be seen from figures 6.8,
Sec. 6.7 Phenomenology of the Forward Rate Curve 82
Fitted Correlation for metric model
Correlation
1 0
0.95
0.9 5
0.85
0.8 10
0.75
15
time/quarter
0 20
5
10 25
15
20
25 30
time/quarter 30
Figure 6.9: Fitted correlation for the model with metric g(θ) = tanh βθ
6.5, 6.6, 6.7 and 6.9. The error that remains is largely confined to the correlation between the
spot rate and other forward rates which is not too surprising since the spot rate behaves very
differently from the other forward rates.
We emphasize here that this involves a fundamentally new way of thinking of the interest
rate models. So far, we have made models which generalized HJM so as to achieve a theory
without too little freedom as in the one factor HJM model or too much freedom as in the two
factor HJM model. While retaining this framework, we now use empirical data to guide us
in refining the model to give us an insight into market psychology which will result from the
induced metric.
6.7 Phenomenology of the Forward Rate Curve
We further investigate some other aspects of the phenomenology of the forward rate curve
which require further examination. Some of these phenomena are similar to those found by
physicists in the analysis of stock prices and we will point out these similarities as we proceed.
One factor that is very interesting regards to the distribution of differences in the rates over
different time scales. For stock prices, it is known that the short scale returns are non-Gaussian
while long term returns are close to Gaussian (see, for eg., Mantegna and Stanley [23] [76]
or Matacz [77]). This behaviour is reasonably modelled by the truncated Lévy distribution
introduced by Mantegna and Stanley [78] and improved to a nice analytic form by Koponen
[79]. The truncated Lévy distribution is described by its Hamiltonian (in the meaning of chapter
2 but now for a small but finite time interval) [79]
cα
2 2 α/2 α

H =− (λ + p ) cos(α arctan(p/λ)) − λ (6.22)
cos(πα/2)
and its variance and kurtosis can be easily shown to be [77]4
α(1 − α) α α−2 α(1 − α)cαλα
σ2 = c λ ,κ = (6.23)
cos(πα/2) cos(πα/2)(α − 2)(α − 3)
The exact form of this variance and kurtosis is unimportant except for the fact that the crossover
time to the Gaussian for these distributions is very long enabling the decay of kurtosis to be
4 The formula for the kurtosis in Matacz [77] is the inverse of what we obtain probably due to a typographical
error
easily observable. A straightforward perusal of the meaning of the central limit theorem as
presented by us in chapter 2 for a finite number of terms shows that if the individual random
variables are scaled so that the variance increases linearly with the time scale , the kurtosis must
decrease inversely as the time scale. This is because when one is adding distributions over time,
one must add them so that the lowest cumulant does not diverge (and therefore increase linearly
which is why for example, the scaling in the continuous version of the normal random walk is
∆x2 /∆t constant). Hence, each further cumulant will fall off one power faster than the earlier.
Since the kurtosis is two cumulants away from the variance, we see that it goes down two
powers faster, that is as 1/∆t. Note that for this to be the case, the process has to be a Markov
process. Empirically it √ is found that the price process for stocks is not fully Markov and the
kurtosis decreases as 1/ ∆t rather than as 1/∆t as found in [80].
Kurtosis of innovations in forward rates over different timescales
Kurtosis
7
6
5
4
3
2
1
0
-1
100
90
80
70
0 60
1 50
2 40 Timescale in days
3 30
4 20
5
T/yr 6 10
7
8 0
Figure 6.10: The kurtosis over different time scales
For the forward rates, we find the following interesting facts. The scaling law κt ∼ 1/(∆t)a(θ)
seems to hold approximately but consistently for almost the entire forward rate curve for ∆t up
to about five to ten days. This cannot, of course, be said with any significance for any single
maturity but the consistency over maturities and the simple structure of a(θ) over the different
maturities as seen in figure 6.11 which plots the slope of the best fit line on a log-log graph
over a time scale of one to nine days seems to point to some significance. To make an accu-
rate assessment, one would need high frequency data. It is interesting to note that the falloff
of kurtosis is very approximately around 1/∆t for a large part of the forward rate curve and
the falloff is faster than the square root law scaling observed for the stock market. When we
increase the time scale to over ten days, the kurtosis becomes negative which is a feature that
does not seem to be seen in other markets and which limited the above analysis to a maximum
time scale of nine days (which is already too long to obey the decay of kurtosis even in the
stock market). Further, the falloff in kurtosis seems to be faster for longer maturities as seen
from figure 6.11. We however, lack higher frequency data to analyse this further. We believe
that such an analysis will provide more insight into the structure of forward rates.
Another interesting result that we have regards the volatility process. We performed an
analysis of the volatility averaged over one hundred days (being defined as hδf 2 (t, θ)i). The
correlation structure found for the innovations in averaged volatility has the form seen in figure
Exponent of power law fit for decay of kurtosis over 9 days

-0.6
-0.7
-0.8
-0.9
exponent
-1
-1.1
-1.2
-1.3
-1.4
0 1 2 3 4 5 6 7 8
theta/year
Figure 6.11: The exponent fitted for the decay of the kurtosis over 9 days. While the time period
is very short and the analysis cannot be extended for longer times as the kurtosis becomes nega-
tive, we think that it has some meaning when seen in connection with other financial data where
the same behaviour is seen in short time scales. Lack of high frequency data unfortunately does
not allow us to check this more closely.
6.12. This can be seen to be very similar indeed to the correlation structure of innovations
in the forward rates in figure 6.8 which seems to point out that this correlation structure is
deeply embedded in the market. The explanation of the correlation structure in terms of a
psychological metric would predict this since the psychological metric would not depend on
the underlying set of instruments. Hence, we seem to have some additional support for this
hypothesis.
Correlation structure of the innovations in the volatility
Correlation
1
0.95
0.9
0.85
0.8
0.75 0
0.7
0.65
0.6 5
10
0
15
5
10 theta/quarter
20
15
20 25
theta/quarter 25
30
Figure 6.12: The correlation structure for innovations in the volatility

C HAPTER 7
H EDGING IN F IELD T HEORY M ODELS

OF THE T ERM S TRUCTURE
7.1 Hedging in General
The main aim of hedging is to reduce one’s exposure to risk. There are many ways to define
risk [80]. For bonds, the main risks are changes in interest rates and the risk of default. In this
thesis, we do not deal with the latter as we are only considering default-free bonds.
The results in the first two sections of this chapter form the basis of Baaquie, Srikant and
Warachka [81].
For the purposes of this chapter, we define risk to be the standard deviation or variance of
final value. More sophisticated definitions exist for non-Gaussian outcomes but we are only
dealing with hedging in Gaussian models in this thesis. Hence, when we hedge a certain in-
strument, we are trying to create a portfolio of the hedged and hedging instruments which
minimizes the overall variance of the portfolio. In the case of a N-factor HJM model, perfect
hedging (i.e., a zero portfolio variance) is achievable once any N independent hedging instru-
ments are used. However, it is easy to see that perfect hedging is not attainable in field theory
models as there are infinite number of independent random factors to hedge against. This is
much more realistic than the HJM model which in the two factor case would seem to imply that
a 3 month and 6 month treasury bill could be used to perfectly hedge a 30 year bond. However,
the difficulties introduced by the infinite number of factors has resulted in their being very lit-
tle literature on this important subject, a notable exception being the measure valued trading
strategy developed in Björk, Kabanov and Runggaldier[82].
In the second section, we will consider instantaneous hedging which is important for the-
oretical purposes. We will calculate the maximum reduction in variance for a finite number
of hedging instruments and the hedge ratios (the amount of hedging instrument that requires
to be used) that result in this maximization. This will show us how well the model can be
approximated by a finite number of factors. We will then use the constant rigidity model fitted
with empirical data to estimate the reduction in the variance of an optimally hedged portfolio
as the number of hedging instruments are increased assuming that the Gaussian approxima-
tion is correct. We will see that a relatively small number of hedging instruments gives good
results. We will also show that the results reduced to well known textbook ones as in Jarrow
and Turnbull[83] when we go to the degenerate case of one-factor HJM model where all the
forward rate innovations are perfectly correlated. We will also perform the same calculations
85
Sec. 7.2 Instantaneous Hedging 86
using the propagator estimated from empirical data.

In the third section, we will consider finite time hedging which is important in practice. This
is because continuous hedging cannot be done in practice due to the presence of transaction
costs. We will see how the hedging performance found in the second section changes as the
time between rebalancings is increased. The entire analysis here is to investigate how portfolios
of bonds behave in such models.
7.2 Instantaneous Hedging
In instantaneous hedging, we are considering a hedging portfolio which is rebalanced continu-

ously in time. Hence, we are only concerned with the instantaneous variance of the portfolio.
This can be calculated for an arbitrary portfolio by using the fact that the covariance of the
innovations in the forward rates is given by
σ(θ)D(θ, θ′ )σ(θ′ ) (7.1)
as can be seen from the expression for the Hamiltonian in (5.84). We will only present the
hedging of zero coupon bonds in this section though it will be seen that the results can be easily
extended to other instruments. In the first subsection, we will present the theoretical derivation
of the hedge ratios and reduced variance for the hedging of a zero coupon bond with other zero
coupon bonds. In the second subsection, we use the empirically fitted σ and D(µ) (from (5.83))
for the constant rigidity action as well as the non-parametric estimate for σ and D to calculate
the semi-empirical reduction in variance. In the third and fourth subsections, we will carry out
similar calculations when hedging zero coupon bonds with futures on zero coupon bonds. This
is much more realistic in practice as hedging with futures is relatively cheap as mentioned in
chapter 1.
In the following, we will use the term bond to mean zero coupon bonds as these are the
only bonds we will be considering.
7.2.1 Hedging bonds with other bonds
We now consider the hedging of one bond maturing at T with N other bonds maturing at
Ti , 1 ≤ i ≤ N. If one of the Ti = T , then the solution is trivial since it is the same bond.
The hedge is then just to short the same bond giving us a zero portfolio with obviously zero
variance. Since this solution is uninteresting, we assume that Ti 6= T ∀i. The hedged portfolio
Π(t) can then be represented as
N
Π(t) = P (t, T ) + ∑ ∆i P (t, Ti )
i=1
where ∆i denotes the number of ith bonds P (t, Ti ) included in the hedged portfolio. Note the
value of bonds P (t, T ) and P (t, Ti ) are determined by observing their market values at time t.
It is the instantaneous change in the portfolio value that is stochastic. Therefore, the variance
of this change is computed to ascertain the efficacy of the hedge portfolio.
We first consider the varianceR of the value of an individual bond in the field theory model.
The definition P (t, T ) = exp (− tT dxf (t, x)) for zero coupon bond prices implies that
dP (t, T ) T −t Z
= f (t, t)dt − dθdf (t, θ)
P (t, T ) 0
Z T −t Z T −t
= r(t) − dθα(θ) − dθσ(θ)A(t, θ) dt
0 0
h i
dP (t,T )
= r(t) − 0T −t dθα(θ) dt since E[A(t, θ)] = 0. Therefore
R
and E P (t,t)
Z T −t
dP (t, T ) dP (t, T )
−E = −dt dθσ(θ)A(t, θ) (7.2)
P (t, T ) P (t, T ) 0
Squaring this expression and invoking the result that E[A(t, θ)A(t, θ′ )] = δ(0)D(θ, θ′; TF R ) =
D(θ,θ′ ;TF R )
dt results in the instantaneous bond price variance
Z T −t Z T −t
V ar[dP (t, T )] = dtP 2 (t, T ) dθ dθ′ σ(θ)D(θ, θ′ ; TF R )σ(θ′ ) (7.3)
0 0
As an intermediate step, the instantaneous variance of a bond portfolio is considered. For a

portfolio of bonds, Π̂(t) = ∑N
i=1 ∆i P (t, Ti ), the following results follow directly
N Z Ti −t
dΠ̂(t) − E[dΠ̂(t)] = −dt ∑ ∆i P (t, Ti ) dθσ(θ)A(t, θ) (7.4)
i=1 0
and
N N Z Ti −t Z Tj −t
V ar[dΠ̂(t)] = dt ∑ ∑ ∆i∆j P (t, Ti)P (t, Tj ) dθ dθ′ σ(θ)D(θ, θ′ ; TF R )σ(θ′ )
i=1 j=1 0 0
(7.5)
The (residual) variance of the hedged portfolio
N
Π(t) = P (t, T ) + ∑ ∆i P (t, Ti ) (7.6)
i=1
may now be computed in a straightforward manner. For notational simplicity, the bonds
P (t, Ti ) (being used to hedge the original bond) and P (t, T ) are denoted Pi and P respectively.
Equation (7.5) implies the hedged portfolio’s variance equals the final result shown below
Z T −t Z T −t
P 2
dθ dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R )
0 0
N Z T −t Z Ti −t
+2P ∑ ∆i P i dθ dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R ) (7.7)
i=1 0 0
N N Z Ti −t Z Tj −t
+∑ ∑ ∆i ∆j P i P j dθ dθ′ σ(θ)D(θ, θ′ ; TF R )σ(θ′ )
i=1 j=1 0 0
Note that the residual variance depends on the correlation structure of the innovation in forward
rates described by the propagator D. Ultimately, the effectiveness of the hedged portfolio is an
empirical question since perfect hedging is not possible without shorting the original bond. This
empirical question is addressed in the next subsection where the propagator calibrated to market
data is used to calculate the effectiveness. Minimizing the residual variance in equation (7.7)
with respect to the hedge parameters ∆i is an application of standard calculus. We introduce
the following notation for simplicity.
Z T −t Z Ti −t
Li = P Pi dθ dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R )
0 0
Z Ti −t Z Tj −t
Mij = Pi Pj dθ dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R )
0 0
Li is the covariance between the innovations in the hedged bond and the ith hedging bond and
Mij is the covariance between the innovations of the ith and jth hedging bond.
The above definitions allow the residual variance in equation (7.7) to be succinctly ex-
pressed as
Z T −t Z T −t N N N
P 2
dθ dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R ) + 2 ∑ ∆i Li + ∑ ∑ ∆i ∆j Mij (7.8)
0 0 i=1 i=1 j=1
The hedge parameters in the field theory model can now be evaluated using basic calculus and
linear algebra to obtain
N
∆i = − ∑ Lj Mij−1 (7.9)
j=1
and represent the optimal amounts of P (t, Ti ) to include in the hedge portfolio when hedging
P (t, T ).
Putting the result into (7.7), we see that the variance of the hedged portfolio equals
Z T −t Z T −t N N
V = P 2
0
dθ
0
dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R ) − ∑ ∑ Li Mij−1 Lj (7.10)
i=1 j=1
which declines monotonically as N increases.

The residual variance enables the effectiveness of the hedged portfolio to be evaluated.
Therefore, this result is the basis for studying the impact of including different bonds in the
hedged portfolio as illustrated in the next subsection. For N = 1, the hedge parameter reduces
to
R T −t R T1 −t ′ !
dθ dθ σ(θ)σ(θ ′ )D(θ, θ ′ ; T )
P 0 0 F R
∆1 = − R T1 −t R T1 −t (7.11)
P1 dθ dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R )
0 0
To obtain the HJM limit, we let the propagator equal one. The hedge parameter in equation
(7.11) then reduces to
 
R T −t R T1 −t ′ R T −t !
P  0 dθ 0 dθ σ(θ)σ(θ′ )  P 0 dθσ(θ)
∆1 = −  2 =− R T1 −t (7.12)
P1 P
R
T1 −t
dθσ(θ) 1 0 dθσ(θ)
0
The popular exponential volatility function1 σ(t, T ) = σe−λ(T −t) allows a comparison between
our field theory solutions and previous research. Under the assumption of exponential volatility,
equation (7.12) becomes
!
P 1 − e−λ(T −t)
∆1 = − (7.13)
P1 1 − e−λ(T1 −t)
Equation (7.13) coincides with the ratio of hedge parameters found as equation 16.13 of Jarrow
and Turnbull [83]. In terms of their notation

P (t, T ) X(t, T )
∆1 = − (7.14)
P (t, T1 ) X(t, T1 )
For emphasis, the following equation holds in a one factor HJM model2
∂ [P (t, T ) + ∆1P (t, T1 )]

=0 (7.15)
∂r(t)
which is verified using equation (7.14) and results found on pages 494-495 of Jarrow and
Turnbull [83]
∂ [P (t, T ) + ∆1P (t, T1 )]
= −P (t, T )X(t, T ) − ∆1P (t, T1 )X(t, T1 )
∂r(t)
= −P (t, T )X(t, T ) + P (t, T )X(t, T ) = 0
When T1 = T , the hedge parameter equals minus one. Economically, this fact states that the
best strategy to hedge a bond is to short a bond of the same maturity. This trivial approach
reduces the residual variance in equation (7.8) to zero as ∆1 = −1 and P = P1 implies L1 =
M11 . Empirical results for nontrivial hedging strategies are found in the next subsection where
the calibrated propagator is used.
Sigma estimated from data

0.06
sigma
0.055
0.05
0.045
0.04
0.035
0 1 2 3 4 5 6 7 8
Time to maturity/year
Figure 7.1: Implied volatility function (unnormalized) for constant rigidity model using market
data
Correlation between different maturities
correlation
0.95
0.9
0.85
0.8
0.75
8
7
6
5
0 4
1
2 3 maturity/year
3
4 2
5 1
maturity/year 6
7
8 0
Figure 7.2: Propagator Implied by the constant rigidity field theory model with µ = 0.06/year
Residual variance when hedging a 5 year zero coupon bond
0.00016
0.00014
0.00012
0.0001
8e-05
6e-05
4e-05
2e-05
0
0 1 2 3 4 5 6 7 8
Figure 7.3: Residual variance for five year bond versus bond maturity used to hedge
7.2.2 Semi-empirical results : Constant Rigidity model
The empirical estimation of parameters for the field theory model was explained in detail in
chapter 6. For this subsection, we use the function σ estimated for the constant rigidity model
from market data. For the convenience of the reader, we present the function σ again in figure
7.1.
It should be noted that the propagator may also be estimated non parametrically from the
correlation found in market data without any specified functional form, as the volatility function
was estimated. This approach preserves the closed form solutions for hedge parameters and
futures contracts illustrated in the previous subsection. However, the original finite factor HJM
model cannot accommodate an empirically determined propagator since it is automatically
fixed once the HJM volatility functions are specified. Later in this subsection, we will see how
the empirical propagator modifies the results of this subsection. The implied propagator for the
empirically fitted value of 0.06/yr for µ is shown in figure 7.2.
The reduction in variance achievable by hedging a five year bond with other bonds is the
focus of this subsection. We take the current forward rate curve to be flat and equal to 5%
1 This volatility function is commonly used as it lets the spot rate r(t) follow a Markov process. See [84].
2 Note that this result depends on the fact that the spot rate r(t) is Markovian and therefore only applies to
either a constant or exponential volatility function.
Residual variance when hedging a 5 year zero coupon bond
Residual Variance
0.00016
0.00014
0.00012
0.0001
8e-05
6e-05
4e-05
2e-05
0
8
7
6
5
0 4
1
2 3 Time to maturity/year
3
4 2
5 1
Time to maturity/year 6
7
8 0
Figure 7.4: Residual variance for five year bond versus two bond maturities used to hedge
Hedge ratio when hedging a five year zero coupon bonds
0
-5
-10
-15
-20
-25
0 1 2 3 4 5 6 7 8
Figure 7.5: Hedge ratios for five year bond
throughout. The initial forward rate curve does not affect any of the qualitative results. The
results can also be easily extended to other bonds. The residual variances for one and two bond
hedged portfolios are shown in figures 7.3 and 7.4. The calculation of the integrals involved
was done using simple trapezoidal integration as the data is not exceptionally accurate in the
first place. Secondly and more importantly, the errors involved will largely cancel themselves
out, hence the difference in the variances is still quite accurate. For example, in figures 7.3 and
7.4, we can see that in the case of perfect hedging, we get exactly zero residual variance which
shows that the errors tend to cancel. The parabolic nature of the residual variance is because µ
is constant. A more complicated function would produce residual variances that do not deviate
monotonically as the maturity of the underlying and the hedge portfolio increases although
the graphs appeal to our economic intuition which suggests that correlation between forward
rates decreases monotonically as the distance between them increases as shown in figure 7.2.
Observe that the residual variance drops to zero when the same bond is used to hedge itself,
eliminating the original position in the process. The corresponding hedge ratios are shown in
figure 7.5.
It is also interesting to note that hedging by two bonds, even very closely spaced ones,
seems to bring significant additional benefits. This can be seen in figure 7.4 where the diagonal
θ = θ′ represents hedging by one bond. The residual variance there is higher than the nearby
points in a discontinuous manner.
Observed Correlation
Correlation
1 0
0.95
0.9
0.85 5
0.8
0.75 10
0.7
0.65
15
time/quarter
0 20
5
10 25
15
20
25 30
time/quarter 30
Figure 7.6: The propagator implied by the market data

Volatility implied by the market
0.0135
0.013
0.0125
0.012
0.0115
sigma*year
0.011
0.0105
0.01
0.0095
0.009
0.0085
0.008
0 1 2 3 4 5 6 7 8
time/year
Figure 7.7: The volatility implied by the market data when using the empirical propagator
We now present the results for the actual propagator found from the data which is graphed
in figure 7.6. The residual variance when a five year bond is hedged with one and two bonds
bond is shown in figures 7.8 and 7.9. We can see from figure 7.9 that, when the market propa-
gator is used, the advantage of using more than one bond to hedge is significantly higher. This
is because of the nature of the correlation structure in figure 7.6. We see that the correlations
of innovations of nearby forward rates of higher maturity is significantly higher in the market
propagator, making hedging with more than one bond more useful. This is even more pro-
nounced when hedging a short maturity bond with longer maturity ones. We can see this from
figure 7.10 which shows the residual variance when a one year bond is hedged with two bonds
where the calculation is done using the empirical propagator. The effect of this higher correla-
tion among forward rates of higher maturity can also be seen in figure 7.9 where the residual
variance rises much more slowly when the hedging bonds are chosen to be of higher maturity
than the hedged bond.
7.2.3 Hedging with futures
We can carry out an analysis very similar to subsection 7.2.1 to find the optimal hedge ra-
tios when hedging a bond with futures contracts on the same or other bonds. In this case,
there is no trivial solution to the hedging problem as when bonds were hedged with other
bonds. Further, since this method of hedging is much more practical in reality, the results
Residual variance when a 5 yr ZCP is hedged with a T yr ZCP

0.0008
0.0007
0.0006
0.0005
residual variance
0.0004
0.0003
0.0002
0.0001
0
0 1 2 3 4 5 6 7 8
T/yr
Figure 7.8: The residual variance when a five year bond is hedged with one bond
will be more interesting. Proceeding as in subsection 7.2.1, we compute the appropriate

hedge parameters for futures contracts. The futures price F(t, t∗ , T ) in terms of the forward
− tT∗ −t
R −t
price PP (t,t
(t,T )
∗)
= e dθf (t,θ)
and the deterministic quantity ΩF (t, t∗ , T ) was found in equation
(5.62). The dynamics of the futures price dF(t, t∗ , T ) is thus given by
Z T −t
dF(t, t∗ , T )
= dΩF (t, t∗ , T ) − dθdf (t, θ) (7.16)
F(t, t∗ , T ) t∗ −t
which implies
Z T −t
dF(t, t∗ , T ) − E[dF(t, t∗, T )]
= −dt dθσ(θ)A(t, θ) (7.17)
F(t, t∗ , T ) t∗ −t
Squaring both sides as before leads to the instantaneous variance of the futures price
Z T −t Z T −t
V ar[dF(t, t∗, T )] = dtF (t, t∗ , T ) 2
dθ dθ′ σ(θ)D(θ, θ′ )σ(θ′ ) (7.18)
t∗ −t t∗ −t
Let Fi denote the futures price F(t, t∗ , Ti ) of a contract expiring at time t∗ on a zero coupon
bond maturing at time Ti . The hedged portfolio in terms of the futures contract is given by
N
Π(t) = P + ∑ ∆i Fi (7.19)
i=1
Residual variance when a 1 yr ZCP is optimally hedged with a T and T’ yr ZCP
residual variance
0.00045
0.0004
0.00035
0.0003
0.00025
0.0002
0.00015
0.0001
5e-05
0
8
7
6
5
0 4
1
2 3 T’/yr
3
4 2
5 1
T/yr 6
7
8 0
Figure 7.9: The residual variance when a five year bond is hedged with two bonds
Residual variance when a 1 yr ZCP is optimally hedged with a T and T’ yr ZCP
residual variance
1.8e-05
1.6e-05
1.4e-05
1.2e-05
1e-05
8e-06
6e-06
4e-06
2e-06
0
8
7
6
5
0 4
1
2 3 T’/yr
3
4 2
5 1
T/yr 6
7
8 0
Figure 7.10: The residual variance when a one year bond is hedged with two bonds
where Fi represent observed market prices. For notational simplicity, define the following
terms
Z Ti −t Z T −t
Li = P Fi dθ dθ′ σ(θ)D(θ, θ′ ; TF R )σ(θ′ )
t∗ −t 0
Z Ti −t Z Tj −t
Mij = Fi Fj dθ dθ′ σ(θ)D(θ, θ′ ; TF R )σ(θ′ )
t∗ −t t∗ −t
The hedge parameters and the residual variance when futures contracts are used as the
underlying hedging instruments have identical expressions to those in (7.9) and (7.10) but are
based on the new definitions of Li and Mij above. Computations parallel those in section 7.2.1.
To explicitly state the results, the hedge parameters for a futures contract that expires at
time t∗ on a zero coupon bond that matures at time Ti equals
N
∆i = − ∑ Lj Mij−1
j=1
while the variance of the hedged portfolio equals

Z T −t Z T −t N N
V = P2
0
dθ
0
dθ′ σ(θ)σ(θ′ )D(θ, θ′ ; TF R ) − ∑ ∑ Li Mij−1 Lj
i=1 j=1
for Li and Mij as defined above.
7.2.4 Semi-empirical results for hedging with futures
We first present results for the propagator fitted for the constant rigidity model as for the bonds.
The initial forward rate curve is again taken to be flat and equal to 5%. We also fix the expiry
of the futures contracts to be at one year from the present. This is a long enough time to clearly
show the effect of the expiry time as well as short enough to make practical sense as long term
futures contracts are illiquid and unsuitable for hedging purposes.
The calculations were done using simple trapezoidal integration as explained previously.
This is sufficient for our purposes as the fitted values for σ and D shown in figures 7.7 and
Residual variance when hedging a 5 year ZCP with one futures contract
8e-05
7e-05
6e-05
5e-05
Residual variance
4e-05
3e-05
2e-05
1e-05
0
1 2 3 4 5 6 7 8
Figure 7.11: Residual variance for a five year bond hedged with a one year futures contract on
a T maturity bond
Number Futures Contracts (Hedge Ratio) Residual Vari-

ance
0 none 1.82 × 10−3
1 4.5 years (−1.288) 5.29 × 10−6
2 5 years (−0.9347), 1.25 years (−2.72497) 1.58 × 10−6
3 5 years (−0.95875), 1.5 years (1.45535), 1.25 years 1.44 × 10−6
(−5.35547)
Table 7.1: Residual variance and hedge ratios for a five year bond hedged with one year futures
contracts.
7.6 are reasonably but not exceptionally accurate and we are more interested in the qualitative
behaviour of the residual variance and hedge parameters.
The residual variance achieved when hedging a five year bond with one futures contract is
shown in figure 7.11. The optimal hedge ratios and the resulting residual variances when hedg-
ing with two and three futures are shown in table 7.1. These were obtained by systematically
tabulating all possible combinations of bonds with intervals of three months in the maturity
direction, tabulating the residual variance for each and finding the best combination.
Firstly, we note that the hedging is very effective even when one futures contract is used
reducing the variance by a factor of over three hundred. Secondly, we also note that the most
effective hedging is not obtained by shorting the futures corresponding to the same bond but
one with a slightly lower maturity. This is due to the correlation structure of the forward rates.
However, when two futures contracts are used, we see that one of the optimal contracts is the
future on the same bond as well as a very short maturity futures which is probably due to the
short end of the forward rate curve which does influence the bond but not the futures. Since
the shortest maturity futures contract is probably likely to have the highest correlation with this
part of the forward rate curve, it seems reasonable to select this futures contract to balance the
effect on the bond from this part of the forward rate curve. This is indeed the case as seen in
table 7.1. We also note that there is very little extra improvement as we use more than two
futures.
We now present the same results using the empirical propagator directly. The residual
variance when one futures contract is used for hedging is shown in figure 7.12. The optimal
Sec. 7.3 Finite time hedging 96
Residual variance when a 5 yr ZCP is optimally hedged with a T yr ZCP future

0.0001
9e-05
8e-05
7e-05
residual variance
6e-05
5e-05
4e-05
3e-05
2e-05
1e-05
0
1 2 3 4 5 6 7 8
T/yr
Figure 7.12: Residual variance when a five year bond is hedged with a one year futures contract
on a T maturity bond
Number Futures Contracts (Hedge Ratio) Residual Vari-

ance
0 None 1.74 × 10−3
1 4.25 years (−0.984) 6.34 × 10−6
2 1.25 years (−3.84577), 5.5 years (−0.76005) 2.26 × 10−6
3 1.25 years (−8.60248), 1.5 years (2.84177), 5.25 1.95 × 10−6
(−0.85915)
Table 7.2: Residual variance and hedge ratios for a five year bond hedged with one year futures
contracts.
hedging futures, hedge ratios and residual variances are shown in table 7.2. We see that for the
actual propagator, the optimal hedging futures are even farther from the actual underlying bond
when compared to the optimal values using the fitted propagator.
7.3 Finite time hedging
The case of finite time hedging is considerably more complicated. We will only do the hedging
of bonds with other bonds as the calculations for minimizing variance can be done exactly.
We will not do hedging of bonds with futures even though this can also be solved exactly for
minimizing the variance as it does not add much extra insight for finite time. To see this,
consider hedging with a futures contract on a zero coupon bond of duration T that matures at
the same as the hedging horizon. This gives exactly the same result as hedging with a bond of
the same maturity T . Therefore, we gain nothing by carrying out that calculation.
The following calculation proceeds efficiently because of the use of path integral techniques
which are very useful for such problems. To be able to optimally hedge bonds with other bonds
in the sense of having a minimal residual variance, we need to the covariance between the final
values of bonds of different maturities. To calculate this covariance, we will first find the joint
probability density function for N bonds at the hedging horizon. Let us denote the initial time
by 0, the hedging horizon by t1 and the maturities ofR the bonds by Ti . Making use of (5.82) we
obtain the joint distribution of the quantities Gi = tT1 i dx(f (t1, x) − f (0, x)) which represent
Sec. 7.3 Finite time hedging 97
the logarithms of the ratios of final value of the bonds to the value of their forward prices for
the final time at the initial time. In other words

P (t1 , T )P (t0, t1 ) P (t1 , T )
Gi = ln = ln (7.20)
P (t0 , T ) F (t0 , t1 , T )
The calculation proceeds as follows
N Z Tj
h ∏ δ( dx(f (t1 , x) − f (0, x)) − Gj )i
j=1 t1
Z N Z t1 Z Tj Z t1 Z Tj !
= dpj DA exp i ∑ pj dt dxα(t, x) + dt dxσ(t, x)A(t, x) − Gj
j=1 0 t1 0 t1
(7.21)
which, on applying (5.82) becomes
1 N N t1 Tj Tk
Z Z Z Z
dpj exp − ∑∑ j k 0p p dt dx dx′ σ(t, x)D(x − t, x′ − t)σ(t, x′ )
2 j=1 k=1 t1 t1
! (7.22)
N Z t 1 Z Tj
+i ∑ pj dt dxα(t, x) − Gj
j=1 0 t1
Performing the Gaussian integrations, we obtain the joint probability distribution given by
!
N N
1
(2π)−n/2 (det B)−1/2 exp − ∑ ∑ (Gj − mj )Bjk −1
(Gk − mk ) (7.23)
2 j=1 k=1
where B is the matrix whose elements Bij are given by

Z t1 Z Ti Z Tj
Bij = dt dx dx′ σ(t, x)D(x − t, x′ − t)σ(t, x′ ) (7.24)
0 t1 t1
and mi is given by
Z t1 Z Ti
mi = dt dxα(t, x) (7.25)
0 t1
Hence, the quantities Gi follow a multivariate Gaussian distribution with covariance matrix Bij
and mean mi .
Having found the joint distribution of Gi , we can find the covariance of the final bond prices
by tabulating the expectations of each of the bonds and the expectation of their products. The
final bond price is given by P (t1 , Ti ) = F (0, t1 , Ti )eGi in terms of Gi . Hence, the expectation
of this quantity is given by

1
Z
−N/2 −1/2 Gi T −1
(2π) (det B) dGi F (0, t1 , Ti )e exp − (G − m) Bij (G − m) (7.26)
2
which gives F(0, t1 , Ti ) whose value is given in (5.62) as it must since the expectation of the
future bond price is the futures price. The expectation of the products of the prices of two bonds
hP (t1, Ti )P (t1 , Tj )i is given by

1 T −1
Z
−N/2 −1/2 Gi +Gj
(2π) (det B) dGi dGj F (0, t1 , Ti )F (0, t1 , Tj )e exp − H B H (7.27)
2
Sec. 7.4 Empirical Results for Finite Time Hedging 98
where H stands for the vector G − m. On evaluation, this gives the result
Z t Z T Z Tj
1 i
′ ′ ′
F(0, t1 , Ti )F(0, t1, Tj ) exp dt dx dx σ(t, x)D(x − t, x − t)σ(t, x ) (7.28)
0 t1 t1
We now consider the behaviour of the portfolio

N
P (t, T ) + ∑ ∆i P (t, Ti ) (7.29)
i=1
The covariance between the prices P (t1, Ti ) and P (ti , Tj ) is given by
Mij =F(0, t1, Ti )F(0, t1 , Tj )

Z t Z T Tj (7.30)

i
1
Z
′ ′ ′
exp dt dx dx σ(t, x)D(x − t, x − t)σ(t, x ) − 1
0 t1 t1
and the covariance between the hedged bond of maturity T and the hedged bonds is given by
Li =F(0, t1 , T )F(0, t1, Ti )

Z t Z T Z Ti (7.31)

1
′ ′ ′
exp dt dx dx σ(t, x)D(x − t, x − t)σ(t, x ) − 1
0 t1 t1
and the minimization of the residual variance of the hedged portfolio proceeds exactly as in the
first section. The hedge ratios are found to be given by
∆ = LT M −1 (7.32)
and the minimized variance is again
Var[P (t, T )] − LT M −1 L (7.33)
It is not too difficult to see that both M and L reduce to the results in the first section if t1 → 0
(with the covariances being scaled by t1 , of course).
One very interesting difference between the instantaneous hedging and finite time hedging
is that the result depends on the value of α. In the calculation above, we used the risk-neutral α
obtained for the money market numeraire. However, the market does not follow the risk-neutral
measure and it would be better to use a value for α estimated for the market for any practical
use of this method. This difference is expected since in the very short term only the stochastic
term dominates making the drift inconsequential. This, of course, is not the case for the finite
time case where the drift becomes important (it is not difficult to see that the importance of the
drift grows with the time horizon).
7.4 Empirical Results for Finite Time Hedging
We now present the empirical results for hedging of a bond with other bonds for both the best fit
for the constant rigidity field theory model and the fully empirical propagator. The calculation
of L and M were again carried out using simple trapezoidal integration and σ was assumed to
Hedging a five year ZCP using one other bond over a one year interval
0.00012
0.0001
8e-05
Residual variance
6e-05
4e-05
2e-05
0
1 2 3 4 5 6 7 8
Maturity of hedging bond/yr
Figure 7.13: Residual variance when a five year bond is hedged with one other bond (best fit of
the constant rigidity field theory model) with a time horizon of one year
Hedge ratio for hedging a five year ZCP using other bonds over a one year interval
0
-2
-4
Hedge ratio
-6
-8
-10
-12
-14
1 2 3 4 5 6 7 8
Figure 7.14: Hedge ratio when a five year bond is hedged with one other bond (best fit of the
constant rigidity field theory model) with a time horizon of one year
be purely a function of θ = x − t so that all the integrals over x were replaced by integrals over
θ. The bond to be hedged was chosen to be the five year zero coupon bond and the time horizon
t1 was chosen to be one year.
The results for the best fit of the constant rigidity field theory model (see figures 7.1 and
7.2) for the residual variance and hedge ratio for hedging with one bond are shown in figures
7.13 and 7.14. The residual variance for hedging with two bonds is shown in figure 7.15.
The results for the fully empirical quadratic fit (see figures 7.6 and 7.7) for the residual
variance and hedge ratio for hedging with one bond are shown in figures 7.16 and 7.17. The
residual variance for hedging with two bonds is shown in figure 7.18.
One interesting result is that the actual residual variance after hedging over a finite time
horizon is lesser than naively extrapolating from the infinitesimal hedging result. This seems
to be due to the shrinking nature of the domain as the contribution to the variance of the bonds
reduces as the time horizon increases. This is very clear if the maturity of the bond is close
to the hedging horizon as the volatility of bonds reduces quickly as the time to maturity ap-
proaches. Apart from this reduction, the results look very similar to the infinitesimal case. This
is probably due to the fact that the volatility is quite small so the nonlinear effects in the covari-
ance matrix (7.30) are not apparent. If very long time horizons (ten years or more) and long
Hedging a five year ZCP using two other bonds over a one year interval
Residual variance
0.00012
0.0001
8e-05
6e-05
4e-05
2e-05
0
8
7
6
1 5
2 4
3 Maturity of hedging bond/yr
4 3
5 2
Maturity of hedging bond/yr 6
7
8 1
Figure 7.15: Residual variance when a five year bond is hedged with two other bonds (best fit
of the constant rigidity field theory model) with a time horizon of one year
Hedging a five year ZCP using one other bond over a one year interval
0.00025
0.0002
Residual variance
0.00015
0.0001
5e-05
0
1 2 3 4 5 6 7 8
Figure 7.16: Residual variance when a five year bond is hedged with one other bond (best
empirical fit) with a time horizon of one year
term bonds are considered, the results will probably be quite different. We see by comparing
figures 7.15 and 7.18 that the better improvement in using more than one bond to hedge when
the empirical rather than the field theory model propagator is used is seen to be true in the finite
time case as well.
Hedge ratio for optimal hedging of a five year ZCP using one other bond over a one year interval
0
-2
-4
Hedge ratio
-6
-8
-10
-12
-14
1 2 3 4 5 6 7 8
Figure 7.17: Hedge ratio when a five year bond is hedged with one other bond (best empirical
fit) with a time horizon of one year
Hedging a five year ZCP using two other bonds over a one year interval
Residual variance
0.00025
0.0002
0.00015
0.0001
5e-05
0
8
7
6
1 5
2 4
3 Maturity of hedging bond/yr
4 3
5 2
Maturity of hedging bond/yr 6
7
8 1
Figure 7.18: Residual variance when a five year bond is hedged with two other bonds (best
empirical fit) with a time horizon of one year
C HAPTER 8
N ON -L INEAR F IELD T HEORY M ODELS
In the previous chapters, we have considered field theory models where the volatility σ was
not dependent upon the forward rates. In principle, there is no reason why this should not be
so. Further, it is also possible that the volatility is itself stochastic. This idea has been used to
extend the Black-Scholes model and has been extensively studied in this context [85, 86, 87,
59, 88, 89, 16, 1]. The main motivation for this idea was the systematic strike price and time-
to-maturity anomalies in stock option prices found in Rubinstein [90] and Sheikh [91]. For
stocks, the idea also has considerable direct empirical support [92]. For interest rate models,
the evidence is not quite as clear as in stocks but Amin and Morton [93] found systematic strike
price and time-to-maturity biases for six common volatility specifications in the HJM model
and suggested that stochastic volatility might resolve these anomalies. Heston [89] introduced
stochastic volatility directly into the evolution of bond prices but this was not in the HJM
framework. More recently, Warachka [94] has developed an extension to the HJM model with
stochastic volatility. Baaquie [48] has generalized the field theory model we have previously
discussed to include stochastic volatility and volatility dependent on forward rates. This model
includes all previous models as special cases. We now review this model and analyse it from
a slightly different point of view. We also present some numerical results derived from Monte
Carlo simulations of these models.
8.1 A general Hamiltonian for the bond prices
We first consider the Hamiltonian described by the stochastic process for the bond prices dis-
counted by the money market account. We will then consider the generalization to the Hamil-
tonians with other numeraires later. We only consider locally Gaussian processes though it is
straightforward to extend this to more general Markov processes. Non-Markovian processes
can be generated by introducing more fields in which the process is Markov as for example
in the formulation of stochastic volatility. The effective Hamiltonian for the bonds themselves
would then describe a non-Markov process since effective Hamiltonians do not necessarily sat-
isfy the semi-group property (or Chapman-Kolmogorov equation in probabilistic language).
We will see how this is done when we consider stochastic volatility models (for simple one-
dimensional systems, see Srikant [1] and the references therein).
102
Sec. 8.2 General Gaussian model for the bonds 103
8.2 General Gaussian model for the bonds
In the first three chapters, we presented a way of developing models for the pricing of contingent
claims. We assumed that we should start with a Hamiltonian for the discounted traded assets
equivalent to the market Hamiltonian and then use that Hamiltonian to price contingent claims.
In the case of bonds, we did not follow this approach but rather reviewed the modelling of
the forward rates which are not traded assets. This was mainly because the forward rates are
intuitively more reasonable quantities to work with and since most of the interest rate literature
concerns the modelling of the spot rate or forward rates. In this section, we present very briefly
the approach we presented and how it would give the same results. It also has the advantage of
making the identification of the risk neutral measure in terms of the forward rates very simple.
∂
For locally Gaussian models, we see that the Hamiltonian (evolving in the ∂t direction) for
the bond prices is given by
δ2
Z ∞
1
H(t) = dT dT ′ CP (t, T, T ′ , P )P (t, T )P (T ′)
2 t δP (t, T )δP (t, T ′)
Z ∞ (8.1)
δ
+ dT f (t, t)P (t, T )
t δP (t, T )
where the covariance structure of the innovations in the bond prices CP has been written in
the above manner to simplify the transformation to the forward rates. Since CP can depend
(almost) arbitrarily on P , this has no real effect except for the terminology. In general, we
can put in an arbitrary positive definite pseudo-differential operator for the Hamiltonian (re-
call that positive definite pseudo-differential operators have a one to one correspondence with
generators for Markov processes). The drift process must have the structure above since the
bonds discounted by the money market are martingales (that is, exactly the same reason why
the risk-neutral process for the stocks had drift rS). If we considered the discounted assets
(that is, bonds discounted by the money market) the drift term would disappear and the Hamil-
tonian would then annihilate all the discounted traded assets. The extra drift term here is of no
consequence when we model forward rates as we see below.
We now transform the Hamiltonian to the forward rates. We see from the definition of the
forward rates that the transformation is given by
δ 1 ∂ δ
= − ,θ = T −t (8.2)
δP (T ) P (T ) ∂θ δf (θ)
Z ∞
δ δ
= − dT P (T ) (8.3)
δf (θ) t+θ δP (T )
Applying this to (8.1) we obtain the Hamiltonian in terms of the forward rates for the risk-
neutral process
1 ∞ 2 ′ δ2
Z ∞ Z θ 2 ′
′ ∂ C(t, θ, θ , f ) ′ ∂ C(t, θ, θ , f ) δ
Z
dθdθ ′ ′
+ dθ dθ ′
(8.4)
2 0 ∂θ∂θ δf (θ)δf (θ ) 0 0 ∂θ∂θ δf (t, θ)
∂f (t,t)
where the drift process for the bonds has been completely eliminated since ∂θ = 0. Hence,
we see that the correlation structure of the innovations in the forward rates are related to the
scaled innovations in the bond prices by
∂2
Cf (t, θ, θ′ , f ) = CP (t, T, T ′ , P ), θ = T − t, θ′ = T ′ − t (8.5)
∂T ′ ∂T
Sec. 8.3 Volatility dependent on forward rates 104
where the relation P (T ) = exp(− 0T −t dθf (t, θ)) has to be put in to express the covariance
R
purely in terms of f . Conversely, we have

Z T −t Z T ′ −t
′
CP (t, T, T , P ) = dθ dθ′ Cf (t, θ, θ′ , f ) (8.6)
0 0
1 ∂P (t,T )
where the converse relation f (t, θ) = − P (t,T ) ∂T , T = t + θ has to be used to express CP
purely in terms of P . We also further note that the drift when expressed in terms of Cf has the
form Z θ
Df (t, θ, f ) = dθ′ Cf (t, θ, θ′ , f ) (8.7)
0
This relation can be seen to be correct in the case of the simple field theory model where
the correlation structure of the innovations in the forward rates was given by
Cf (θ, θ′ ) = σ(θ)D(θ, θ′ )σ(θ′ ) (8.8)
and the covariance structure of the scaled innovations of the bond prices is given by
Z T −t Z T ′ −t
′
CP (T, T ) = dθ dθ′ σ(θ)D(θ, θ′ )σ(θ′ ) (8.9)
0 0
We note the drift term in (8.4) comes directly from the transformation and always works
irrespective of the form of the covariance structure involved which can even be stochastic in na-
ture (since we are using the Itô formalism, this does mean though that the form of the covariance
is known the instant before the innovation which must be the case for this to be reasonable).
8.3 Volatility dependent on forward rates
8.3.1 Definition of the model
One flaw in the HJM model with non-forward rate dependent volatility and the simple field
theory model we reviewed earlier is that they allow negative forward rates which are impossible
in reality. This is because the forward rates follow a Gaussian distribution which has a support
over the entire set of real numbers. It is more realistic to use a model where the forward rates
are constrained to be positive. We must therefore use a process which has a support only over
the positive numbers. This can be done by considering processes that are known to have this
behaviour such as the Cox-Ingersoll-Ross process. We will now present the theory for the
general process where σ(t, θ) depends on f (t, θ). This section was inspired by Baaquie [48]
and follows it closely in many respects but the approach is different and is more suited for
numerical simulation.
We noted when showing the equivalence between the Santa-Clara and the field theory model
that the simple field theory model was equivalent to the Itô stochastic differential equation
Z x √
˙ x) = α(t, x) + σ(t, x)
f(t, dx′ D(x − t, x′ − t)η(t, x′ ) (8.10)
t
or in the θ variable
∂f (t, θ) ∂f (t, θ)
Z x √
− = α(θ) + σ(θ) dx′ D(θ, θ′ )η(t, θ′ ) (8.11)
∂t ∂θ t
√
where D(x, x′ ) is a function such that
Z √ √
dx′′ D(x, x′′ ) D(x′ , x′′ ) = D(x, x′ ) (8.12)
where D is a positive definite function which can be interpreted as the covariance structure of
f˙−α
σ and η(t, x) is white noise defined with the correlation function
hη(t, x)η(t′, x′ )i = δ(t − t′ )δ(x − x′ ) (8.13)
Since we now want to make σ a function of f , we can model it with the stochastic differential
equation
∂f (t, x)
Z x √
= α(t, x, f ) + σ(t, x, f (t, x)) dx′ D(x − t, x′ − t)η(t, x′ ) (8.14)
∂t t
We have to carefully see what this means by discretizing the system. We perform the discretiza-
tion as explained in chapter 5 to obtain
√ j √
fi+1,j − fi,j = ǫαi,j (fi ) + σi,j (fi,j ) ǫ ∑ D j,k Zi,k (8.15)
k=1
where each Zi,k is a standard normal variable (note that we are using Itô discretization) and
f√
i stands for the entire row of variables fi,j . If we now multiply throughout by the matrix
( D)−1l,j , we see that we get
√ −1 (fi+1,j − fi,j − ǫαi,j (fi )) √

∑ D)l,j
(
σi,j (fi,j )
= ǫZi,j (8.16)
l
Since the path integral is the transition amplitude between two successive time slices, we can
write down the transition amplitude for each time slice using the fact that each Zi,j follows the
standard normal distribution. The result is
!
1 f
1 fi+1,j − fi,j − ǫαi,j (fi ) −1 i+1,k − fi,k − ǫα (f
i,k i )
N/2
exp − ∑ Dj,k (8.17)
(2πǫ) ǫ j,k σi,j (fi,j ) σi,k (fi,k )
or
fi+1,j −fi,j fi+1,k −fi,k
!
1 ǫ − αi,j (fi ) −1 ǫ − αi,k (fi )
exp −ǫ ∑ Dj,k (8.18)
(2πǫ)N/2 j,k σi,j (fi,j ) σi,k (fi,k )
If D −1 in the continuum limit is a local differential operator, we see that the action is given by
(after taking the product over the time slices)
˙ ˙
1 f(t, x) − α(t, x, f ) −1 f (t, x) − α(t, x, f )
Z Z
S=− dt dx D (8.19)
2 σ(t, x, f (t, x)) σ(t, x, f (t, x))
Note that the path integral is over the variables Z. If we want to write the integrals in terms of
f , we have to consider the Jacobian of the transformation (8.16) which gives
1
√ (8.20)
det D ∏j σi,j (fi,j )
at each time slice. This measure term has to be included in the path integral if defined in
terms of f . The determinant of D is irrelevant since it gives a constant but the term σ must be
accounted for separately. Hence, the path integral for the stochastic differential equation (8.14)
is given by Z
Df σ −1 eS (8.21)
where Z ∞
df (t, x)
Z
Df σ −1 = ∏ (8.22)
x −∞ σ(t, x, f (t, x))
We can change variables in (8.14) easily using Ito’s formula. For a transformation g(t, x) =
g(f (t, x)), we get
∂g 1 2 ∂2 g
ġ(t, x) =α(t, x, f ) + σ (t, x, f (t, x))D(x − t, x − t) 2
∂f 2 ∂f
Z x√ (8.23)
∂g
+ σ(t, x, f (t, x)) D(x − t, x′ − t)dη(t, x′ )
∂f t
We can then denote the drift in the new variables
∂f 1 2 ∂f
α(t, x, f ) + σ (t, x, f (t, x))D(x − t, x − t) (8.24)
∂g 2 ∂g
by αg and the new standard deviation
∂g
σ(t, x, f (t, x)) (8.25)
∂f
by σg . We normally make this transformation so that σg no longer depends on g so that simu-
lations are considerably simpler.
The term α which we have so far left arbitrary is fixed by the martingale property. Since
the covariance structure for the innovations in the forward rates are given by
Cf (t, θ, θ′ , f ) = σ(θ, f (t, θ))D(θ, θ′ )σ(θ′ , f (t, θ′ )) (8.26)
we see from (8.7) that

Z x
α(t, x, f ) = σ(t, x, f (t, x)) dx′ D(x − t, x′ − t)σ(t, x′ , f (t, x′ )) (8.27)
t
We note two things, one that the function α is nonlocal in f which makes analysis using stan-
dard stochastic field theory as in Zinn-Justin [95] inapplicable. Hence, we are largely restricted
to numerical simulation. Secondly, we note that we have a new function α for each time for
simulation purposes. This unfortunately makes simulations very heavy on CPU time but it
cannot be avoided. We can also deduce α by writing down the Hamiltonian from the covari-
ance structure of the innovations and using the fact that the Hamiltonian must annihilate all the
traded assets.
We can now draw the link from this model to the model presented in Baaquie [48] which
are equivalent. In Baaquie [48], the variable modelled was

f (t, x)
φ(t, x) = ln (8.28)
f0
where f0 is a positive constant. σφ is assumed to be given by
σφ (t, x, φ(t, x)) = σ0 (x − t)eνφ(t,x) (8.29)

where σ0 is a non time-varying function so that
σ0 (x − t)f ν+1 (t, x)
σ= (8.30)
f0ν+1
Since
∂φ 1
= = e−φ (8.31)
∂f f
we see that αφ is given from an application of (8.24) and (8.7) by [48]1
σ02 (x − t)D(x − t, x − t)e2νφ(t,x)
αφ (t, x, f ) = −
2f0 (8.32)
Z x
νφ(t,x) ′ ′ ′ (ν+1)φ(t,x′ )
+ σ0 (x − t)e dx D(x − t, x − t)σ0 (x − t)e
t
The action in this variable can be calculated from the new stochastic differential equation (8.23)
in the same manner as we did for the f variable to give [48]
 !2
Z ∞ Z t+TF R ∂φ(t,x)
1 f0 ∂x − α(t, x)
S =− dt dx 
2 0 t σ0 (x − t)eνφ(t,x)
!!2  (8.33)
∂φ(t,x)
∂ f0 ∂t − α(t, x)
+ 
∂x σ0 (x − t)eνφ(t,x)
where we see that the local differential operator being modelled is (after a straightforward
integration by parts)
1 ∂2
D −1 = 1 − 2 2 (8.34)
µ ∂x
The path integral must include a measure dependent term as discussed earlier and is defined by
[48] Z Z Z
−1 S[f ] −1 S[φ]
Z = Df σ e = Dφσφ e = Dφf −ν eS[φ] (8.35)
The Hamiltonian can be written down from the covariance and mean of the innovations in φ or
calculated from its definition in physics to give [48]
νφ(t,x) ′
σ0 (x − t)eνφ(t,x ) δ2
Z t+TF R
1 ′ σ0 (x − t)e ′
H(t) = dxdx D(x − t, x − t)
2 t f0 f0 δφ(t, x)δφ(t, x′ )
Z t+TF R
α(t, x) δ
+ dx
t f0 δφ(t, x)
(8.36)
1 The first term in Baaquie [48] is missing due to a typographical error
Sec. 8.4 Stochastic volatility models 108
We finally note one very important point. While we can easily write down the stochastic
differential equations, we have to carefully check whether they have any solution. From the
conditions for the existence of solutions to stochastic differential equations given in chapter 2,
we see that there might not be a solution if α > f . In the model considered above, the limit
would mean that the process is only well-behaved if ν < − 12 since α ∼ f 2ν+2 . The reason for
this is clear if we ignore the stochastic part of the evolution and look at the drift alone, we see
that the drift is of the order of f 2ν+2 . If we then consider the differential equation
df f 2ν+2
= (8.37)
dt 2ν + 1
whose solution for is given by
1
f (t) = (f (0)−1−2ν − t)− 1+2ν (8.38)
which blows up if ν > − 21 . However, the condition is not too restrictive as we see that the
volatility of the forward rates actually goes as f ν+1 so that in terms of volatility we are only
restricted to powers less than or equal to 21 which includes many models of interest in the limit
µ → 0 which are detailed in table 8.1 but does not include the Courtadon and linear proportional
HJM model which are ill-defined over an arbitrary but finite length of time. These models are
discussed in Amin and Ng [96] from an empirical point of view.
Model σ(t, θ)
Ho and Lee σ0
1
CIR σ0 f 2 (t, θ)
Courtadon σ0 f (t, θ)
Vasicek σ0 e−λθ
Linear Proportional HJM (σ0 + σ1 θ)f (t, θ)
Table 8.1: The models described by (8.33) in the limit µ → 0. Note carefully that σ(t, θ) for
σ (θ)f ν+1 (t,θ)
this table means 0 ν+1 due to the form of (8.30)
f0
8.4 Stochastic volatility models
Another way of generalizing the simple field theory model is to make the volatility itself a
stochastic field as discussed in Baaquie [48]. The first part of this section is a review of this
model while the second is one that we propose based on ease of simulation. Since the volatility
is positive, we can model it as a field of form σ0 (x − t)e−h(t,x) where σ0 is a positive function.
It will be convenient to assume that σ0 is purely a function of x − t which is reasonable as we
would like the model to be time translation invariant.
Empirically, the behaviour of the volatility is complex as can be seen from figures 8.1 and
8.2. There seems to be a base volatility with large scale increases over certain periods (they
seem to be follow a smooth wave like structure but this is probably an artifact of the averaging).
The shorter end of the forward rate curve clearly has greater volatility than the longer end. The
volatility of volatility for thirty day moving average volatilities is shown in figure 8.3.
Plot of hundred day moving average of volatility
Volatility
2.5e-06
2e-06
1.5e-06
1e-06
5e-07
30
25
20
0 15
100
200 10 theta/quarter
300
400
500 5
days 600
700
800 0
Figure 8.1: A plot of the hundred day moving average of the volatility (defined as hδf 2 (t, θ)i
over one hundred days after the starting day)
Plot of logarithm of hundred day moving average of volatility
Log(Volatility)
-13
-13.5
-14
-14.5
-15
-15.5
-16
30
25
20
0 15
100
200 10 theta/quarter
300
400
500 5
days 600
700
800 0
Figure 8.2: A plot of the logarithm of the hundred day moving average of the volatility (defined
as hδf 2 (t, θ)i over one hundred days after the starting day)
8.4.1 The first model
From the discussion in chapter 2 about the Hamiltonian, we see if we have the instantaneous
covariances A(t, x, x′ ) between δf (t, x) and δf t, x′ ), B(t, x, x′ ) between δf (t, x) and δh(t, x′ ),
C(t, x, x′ ) between δh(t, x) and δh(t, x′ ) and the instantaneous drifts D(t, x) for δf (t, x) and
E(t, x) for δh(t, x) (δ representing the change in the evolution direction), we can immediately
write down the Hamiltonian as
1 t+TF R δ2 δ2
Z
H(t) = dxdx A(t, x, x′ )
′
+ 2B(t, x, x ′
)
2 t δf (t, x)δf (t, x′ ) δf (t, x)δh(t, x′ )
δ2
Z t+T
′
FR δ δ
+C(t, x, x ) + dx D(t, x) + E(t, x)
δh(t, x)δh(t, x′ ) t δf (t, x) δh(t, x)
(8.39)
Hence, if we specify the model

Rx ′
in terms of these four functions (D(t, x) is specified by the
martingale condition to be t dx A(t, x, x′ ) as discussed earlier), we can immediately analyse
it using the above general Hamiltonian.
Alternatively, we can specify the model in terms of a stochastic differential equation of the
Volatility of Volatility
0.001
0.0009
std (30 day std (f))

0.0008
0.0007
0.0006
0.0005
0 2 4 6 8
time/year
Figure 8.3: The volatility of volatility
form
f˙(t,x)−α
 
√

 σ0 (x−t)e−h(t,x)  η1 (t, x)
= D (8.40)
ḣ(t,x)−β(x−t) η2 (t, x)
ξ(x−t)
√ √
where D is a matrix in the finite dimensional case or a generalization of D in the one
√ T√
field case that has D D = D where the matrix D represents the entire covariance structure
between the innovations in the two fields f and h.
One more alternative is to specify the model in terms of the Lagrangian. In this case, we
have a matrix differential operator which has to be inverted to give us the necessary correlation
functions which are required to specify the Hamiltonian or stochastic differential equation.
The martingale condition can then be applied to find the relation between the parameters in the
Lagrangian in the risk-neutral measure.
We first review this approach. In Baaquie [48], the following specific action was considered
[48]
" 2 2 #
Z ∞ Z t+TF R 2 2
(A − ρB) B 1 ∂A 1 ∂B
S=− dt dx 2
+ + 2 + 2 (8.41)
0 t 2(1 − ρ ) 2 2µ ∂x 2κ ∂x
where
∂f
− α(t, x)
∂t
A = (8.42)
σ0 (x − t)e−h(t,x)
∂h
∂t − β(x − t)
B = (8.43)
ξ(x − t)
We have also assumed that β and ξ are purely functions of x − t to preserve time translation
invariance. We can always make σ0 , β and ξ dependent on t if required with no change in the
analysis.
As usual, the path integral measure is given by the variable f scaled by the volatility e−h
(see earlier discussion when discussing the model where the volatility is a function of the
forward rates) and the variable h which does not have to be scaled as it does not have a non-
trivial volatility. More specifically, the measure for the path integral
Z
Z= Df Dσ −1 eS (8.44)
is given by
Z ∞ Z ∞
Df Dσ −1 = ∏ df (t, x)dσ −1(t, x) = ∏ df (t, x)dh(t, x)eh(t,x) (8.45)
(t,x)∈P −∞ (t,x)∈P −∞
where P is the domain 0 < t < ∞, t < x < t + TF R .

The motivation for the form of the action in (8.41) was that in the limit µ → 0, κ → 0, we get
a one-factor HJM stochastic volatility model with the correlation between f (t, x) and h(t, x)
given by ρ. The model we present later also has this property.
The Hamiltonian can be worked out using the Fourier conjugate variables for f and h as
in Baaquie [48]. However, since this action is also quadratic, we see that inverting the matrix
differential operator will directly give us the covariances which we can plug into (8.39) to give
us the Hamiltonian.
The matrix differential operator for this action is
1 ∂ 2 ρ
!
1−ρ2
− µ12 ∂x 2 − 1−ρ 2
D −1 = 1 1 ∂ 2 (8.46)
− 1−ρ 2 1−ρ2
− κ12 ∂x 2
This can be inverted using
1 4 −1

1 1 1 1 2
= − + ∂ + 2 2∂
det D −1 1 − ρ2 µ2 κ2 µ κ
(8.47)
1
(∂ 2 − r+ )−1 − (∂ 2 − r− )−1

=
r− − r+
where r+ and r− are the roots of the quadratic equation

1 1 1 1
2
− 2
+ 2 z + 2 2 z2 = 0 (8.48)
1−ρ µ κ µ κ
The inverse is then shown to be given by [48]
2 !
D− − D+ 1−ρ
κ2
(r+ D+ − r− D− ) ρ(D− − D+ )
D=c 2 (8.49)
ρ(D− − D+ ) D− − D+ + 1−ρ
µ2
(r+ D+ − r− D− )
where
D+ = (∂ 2 − r+ )−1 (8.50)
D− = (∂ 2 − r− )−1 (8.51)

1
q
2 2
r± = µ + κ ± (κ2 − µ2 )2 + 4ρ2 µ2 κ2 (8.52)
2(1 − ρ2 )
µ2 κ2
c = p (8.53)
(κ2 − µ2 )2 + 4ρ2 µ2 κ2
We now have the required correlation and drift functions for the Hamiltonian. They are
given by
′
A(t, x, x′ ) = cσ0 (x − t)e−h(t,x) D11 (x − t, x′ − t)σ0 (x′ − t)e−h(t,x ) (8.54)
′ −h(t,x) ′ ′
B(t, x, x ) = cσ0 (x − t)e D12 (x − t, x − t)ξ(x − t) (8.55)
C(t, x, x ) = cξ(x − t)D22 (x − t, x′ − t)ξ(x′ − t)
′
(8.56)
Z x
D(t, x) = dx′ A(t, x, x′ ) (8.57)
t
E(t, x) = β(x − t) (8.58)
Putting the above values into (8.39), we get the Hamiltonian for this action functional. Simi-
larly, we also now have the stochastic differential equation.
8.4.2 The second model
This model is based on the stochastic differential equation approach. In this approach, we
model the evolution of the two fields in the following way
Z x p
f˙(t, x) = α(t, x) + σ0 (x − t)e−h(t,x)
ρ dx′ D1 (x, x′ )η1 (t, x′ )
t
p Z x (8.59)
′ ′ ′
p
+ 1−ρ 2 dx D2 (x, x )η2 (t, x )
t
Z x
dx′
p
ḣ(t, x) = β(t, x) + ξ(t, x) D1 (x, x′ )η1 (t, x′ ) (8.60)
t
where η1 and η2 are uncorrelated white noises
hη1 (t, x)η1 (t′ , x′ )i = δ(t − t′ )δ(x − x′ ) (8.61)

hη2 (t, x)η2 (t′ , x′ )i = δ(t − t′ )δ(x − x′ ) (8.62)
hη1 (t, x)η2 (t′ , x′ )i = 0 (8.63)
We assume that D1−1 and D2−1 are given by two differential operators of the form
1 ∂2
D1−1 = 1 − (8.64)
κ2 ∂x2
1 − ρ2 ∂ 2
D2−1 = 1− (8.65)
µ2 ∂x2
to draw a parallel with the other model and so that we can write a Lagrangian in terms of local
operators (the reason for the inclusion of the term 1 − ρ2 is to make the form of the Lagrangian
simpler). The action can be written in a straightforward way using the decomposition
ḣ − β
Z
dx′
p
D1 (x, x′ )η1 (t, x′ ) = (8.66)
ξ
˙
1 f −α ḣ − β
Z
′ ′ ′
p
dx D2 (x, x )η2 (t, x ) = p −ρ (8.67)
1 − ρ2 σ ξ
Sec. 8.5 Monte Carlo Simulation 113
to give the action

 
1 ∂2 ρ ∂2
1 ∞
Z t+TF R 1
2 − µ2 ∂x2 − 1−ρ + µρ2 ∂x

 A
Z
2 2
1−ρ
− dt dx A B  ρ ρ ∂2 1 1 ρ2

∂2
(8.68)
2 0 t − 1−ρ 2 + µ2 ∂x2 − + µ2 ∂x2 B
1−ρ2 κ2
where A and B are defined in (8.42) and (8.43) respectively. The path integral described by
this action has the same measure terms as for the previous model and is described in (8.45).
The correlation structure implied in this model is easier worked out using the stochastic
differential equation (8.59) rather than the action above. Using the correlation properties of the
white noises, we immediately obtain the following correlation structure which is the inverse of
the matrix differential operator in the action
2
ρ D1 + (1 − ρ2 )D2 ρD1

(8.69)
ρD1 D1
where D1 and D2 are the inverse (Green’s functions) of the operators D1−1 and D2−1 with the
boundary conditions we have decided to impose. Due to this correlation structure, we see that
the term α is now fixed to be
Z x
−h(t,x) ′
α(t, x) = σ0 (x − t)e dx′ (ρ2 D1 (x, x′ ) + (1 − ρ2 )D2 (x, x′ ))σ0 (x′ − t)e−h(t,x ) (8.70)
t
in the martingale measure. The same correlation structure will be obtained by inverting the
differential operator in the action (8.68), the roots of the quadratic equation corresponding to
µ2
(8.48) being given by κ2 and 1−ρ 2.
8.5 Monte Carlo Simulation
The Monte Carlo simulation of the actions presented above is best carried out using the equiv-
alent Langevin equations as consecutive configurations are then completely uncorrelated. This
makes the simulation relatively much more efficient.
We note some of the practical details in the simulation. The correlation matrices were
found analytically by inverting the corresponding differential operators. These matrices were
then discretized and the “square root” of the correlation matrix was found by the technique of
Cholesky decomposition. This technique enables us to find, given a positive definite symmetric
matrix D, a matrix L with the property that LT L = D (see, for example, Press et al [75] or
Artemiev and Averina [97]) which is the discrete analog of the relation (8.12). The Cholesky
decomposition is achieved by the recursive formula
j−1
Dij − ∑k=1 lik ljk
lij = (8.71)
j−1 2
Djj − ∑k=1 ljk
The Gaussian random variables η were produced by the standard algorithm in pairs
p p
η1 = −2 ln α1 cos 2πα2 , η2 = −2 ln α1 sin 2πα2 (8.72)
where α1 and α2 are uniformly distributed random numbers in the range (0, 1) produced by the
C library’s drand48 function which produces pseudo-random numbers in this range with the
long period 248 which is practically infinity from our point of view.
The simulation then proceeds with the either (8.15) or the discrete version of (8.66) on a
discrete lattice as defined in chapter 5. In practice, we actually set up the lattice in the θ variable
for convenience since the covariance matrices are functions of this variable. This, of course,
makes no difference to the results. The simulations were carried out at two lattice spacings,
one half the other to ensure that no serious errors resulted from the finite spacing. The risk
R t1
− r(t)dt
neutral condition for both the bond (that is, E[e t0
P (t1, T )] = P (t0 , T )) and the discount
R t1
− r(t)dt
factor which is the inverse of the money market account (that is, E[e t0
] = P (t0 , t1 )) was
checked for each case and found to be correct.
Simple initial values were used for all the initial values and parameters as the aim is to
understand how the change in model affects bond and option prices. The initial forward rate
curve was fixed at 5% for all the simulations. Further the functions σ0 were taken to be constant.
The correlation function assumed was that given by (5.83) with µ = 0.06/year, the value fitted
by empirical data as explained in chapter 6. The simulations were done over a time period of
one year for the distribution and option pricing of a two year zero coupon bond.
The non-Gaussian nature of the process is best described by the implied volatility graph of
option prices which show the value of σ for the bond which recreates that option price assuming
a Gaussian process. The σ value for different strike prices would be constant for a Gaussian
process (showing that a unique σ can characterize the process) but would be different for a
non-Gaussian one. More specifically, it is the value of q that recreates the option price which is
given in the Gaussian models by
P (t0 , T )N(d1 ) − KP (t0 , t1 )N(d2 ) (8.73)
with
2
F (t0 ,t1 ,T )
ln K + q2
d1 = (8.74)
q
d2 = d1 − q (8.75)
We can see how strongly non-Gaussian the processes described above by looking at how much
σ varies with strike price. We can also directly look at the density functions of the bond prices
which we also do.
8.5.1 Results
Volatility Dependent on Forward Rates

√
For this simulation, we used the function σ(t, x, f (t, x)) = σ0 f . For simplicity, we keep σ0
constant. This function is chosen since it does not lead to problems with divergences due to
α and also because it ensures that the forward rates are always positive, a main failing of the
simpler Gaussian models. We chose σ0 = 0.1 which is significantly larger than the market value
of around 0.04 so as to observe the effects due to the nonlinearity more clearly (note that σ0 is
Comparison of pdf of -log(bond) and a normal distribution of equal mean and variance
20
18
16
14
12
density
10
0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16
-log(bond)
Figure 8.4: The probability density function estimate for the negative logarithm of the initial
two year zero coupon bond (one year at final time)
√
dimensionless √
since the standard deviation has dimensions 1/ year which is the same as the
dimensions of f).
We first present the estimate
R
for the probability density function of the negative logarithm
of the bond price (that is, f dx) which would be Gaussian in the simpler models. This estimate
is shown in figure 8.4 in comparison with a normal distribution of the same mean and variance.
We can see that the probability density function is strongly positively skewed (the skewness is
found to be 0.65). It also has slightly fat tails with the kurtosis being given by 0.57.
The strong positive skewness of the density function is not√too surprising since the solution
of a one dimensional stochastic differential equation with σ ∼ x is given by the Bessel squared
process whose probability densities have large positive skewness. This positive skewness also
means that the bond prices are strongly negatively skewed as can be seen in figure 8.5. The
bond prices are negatively skewed even in Gaussian models due to their log-normal behaviour.
Here, we see that this is even more the case (the skewness is given by -0.58).
Since the bond prices are strongly negatively skewed, we would expect the implied volatility
for the European call option to decrease with strike price as the probability of high bond prices
is low. This is the result that we find which is shown in figure 8.6.
Hence, we see that such a process would give rise to a distinctly non-Gaussian process
with strong positive skewness for the distribution of the forward rates which gives rise to a
decreasing implied volatility with increasing strike price.
Stochastic Volatility
For this simulation, we set the volatility function σ0 to be one since all the dependence can be
put directly into h. The drift of the volatility β was chosen to be zero since it is unlikely that
there is a consistent drift though mean-reverting terms are quite plausible. We set h = −3.2
Sec. 8.6 Conclusion 116
Pdf for the bond prices

25
20
15
density
10
0
0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1
bond price
Figure 8.5: The probability density function estimate for the initial two year zero coupon bond
(one year at final time)
and ξ = 1 to approximate the conditions in the market with simple initial conditions. The value
of κ was chosen to be 0.05/year.
The estimate for the probability density of − ln P is presented in figure 8.8. It can be seen
that the sign of the correlation largely determines the skewness of the distribution (the estimated
values being 0.27 for the positively correlated simulation, 0.02 for the uncorrelated simulation
and -0.3 for the negatively correlated simulation). All the distributions have a positive kurtosis
of around 0.3 and show fat tails.
The probability density for the bond prices have opposite behaviour from that of the densi-
ties for the forward rates. These densities are shown in figure 8.8. Since the positive correlation
simulation shows negatively skewed bond prices, we expect to see an implied volatility that
declines with strike price which is the case as can be seen from figure 8.10. The opposite is true
for the negative correlation simulation as can be seen from figure 8.11. For zero correlation,
the fat tails result in a volatility “smile” which can be seen in figure 8.9. This behaviour is
somewhat similar to what one finds for stock option pricing with stochastic volatility.
8.6 Conclusion
From the simulations, we see that the nonlinear models are capable of producing significant
non-Gaussian behaviour similar to that observed in the market as, for example, described in
Amin and Ng [96]. Lack of market data, however, makes it unable for us to test this further
and provides a direction for further work in this area. Another direction of research would be
the theoretical analysis of the model though the nonlocal nature of this model makes this very
difficult.
Implied volatility of simulation of sigma = sigma0 sqrt(f)

using 10,000 configurations
0.025
0.024
0.023
implied volatility
0.022
0.021
0.02
0.019
0.018
0.93 0.94 0.95 0.96 0.97 0.98
strike price
Figure 8.6: The implied volatility curve for a European call option of maturity one year on a
two year zero coupon bond (final maturity one year)
Pdf of -log(bond) for differnt correlations

40
zero correlation
0.5 correlation
-0.5 correlation
35
30
25
density
20
15
10
0
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
-log(bond)
Figure 8.7: The pdfs for different correlations (-0.5, 0, 0.5)

Pdf of bond price for different correlations

45
zero correlation
0.5 correlation
40 -0.5 correlation
35
30
25
density
20
15
10
0
0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1
bond price
Figure 8.8: The pdfs of the bonds for the correlations (-0.5, 0, 0.5)
Implied volatility for stochastic volatility model with zero correlation

0.0125
0.012
implied volatility
0.0115
0.011
0.0105
0.01
0.93 0.94 0.95 0.96 0.97 0.98 0.99
strike price
Figure 8.9: The implied volatility for the European call option for zero correlation
Implied volatility for stochastic volatility model with correlation 0.5

0.02
0.018
0.016
implied volatility
0.014
0.012
0.01
0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99

strike price
Figure 8.10: The implied volatility for the European call option for 0.5 correlation
Implied volatility for stochastic volatility model with correlation -0.5

0.013
0.0125
0.012
implied volatility
0.0115
0.011
0.0105
0.01
0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99
strike price
Figure 8.11: The implied volatility for the European call option for -0.5 correlation
A PPENDIX A
T HE G ENERIC P ROGRAM FOR F ITTING

THE PARAMETERS
/*
* File : fitmulambda linear.cc Version : 1.0.0
* Revision : 1.0 Date : 18/2/2001
*
* Description : This programs fits the correlation structure of the
* forward rate curve to the non-constant rigidity model
*
*
* Created : 16/5/2001 Author : Marakani Srikant Copyright (C) Marakani
* Srikant 2001 10
*/
#include <general−1.2.1.h>
/*Numerical Recipes code included for the actual minimization*/
#include "gaussj.c"
#include "mrqcof.c"
#include "covsrt.c"
#include "mrqmin.c"
#define NUM COL 29
#define NUM PARAM 2 20
/*The function to be fitted together with its derivatives with respect

to the parameters*/
void fitfunc (float x, float a[ ], float *yfit, float dyda[ ], int ma)
{
float theta1, theta2, alpha;
alpha = sqrt(0.25+a[2]*a[2]/(a[1]*a[1]));
theta2 = floor ((2*NUM COL−1−sqrt(DSQR(2*NUM COL−1)−8*x+8))/2);
theta1 = theta2 + (x−NUM COL*theta2+theta2*(theta2+1)/2); 30
theta1++;
theta2++;
assert (theta2 > 0);
assert (theta1 > theta2);
theta1 = theta1/4;
theta2 = theta2/4;
*yfit = sqrt((−0.5 + alpha + (0.5 + alpha)*
pow(1 + a[1]*theta2,2*alpha))/
(−0.5 + alpha + (0.5 + alpha)*
pow(1 + a[1]*theta1,2*alpha))); 40
120
Appendix A : The Generic Program for Fitting the Parameters 121
dyda[1] = −(alpha*(0.5 + alpha)*

(−(theta2*pow(1 + theta2*a[1],
−1 + 2*alpha)*
(−0.5 + alpha +
(0.5 + alpha)*
pow(1 + theta1*a[1],2*alpha)))
+ theta1*pow(1 + theta1*a[1], −1 + 2*alpha)*
(−0.5 + alpha +
(0.5 + alpha)*
pow(1 + theta2*a[1],2*alpha))))/ 50
(sqrt((−0.5 + alpha +
(0.5 + alpha)*
pow(1 + theta1*a[1],2*alpha))/
(−0.5 + alpha +
(0.5 + alpha)*
pow(1 + theta2*a[1],2*alpha)))*
pow(−0.5 + alpha +
(0.5 + alpha)*
pow(1 + theta2*a[1],2*alpha),2));
dyda[2] = −(pow(1 + theta1*a[1],2*alpha)* 60
(−2 + (−1 + 4*pow(alpha,2))*
log(1 + theta1*a[1])) +
pow(1 + theta2*a[1],2*alpha)*
(2 + pow(1 + 2*alpha,2)*
pow(1 + theta1*a[1],2*alpha)*
(log(1 + theta1*a[1]) −
log(1 + theta2*a[1])) +
log(1 + theta2*a[1]) −
4*pow(alpha,2)*log(1 + theta2*a[1])))/
(sqrt((−0.5 + alpha + 70
(0.5 + alpha)*
pow(1 + theta1*a[1],2*alpha))/
(−0.5 + alpha +
(0.5 + alpha)*
pow(1 + theta2*a[1],2*alpha)))*
pow(−1 + 2*alpha +
(1 + 2*alpha)*pow(1 + theta2*a[1],2*alpha)
,2));
dyda[1] += −a[2]*a[2]/(a[1]*a[1]*a[1]*alpha) * dyda[2];
dyda[2] = a[2]/(a[1]*a[1]*alpha) * dyda[2]; 80
}
/*Function to print out fitted values. Used for debugging*/

void outfittedfunc(float a[ ])
{
float theta1, theta2, alpha;
alpha = sqrt(0.25+a[2]*a[2]/(a[1]*a[1]));
for (theta1 = 0.25; theta1<7.5; theta1 += 0.25) 90
{
for (theta2 = 0.25; theta2<7.5; theta2 += 0.25)
{
if (theta1 > theta2)
{
printf ("%g ", sqrt((−0.5 + alpha + (0.5 + alpha)*pow(1 + a[1]*theta2,2*alpha))/
(−0.5 + alpha + (0.5 + alpha)*pow(1 + a[1]*theta1,2*alpha))));
}
else
{ 100
printf ("%g ", sqrt((−0.5 + alpha + (0.5 + alpha)*pow(1 + a[1]*theta1,2*alpha))/
(−0.5 + alpha + (0.5 + alpha)*pow(1 + a[1]*theta2,2*alpha))));
}
}
printf ("\n");
}
}
int main (int argc, char *argv[ ])

{ 110
int i, nrow, *ia, counter = 0;
FILE *input;
float *a, *x, *y, *sig, **covar, **alpha, chisq, alambda;
if (argc != 4)
{
printf ("Usage : %s input a1 a2\n", argv[0]);
exit (1);
}
nrow = lc (argv[1]); 120
x = vector (1, nrow);
y = vector (1, nrow);
sig = vector (1, nrow);
a = vector (1, NUM PARAM);
ia = ivector (1, NUM PARAM);
covar = matrix (1, NUM PARAM, 1, NUM PARAM);
alpha = matrix (1, NUM PARAM, 1, NUM PARAM);
/*Setting up parameters to be passed to minimization algorithm*/
for (i = 1; i <= nrow; i++)
{ 130
sig[i] = 1;
}
a[1] = atof (argv[2]); a[2] = atof (argv[3]);
for (i = 1; i <= NUM PARAM; i++)
{
ia[i] = 1;
}
alambda = −1;
/*Input the correlation matrix*/
if (!(input = fopen (argv[1], "r"))) 140
{
printf ("Error opening file %s for input\n", argv[1]);
exit (1);
}
else
{
for (i = 1; i <= nrow; i++)
{
fscanf (input, "%f %f ", &x[i], &y[i]);
} 150
}
fclose (input);
/*Perform the minimization of (y-y i)ˆ2*/
mrqmin (x, y, sig, nrow, a, ia, NUM PARAM, covar, alpha, &chisq, fitfunc, &alambda);
do
{
counter++;
}
while ((chisq > 1e−2)&&(counter<200)); 160
alambda = 0;
for (i=1; i <= NUM PARAM; i++)
{
printf ("%g ", a[i]);
}
printf ("%g\n", chisq);
free matrix (alpha, 1, NUM PARAM, 1, NUM PARAM);
free matrix (covar, 1, NUM PARAM, 1, NUM PARAM);
free ivector (ia, 1, NUM PARAM); 170
free vector (a, 1, NUM PARAM);
free vector (sig, 1, nrow);
free vector (x, 1, nrow);
free vector (y, 1, nrow);
}
A PPENDIX B
T HE S IMULATION P ROGRAM FOR

VOLATILITY AS A F UNCTION OF
F ORWARD R ATES
/*
* File : simulationf.cc Version : 1.0.0
* Revision : 1.0 Date : 16/5/2002
*
* Description : This programs simulates the forward rate curve as a
* random field.
*
*
* Created : 16/10/2001 Author : Marakani Srikant Copyright (C) Marakani
* Srikant 2001 10
*/
/* Basic plan : Calculate the correlation matrix, perform a Cholesky

decomposition to get the square root of this matrix. Use the square
root to generate random variables with the correct correlation.
*/
/*Code from Numerical Recipes for Cholesky decomposition*/ 20

void choldc(double **a, int n, double p[ ])
{
void nrerror(char error text[ ]);
int i,j,k;
float sum;
for (i=1;i<=n;i++) {
for (j=i;j<=n;j++) {
for (sum=a[i][j],k=i−1;k>=1;k−−) sum −= a[i][k]*a[j][k];
if (i == j) { 30
if (sum <= 0.0)
nrerror("choldc failed");
p[i]=sqrt(sum);
} else a[j][i]=sum/p[i];
}
}
}
124
Appendix B : The Simulation Program for Volatility as a Function of Forward Rates 125
/*Function to read in parameters and initial data*/

void ReadInput (char *fileName, double ***f, double *sigma0, double *eps, 40
int *t1, int *t2, double *mu)
{
FILE *input;
int i;
assert (input = fopen (fileName, "r"));

fscanf (input, "%lf %lf %d %d %lf", sigma0, eps, t1, t2, mu);
assert (*f = dmatrix (1, *t1+1, 1, *t2+1));
assert ((*t2+1)%2 == 0);
assert (t2>t1); 50
for (i=1; i<= *t2+1; i++)
{
fscanf (input, " %lf", &((*f)[1][i]));
}
fclose (input);
}
/*This function calculates the correlation matrix*/

void MakeCovar (double **covar, double eps, int size, double mu)
{ 60
int i, j;
for (i=1; i<=size; i++)

{
for (j=1; j<=size; j++)
{
if (j>=i)
{
covar[i][j] = sqrt((exp(−mu*eps*(j−1))*cosh(mu*eps*(i−1)))/(exp(−mu*eps*(i−1))*cosh(mu*eps*(j−1))));
} 70
else
{
covar[i][j] = sqrt((exp(−mu*eps*(i−1))*cosh(mu*eps*(j−1)))/(exp(−mu*eps*(j−1))*cosh(mu*eps*(i−1))));
}
}
}
}
/*This function calculates the vector alpha from the correlation matrix*/
void MakeAlpha (double **covar, double *f, double sigma0, double *alpha, int size, double eps) 80
{
int i, j;
double result;
alpha[1] = 0;
{
for (result=0, j=1; j<=i; j++)
{
/*Perform trapezoidal integration to get alpha[i]*/ 90
if (j==1| |j==i)
{
result += 0.5*eps*covar[j][i]*sigma0*sqrt(f[j]);
}
else
{
result += eps*covar[j][i]*sigma0*sqrt(f[j]);
}
}
alpha[i] = sigma0*sqrt(f[i]) * result; 100
}
}
/*This function makes a Gaussian random vector*/

void MakeRand (double *rand, int size)
{
double r1, r2;
int i;
for (i=1; i<=size/2; i++) 110

{
r1 = drand48();
r2 = drand48();
rand[2*i−1] = sqrt(−2*log(r1))*cos(2*M PI*r2);
rand[2*i] = sqrt(−2*log(r1))*sin(2*M PI*r2);
}
}
/*This function transforms the simple random vector to one with the wanted correlation structure*/
void MakeCovRand (double **transform, double *rand, double *covRand, int size) 120
{
int i, j;

{
covRand[i] = 0;
{
covRand[i] += transform[i][j]*rand[j];
} 130
}
}
/*This procedure generates the next configuration for the Monte Carlo calculation*/
void GenNextf (double **f, double **covar, double *alpha, double sigma0, double **transform,
int size1, int size, double eps)
{
int i, j;
static double *rand, *covRand, sqrteps;
static int first = 1; 140
if (first)
{
rand = dvector (1, size);
covRand = dvector (1, size);
first = 0;
sqrteps = sqrt(eps);
}
for (i=2; i<=size1; i++)
{ 150
MakeAlpha (covar, f[i−1], sigma0, alpha, size, eps);
MakeRand (rand, size);
MakeCovRand (transform, rand, covRand, size);
for (j=1; j<=size−i+1; j++)
{
f[i][j] =
f[i−1][j+1]+eps*alpha[j+1]+sqrteps*sigma0*sqrt(f[i−1][j+1])*covRand[j+1];
}
}
} 160
/*This procedure calculates the quantities of interest*/

void CalculateQuantity (double **f, int size, int size1, double eps, double *bprice, double *dcf)
{
int i, j, k;
/*Perform trapezoidal integration to get discounted bond value*/

for (i=1, *bprice = 0; i<=size−size1; i++)
{
if ((i==1)| |(i==size−size1)) 170
{
*bprice += 0.5*eps*f[size1][i];
}
else
{
*bprice += eps*f[size1][i];
}
}
*bprice = exp(−*bprice);
/*Perform trapezoidal integration to get discount factor*/ 180
for (i=1, *dcf = 0; i<=size1; i++)
{
if ((i==1)| |(i==size1))
{
*dcf += 0.5*eps*f[i][1];
}
else
{
*dcf += eps*f[i][1];
} 190
}
*dcf = exp(−*dcf);
}

{
/*f is the forward rate field, eps is the lattice spacing, rand is
the uncorrelated random vector while covRand is the correlated
random vector. p is the diagonal of the transformation matrix */
double **f, **covar, **transform, eps, *p, sigma0, mu, 200
*alpha, quantity, est quantity, var quantity, bprice, dcf;
int i, j, t1, t2, nstep;
if (argc != 3)
{
printf ("Usage : %s input file nstep\n", argv[0]);
exit (1);
}
srand48(time(NULL));
ReadInput(argv[1], &f, &sigma0, &eps, &t1, &t2, &mu); 210
nstep = atoi (argv[2]);
p = dvector (1, t2+1);
alpha = dvector (1, t2+1);

covar = dmatrix (1, t2+1, 1, t2+1);
transform = dmatrix (1, t2+1, 1, t2+1);
MakeCovar (covar, eps, t2+1, mu);
choldc (covar, t2+1, p);
for (i=1; i<=t2+1; i++)
{
for (j=1; j<=t2+1; j++) 220
{
transform[i][j] = (j<i) ? covar[i][j] : 0;
}
transform[i][i] = p[i];
}
for (est quantity=var quantity=0, i=1; i<=nstep; i++)
{
GenNextf (f, covar, alpha, sigma0, transform, t1+1, t2+1, eps);
CalculateQuantity(f, t2+1, t1+1, eps, &bprice, &dcf);
printf("%g %g\n", bprice, dcf); 230
}
free dmatrix (f, 1, t1+1, 1, t2+1);
free dvector (p, 1, t2+1);
free dmatrix (covar, 1, t2+1, 1, t2+1);
free dmatrix (transform, 1, t2+1, 1, t2+1);
}
A PPENDIX C
T HE S IMULATION P ROGRAM FOR

VOLATILITY AS AN I NDEPENDENT F IELD
/*
* File : simulation stochastic.cc Version : 1.0.0
* Revision : 1.0 Date : 30/5/2002
*
* Description : This programs simulates the forward rate curve as
* well as the volatility as a random field.
*
*
* Created : 14/5/2001 Author : Marakani Srikant
* Copyright (C) Marakani Srikant 2001 10
*/
/* Basic plan : Calculate the correlation matrix, perform a Cholesky

decomposition to get the square root of this matrix. Use the square
root to generate random variables with the correct correlation.
*/
/*Cholesky decomposition routine from Numerical Recipes in C*/ 20

void choldc(double **a, int n, double p[ ])
{
void nrerror(char error text[ ]);
int i,j,k;
float sum;
for (i=1;i<=n;i++) {
for (j=i;j<=n;j++) {
for (sum=a[i][j],k=i−1;k>=1;k−−) sum −= a[i][k]*a[j][k];
if (i == j) { 30
if (sum <= 0.0)
nrerror("choldc failed");
p[i]=sqrt(sum);
} else a[j][i]=sum/p[i];
}
}
}
/*Procedure to read input of initial values and size of grid*/

void ReadInput (char *fileName, double ***f, double ***h, double **beta, 40
129
Appendix C : The Simulation Program for Volatility as an Independent Field 130
double **xi, double *eps, int *t1, int *t2, double *mu1,
double *mu2, double *rho)
{
FILE *input;
int i;
assert (input = fopen (fileName, "r"));

fscanf (input, "%lf %d %d %lf %lf %lf", eps, t1, t2, mu1, mu2, rho);
*mu1 = (*mu1)/sqrt(1−(*rho)*(*rho));
assert (*f = dmatrix (1, *t1+1, 1, *t2+1)); 50
assert (*h = dmatrix (1, *t1+1, 1, *t2+1));
assert (*beta = dvector (1, *t2+1));
assert (*xi = dvector (1, *t2+1));
assert ((*t2+1)%2 == 0);
assert (t2>t1);
for (i=1; i<= *t2+1; i++)
{
fscanf (input, " %lf", &((*f)[1][i]));
}
for (i=1; i<= *t2+1; i++) 60
{
fscanf (input, " %lf", &((*h)[1][i]));
}
for (i=1; i<= *t2+1; i++)
{
fscanf (input, " %lf", &((*beta)[i]));
}
for (i=1; i<= *t2+1; i++)
{
fscanf (input, " %lf", &((*xi)[i])); 70
}
fclose (input);
}
/*Procedure to make the required covariance matrices*/

void MakeCovar (double **covar, double eps, int size, double mu)
{
int i, j;
for (i=1; i<=size; i++) 80

{
{
if (j>=i)
{
covar[i][j] = mu*exp(−mu*eps*(j))*cosh(mu*eps*(i));
}
else
{
covar[i][j] = mu*exp(−mu*eps*(i))*cosh(mu*eps*(j)); 90
}
}
}
}
/*Procedure to calculate the required matrix for alpha*/

void MakeCovarAlpha (double **covar, double eps, int size, double mu, double kappa, double rho)
{
int i, j;
double mu1 = mu/sqrt(1−rho*rho); 100
for (i = 1; i<=size; i++)

{
{
if (j>=i)
{
covar[i][j] = rho*rho*kappa*exp(−kappa*eps*(j−1))*cosh(kappa*eps*(i−1))
+ (1−rho*rho)*mu1*exp(−mu1*eps*(j−1))*cosh(mu1*eps*(i−1));
} 110
else
{
covar[i][j] = rho*rho*kappa*exp(−kappa*eps*(i−1))*cosh(kappa*eps*(j−1))
+ (1−rho*rho)*mu1*exp(−mu1*eps*(i−1))*cosh(mu1*eps*(j−1));
}
}
}
}
/*Procedure to calculate alpha at each time step*/ 120

void MakeAlpha (double **covar, double *h, double *alpha, int size, double eps)
{
int i, j;
double result;
alpha[1] = 0;
{
for (result=0, j=1; j<=i; j++)
{ 130
/*Perform trapezoidal integration to get alpha[i]*/
if (j==1| |j==i)
{
result += 0.5*eps*covar[j][i]*exp(h[j]);
}
else
{
result += eps*covar[j][i]*exp(h[j]);
}
} 140
alpha[i] = exp(h[i]) * result;
}
}
/*Procedure to make uncorrelated random vector*/

void MakeRand (double *rand, int size)
{
double r1, r2;
int i;
150
for (i=1; i<=size/2; i++)
{
r1 = drand48();
r2 = drand48();
rand[2*i−1] = sqrt(−2*log(r1))*cos(2*M PI*r2);
rand[2*i] = sqrt(−2*log(r1))*sin(2*M PI*r2);
}
}
/*Procedure to make correlated random vector*/ 160

void MakeCovRand (double **transform, double *rand, double *covRand, int size)
{
int i, j;

{
covRand[i] = 0;
{
covRand[i] += transform[i][j]*rand[j]; 170
}
}
}
/*Procedure to generate configurations for the Monte Carlo simulation*/

void GenNextfh (double **f, double **h, double **covaralpha, double *alpha,
double *beta, double *xi, double **transformf, double **transformh,
int size1, int size, double eps, double rho, double *randf,
double *randh, double *covRandf, double *covRandh)
{ 180
int i, j;
static double sqrteps, k;
static int first = 1;
if (first)
{
first = 0;
sqrteps = sqrt(eps);
k = sqrt(1−rho*rho);
} 190
{
MakeRand (randf, size−1);
MakeCovRand (transformf, randf, covRandf, size−1);
MakeRand (randh, size−1);
MakeCovRand (transformh, randh, covRandh, size−1);
MakeAlpha (covaralpha, h[i−1], alpha, size, eps);
/*Implementing the stochastic differential equation*/
{ 200
f[i][j] = f[i−1][j+1]+eps*alpha[j+1] +
sqrteps*exp(h[i−1][j+1])*(k*covRandf[j]+rho*covRandh[j]);
}
{
h[i][j] = h[i−1][j+1]+eps*beta[j+1] +
sqrteps*xi[j+1]*covRandh[j];
}
}
} 210
/*Procedure to calculate quantities of interest*/

void CalculateQuantities (double **f, int size, int size1, double eps,
double *quantity1, double *quantity2)
{
int i;
double result = 0;
/*Perform trapezoidal integration to get bond value*/

for (i=1; i<=size−size1; i++) 220
{
if ((i==1)| |(i==size−size1))
{
result += 0.5*eps*f[size1][i];
}
else
{
result += eps*f[size1][i];
}
} 230
*quantity1 = exp(−result);
result = 0;
{
if ((i==1)| |(i==size1))
{
result += 0.5*eps*f[i][1];
}
else
{ 240
result += eps*f[i][1];
}
}
*quantity2 = exp(−result);
}

{
/*
*f is the forward rate field, h is the variance field, eps is the 250
*lattice spacing, randf and randh are the uncorrelated random
*vectors. p1 and p2 are the diagonals of the transformation
*matrices
*/
double **f, **covarf, **covarh, **covaralpha, **transformf, **transformh,
eps, *p1, *p2, **h, *beta, *xi, muf, muh, *alpha, quantity1, quantity2,
est quantity1, est quantity2, var quantity1, var quantity2, rho,
*randf, *randh, *covRandh, *covRandf;
int i, j, t1, t2, nstep, size, size1;
260
if (argc != 3)
{
printf ("Usage : %s input file nstep\n", argv[0]);
exit (1);
}
srand48(time(NULL));
ReadInput(argv[1], &f, &h, &beta, &xi, &eps, &t1, &t2, &muf, &muh, &rho);
nstep = atoi (argv[2]);
p1 = dvector (1, t2+1);
p2 = dvector (1, t2+1); 270
alpha = dvector (1, t2+1);
randf = dvector (1, t2);
covRandf = dvector (1, t2);

randh = dvector (1, t2);
covRandh = dvector (1, t2);
covarf = dmatrix (1, t2, 1, t2);
covarh = dmatrix (1, t2, 1, t2);
covaralpha = dmatrix (1, t2+1, 1, t2+1);
transformf = dmatrix (1, t2, 1, t2);
transformh = dmatrix (1, t2, 1, t2); 280
MakeCovar (covarf, eps, t2, muf/sqrt(1−rho*rho));
MakeCovar (covarh, eps, t2, muh);
MakeCovarAlpha (covaralpha, eps, t2+1, muf, muh, rho);
MakeAlpha (covaralpha, h[1], alpha, t2+1, eps);
choldc (covarf, t2, p1);
choldc (covarh, t2, p2);
for (i=1; i<=t2; i++)
{
for (j=1; j<=t2; j++)
{ 290
transformf[i][j] = (j<i) ? covarf[i][j] : 0;
}
transformf[i][i] = p1[i];
}
for (i=1; i<=t2; i++)
{
for (j=1; j<=t2; j++)
{
transformh[i][j] = (j<i) ? covarh[i][j] : 0;
} 300
transformh[i][i] = p2[i];
}
for (est quantity1=var quantity1=est quantity2=var quantity2=0, i=1; i<=nstep; i++)
{
GenNextfh (f, h, covaralpha, alpha, beta, xi, transformf, transformh, t1+1,
t2+1, eps, rho, randf, randh, covRandf, covRandh);
CalculateQuantities(f, t2+1, t1+1, eps, &quantity1, &quantity2);
printf("%g %g\n", quantity1, quantity2);
est quantity1 += quantity1/nstep;
est quantity2 += quantity2/nstep; 310
var quantity1 += quantity1*quantity1/nstep;
var quantity2 += quantity2*quantity2/nstep;
}
var quantity1 −= est quantity1*est quantity1;
var quantity2 −= est quantity2*est quantity2;
printf ("%g %g %g %g\n", est quantity1, (var quantity1 > 0) ?
sqrt(var quantity1/nstep):0, est quantity2, (var quantity2 > 0)
? sqrt(var quantity2/nstep):0);
free dmatrix (f, 1, t1+1, 1, t2+1);
free dmatrix (h, 1, t1+1, 1, t2+1); 320
free dvector (beta, 1, t2+1);
free dvector (xi, 1, t2+1);
free dvector (p1, 1, t2+1);
free dvector (p2, 1, t2+1);
free dvector (randf, 1, t2);
free dvector (covRandf, 1, t2);
free dvector (randh, 1, t2);
free dvector (covRandh, 1, t2);
free dmatrix (covarf, 1, t2+1, 1, t2+1);
free dmatrix (covarh, 1, t2+1, 1, t2+1); 330
free dmatrix (covaralpha, 1, t2+1, 1, t2+1);

free dmatrix (transformf, 1, t2+1, 1, t2+1);
free dmatrix (transformh, 1, t2+1, 1, t2+1);
}
References
[1] M. Srikant, Option Pricing with Stochastic Volatility, Honours thesis, National University
of Singapore, 1998.
[2] N. L. Jacob and R. R. Pettit, Investments, Irwin, 1989.
[3] J. C. Hull, Options, Futures, and other Derivatives, Prentice Hall International, 1997.
[4] F. J. Fabozzi, Bond Markets, Analysis and Strategies, Prentice Hall International, 2000.
[5] F. Black and M. Scholes, Journal of Political Economy 81, 637 (1973).
[6] O. A. Lamont and R. H. Thaler, Can the market add and subtract? Mis-
pricing in tech stock carve-outs, Internal paper, University of Chicago.
http://gsbwww.uchicago.edu/fac/owen.lamont, 2000.
[7] W. Paul and J. Baschnagel, Stochastic Processes : From Physics to Finance, Springer,
1999.
[8] M. Namiki, Stochastic Quantization, Springer-Verlag, 1992.
[9] L. Bachelier, Annales Scientifiques de l’École Normale Supérieure (Paris) III-17, 21

(1900).
[10] A. Einstein, Annalen der Physik 17, 549 (1905).
[11] M. Jacobsen, Bernoulli 2, 271 (1996).
[12] J. Dash, Path Integrals and Options : Part I, CNRS Preprint CPT-88/PE.2206, 1988.
[13] J. Dash, Path Integrals and Options : Part II, CNRS Preprint CPT-89/PE.2333, 1989.
[14] J. Dash, Path Integrals and Options, Invited Talk, SIAM Annual Conference, July, 1993.
[15] V. Linetsky, Computational Economics 11, 129 (1998).
[16] B. E. Baaquie, Journal de Physique I 7, 1733 (1997), cond-mat/9708178.
[17] H. Kleinert, Option Pricing from Path Integral for Non-Gaussian Fluctuations. Natural
Martingale and Application to Truncated Lévy Distributions, http://xxx.lanl.gov/cond-
mat/0202311, 2002.
[18] A. Cavagna, J. P. Garrahan, I. Giardina, and D. Sherrington, Physical Review Letters 83,
4429 (1999).
136
[19] R. Savit, R. Manuca, and R. Riolo, Physical Review Letters 82, 2203 (1999).
[20] J. Bouchaud, N. Sagna, R. Cont, N. El-Karoui, and M. Potters, Phenomenology of the

Interest Rate Curve, Working Paper http://xxx.lanl.gov/cond-mat/9712164, 1997.
[21] A. Matacz and J.-P. Bouchaud, International Journal of Theoretical and Applied Finance
3, 703 (2000).
[22] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Amaral, and H. E. Stanley, Physical

Review Letters 83, 1471 (1999).
[23] R. N. Mantegna and H. E. Stanley, An Introduction to Econophysics, Cambridge Univer-

sity Press, 2000.
[24] Z. Brzeźniak and T. Zastawniak, Basic Stochastic Processes, Springer, 1999.
[25] S. M. Ross, Stochastic Processes, Wiley, 1996.
[26] M. M. Rao, Stochastic Processes : General Theory, Kluwer Academic Publishers, 1995.
[27] B. Øksendal, Stochastic Differential Equations : An Introduction with Applications,

Springer-Verlag, 1989.
[28] N. Jacob, Pseudo-Differential Operators and Markov Processes, Akademie Verlag, 1996.
[29] D. Applebaum, Journal of Mathematical Physics 39, 3019 (1998).
[30] S. Karlin and H. Taylor, A First Course in Stochastic Processes, Academic Press, 1975.
[31] D. R. Wolf, Information and Correlation in Statistical Mechanical Systems, PhD thesis,
University of Texas at Austin, 1996.
[32] S. Weinberg, The Quantum Theory of Fields, Cambridge University Press, 1994.
[33] N. Wiener, Journal of Mathematical and Physical Sciences 2, 132 (1923).
[34] E. Nelson, Physical Review 150, 1079 (1965).
[35] E. Nelson, Quantum Fluctuations, Princeton University Press, 1985.
[36] I. V. Girsanov, Theory of Probability and its Applications 5, 285 (1960).
[37] G. Parisi and N. Sourlas, Physical Review Letters 43, 744 (1979).
[38] D. Kreps and J. Harrison, Journal of Economic Theory 20, 381 (1979).
[39] J. Cox and S. Ross, Journal of Financial Economics 3, 145 (1976).
[40] J. Harrison and S. Pliska, Stochastic Processes and Their Applications 11, 215 (1981).
[41] H. Geman, N. E. Karoui, and J.-C. Rochet, Journal of Applied Probability 32, 443 (1995).
[42] D. Duffie, Dynamic Asset Pricing Theory, Princeton University Press, 2001.
[43] J. Cox, S. A. Ross, and M. Rubinstein, Journal of Financial Economics 7, 229 (1979).
137
[44] F. Delbaen and W. Schachermayer, Mathematische Annalen 300, 463 (1994).
[45] M. Fritelli, Mathematical Finance 10, 39 (2000).
[46] T. Goll and L. Rüschendorf, Finance and Stochastics 5, 557 (2001).
[47] G. Yuan, Application of Differential Geometry to the Mathematical Finance of Incomplete

Markets, PhD thesis, National University of Singapore, 2002.
[48] B. E. Baaquie, Physical Review E 65, 056122 (2002), cond-mat/0110506.
[49] R. A. Jarrow and A. I. Kaushik, Journal of International Money and Finance 10, 310
(1991).
[50] B. E. Baaquie, Quantum finance, In preparation, 2002.
[51] R. C. Merton, Bell Journal of Economics and Management Science 2, 275 (1973).
[52] M. Rubinstein and E. Reiner, RISK 4, 28 (1991).
[53] N. Kunitomo and M. Ikeda, Mathematical Finance 2, 275 (1992).
[54] V. Linetsky, Mathematical Finance 9, 55 (1999).
[55] H. Geman and M. Yor, Mathematical Finance 3, 349 (1993).
[56] O. Vasicek, Journal of Financial Economics 5, 177 (1977).
[57] M. Brennan and E. Schwartz, Journal of Banking and Finance 3, 133 (1979).
[58] J. C. Cox, J. Ingersoll, and S. A. Ross, Econometrica 53, 385 (1985).
[59] J. C. Hull and A. White, Review of Financial Studies 3, 573 (1990).
[60] F. Jamshidian, Pricing of Contingent Claims in the One Factor Term Structure Model,
1987, Working paper, Merril Lynch.
[61] F. Jamshidian, Journal of Finance 44, 205 (1989).
[62] F. Jamshidian, Research in Finance 9, 131 (1991).
[63] F. Black, E. Derman, and W. Toy, Financial Analysts Journal Jan-Feb. 1990, 33 (1990).
[64] F. Black and P. Karasinski, Financial Analysts Journal Jul-Aug. 1991, 52 (1991).
[65] R. Jarrow, D. Heath, and A. Morton, Econometrica 60, 77 (1992).
[66] J. Cohen and R. Jarrow, Markov Modeling in the Heath, Jarrow, and Heath Term Structure
Framework, Cornell University, 2000.
[67] D. Kennedy, Mathematical Finance 4, 247 (1994).
[68] R. Goldstein, Review of Financial Studies 13, 365 (2000).
[69] B. E. Baaquie, Physical Review E 64, 1 (2001).
138
[70] P. Santa-Clara and D. Sornette, Review of Financial Studies 14, 149 (2001), cond-
mat/9801321.
[71] R. A. Jarrow, Modelling Fixed Income Securities and Interest Rate Options, McGraw-
Hill, 1995.
[72] P. Dybvig, J. Ingersoll, and S. Ross, Journal of Business 69, 1 (1996).
[73] A. Jeffrey, Asymptotic Maturity Behaviour of Single Factor Heath-Jarrow-Morton Term

Structure Models : A Note, working paper, Yale University, 1997.
[74] B. E. Baaquie and M. Srikant, Empirical Investigation of a Quantum Field of Forward

Rates, National University of Singapore http://xxx.lanl.gov/abs/cond-mat/0106317, 2002.
[75] W. H. Press, S. A. Teukolsky, W. T. Vettering, and B. P. Flannery, Numerical Recipes in

C : The Art of Scientific Computing, Cambridge University Press, 1995.
[76] R. N. Mantegna and H. E. Stanley, Physica A 239, 255 (1997).
[77] A. Matacz, International Journal of Theoretical and Applied Finance 3, 143 (2000).
[78] R. N. Mantegna and H. E. Stanley, Physical Review Letters 73, 2946 (1994).
[79] I. Koponen, Physical Review E 52, 1197 (1995).
[80] J.-P. Bouchaud and M. Potters, Theory of Financial Risks : From Statistical Physics to
Risk Management, Cambridge University Press, 2000.
[81] B. E. Baaquie, M. Srikant, and M. C. Warachka, A Quantum Field Theory Term Structure
Model Applied to Hedging, cond-mat/0206457, 2002.
[82] T. Bjork, Y. Kabanov, and W. Runggaldier, Mathematical Finance 7, 211 (1997).
[83] R. Jarrow and S. Turnbull, Derivative Securities, Second Edition, South Western College
Publishing, 2000.
[84] E. Eberlein and S. Raible, Mathematical Finance 9, 31 (1999).
[85] L. O. Scott, Journal of Financial and Quantitative Analysis 22, 419 (1987).
[86] H. Johnson and D. Shanno, Journal of Financial and Quantitative Analysis 22, 143 (1987).
[87] J. B. Wiggins, Journal of Financial Economics 19, 351 (1987).
[88] J. Stein and E. Stein, The Review of Financial Studies 4, 727 (1991).
[89] S. L. Heston, Review of Financial Studies 6, 327 (1993).
[90] M. Rubinstein, Journal of Finance 33, 455 (1978).
[91] A. Sheikh, Journal of Financial and Quantitative Analysis 22, 419 (1991).
[92] Y. Liu et al., Physical Review E 60, 1390 (1999).
[93] A. Morton and K. Amin, Journal of Financial Economics 35, 141 (1994).
139
[94] M. C. Warachka, A Note on Stochastic Volatility in Term Structure Models, Working
paper, 2001.
[95] J. Zinn-Justin, Quantum Field Theory and Critical Phenomenona, Cambridge University
Press, 1992.
[96] K. I. Amin and V. K. Ng, Review of Financial Studies 10, 333 (1997).
[97] S. S. Artemiev and T. Averina, Numerical Analysis of Ordinary and Stochastic Differential
Equations, 1997.
140

Interest Rate Modeling

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interest Rate Modeling

Uploaded by

Copyright:

Available Formats

Stochastic Processes in Finance

In partial fulfillment of the requirement for the degree of Master of Science

my mother and friends.

1 A Brief Introduction to Finance 2

3 The Fundamental Theorem of Asset Pricing 31

5 Interest Rate Models 53

6 Comparison of the Models with Market Data 72

7 Hedging in Field Theory Models of the Term Structure 85

8 Non-Linear Field Theory Models 102

A The Generic Program for Fitting the Parameters 120

B The Simulation Program for Volatility as a Function of Forward Rates 124

C The Simulation Program for Volatility as an Independent Field 129

A B RIEF I NTRODUCTION TO F INANCE

1.1.1 Common Stocks

1.1.2 Bonds and Fixed Income Securities

Profit on a Call Option

Profit on a Put Option

1.3 The Practical Uses of Options

1.4 The Principle of No Arbitrage

1.5 The Historic Connection between Physics and Finance

2.1 Events, Probability Measures and Random Variables

2.2 Generating functions and cumulants

The moments are now given by

The characteristic function can be considered as the probability distribution in momentum

2.3 Equivalent Measures and the Radon-Nikodým Deriva-

2.4 Stochastic Processes

In physics, we usually write dP = Dx(t)eS[x(t)] / T ×R Dx(t)eS[x(t)] where S is called the ac-

2.5 Martingales and semi-martingales

A stochastic process is said to be a martingale if it has no inbuilt drift. More precisely, a

E[Xt |Ft′ ] = Xt′ , t′ < t and E[|Xt |] < ∞ (2.19)

2.6 Markov Processes

f (xn , tn |x1 , t1 ; . . . ; xn−1 , tn−1 ) = f (xn , tn |xn−1 , tn−1 ) (2.21)

where K(x1 , x2 , t2 − t1 ) = f (x1 , t1 ; x2 , t2 ). We call K the transition probability. If K depends

2.7 The Kramers-Moyal Expansion

For a Markov process, the probability density function satisfies

Assume that the moments of the conditional density function

are known. We can then write

2.8 The Fokker-Planck and Langevin Equations

∂p(x, t) ∂(a1 (x)p(x, t)) 1 ∂ 2 (a2 (x)p(x, t))

∂p(x, t) ∂p(x, t) a2 (x) ∂ 2 p(x, t)

2.9 Extension to Multivariate Distributions

Then, Dynkin’s formula reduces to

E[f (B)] = f (a) (2.63)

−pk ln R − qk (ln R + k ln 2) = − ln |a| (2.64)

pk R2−n + qk (2k R)2−n = |a|2−n (2.65)

From this, we get

2.10 Stochastic Fields

for simplicity. However, it should be recalled that the measure is actually

2.11 Killing terms and Potentials

2.12 Cumulants and the Central Limit Theorem

This can be written in an illuminating fashion as

where c ≥ 0, 0 < α ≤ 2, −1 ≤ β ≤ 1 and ω(p, α) is given by

2.13 Itô Stochastic Calculus

2.13.1 The Wiener process

P (Xt1 ∈ A1 , Xt2 ∈ A2 , . . . , Xtn ∈ An )

are of the form

In other words, the initial distribution is δ(x).

(x0 = 0, t0 = 0) with density K(x, t) given by (2.91).

2.13.2 Stochastic Integrals

lim Xn = X ⇔ lim h(Xn − X)2 i = 0 (2.95)

2.13.3 Stochastic Differential Equations

A stochastic process x(t) is said to obey an Itô stochastic differential equation