Using Nearest Neighbor Method in Approximate Dynamic Programming To Price American Options

Using Nearest Neighbor Method in Approximate
Dynamic Programming to Price American Options

by Ankush Agarwal
February 24, 2012
Contents
1 Introduction 3
2 Literature review 7
2.1 Nested simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 High estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.2 Low estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Least squares regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Kernel based method 21
3.1 Two time-period problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Multiple time-period problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Proofs of the main results 30
5 Numerical experiments 35
6 Conclusion 36
Bibliography 37
1
List of Figures
2.1 Stock price tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 (high) estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Simple low estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 (low) estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2
Chapter 1
Introduction
Option as an instrument for trading purpose nds its mention as early as in the ancient Greek
period when Thales of Miletus traded on the expectation of a bumper olive harvest and made
money when the scenario indeed materialized. He rented out olive presses at much higher
price than what he had paid for his option. Other notable mention of option usage comes
from the real estate market where a buyer of the option has the right which he may or may
not exercise to purchase a parcel of lands for development. More commonplace occurrence of
option is the right of the mortgage owner to repay the loan early and the right of companys
investors to convert the purchased debt into common stock. Thus options had been in use
for risk management purposes in over the counter markets for quite some time but their
popularity was still limited due to the lack of a pricing formula. It was only in 1973 that Black
and Scholes [3] in their seminal paper provided the nancial industry a theoretical framework
to price nancial options analytically. The research publication also coincided with the start
of option trading on Chicago Board Options Exchange. Together it led to the widespread
popularity of these instruments and a signicant increase in the notional value of trades in
the nancial markets.
In their paper Black and Scholes derived the pricing formula for European options which
may be exercised only at the xed expiry date i.e. at a single pre-dened point in time. The
authors argued that under the no-arbitrage and other regularity conditions, an option can
be replicated using a self-nancing portfolio the price of which satises a partial dierential
equation (PDE) with a certain boundary condition. The price of the option can also be ex-
pressed as the expectation of a discounted payo under an appropriate probability measure
called the risk-neutral measure. The solution to either of the representations gives an analyt-
ical formula for the price of an European option. A lot of work has been done since to relax
the key constant option volatility assumption in the Black-Scholes (BS) model. In order to
better explain the real world behavior of the volatility parameter in the BS model, Derman et
al. [9] [10] proposed local volatility based model. Stochastic volatility models have also been
3
proposed by Heston [15], Derman et al. [8] and more recently by Fouque et al. [12] to better
capture the options market dynamics.
As the understanding of the underlying mathematics of option pricing has improved, the
markets have witnessed the introduction of new and complex options. The most commonly
traded category among them is the American-style options. An American option may be
exercised at any time before the expiry date and hence the pricing of American option entails
solving an optimization problem where we need to nd the optimal exercise time such that the
payo from the option exercise is maximized. Despite recent advances, solving the optimal
stopping time problem remains one of the most challenging problems in applied probability.
Earliest attempt to give a reasonable estimate of the price of American option was by Barone-
Adesi and Whaley [1] where they provided an approximate solution to the Black-Scholes PDE
without the initial condition. Haugh and Kogan [14], Chen and Glasserman [7] provide upper
and lower bound for the price of American options which can also be used to calculate the
price of American option approximately. With the advent of technology and faster comput-
ers, another more popular approach to solve the American option pricing problem has been
through the use of simulation methods. Simulation methods are exible in the sense that
they allow state variables to follow general stochastic processes and to be multi-dimensional.
In general, the simulation techniques for pricing American option restricts the exercise dates
to a nite set of xed exercise opportunities. Such an option with xed set of exercised dates
is called a Bermudan option and its price gives an approximation to the price of an American
option. If we keep increasing the number of exercise opportunities then we will get the true
price of the American option asymptotically. In this article, we discuss a simulation based
algorithm to price Bermudan option using kernel based estimation method which in itself is
a challenging problem.
There have been several key contributions addressing the pricing of American option by
simulation. The important examples include Barraquand and Martineau [2], Carriere [6],
Broadie and Glasserman [5], Tsitsiklis and Van Roy [20], Longsta and Schwartz [18] and
Kohler et al. [17].
Broadie and Glasserman in their paper restrict the exercise opportunities to a nite set
of times. The authors then use a nested Monte Carlo simulation to generate two estimates,
one biased low and other biased high, for the price of American option in discrete time
setting. A point estimate can be calculated with low and high bias estimate along with a
condence interval for the true price of the option. The proposed algorithm performs well for
small number of exercise opportunities even when the dimensionality of the state variables is
high. The main drawback of the method is that it estimates continuation value of the option
at every exercise opportunity using nested simulation and the computational cost blows up
exponentially with the increase in exercise opportunities.
4
The problem of exponential increase in the computational cost for nested simulation re-
sulted in a dierent approach which used regression analysis to estimate the continuation
value function. Tsitsiklis and Van Roy in their paper suggested that the conditional expecta-
tion can be estimated from the cross-sectional information in the simulation by using linear
regression. The discounted post realized continuation payos are regressed on basis functions
of the state variables. The estimated conditional function is then used to make the decision
to whether exercise immediately or to continue by comparing the payo from immediate ex-
ercise and continuation. Longsta and Schwartz proposed that considering only in-the money
paths (paths where immediate exercise yields non-negative payo) improved the accuracy of
the algorithm and allows use of lesser number of basis functions to estimate the continuation
value function.
The suitable choice and number of basis functions can vary according to the problem
and theres no theoretical approach to the problem which makes the idea of using non-
parametric estimator for continuation value function attractive. Another factor in support of
non-parametric estimator is that the regression analysis approximates the continuation value
function using continuous basis functions which may not give a good approximation in many
practical cases (when the continuation region is separable). Kohler et al. in their paper use a
neural network based estimator of continuation value function and show that it performs bet-
ter than the regression based estimator of Tsitsiklis and Van Roy and Longsta and Schwartz
in specic cases at the expense of considerably higher computational cost. A kernel based
non-parametric estimator is used in Hong and Juneja [16] in a risk management setting to es-
timate the conditional expectation of a non-linear payo function. When compared to nested
simulation, they show that the kernel approach has a faster rate of convergence when the
dimension is less than or equal to three, a slower rate of convergence rate when the dimension
is larger than or equal to ve, and the same rate when the dimension is equal to four. In this
article, we employ this kernel based estimator to approximate the continuation value function
and solve the American option pricing problem under nite exercise opportunities setting.
We rst discuss the properties of the estimator under two time-period (two exercise opportu-
nities) setting and provide the asymptotic bias and the variance terms. We analyze the mean
square error of the estimator and provide the optimal rate of convergence of MSE under the
choice of optimal kernel parameter. We argue that the rate of convergence of the estimator
remains the same even when the number of exercise opportunities is large. We investigate the
performance of our method in single dimension and show that it is comparable to the nested
simulation when the number of exercise opportunities is small. Some of the proofs developed
in this article have been worked out after making simplifying assumptions which could be
relaxed after a more rigorous analysis.
The remainder of the report is organized as follows. In chapter 2 we set up the American
option pricing framework and provide a more elaborate review of the popular simulation
5
methods. We propose our method for pricing based on kernel based estimator in chapter 3
where we also discuss the results for asymptotic bias, variance and mean square error of the
estimator. The theoretical proofs are included in chapter 4 and the experimental results in
support of the method are provided in chapter 5. We summarize the work done in the project
and discuss the scope of future work in the conclusion.
6
Chapter 2
Literature review
In this chapter we discuss dierent simulation methods generally employed to price American
options. As mentioned earlier, these methods solve the pricing problem under nite exercise
opportunities setting which can be reduced to a dynamic programming problem. In order
to understand these methods we rst develop a dynamic programming framework to price
American options under the constraint of nite exercise opportunities.
Let W(t) be d-dimensional standard Brownian motion on a probability space (, T, P)
where the generated ltration T
t
t0
satises the usual conditions. Let X
t
t0
be R
d
-valued
Markov process of the underlying asset price which is adapted with respect to T
t
t 0. The
payo to the option holder from exercise at time t is g(X(t)) for some nonnegative payo
function g : R
d
R
+
. If we further assume the existence of an instantaneous short rate
process r(t) and an equivalent risk-neutral measure Q, then the price of a European option
of strike price K and maturity T is given as
E
Q
_
e
T
0
r(s)ds
g(X(T))
_
.
Under the assumption of Black and Scholes, the underlying process is modeled as geometric
Brownian motion GBM(r,
2
), the formula for the price of a European put option of maturity
T with payo function g(X(t)) = (K X(t))
+
is given as
p(t, X(t)) = Ke
r(Tt)
N(d
(T t, X(t)))X(t)N(d
+
(T t, X(t))), 0 t T, X(t) > 0,
where
d
(, x) =
1
_
log
x
K
+
_
r

2
2
_
_
,
N is the cumulative standard normal distribution
N(y) =
1
2
_
y
z
2
2
dz =
1
2
_

y
e
z
2
2
dz,
7
K is the strike, r is the constant risk-free interest rate and is the constant volatility param-
eter.
In the case of American option, we rst consider a class of admissible T
t
-adapted stopping
times T with values in [0, T]. The pricing problem then becomes the calculation of
sup
T
E
Q
_
e
r
g(X())
.
This supremum is achieved by an optimal stopping time
that has the form
= inft 0 : X(t) b
(t),
for some optimal exercise boundary b
.
Consider the American option which restrict the exercise dates to a nite set of xed
exercise opportunities 0 < t
1
< t
2
< < t
m
= T. Such restriction to a nite set of exercise
dates will give an approximation to the true price of the option which itself could be obtained
by letting m . We will discuss the case of nite exercise opportunities as it provides
reliable approximation to the true option price and in itself is a challenging problem to solve.
The calculation of option price under this nite-exercise setting can be characterized as a
dynamic programming problem
1
. Let g denote the payo function and

V
i
(x) denote the
option value at t
i
given X
i
= x, assuming the option has not been exercised previously. As
were interested in

V
0
(x
0
), this value can be determined recursively as follows
V
m
(x) = g(x)
V
i1
(x) = max
_
g(x), E
_
D
i1,i
(X
i
)
V
i
(X
i
)[X
i1
= x
__
, i = 1, . . . , m
where D
i1,i
(X
i
) is the discount factor from t
i1
to t
i
. We could alternatively formulate
the problem so that the option price is always recorded in time-0 value. We can rewrite the
problem as
V
m
(x) = g
m
(x)
V
i1
(x) = max g
i1
(x), E[V
i
(X
i
)[X
i1
= x] , i = 1, . . . , m
where V
i
and g
i
satisfy
g
i
(x) = D
0,i
g(x), i = 1, . . . , m,
V
i
(x) = D
0,i
V
i
(x), i = 0, . . . , m.
The continuation value of an American option with a nite number of exercise opportunities
1
The DP setup of American option pricing problem is similar to the setup used in [13]
8
is the value of holding rather than exercising the option. The continuation value in state x
at date t
i
(measured in time-0 value) is
C
i
(x) = E[V
i+1
(X
i+1
)[X
i
= x],
i = 0, . . . , m1. These also satisfy a dynamic programming recursion: C
m
0 and
C
i
(x) = E[maxg
i+1
(X
i+1
), C
i+1
(X
i+1
)[X
i
= x],
i = 0, . . . , m 1. The option value is C
0
(x
0
), the continuation value at time 0. The value
functions V
i
could be then determined through continuation values
V
i
(x) = max g
i
(x), C
i
(x) ,
i = 1, . . . , m. An approximation

C
i
to the continuation values determines a stopping rule
through
= min
_
i 1, . . . , m : g
i
(X
i
)

C
i
(X
i
)
_
.
Hence, we can estimate the price of an American option using a reasonable approximation to
the continuation value function.
2.1 Nested simulation
The American option pricing problem is to nd
V = sup
E
Q
[e
r
g(X())]
over all stopping times T. Under the setting of nite set of xed exercise opportunities 0 <
t
1
< t
2
< . . . < t
m
= T we simulate paths of asset prices X
0
, X
1
, . . . , X
T
at corresponding
times 0 = t
0
< t
1
< . . . < t
m
= T. We then compute a discounted option value corresponding
to each path, and take the average. If the optimal exercise policy was known then for every
simulated path the estimate would be e
r
g(X(
)). But the optimal policy is not known

beforehand and must be determined via the simulation. In [5], Broadie and Glasserman
provide two estimators, one biased high and other biased low which can be used to estimate
the true option value.
2.1.1 High estimator
Let V denote price of an American call option with m + 1 exercise opportunities at times
t
i
, i = 0, . . . , m. Denote the high bias estimator as . It is dened as the option value
estimate obtained by a dynamic programming (DP) algorithm applied to the simulated tree.
9
u
u
u
uu
u
u
u
u
u
u
u
u
u
"
"
"
"
"
"
"
"
"
"
"
"
"
""
b
b
b
b
b
b
b
b
b
b
b
b
b
bb
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
-
101
t
1
t
0
= 0 t
2
= T
90
119
117
61
101
56
144
143
102
95
136
94
Figure 2.1: Stock price tree
u
u
u
uu
u
u
u
u
u
u
u
u
u
"
"
"
"
"
"
"
"
"
"
"
"
"
""
b
b
b
b
b
b
b
b
b
b
b
b
b
bb
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
-
t
1
t
0
= 0 t
2
= T
2
43
44
0
1
0
0
36
0
29.7
0.3
17
15.7
Figure 2.2: (high) estimate
10
At the terminal date, option value is known. At each prior date, the option value is dened
to be the maximum of the immediate exercise value and the expectation of the succeeding
discounted option values. Finally is the estimated option value at the initial node. A
numerical illustration of an American call option as provided in [5] is given in Fig. 2.1 and
2.2. The parameters are X
0
= 101, K = 100, T = 1, = 0.4 and r = 0. For this particular
tree, the American option price estimate is = 11.9.
While calculating we look forward in the future to maximize the estimated price which
induces an upward bias in the method. If at some node future stock prices are too high, then
the dynamic programming problem may choose not to exercise and receive a value higher than
the optimal decision to exercise. Similarly, if future stock prices are too low, the dynamic
programming algorithm may choose to exercise even when the optimal decision is not to
exercise. In each case, the DP algorithm peeps in the future and overestimates the option
value. The argument can be proved more precisely by using Jensens inequality.
2.1.2 Low estimator
The high bias in the high estimator is the result of using the same set of paths to calculate
the continuation value as well as to calculate the price of the option. This is implicit in
the dynamic programming recursion as argued above. To remove this source of bias, the
calculation of continuation value should be separated from the exercise decision. To calculate
a low estimator the paths are separated into two sets. The rst set which is used to estimate
the continuation value and the second set which is used to decide whether or not to exercise.
This separation overcomes the problem of looking in the future to make the exercise decision
and get rids of the upward bias. The idea is once again explained by using the example in [5]
which is illustrated in Fig. 2.3. The numerical values are based on the stock price tree in Fig.
2.1. At each node the second two branches are used to determine the exercise decision and
the rst branch is used to determine the continuation value, if necessary. For example, at the
top node at time t
1
the decision is made by comparing the immediate exercise value (19) with
the discounted expected value of not exercising (43.5 = 0.5*43 + 0.5*44). Hence, the decision
is made to continue, but the value assigned to this action is 2 (based on the rst branch which
leads to a terminal stock price of 102). To understand why this method gives an estimator
biased low we should rst note that the decision is based on unbiased information from the
maturity date. If the correct decision is inferred from this information, the estimator would
be unbiased. But with a nite sample, there is a positive probability of inferring a suboptimal
decision. In this case, the value assigned to any node prior to expiration will be an unbiased
estimate of the lower value associated with the incorrect decision. The expected node value
is a weighted average of an unbiased estimate (based on the correct decision) and an estimate
which is biased low (based on the incorrect decision). The net eect is an estimate which
is biased low. To improve the accuracy of the method, at each node we can use branch
11
u
u
uu
u
u
u
u
u
u
u
u
u
u "
"
"
"
"
"
"
"
"
"
"
"
"
""
b
b
b
b
b
b
b
b
b
b
b
b
b
bb
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
-
t
1
t
0
= 0 t
2
= T
1
0
0
36
0
0
1
0
44
43
2
0
2
Figure 2.3: Simple low estimate
u
u
u
uu
u
u
u
u
u
u
u
u
u
"
"
"
"
"
"
"
"
"
"
"
"
"
""
b
b
b
b
b
b
b
b
b
b
b
b
b
bb
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
`
`
`
`
`
`
`
`
`
`
`
`
`
``
-
t
1
t
0
= 0 t
2
= T
2
0.3
29.7
5.7
0
36
0
0
1
0
44
43
11.9
Figure 2.4: (low) estimate
12
1 to estimate the continuation value and the other b 1 branches to estimate the exercise
decision. This process is repeated b 1 times, using branch 2 to estimate the continuation
value, then branch 3, and so on. The b values obtained are averaged to determine the option
value estimate at the node. The idea is illustrated in Fig. 2.4 using the same example as
in [5]. We observe that the computational burden of the method grows exponentially as m
b
with m the number of exercise opportunities and b the number of branches. The method will
work for Bermudan options but in case of calculating the price of an American option using
simulation, computational cost becomes prohibitive.
2.2 Least squares regression
To bypass the problem of exponential growth in computational cost with the increase in exer-
cise opportunities, Longsta and Schwartz [18] proposed to use a regression based estimator
of the continuation value function. At the nal exercise date, the optimal exercise strategy
for an American option is to exercise the option if it is in the money. However before the nal
date, the optimal strategy is to compare the immediate exercise value with the expected cash
ows from continuing, and then exercise if immediate exercise is more valuable. The proposed
method is based on using the cross-sectional information in the simulation to estimate the
continuation value function using least squares. The method regresses the ex post realized
payos from continuation on a set of basis functions of the values of the relevant state vari-
ables. The tted value from this regression using least squares then provides a direct estimate
of the conditional expectation function. A numerical example from [18] is used to explain the
idea further.
Consider an American put option on a share of non-dividend-paying stock. The put option
is exercisable at a strike price of 1.10 at time 1,2, and 3 which is the expiry date of the option.
The risk-less rate r is assumed to be 6%. For simplicity, the algorithm is illustrated using
only eight sample paths for the price of the stock. These sample paths are generated under
the risk-neutral measure and are shown in the following matrix.
13
Stock price paths
Path t = 0 t = 1 t = 2 t = 3
1 1.00 1.34 1.12 1.09
2 1.00 1.25 1.14 1.41
3 1.00 0.90 0.83 1.06
4 1.00 1.03 1.13 1.30
5 1.00 0.92 0.99 1.04
6 1.00 0.95 0.75 0.77
7 1.00 1.02 1.04 0.79
8 1.00 1.00 1.15 1.52
Conditional on not exercising the option before the nal expiration date at time 3, the
cash ows realized by the option-holder from following the optimal strategy at time 3 are
given below.
Cash-ow matrix at time 3
Path t = 1 t = 2 t = 3
1 0.01
2 0.00
3 0.04
4 0.00
5 0.06
6 0.33
7 0.31
8 0.00
If the put is in the money at time 2, then the option-holder must decide whether to exercise
the option immediately or continue the options life until the nal expiration date at time
3. From the stock-price matrix, there are only stock prices at time 2 for ve in-the-money
paths and Y denote the corresponding discounted cash ows received at time 3 if the put is
not exercised at time 2. Only in-the-money paths are used since it allows to better estimate
the conditional expectation function in the region where exercise is relevant and signicantly
improves the eciency of the algorithm. The vectors X and Y are given by the non-dashed
entries below.
14
Regression at time 2
Path Y X
1 0.01 x 0.94176 1.08
2
3 0.04 x 0.94176 0.83
4
5 0.06 x 0.94176 0.99
6 0.20 x 0.94176 0.75
7 0.09 x 0.94176 1.04
8
To estimate the expected cash ow from continuing the options life conditional on the
stock price at time 2, Y is regressed on a constant, X and X
2
. This specication is one of the
simplest possible; more general specications could be considered according to the problem.
The resulting conditional expectation function is E[Y [X] = 1.603 2.823X +1.321X
2
. With
this conditional expectation function, the value of immediate exercise at time 2 is compared ,
given in the rst column below, with the value from continuation, given in the second column.
Optimal early exercise decision at time 2
Path Exercise Continuation
1 0.00 0.0983
2
3 0.27 0.1723
4
5 0.11 0.1021
6 0.35 0.2296
7 0.06 0.0961
8
The value of immediate exercise equals the intrinsic value 1.10X for in-the-money paths,
while the value from continuation is given by substituting X into the conditional expectation
function. This comparison implies that it is optimal to exercise the option at time 2 for the
fourth, sixth and seventh paths. This leads to the following matrix, which shows the cash
ows received by the option-holder conditional on not exercising prior to time 2.
15
Cash-ow matrix at time 2
Path t = 1 t = 2 t = 3
1 0.00 0.01
2 0.00 0.00
3 0.27 0.00
4 0.00 0.00
5 0.11 0.00
6 0.35 0.00
7 0.00 0.31
8 0.00 0.00
Observe that when the option is exercised at time 2, the cash ow in the nal column
becomes zero, This is because once the option is exercised there are no further cash ows
since the option can only be exercised once.
Proceeding recursively, it can examined whether the option should be exercised at time
1. The same procedure is repeated to arrive at the following matrix in which the value of
immediate exercise at time 1 is compared, given in the rst column below, with the value
from continuation, given in the second column.
Optimal early exercise decision at time 1
Path Exercise Continuation
1 0.00 0.0058
2
3 0.20 0.2010
4
5 0.18 0.2242
6 0.15 0.2503
7 0.08 0.2871
8
Having identied the exercise strategy at times 1, 2 and 3, the stopping rule can now be
represented by the following matrix, where the ones denote the exercise date at which the
option is exercised.
16
Stopping rule
Path t = 1 t = 2 t = 3
1 0 0 1
2 0 0 0
3 0 1 0
4 0 0 0
5 0 1 0
6 0 1 0
7 0 0 1
8 0 0 0
With this specication of the stopping rule, it is now straightforward to determine the
cash ows realized by following this stopping rule. This is done by simply exercising the
option at the exercise dates where there is a one in the above matrix. This leads to the
following option cash ow matrix.
Option cash ow matrix
Path t = 1 t = 2 t = 3
1 0.00 0.00 0.01
2 0.00 0.00 0.00
3 0.00 0.27 0.00
4 0.00 0.00 0.00
5 0.00 0.11 0.00
6 0.00 0.35 0.00
7 0.00 0.00 0.31
8 0.00 0.00 0.00
Having identied the cash ows generated by the American put at each date along each
path, the option can now be valued by discounting each cash ow in the option cash ow
matrix back to time zero, and averaging over all paths.
Next we provide a precise description of Longsta-Schwartz algorithm. Denote by C(, s; t, T)
the path of cash ows generated by the option, conditional on the option not being exercised
at or prior to time t and on the option-holder following the optimal stopping strategy for all s,
t < s T. We restrict the exercise opportunities to K discrete times 0 < t
1
t
2
. . . t
K
= T,
and consider the optimal stopping policy at each exercise date. At the nal expiration date of
the option, the investor exercises the option if it is in the money, or allows it to expire if it is
out of the money. At exercise time t
k
prior to nal expiration date, however, the option-holder
will require the knowledge of continuation value in order to make the decision to exercise or
not. No-arbitrage valuation theory implies that the continuation value can be calculated by
taking the expectation of the remaining discounted cash ows C(, s; t
k
, T) with respect to
17
the risk-neutral pricing measure Q. Specically at time t
k
, the value of continuation F(; t
k
)
can be expressed as
F(; t
k
) = E
Q
_
K
j=k+1
exp(
_
t
j
t
k
r(, s)ds)C(, s; t
j
, T)
T
t
k
where r(, t) is the risk-less discount rate, and the expectation is taken conditional on the
information set T
t
k
at time t
k
. The Longsta-Schwartz least squares Monte Carlo (LSM)
method approximates this conditional expectation at t
K1
, t
K2
, . . . , t
1
. Working backwards
recursively, at time t
K1
, it is assumed in the LSM algorithm that the unknown functional
form of F(; t
K1
) can be represented as a linear combination of a countable set of T
t
K1
- measurable basis functions. The assumption can be justied in the case when conditional
expectation is an element of the L
2
space of square-integrable functions relative to some
measure. Since L
2
is a Hilbert space, it has a countable orthonormal basis and the conditional
expectation can be represented as a linear function of the elements of the basis. A suitable set
of basis functions including Laguerre, Hermite, Legendre, Chebyshev, Gegebaeur and Jacobi
polynomials can be used to approximate the conditional expectation.
Once this subset of M basis functions has been specied, F
M
(; t
K1
) which is the approxi-
mation of F(; t
K1
) using rst M basis functions, is estimated by projecting or regressing the
discounted values of C(, s; t
K1
, T) onto the basis functions for the paths where the option
is in the money. It can be shown that the tted value of this regression

F(; t
K1
) converges
in mean square and in probability to F
M
(; t
K1
) as the number N of (in-the-money) paths
in the simulation goes to innity. Furthermore, it can be showed that

F(; t
K1
) is the best
linear unbiased estimator of F
M
(; t
K1
) based on a mean-squared metric. Once the condi-
tional expectation function at time t
K1
is estimated, we can determine whether the early
exercise at time t
K1
is optimal for an in-the-money path by comparing the immediate
exercise value with

F(; t
K1
), and repeating for each in-the-money path. Once the exercise
decision is identied, the option cash ow paths C(, s; t
K2
, T) can then be approximated.
The recursion proceeds rolling back to time t
K2
and repeating the procedure until the ex-
ercise decisions at each exercise time along each path have been determined. The American
option is then valued by starting at time zero, moving forward along each path until the rst
stopping time occurs, discounting the resulting cash ow from exercise back to time zero, and
then taking the average over all paths .
The proposed algorithm will provide an estimator biased high as it uses the same set of
paths to estimate the price of the American option which are used to evaluate a exercise
policy. It essentially gives the high bias estimator in nested simulation. But Longsta and
Schwartz argue empirically that even if one set of paths are used to estimate the conditional
expectation regressions and another set of paths used to apply the regression functions, the
18
prices obtained are virtually identical. Given the empirical evidence, the authors recommend
that the LSM algorithm can be used directly to price the American option in order to minimize
computational time. The central drawback of the LSM algorithm is the ad-hoc choice of the
subset of basis functions. In the absence of a standard procedure to choose the subset, basis
functions are chosen based on their empirical suitability for the problem. Also by using a linear
combination of a set of continuous basis functions, we can approximate the continuation value
function by only a continuous function which supports the case for the use of non-parametric
estimators of the conditional expectation.
2.3 Neural networks
In their paper Kohler et al. [17] estimate the continuation value function using least squares
neural network regression where all parameters of the estimates are selected using the given
data only. To begin with consider a sigmoid function : R [0, 1] i. e. , monotonically
increasing function satisfying
(x) 0 as x and (x) 1 as x .
An example of such a sigmoid function is the logistic squasher dened by (x) =
1
1+e
x
, x R.
A neural network with k N hidden neurons and the chosen sigmoid function is used to
estimate the continuation values. The method of least squares is used to t the neural network
function to the data and for technical reasons the sum of the absolute values of the output
weights is bounded. The selection of the optimal number of hidden neurons is data-driven
and is achieved using sample-splitting.
Let
n
> 0 which is chosen subject to the constraint
n
0 as n such that some
regularity constraints are satised and let T
k
(
n
) be a class of neural networks dened by
T
k
(
n
) =
_
k
i=1
c
i
(a
T
i
x +b
i
) +c
0
: a
i
R
d
, b
i
R,
k
i=0
[c
i
[
n
_
.
Now in order to estimate the continuation values, rst generate sample paths X
(l)
i,t
t=0,...,T
(l =
0, 1, . . . , T 1, i = 1, 2, . . . , n) of the underlying process X
t
t=0,...,T
. With the simulated
data and dynamic programming (DP) representation, one could generate samples of the
continuation value recursively for each path i and each time instant t. These samples are
then used to t a neural network using the principle of least squares.
Start with
C
n,T
(x) = 0, x R
d
,
where

C
n,t
(x) is an approximation to the actual continuation value function C
t
(x). Fix
19
t 0, . . . , T 1. Given an estimate

C
n,t+1
(x) of C
t+1
, one could estimate an approximative
sample of continuation value on each sample path i and time instant t denoted by
Y
(t)
i,t
= max
_
h(X
(t)
i,t+1
),

C
n,t+1
(x)(X
(t)
i,t+1
)
_
.
So the nal approximative of the data time instant t is given by
_
_
X
(t)
i,t
, Y
(t)
i,t
_
: i = 1, . . . , n
_
. (2.1)
We note that this sample depends on the tth sample of X
s
s=0,...,T
and

C
n,t+1
i. e. , for each
time step t, a new sample of the stochastic process X
s
s=0,...,T
is used to dene the data in
(2.1). This ensures that the algorithm evaluates a lower bound to the option price.
The parameter k of the neural network regression estimate is chosen automatically by
sample splitting. The data in (2.1) is subdivided in a learning sample of size n
l
= ,n/2| and
a testing sample of size n
t
= n n
l
. Dene for a given k T
n
= 1, . . . , n a regression
estimate of C
t
by
C
k
n
l
,t
() = arg min
fF
k
(
n
)
_
1
n
l
n
l
i=1
[f(X
(t)
i,t
) Y
(t)
i,t
[
2
_
,
where the underlying assumption is that the minima exists, however it need not be unique.
The optimal number of hidden neurons k is chosen by minimizing the empirical L
2
risk on
the testing sample. So k is chosen as
k = arg min
kP
n
1
n
t
n
i=n
l
+1
[
C
k
n
l
,t
(X
(t)
i,t
) Y
(t)
i,t
[
2
and the nal neural network regression estimate of C
t
is dened as
C
n,t
(x) =

C
k
n
l
,t
(x), x R
d
.
This neural network regression estimate of the continuation value is then used to make the
exercise decision at every time instant t to calculate the estimate of the American option price
using simulated sample paths.
20
Chapter 3
Kernel based method
The proposed technique of American option pricing is based on using the kernel based es-
timator to approximate the continuation value function. The non-parametric kernel esti-
mator allows us to approximate an arbitrary continuation value function with lesser com-
putational cost than the method employing neural networks. Given N i.i.d. samples of
(X, Y ), denoted by (X
i
, Y
i
), 1 i N, we dene Nadraya-Watson estimator for estimat-
ing C(x) E[Y [X = x] as follows:
C
N
(x) =
i=N
i=1
Y
i
K
h
(x X
i
)
i=N
i=1
K
h
(x X
i
)
,
where K
h
(x) = (1/h
d
)K(x/h), K is the kernel function which is essentially a symmetric
density function on R
d
, and h is the bandwidth parameter satisfying h 0 and Nh
d

as N .
In our proposed method we use a xed-ball estimator which is a special case of Nadraya-
Watson estimator with K
h
(x) = I([x[ < h). The algorithm provides a low bias estimator for
the price of an American option with a xed set of m+1 exercise opportunities 0 = t
0
< t
1
<
. . . < t
m
= T. The complete algorithm is summarized as follows:
1. Generate a set of N sample paths X
jt
i
, i = 0, . . . , m
N
j=1
of the underlying process
X
t
, 0 t T and refer to them as Stage 1 paths. For every Stage 1 sample path
obtain a noisy sample of the continuation value at every exercise opportunity t
i
using
the dynamic programming (DP) setup as illustrated in Chapter 2;
2. Generate another set of N sample paths X
jt
i
, i = 0, . . . , m
N
j=1
of the underlying
process, referred as Stage 2 paths;
3. For each Stage 2 sample path starting from t = t
1
nd neighbors from Stage 1 paths
21
and estimate the continuation value using the average of the noisy continuation value
of the neighbors;
4. Make the exercise decision by using the estimated continuation value and calculate the
option price estimator by using the average of the discounted payos of each Stage 2
sample path.
With a nite sample there is a positive probability that the policy evaluated using Stage
1 paths is suboptimal. In this case, as we use a sub-optimal policy to exercise the option,
we get an estimate of the true option value which is biased low. We can also distribute the
computational budget to generate dierent number of sample paths in Stage 1 and Stage 2.
We restrict the discussion in this article to the case when the underlying price process X
takes values in R and leave the analysis in higher dimension for future discussion. o
3.1 Two time-period problem
To understand the properties of the xed-ball estimator, we rst consider the two time-period
problem where we x the exercise opportunities at 0 = t
0
< t
1
< t
2
= T. For any Stage 2
sample path (X
0
, X
1
, X
2
) of the underlying process, the continuation value of at t
1
can be
approximated using the paths simulated in Stage 1 with a xed-ball estimator. Assuming its
always possible to nd the neighbors, it can be expressed as,
C
N
(X
1
) =
j=N
j=1
(C(X
j1
) +W
j
) 1
{|X
j1
X
1
|<h}
j=N
j=1
1
{|X
j1
X
1
|<h}
where C() is the continuation value function, X
j1
j=N
j=1
are the Stage 1 sample path values
at t
1
, W
j
is distributed independently and normally with E[W
j
[X
j1
] = 0 and E[W
2
j
[X
j1
=
x] =
2
(x). Note that in the two-time period setting C(X
1
) = E[g(X
2
)[X
1
]. We can estimate
the price of the option by exercising the approximate policy evaluated using Stage 1 sample
paths. Then the estimated price is given by
V =
1
N
j=N
j=1
[g(X
j1
) 1
{
C
N
(X
j1
)g(X
j1
)}
+g(X
j2
) 1
{
C
N
(X
j1
)>g(X
j1
)}
],
and the actual price by
V = E[g(X
1
) 1
{E[g(X
2
)|X
1
]g(X
1
)}
] +E[g(X
2
) 1
{E[g(X
2
)|X
1
]>g(X
1
)}
].
To calculate the bias E[
V ] V , we can easily verify that the dierence will have non-

zero contribution only from the sets
_
X
1
:

C
N
(X
1
) g(X
1
) < E
_
g(X
2
)[X
1
_
and
_
X
1
:
22
E
_
g(X
2
)[X
1
g(X
1
) <

C
N
(X
1
)
_
.
Hence, E[
V ] V = E
__
g(X
1
) g(X
2
)
_
1
{
C
N
(X
1
)g(X
1
)<E[g(X
2
)|X
1
]}
+E
__
g(X
2
) g(X
1
)
_
1
{E[g(X
2
)|X
1
]g(X
1
)<
C
N
(X
1
)}
.
(3.1)
Claim.

V is lower biased, i.e. E[
V ] V 0.
Proof. We start with the rst term in the bias (3.1)
E
__
g(X
1
) g(X
2
)
_
1
{
C
N
(X
1
)g(X
1
)<E[g(X
2
)|X
1
]}
= E
_
E
__
g(X
1
) g(X
2
)
_
1
{
C
N
(X
1
)g(X
1
)<E[g(X
2
)|X
1
]}
X
1
, X
11
, . . . , X
N1
= E
__
g(X
1
) E[g(X
2
)[X
1
]
_
1
{
C
N
(X
1
)g(X
1
)<E[g(X
2
)|X
1
]}
0, as X
2
is dependent only on X
1
and is independent of X
j1
for all j Stage1,
and similarly E
__
g(X
2
) g(X
1
)
_
1
{E[g(X
2
)|X
1
]g(X
1
)<
C
N
(X
1
)}
0.
Intuitively also it is clear that as we use a sub-optimal policy to exercise the option, the
estimate of the true option value will be biased low. The bias and variance of the estimated
price of an American option in two-time period setting can be calculated using the standard
results for the Nadraya-Watson estimator. For a given value of X
1
= x the bias and variance
of

C
N
(x) is denoted by B
N
(x) and V
N
(x) respectively. The following existence and continuity
conditions are assumed to hold while deriving B
N
(x) and V
N
(x):
Assumption 1. X has a probability density function f(x) and f(x) > 0,
Assumption 2. C(x) is twice continuously dierentiable and has bounded rst and second-
order derivatives,
Assumption 3. f(x) is twice continuously dierentiable and has bounded second-order deriva-
tive,
Assumption 4. h 0, Nh
d
as N .
Lemma 3.1. Let
2
(x) = E[Y
2
[X = x] C
2
(x) > 0,
K
=
_
R
d
u
T
uK(u)du and c
K
=
_
R
d
K
2
(u)du. With assumptions 14 and under some regularity conditions,
B
N
(x) = B(x)h
2
+o(h
2
),
V
N
(x) =
V (x)
Nh
d
+
o(1)
Nh
d
,
the notation A
T
and tr(A) denotes the transpose and trace of a matrix A respectively, and
B(x) =

K
2f(x)
_
tr
_

x
x
T
[m(x)f(x)]
_
m(x)tr
_

x
x
T
f(x)
__
23
V (x) = c
K
2
(x)/f(x).
A detailed discussion of this Lemma can be found in Bosq [4] and Pagan and Ullah [19].
In our setting d = 1, Y
i
= C(X
i
) + W
i
and c
K
= 2. To prove our result for the bias and
variance of the xed-ball estimator we need to make further assumptions which are discussed
as follows.
Assumption 5. For Nh
5
C,
C(X
1
)

C
N
(X
1
) = Z
h
N
(X
1
) =
Nh
2
2
(X
1
)
f(X
1
)
B(X
1
)h
2
where E[

Z[X
1
] = 0, E[

Z
2
[X
1
] = 1 and E[[
Z[
3
[X
1
] < .
The above assumption could be justied with Lemma 3.1. We note that for suciently
large value of N,

C
N
(X) is close to C(X) and so the exercise errors are committed only
close to the exercise boundary which is the region c = x R : C
1
(x) = g
1
(x). Under the
geometric Brownian motion model for an American put option, c contains a single point x
0
.
In order to state our result for the asymptotic bias and variance of the option price
estimator, we rst need to discuss the properties of the terms

B(X
1
) :=
B(X
1
)
G(X
1
)
and H(X
1
) :=
C(X
1
)g(X
1
)
G(X
1
)
where G(X
1
) :=
_
2
2
(X
1
)
f(X
1
)
.
Assumption 6.
f
(1)
f
is bounded.
The above assumption holds true for x > > 0 when X
1
has a lognormal distribution.
It also holds for Pareto distribution over some support [x
m
, ) for x
m
> 0. For d = 1,
B =

K
2
_
C
(2)
+ 2C
(1) f
(1)
f
_
and with assumption 2 and 3 we can check that B(x) is bounded.
The option payo functions often fail to be dierentiable everywhere e. g. American put
payo h(x) = maxK x, 0. But the points of non-dierentiability can be ignored because
the probability that the underlying process X
t
will hit them is 0. To make this more precise,
let
T
g
= x R
+
: g() is dierentiable at x
and we need
Assumption 7. P(X T
g
) = 1.
The example of standard American put option will satisfy assumption 7 if we assume
that the underlying asset follows geometric Brownian motion. The put option payo is not
dierentiable only at the strike price K and probability that the underlying process is exactly
equal to K is zero.
24
Assumption 8.
2
(x) > 0, bounded and continuously dierentiable.
With assumption 7 and 8 we can check that H() and

B() are both dierentiable. Fur-
thermore, in conjunction with assumption 6

B() is bounded.
Proposition 3.2. Let Nh
5
C and suppose that assumptions 1 8 hold. Furthermore, we
assume that X
1
and

Z have a joint density f
X
1
,
Z
(, ) which has a partial derivative w. r. t. x
and
f
X
1
,
Z
x
1
(, z)/f
X
1
,
Z
(, z)
M z R.
Then,
V E[
V ] =
1
2H
(x
0
)
G(x
0
)f(x
0
)
_
B
2
(x
0
)h
4
+
1
Nh
_
+o
_
1
Nh
_
.
To analyze the variance of

V , we rst note that
Var(
V ) =
1
N
Var
_
g(X
1
) 1
{
C
N
(X
1
)g(X
1
)}
+g(X
2
) 1
{
C
N
(X
1
)>g(X
1
)}
+
_
1
1
N
_
Cov
_
g(X
11
) 1
{
C
N
(X
11
)g(X
11
)}
+g(X
12
) 1
{
C
N
(X
11
)>g(X
11
)}
,
g(X
21
) 1
{
C
N
(X
21
)g(X
21
)}
+g(X
22
) 1
{
C
N
(X
21
)>g(X
21
)}
_
.
(3.2)
The variance term and covariance term in the RHS of equation (3.2) will be analyzed sepa-
rately. To analyze the variance term, we use the following Lemma from [16].
Lemma 3.3. Suppose that for function g() the discontinuity set denoted by

T
g
, satises
P(C(X)

T
g
) = 0, and there exists a constant M and an integer p 0 such that [g(t)[
M[t[
p
. Assume that

C
N
(X) converges to C(X) in probability as N , and there exists
some > 0 such that
sup
N
E
_
[
C
N
(X)[
2p+
< .
Then,
Var
_
g(
C
N
(X))
= Var
_
g(C(X))
+o(1).
Proposition 3.4. Let Nh
5
C and suppose that f(x) C
1
, [f
(x)[ C
1
, assumptions in
Lemma 3.3 and 1 8 hold. Then,
Var(
V ) = O
_
1
N
_
.
We can analyze the mean square error (MSE) of

V in order to compute the optimal value
of h by combining the bias and variance results from Propositions 3.2 and 3.4.
MSE(
V ) =
_
1
2H
(x
0
)
G(x
0
)f(x
0
)
_
B
2
(x
0
)h
4
+
1
Nh
_
+o
_
1
Nh
_
__
2
+O
_
1
N
_
. (3.3)
25
Expanding the RHS of equation (3.3), we can see that the MSE of

V has dominant terms:
a
2
h
8
+
2ab
Nh
3
+
b
2
N
2
h
2
+
c
N
for some constants a, b and c.
We assume that on average, the computational eort required to generate a sample
(X
1
, X
2
) is and the eort required to compute the estimator given the sample is negli-
gible. Then approximately we have = N where denotes the total computational budget.
We let to analyze the asymptotic convergence rate of the estimator

V with the com-
putational budget. We can set = 1 without loss of generality. Then the MSE of

V has
dominant terms
a
2
h
8
+
2ab
h
3
+
b
2
2
h
2
+
c
.
We can easily verify that the optimal rate of convergence of the MSE equals
8
5
and the optimal h is of order
1
5
.
3.2 Multiple time-period problem
We can use the results from the previous section to extend the result of two-time period
problem to multi-period. We start with the three-time period case and x the dates of
exercise opportunities to 0 < t
1
< t
2
< t
3
. The risk-free rate of interest is assumed to be zero
to ease the notation. In this setting actual option price is given as
V = E
_
g(X
1
) 1
{C(X
1
)g(X
1
)}
+E
_
g(X
2
) 1
{C(X
1
)>g(X
1
)}
1
{C(X
2
)g(X
2
)}
+E
_
g(X
3
) 1
{C(X
1
)>g(X
1
)}
1
{C(X
2
)>g(X
2
)}
and estimated option price by
V =
1
N
j=N
j=1
_
g(X
j1
) 1
{
C
N
(X
j1
)g(X
j1
)}
+g(X
j2
) 1
{
C
N
(X
j1
)>g(X
j1
)}
1
{
C
N
(X
j2
)g(X
j2
)}
+g(X
j3
) 1
{
C
N
(X
j1
)>g(X
j1
)}
1
{
C
N
(X
j2
)>g(X
j2
)}
.
We study the following dierence to calculate the bias of the estimator in three-period setting
V E[
V ] = E
__
g(X
1
) g(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)g(X
2
)}
+E
__
g(X
1
) g(X
3
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)>g(X
2
)}
26
+E
__
g(X
2
) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)g(X
2
)}
+E
__
g(X
3
) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)>g(X
2
)}
+E
__
g(X
3
) g(X
2
)
_
1
{
C
N
(X
2
)g(X
2
)<C(X
2
)}
1
{
C
N
(X
1
)>g(X
1
)}
1
{C(X
1
)>g(X
1
)}
+E
__
g(X
2
) g(X
3
)
_
1
{C(X
2
)g(X
2
)<
C
N
(X
2
)}
1
{
C
N
(X
1
)>g(X
1
)}
1
{C(X
1
)>g(X
1
)}
.
Consider the term
E
__
g(X
3
) g(X
2
)
_
1
{
C
N
(X
2
)g(X
2
)<C(X
2
)}
1
{
C
N
(X
1
)>g(X
1
)}
1
{C(X
1
)>g(X
1
)}
= E
__
C(X
2
) g(X
2
)
_
1
{
C
N
(X
2
)g(X
2
)<C(X
2
)}
1
{
C
N
(X
1
)>g(X
1
)}
1
{C(X
1
)>g(X
1
)}
(3.4)
and
E
__
g(X
2
) g(X
3
)
_
1
{C(X
2
)g(X
2
)<
C
N
(X
2
)}
1
{
C
N
(X
1
)>g(X
1
)}
1
{C(X
1
)>g(X
1
)}
= E
__
g(X
2
) C(X
2
)
_
1
{C(X
2
)g(X
2
)<
C
N
(X
2
)}
1
{
C
N
(X
1
)>g(X
1
)}
1
{C(X
1
)>g(X
1
)}
.
(3.5)
We can easily check that these two terms are upper bounded by
E
__
C(X
2
) g(X
2
)
_
1
{
C
N
(X
2
)g(X
2
)<C(X
2
)}
,
E
__
g(X
2
) C(X
2
)
_
1
{C(X
2
)g(X
2
)<
C
N
(X
2
)}
respectively. Adding these terms give the same order term as in the bias of the single period
problem. Next the term
E
__
g(X
1
) g(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)g(X
2
)}
= E
__
g(X
1
) max(C(X
2
), g(X
2
))
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)g(X
2
)}
1
{C(X
2
)g(X
2
)}
+E
__
g(X
1
) g(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)g(X
2
)}
1
{C(X
2
)>g(X
2
)}
= E
__
g(X
1
) max(C(X
2
), g(X
2
))
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{C(X
2
)g(X
2
)}
E
__
g(X
1
) max(C(X
2
), g(X
2
))
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)>g(X
2
)}
1
{C(X
2
)g(X
2
)}
+E
__
g(X
1
) g(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)g(X
2
)}
1
{C(X
2
)>g(X
2
)}
(3.6)
27
and
E
__
g(X
1
) g(X
3
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)>g(X
2
)}
= E
__
g(X
1
) C(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)>g(X
2
)}
= E
__
g(X
1
) max(C(X
2
), g(X
2
))
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)>g(X
2
)}
1
{C(X
2
)>g(X
2
)}
+E
__
g(X
1
) C(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)>g(X
2
)}
1
{C(X
2
)g(X
2
)}
= E
__
g(X
1
) max(C(X
2
), g(X
2
))
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{C(X
2
)>g(X
2
)}
E
__
g(X
1
) max(C(X
2
), g(X
2
))
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)g(X
2
)}
1
{C(X
2
)>g(X
2
)}
+E
__
g(X
1
) C(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)>g(X
2
)}
1
{C(X
2
)g(X
2
)}
(3.7)
add up to give
E
__
g(X
1
) C(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
+E
__
C(X
2
) g(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{
C
N
(X
2
)g(X
2
)<C(X
2
)}
+E
__
g(X
2
) C(X
2
)
_
1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{C(X
2
)g(X
2
)<
C
N
(X
2
)}
.
(3.8)
Finally the terms
E
__
g(X
3
) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)>g(X
2
)}
= E
__
E[g(X
3
)[X
1
, X
2
] g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)>g(X
2
)}
= E
__
C(X
2
) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)>g(X
2
)}
= E
__
max(C(X
2
), g(X
2
)) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)>g(X
2
)}
(3.9)
and
E
__
g(X
2
) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)g(X
2
)}
= E
__
max(C(X
2
), g(X
2
)) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
1
{C(X
2
)g(X
2
)}
.
(3.10)
Again when added, the above two terms give
E
__
C(X
1
) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
which can be analyzed using the bias term in the single-period problem.
In the multiple time period setting, bias in the estimator arises when an incorrect or
sub-optimal decision is made. If an option is continued at an exercise opportunity t
i
even
when g(X
i
) C(X
i
), we refer the instance as mistake. An incorrect decision could result
28
from committing at most m mistakes. Let denote the optimal exercise opportunity and
to be the eventual exercise time based on the decision derived using the continuation value
estimator. We make the assumptions
Case = 1, 1 mistake
E[(g(X
1
) g(X
1
))1
{g(X
1
)C(X
1
)}
1
{ =
1
}
]
where
1
is the next optimal time after the option is continued at t
1
. In this case E[g(X
1
)[X
1
] =
C(X
1
) and one can reduce the expression to
E[(g(X
1
) C(X
1
))1
{C(X
1
)g(X
1
)<
C
N
(X
1
)}
1
{ =
1
}
]
which from our previous analysis is of the same order as the bias term in the single period
problem.
Case = 1, 2 mistake
m1
i=2
E[(g(X
1
) g(X
2
))1
{g(X
1
)C(X
1
)}
1
{
1
=i}
1
{ =
2
}
]
where
2
is the next optimal time after
1
. Similarly one can formalize the bias terms when
= 1 and there are more than 2 mistake using a sequence of
i
, 2 i m1 stopping times.
29
Chapter 4
Proofs of the main results
Proof of Proposition 3.2. From the bias expression in equation (3.1) the rst term of V E[
V ]
is given as
E
__
g(X
2
) g(X
1
)
_
1
{
C
N
(X
1
)g(X
1
)<C(X
1
)}
= E
__
C(X
1
) g(X
1
)
_
1
{
C
N
(X
1
)C(X
1
)g(X
1
)C(X
1
)<0}
= E
_
Y 1
{0<Y Z
h
N
(X
1
)}
where Y = C(X
1
) g(X
1
) and Z
h
N
(X
1
) = C(X
1
)

C
N
(X
1
). Using assumption 5 we can
write
E
_
Y 1
{0<Y Z
h
N
(X
1
)}
= E
_
Y 1
{0<Y

Z
Nh
2
2
(X
1
)
f(X
1
)
B(X
1
)h
2
}
= E
_
Y 1
{0<
Y
2
2
(X
1
)
f(X
1
)
Nh
B(X
1
)
2
2
(X
1
)
f(X
1
)
h
2
}
= E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)

Z
Nh
B(X
1
)h
2
}
where G(X
1
) =
_
2
2
(X
1
)
f(X
1
)
, H(X
1
) =
Y
G(X
1
)
and

B(X
1
) =
B(X
1
)
G(X
1
)
.
Performing similar calculations, the other term in the bias expression of equation (3.1)
will reduce to
E
__
g(X
1
) g(X
2
)
_
1
{E[g(X
2
)|X
1
]g(X
1
)<
C
N
(X
1
)}
= E
_
Y 1
{Z
h
N
(X
1
)<Y 0}
= E
_
G(X
1
)H(X
1
) 1
{

Z
Nh
B(X
1
)h
2
<H(X
1
)0}
.
(4.1)
30
We dene / = X
1
: [X
1
x
0
[ < ch
where 0 < < 2 is chosen such that [
B(X
1
)
B(x
0
)[ <<

B(x
0
) for all X
1
/ which is possible as

B(x) is dierentiable. Then we can
write
E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)

Z
Nh
B(x
0
)h
2
(
B(X
1
)
B(x
0
))h
2
,X
1
A}
= E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)

Z
Nh
(
B(x
0
)+r(h
))h
2
,X
1
A}
= E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)
_

Z
Nh
5
B(x
0
)r(h
)
_
h
2
,X
1
A}
.
(4.2)
Now since H(X
1
) is dierentiable, the term in equation (4.2) can be seen to be dominated by
E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)
_

Z
Nh
5
B(x
0
)
_
h
2
}
.
By Implicit Function theorem, the inverse of H(), Q() exists which will be dierentiable in
the neighborhood of x
0
. We can evaluate the above expression as follows
=
_

B(x
0
)
Nh
5
_
Q(
z
Nh
B(x
0
)h
2
)
x
0
G(x
1
)H(x
1
)f
X
1
,
Z
(x
1
, z)dx
1
d z
=
_

B(x
0
)
Nh
5
_
Q(
z
Nh
B(x
0
)h
2
)x
0
0
G(x
0
+)H(x
0
+)f
,
Z
(x
0
+, z)dd z
=
_

B(x
0
)
Nh
5
_
Q(
z
Nh
B(x
0
)h
2
)x
0
0
_
G(x
0
) +G
)
__
H
(x
0
) +

2
2
H
)
_
_
f
,
Z
(x
0
, z) +
f
,
, z)
_
dd z
(4.3)
where 0 <

< Q(
z
Nh

B(x
0
)h
2
) x
0
. Retaining terms up to o(N
1
h
1
) we get
1
2
_

B(x
0
)
Nh
5
G(x
0
)H
(x
0
)f
,
Z
(0, z)
_
Q(
z
Nh

B(x
0
)h
2
) x
0
_
2
d z +o
_
1
Nh
_
Further calculating the second term in the bias expression in equation (4.1) we will get a
similar term as above. Therefore, when X
1
/, the dominant term in the bias expression is
1
2
_

G(x
0
)H
(x
0
)f
,
Z
(0, z)
_
Q(
z
Nh

B(x
0
)h
2
) x
0
_
2
d z
=
1
2
_

G(x
0
)H
(x
0
)f
,
Z
(0, z)/f
(0)
_
Q
(0)
_
z
Nh

B(x
0
)h
2
_
+
1
2
Q
( y)
_
z
Nh

B(x
0
)h
2
_
2
_
2
f
(0)d z
=
1
2
_

G(x
0
)Q
(0)f
,
Z
(0, z)/f
(0)
_
z
Nh

B(x
0
)h
2
_
2
f
(0)d z +o
_
1
Nh
_
31
=
1
2
G(x
0
)Q
(0)f
(0)
_
B
2
(x
0
)h
4
+
1
Nh
_
+o
_
1
Nh
_
where 0 < y <
z
Nh

B(x
0
)h
2
and the rest follows from assumption 5. The remainder terms
can again be shown to be of smaller order by assumption on the joint density function and
assumption 5.
Next consider the region where X
1
is away from x
0
, /
= X
1
: [X
1
x
0
[ > ch
. For
X
1
/
consider the case when [
B(X
1
)[ h
. If we choose < 2 , then

E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)

Z
Nh
B(X
1
)h
2
,X
1
A
,|
B(X
1
)|h
E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)

Z
Nh
5
h
2
+h
2
,X
1
A
,|
B(X
1
)|h
.
(4.4)
For some X
1
/
, assume H
(X
1
) ,= 0 such that [H(X
1
)[ = [H
(X
1
)(X
1
x
0
)[ > Kh
.
Then equation (4.4) will not contribute to the bias. Next for suciently small h, as

B(X
1
)
is bounded from the previous assumptions P([
B(X
1
)[ > h
) = O(h
) for some > 0. So

the term
E
_
G(X
1
)H(X
1
) 1
{0<H(X
1
)

Z
Nh
B(X
1
)h
2
,X
1
A
,|
B(X
1
)|>h
(4.5)
will also not contribute to the bias. The same assumptions and calculations will work for the
expression in equation (4.1) and hence the absolute bias is given as
[V

V [ =
1
2
G(x
0
)Q
(0)f
(0)
_
B
2
(x
0
)h
4
+
1
Nh
_
+o
_
1
Nh
_
.
Proof of Proposition 3.4. To ease the notation we re-write the estimator

V of option price as
V =
1
N
N
i=1
g(X

i
),
where
i
is the optimal stopping time calculated using the continuation value approximation
from Stage 1 sample paths for the ith Stage 2 sample path. Then
Var(
V ) =
1
N
E[g
2
(X

i
)] +
N(N 1)
N
2
E[g(X

i
)g(X

j
)] E[g(X

i
)]E[g(X

j
)]
We can show from the assumptions
1
N
E[g
2
(X

i
)]
1
N
E[g
2
(X
i
)].
The main contribution in the variance of the estimator is from the covariance term which can
32
be written approximately as
E[g(X

i
)g(X

j
)] E[g(X

i
)]E[g(X

j
)]
= E[g(X

i
)g(X

j
)] E[g(X
i
)g(X
j
)] +E[g(X
i
)]E[g(X
j
)] E[g(X

i
)]E[g(X

j
)]
where
i
and
j
are optimal stopping times on independent i and j sample paths of Stage 2.
To simplify the analysis, we consider the two-time period setting where error will happen in
the scenarios when
i
= 1 and
i
= 2 or
i
= 2 and
i
= 1. Collecting the corresponding terms
we get
E[g(X
i1
)g(X

j
) C(X
i1
)g(X
j
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
]
+E[C(X
i1
)g(X

j
) g(X
i1
)g(X
j
)1
{C(X
i1
)g(X
i1
)<
C
N
(X
i1
)}
]
+E[C(X
i1
)E[g(X
j
)] g(X
i1
)E[g(X

j
)]1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
]
+E[g(X
i1
)E[g(X
j
)] C(X
i1
)E[g(X

j
)]1
{C(X
i1
)g(X
i1
)<
C
N
(X
i1
)}
].
(4.6)
Picking the rst term in Equation (4.6) we condition it on the two events A
ij
(h) =
_
[X
i1

X
j1
[ < h
_
and A
c
ij
(h) =
_
[X
i1
X
j1
[ > h
_
.
E
_
g(X
i1
)g(X

j
) C(X
i1
)g(X
j
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
A
ij
(h)
P(A
ij
(h))
+E
_
g(X
i1
)g(X

j
) C(X
i1
)g(X
j
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
A
c
ij
(h)
P(A
c
ij
(h))
= E
_
g(X
i1
)g(X

j
) C(X
i1
)g(X
j
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
A
ij
(h)
P(A
ij
(h))
+E
_
g(X
i1
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
A
c
ij
(h)
E
_
g(X

j
)[A
c
ij
(h)
P(A
c
ij
(h))
E
_
C(X
i1
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
A
c
ij
(h)
E
_
g(X
j
)[A
c
ij
(h)
P(A
c
ij
(h))
(4.7)
where we make the simplifying assumption that conditioned on the set A
c
ij
(h), g(X

j
) is
independent of g(X
i1
). Next from our bias calculations we assume
E
_
g(X

j
)[A
c
ij
(h)] = E
_
g(X
j
)[A
c
ij
(h)] +B(x
0
)h
4
+o(h
4
),
then the sum of last two terms in Equation (4.7) gives
E[g(X
i1
) C(X
i1
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
[A
c
ij
(h)]E[g(X
j
)[A
c
ij
(h)]P(A
c
ij
(h))
+E[g(X
i1
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
[A
c
ij
(h)]P(A
c
ij
(h))
_
B(x
0
)h
4
+o(h
4
)
_
.
33
Similarly we look at the third term in Equation (4.6)
E[C(X
i1
)E[g(X
j
)] g(X
i1
)E[g(X

j
)]1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
]
= E[C(X
i1
)E[g(X
j
)] g(X
i1
)E[g(X

j
)]1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
[A
ij
(h)]P(A
ij
(h))
+E[C(X
i1
)E[g(X
j
)] g(X
i1
)E[g(X

j
)]1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
[A
c
ij
(h)]P(A
c
ij
(h))
= E[C(X
i1
)E[g(X
j
)] g(X
i1
)E[g(X

j
)]1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
[A
ij
(h)]P(A
ij
(h))
+E[C(X
i1
) g(X
i1
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
[A
c
ij
(h)]E[g(X
j
)]P(A
c
ij
(h))
E[g(X
i1
)1
{
C
N
(X
i1
)g(X
i1
)<C(X
i1
)}
[A
c
ij
(h)]P(A
c
ij
(h))
_
B(x
0
)h
4
+o(h
4
)
_
.
34
Chapter 5
Numerical experiments
35
Chapter 6
Conclusion
36
Bibliography
[1] G. Barone-Adesi and R.E. Whaley. Ecient analytic approximation of american option
values. Journal of Finance, pages 301320, 1987.
[2] J. Barraquand and D. Martineau. Numerical valuation of high dimensional multivariate
american securities. Journal of nancial and quantitative analysis, 30(3), 1995.
[3] F. Black and M. Scholes. The pricing of options and corporate liabilities. The journal of
political economy, pages 637654, 1973.
[4] D. Bosq. Nonparametric statistics for stochastic processes: estimation and prediction,
volume 110. Springer Verlag, 1998.
[5] M. Broadie and P. Glasserman. Pricing american-style securities using simulation. Jour-
nal of Economic Dynamics and Control, 21(8-9):13231352, 1997.
[6] J.F. Carriere. Valuation of the early-exercise price for options using simulations and
nonparametric regression. Insurance: mathematics and Economics, 19(1):1930, 1996.
[7] N. Chen and P. Glasserman. Additive and multiplicative duals for american option
pricing. Finance and Stochastics, 11(2):153179, 2007.
[8] E. Derman and I. Kani. Stochastic implied trees: Arbitrage pricing with stochastic
term and strike structure of volatility. International Journal of Theoretical and Applied
Finance, 1(1):61110, 1998.
[9] E. Derman, I. Kani, and N. Chriss. Implied trinomial tress of the volatility smile. The
Journal of Derivatives, 3(4):722, 1996.
[10] E. Derman, I. Kani, and J.Z. Zou. The local volatility surface: Unlocking the information
in index option prices. Financial Analysts Journal, pages 2536, 1996.
[11] R. Durrett. Probability: theory and examples. Wadsworth Publishing Company, 1996.
[12] J.P. Fouque, G. Papanicolaou, and K.R. Sircar. Mean-reverting stochastic volatility.
International Journal of Theoretical and Applied Finance, 3(1):101142, 2000.
37
[13] P. Glasserman. Monte Carlo methods in nancial engineering, volume 53. Springer
Verlag, 2004.
[14] M.B. Haugh and L. Kogan. Pricing american options: a duality approach. Operations
Research, pages 258270, 2004.
[15] S.L. Heston. A closed-form solution for options with stochastic volatility with applica-
tions to bond and currency options. Review of nancial studies, 6(2):327343, 1993.
[16] L.J. Hong, S. Juneja, and G. Liu. Nested estimation without nested simulation: The
impact of dimensionality.
[17] M. Kohler, A. Krzy zak, and N. Todorovic. Pricing of high-dimensional american options
by neural networks. Mathematical Finance, 20(3):383410, 2010.
[18] F.A. Longsta and E.S. Schwartz. Valuing american options by simulation: A simple
least-squares approach. Review of Financial Studies, 14(1):113147, 2001.
[19] A. Pagan and A. Ullah. Nonparametric econometrics. Cambridge Univ Pr, 1999.
[20] J.N. Tsitsiklis and B. Van Roy. Regression methods for pricing complex american-style
options. Neural Networks, IEEE Transactions on, 12(4):694703, 2001.
38

Using Nearest Neighbor Method in Approximate Dynamic Programming To Price American Options

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Using Nearest Neighbor Method in Approximate Dynamic Programming To Price American Options

Uploaded by

Copyright:

Available Formats

Using Nearest Neighbor Method in Approximate

Dynamic Programming to Price American Options

that has the form

)). But the optimal policy is not known

V ] V , we can easily verify that the dierence will have non-

and estimated option price by

where 0 < < 2 is chosen such that [

consider the case when [

. If we choose < 2 , then

) for some > 0. So

You might also like