You are on page 1of 4

ORFE 522 - Linear Optimization

Problem Set 4
Theo Gutman-Solo

Dynamics of Binomial Stock Options

To model a simplified call option for dynamics assume strike price k and option
duration T = 100 days. Next the evolution of the price can be modeled as a
multiplicative random walk.

uSt with probability p
St+1 =
dSt with probability 1 p
Where d (0, 1) and u (1, ). u and d are inverses. The bellman equation
for this simple model is
Jt (St ) = max{St K, pJt+1 (uSt ) + (1 p)Jt+1 (dSt )}
JT (ST ) = max{ST k, 0}
To solve this equation I used policy iteration and value iteration. Due to computational limitations I used 30 iterations. For simplicity I used a strike price of
0.98. The underlying stock has an initial price of 1 and increases by a factor of
u = 1.02 with probability 0.5 and decreases by 1.021 with the same probability

The graph is fairly self-explanatory. Although the scale makes it difficult to tell
the value increases gradually with time.

Above we have a contour plot showing the optimal policy. Blue means hold
and yellow implies exercise the option (the third color should be ignored its
an artifact of matlabs contourf function). At first it seems odd that we never
exercise the option early, but this can explained with a few comments. First in
this simplified model we have ignored interest rates. This tacitly assumes that
the cost of funding a position is zero. This removes one incentive to exercise
early. Note the dynamics of the stock evolution is essentially a random walk.
This means that as time goes to infinity we expect the length of runs to become
2

arbitrarily large. This is a consequence of the zero-one law. The limsup is inifinite. As a result as time increases the option becomes more valuable. Next note
that in this model the option is in the money the process is a submartingale.
Simple calculation shows that for u > 0 we have the inequality.
u+

1
1
u

As a result if we are in the money we will always hold. Simarlily if we are out of
the money we will hold since there is no value to exercise. In fact there are only
a few corner cases in this model where youd exercise early. Finally consider the
following contour plot

Here yellow implies that the option is worthless. With the simplified dynamics
used in this model it impossible that the option end in the money. As a result
exercising or holding makes no difference in this area.

Q-learning

I used the same model and parameters to sample Q-learning. As a result while
the code is different all the results are the same. For Q-learning, I assumed the
underlying randomness was bionomial with p = 0.5.

Finally running Q learning gave the option prices.

The surface is very similar to that produced by the deterministic algorithms


above. Note that the surface isnt smooth and contains many variations reflecting the randonmess in the generating process.

You might also like