You are on page 1of 12

Confidence Intervals for the Kelly

Criterion
Euan C. Sinclair

Head of Global Risk – Bluefin Trading

The author thanks many members of www.nuclearphynance.com for comments on earlier


versions of this paper. Particular thanks are due to Benjamin Portheault and Kent Osband.

Electronic copy available at: http://ssrn.com/abstract=2457368


Abstract

Investing according to the Kelly criterion will theoretically outperform any other sizing
strategy. However, the value of the optimal fraction will generally need to be estimated
from empirical data. This means that our estimate will invariably have a degree of
uncertainty attached to it. In this note I show how to calculate the variance of the
estimated Kelly criterion ratio.

1 Introduction

Kelly (1956) considered the question of how a gambler with an edge should act in order
to maximize his bankroll growth. He derived a trade sizing scheme (the Kelly criterion)
which showed the optimal fraction of the bankroll to be allocated to each opportunity.

The Kelly criterion has a number of desirable properties.

 As we only ever invest a fraction of our wealth, we can never go bankrupt.


 The strategy is guaranteed to asymptotically outperform any essentially different
strategy.
 The time for the bankroll to reach any fixed amount is asymptotically smallest
with this strategy.

The proofs for these properties can be found in Breiman (1961).

However the scheme also possesses some undesirable properties. There is a critical level
of the invested fraction (approximately equal to twice the optimal proportion) where the
growth rate of the portfolio becomes negative. Breiman (1961) proved that in this case
the bankroll asymptotically tends toward zero. Also, when investing at the optimal
fraction, the evolution of the bankroll is very volatile.

In the case of a game such as blackjack the Kelly ratio can be calculated exactly. But in
most cases this is not possible. When applying the Kelly criterion to sports gambling or
trading financial instruments, the optimal ratio needs to be estimated by analyzing
historical data. As we will only ever have a finite amount of data, this is a case where we
are attempting to estimate a population parameter from a sample. Inevitably this means
our estimate will have some degree of uncertainty.

This is well known by practitioners. And to mitigate the risk of over-betting, those people
following such a sizing scheme often modify the Kelly criterion by investing only a

Electronic copy available at: http://ssrn.com/abstract=2457368


fraction of the optimal amount. These schemes are known as fractional Kelly schemes.
By doing this, traders accept that they will be reducing growth but will also more
drastically reduce variance.

However, simply scaling the investment fraction doesn’t protect against a bigger
problem: the case where the investment fraction is estimated to be positive but the true
value is actually negative. In this case, investing any positive fraction of the bankroll will
be over-betting.

Our goal in this paper is to address this concern by deriving the distribution of the
investment fraction. This will allow us to derive confidence intervals around our point
estimate and in particular let us estimate a scaling factor that gives us only a given chance
of overbetting. This connects the fractional Kelly heuristic to statistical sampling.

This will also address the common argument against applying the Kelly criterion: that
because we can never precisely know the true Kelly ratio we should not trust the idea at
all.

The layout of the paper is as follows. In Section Two we derive the Kelly criterion for the
case of a game with two outcomes with a fixed probability of winning. We then show the
relationship of the variance of the Kelly ratio to the variance of the estimate of the win
probability. In Section Three we extend this analysis to the case where the returns of the
bets are drawn from a continuous distribution. Section Four mentions related work and
Section Five concludes.

2. The Kelly Criterion for Binomial Bets

Consider the case where the outcome of the bets are given by independent random
variables Xn which take the value W for a win (with probability p) and L for a loss (with
probability q=1-p). So the expected value of a one unit bet is

EV=pW-(1-p)L (1)

In what follows we assume that we are analyzing a favorable game so that EV>0. In a
two person game this will usually be the case so this is not a restriction. Sometimes it is
possible for a gambler to choose what side of a bet she wants to take, so she can in
principle pick the side with positive EV. This would generally be the case with sports
betting but would not be the case for most casino games. In the degenerate case where
EV=0, we will see that the optimal betting fraction is zero, so this game would be
pointless for either party to play.
The total return will be maximized by wagering the entire bankroll on each bet in the
sequence. However if we follow this strategy, the first losing bet will bankrupt us. As the
number of bets approaches infinity, the probability of bankruptcy tends to one. Ideally,
we want a strategy that makes as much money as possible while also avoiding the risk of
bankruptcy.

Let Bn be the value of the bankroll after n bets and B0 the initial bankroll. Assume we bet
a fraction, f, of this bankroll at each time. Then after a win

(2)

And after a loss

(3)

Alternatively, after n plays we will have

(4)

Since

[ ( ) ] (5)

The exponential growth per bet is given by

( ) (6)

The criterion proposed by Kelly is that the gambler should maximize the exponential rate
of growth of capital, G.

Maximizing G with respect to f gives


(7)

(Note that from equations 1 and 7 we can write

So a positive (negative) EV implies that we would wager a positive (negative) fraction of


our bankroll. And also that EV=0 implies fmax=0.)
The growth rate’s dependence on f is shown in Figure One. This shows that overbetting,
investing more than fmax, decreases the growth rate and if we bet greater than
approximately 2 fmax the growth rate becomes negative.

0.006

0.004

0.002

0
G(f)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24
-0.002

-0.004

-0.006

-0.008
f

Figure One: The growth rate (G(f)) of the bankroll for the case p=0.55, W=L=1.

In any scheme where we bet a fraction of our current bankroll it will be impossible to
ever go bankrupt. However, by overbetting (betting a fraction greater than that given by
the Kelly ratio) we will reduce our growth rath and possibly make it negative. This will
asymptotically reduce our bankroll below any given level which in practical situations is
as good as being bankrupt.

From equation 7, we can directly evaluate the variance.

( ⁄
) ( ⁄
) (8)

⁄ ⁄
(9)

( ⁄
) (10)

3. The Kelly Criterion for a Continuous Return Distribution


The more relevant situation in finance is where the outcome of a trade is known to have a
certain continuous distribution. Again we bet a fraction, f, of our wealth at the start of
each period so that

Bn  Bn1  fBn1 g  X n  (11)


Where Bn is the random variable giving the result of the nth trade and it has the payoff
g(Xn). After a sequence of n trades our bankroll will be
n
Bn  B0  1  fg  X i  (12)
i 1

Now we take logarithms as before:

B  n
ln n    ln1  fg  X i  (13)
 B0  i 1

So

 B 
E ln n   nEln1  fg  X n  (14)
  B0 

 n  ln1  fg  x   x dx (15)

Where (x) is the distribution function that describes the results of the trades. If we
maximize over the bankroll fraction, f, we find that the optimal value is the one that
satisfies

g ( x)( x)dx  g x  
 1  fg ( x)
 E   0
 1  fg ( x) 
(16)

Applying a Taylor expansion to this equation gives

∫ (17)

∫ (18)

∫ ∫ ∫ (19)

This can be further simplified if we note that


∫ (20)

is the payoff to a unit bet.

Further

∫ (21)

∫ (22)

∫ (23)

Where 3 and 4 are the third and fourth raw moments of .

So if f is small we can truncate the series after the first term to get

(24)

And further if m is small we can further approximate by

(25)

As in the binary game case, fmax is an estimator and has a probability distribution.

First consider the case of normal results. Here the central limit theorem says that the
estimators of the mean, ̂ and variance, ̂ asymptotically have the following normal
distributions, where and are the population mean and variances respectively.

√ ̂ (26)

√ (̂ ) (27)

Alternatively, the estimation errors of mean, ̂ and variance, ̂ can be approximated by

̂ (28)

̂ (29)

Denote by f(,2) the Kelly ratio defined by equation 24. So the estimator is just ̂ ̂ .
The estimation errors in the mean and variance will lead to estimation errors in f.
If we define theta to be the column vector of the normal distribution’s parameters this has
̂
an estimate of ( ) . For IID returns, √ ( ̂ ) ( ) where is the
̂
variance of the estimation error of .

Denoting the estimator of the Kelly ratio to be where f() is the function that
estimates the Kelly ratio, we next apply the delta method (see for example Oehlert,
1992).

This states that the variance of a function is

(30)

But [ ] (31)

and
̂
( ) (32)

So evaluating equation 30 gives the asymptotic variance of our estimate of the Kelly ratio
as
̂
(33)

In the case of a general distribution of trade results we need to make use of the result
(Zhang, 2007) that

(34)

Where is the third central moment of the population distribution. Now equation 30
reads

̂
( )[ ] ( ̂) (35)

̂ ̂
(36)

We now use an example of real trade results to show the importance of including
estimation error in trade sizing. The trade results are from a proprietary option trading
strategy. It is somewhat typical of many such strategies in that it has positive expected
value but large negative kurtosis. The summary statistics for these trade results are given
in Table One and the distribution of results in shown in Figure Two.

Sample size 1000


Mean $0.059
Standard Deviation $1.137
Skewness ($6.199)

Table One: Summary statistics for the option trade.

120

100

80
Instances

60

40

20

0
-5

-3

-1
-3.4
-4.6
-4.2
-3.8

-2.6
-2.2
-1.8
-1.4

-0.6
-0.2

3
0.2
0.6

1.4
1.8
2.2
2.6

3.4
3.8

Proft/Trade

Figure Two: The distribution of the option trade results.

We can rearrange (and slightly modify) equation 36 to give an explicit expression for the
estimated standard deviation of the Kelly ratio.

̂ ̂
√ (37)

where the denominator of n-1 is due to applying Bessel’s correction.

Because of the Central Limit Theorem we know that the distribution of f is normal so we
can calculate the probability that f is actually below any critical value f*.

( ̂
) (38)
√ ̂

where Z is the cumulative distribution function of the standard normal distribution.

Equation 24 gives the Kelly ratio as 0.045, but equation 37 tells us that the standard
deviation of this point estimate is 0.036, so our point estimate is only 1.25 standard
deviations above zero. So there is an 8.9% chance that the true Kelly ratio of the
population is less than zero.

Having an expression for the sampling distribution also allows us to estimate the chance
that we are overbetting so much that our growth rate is negative. This case corresponds to
the true value of f being less than half the estimated value. Equation 38 tells us this is
25%.

This leads us to a complimentary way to use the information. We can use equation 38 to
solve for a benchmark given that we want a certain chance of overbetting. For example,
we have just seen that using a benchmark of half the measured Kelly fraction (i.e. betting
at “half-Kelly”) still implies a 25% chance that we will be overbetting. Table Two shows
the probabilities of overbetting for various fractional Kelly schemes.

Chance of Overbetting Corresponding Benchmark Kelly Scale Factor


0.1 0.0022 0.0480
0.15 0.0104 0.2301
0.2 0.0169 0.3748

Table Two: Fractional schemes corresponding to various probabilities of overbetting.

So in order to introduce a margin of safety we would need to scale the measured Kelly
ratio by a considerable amount. This is in line with the practice of professional gamblers.
Much of this need for scaling is due to the presence of negative skewness. If the returns
were normally distributed the scaling could be reduced. This is shown in Table Three.

Chance of Overbetting Corresponding Benchmark Kelly Scale Factor


0.1 0.0092 0.2054
0.15 0.0161 0.3574
0.2 0.0215 0.4782
Table Three: Fractional schemes corresponding to various probabilities of overbetting
when setting skewness of the trading results to zero.

4. Related Work

Somewhat related work has examined the simple binomial Kelly criterion from a
Bayesian perspective (Medo et al, 2008) and derived the uncertainty in the win
probability assuming it is binomially distributed. Also similar in aim, is research that
shows that under parameter uncertainty, utility is maximized when the optimal trading
fraction is reduced from the case where there is no uncertainty (Baker and McHale,
2013).

Other work that is relevant is that which has been done on the sampling properties of the
Sharpe ratio. While the reasons for the work are different the mathematics are very
similar. Interested readers should consult Baily and Lopez de Prado (2012) and the
references therein.

5. Conclusion

We have derived the estimation error in the Kelly ratio that stems from errors in
estimation of parameters of the underlying trade. This was done for both the case of
binary games (bets) and the more common situation where the trade results are drawn
from a continuous distribution. This allows us to assign confidence intervals around our
point estimate of the optimal trading ratio.

This work could be extended in several ways. One would be to apply the same method to
multivariate situations where the Kelly criterion defines a vector of portfolio weights.
Another would be to look at the cases where the return distribution cannot be described
by an expansion in the moments (equation 18). While some distributions could be
adequately handled by adding higher terms to equation 18 many return distributions in
trading are so “fat tailed” that the higher moments are undefined and this approach will
break down.

6. References
Baily, D.H., and M.M. Lopez de Prado, 2012, The Sharpe Ratio Efficient Frontier,
Journal of Risk, 15, 3-44.

Baker, R.D., and I.G .McHale, 2013, Optimal Betting under Parameter Uncertainty:
Improving the Kelly Criterion. Decision Analysis, 10, 189-199.

Breiman, I., 1961, Optimal gambling systems for favorable games. Fourth Berkeley
Symposium on Probability and Statistics, 65-78.

Kelly, J.L., 1956, A new interpretation of information rate. Bell System Technical
Journal, 35, 917-926.

Medo, M., Y.M.Pis’mak, and Y.Zhang, 2008, Diversification and Limited Information in
the Kelly Game. Physica A, 387, 6151-6158.

Oehlert, G., 1992, A Note on the Delta Method. American Statistician, 46, 27-29.

Zhang, L., 2007, Sample Mean and Sample Variance: Their Covariance and Their
(In)Dependence. American Statistician, 61, 159-160.

You might also like