Probabilistic Programming in Quantitative Finance Probabilistic Programming in Quantitative Finance

bayesian_risk_perf_v3 slides
http://twiecki.github.io/bayesian_risk_perf_v3.slides.html?print-pdf#/
Probabilistic Programming in Quantitative Finance

Thomas Wiecki
@twiecki
1 of 86
03/17/2015 08:47 PM
2 of 86
03/17/2015 08:47 PM
About me
Lead Data Scientist at Quantopian Inc (https://www.quantopian.com): Building a
crowd sourced hedge fund.
PhD from Brown University -- research on computational neuroscience and machine
learning using Bayesian modeling.
3 of 86
03/17/2015 08:47 PM
4 of 86
03/17/2015 08:47 PM
The problem we're gonna solve

Two real-money strategies:
In [76]:
5 of 86
plot_strats()
03/17/2015 08:47 PM
6 of 86
03/17/2015 08:47 PM
D
ra
w
do
w
n
ta
Be
Ta
il r
is
k
Vo
la
til
ity
Types of risk
Systematic and Unsystematic Risk
7 of 86
03/17/2015 08:47 PM
8 of 86
03/17/2015 08:47 PM
Sharpe Ratio
In [24]:
Sharpe =
mean returns
volatility
print "Sharpe ratio strategy etrade =", data_0.mean() / data_0.std() * np.sqrt(252)

print "Sharpe ratio strategy IB =", data_1.mean() / data_1.std() * np.sqrt(252)
Sharpe ratio strategy etrade = 0.627893606355
Sharpe ratio strategy IB = 1.43720181575
9 of 86
03/17/2015 08:47 PM
10 of 86
03/17/2015 08:47 PM
Types of risk
Model misspecication
Data issues
D
ra
w
do
w
n
ta
Be
Ta
il r
is
k
Programming errors
Vo
la
til
ity
Model Risk
Estimation Uncertainty
Systematic and Unsystematic Risk
11 of 86
03/17/2015 08:47 PM
12 of 86
03/17/2015 08:47 PM
13 of 86
03/17/2015 08:47 PM
14 of 86
03/17/2015 08:47 PM
Short primer on random variables

Represents our beliefs about an unknown state.
Probability distribution assigns a probability to each possible state.
Not a single number (e.g. most likely state).
"When I bet on horses, I never lose. Why? I bet on all the horses." Tom Haverford
15 of 86
03/17/2015 08:47 PM
16 of 86
03/17/2015 08:47 PM
You already know what a variable is...

In [8]:
17 of 86
coin = 0 # 0 for tails

coin = 1 # 1 for heads
03/17/2015 08:47 PM
18 of 86
03/17/2015 08:47 PM
A random variable assigns all possible values a certain

probability
In [ ]:
coin = {0: 50%,

1: 50%}
Alternatively:
coin ~ Bernoulli(p=0.5)
coin is a random variable
Bernoulli is a probability distribution
~ reads as "is distributed as"
19 of 86
03/17/2015 08:47 PM
20 of 86
03/17/2015 08:47 PM
This was discrete (binary), what about the continuous

case?
returns ~ Normal(, 2 )
21 of 86
03/17/2015 08:47 PM
In [77]:
22 of 86
from scipy import stats

sns.distplot(data_0, kde=False, fit=stats.norm)
plt.xlabel('returns')
03/17/2015 08:47 PM
How to estimate and ?

Naive: point estimate
Set mu = mean(data) and sigma = std(data)
Maximum Likelihood Estimate
Correct answer as n
Bayesian analysis
Most of the time n ...
Uncertainty about and
Turn and into random variables
How to estimate?
23 of 86
03/17/2015 08:47 PM
24 of 86
03/17/2015 08:47 PM
Bayes Formula!
Prior
Bayes
Posterior
Data
Use prior knowledge and data to update our beliefs.
25 of 86
03/17/2015 08:47 PM
26 of 86
03/17/2015 08:47 PM
In [78]:
interactive(gen_plot, n=(0, 600), bayes=True)
Out[78]:
27 of 86
03/17/2015 08:47 PM
28 of 86
03/17/2015 08:47 PM
Probabilistic Programming
Model unknown causes (e.g. ) of a phenomenon as random variables.
Write a programmatic story of how unknown causes result in observable data.
Use Bayes formula to invert generative model to infer unknown causes.
29 of 86
03/17/2015 08:47 PM
30 of 86
03/17/2015 08:47 PM
Approximating the posterior with MCMC sampling

In [81]:
31 of 86
plot_want_get()
03/17/2015 08:47 PM
32 of 86
03/17/2015 08:47 PM
PyMC3
Probabilistic Programming framework written in Python.
Allows for construction of probabilistic models using intuitive syntax.
Features advanced MCMC samplers.
Fast: Just-in-time compiled by Theano.
Extensible: easily incorporates custom MCMC algorithms and unusual probability
distributions.
Authors: John Salvatier, Chris Fonnesbeck, Thomas Wiecki
Upcoming beta release!
33 of 86
03/17/2015 08:47 PM
34 of 86
03/17/2015 08:47 PM
35 of 86
03/17/2015 08:47 PM
36 of 86
03/17/2015 08:47 PM
Model returns distribution: Specifying our

priors
37 of 86
03/17/2015 08:47 PM
38 of 86
03/17/2015 08:47 PM
In [82]:
39 of 86
x = np.linspace(-.3, .3, 500)

plt.plot(x, T.exp(pm.Normal.dist(mu=0, sd=.1).logp(x)).eval())
plt.title(u'Prior: mu ~ Normal(0, $.1^2$)'); plt.xlabel('mu'); plt.ylabel('Probabili
ty Density'); plt.xlim((-.3, .3));
03/17/2015 08:47 PM
40 of 86
03/17/2015 08:47 PM
In [83]:
41 of 86
x = np.linspace(-.1, .5, 500)

plt.plot(x, T.exp(pm.HalfNormal.dist(sd=.1).logp(x)).eval())
plt.title(u'Prior: sigma ~ HalfNormal($.1^2$)'); plt.xlabel('sigma'); plt.ylabel('Pr
obability Density');
03/17/2015 08:47 PM
42 of 86
03/17/2015 08:47 PM
Bayesian Sharpe ratio

Normal(0, .12 ) Prior
HalfNormal(.12 ) Prior
returns Normal(, 2 ) Observed!
Sharpe =
43 of 86
03/17/2015 08:47 PM
44 of 86
03/17/2015 08:47 PM
Graphical model of returns

~
Bayes
~
Posteriors
Priors
Data
45 of 86
03/17/2015 08:47 PM
46 of 86
03/17/2015 08:47 PM
This is what the data looks like

In [9]:
print data_0.head()
2013-12-31 21:00:00
0.002143
2014-01-02 21:00:00
-0.028532
2014-01-03 21:00:00
-0.001577
2014-01-06 21:00:00
-0.000531
2014-01-07 21:00:00
0.011310
Name: 0, dtype: float64
47 of 86
03/17/2015 08:47 PM
48 of 86
03/17/2015 08:47 PM
In [14]:
import pymc as pm
with pm.Model() as model:
# Priors on Random Variables
mean_return = pm.Normal('mean return', mu=0, sd=.1)
volatility = pm.HalfNormal('volatility', sd=.1)
# Model returns as Normal
obs = pm.Normal('returns',
mu=mean_return,
sd=volatility,
observed=data_0)
sharpe = pm.Deterministic('sharpe ratio',
mean_return / volatility * np.sqrt(252))
49 of 86
03/17/2015 08:47 PM
50 of 86
03/17/2015 08:47 PM
In [15]:
with model:
# Instantiate MCMC sampler
step = pm.NUTS()
# Draw 500 samples from the posterior
trace = pm.sample(500, step)
[-----------------100%-----------------] 500 of 500 complete in 0.4 sec
51 of 86
03/17/2015 08:47 PM
52 of 86
03/17/2015 08:47 PM
Analyzing the posterior
53 of 86
In [84]:
sns.distplot(results_normal[0][0]['mean returns'], hist=False, label='etrade')

sns.distplot(results_normal[1][0]['mean returns'], hist=False, label='IB')
plt.title('Posterior of the mean'); plt.xlabel('mean returns')
Out[84]:
<matplotlib.text.Text at 0x7fde80cb5850>
03/17/2015 08:47 PM
54 of 86
03/17/2015 08:47 PM
55 of 86
In [85]:
sns.distplot(results_normal[0][0]['volatility'], hist=False, label='etrade')

sns.distplot(results_normal[1][0]['volatility'], hist=False, label='IB')
plt.title('Posterior of the volatility')
plt.xlabel('volatility')
Out[85]:
<matplotlib.text.Text at 0x7fde80e58310>
03/17/2015 08:47 PM
56 of 86
03/17/2015 08:47 PM
In [86]:
57 of 86
sns.distplot(results_normal[0][0]['sharpe'], hist=False, label='etrade')

sns.distplot(results_normal[1][0]['sharpe'], hist=False, label='IB')
plt.title('Bayesian Sharpe ratio'); plt.xlabel('Sharpe ratio');
03/17/2015 08:47 PM
58 of 86
03/17/2015 08:47 PM
In [28]:
print 'P(Sharpe ratio IB > 0) = %.2f%%' % \

(np.mean(results_normal[1][0]['sharpe'] > 0) * 100)
P(Sharpe ratio IB > 0) = 96.48%
In [29]:
print 'P(Sharpe ratio IB > Sharpe ratio etrade) = %.2f%%' % \

(np.mean(results_normal[1][0]['sharpe'] > results_normal[0][0]['sharpe']) * 100)
P(Sharpe ratio IB > Sharpe ratio etrade) = 80.06%
59 of 86
03/17/2015 08:47 PM
60 of 86
03/17/2015 08:47 PM
Value at Risk with uncertainty
61 of 86
03/17/2015 08:47 PM
In [88]:
62 of 86
ppc_etrade = post_pred(var_cov_var_normal, results_normal[0][0], 1e6, .05, samples=8

00)
ppc_ib = post_pred(var_cov_var_normal, results_normal[1][0], 1e6, .05, samples=800)
03/17/2015 08:47 PM
Interim summary
Bayesian stats allows us to reformulate common risk metrics, use priors and
quantify uncertainty.
IB strategy seems better in almost every regard. Is it though?
63 of 86
03/17/2015 08:47 PM
64 of 86
03/17/2015 08:47 PM
So far, only added confidence

In [89]:
65 of 86
sns.distplot(results_normal[0][0]['sharpe'], hist=False, label='etrade')

sns.distplot(results_normal[1][0]['sharpe'], hist=False, label='IB')
plt.title('Bayesian Sharpe ratio'); plt.xlabel('Sharpe ratio');
plt.axvline(data_0.mean() / data_0.std() * np.sqrt(252), color='b');
plt.axvline(data_1.mean() / data_1.std() * np.sqrt(252), color='g');
03/17/2015 08:47 PM
66 of 86
03/17/2015 08:47 PM
Is this a good model?

In [93]:
67 of 86
sns.distplot(data_1, label='data IB', kde=False, norm_hist=True, color='.5')

for p in ppc_dist_normal:
plt.plot(x, p, c='r', alpha=.1)
plt.plot(x, p, c='r', alpha=.5, label='Normal model')
plt.xlabel('Daily returns')
plt.legend();
03/17/2015 08:47 PM
68 of 86
03/17/2015 08:47 PM
Can it be improved? Yes!

Identical model as before, but instead, use a heavy-tailed T distribution:
returns T(, , 2 )
69 of 86
03/17/2015 08:47 PM
In [94]:
70 of 86
sns.distplot(data_1, label='data IB', kde=False, norm_hist=True, color='.5')

for p in ppc_dist_t:
plt.plot(x, p, c='y', alpha=.1)
03/17/2015 08:47 PM
Lets compare posteriors of the normal and T

model
71 of 86
03/17/2015 08:47 PM
72 of 86
03/17/2015 08:47 PM
Mean returns
In [96]:
73 of 86
sns.distplot(results_normal[1][0]['mean returns'], hist=False, color='r', label='nor

mal model')
sns.distplot(results_t[1][0]['mean returns'], hist=False, color='y', label='T model'
)
plt.xlabel('Posterior of the mean returns'); plt.ylabel('Probability Density');
03/17/2015 08:47 PM
74 of 86
03/17/2015 08:47 PM
Bayesian T-Sharpe ratio

In [97]:
75 of 86
sns.distplot(results_normal[1][0]['sharpe'], hist=False, color='r', label='normal mo

del')
sns.distplot(results_t[1][0]['sharpe'], hist=False, color='y', label='T model')
plt.xlabel('Bayesian Sharpe ratio'); plt.ylabel('Probability Density');
03/17/2015 08:47 PM
76 of 86
03/17/2015 08:47 PM
But why? T distribution is more robust!

In [98]:
77 of 86
sim_data = list(np.random.randn(75)*.01)
sim_data.append(-.2)
sns.distplot(sim_data, label='data', kde=False, norm_hist=True, color='.5'); sns.dis
tplot(sim_data, label='Normal', fit=stats.norm, kde=False, hist=False, fit_kws={'col
or': 'r', 'label': 'Normal'}); sns.distplot(sim_data, fit=stats.t, kde=False, hist=F
alse, fit_kws={'color': 'y', 'label': 'T'})
plt.xlabel('Daily returns'); plt.legend();
03/17/2015 08:47 PM
78 of 86
03/17/2015 08:47 PM
Estimating tail risk using VaR

In [99]:
79 of 86
ppc_normal = post_pred(var_cov_var_normal, trace_normal, 1e6, .05, samples=800)

ppc_t = post_pred(var_cov_var_t, trace_t, 1e6, .05, samples=800)
sns.distplot(ppc_normal, label='Normal', norm_hist=True, hist=False, color='r')
sns.distplot(ppc_t, label='T', norm_hist=True, hist=False, color='y')
plt.legend(loc=0); plt.xlabel('5% daily Value at Risk (VaR) with \$1MM capital (in \
$)'); plt.ylabel('Probability density'); plt.xticks(rotation=15);
03/17/2015 08:47 PM
80 of 86
03/17/2015 08:47 PM
Comparing the Bayesian T-Sharpe ratios

In [101]:
81 of 86
sns.distplot(results_t[0][0]['sharpe'], hist=False, label='etrade')

sns.distplot(results_t[1][0]['sharpe'], hist=False, label='IB')
plt.xlabel('Bayesian Sharpe ratio'); plt.ylabel('Probability Density');
03/17/2015 08:47 PM
In [42]:
82 of 86
print 'P(Sharpe ratio IB > Sharpe ratio etrade) = %.2f%%' % \

(np.mean(results_t[1][0]['sharpe'] > results_t[0][0]['sharpe']) * 100)
03/17/2015 08:47 PM
Conclusions
Bayesian statistics allows us to quantify uncertainty -- measure orthogonal sources
of risk.
Rich statistical framework to compare different models against each other.
Blackbox inference algorithms allow estimation of complex models.
PyMC3 puts advanced samplers at your fingertips.
83 of 86
03/17/2015 08:47 PM
84 of 86
03/17/2015 08:47 PM
Further reading
Quantopian (https://www.quantopian.com) -- Develop trading algorithms like this in
your browser.
My blog for Bayesian linear regression (financial alpha and beta)
(https://twiecki.github.io)
Probilistic Programming for Hackers (http://camdavidsonpilon.github.io
/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/) -- IPython
Notebook book on Bayesian stats using PyMC2.
Doing Bayesian Data Analysis (http://www.indiana.edu/~kruschke
/DoingBayesianDataAnalysis/) -- Great book by Kruschke.
PyMC3 repository (https://github.com/pymc-devs/pymc3)
Twitter: @twiecki (https://twitter.com/twiecki)
85 of 86
03/17/2015 08:47 PM
86 of 86
03/17/2015 08:47 PM

Probabilistic Programming in Quantitative Finance Probabilistic Programming in Quantitative Finance

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probabilistic Programming in Quantitative Finance Probabilistic Programming in Quantitative Finance

Uploaded by

Copyright:

Available Formats

bayesian_risk_perf_v3 slides

Probabilistic Programming in Quantitative Finance

The problem we're gonna solve

Systematic and Unsystematic Risk

print "Sharpe ratio strategy etrade =", data_0.mean() / data_0.std() * np.sqrt(252)

Systematic and Unsystematic Risk

Short primer on random variables

You already know what a variable is...

coin = 0 # 0 for tails

A random variable assigns all possible values a certain

coin = {0: 50%,

This was discrete (binary), what about the continuous

from scipy import stats

How to estimate and ?

interactive(gen_plot, n=(0, 600), bayes=True)

Approximating the posterior with MCMC sampling

Model returns distribution: Specifying our

x = np.linspace(-.3, .3, 500)

x = np.linspace(-.1, .5, 500)

Bayesian Sharpe ratio

Graphical model of returns

This is what the data looks like

Analyzing the posterior

sns.distplot(results_normal[0][0]['mean returns'], hist=False, label='etrade')

sns.distplot(results_normal[0][0]['volatility'], hist=False, label='etrade')

sns.distplot(results_normal[0][0]['sharpe'], hist=False, label='etrade')

print 'P(Sharpe ratio IB > 0) = %.2f%%' % \

print 'P(Sharpe ratio IB > Sharpe ratio etrade) = %.2f%%' % \

Value at Risk with uncertainty

ppc_etrade = post_pred(var_cov_var_normal, results_normal[0][0], 1e6, .05, samples=8

So far, only added confidence

sns.distplot(results_normal[0][0]['sharpe'], hist=False, label='etrade')

Is this a good model?

sns.distplot(data_1, label='data IB', kde=False, norm_hist=True, color='.5')

Can it be improved? Yes!

sns.distplot(data_1, label='data IB', kde=False, norm_hist=True, color='.5')

Lets compare posteriors of the normal and T

sns.distplot(results_normal[1][0]['mean returns'], hist=False, color='r', label='nor

Bayesian T-Sharpe ratio

sns.distplot(results_normal[1][0]['sharpe'], hist=False, color='r', label='normal mo

But why? T distribution is more robust!

Estimating tail risk using VaR

ppc_normal = post_pred(var_cov_var_normal, trace_normal, 1e6, .05, samples=800)

Comparing the Bayesian T-Sharpe ratios

sns.distplot(results_t[0][0]['sharpe'], hist=False, label='etrade')

print 'P(Sharpe ratio IB > Sharpe ratio etrade) = %.2f%%' % \

You might also like