You are on page 1of 19

1

Modeling securities distribution for a better measure of VaR


and Expected Shortfall


Saket Anand - 61010063









Abstract

Value at Risk(VaR) and Expected Shortfall (ES) are widely used risk measures for
portfolios. For computing VaR and ES it is often assumed that the historical returns are
normally distributed. In this study, it is shown that normality assumption for historical
returns does not hold even for highly diversified indices such as BSE-Sensex and S&P-
500. The paper presents a two step method for computing a more precise measure of VaR
and Expected Shortfall. Firstly, the entire historical distribution is modeled to find a
precise measure of VaR. Thereafter, the tail is modeled separately to get an accurate
measure of Expected Shortfall.






















2
I. Introduction

It is a well known fact that the security distributions are not normally distributed.
However, one might expect returns of a diversified index to be normally distributed
because of the central limit theorem. A quick look at the daily returns data of BSE-
Sensex over a ten year period reveals that not only does the normal distribution
underestimate the chances of extreme events (Figure 1 and Figure 2) but it also does not
capture the shape of the empirically observed distribution around the core ( i.e. around
mean).




Figure-1-Normal Distribution underestimates the probability of extreme events.


Returns time series
-15
-10
-5
0
5
10
15
20
3/11/1997 7/24/1998 12/6/1999 4/19/2001 9/1/2002 1/14/2004 5/28/2005 10/10/2006 2/22/2008 7/6/2009
date
r
e
t
u
r
n
s
Returns mu +2 sigma mu-2 sigma

Figure 2: Normal Distribution underestimates the probability of extreme events.






3
Empirical CDF vs Normal CDF
0
0.2
0.4
0.6
0.8
1
1.2
-15 -10 -5 0 5 10 15 20
returns
C
D
F
empirical normal


Figure-3-Normal Distribution does not fit the core well


In order to compute VaR and Expected Shortfall it is important to match the core because only
if we model the core precisely will we be able to model the tail. For example, to compute
Expected Shortfall (ES) we first need to find an accurate measure of VaR. Clearly, if we use
normal distribution for estimating the VaR our estimates will not match with VaR suggested by
the empirical data. Moreover, it is much easier to match the core than the tails.

Section III presents the reasons for failure of normal distribution to capture the entire empirical
distribution. However, there are certain desirable properties of Normal distribution because of
which it is still used (Section IV). Form Section V to Section VIII, the new method of
computing VaR and ES is developed. In these sections, the method is applied to 10 yeas daily
returns of BSE-returns as a proof of concept. Finally in section VIII, the new approach of
computing VaR and ES is applied to the 10 year daily returns data of S&P-500.

Let us begin by listing some of the properties of Probability Density Function (pdf) (section II)


II. Properties of Probability Density Functions f(x)

Property 1: f(x) should be bounded for all values of x : . 1 ) ( x f
Property 2: f(x) should be positive for all values of x : 0 ) ( x f
4
Property 3: f(x) should be continuous for all values x.
) ( ) ( ) (
0 h 0 h
x f h x f Lim h x f Lim = = +


Property 4: The area under the curve f(x) from infinity to + infinity should be 1.
1 ) ( =

+

dx x f
Property 5: f(x) should tend to zero at infinity and + infinity.
0 ) ( ) (

= =
>
x f Lim x f Lim
x x


III. Why does the normal fail to capture the empirical distribution?

The normal distribution has the following functional form:
2
2
1
2
1
) (
|

\
|


x
e x f

Although the normal distribution is the most widely used distribution, it has certain built-in
features that create difficulty in fitting the normal distribution to the real data.

Firstly, the normal distribution is symmetric around the mean. Most distributions are right
skewed or left skewed. Therefore, to capture the empirical distribution we have to think of a
pdf whose functional form allows asymmetry.

Secondly, there are only two degrees of freedom in a normal distribution i.e. the mean and the
standard deviation. With only two degrees of freedom it becomes difficult to capture the core
distribution (data points around the mean) and the tails simultaneously. We need to have more
knobs for fitting the distribution.

IV. Why normal distribution is still preferred?

Inspite of the disadvantages of the normal distribution it has certain properties that are still
desirable. For example, any p.d.f. should approach zero at the +inf and inf (Property 5).

In literature, power law functions are used to handle such scenarios. Power law function in
general can be described by the following functional form:

=

+ =
N
i
i
i
i
b x
a
a x f
1
0
) (
) (

A power law function, although tractable and flexible, suffers from the problem of singularity
at location parameters {
i
b }. Therefore, it cannot be used to fit the whole distribution but it can
be used to model the tail. In fact the generalized pareto distribution (GPD), which is used to
model tail in this paper, is a special form of power law distribution.

There are other distributions such as Cauchys distribution that are polynomial in nature and do
not have the problem of singularity.
5
( )

a
c a
b x a
c
x f
= >
+
=
; 0
) (
2 2

However, Cauchys distribution also has built in symmetry assumption around the location
parameter b. Moreover, Cauchys distribution has only two degrees of freedom just like the
normal distribution. Therefore, it is not clear that it will provide a better fit to the empirical
distribution than the fit provided by the normal distribution.

It is difficult to think of a function that satisfies all the five properties of a pdf , is flexible and
is yet tractable. For example, let us consider a highly flexible pdf function which has the
following form:

0
) (
6
3 4
4
2 3
2
1 1 0
) ( ) ( ) (
>
=

i
b x a b x a b x a a
a all
ce x f


It is easy to see that the above functional form satisfies Property 1 (since
) ( x g
e

is always less
than 1 if g(x)>0 for all x). Property 2 is also satisfied if c>0. Property 3 i.e. continuity property
is also satisfied (since
) ( x g
e

is continuous for all x if


) ( x g
is continuous). Propert5 is satisfied if
a
4
>0. To satisfy Property 4 we have to adjust c so that the area under the p.d.f. sums up to
one.

V. New Approach

As discussed in section IV, there are certain properties of Normal distribution, such as
continuity and tractability, which are desirable. However, there are some implicit assumptions,
such as symmetry, which create problems while fitting the whole distribution. We want to
retain the tractability and continuity of the normal distribution but at the same time we want to
do away the symmetry. Moreover, we need to increase the degrees of freedom i.e. we need to
have more than two knobs for fitting the p.d.f.

I propose to use a functional form which is a linear combination of two normal pdfs

1 0
) ( ) 1 ( ) ( ) (
2
1
) ( ;
2
1
) (
2 1
2
1
1
2
2
1
1
1
2
2
2
2
1
1

+ =
= =
|
|

\
|

|
|

\
|

w
x f w x wf x f
e x f e x f
x x





Let us now examine whether all the properties of p.d.f. are satisfied by the proposed
distribution.
6

Property 1: f(x) should be bounded for all values of x : . 1 ) ( x f

Since 1 ) ( and 1 ) (
2 1
x f x f , therefore 1 ) ( x f because ) (x f will lie between
) ( and ) (
2 1
x f x f because ) (x f is a convex combination of ) ( and ) (
2 1
x f x f .

Property 2: f(x) should be positive for all values of x : 0 ) ( x f

Since 0 ) ( and 0 ) (
2 1
x f x f , therefore 0 ) ( x f because ) (x f will lie between
) ( and ) (
2 1
x f x f because ) (x f is a convex combination of ) ( and ) (
2 1
x f x f .

Property 3: f(x) should be continuous for all values x.

Since ) ( and ) (
2 1
x f x f are both continuous, therefore ) (x f is continuous because ) (x f is
a linear combination of ) ( and ) (
2 1
x f x f

Property 4: The area under the curve f(x) from infinity to + infinity should be 1.
1 ) ( and 1 ) ( ( 1 ) 1 )( 1 ( ) 1 (
) ( ) 1 ( ) ( )) ( ) 1 ( ) ( ( ) (
2 1
1 1 2 1
= = = + =
+ = + =


+

+

+

+

+

+

dx x f dx x f w w
dx x f w dx x f w dx x f w x wf dx x f
Q

Property 5: The pdf should tend to zero at infinity and + infinity.

0 ) (
0 ) 0 )( 1 ( ) 0 (
) ( ) 1 ( ) (
)] ( ) 1 ( ) ( [ ) (
2 1
2 1
=
= + =
+ =
+ =



x f Lim
similarly
w w
x f Lim w x f Lim w
x f w x wf Lim x f Lim
x
x x
x x

We now argue that the proposed functional form will provide a fit that is at least as good as the
normal distribution. Let us take a look at the proposed functional form of the p.d.f.

1 0
) ( ) 1 ( ) ( ) (
2 1

+ =
w
x f w x wf x f


If we set w=1 then we get a standalone normal distribution. Hence, if use the Maximum
likelihood method to approximate the parameter, then we know for sure that there is at least
one solution in with following parameters:
R R
n
x
n
x
w

= = =

2 2
2
1 1
, ;
) (
; ; 1


7
It is worth noting that with this simple arrangement we have overcome the symmetry
assumption. In addition, the degrees of freedom, or the knobs, have gone up from 2 to 5. The
additional parameters being:
2 2
, , w


The Maximum Likelihood (MLE) method is used to estimate the parameters of the of the
proposed distribution

0 ; 0
1 w 0
such that
2
1
) 1 (
2
1
) ( ln
2 1
2
1
2
2
1
1
, , , ,
2
2
2
2
1
1
2 2 1 1


(
(

|
|

\
|

|
|

\
|



i i
x x
w
e w e w
Max

We now test our solution, on 10 year daily returns data of BSE-Sensex. Using the method
described, we estimate the parameters of the proposed distribution (Figure 4)



Figure 4: Parameter values of proposed distribution for 10 year daily-return data of BSE-sensex

Figure 5 shows the result of fitting the proposed distribution, on ten year returns data of BSE-
Sensex. The proposed distribution almost overlaps the empirical distribution and is definitely
better than the normal distribution.

Similar analysis was done for the Infosys stock (a stock which is constituent of BSE-Sensex)
and Praj Industry stock (a stock which is not a part of BSE-Sensex). The results of the same are
presented in Figure 6 and Figure 7 respectively. In all cases, the proposed distribution does
much better than the empirical distribution. Having fitted the distribution to the index returns I
now turn my attention to the distribution of the returns on options, which is believed to be
highly skewed.

A hypothetical call option on BSE-Sensex with expiry of 1/1/2011 and Strike of 15000 is
considered. In addition, the risk free rate is assumed to be fixed at 5% and the volatility at 35%.
The option return in this case depends only on the movement of the underlying and the time
decay. Even in this case, the proposed distribution matches well with the empirical distribution
(Figure 8).
8
BSE-Sensex
0
0.2
0.4
0.6
0.8
1
1.2
-20 -15 -10 -5 0 5 10 15
x
P
r
o
b
a
b
i
l
i
t
y
Empirical CDF Proposed CDF Normal CDF


Figure 5 : Results of fitting proposed pdf for BSE-Sensex returns

BSE-Sensex
9
Infosys
0
0.2
0.4
0.6
0.8
1
1.2
-1 -0.5 0 0.5 1 1.5
empirical cdf fitted_cdf normalcdf


Figure 6 : Results of fitting proposed distribution to Infosys returns

10
Praj Industries
0
0.2
0.4
0.6
0.8
1
1.2
-0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3
returns
c
d
f
empirical Fitted Normal


Figure 7: Results of fitting proposed distribution to Praj Industries returns.

11
Call Option Analysis
0
0.2
0.4
0.6
0.8
1
1.2
-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8
x
c
d
f
empirical Fitted


Figure 8 : Results for fitting proposed distribution for Call option returns : Strike=15000, Expiry=1/1/2011,Vol=35%,r=5%.
12
VI. VaR

In section V, we saw that the proposed p.d.f. fits empirical distribution much better than the
normal distribution. Now we need to see the impact of the proposed distribution on the value of
VaR.

We arrive at the VaR numbers as follows:

\
|


|
|

\
|

|
|

\
|

=
=
+ =
= =
VaR
x
VaR
x x
dx e Percentile Var
dx x f Percentile Var
x f w x wf x f
e x f e x f
2
2
2
2
2
1
1
2
1
2 1
2
1
1
2
2
1
1
1
2
1
on Distributi Normal For
) (
) ( ) 1 ( ) ( ) (
2
1
) ( ;
2
1
) (
on distributi Proposed For








Figure 7: VaR Numbers for BSE-Sensex


It is clear that over-all VaR numbers are closer to the historical VaR numbers when the
proposed distribution is used. This is the VaR number we will use for computing Expected
Shortfall.

VII. Expected Shortfall

Although the proposed distribution gives a much better measure of VaR, we need to see
whether this distribution can be used for extreme values as well. The conditional cumulative
probabilities given by empirical distribution, normal distribution and proposed distribution are
plotted on the same graph to check the fit at the tails (Figure 9). The conditional cumulative
probabilities are defined as:

) (
) ( 1
) ( ) (
) | Pr( y F
VaR F
VaR F y VaR F
VaR x y VaR x VaR
VaR
=

+
= > + < <
13

In the expression above, F is the cumulative probability distribution of negative of the returns.
Since we are concerned with VaR or the loss, it is okay to multiply the whole time-series by -1
and focus on the right tail instead of left tail. Therefore, at 95% probability level F(VaR)=95%.

tails
0
0.2
0.4
0.6
0.8
1
1.2
0 1 2 3 4 5 6 7 8 9
u
C
o
n
d
i
t
i
o
n
a
l

P
r
o
b
a
b
i
l
i
t
i
e
s
Cond/Emp Cond/Fitted Cond/Norm

Figure 9 Conditional Probability distribution for tails



Cleary neither the normal distribution nor the proposed distribution explains the tail
satisfactorily. Therefore, we need to fit the tail separately to estimate the Expected
Shortfall number.

To do that , I fit a Generalized Pareto Distribution (GPD) to the tail. GPD is defined as:

\
|
+
=
=

0 if ) 1 1
0 if 1
) (
1

y
e
y G
y

Where y is excess-over VaR, is the shape parameter and the scale parameter.




Tail-BSE Sensex
14
We again use maximum likelihood method to estimate the shape and scale parameter.

There are few points worth mentioning about the GPD:

1. GPD is conditional distribution.
2. GPD is cumulative distribution.
3. y is the excess value over the threshold and y >0
4. can never be negative because if it were the case then the conditional
distribution will approach infinity as y tends to infinity.
5. can never be negative because in non negative and

y
+ 1 >0 for all y and
y>0.

The result of fitting GPD on BSE-sensex tail is shown in Figure 10:

Tail Distribution
0
0.2
0.4
0.6
0.8
1
1.2
0 1 2 3 4 5 6 7 8 9
u
C
o
n
d
i
t
i
o
n
a
l

P
r
o
b
a
b
i
l
i
t
i
e
s
Cond/Emp Cond/Fitted Cond/Norm Cond/Pareto
Figure 10 : Generalized Pareto Distribution fits the tail much better than either normal or the
proposed distribution.




Tail-BSE Sensex
15

Expected Shortfall(ES) is defined as

=
VaR
VaR
dx x f
dx x xf
ES
) (
) (

For GPD it is possible to obtain a closed form solution for ES:

+
= +
=
0 if
1
0 if


VaR
VaR
ES


For the BSE-Sensex data the estimated values for shape and sigma are =0 and
=1.243706



Figure 11 : The Expected Shortfall measure at 95% Var Measure.

The ES number for Generalized Pareto lies in between ES predicted by the Normal
Distribution and ES predicted by the proposed distribution. This is to be expected from
what we see in figure 9. The Normal Distribution is under-estimating the values of
extreme events where as the proposed distribution is over-estimating the values of
extreme events.

VIII Analysis for S&P 500

First we fit the distribution on 10 year daily returns data for S&P and fit the mixed
distribution and the normal distribution.










16

Distributions - SNP
0
0.2
0.4
0.6
0.8
1
1.2
-15 -10 -5 0 5 10 15
returns
c
d
f
Empirical Normal Fitted


Figure 12: For S&P the Proposed Distribution does a much better job
17
The parameters estimates for the proposed distributions are:



Figure 13: Parameters of the Proposed Distribution for S&P

Then we calculate VaR Estimates for various percentiles:



Figure 14: The VaR numbers are closer to empirical distribution if use the Proposed
distribution.


Finally we fit a Pareto distribution to the tail of daily returns of S&P distribution (figure
14)

Tail-S&P
0
0.2
0.4
0.6
0.8
1
1.2
0 1 2 3 4 5 6 7 8
excess return over VaR
C
o
n
d
i
t
i
o
n
a
l

P
r
o
b
a
b
i
l
i
t
i
e
s
Conditonal Empricial Conditional Normal Conditional fitted Pareto

Figure 14: The Pareto distribution fits the tail in a much better way for SNP
18


Figure 15 : The expected Shortfall number from fitting Pareto, Normal and the proposed
distribution


IX. Conclusion:

The method of taking weighted average pdfs fits entire the empirical distribution much
better than the normal distribution. Although I have taken the pdf to be a mixture of two
normal pdfs, there is no such restriction on what kind of pdfs to be used or how many
pdfs to be used. We could have used 3 normal distributions or 2 normal and 1 Cauchy
distributions. For the purpose of this study the weighted sum of two normal distributions
was sufficient. We use this distribution to approximate the VaR number and showed that
the VaR computed using the proposed distribution was much closer to the VaR predicted
by the empirical data. Next we examined the behavior at the tail and found that the
proposed distribution does not satisfactorily explain the behavior at the tail. This is going
to be the case irrespective of the distribution we use. If we try to fit a single curve to the
whole curve, it will fit the core better than at the tails. Therefore, we need to model the
tail separately. To model the distribution of the tail we fitted a Generalized Pareto
Distribution (GPD) to the tail alone. Form the GPD, we extract the value for Expected
Shortall (ES).





















19

References:

1. Wo-Chiang Lee, Applying Generalized Pareto Distribution to the Risk Management of Commerce
Fire Insurance
2. http://www.autonlab.org/tutorials, Maximum likelihood
3. Ramazan Genay , Faruk Seluk , Abdurrahman Uluglyagci, High volatility, thick tails and
extreme value theory in value-at-risk estimation
4. Alexander J. McNiel, Rudiger Frey, Quantitative Risk Management.

You might also like