You are on page 1of 7

Monte Carlo Simulation of Correlated Random Variables

Stefan Forster (Augsburg)


08.11.1997

Abstract
This paper describes a method for the Monte Carlo simulation of two correlated
random variables. The author analyses linear combinations of stochastically in-
dependent random variables that are equally distributed over the interval (0 1)
;

(\random numbers") and also examines their distribution. If a suitable matrix of


coecients is chosen, the subsequent transformation results in random variables
with the desired distribution properties and the given covariance. The method is
carried out for a series of covariances using two exponentially distributed random
variables.

1. Introduction
Thanks to their almost universal application, Monte Carlo simulations are becoming
increasingly popular in many academic disciplines, a trend that has been further en-
hanced by the widespread use of ever faster computers boasting more and more RAM.
Monte Carlo simulations are suitable, for instance, for the analysis of the stochastic
di erence equations that occur when modelling discrete processes. In the domain of
ruin theory, for example, so-called risk reserve processes can be analysed: if not only
the total claims expenditure of a particular business year, but also the investment result
is factored into the calculation as a random variable, both these variables are usually
assumed to be stochastically independent. In this case, Monte Carlo simulations lend
themselves to solving the problem. In reality, however, the posited stochastic inde-
pendence may not necessarily exist: with reinsurers, for instance, underwriting results
often correlate with investment results simply because the former also hold stakes in
their ceding companies. Nevertheless, as the dependencies between the two random
variables cannot generally be expressed in terms of formulae, it is impossible to use the
basic version of the Monte Carlo simulation in such cases. This paper will show how
such correlations between two random variables can be taken into account in a Monte
Carlo simulation.

 E-mail: Foerster@a-city.de

1
2. Derivation of the method
Let (
; A; P) be a probability space in which all random variables occurring in this
paper are de ned. Let us also assume that the two random variables Xi : (
; A) ! (R; B)
to be simulated using the Monte Carlo method are quadratically integrable in relation
to
 P and have  the distribution functions FX , i = 1; 2. Further, let Ui : (
; A) !
i

(0; 1) ; B(0;1) , i = 1; 2, be the underlying random variables, uniformly distributed


1
along the interval (0; 1) (\random numbers"). Let
FX?1(x) := min f u 2 R : FX (u)  x g ;
i i
i = 1; 2: (1)
In the standard version of the Monte Carlo simulation these generalised inverse func-
tions of the distribution functions FX of Xi , i = 1; 2, are used to generate stochastically
i

independent random variables


X^i := FX?1  Ui; i = 1; 2;
i
(2)
whose distributions PX^ are identical with the distributions PX of Xi , i = 1; 2. In order
i i

to generate stochastically dependent random variables, stochastically dependent Ui


would have to be used in (2); however, the random number generator produces only
(nearly) uncorrelated pseudo random numbers. These can, for instance, be linked to
each other via linear combination. Let A = (aij )i;j =1;2 2 R22 be a matrix where
Ai: 6= 0, i = 1; 2, with Ai: denoting the ith row of the matrix. The random variable
U := (U1; U2 )t is then converted by linear transformation into the random variable
V = (V1 ; V2 )t := AU (3)
with mutually dependent components V1 and V2 . When these components are substi-
tuted in their distribution functions FV , i = 1; 2, random variables
i

U~i := FV  Vi; i = 1; 2;
i
(4)
are generated that are uniformly distributed over the interval (0; 1). By analogy with (2)
we thus obtain the random variables
X~i := FX?1  U~i = FX?1  FV  Vi = FX?1  FA U  (Ai:U ) ;
i i i i i:
i = 1; 2; (5)
which are distributed in accordance with PX~ = PX , i = 1; 2, and, in contrast to (2),
i i

are correlated. Assuming that U(0;1) is the uniform distribution over the interval (0; 1),
then it follows that, owing to
PU = P(U1 ;U2 ) = PU1
PU2 = U(0;1)
U(0;1) (6)
and using Fubini's theorem,
Z
E(X~ 1 X~ 2 ) = X~1 X~2 d P

Z Y 2  
= F ?1  FA U  Ai:U d P

i=1 X i i:

2
Z Y2  
= F ?1  FA U (Ai: u) d PU (u)
R2 i=1
X i i:

Z Z Y 2  
= FX?1  FA U (ai1 u1 + ai2 u2) dU(0;1) (u1 ) dU(0;1) (u2 )
i:
R R i=1 i

Z 1Z 1 Y2  
= FX?1  FA U (ai1 u1 + ai2u2 ) du1 du2: (7)
0 0 i=1 i i:

The covariance of X~ 1 and X~ 2 can then be calculated using


Cov(X~1 ; X~2 ) = E(X~ 1 X~ 2 ) ? E(X~ 1 ) E(X~ 2 ) = E(X~ 1 X~ 2 ) ? E(X1 ) E(X2 ): (8)
The double integral occurring in (7) can, as a rule, only be solved numerically. An
introduction to the numerics of multiple integrals can be found, for example, in [HH92,
p. 342 ], while [Str71] provides a more in-depth treatment of the same topic. The
integrand of this double integral contains the distribution function of the linear com-
bination of the two uniformly distributed random variables U1 and U2 , which we shall
examine in the following section. It is then possible to induce a given covariance be-
tween X~ 1 and X~ 2 by selecting an appropriate parameter matrix A via a numerical
zero search algorithm and/or for solving (overidenti ed) non-linear sets of equations.
Section 4 of this paper discusses how this method can be applied.

3. Linear combinations of random numbers


Let us now examine the linear combination
V := a1 U1 + a2 U2; ai 2 R n f0g; i = 1; 2 (9)
of two random variables uniformly distributed over the interval (0; 1). It is easy to see
that the distributions of these random variables ai Ui , i = 1; 2, have density functions1
8
< R ! f 0; 1= jai j g
>
fa U : > x 7! 1  1 (x) ; i = 1; 2 (10)
: ja j (a ^ 0; a _0)
i i

i i i

in respect of the Lebesgue measure. Thanks to the stochastical independence of a1 U1


and a2 U2 , we obtain by convolution of these two density functions the density function
Z
fV (v) := fa1 U1 (v ? x)fa2 U2 (x) dx; v 2 R (11)
R
of the distribution of V . For the case 0 < a1  a2 , it is easy to calculate
80 v0
>
>
>
<v
> 0 < v  a1
fV (v) = a 1a  > a1 ; if a1 < v  a2 (12)
1 2 >>
> (
: 1a + a 2 ) ? v a 2 < v  a 1 + a2
0 a 1 + a2 < v
In these functions, ^ and _ are de ned as ^ := minf g, _ := maxf g, respectively.
1
a b a b a b a; b a b a; b

1M denotes, as is customary, the indicator function of a set  R where 1 ( ) = 1, if 2 or 0


M M x x M

otherwise.

3
All in all, it is necessary to examine eight di erent cases. If set M := f 0; a1 ; a2 ; a1+ a2 g
is sorted in ascending order using the isotonic function r: f 1; : : : ; 4 g ! M (abbreviated
to ri := r(i), i = 1; : : : ; 4), fV can be generelly formulated as follows:
80 v  r1
>
>
1 > < v ? r1
> r1 < v  r 2
fV (v) = ja1 a2 j > r2 ? r1 = r4 ? r3 ;
if r2 < v  r3 (13)
: r4 ? v r3 < v  r4
>
>
>
0 r4 < v
Figure 1 shows the graph of the function fV for the parameters a1 = 1 and a2 = 2:

0.6

0.5

0.4

0.3

0.2

0.1

0.0
-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

Figure 1: Density function fV (a1 = 1 and a2 = 2)

By integrating fV , we obtain the desired distribution function FV of V :


8
>
> 0 v  r1
>
>
>
>
>
<
(v ? r1 )2 r1 < v  r2
1 (v ? r1 )2 ? (v ? r2 )2 ; if r2 < v  r3
FV (v) = 2ja1 a2 j  >
>
(14)
>
>
>
>
>
2ja1 a2 j ? (r4 ? v)2 r3 < v  r4
:
2ja1 a2 j r4 < v
Using the identity ja1 a2 j = (r2 ? r1 )(r3 ? r1 ), (14) can be written such that it is depen-
dent on r1 ; : : : ; r4 alone: this is particularly helpful when programming this function.
Figure 2 shows the corresponding distribution function for the density function shown
in Figure 1.

4
1.0

0.8

0.6

0.4

0.2

0.0
-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

Figure 2: Distribution function FV (a1 = 1 and a2 = 2)

4. Example
The method described above will now be applied in the following example. Let us
assume that X = (X1 ; X2 ) is as given in Section 2, where the two components Xi 
Exp , i = 1; 2, are exponentially distributed with the parameters 1 = 1 and 2 = 2
respectively. It thus follows for the mean value and the variance of X :
i

   
E(X ) = 01:5 and Var(X ) = Cov(X1 ; X ) Cov(0X:251 ; X2 ) (15)
1 2
The matrix  
1
A := a 1 0 (16)
is chosen as parameter matrix for the method. The restriction to a single variable
parameter a21 = a 2 R means that, given covariance Cov(X1 ; X2 ), the latter can be
approximated using one of the usual procedures for a numerical zero search algorithm.
As the target function from equation (8) is continuous in a, the so-called binary search
lends itself to this purpose. The approximation of the double integral occurring in (7)
is carried out via a Monte Carlo simulation. The accuracy o ered by this method can
also be achieved, for instance, using the compound 6th -order Newton-Cotes formula
(Weddle rule) with roughly the same number of nodes as there are steps in the Monte
Carlo simulation. Figure 3 below shows the graph thus generated of a = a(Cov(X1 ; X2 ))
for the covariance range from ?0:30 bis +0:45.
The theoretically greatest possible covariance range from ?0:5 to +0:5 as shown in
the example cannot be fully reached using the selected approach (16) (this range lies
roughly between ?0:32 and +0:47). However, this range can be extended even further
by varying the other matrix components of A.
5
5.0

4.0

3.0

2.0

a 1.0

0.0

-1.0

-2.0

-3.0
-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4

Cov(X~1 ; X~2 )
Figure 3: Parameter a as a function of the covariance
 
Using Monte Carlo simulations in accordance with (5), the values X~ 1 (!k ); X~2 (!k ) ,
k = 1; : : : ; 4 million, were simulated for each of a series of covariances. Table 1 shows
the resulting estimates (calculated using the usual unbiased estimators) for both the
mean value E^ (X ) and the variance Var(d X ).
Incidentally, the case Cov(X1 ; X2 ) = 0 was simulated using the normal Monte Carlo
procedure (2) for uncorrelated variables and is given here simply for the purpose of
comparison.
This numerical example was implemented in the C++ programming language. Using
a normal PC, the binary search for parameter a | comprising 1 million simulation
steps each for the cubature of the double integral and with a procedural accuracy
of 10?9 percent | takes about 1-2 minutes. It takes only 15 seconds to simulate
4 million values X~ 1 (!k ); X~2 (!k ) and to calculate the corresponding estimators for
the mean value and the covariance matrix. It is worth mentioning in this context that
two independently operating random number generators (multiplicative congruence
method, see e.g. [HH67, p. 28] were used to generate the pseudo random numbers U1 (!k )
and U2 (!k ).

5. Concluding remarks
As the example in Section 4 shows, today's high-performance computers mean that
the procedure described in this paper is capable of delivering an accuracy adequate for
practical purposes within a reasonable time.
In principle, this procedure is also suitable for simulating any number of correlated
random variables. To do this, it is necessary to formulate a generalisation of (14) for
6
Cov(X1 ; X2 ) a E^ (X ) d X)
Var(

-0.3 -2.25038669 1:00025575  
1:00230680 ?0:30007563 

0:49994647 ?0:30007563 0:24982405 
-0.2 -0.65309271 1:00037665  
0:99969808 ?0:19988382

0:49985152 ?0:19988382 0:24996537 
-0.1 -0.22334035 1:00003878  
0:99983236 ?0:09987111

0:49985804 ?0:09987111 0:24984293
0.0 0:99995994  
0:99927780 0:00026363 
0:50031120 0:00026363 0:25065330

0.1 0.18676798 0:99994481  
1:00033920 0:10081312 
0:49946745 0:10081312 0:24925708

0.2 0.47654724 1:00011061  
0:99785308 0:19859512 
0:49996047 0:19859512 0:24948611

0.3 0.93628740 1:00025179  
0:99996084 0:30018929 
0:50010758 0:30018929 0:25003923

0.4 2.04820447 0:99995495  
0:99856216 0:39945971 
0:50003622 0:39945971 0:24991279

Table 1: Results of di erent simulation runs

linear combinations of n = 3; 4; : : : random variables that are uniformly distributed


along the interval (0; 1). The multiple integral occurring in analogy with (7) can then
be approached only by means of Monte Carlo procedures; however, as the number of
dimensions increases, so must the number of simulation runs needed to achieve the
given accuracy, thus lengthening the processing time.

References
[HH67] J. M. Hammersley and D. C. Handscomb. Monte Carlo Methods. Methuen & Co
Ltd., London, 3rd edition, 1967.
[HH92] Gunther Hammerlin and Karl-Heinz Ho mann. Numerische Mathematik.
Springer-Verlag, Berlin|Heidelberg|New York, 3rd edition, 1992.
[Str71] A. H. Stroud. Approximate Calculation of Multiple Integrals. Prentice-Hall,
Inc., Englewood Cli s N.J., 1971.

You might also like