You are on page 1of 10

SOA Exam P Notes

Dargscisyhp
July 5, 2016

Mathematical identities
Series

n1
P
i=0

n1

ari = a rr1 = a 1r
1r

If r < 1 then

ari =

i=0

n
P

i=1
n
P

i2 =

n(n+1)(2n+1)
6

i=1
n
P

i3 =

i=1

n2 (n+1)2
4

x=0

n(n+1)
2

ax
x!

= ea

Integrals
R
xeax dx =

a
1r

xeax
a

eax
a2 + C
R ax
R
d
e dx = xeax dx.
Proof: Consider da
axeax eax
. Equating th last expressions
a2

Gamma Function
R
() = 0 y 1 ey dy
If n is a position integer (n) = (n 1)!.

Basic Probability
De Morgans laws
(A B)0 = A0 B 0
 n
0
n
S
T

Ai =
A0i
i=1

i=1

(A B)0 = A0 B 0
 n
0
n
T
S

Ai =
A0i
i=1


A

i=1


S
i

Bi

(A Bi )

 
R ax
d
d eax
But we also have da
e dx = da
=
a
of both equations proves the identity.


A


T

Bi

(A Bi )

If Bn mutually exclusive and exhaustive then A = A


S


Bi

A = A (B B 0 ) = (A B) (A B 0 )
P [B 0 ] = 1 P [B]
P [A B] = P [A] + P [B] P [A B]
P [A B C] = P [A] + P [B] + P [C] P [A B] P [A C] P [B C] + P [A B C]
 n

n
S
P
For mutually exlusive Ai we have P
Ai =
P [Ai ]
i=1

If Bn forms a parition then P [A] =

n
P

i=1

P [A Bi ]

i=1

Conditional probability
P [B|A] is read as the probability of B given A.
P [B|A] =

P [BA]
P [A]

P [B] = P [B A] + P [B A0 ] = P [B|A]P [A] + P [B|A0 ]P [A0 ]


P [A0 |B] = 1 P [A|B]
P [A B|C] = P [A|C] + P [B|C] P [A B|C]
Bayes rule: reversing conditionality
Simple version of Bayes rule:
P [A|B] =

P [B|A]P [A]
P [B|A]P [A]+P [B|A0 ]P [A0 ]

Bottom is just P [B A] + P [B A0 ] = P [B]. Top is P [B A].


Bayes Rule can be extended using the same justification as above:
P [Aj |B] =

P [B Aj ]
P [B|Aj ] P [Aj ]
= P
n
P [B]
P [B|Ai ] P [Ai ]
i=1

If P

n1
T


Ai > 0 then

i=1

"
P

n
\

#
Ai = P [A1 ]P [A2 |A1 ]P [A3 |A2 A1 ] . . . P [An |A1 A2 . . . An1 ]

i=1

A and B are considered independent events if any of the following equivalent hold
P [A B] = P [B]P [A]
P [A|B] = P [A]
P [B|A] = P [B]
Keep in mind that independent events are not the same thing as disjoint events.

Combinatorics
Permutations
The number of ways n distinct objects can be arranged is n!.
The number of ways we can choose k objects from n in an ordered manner is
P (n, k) =

n!
(n k)!

If you have n1 objects of type 1, n2 objects of type 2 etc such that

n
P

ni = n then the

i=1

number of ways of arranging these objects in an ordered manner is


n!
n
Q
ni !
i=1

Combinations
Combinations give you the number of ways you can choose k objects from a set of n
where the order is irrelevant. This number is
 
n
n!
=
(n k)!k!
k
If you have n1 objects of type 1, n2 objects of type 2 etc then the number of ways we
can combine k1 objects of type 1, k2 objects of type 2 etc is
k  
Y
ni
i=1

ki

Random Variables
Cumulative distribution function:
P
F (X) =
p() for discrete variable
F (X) =

<x
Rx

f (t)dt for continuous variable.

Survival function: S(X) = 1 F (X)

d
dx

[F (x)] = F 0 (X) = S 0 (X) = f (x)

P [A x B] = F (b) F (a)
Expectation Value (X is random variable, p(x) and f(x) are distribution functions and h(x)
is a function of a random variable)
P
Discrete: E[X] = xp(x)
R
Continuous: E[X] = xf (x)dx

R
P
E[h(X)] =
h(x)p(x) =
h(x)f (x)dx
discrete

continuous

If X is defined on [a, ] (f(x)=0 for x < a) then E[X] = a +

R
[1 F (x)]dx
a

Rb
If X is defined on [a, b] then E[X] = a + [1 F (x)]dx
a

Jensens inequality states that if h(X) 0 where X is a random variable then


E[h(X)] h(E[X]). If h(X) > 0 = E[h(X)] > h(E[X]). Inequality reverses
for h < 0.
nth moment of X is E[X n ].
nth central moment of X where E[X] = is E[(X )n ].
Variance
Variance measures dispersion about the mean.
V ar[X] = 2 = E[(X )2 ] = E[X 2 ] (E[X])2 .
is standard deviation.
If a, b R then V ar[aX + b] = a2 V ar[X]
Coefficient of variation: /
Skew[X] =

1
3 E[(X

)3 ]

Moment generating function


MX (t) = E[etx ]
R
MX (t) = f (x)etx dx (continuous)

P
MX (t) = etx p(x) (discrete)
R
MX (0) = 1 because MX (0) = e0x f (x)dx = 1

Moments of X can be found by successive derivatives of moment generating function,


hence the name.
0
MX
(0) = E[X]
MX (0) = E[X 2 ]
n
MX
(0) = E[X n ]

d2
[ln (MX (t))]t=0 = V ar(X)
dt2

MX (t) =

k
X
t
k=0

k!

E[X k ]

If X has a discrete distribution xP


1 , x2 , ..., xn with probability p1 , p2 , ..., pn the moment
generating function is MX (t) = pi etxi .
If Z=X+Y then MZ (t) = MX (t)MY (t)
Proof: MZ (t) = E[etZ ] = E[et(X+Y ) ] = E[etX etY ] = E[etX ]E[etY ] = MX (t)MY (t)
Chebyshevs inequality: For r>0, X random variable, P |X | > r]

1
r2 .

If fX (X) is probability function of x then the conditional probability distribution of X given


the occurrence of event A is Pf (X)
(A) if X A and 0 otherwise.
Simple definition of independent variables
P [(a < X < b) (c < Y < d)] = P [a < X < b] P [c < Y < d]
4

If X and Y are independent variables then if A is an event involving only X and B is


an event involving only Y then A and B are independent events.
Hazard rate of failure for continuous random variable (normal notation: f(x)= probability
distribution, F(X) = CDF)
h(x) =

d
f (x)
= ln [1 F (x)]
1 F (X)
dx
k

Mixture of distributions: given random variables {Xi }i=1 with density functions {fi (X)}
Pk
and weights i < 1 such that i=1 i = 1 we can construct a random variable X with the
following density function
Pk
f (x) = i=1 i fi (x)
Pk
E[X n ] = i=1 i E[Xin ]
Pk
MX (t) = i=1 i MXi (t)

Discrete Distributions
Uniform Distribution
Uniform distribution of N points
p(X) =
E[X] =

1
N
N +1
2

V ar[X] =

N 2 1
12

MX (t) =

N
X
ejt
j=1

et (eN t1)
N (et 1)

Binomial Distribution
If a single trial of an experiment has success probability p and unsuccessful probability
1-p then if n is a number of trials and X the number of successes X is binomially
distributed
 X
n
P (X) = X
p (1 p)nx
E[X] = np
V ar[x] = np(1 p)
MX (t) = (1 p + pet )n

Pk
Pk1

p
= 1p
+

(n+1)p
k(1p)

(n+1)pkp
k(1p)

Poisson distribution
Often used as a model for counting umber of events in a certain period of time.
Example: Number of customers arriving for service at a bank over a 1 hour period is
X.
The Poisson parameter > 0.

p(x) =

e x
x!

E[X] = V ar[x] =

MX (t) = e(e

Pk
Pk1

1)

Geometric distribution
In a series of independent experiments with success probability p and failure probability
q = 1 p, if X represents the number of failures until the first success then X has a
geometric distribution with parameter p.
p(X) = (1 p)X for X=0,1,2,3,...
1p
q
p = p
q
V ar[X] = 1p
p2 = p2
p
MX (t) = 1(1p)e
t

E[X] =

Negative binomial distribution with parameters r and p


If X is the number of failures until the rth success occurs where each trial has a success
probability p then X has a negative binomial distribution
 r
p(x) = r+x1
p (1 p)X for X=0,1,2,3
x
 (r+x1)(r+x2)...(r+1)(r)
r+x1
=
x!
r1
E[X] =

r(1p)
p

V ar[x] =

r(1p)
p2

p
MX (t) =
1 (1 p)et

pk
pk1

=1p+

r

(r1)(1p)
k

Multinomial Distribution
Parameters: n, p1 , p2 ,...pk (n>0, 0 pi 1,

pi = 1)

Suppose an experiment has k possible outcomes with


P probabilities pi . Let Xi denote
the number of experiments with outcome i so that i Xi = n.
P [X1 = x1 , X2 = x2 , ..., Xk = xk ] = p(x1 , x2 , ..., xk ) =

xk
x1 x2
n!
x1 !x2 !...xk ! P1 P2 ...Pk

E[Xi ] = npi
V ar[Xi ] = npi (1 pi )
Cov[Xi , Xj ] = npi pj
Hypergeometric distribution
Rarely used
If there are M total objects with k objects of type I and M-k objects of type II ad if n
objects are chosen at random then X denotes the number of type I chosen then X has
a hypergeometric distribution.
X n, X k, 0 X, n (M k) X
(k)(M k)
p(x) = x Mnx
(n)
nk
M
k)(M n)
= nk(M
M 2 (M 1)

E[X] =
V ar[x]

Continuous Distributions
The probabilities for the following boundary conditions are equivalent for continuous distributions: P [a < x < b] = P [a < x b] = P [a x < b] = P [a x b].
Uniform distribution
f (x) =

1
ba

E[x] =

a+b
2 =median

Var[x] =

for a < x < b.

(ba)2
12

Mx (t) =

ebt eat
(b a) t

Symmetric about mean


dc
P [c < x < d] = ba
Rx
Since a f (x)dx = xa
ba the characteristic function is

F (x) =

xa
ba

x<a
axb
x>b

Normal distribution N(0,1)


Mean of 0, Variance of 1.
Density function
1 z2
(z) = e 2
2
Mz (t) = exp

h 2i
t
2

z-tables
A z-table gives P [Z < z] = (z) for normal distribution N(0,1).
When using table use symmetry of distribution for negative values i.e. (1) = P [z
1] = P [z 1] = 1 (1).
General normal distribution N (, 2 )
Mean and mode: , Variance: 2

f (x) =
h
Mx (t) = exp t +

2 t2
2



1
(x )2
exp
2 2
2

To find P [r < x < s] first standardize the distribution by putting


in terms of
 rthings
x
s
Z = x
and
then
use
the
z-table
as
follows:
P
[r
<
x
<
s]
=
P
<
=


<

s
r

If X = X1 + X2 where E[X1 ] = 1 , var[X1 ] = 12 , E[X2 ] = 2 , var[X2 ] = 22 and
X1 and X2 are normal random variables then X is normal with E[x] = 1 + 2 and
var[X] = 12 + 22 .

Given a random variable X with mean and variance 2 the distribution can be approximated by the normal distribution N (, 2 ). If X takes discrete integer values then
you can improve on your normal approximation in Y by using the integer correction:
P [n x m] P [n 12 Y m + 12 ].
Exponential distribution
Used as a model for the time until some specific event occurs.
f (x) = ex for x > 0, f (x) = 0 otherwise.
F (x) = 1 ex
S(x) = 1 F (x) = P [X > x] = ex
E[x] =

1
2

Mx (t) = t
R
E[X k ] = 0

var[x] =

fot t < .
xk ex dx =

k!
k

for k=1,2,3,...
(x+y)

P [X>x+yX>x]
P [X>x]

e
P [X > x + y|X > x] =
= PP[X>x+y]
= ey . In other words,
[X>x] =
e x
based on the definition of the survival function S(x), P [X > x + y|X > x] = P [X > y].
This result is interpreted as showing that an exponential process is memoryless. (See
page 205 in manual for details)

If X is the time between successive events and is exponential with a mean 1 for X
and N is the number of events per unit time then N is a Poisson random variable with
mean .


If we have a set of independent exponential random variables {Yi } with mean 1
i

1
P
i
and Y = min {Yi } then Y is exponential with mean
.
i

Gamma distribution
Parameters > 0, > 0
For x > 0
f (x) =

x1 ex
()

= 2

E[X] =
var[x]

Mx (t) =

If this shows up on the exam then most likely = n where n is an integer so the density
function becomes
xn1 ex
f (x) =
(n 1)!

Joint, Marginal, Conditional distributions and Independence


Joint distribution
Probability of a joint distribution is given by f (x, y) which must be less than 1 at any
value and sum or integrate to 1 over the probability space.

If f (x, y) is appropriately defined as specified above, P [(x, y) A] is the summation or


double integral of the density function over A.
The cumulative distribution function is defined as
R x R y
f (s, t)dtds continuous
y
x
P
P
F (x, y) = P [(X x) (Y y)] =

f (s, t)
discrete
s= t=

P P

h(x, y)f (x, y)


E[h(x, y)] = Rx y

h(x, y)f (x, y) dy dx


R2

discrete
continuous

If X and Y are jointly distributed with a uniform density in R and 0 outside then the
pdf is M 1(R) where M (R) represents the area of R. The probability of event A is the
M (A)
M (R) .

E[h1 (x, y) + h2 (x, y)] = E[h1 (x, y)] + E[h2 (x, y)].


P
P
E
xi =
E[xi ].
i

lim F (x, y) = lim F (x, y) = 0

P [(x1 < X x2 ) (y1 < Y y2 )] = F (x2 , y2 ) F (x2 , y1 ) F (x1 , y2 ) + F (x1 , y1 )


Marginal distribution
Derived for a single variable from a joint distribution.
Marginal distribution of X is denoted fX (x) and is fX (x) =
R
case and fX (x) = R f (x, y) dy in the continuous case.

f (x, y) in the discrete

The cumulative distribution function is FX (x) = lim F (x, y).


y

Since X is a dummy variable, all of this discussion carries over to other variables in the
joint distribution.
Marginal probability and density functions must satisfy all requirements of probability
and density functions.
Conditional distribution
Gives the distribution of one random variable with a condition imposed on another
random variable.
Must satisfy conditions of a distribution.
Conditional mean, variance, etc can all be found using the usual methods.
Conditional distribution of X given Y=y is fX|Y (x|Y = y) =

f (x,y)
fY (y) .

If X and Y are independent then fY |X (y|X = x) = FY (y) and fX|Y (x|Y = y) = FX (x).
If marginal and conditional distributions are known then joint distribution can be found:
f (x, y) = fY |X (y|X = x) fX (x).
E[E[X|Y ]] = E[X]
Var[X] = E[Var[X|Y ]] + Var[E[X|Y ]]
Independence of random variables
X and Y are independent if the probability space is rectangular and f (x, y) = fX (x)fY (y).

Independence is also equivalent to the factorization of the cumulative distribution function: F (x, y) = FX (x)FY (y) for all (x,y).
If X and Y are independent then
E[g(x)h(x)] = E[g(x)]E[h(y)].
Var[X + Y ] = Var[X] + Var[Y ]
Covariance
High covariance indicates larger values of Y occur when larger values of X occur. Low
covariance indicates low values of Y occur when larger values of X occur. 0 covariance
indicates that X is not related to the Y values it is paired with.
CovE[(X E[x])(Y E[Y ])] = E[XY ] E[x]E[Y ]
Cov[X, X] = Var[X]
If a, b, c R V ar[aX = bY + c] = a2 V ar[X] + b2 V ar[Y ] + 2abCob[X, Y ]
Cov[X, Y ] = Cov[Y, X].
If a, b, c, d, e, f R and X,Y,Z,W are random variables then Cov[aX + bY + c, dZ +
eW + f ] = adCov[X, Z] + aeCov[X, W ] + bdCov[Y, Z] + beCov[Y, W ].
Coefficient of correlation
(X, Y ) = X,Y =

Cov[X,Y ]
X Y

1 X , Y 1.
Moment generating function for jointly distributed random variables.
MXY (t1 , t2 ) = E[et1 X+t2 Y ].

E [X n Y m ] =



n+m

M
(t
,
t
)
XY 1 2
n
m
t1 t2
t1 =t2 =0

MXY (t1 , 0) = E[et1 x ] = MX (t1 ) and you can do this with Y too.
If M (t1 , t2 ) = M (t1 , 0) M (0, t2 ) in a region about t1 = t2 = 0 then X and Y are
independent.
IF Y = aX + b then MY (t) = ebt MX (at).
Bivariate Normal Distribution
Occurs infrequently on exams
2
If X and Y are normal random variables with E[X] = x , V ar[x] = X
, E[Y ] = Y ,
2
and V ar[Y ] = Y with correlation coefficient XY then X and Y are said to have a
bivariate normal distribution.
)
Conditional mean of Y given X=x: E[Y |X = x] = Y +XY (xx ) = y + Cov(X,Y
(x
2
X
x ).

Conditional variance of Y given X=x: V ar[Y |X = x] = Y2 (1 2XY )


If X and Y are independent XY = 0.

f (x, y) =

2x Y

1
p

"

1
exp
2
2(1 2 )
1

10

"

x x
X

2


+

y y
Y

2


2

x x
X



y y
Y

##

You might also like