Exam C Formulas PDF

C Formula Sheet
Probability
 f ( x)  F '( x)

 F ( x)   f ( x)
 F ( x)  1  S ( x)

 S ( x)  1  F ( x)
 S ( x)  e H ( x )

 H ( x)   ln( S ( x))
h( x)  H '( x)

 H ( x)   h( x)
f ( x)   S '( x)
f ( x)
h( x ) 
S ( x)

E  X    x  p ( x) or
k k
x
k
 f ( x)dx
x -
n'  E  X n  (nth Raw Moment)

n  E  X     (nth Central Moment)
n
  1' is the mean

2  2'   2 is the variance
3  3'  32'   2 3
4  4'  43'   62'  2  3 4
Var ( X )  2   2
Standard Deviation = 
3
Skewness is  1   0 if X is symmetric Skew( X )=Skew(cX )
3

Kurtosis is  2  44


Coefficient of Variation is

n n!
Combinations :  
 k  k !(n  k )!
Percentiles
A 100p th percentile of a random variable X is a number  p satisfying:
Pr( X   p )  p
Pr( X   p )  p
Conditional Probability
Pr( A  B)
Pr( A | B) 
Pr( B)
Pr( B | A)  Pr( A)
Pr( A | B) 
Pr( B)
Scaling
To scale a lognormal:
X ~ lognormal   , 
cX ~ lognormal    ln c, 
Let Y  cX
 y  y
Then FY ( y )  Pr Y  y   Pr  cX  y   Pr  X    FX  
 c c
Variance
E  XY   E  X  E Y  if Independent
E  aX  bY   aE  X   bE Y 
Var  aX  b   a 2Var ( X )
Var  aX  bY   a 2Var ( X )  2abCov  X ,Y   b 2Var (Y )
 
Var ( X )  E X 2  E ( X )2
VaR  X   VaR 1  X 
Cov( X , Y )  E  XY   E ( X ) E (Y )
Cov   A  B  ,  C  D    Cov ( A, C )  Cov ( A, D )  Cov ( B, C )  Cov ( B, D )
Cov( A, A)  Var ( A)
Cov( A, B)  0 if Independent
Cov( X , Y )
 XY  (Correlation Coefficient)
 X Y
Bernoulli Shortcut: For any RV with only 2 values a and b:
Var ( X )   a - b  pq
2
Parametric Distributions
Distribution f(x) E(X)  

E X2 Var(X)
Binomial n k nk
n trials  k  p (1  p) np npq
 
Bernoulli p k (1  p)nk p pq
1 trial
Uniform 1 xd d u  u - d 2 if u  0  u - d 2
continuous on [d, u] F ( x) 
u-d ud 2 3 12
a  2 ab
Beta cx a1 (  x)b1  a  b 2 (a  b  1)
ab
Exponential x
  2 2 2
(memoryless) ce
  x     1  
1
 2 1  
2
Weibull cx 1e     
Gamma  1  
x   2
cx e
Single Parameter c   2
Pareto
x 1  1  2
Double Parameter c  2 2 
Pareto  x    1
  1  2 
 E  X 2 
 1  2
 ln x   
2
 0.5 2 2   2 2
e e
ce 2
2
Lognormal
Median  e 
x
Frailty
Let h  x |    a( x)
If a(x) is a constant then the hazard rate is exponential, otherwise Weibull
 can be Gamma or Inverse Gaussian
x
A( x)   a(t )dt
0
H  x |    A( x)
S  x |    e A( x )
S ( x)  M    A( x)  (Use MGF for )
Exponential a(x) leads to a Pareto distribution

Weibull a(x) leads to a Burr distribution
Conditional Variance
Var ( X )  E Var  X | I   Var  E  X | I 

E  X   E  E  X | I    P  Ai   E  X | Ai 
E  X 2   E  E  X 2 | I  
Var  X   E Var  X | I 
Splices
1) Sum of functions must integrate to 1

2) To be continuous, functions must be equal at break point
Shifting
If f     3e  
3  1
The mean = 1/3 + 1 (the mean of the unshifted exponential plus the shift)
Policy Limits
X ^ d is the LIMITED EXPECTED VALUE

X X u
X ^d  (Cost to Customer)
d X u
 
Definition: E  X k    x k f ( x) dx   kx k 1S ( x)dx
0 0
 
E ( X )   x  f ( x)dx   S ( x)dx
0 0
d
E ( X ^ d )   xf ( x)dx  d  S (d ) 
0
d
  S ( x)dx
0
 X ^ d     x
d
f ( x )dx  d k  S (d ) 
k k
E
0
d
  kx k 1S ( x)dx
0
 d 
If Y  (1  r ) X then E Y ^ d   1  r  E  X ^
 1  r 
Deductibles
Ordinary Deductible of d – pays max(0,X-d)

For d = 500, Loss <= 500 pays nothing, Loss of 700 pays 200
Franchise Deductible of d – pays nothing if Loss is less than d, and full amount if Loss > d
Payment per Loss with Deductible
Ordinary Deductible
0 X d
Payment from Ins. Co.    E  X  d  
X  d X d
 

E  X  d    ( x  d )  f ( x)dx 
d
 S (x)dx
d

E  X  d 

k
( x  d )k  f ( x)dx  E  X k   E  X ^ d  
k

     
d
Payment per Payment with Deductible
Ordinary Deductible
Y P   X  d  | X  d
FX ( x  d )  FX (d )
FY P ( x) 
1  FX (d )
SX (x  d )
SY P ( x) 
S X (d )
e( d )  E  X  d | X  d 
E  X  d   E X   E X ^ d 
e(d )  E Y P   
  S (d ) S (d )
 
  x  d  f ( x)dx  S ( x)dx
e( d )  d  d
S (d ) S (d )
e(d )  Mean Excess Loss
E  X   E  X ^ d   E  X  d  
pmt from customer pmt from ins. co.
E  X ^ d   E  X ^ d | X  d   Pr  X  d   E  X ^ d | X  d   Pr  X  d 
  Average loss < d   Pr  X  d   d  Pr  X  d 
Franchise Deductibles
Expected Payment per Loss  E  X  d    dS ( d )

Expected Payment per Payment  e( d )  d
Special Cases for e(d)
Distribution e(d)
Exponential 
 d
Uniform  0,  2
 d
2 Parameter Pareto  1
 d
  1 d 


1 Parameter Pareto    d   d
 d 

  1
If X  Uniform  0,  , then  X - d  | X  d  Uniform  0,  d 

If X  Pareto  ,  , then  X - d  | X  d  Pareto  ,  d 
If X  1 Parameter Pareto  ,  and d   , then  X - d  | X  d  Pareto  ,d 
Loss Elimination Ratio
E X ^ d 
LER(d ) 
E X 
LER(d )  Expected % of loss not included in payment
Special Cases for LER(d)
Distribution LER(d)
d
Exponential 1 e 
 1
  
2 Parameter Pareto 1  
 d  
 
 1

1 Parameter Pareto 1 d

Properties of Risk Measures
Translation Invariance:   X  c     X   c
Positive Homogeneity:  cX   c   X 
Subadditivity:  X  Y     X    Y 
Monotinicity:  X    Y  if X  Y
A coherent risk measure satisfies all 4 properties
VaR p fails subadditivity

TVaR p is coherent
E  X  is coherent
VaRp ( X )   p  FX1 ( p) Value-at-Risk

VaR0.99 is the 99th percentile
TVaR p ( X )  E  X | X VaR p ( X )  Tail-value-at-Risk


FX1 ( p )
xf ( x)dx

1 p
1
VaR ( X )dy
p
y

1 p

 VaR p ( X )  e X VaR p ( X ) 
Distribution VaR(X) TVaR(X)

  zp  e 0.5 x
2
  z p   where  ( x) 
Normal 1 p 2
  z p     z p  
e E  X 
 1  p 
Lognormal
 
If given a mixture, use the survival function and solve for x

Maximum Covered Loss
For deductible 'd' and maximum covered loss 'u'

0 X d

EX   X  d d  X u
u  d X u

E  Payment per Loss   E  X ^ u   E  X ^ d 

u

 S ( x)dx
d
 E  X  d    E  X  u  
Policy Limit: the maximum amount the coverage will pay

- If Policy Limit = 10,000 and d = 500
- Pays 10,000 for loss of 10,500 or higher
- Pays Loss – 500 for losses b/w 500 and 10,500
Maximum Covered Loss: the maximum loss amount that is covered

- If MCL = 10,000 and d = 500
- Pays 9,500 for loss of 10,000 or higher
- Pays Loss – 500 for losses b/w 500 and 10,000
Coinsurance
    
E  Payment per Loss     E X ^ u   E X ^ d 
Coinsurance   MCL   ded  
Coinsurance of 80% means that the insurance pays 80% of the costs
Inflation
  u   d 
E  Payment per Loss    (1  r )   E  X ^ 
1  r  
E X ^
  1  r  
Variance of Payment per Loss with a deductible
X  Loss RV
Y L  payment per loss RV
E Y L   0  Pr  X  d   E Y P  Pr  X  d 
    
Var Y L  E Var Y L | Case   Var E Y L | Case   
     
 0  Pr  X  d   Var Y P Pr  X  d    E Y P   0  Pr  X  d   Pr  X  d  
2
 
Bonus
Pay Bonus of 50% of (500 - X) if X  500

B  0.5Max  0,500 - X 
 0.5Max  500  500,500 - X 
 0.5  500  Min  500, X  
 0.5  500  X ^ 500 
 250  0.5E  X ^ 500
Discrete Distributions
The (a,b,0) class
Distribution a Variance vs. Mean

Poisson 0 Variance = Mean
Negative Binomial 0 Variance > Mean

(Geometric is NB with r = 1)
Binomial 0 Variance < Mean
n
  
Pr  N  n     Geometric Distribution (memoryless)
1  
A sum of 'n' independent Negative Binomial random variables having the same 
and parameters r1,..., rn has a Negative Binomial distribution with parameters 
n
and r
i 1
i
A sum of 'n' independent Binomial random variables having the same q

and parameters m1,..., mn has a Binomial distribution with parameters q
n
and m
i 1
i
pk b
a k  1,2,...
pk 1 k
ab
EN  
1 a
ab
Var  N  
1  a 2
 x  x( x  1)  ( x  n  1)
n 
  n!
Probability Generating Functions
P ( n ) (0)
pn 
n!
p0  P(0)
( n )  P ( n ) (1)
P '(1)  E  X 
P ''(1)  E  X ( X  1) 
P '''(1)  E  X ( X  1)( X  2) 
If given a primary and secondary pgf, substitute the secondary

pgf for 'z' in the primary pgf.
The (a,b,1) class
pk b
a k  2,3,4,...
pk 1 k
Zero-Truncated Distributions
p0T  0
pk
pkT 
1  p0
Zero-Modified Distributions
p0M  0

pkM  1  p0M   1 p p
k
 
pkM  1  p0M  pkT
1  p0M
E M  
1  p0
 
E Orig   1  p0M E  Zero  Truncated 

E  N   cm
Var ( N )  c 1  c  m 2  cv
c  1  p0M
m is the mean of the corresponding zero-truncated distribution
v is the variance of the corresponding zero-truncated distribution
Sibuya
ETNB with  1  r  0 and take lim as   
a 1
b  r 1
p1T  r
Poisson/Gamma
The Negative Binomial is a Gamma mixture of Poissons

N ~ Poisson   
 ~ Gamma  , 
Negative Binomial ( r )  Gamma( )

Negative Binomial (  )  Gamma( )
Gamma
Mean  
Variance   2
Negative Binomial
Mean  r 
Variance  r  1   
Negative Binomial (r  1) is Geometric

Gamma   1 is Exponential
Weibull   1 is Exponential
Var     Var  X   E  X  where  ~ Gamma
Coverage Modifications
Frequency Model Original Parameters Exposure Modification Coverage Modification

Exposure n1 Exposure n2 Exposure n1
Pr  X  0   1 Pr  X  0   1 Pr  X  0   v
 n2 
Poisson    v
 n1 
 n2 
Binomial m, q   m, q m, vq
 n1 
 n2 
Negative Binomial r,    r,  r , v
 n1 
(a,b,0) and (a,b,1) adjustments
 1  p0* 
1 p0M * 
 1 p0M  

 where * indicates revised parameters
 1 p0
Aggregate Loss Models
Compound Variance
S= aggregate losses
N = frequency RV
X =severity RV
E S   E  N  E  X 
Var  S  E  N Var  X   Var  N  E  X 
2
Can only be used when N and X are independent

Var  S   E  X 2  if primary distribution (# claims) is Poisson
Collective Risk vs Individual Risk
Convolution Method
pn  Pr( N  n)  f N (n)
f n  Pr( X  n)  f X (n)
g n  Pr( S  n)  f S (n)
FS ( x)  g 
n x
n
i1 ...ik  n
fi1 fi2  fik
When given severity distributions with Pr( X  0)  0

1) Modify the frequency to eliminate 0
2) Adjust the severity probabilities after removing 0
Aggregate Deductibles
Assume severity is discrete

d = stop-loss or reinsurance deductible
Assume Premium  E  S ^ d 
E  S  d    E  S   E  S ^ d   Net stop-loss Premium
Method 1 - Definition of E  S ^ d 
u
E S ^ d  
 hjg
j 0
hj  d Pr S  d
u   d   1 (the sum of all multiples of h less than d )

 h
Method 2 - Integrate the Survival Function

u 1
E S ^ d  
 hS (hj)  d  hu  S (hu)
j 0
u   d   1 (the sum of all multiples of h less than d )

 h
To find E  S ^ 2.8 where x can be 2, 4 6 or 8...

Method 1
     
E  S ^ 2.8   P  S  0   0    P  S  2   2    P  S  2   2.8 
     
 g (0) amt   amt   d 
   g (2)   1 g (0) g (2) 
     
E  S ^ 4  P  S  0   0  P  S  2   2  P  S  4   4 
    
     
 g (0) amt   amt   d
   g (2)   1 g (0)  g (2) 
Method 2
   
E  S ^ 2.8   2  Pr  S  0     0.8  Pr  S  2  
 dist between values of x   dist b/w highest value of x below d and d 

Aggregate Coverage Modifications
If there is a per-policy deductible and you want Aggregate Payments
1) Expected Payment per Loss x Expected Number of Losses per Year

E  S   E  N   E  X  d  
OR
2) Expected Payment per Payment x Expected Number of Payments per Year

E  S   E  N '   E  X  d | X  d 
where N' is the number of positive payments  frequency  Pr( X  d ) 
1) Better for discrete severity distributions
2) Better for if severity is Exponential, Pareto or Uniform
Exact Calculation of Aggregate Loss Distribution
Two cases for which the sum of independent random variables has a simple distribution
1) Normal Distribution. If X i are Normal with mean  and variance  , their sum is normal.
2
2) Exponential or Gamma Distribution. If X i are exponential or gamma, their sum has a gamma
distribution
Normal Distributions
If n random variables Xi are independent and normally distributed with parameters  and  2 ,
their sum is normally distributed with parameters n and n 2 .
Calculate Pr  S  c | n  1 using  and  2

Calculate Pr  S  c | n  2  using 2 and 2 2 ,etc.
Then multiply each these probabilities by their respective p1, p2, etc.
Exponential and Gamma Distributions
The sum of n exponential random variables with common mean 

is a Gamma distribution with parameters   n and  . When a gamma
distribution's  parameter is an integer, the gamma distribution is called
an Erlang distribution.
In a Poisson Process with parameter  , the number of events occurring by time t

has a Poisson distribution with mean  t.
In a Poisson Process with parameter  , the time between events follows

an exponential distribution with parameter 1 .

In a Poisson Process with paremeter 1 , the time between events is exponential


with mean  . Therefore, the time until the nth event occurs is Erlang with parameters n and 
 x 
x
 n
e 
The probability of exactly n events occurring before time x is
n!
Gamma CDF
 x 
x
 j
n 1 

e
FX ( x)  1 
j!
j 0
x
- 
 
If   1, F ( x)  1- e
x x
- 
   x  - 
If   2, F ( x)  1- e    e  
 
Empirical Models
Bias
bias    E ˆ     is the estimator,  is the parameter being estimated

 
bias is the expected value of the estimator minus its true value
Estimator is unbiased if bias    0 for all 
Estimator is asymptotically unbiased if lim bias    0
x 
 
The sample variance (with division by n -1) is an unbiased estimator of variance
Sample mean is an unbiased estimator of the true mean
Consistency (weak consistency)
 
Definition: Estimator is consistent if lim Pr  n      1 for all   0
n
1) An estimator is consistent if it is asymptotically unbiased
 
and Var  n  0 as n  
2) The MLE is always consistent
3) If MSE    0 then  is consistent
Mean Square Error

  
2
MSE    E     | 
 
 
MSE    Var   bias   
2
MSE is a function of the true value of the parameter

Complete Data
Grouped Data
nj
f n ( x) 
n  c j  c j 1 
where x is in c j , c j 1 
n j = # points in the interval
n = total points
f n ( x)  Histogram
Fn ( x)  Ogive
To find the 2nd raw moment, calculate

 f ( x)  x dx
a
n
2
If there is a policy limit (say 8000), then E  X^8000   would have as its last 2 terms:
2
 
8000 10,000

5000
f n ( x)  x dx 
2

8000
f n ( x)  80002 dx
Variance of Empirical Estimators
Binomial:
Variance = mq(1  q)
X
If Y= (Binomial Proportion),
m
q (1  q )
Variance =
m
Multinomial:
Variance  mqi (1  qi )  i = category 
Covariance  -mqi q j
X q (1  qi )
If Y= , Variance = i
m m
-q q
Covariance  i j
m
Individual Data
S  x  1  S  x  
Var  Sn  x    if S is known
n
Sn  x  1  Sn  x  
Var  Sn  x   
n
nx  n  nx 
Var  Sn  x    3
where nx is the # of survivors past time x
n
 nx  n y  n y
  
Var y  x p x | nx  Var y  x q x | nx  
nx3
The empirical estimators of S ( x) and f ( x) are unbiased
Individual Data
1) Determine the estimator

2) Determine what's random and what's not
3) Write an expression for the estimator with symbols for random variables
4) Calculate the variance of the random variables
5) Calculate the variance for the whole expression
(i.e. Var ( aX  bY ), Var ( aX ), etc.)
Kaplan Meier Product Limit
j 1

 si 
Sn (t )  1  
i 1  ri 
S (t )  S  t0 
t
t0 where t 0 is the end of the study Exponential Extrapolation
ri  risk set
si  death
di  entry time
ui  withdrawal time
xi  death time
 
S x -  Pr  X  x 
1 1 2
Shortcut:  
n n  1 n  0.5
Nelson Aalen Estimator
j 1
Hˆ (t ) 

i 1
si
ri
ˆ
Sˆ ( x)  e  H ( x )
Lives that leave at the same time as a death are in the risk set.
Lives that arrive at the same time as a death are not in the risk set.
Censored lives are in the risk set but are not counted as deaths
k = # distinct data points
Confidence Intervals
For S(x), the boundaries must be between 0 and 1.

For H(x), the boundaries can be anything.
Calculator Shortcuts
si
1) Enter in column 1
ri
2) Enter the formula ln(1- L1) for column 2
3) Select L1 as first variable and L2 as second
4) Calculate e x - e y
Nelson  Aalen Kaplan  Meier
Estimation of Related Quantities
If using Kaplan-Meier or Nelson-Aalen methods, E(X^d) = area under the curve of S(x). Multiply
the base x height of each of the rectangles.
Bayes Theorem
P  E | A1   P  A1 
P  A1 | E  
P  E | A1   P  A1   P  E | A2   P  A2   ...  P  E | An   P  An 
Greenwood's Approximation of Variance (Kaplan-Meier)
   r r  s 
sj
Var S (t )  S (t ) 2
j j j
yj  t
Sn  x  1  Sn  x  
 
Var S (t ) 
n
if data is complete (no censoring or truncation)
Variance of Nelson-Aalen Estimator
  
sj
Var H (t ) 
rj 2
yj t
Linear Confidence Intervals
 S (t)  z
n 0.5(1 p ) Var  Sn (t )  , Sn (t )  z0.5(1 p ) Var  S n (t )  
Log-transformed confidence interval for S(t)
z Var  Sn (t )  
 S (t ) 1U , S (t )U  where U  exp 
0.5(1 p ) 
 n n   Sn (t )  ln Sn (t ) 
 
 
Log-transformed confidence interval for H(t)
 H (t ) 
, H (t ) U 
z
 0.5(1 p ) Var H (t )
where U  exp 
  
  
 U   H (t ) 
 
Kernel Smoothing
Uniform
1
 xi  b  x  xi  b
k xi ( x)   2b
0
 Otherwise
0 x  xi  b

 x   xi  b 
K xi ( x)   xi  b  x  xi  b
 2b
1 x  xi  b

n
fˆ ( x)  
i 1
1
n
 
k xi ( x)
f n  xi   probability
n
Fˆ ( x)  
i 1
1
n
 
K xi ( x)
f n  xi   probability
xi is a sample point
x is the estimation point
Kernel distribution is 1 for observation points more than one bandwidth to the left
Kernel distribution is 0 for observation points more than one bandwidth to the right
K 6 (13)  K10 (13)  K 25 (13)
ex) To find K12 (11), linearly interpolate between K12 (7) and K12 (17)
Triangular
1
Height of triangle is
b
Base of triangle is 2b
Expected Values
EX |Y  Y (Y is the original random variable)
The mean of the smoothed distribution is the same as the original mean
 E  X   E Y 
Uniform Kernel
b2
Var ( X )  Var (Y ) 
3
Triangular Kernel
b2
Var ( X )  Var (Y ) 
6
Approximations for Large Data Sets
d j is the # of left truncated observations in c j , c j 1 

- number of new entrants

u j is the # of right censored observations in c j , c j 1 

x j is the # of losses in c j , c j 1 

r j is the risk set c j , c j 1 

q 'j is the decrement rate in c j , c j 1 
xj
q 'j 
rj
Pj 1  Pj  d j  u j  x j
v j  # withdrawals
w j  # survivors
v j  wj  u j
All entries/withdrawals at endpoints

r j  Pj  d j
All entries/withdrawals uniformly distributed

r j  Pj  0.5 d j  v j 
Multiple Decrements
p   p '(1)  p '(2) ...


'( x )  xt  xt 1  xt 2 
p
t 3 t   1  1  1  
 t  t 1  rt 2 
r r
'( x )
t 3 qt  1  t 3 pt'( x)
Parametric Models
Method of Moments
n n

i 1
xi  
i 1
xi  2
m t
n n
Distribution Formulas Formulas

Exponential  m
m2 t  m 2
Gamma ˆ  ˆ 
t  m2 m
2t  2m2 ˆ 
mt
Pareto ˆ 
t  2m 2 t  2m 2
Lognormal ˆ  2ln(m)  0.5ln(t ) ˆ 2  2ln(m)  ln(t )
Uniform on  0,  ˆ  2m
When they don’t specify which moment to use, use the first ‘k’ moments, where ‘k’ is the number
of parameters you’re fitting.
For an inverse exponential, add the reciprocals to get the mean.
Percentile Matching
ˆ p  x(n1) p if (n  1) p is an integer
Otherwise multiply (n  1)  p and interpolate
The smoothed empirical percentile is not defined if the product is less than 1 or greater than n
Maximum Likelihood
Type of Data Formula

Discrete distribution, individual data px
Continuous distribution, individual data f ( x)
Grouped Data
   
F c j  F c j 1
Individual Data censored from above at u 1  F (u) for censored observations
Individual Data censored from below at d F (d )for censored observations
Individual Data truncated from above at u f ( x)
F (u )
Individual Data truncated from below at d f ( x)
1  F (d )
Cases where MLE = Method of Moments Estimator

(if no censored or truncated data)
Distribution Result
Exponential MLE = MoM
Gamma MLE = MoM if fixed 
Normal MLE = MoM
Poisson MLE = MoM
Negative Binomial MLE = MoM if r is known
Binomial MLE = MoM if m is known
If the MLE is the sample mean, the variance of the MLE is the variance of the
Var ( X )
distribution =
n
Common Likelihood Functions
MLE Function MLE

a  
b b
L     e 
a
L     a eb
a

b
L     a 1   
b a

ab
MLE Formulas
Distribution Formula CT?

nc
 x  d  i i Yes
Exponential ˆ  i 1
n
n
 ln x
i 1
i
ˆ 
n
Lognormal No
n
 ln 2
xi
i 1
  ˆ 
2
ˆ 
n
Inverse Exponential n
ˆ  n No

1
xi
i 1
nc nc
Weibull, fixed    xi   di Yes

ˆ  i 1 i 1
n
Uniform on Individual Data  0,    max  xi  No
 n 
Uniform on Grouped Data  0,  ˆ  c j  
 nj 
  No
OR Some observations are censored
at a single point c j  Upper bound of highest finite interval
There must be at least one observation n j  Number of observations below c j
above c j
ˆ  Min of
Uniform on Grouped Data  0,  1) UB of highest interval with data
All groups are bounded
 n 
2) LB of highest interval with data *  
 nj 
 
Two-parameter Pareto, fixed  ˆ  
n
K
nc nc Yes
K   ln   di    ln   xi 
i 1 i 1
One-parameter Pareto, fixed  ˆ  

n
K
nc nc Yes
K   ln max  , di    ln xi
i 1 i 1
n
aˆ  
K
Beta, fixed  n No
K   ln xi  n ln 
b=1
i 1
n
bˆ  
K
Beta, fixed  n No
K   ln   xi   n ln 
a=1
i 1
n = # of uncensored observations
c = # of censored observations
d = truncation point
x = observation if uncensored or the censoring point if censored
CT = formula can be used for left-truncated or right-censored data
Bernoulli Technique
Whenever there is one parameter and only 2 classes of observations, maximum likelihood will
assign each class the observed frequency, and you can then solve for the parameter.
If X can be only 2 values (a or b)

# data points = a
P  X  a 
# data points
Reasons to Use Maximum Likelihood
1) Method of Moments and Percentile Matching only use a limited number of features from the
sample.
2) Method of moments are hard to use with combined data.
3) Method of moments and percentile matching cannot always handle truncation and censoring.
4) Method of moments and percentile matching require arbitrary decisions on which moments
or percentiles to use.
Reasons NOT to Use Maximum Likelihood
1) There’s no guarantee that the likelihood can be maximized – it can go to infinity.
2) There may be more than one maximum.
3) There may be local maxima in addition to the global maximum; these must be avoided.
4) It may not be possible to find the maximum by setting the partial derivatives to zero; a
numerical algorithm may be necessary.
Fisher’s Information
1 Parameter
 d2 
     E  2 l    Fisher's Information
 d 

Var  
1
  
2 Parameters
 2   
 2 l  ,   l  ,   
  
I  ,     E  
    2 
 l  ,   l   
 , 
    2

 Var   Cov    1

 
Var ˆ ,ˆ  
 Cov       
Var   

Inverting a Matrix
1
a b 1 d b
c 
 d  ad  bc  c a 
Delta Method
2
 dg 
Var  g( X )   Var ( X )   1 Variable
 dx 
2 2
 g  g g  g 
Var  g( X , Y )   Var ( X )    2Cov( X , Y )  Var (Y )   2 Variables
 x  x y  y 
Var  g( X )    g     g  General
 g g 
where g   ,...,  and  is the covariance matrix

 1x xk 
Take derivative with respect to unknown variable
Fitting Discrete Distributions
Distribution Method of Moments MLE
ˆ  x ˆ  x
Poisson
x2 ˆ ˆ 2  x
Negative Binomial rˆ  2 
ˆ  x
rˆ  x
x
ˆ 2  Sample Variance (divide by n)

x
Binomial qˆ 
m
Choosing between (a,b,0) distributions to fit the data:
1) Compare sample variance ˆ 2 to sample mean x

Poission: Variance = Mean
Negative Binomial: Variance > Mean
Binomial: Variance < Mean
knk
2) Calculate and observe the slope as a function of k
nk -1
If ratios are increasing, then a > 0
Poisson = 0 slope
Negative Binomial = positive slope
Binomial = negative slope
n is the # of policies/observations of k
The variance of a mixture is always at least as large as the weighted average of the
variances of the components and usually greater due to:
Var ( X )  E Var ( X | I )   Var  E  X | I 

variance of a mixture weighted average of variances
Asymptotic Variance of MLE’s
Distribution Formula
2
 
Exponential
Var ˆ 
n
Uniform  0,  n 2
 
Var ˆ 
 n  12 (n  2)
 2
 
Weibull fixed
ˆ
Var   2
n
Pareto fixed  2
Var ˆ  
n
    2 
 
Pareto fixed 2
Var ˆ 
n 
2
Var  ˆ  
n
Lognormal
Cov   ,   0
2
Var ˆ  
2n

Poisson  
Var ˆ 
n
Var  X   Var ( X )
1
n
Hypothesis Tests – Graphic Comparison
F ( x)  F (d )
F * ( x) 
1  F (d )
f ( x)
f * ( x) 
1  F (d )
D( x) plots
D( x)  Fn ( x)  F *( x)
 Empirical - Fitted
Empirical calculation uses a denominator of n
If D( x)  0, then Fn ( x)  F *( x)
 had more data  x than predicted by model
If D( x)  0, then Fn ( x)  F *( x)
 had less data  x than predicted by model
If data truncated at d , D( d )  0
1
Every vertical jump has distance
n
p  p plots
1
On horizontal axis, one point every multiple of
n 1
Domain and Range of Graph are  0,1
    
Points are Fn x j , F * x j
The first data point corresponds to the first sample value
If slope is less than 45 , fitted distribution is putting

too little weight in that region
If slope is more than 45 , fitted distribution is putting
too much weight in that region
Don't plot censored values
Kolmogorov-Smirnov Test
D  max Fn ( x)  F * ( x)
d  x u
Max occurs right before or after a jump
If D < critical value, do not reject H 0 (null hypothesis)
Having censored data lowers D and also lowers the critical value
Critical values  0 as n  
If data are X = 2000, 4000, 5000, 5000

and the 5000 values are right-censored
 
Then F4 5000  F4  5000   0.5
Anderson-Darling Test
k
  Sn  y j   ln S *  y j   ln S *  y j 1 
2
A  nF * (u )  n
2
j 0
k
  Fn  y j   ln F *  y j 1   ln F *  y j 
2
n
j 1
n includes censored observations

Chi-Square
 
2
k Oj  E j
Q  Ej
j 1
k  O j2 
Q    E 
n
j 1  j 
O j  # of observations in each group

E j  np j (Expected # of observations in each group)
n  Total # of observations
Each group should have at least 5 expected observations. If not,

you have to bring some groups together.
Degrees of Freedom
Distribution with parameters is given
k  1 DoF 
Distribution is fitted/estimated by MLE or using different data
k  1  r DoF Parameters are fitted from the data
r  # parameters
k  # groups
Independent Periods
O j  E j 
2
k
Q  Vj
where V j is the variance
j 1
Degrees of Freeedom  k - p (p is number of estimated parameters)
Kolmogorov- Anderson-Darling Chi-square Loglikelihood
Smirnov
Individual Data Individual Data Individual or Grouped Data Individual or Grouped

Data
Continuous Fits Continuous or Discrete Fits
If there is censored If there is censored data If there is censored data If there is censored data
data    , should   ,    , no adjustment of    , no adjustment of
lower critical value should lower critical value critical value critical value
If parameters are If parameters are fitted, If parameters are fitted, If parameters are fitted,
fitted, critical value critical value should be critical value adjusts critical value adjusts
should be lowered lowered automatically automatically
Larger sample size

makes critical value Critical value independent of Critical value independent of Critical value independent
decline sample size sample size of sample size
No Discretion in No Discretion in grouping of Discretion in grouping of data

grouping of data data
Uniform weight on Higher weight on tails of Higher weight on intervals

all parts of distribution with low fitted probability
distribution
Type I Error: Rejecting H0 when it is true

Type II Error: Rejecting H1 when it is true
Likelihood Ratio Algorithm
If parameters are added to the model, the new model will have a loglikelihood at least as great.
The # DoF for the likelihood ratio test is the number of free parameters in the alternative model
minus the number of free parameters in the base model (null hypothesis).
Compare 2( LogL1  LogL2 ) to critical value at selected chi-square percentile and DOF
LogL1  Alternative Model Loglikelihood (which will be higher)
LogL2  Base Model
If 2( LogL1  LogL2 ) > critical value, accept alternative hypothesis
Start by comparing best 2-parameter to best 1-parameter

If 2( LogL1  LogL2 )  Critical Value (it fails), compare best 3-parameter distributions to best 1-parameter
If 2( LogL1  LogL2 )  Critical Value (it passes), compare 3-parameter distributions to best 2-parameter
Schwarz-Bayesian Criterion
r
 LogL -   ln n
2
where r is the # parameters
where n is the # of data points
The distribution with the highest resulting LogL is selected
Credibility
eF  n0CV 2
2
 yp 
where n0   
 k 
1 p 
y p  coefficient from the standard normal =  -1  
 2 

Given y p , P% = 100 2  percent corresponding to y p   1 
k  maximum fluctuation you will accept (i.e. within 5%)
Limited Fluctuation Credibility: Poisson Frequency
All  must be the same!

E  X 2 
1  CVs2 
E X 
2
Credibility for
Experience Number of Claims Claim Size (Severity) Aggregate Losses/Pure
expressed in Premium
Exposure Units n0 n0

 CVs2  n0

1  CV  s
2
eF 
Number of Claims
nF
n0 
n0 CVs2  
n0 1  CVs2 
Aggregate Losses n0  s n0 s CVs2   
n0 s 1  CVs2 
sF
Pure Premium is the expected aggregate loss per policyholder per time period.
Limited Fluctuation Credibility: Non-Poisson Frequency
eF  n0CVs2
nF  eF  f
Credibility for
Experience Number of Claims Claim Size (Severity) Aggregate Losses/Pure Premium
expressed in
Exposure Units   2f    s2    2f  s2 
n0  2
  n0  2
    n0  2  2 

 f s   f
eF
 f   s f  
Number of Claims   2f    s2    2f  s2 
n0   n0  2  n0  
nF  f
   s    f  2 
 s 
  2f    s2    2f  s2 
Aggregate Losses
n0  s   n0  s  2  n0  s  
sF  f
   s    f  2 
 s 
# of Insureds is Exposure
Partial Credibility
PC  ZX  1  Z  M
PC  M  Z  X  M 
PC  Credibility Premium
M  Manual Premium
Z  Credibility
X  Observed Mean
n
Z
nF
n  Expected Claims
nF  Number of Expected Claims needed for Full Credibility
Bayesian Credibility
Bayesian Methods – Discrete Prior
Class 1 Class 2
1) Prior Probabilities
2) Likelihood of Experience
3) Joint Probabilities Product of rows above Product of rows above
4) Posterior Probabilities Quotients of row 3 over row 3 sum Quotients of row 3 over row 3 sum
5) Hypothetical Means
6) Bayesian Premium Product of rows 4 and 5 Product of rows 4 and 5
Bayesian Premium is the predicted expected value of the next trial
# claims is Bernoulli means at most 1 can occur
Bayesian Methods – Continuous Prior
    f  x1 ,..., xn |  
  | x1 ,..., xn   Posterior Density
     f  x1,..., xn |   d
Limits of Integration are according to prior distribution
f  xn 1 | x1 ,..., xn    f  xn 1 |     | x1 ,..., xn  d Predictive Density
   is the prior density

  | x1 ,..., xn  is the posterior density
f  x1 ,..., xn |   is the likelihood function of the data given   conditional 
n
f  x1 ,..., xn |     f  xi |  
i 1
f  x1 ,..., xn  is the unconditional joint density function
f  x1 ,..., xn    f  x1 ,..., xn |     d


n!
 e dt 
n  t
t
0
 n1
Bayesian Credibility: Poisson/Gamma
N ~ Poisson   
 1
 ~ Gamma   ,   
 
*   + claims
 *    exposures

Pc  *
*
 1
Posterior: Gamma  * ,*  
 * 
Posterior mean is the avg. # claims/policy
Predictive: Negative Binomial  r  * ,   * 
Bayesian Credibility: Normal/Normal
X ~ Normal  , v 
 ~ Normal   , a 
x  Observed Average
n  Exposure
v  anx
*  Posterior Mean
v  an
va
a*  Posterior Variance
v  an
Predictive Mean  *
Predictive Variance  v  a*
Bayesian Credibility: Lognormal/Normal
X Lognormal  , v 
 Normal   , a 
Find
 ln xi  x
n
v  anx
*  Posterior Mean
v  an
va
a*  Posterior Variance
v  an
E  X |    E e  0.5v   E e  e0.5v  e

*  0.5 a*  0.5v
e
Bayesian Credibility: Bernoulli/Beta
Probability of a claim = q
q Unif  0,1
Uniform is a special case of Beta distribution with a  b  1 and   1

k  # claims
n = exposure
Beta  Cx a 1 (1  x)b 1
a*  a  claims 
 Plug into Posterior Distribution
b*  b  exposures - claims 
a
E  | x   *
a*  b*
If m  1, treat as a series of 'm' Bernoullis

If exposure is 2 years, treat as '2m' Bernoullis
1) n  2m
and
a*  a 
2)  m * 
a*  b*  a*  b* 
( x  1)  x( x)
Bayesian Credibility: Exponential/Inverse Gamma
1 x
f  x |   e  Exponential


 e 
     Inverse Gamma
     1
*    n
*    nx
*
E (next loss) 
*  1
If f ( x) is Gamma instead of exponential

1  1  
x
f  x |   x e
   
*     n
*    nx
Loss Functions
For the loss function minimizing MSE
   
2
l ˆ,  ˆ  
Bayesian Point Estimate is the mean of the posterior distribution
For the loss function minimizing Absolute Value of the Error
 
l ˆ,  ˆ  
Bayesian Point Estimate is the median of the posterior distribution
For the zero-one loss function

Bayesian Point Estimate is the mode of the posterior distribution
Buhlmann Credibility
Buhlmann Credibility: Basics
  E       Overall Mean  Expected Value of the Hypothetical Mean 

v  E v     Expected Value of the Process Variances
a  Var       Variance of the Hypothetical Mean
a  v  Overall Variance
For Poisson frequency HM = PV
Buhlmann's k
v
k 
a
Buhlmann's Credibility Z
n na
Z  
n  k na  v
n  # periods when studying frequency or aggregate losses

n  # claims when studying severity
If given 2 classes and there are multiple groups within each class,
you must find the mean and variance of each group separately.
Buhlmann Credibility: Continuous
Given a distribution function and the prior function:

1) Use the distribution function to get the HM and PV
2) Find v and a using the prior distribution
Model Prior Posterior Predictive Buhlmann v Buhlmann a
Poisson Gamma Gamma Negative Binomial
   *    claims r  *
1  * =  exposures   2
   * =
1
 *
Bernoulli Beta Beta Bernoulli
a* ab ab
(q) a a*  a  claims q  E  | x  
a*  b* (a  b)(a  b  1) (a  b)2 (a  b  1)
b b* =b  exposures - claims
Normal Normal Normal Normal
 ,v   v  anx v a
*    *
a v  an
va  2  a*  v
a* 
v  an
Exponential Inverse Gamma Inverse Gamma Pareto
2 2
   *    n   * (  1)(  2) (  1)2 (  2)
 *    nx   *
Exact Credibility
If you have conjugate pairs and they ask for a Buhlmann estimate, use the Bayesian estimate.
Buhlmann as Least Squares Estimates of Bayes
E  Initial probabilities x Outcomes  = E  Initial probabilities x Bayesian Estimates 
Yˆi     X i
Cov  X , Y 
 Z
Var ( X )
  (1  Z ) E  X 
Var ( X )   pi X i2  E  X 
2
Cov  X , Y    pi X iYi  E  X  E Y 
where X are the initial outcomes and Y are the Bayesian observations
Buhlmann Predictions
Pc ( 0 )  (1  Z ) E  X 
first observation = 0
Pc (2)  (1  Z ) E  X   2Z
Pc (8)  (1  Z ) E  X   8Z
Graphics Questions
1) The Bayesian prediction must be within the range of the hypothetical means
-within range of the prior distribution
2) The Buhlmann predictions must lie on a straight line
3) There should be Bayesian predictions both above and below the Buhlmann line
4) The Buhlmann prediction must be between the overall mean and the observation
Cov  X i X j 
Cov  X i X j   a
Var  X i   v  a
Empirical Bayes Non-Parametric Methods
Uniform Exposures Non-Uniform Exposures
̂ x x
Mean of all data
r n
1

r n
( xij  xi )2  ij ij
m ( x  xi ) 2
i 1 j 1
v̂ r (n  1) i 1 j 1
avg/cell avg/class
r
avg/cell avg/class
Mean of sample
variances of the ( ni  1)
rows i 1 per class
r
 mi ( xi  x ) 2  vˆ(r  1)
1  r  vˆ
  ( xi  x )   2 i 1 exp/class avg/class overall avg
â r  1  i 1 avg/class overall avg  n

  
r
mi2
i 1 exp/class
m 
overall m
overall
n Years of Experience # Policyholders
r = # groups
m = # exposures
To calculate individual variances
 Xi  X 
2
v1 
n 1
In the PC formula, M = the average of ALL claims

Empirical Bayes Semi-Parametric Methods
Poisson Model
ˆ  x
vˆ  x
aˆ  s 2  vˆ
 Xi  X 
2
s2   r 1
r  # Policyholders
aˆ
Z regardless of # of years (but if non-uniform exposures, use n = # exposures
aˆ  vˆ
for the group you are looking at)
If non-uniform exposures, aˆ must be calculated using Non-parametric formula
For PC
1) X= total # observed claims (but if non-uniform exposures, use the average)
2) If exposure is 5 years, divide PC by 5 to get estimate for 1 year (next year)
Non-Poisson Model
1) Negative Binomial with fixed 
E  N | r   r
Var  N | r   r  (1   )
ˆ  x
vˆ  x (1   )
aˆ  s 2  vˆ
2) Gamma with fixed 
E  X |    
Var  X |     2
ˆ  x
vˆ  x
aˆ  s 2  vˆ
Simulation
Inversion Method
1) Get u  F ( x)
2) Solve for x
3) Plug in 'u' to get simulated value
If F (2 )  .25
F (2)  .75
Then .25  u  .75 is mapped to x  2
If F(x) = 'a' (constant) in range  x1 , x2  , then map a  x 2
If given a graph with (x, F(x))

1) Start on the y-axis with the u values
2) Move right until you hit the line
If the line is horizontal, keep going right until it starts going up
3) Go vertically down to x
4) x is the simulated value
Number of Data Values to Generate
F ( x) 1  F ( x) 

Var Fˆ ( x)   n
eF  n0CV 2
Estimated Item
Mean
Confidence Interval:
s 
x  z  n 
 n
sn is the square root of the unbiased sample variance after n runs
Number of Runs:
Calculates number of runs needed for the sample mean to be within 100k% of the true mean.
n  n0CV 2 Must use unbiased Variance in CV 2
 
1
Remember that Var X  Var  X 
n
F(x)
F ( x)  z VaR  F ( x) 
F ( x) 1  F ( x) 
F ( x)  z
n
Number of Runs:
 Pn 
 n  Pn  1 n 
n  n0    n0  P 
 Pn   n 
 n 
Pn  # runs below x
Percentiles q
Y , Y 
a b
 
a   nq  0.5  z1 p nq 1  q  
 2 
 
b   nq  0.5  z1 p nq 1  q  
 2 
Risk Measures
TVaR p ( X )  E  L | L  VaR p 
E  L   E  L ^  p 
 VaR p ( X )  eX VaR p ( X )   p
1 p


FX1 ( p )
xf ( x)dx

1 p
1
VaR ( X )dy y

p
1 p
TVaRq ( X ) is the mean of the upper tail of the distribution
TVaRq ( X )  Conditional Tail Expectation
n
Y j k
j
TVaRq ( X ) 
n  k 1
n  
 
  E TVaR ( X )  2 
2
sq2    
E TVaR ( X )
n  1    
q q
 
2
sq2  q TVaR q ( X )  VaR q ( X )

Var TVaR q ( X )   n  k 1
Confidence Interval =TVaR q ( X )  z Var TVaR q ( X )  

Estimate of VaRq ( X ) is Yk where k   nq   1
So if simulation has 1000 runs and you're estimating 95th percentile,

Then use Y951
Bootstrap Approximation
 ( F ) is the parameter
g  x1 ,..., xn  is an estimator based on a sample of 'n' items
MSE g  x1 ,..., xn   ( F )   EFn  g  x1 ,..., xn    ( Fn )  

 2
 
n
 i
 x  x  2
2  i 1
n
Estimating the mean with the Sample Mean

n
ˆ 2
 i
 x  x  2
MSE x      i 1
n n2
Sums of Distributions
Single Multiple
Bernoulli Binomial
Binomial Binomial
Poisson Poisson
Geometric Negative Binomial
Negative Binomial Negative Binomial
Normal Normal
Exponential Gamma
Gamma Gammas
Chi-Square Chi-Square

Exam C Formulas PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exam C Formulas PDF

Uploaded by

Copyright:

Available Formats

C Formula Sheet

n'  E  X n  (nth Raw Moment)

  1' is the mean

Distribution f(x) E(X)  

S ( x)  M    A( x)  (Use MGF for )

Exponential a(x) leads to a Pareto distribution

Var ( X )  E Var  X | I   Var  E  X | I 

1) Sum of functions must integrate to 1

X ^ d is the LIMITED EXPECTED VALUE

Ordinary Deductible of d – pays max(0,X-d)

Payment per Loss with Deductible

Expected Payment per Loss  E  X  d    dS ( d )

If X  Uniform  0,  , then  X - d  | X  d  Uniform  0,  d 

Loss Elimination Ratio

A coherent risk measure satisfies all 4 properties

VaR p fails subadditivity

VaRp ( X )   p  FX1 ( p) Value-at-Risk

Distribution VaR(X) TVaR(X)

If given a mixture, use the survival function and solve for x

For deductible 'd' and maximum covered loss 'u'

E  Payment per Loss   E  X ^ u   E  X ^ d 

Policy Limit: the maximum amount the coverage will pay

Maximum Covered Loss: the maximum loss amount that is covered

Variance of Payment per Loss with a deductible

Pay Bonus of 50% of (500 - X) if X  500

The (a,b,0) class

Distribution a Variance vs. Mean

Negative Binomial 0 Variance > Mean

Binomial 0 Variance < Mean

A sum of 'n' independent Binomial random variables having the same q

Probability Generating Functions

If given a primary and secondary pgf, substitute the secondary

The Negative Binomial is a Gamma mixture of Poissons

Negative Binomial ( r )  Gamma( )

Negative Binomial (r  1) is Geometric

Var     Var  X   E  X  where  ~ Gamma

Frequency Model Original Parameters Exposure Modification Coverage Modification

(a,b,0) and (a,b,1) adjustments

Can only be used when N and X are independent

When given severity distributions with Pr( X  0)  0

Assume severity is discrete

u   d   1 (the sum of all multiples of h less than d )

Method 2 - Integrate the Survival Function

u   d   1 (the sum of all multiples of h less than d )

To find E  S ^ 2.8 where x can be 2, 4 6 or 8...

If there is a per-policy deductible and you want Aggregate Payments

1) Expected Payment per Loss x Expected Number of Losses per Year

2) Expected Payment per Payment x Expected Number of Payments per Year

1) Better for discrete severity distributions

2) Better for if severity is Exponential, Pareto or Uniform

Exact Calculation of Aggregate Loss Distribution

Calculate Pr  S  c | n  1 using  and  2

Exponential and Gamma Distributions

The sum of n exponential random variables with common mean 

In a Poisson Process with parameter  , the number of events occurring by time t

In a Poisson Process with parameter  , the time between events follows

In a Poisson Process with paremeter 1 , the time between events is exponential

bias    E ˆ     is the estimator,  is the parameter being estimated

Consistency (weak consistency)

1) An estimator is consistent if it is asymptotically unbiased

Mean Square Error

MSE is a function of the true value of the parameter

To find the 2nd raw moment, calculate