You are on page 1of 4960

hmahler@mac.

com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 1

To Buyers of 2013 Exam 4 / Exam C Study Guides


Howard C. Mahler, FCAS, MAAA
How much detail is needed and how many problems need to be done varies by person and topic.
In order to help you to concentrate your efforts:
1. About 1/6 of the many problems are labeled highly recommended,
while another 1/6 are labeled recommended.
2. Important Sections are listed in bold in the table of contents.
Extremely important Sections are listed in larger type and in bold.
3. Important ideas and formulas are in bold.
4. Each Study Guide has a Section of Important Ideas and Formulas.
5. Each Study Guide has a chart of past exam questions by Section.
6. There is a breakdown of percent of questions on each past exam by Study Guide.
My Study Aids are a thick stack of paper.1 However, many students find they do not need to look
at the textbooks. For those who have trouble getting through the material, concentrate on
the introductions and sections in bold.
Highly Recommended problems (about 1/6 of the total) are double underlined.
Recommended problems (about 1/6 of the total) are underlined.
Do at least the Highly Recommended problems your first time through.
It is important that you do problems when learning a subject and then some more problems
a few weeks later.
Be sure to do all the questions from the recent Course 4 Exams at some point.
I have written some easy and tougher problems.2 The former exam questions are arranged in
chronological order. The more recent exam questions are on average more similar to what you will
be asked on your exam, than are less recent questions.
All of the 2009 Sample Exam questions are included (there were 289 prior to deletions.)
Their locations are shown in my final study guide, Breakdown of Past Exams.
Each of my study guides is divided into sections as shown in its table of contents.
The solutions to the problems in a section of a study guide are at the end of that section.

The number of pages is not as important as how long it takes you to understand the material. One page in a
textbook might take someone as long to understand as ten pages in my Study Guides.
2
Points are based on 100 points = a 4 hour exam.

hmahler@mac.com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 2

In the electronic version use the bookmarks / table of contents in the Navigation Panel
in order to help you find what you want. The Find function will also help.
You may find it helpful to print out selected portions, such as the Table of Contents and Important
Ideas Section in each of my study guides.
Mahlers Guides for Joint Exam 4/C have 14 parts,
which are listed below, along with my estimated percent of the exam.3

Study Guides for Joint Exam C / Exam 4


1
2
3
4
5
6
7
8
9
10
11
12
13
14

6%
8%
7%
3%
4%
26%
10%
3%
14%
7%
1%
3%
8%

Mahler's Guide to Frequency Distributions


Mahler's Guide toLoss Distributions
Mahler's Guide toAggregate Distributions
Mahler's Guide toRisk Measures
Mahler's Guide to Fitting Frequency Distributions
Mahler's Guide to Fitting Loss Distributions
Mahler's Guide to Survival Analysis
Mahler's Guide to Classical Credibility
Mahler's Guide to Buhlmann Credibility & Bayesian Analysis
Mahler's Guide to Conjugate Priors
Mahler's Guide to Semiparametric Estimation
Mahler's Guide to Empirical Bayesian Credibility
Mahler's Guide to Simulation
Breakdown of Past Exams

My Practice Exams are sold separately.

This is my best estimate, which should be used with appropriate caution, particularly in light of the changes in the
syllabus. In any case, the number of questions by topic varies from exam to exam.

hmahler@mac.com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 3

Author Biography:
Howard C. Mahler is a Fellow of the Casualty Actuarial Society,
and a Member of the American Academy of Actuaries.
He has taught actuarial exam seminars and published study guides since 1994.
He spent over 20 years in the insurance industry, the last 15 as Vice President and Actuary at the
Workers' Compensation Rating and Inspection Bureau of Massachusetts.
He has published dozens of major research papers and won the 1987 CAS Dorweiler prize.
He served 12 years on the CAS Examination Committee including three years as head of the
whole committee (1990-1993).
Mr. Mahler has taught live seminars for Joint Exam 4/C, Joint Exam MFE/3F, CAS Exam 3L,
CAS Exam 5, and what is now CAS Exam 8.
He has written study guides for all of the above.
Mr. Mahler teaches weekly classes in Boston for Joint Exam 4/C, and Joint Exam MFE/3F.
hmahler@mac.com
www.howardmahler.com/Teaching

hmahler@mac.com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 4

Loss Models, 3rd Edition

Mahler Study Guides

Chapter 3.1
Chapter 3.2
Chapter 3.3
Chapter 3.4
Chapter 3.5

Loss Distributions: Sections 2-4, 7-8.


Loss Distributions: Section 19.
Freq. Dists.: Section 9, Aggregate Dists.: Sections 4-5
Loss Distributions: Sections 30, 33, 34.
Risk Measures

Chapter 4

Loss Distributions: Sections 21, 38.

Chapter 5.2
Chapter 5.3
Chapter 5.4

Loss Distributions: Sections 29, 39, 40.


Loss Distributions: Sections 22, 24-28.
Conjugate Priors: Section 11.

Chapters 6.1-6.5, 6.7

Frequency Distributions: Sections 1-6, 9, 11-14, 19.

Chapter 8

Loss Dists.: Sections 6, 15-18, 36, Freq Dists.: Sections 3-6

Chapter 9.1-9.7, 9.11.1-9.11.24

Aggregate Distributions

Chapter 12

Fitting Loss Dists.: Sections 14, 26. Freq. Dists.: Section 7.

Chapter 13

Loss Dists.: Section 4, Fitting Loss Dists.: Sections 5-6,


Survival Analysis: Sections 1, 5.

Chapter 14

Survival Analysis: Sections 1, 2, 3, 4, 6, 7, 9,


Loss Dists.: Sections 16-17, Fitting Loss Dists.: Sections 5, 6.

Chapter 15.1-15.4
Chapter 15.5
Chapter 15.65

Fit Loss Dists.: Secs. 7-11, 20-25, 27-31, Surv. Anal.: Sec. 8
Buhlmann Cred.: Sections 4-6, 16, Conj. Priors: Section 10
Fitting Freq. Dists.: Sections 2, 3, 6.

Chapter 16

Fitting Freq. Dists.: Secs 4-5, Fitting Loss Dists.: Secs 12-19.

Chapter 20.2
Chapter 20.3 6
Chapter 20.4 7

Classical Credibility
Buhlmann Credibility, Conjugate Priors.
Empirical Bayes, Semiparametric Estimation

Chapter 218

Simulation

4
5
6
7
8

Excluding 9.6.1 and examples 9.9 and 9.11.


Sections 15.6.1-15.6.4, 15.6.6 only.
Excluding 20.3.8.
Excluding 20.4.3.
Sections 21.2-21.2 (excluding 21.2.4)

hmahler@mac.com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 5

Besides many past exam questions from the CAS and SOA, my study guides include some past
questions from exams given by the Institute of Actuaries and Faculty of Actuaries in Great Britain.
These questions are copyright by the Institute of Actuaries and Faculty of Actuaries, and are
reproduced here solely to aid students studying for actuarial exams. These IOA questions are
somewhat different in format than those on your exam, but should provide some additional
perspective on the syllabus material.
Your exam will be 3.5 hours and will consist of approximately 35 multiple choice questions, each of
equal value.9 The examination will be offered via computer-based testing.
Download from the CAS website or SOA website, a copy of the tables to be attached to your
exam.10
Read the Hints on Study and Exam Techniques in the CAS Syllabus.11
Read Tips for Taking Exams.12
Some students have reported success with the following guessing strategy.
When you are ready to guess (a few minutes before time is finished for the exam), count up how
many you have answered of each letter.
Then fill in the least used letter, at each stage.
For example, if the fewest were A, fill in A's until some other letter is fewest.
Now fill in that letter, etc.
Remember that for every question you should fill in a letter answer.13
On Exam 4/C, the following rule applies to the use of the Normal Table:
When using the normal distribution, choose the nearest z-value to find the probability, or
if the probability is given, chose the nearest z-value. No interpolation should be used.
Example: If the given z-value is 0.759, and you need to find Pr(Z < 0.759) from the normal
distribution table, then choose the probability value for z-value = 0.76; Pr(Z < 0.76) = 0.7764.
When using the Normal Approximation to a discrete distribution, use the continuity correction.

Equivalent to 2.5 points each, if 100 points = a 4 hour exam.


http://www.soa.org http://casact.org
11
http://casact.org/admissions/syllabus/index.cfm?fa=hints
12
www.casact.org/admissions/index.cfm?fa=tips
13
Nothing will be added for an unanswered question and nothing will be subtracted for an incorrect answer.
Therefore put down an answer, even a total guess, for every question.
10

hmahler@mac.com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 6

I suggest you buy and use the TI-30XS Multiview calculator. You will save time doing repeated
calculations using the same formula. Examples include calculating processes variances to calculate an
EPV, constructing a Distribution Table of a frequency distribution, simulating from the same
continuous distribution several times, etc.
While studying, you should do as many problems as possible. Going back and forth between
reading and doing problems is the only way to pass this exam. The only way to learn to solve
problems is to solve lots of problems. You should not feel satisfied with your study of a subject until
you can solve a reasonable number of the problems.
There are two manners in which you should be doing problems. First you can do problems in order
to learn the material. Take as long on each problem as you need to fully understand the concepts
and the solution. Reread the relevant syllabus material. Carefully go over the solution to see if you
really know what to do. Think about what would happen if one or more aspects of the question were
revised.14 This manner of doing problems should be gradually replaced by the following manner as
you get closer to the exam.
The second manner is to do a series of problems under exam conditions, with the items you will
have when you take the exam. Take in advance a number of points to try based on the time
available. For example, if you have an uninterrupted hour, then one might try either
60/2.5 = 24 points or 60/3 = 20 points of problems. Do problems as you would on an exam in any
order, skipping some and coming back to some, until you run out of time. I suggest you leave time
to double check your work.
Expose yourself somewhat to everything on the syllabus. Concentrate on sections and items in
bold. Do not read sections or material in italics your first time through the material.15 Each study guide
has a chart of where the past exam questions have been; this may also help you to direct your
efforts.16 Try not to get bogged down on a single topic. On hard subjects, try to learn at least the
simplest important idea. The first time through do enough problems in each section, but leave some
problems in each section to do closer to the exam. At least every few weeks review the important
ideas and formulas sections of those study guides you have already completed.
Make a schedule and stick to it. Spend a minimum of one hour every day.
I recommend at least two study sessions every day, each of at least 1/2 hour.

14

Some may also find it useful to read about a dozen questions on an important subject, thinking about how to set
up the solution to each one, but only working out in detail any questions they do not quickly see how to solve.
15
Material in italics is provided for those who want to know more about a particular subject and/or to be prepared for
more challenging exam questions. Material in italics could be directly needed to answer perhaps one or two
questions on an exam.
16
While this may indicate what ideas questions on your exam are likely to cover, every exam contains a few questions
on ideas that have yet to be asked.

hmahler@mac.com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 7

Use whatever order to go through the material that works best for you.
Here is a schedule that may work for some people.17
A 15 week Study Schedule for Exam 4/C:
1. Frequency Distributions
2. Start of Loss Distributions: sections 1 to 30.
3. Rest of Loss Distributions: Remainder.
4. Aggregate Distributions
5. Fitting Frequency Distributions
Classical Credibility
6. Start of Buhlmann Credibility and Bayesian Analysis: sections 1-6 and 12.
7. Start of Fitting Loss Distributions: sections 1 to 10.
8. More Buhlmann Credibility and Bayesian Analysis: sections 7 to 10.
9. More Fitting Loss Distributions: sections 11 to 19.
10. Rest of Buhlmann Credibility and Bayesian Analysis: Remainder.
Semiparametric Estimation
11. Rest of Fitting Loss Distributions: Remainder.
12. Conjugate Priors
13. Survival Analysis
14. Empirical Bayesian Credibility
Risk Measures
15. Simulation

17

This is just an example of one possible schedule. Adjust it to suit your needs or make one up yourself.

hmahler@mac.com,

Exam C / Exam 4 Study Guides,

10/26/12,

Page 8

Most of you will need to spend a total of 300 or more hours of study time on the entire syllabus; this
means an average of at least 2 hours a day.
Throughout do Exam Problems and Practice Problems in my study guides. At least 50% of your
time should be spent doing problems. As you get closer to the Exam, the portion of time spent
doing problems should increase.
Review the important formulas and ideas section at the end of each study guide.
During the last several weeks do my practice exams, sold separately.
The CAS/SOA has posted a preview of the tables for Computer Based Testing:
http://www.beanactuary.org/exams/4C/Split.html
I would suggest you use them if possible when doing practice exams.
Past students helpful suggestions and questions have greatly improved these Study Aids.
I thank them.
Feel free to send me any questions or suggestions:
Howard Mahler, Email: hmahler@mac.com
Please do not copy the Study Aids, except for your own personal use. Giving them to others is
unfair to yourself, your fellow students who have paid for them, and myself.18
If you found them useful, tell a friend to buy his own.
Please send me any suspected errors by Email.
(Please specify as carefully as possible the page, Study Guide and Course.)
The errata sheet will be posted on my webpage: www.howardmahler.com/Teaching

18

These study aids represent thousands of hours of work.

Exam C / Exam 4 Study Guides,

hmahler@mac.com,

10/26/12,

Page 9

Pass Marks and Passing Percentages for Past Exams:19


Pass
Exam 4/C
Mark
Spring 2007 N.A.
Fall 2007
63%

Effective Number
of Exams
1976
1786

Number
Passing
887
926

Percent
Passing
42.7%
49.9%

% Effective
Passing
44.9%
51.8%

Spring 2008 60% 1848


Fall 2008
55% 1763

1757
1698

868
769

47.0%
43.6%

49.4%
45.3%

Spring 2009 55% 1957


Fall 2009
58%20 2198

1861
2004

746
959

38.1%
43.6%

40.1%
47.9%

Spring 2010 66% 1674


Aug. 2010 66%21 1252
Nov. 2010 64% 1512

1559
1163
1358

702
552
612

41.9%
44.1%
40.5%

45.0%
47.5%
45.1%

Feb. 2011
June 2011
Oct. 2011

64% 1470
64% 1890
64% 1962

1304
1681
1723

598
745
858

40.7%
39.4%
43.7%

45.9%
44.3%
49.8%

Feb. 2012
May 2012

67% 1461
67% 2008

1318
1798

665
904

45.5%
45.0%

50.5%
50.3%

19

Number
of Exams
2079
1857

Information taken from the CAS and SOA webpages. Check the webpages for updated information.
Starting in Fall 2009, there was computer-based testing. All versions of the exam are constructed to be of
comparable difficulty to one another. Apparently, the passing percentage varies somewhat by version of the exam.
On average, 58% correct was needed to pass the exam.
21
Examination C/4 is administered using computer-based testing (CBT). Under CBT, it is not possible to schedule
everyone to take the examination at the same time. As a result, each administration consists of multiple versions of
the examination given over a period of several days. The examinations are constructed and scored using Item
Response Theory (IRT). Under IRT, each operational item that appears on an examination has been calibrated for
difficulty and other test statistics and the pass mark for each examination is determined before the examination is
given. All versions of the examination are constructed to be of comparable difficulty to one another.
For the August 2010 administration of Examination C/4, an average of 66% correct was needed to pass the exam.
20

Mahlers Guide to

Frequency Distributions
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-1


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-1,

Frequency Distributions

HCM 10/4/12,

Page 1

Mahlers Guide to Frequency Distributions


Copyright 2013 by Howard C. Mahler.
Information in bold or sections whose title is in bold are more important for passing the exam.
Larger bold type indicates it is extremely important.
Information presented in italics (or sections whose title is in italics) should not be needed to directly
answer exam questions and should be skipped on first reading. It is provided to aid the readers
overall understanding of the subject, and to be useful in practical applications.
Highly Recommended problems are double underlined.
Recommended problems are underlined.1
Solutions to the problems in each section are at the end of that section.

E
F

Section #

Pages

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

4
5-15
16-41
42-72
73-94
95-120
121-148
149-161
162-175
176-188
189-210
211-222
223-242
243-261
262-276
277-294
295-334
335-345
346-387
388-398
399-406

Section Name
Introduction
Basic Concepts
Binomial Distribution

Poisson Distribution
Geometric Distribution
Negative Binomial Distribution
Normal Approximation
Skewness
Probability Generating Functions
Factorial Moments
(a, b, 0) Class of Distributions
Accident Profiles
Zero-Truncated Distributions
Zero-Modified Distributions
Compound Frequency Distributions
Moments of Compound Distributions
Mixed Frequency Distributions
Gamma Function
Gamma-Poisson Frequency Process
Tails of Frequency Distributions
Important Formulas and Ideas

Note that problems include both some written by me and some from past exams. The latter are copyright by the
CAS and SOA, and are reproduced here solely to aid students in studying for exams. The solutions and comments
are solely the responsibility of the author; the CAS and SOA bear no responsibility for their accuracy. While some of
the comments may seem critical of certain questions, this is intended solely to aid you in studying and in no way is
intended as a criticism of the many volunteers who work extremely long and hard to produce quality exams. In some
cases Ive rewritten these questions in order to match the notation in the current Syllabus.

2013-4-1,

Frequency Distributions

HCM 10/4/12,

Page 2

Past Exam Questions by Section of this Study Aid2


Course 3 Course 3 Course 3 Course 3 Course 3 Course 3
Section

Sample

5/00

11/00

5/01

11/01

11/02

CAS 3

SOA 3

CAS 3

11/03

11/03

5/04

1
2
3

14

16

5
6

18

7
8

28

9
10
11

25

28

32

26

12
13
14

37

15
16

17

13

16 36

30

27

3 15

27

18
19

12

15

20

The CAS/SOA did not release the 5/02 and 5/03 exams.
From 5/00 to 5/03, the Course 3 Exam was jointly administered by the CAS and SOA.
Starting in 11/03, the CAS and SOA gave separate exams. (See the next page.)

Excluding any questions that are no longer on the syllabus.

2013-4-1,

Frequency Distributions

CAS 3 SOA 3
Section

HCM 10/4/12,

CAS 3 SOA M CAS 3 SOA M CAS 3

11/04

11/04

5/05

22 24

15

23

5/05

11/05

39

24

11/05

5/06

Page 3
CAS 3 SOA M
11/06

11/06
Section
5/07

1
2
32

5
6

21

28

32

23 24 31

22

7
8
9
10

25

11

16

19

31

12
13
14
15

27

16

18

17

35

32

18
19

30
19

10

20

The SOA did not release its 5/04 and 5/06 exams.
This material was moved to Exam 4/C in 2007.
The CAS/SOA did not release the 11/07 and subsequent exams.

4/C

39

2013-4-1,

Frequency Distributions, 1 Introduction

HCM 10/4/12,

Page 4

Section 1, Introduction
This Study Aid will review what a student needs to know about the frequency distributions in
Loss Models. Much of the first seven sections you should have learned on Exam 1 / Exam P.
In actuarial work, frequency distributions are applied to the number of losses, the number of claims,
the number of accidents, the number of persons injured per accident, etc.
Frequency Distributions are discrete functions on the nonnegative integers: 0, 1, 2, 3, ...
There are three named frequency distributions you should know:
Binomial, with special case Bernoulli
Poisson
Negative Binomial, with special case Geometric.
Most of the information you need to know about each of these distributions is shown in
Appendix B, attached to the exam. Nevertheless, since they appear often in exam questions, it is
desirable to know these frequency distributions well, particularly the Poisson Distribution.
In addition, one can make up a frequency distribution.
How to work with such unnamed frequency distributions is discussed in the next section.
In later sections, the important concepts of Compound Distributions and Mixed Distributions will be
discussed.3
The most important case of a mixed frequency distribution is the Gamma-Poisson frequency
process.

Compound Distributions are mathematically equivalent to Aggregate Distributions, which are discussed in
Mahlers Guide to Aggregate Distributions.

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 5

Section 2, Basic Concepts


The probability density function4 f(i) can be non-zero at either a finite or infinite number of points.
In the former case, the probability density function is determined by a table of its values at these
finite number of points.

The f(i) can take on any values provided they satisfy 0 f(i) 1 and f(i) = 1.
i=0

For example:
Number
of Claims
0
1
2
3
4
5
6
7
8
9
10
11

Probability
Density Function
0.1
0.2
0
0.1
0
0
0.1
0
0
0.1
0.3
0.1

Sum

Cumulative
Distribution Function
0.1
0.3
0.3
0.4
0.4
0.4
0.5
0.5
0.5
0.6
0.9
1

The Distribution Function5 is the cumulative sum of the probability density function:
j

F(j) =

f(i) .
i=0

In the above example, F(3) = f(0) + f(1) + f(2) + f(3) = 0.1 + 0.2 + 0 + 0.1 = 0.4.

Loss Models calls the probability density function of frequency the probability function or
p.f. and uses the notation pk for f(k), the density at k.
5
Also called the cumulative distribution function.

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 6

Moments:
One can calculate the moments of such a distribution.
For example, the first moment or mean is:
(0)(0.1) + (1)(0.2) + (2)(0) + (3)(0.1) + (4)(0) + (5)(0) + (6)(0.1) + (7)(0) + (8)(0) + (9)(0.1)
+ (10)(0.3) + (11)(0.1) = 6.1.

Number
of Claims
0
1
2
3
4
5
6
7
8
9
10
11

Probability
Density Function
0.1
0.2
0
0.1
0
0
0.1
0
0
0.1
0.3
0.1

Probability x
# of Claims
0
0.2
0
0.3
0
0
0.6
0
0
0.9
3
1.1

Probability x
Square of
# of Claims
0
0.2
0
0.9
0
0
3.6
0
0
8.1
30
12.1

Sum

6.1

54.9

E[X] = i f(i) = Average of X = 1st moment about the origin = 6.1.


E[X2 ] = i2 f(i) = Average of X2 = 2nd moment about the origin = 54.9.
The second moment is:
(02 )(0.1) + (12 )(0.2) + (22 )(0) + (32 )(0.1) + (42 )(0) + (52 )(0) + (62 )(0.1) + (72 )(0) + (82 )(0)
+ (92 )(0.1) + (102 )(0.3) + (112 )(0.1) = 54.9.
Mean = E[X] = 6.1.
Variance = second central moment = E[(X - E[X])2 ] = E[X2 ] - E[X]2 = 17.69.
Standard Deviation = Square Root of Variance = 4.206.
The mean is the average or expected value of the random variable. For the above example, the
mean is 6.1 claims.

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 7

In general means add; E[X+Y] = E[X] + E[Y]. Also multiplying a variable by a constant multiplies the
mean by the same constant; E[kX] = kE[X].
The mean is a linear operator: E[aX + bY] = aE[X] + bE[Y].
The mean of a frequency distribution can also be computed as a sum of its survival functions:6
E[X] =

i=0

i=0

Prob[X > i] = {1 -

F(i)} .

Mode and Median:


The mean differs from the mode which represents the value most likely to occur. The mode is the
point at which the density function reaches its maximum. The mode for the above example is 10
claims.
For a discrete distribution, take the 100pth percentile as the first value at which F(x) p.7
The 80th percentile for the above example is 10; F(9) = .6, F(10) = .9.
The median is the 50th percentile. For frequency distributions, and other discrete distributions,
the median is the first value at which the distribution function is greater than or equal to 0.5. The
median for the above example is 6 claims; F(6) = .5.
Definitions:
Exposure Base: The basic unit of measurement upon which premium is determined.
For example, the exposure base could be car-years, $100 of payrolls, number of insured lives, etc.
The rate for Workers Compensation Insurance might be $3.18 per $100 of payroll, with $100 of
payroll being one exposure.
Frequency: The number of losses or number of payments random variable, (unless indicated
otherwise) stated per exposure unit.
For example the frequency could be the number of losses per (insured) house-year.
Mean Frequency: Expected value of the frequency.
For example, the mean frequency might be 0.03 claims per insured life per year.
6

This is analogous to the situation for a continuous loss distributions; the mean of a Loss Distribution can be
computed as the integral of its survival function.
7
Definition 3.7 in Loss Models. F(p -) p F(p ).

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Problems:
Use the following frequency distribution for the next 5 questions:
Number of Claims Probability
0
0.02
1
0.04
2
0.14
3
0.31
4
0.36
5
0.13
2.1 (1 point) What is the mean of the above frequency distribution?
A. less than 3
B. at least 3.1 but less than 3.2
C. at least 3.2 but less than 3.3
D. at least 3.3 but less than 3.4
E. at least 3.4
2.2 (1 point) What is the mode of the above frequency distribution?
A. 2
B. 3
C. 4
D. 5
E. None of the above.
2.3 (1 point) What is the median of the above frequency distribution?
A. 2
B. 3
C. 4
D. 5
E. None of the above.
2.4 (1 point) What is the standard deviation of the above frequency distribution?
A. less than 1.1
B. at least 1.1 but less than 1.2
C. at least 1.2 but less than 1.3
D. at least 1.3 but less than 1.4
E. at least 1.4
2.5 (1 point) What is the 80th percentile of the above frequency distribution?
A. 2
B. 3
C. 4
D. 5
E. None of A, B, C, or D.

Page 8

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 9

2.6 (1 point) The number of claims, N, made on an insurance portfolio follows the following
distribution:
n
Pr(N=n)
0
0.7
1
0.2
2
0.1
What is the variance of N?
A. less than 0.3
B. at least 0.3 but less than 0.4
C. at least 0.4 but less than 0.5
D. at least 0.5 but less than 0.6
E. at least 0.6
Use the following information for the next 8 questions:
V and X are each given by the result of rolling a six-sided die.
V and X are independent of each other.
Y= V + X.
Z = 2X.
Hint: The mean of X is 3.5 and the variance of X is 35/12.
2.7 (1 point) What is the mean of Y?
A. less than 7.0
B. at least 7.0 but less than 7.1
C. at least 7.1 but less than 7.2
D. at least 7.2 but less than 7.3
E. at least 7.4
2.8 (1 point) What is the mean of Z?
A. less than 7.0
B. at least 7.0 but less than 7.1
C. at least 7.1 but less than 7.2
D. at least 7.2 but less than 7.3
E. at least 7.4
2.9 (1 point) What is the standard deviation of Y?
A. less than 2.0
B. at least 2.0 but less than 2.3
C. at least 2.3 but less than 2.6
D. at least 2.9 but less than 3.2
E. at least 3.2

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 10

2.10 (1 point) What is the standard deviation of Z?


A. less than 2.0
B. at least 2.0 but less than 2.3
C. at least 2.3 but less than 2.6
D. at least 2.9 but less than 3.2
E. at least 3.2
2.11 (1 point) What is the probability that Y = 8?
A. less than .10
B. at least .10 but less than .12
C. at least .12 but less than .14
D. at least .14 but less than .16
E. at least .16
2.12 (1 point) What is the probability that Z = 8?
A. less than .10
B. at least .10 but less than .12
C. at least .12 but less than .14
D. at least .14 but less than .16
E. at least .16
2.13 (1 point) What is the probability that X = 5 if Y10?
A. less than .30
B. at least .30 but less than .32
C. at least .32 but less than .34
D. at least .34 but less than .36
E. at least .36
2.14 (1 point) What is the expected value of X if Y10?
A. less than 5.0
B. at least 5.0 but less than 5.2
C. at least 5.2 but less than 5.4
D. at least 5.4 but less than 5.6
E. at least 5.6
2.15 (3 points) N is uniform and discrete from 0 to b; Prob[N = n] = 1/(b+1), n = 0, 1, 2, ... , b.

10 Minimum[N, 10].
If E[N 10] = 0.875 E[N], determine b.
N

A. 13

B. 14

C. 15

D. 16

E. 17

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 11

2.16 (2 points) What is the variance of the following distribution?


Claim Count:
0
1
2
3
4
5
>5
Percentage of Insureds: 60.0% 24.0% 9.8% 3.9% 1.6% 0.7% 0%
A. 0.2
B. 0.4
C. 0.6
D. 0.8
E. 1.0
2.17 (3 points) N is uniform and discrete from 1 to S; Prob[N = n] = 1/S, n = 1, 2, ... , S.
Determine the variance of N, as a function of S.
2.18 (4, 5/88, Q.31) (1 point) The following table represents data observed for a certain class of
insureds. The regional claims office is being set up to service a group of 10,000 policyholders from
this class.
Number of Claims
Probability of a Policyholder
n
Making n Claims in a Year
0
0.84
1
0.07
2
0.05
3
0.04
If each claims examiner can service a maximum of 500 claims in a year, and you want to staff the
office so that you can handle a number of claims equal to two standard deviations more than the
mean, how many examiners do you need?
A. 5 or less
B. 6
C. 7
D. 8
E. 9 or more
2.19 (4B, 11/99, Q.7) (2 points) A player in a game may select one of two fair, six-sided dice.
Die A has faces marked with 1, 2, 3, 4, 5, and 6. Die B has faces marked with 1, 1, 1, 6, 6, and 6.
If the player selects Die A, the payoff is equal to the result of one roll of Die A. If the player selects
Die B, the payoff is equal to the mean of the results of n rolls of Die B.
The player would like the variance of the payoff to be as small as possible.
Determine the smallest value of n for which the player should select Die B.
A. 1
B. 2
C. 3
D. 4
E. 5
2.20 (1, 11/01, Q.32) (1.9 points) The number of injury claims per month is modeled by a random
1
variable N with P[N = n] =
, where n 0.
(n+ 1) (n+ 2)
Determine the probability of at least one claim during a particular month, given
that there have been at most four claims during that month.
(A) 1/3
(B) 2/5
(C) 1/2
(D) 3/5
(E) 5/6

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 12

Solutions to Problems:
2.1. D. mean = (0)(.02) + (1)(.04) + (2)(.14) + (3)(.31) + (4)(.36) + (5)(.13) = 3.34.
Comment: Let S(n) = Prob[N > n] = survival function at n.
S(0) = 0.98. S(1) = .94

E[N] =

S(i) = .98 + .94 + .80 + .49 + .13 + 0 = 3.34.


0

2.2. C. f(4) = 36% which is the greatest value attained by the probability density function, therefore
the mode is 4.
2.3. B. Since F(2) = 0.2 < 0.5 and F(3) = 0.51 0.5 the median is 3.
Number of Claims
0
1
2
3
4
5

Probability
2%
4%
14%
31%
36%
13%

Distribution
2%
6%
20%
51%
87%
100%

2.4. B. Variance = (second moment) - (mean)2 = 12.4 - 3.342 = 1.244.


Standard Deviation =

1.244 = 1.116.

2.5. C. Since F(3) = 0.51 < 0.8 and F(4) = 0.87 0.8, the 80th percentile is 4.
2.6. C. Mean = (.7)(0) + (.2)(1) + (.1)(2) = .4.
Variance = (.7)(0 - .4)2 + (.2)(1 - .4)2 + (.1)(2 - .4)2 = 0.44.
Alternately, Second Moment = (.7)(02 ) + (.2)(12 ) + (.1)(22 ) = .6. Variance = .6 - .42 = 0.44.
2.7. B. E[Y] = E[V + X] = E[V] + E[X] = 3.5 + 3.5 = 7.
2.8. B. E[Z] = E[2X] = 2 E{X] = (2)(3.5) = 7.
2.9. C. Var[Y] = Var[V+X] = Var[V]+V[X] = (35/12)+(35/12) = 35/6 = 5.83.
Standard Deviation[Y] =

5.83 = 2.41.

2.10. E. Var[Z] = Var[2X] = 22 Var[X] = (4)(35/12) = 35/3 = 11.67.


Standard Deviation[Z] =

11.67 = 3.42.

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 13

2.11. C. For Y = 8 we have the following possibilities: V =2, X=6; V=3, X=5; V=4;X=4; V=5; X=3;
V=6, X=2. Each of these has a (1/6)(1/6) = 1/36 chance, so the total chance that Y =8 is 5/36 =
0.139.
Comment: The distribution function for Y is:
y
f(y)

2
1/36

3
2/36

4
3/36

5
4/36

6
5/36

7
6/36

8
5/36

9
4/36

10
3/36

11
2/36

12
1/36

2.12. E. Z = 8 when X = 4, which has probability 1/6.


Comment: The distribution function for Z is:
z
f(z)

2
1/6

4
1/6

6
1/6

8
1/6

10
1/6

12
1/6

Note that even though Z has the same mean as Y, it has a significantly different distribution function.
This illustrates the difference between adding the result of several independent identically distributed
variables, and just multiplying a single result by a constant. (If the variable has a finite variance), the
Central Limit applies to the prior situation, but not the latter. The sum of N independent dice starts to
look like a Normal Distribution as N gets large.
N times a single die has a flat distribution similar to that of X or Z, regardless of N.
2.13. C. If Y 10, then we have the possibilities V=4, X=6; V=5, X =5; V=5, X=6;
V=6, X=4; V=6 ,X=5; V=6, X=6. Out of these 6 equally likely probabilities, for 2 of them X=5.
Therefore if Y10, there is a 2/6 = .333 chance that X = 5.
Comment: This is an example of a conditional distribution.
The distribution of f(x | y 10) is:
x
4
5
6
f(x | y 10)
1/6
2/6
3/6
The distribution of f(x | y =10) is:
x
4
5
6
f(x | y =10)
1/3
1/3
1/3
2.14. C. The distribution of f(x | y 10) is:
x
4
5
6
f(x | y 10)
1/6
2/6
3/6
(1/6)(4) + (2/6)(5) + (3/6)(6) = 32 / 6 = 5.33.

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

Page 14

2.15 C. E[N] = (0 + 1 + 2 + ... + b)/(b + 1) = {b(b+1)/2}/(b + 1).


For b 10, E[N 10] = {0 + 1 + 2 + ... 9 + (b-9)(10)}/(b + 1) = (45 + 10b - 90)/(b + 1).
E[N

10] = 0.875 E[N]. 10b - 45 = .875b(b+1)/2. .875b2 - 19.125b + 90 = 0.

b = (19.125

19.1252 - (4)(0.875)(90) )/1.75 = (19.125 7.125)/1.75 = 15 or 6.857.

However, b has to be integer and at least 10, so b = 15.


Comment: The limited expected value is discussed in Mahlers Guide to Loss Distributions.
If b = 15, then there are 6 terms that enter the limited expected value as 10:
E[N 10] = (0 + 1 +2 + 3 + 4 + 5 + 6 + 7 +8 +9 + 10 + 10 + 10 + 10 +10 + 10}/16 = 105/16.
E[N] = (0 + 1 +2 + 3 + 4 + 5 + 6 + 7 +8 +9 + 10 + 11 + 12 +13 + 14 + 15}/16 = 15/2.
Their ratio is 0.875.
2.16. E. Mean = 0.652 and the variance = 1.414 - 0.6522 = 0.989.
Number
of Claims

A Priori
Probability

Number Times
Probability

Number Squared
Times Probability

0
1
2
3
4
5

0.60000
0.24000
0.09800
0.03900
0.01600
0.00700

0.00000
0.24000
0.19600
0.11700
0.06400
0.03500

0.00000
0.24000
0.39200
0.35100
0.25600
0.17500

Sum

0.652

1.41400

2.17. E[N] = (1 + 2 + ... + S)/S = {S(S+1)/2}/S = (S + 1)/2.


E[N2 ] = (12 + 22 + ... + S2 )/S = {S(S+1)(2S + 1)/6}/S = (S + 1)(2S + 1)/6.
Var[N] = E[N2 ] - E[N]2 = (S + 1)(2S + 1)/6 - {(S + 1)/2}2 = {(S + 1)/12}{2(2S + 1) - 3(S + 1)}
= {(S + 1)/12}(S - 1) = (S2 - 1)/12.
Comment: For S = 6, a six-sided die, Var[N] = 35/12.
2.18. C. The first moment is: (.84)(0) +(.07)(1) +(.05)(2) +(.04)(3) = 0.29.
The 2nd moment is: (.84)(02 ) +(.07)(12 ) +(.05)(22 ) +(.04)(32 ) = 0.63. Thus the variance is:
0.63 - 0.292 = .5459 for a single policyholder. For 10,000 independent policyholders, the variance
of the sum is (10000)(.5459) = 5459. The standard deviation is: 5459 = 73.9.
The mean number of claims is (10000)(.29) = 2900. Adding two standard deviations one gets
3047.8. This requires 7 claims handlers (since 6 can only handle 3000 claims.)

2013-4-1,

Frequency Distributions, 2 Basic Concepts

HCM 10/4/12,

2.19. C. Both Die A and Die B have a mean of 3.5.


The variance of Die A is: (2.52 + 1.52 + 0.52 + 0.52 + 1.52 + 2.52 ) / 6 = 35/12.
The variance of Die B is: 2.52 = 6.25.
The variance of an average of n rolls of Die B is 6.25/n. We want 6.25 /n < 35/12.
Thus n > (6.25)(12/35) = 2.14. Thus the smallest n is 3.
2.20. B. Prob[N 1 | N 4] = Prob[1 N 4]/Prob[N 4] =
(1/6 + 1/12 + 1/20 + 1/30)/(1/2 + 1/6 + 1/12 + 1/20 + 1/30) = 20/50 = 2/5.
Comment: For integer a and b, such that 0 < a < b,
k= b-1

1/k = (b-a) 1/{(n+a)(n+b)}.


k=a

n=0

Therefore, {(b-a)/

k=b1

1 / k }/{(n+a)(n+b)}, n 0, is a frequency distribution.

k=a

This is a heavy-tailed distribution without a finite mean.


If b = a + 1, then f(n) = a/{(n+a)(n+a+1)}, n 0.
In this question, a =1, b = 2, and f(n) = 1/{(n+1)(n+2)}, n 0.

Page 15

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 16

Section 3, Binomial Distribution


Assume one has five independent lives, each of which has a 10% chance of dying over the next
year. What is the chance of observing two deaths? This is given by the product of three factors. The
first is the chance of death to the power two. The second factor is the chance of not dying to the
power 3 = 5 - 2. The final factor is the ways to pick two lives out of five, or the binomial coefficient of:
5
5!
= 10.
=
2 2! 3!
The chance of observing two deaths is:
5
0.12 0.93 = 7.29%.
2
The chance of observing other numbers of deaths in this case is:
Number
of Deaths
0
1
2
3
4
5

Chance
of Observation
59.049%
32.805%
7.290%
0.810%
0.045%
0.001%

Sum

Binomial
Coefficient
1
5
10
10
5
1

This is a just an example of a Binomial distribution, for q = 0.1 and m = 5.


For the binomial distribution: f(x) = m! qx (1-q)m-x /{x!(m-x)!}

x = 0, 1, 2, 3,...., m.

Note that the binomial density function is only positive for x m; there are at most m claims. The
Binomial has two parameters m and q. m is the maximum number of claims and q is the chance of
success.8
Written in terms of the binomial coefficient the Binomial density function is:
m
f(x) = qx (1- q )m-x, x = 0, 1, 2, 3,..., m.
x
8

I will use the notation in Loss Models and the tables attached to your exam. Many of you are familiar with the
notation in which the parameters for the Binomial Distribution are n and p rather than m and q as in Loss Models.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 17

Bernoulli Distribution:
The Bernoulli is a distribution with q chance of 1 claim and 1-q chance of 0 claims. There are only two
possibilities: either a success or a failure. The Bernoulli is a special case of the Binomial for m = 1.
The mean of the Bernoulli is q. The second moment of the Bernoulli is (02 )(1-q) + (12)q = q.
Therefore the variance is q - q2 = q(1-q).
Binomial as a Sum of Independent Bernoullis:
The example of five independent lives was the sum of five variables each of which was a Bernoulli
trial with chance of a claim 10%. In general, the Binomial can be thought of as the sum of the
results of m independent Bernoulli trials, each with a chance of success q. Therefore, the
sum of two independent Binomial distributions with the same chance of success q, is another
Binomial distribution; if X is Binomial with parameters q and m1 , while Y is Binomial with parameters
q and m2 , then X+Y is Binomial with parameters q and m1 + m2 .
Mean and Variance:
Since the Binomial is a sum of the results of m identical Bernoulli trials, the mean of the Binomial is m
times the mean of a Bernoulli, which is mq.
The mean of the Binomial is mq.
Similarly the variance of a Binomial is m times the variance of the corresponding Bernoulli, which is
mq(1-q).
The variance of a Binomial is mq(1-q).
For the case m = 5 and q = 0.1 presented previously:
Number
of Claims
0
1
2
3
4
5

Probability
Density Function
59.049%
32.805%
7.290%
0.810%
0.045%
0.001%

Probability x
# of Claims
0.00000
0.32805
0.14580
0.02430
0.00180
0.00005

Probability x
Square of
# of Claims
0.00000
0.32805
0.29160
0.07290
0.00720
0.00025

Probability x
Cube of
# of Claims
0.00000
0.32805
0.58320
0.21870
0.02880
0.00125

Sum

0.50000

0.70000

1.16000

The mean is: 0.5 = (5)(0.1) = mq.


The variance is: E[X2 ] - E[X]2 = 0.7 - 0.52 = 0.45 = (5)(0.1)(0.9) = mq(1-q).

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 18

Properties of the Binomial Distribution:


Since 0 < q < 1: mq(1-q) < mq.
Therefore, the variance of any Binomial is less than its mean.
A Binomial Distribution with parameters m and q, is the sum of m independent Bernoullis, each with
parameter q. Therefore, if one sums independent Binomials with the same q, then one gets
another Binomial, with the same q parameter and the sum of their m parameters.
Exercise: X is a Binomial with q = 0.4 and m = 8. Y is a Binomial with q = 0.4 and m = 22.
Z is a Binomial with q = 0.4 and m = 17. X, Y, and Z are independent of each other.
What form does X + Y + Z have?
[Solution: X + Y + Z is a Binomial with q = 0.4 and m = 8 + 22 + 17 = 47.]
Specifically, the sum of n independent identically distributed Binomial variables, with the same
parameters q and m, is a Binomial with parameters q and nm.
Exercise: X is a Binomial with q = 0.4 and m = 8.
What is the form of the sum of 25 independent random draws from X?
[Solution: A random draw from a Binomial Distribution with q = 0.4 and m = (25)(8) = 200.]
Thus if one had 25 exposures, each of which had an independent Binomial frequency process with
q = 0.4 and m = 8, then the portfolio of 25 exposures has a Binomial frequency process with q = 0.4
and m = 200.
Thinning a Binomial:
If one selects only some of the claims, in a manner independent of frequency, then if all claims are
Binomial with parameters m and q, the selected claims are also Binomial with parameters m and
q = q(expected portion of claims selected).
For example, assume that the number of claims is given by a Binomial Distribution with
m = 9 and q = 0.3. Assume that on average 1/3 of claims are large.
Then the number of large losses is also Binomial, but with parameters m = 9 and q = .3/3 = 0.1.
The number of small losses is also Binomial, but with parameters m = 9 and q = (.03)(2/3) = 0.2.9

The number of small and large losses are not independent; in the case of a Binomial they are negatively correlated.
In the case of a Poisson, they are independent.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 19

Binomial Distribution
Support: x = 0, 1, 2, 3, ..., m.

Parameters: 1 > q > 0, m 1. m is integer.

m = 1 is a Bernoulli Distribution.
D. f. :

P. d. f. :

F(x) = 1 - (x+1, m-x ; q) = (m-x, x+1 ; 1-q)

f(x) =

Incomplete Beta Function

m! qx (1- q)m - x m x
= q (1- q )m-x.
x! (m- x)!
x

Mean = mq
Variance = mq(1-q)

Coefficient of Variation =

Skewness =

Kurtosis = 3 +

1 - 2q
mq(1 - q)

Variance / Mean = 1 - q < 1.


1 - q
.
mq

1
6
- .
mq(1 - q) m

Mode = largest integer in mq + q (if mq + q is an integer, then f(mq + q) = f(mq + q- 1)


and both mq + q and mq + q - 1 are modes.)
Probability Generating Function: P(z) = {1 + q(z-1)}m
f(x+1)/f(x) = a + b/(x+1), a = -q/(1-q), b = (m+1)q/(1-q), f(0) = (1-q)m.
Moment Generating Function: M(s) = (qes + 1-q)m

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Binomial Distribution with m = 8 and q = 0.7:


Prob.
0.3
0.25
0.2
0.15
0.1
0.05
0

Binomial Distribution with m = 8 and q = 0.2:


Prob.
0.3
0.25
0.2
0.15
0.1
0.05
0

Binomial Distribution with m = 8 and q = 0.5:


Prob.
0.25
0.2
0.15
0.1
0.05

Page 20

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 21

Binomial Coefficients:
The binomial coefficient of x out of n trials is:
n
n!
n(n -1)(n - 2) ... (n +1- x)
(n + 1)
=
=
.
=
x(x -1)(x - 2) ... (1)
(x + 1) (n+ 1 - x)
x x! (n - x)!
Below are some examples of Binomial Coefficients:
n
2
3
4
5
6
7
8
9
10
11

x=0
1
1
1
1
1
1
1
1
1
1

x=1
2
3
4
5
6
7
8
9
10
11

x=2
1
3
6
10
15
21
28
36
45
55

x=3

x=4

x=5

x=6

x=7

x=8

1
4
10
20
35
56
84
120
165

1
5
15
35
70
126
210
330

1
6
21
56
126
252
462

1
7
28
84
210
462

1
8
36
120
330

1
9
45
165

x=9 x=10 x=11

1
10
55

1
11

It is interesting to note that the entries in a row sum to 2n .


For example, 1 + 6 + 15 + 20 + 15 + 6 + 1 = 64 = 26 .
Also note that for x=0 or x=n the binomial coefficient is one.
The entries in a row can be computed from the previous row. For example, the entry 45 in the
row n =10 is the sum of 9 and 36 the two entries above it and to the left. Similarly, 120 = 36+84.
n
n
Note that: =
.
x
n - x
For example,
11
11
11!
39,916,800
11!
=
= 462 =
= .
=
(120) (720)
6! (11 6)! 6
5 5! (11 5)!

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Using the Functions of the Calculator to Compute Binomial Coefficients:


n
Using the TI-30X-IIS, the binomial coefficient can be calculated as follows:
i
n
PRB

nCr
Enter
i
Enter
10
10!
For example, in order to calculate =
= 120:
3 3! 7!
10
PRB

nCr
Enter
3
Enter
10
10!
Using instead the BA II Plus Professional, in order to calculate =
= 120:
3 3! 7!
10
2nd
nCr
3
=

Page 22

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 23

The TI-30XS Multiview calculator saves time doing repeated calculations using the same formula.
For example constructing a table of the densities of a Binomial distribution, with m = 5 and q = 0.1:10
5
f(x) = 0.1x 0.95-x.
x
table
y = (5 nCr x) * .1x * .9(5-x)
Enter
Start = 0
Step = 1
Auto
OK
x=0
x=1
x=2
x=3
x=4
x=5

10

y
y
y
y
y
y

=
=
=
=
=
=

0.59049
0.32805
0.07290
0.00810
0.00045
0.00001

Note that to get Binomial coefficients hit the prb key and select nCr.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 24

Relation to the Beta Distribution:


The binomial coefficient looks almost like 1 over a complete Beta function.11
The incomplete Beta distribution for integer parameters can be used to compute the sum of terms
from the Binomial Distribution.12

(a,b;x) =

i=a+b-1

i= a

a + b -1 i
x (1- x)a + b - (i + 1) .
i

For example, (6, 9; 0.3) = 0.21948 =

i=14

14

i 0.3i 0.7 14 - i .
i=6

By taking appropriate differences of two Betas one can get any sum of binomials terms.
For example:
n
q a (1-q)n-a = (a, n-(a-1) ; q) - (a+1. n-a ; q).
a
10
For example, 0.23 0.87 = (120) 0.23 0.87 = 0.20133 = (3, 8 ; 0.2) - (4, 7; 0.2)
3

(a, b ; x) = 1 - (b, a ; 1-x) = F2a, 2b [bx / {a(1-x)}] where F is the distribution function of the
F-distribution with 2a and 2b degrees of freedom.
For example, (4,7; .607) = .950 = F8,14 [ (7)(.607) /{(4)(.393)} ] = F8,14 [2.70].13

The complete Beta Function is defined as (a)(b) / (a+b).


It is the divisor in front of the incomplete Beta function and is equal to the integral from 0 to 1 of xa-1(1-x)b-1.
12
For a discussion of the Beta Distribution, see Mahlers Guide to Loss Distributions. On the exam you should
either compute the sum of binomial terms directly or via the Normal Approximation. Note that the use of the Beta
Distribution is an exact result, not an approximation. See for example the Handbook of Mathematical Functions, by
Abramowitz, et. al.
13
If one did an F-Test with 8 and 14 degrees of freedom, then there would be a 5% chance that the value exceeds
2.7.
11

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 25

Problems:
Use the following information for the next seven questions:
One observes 9 independent lives, each of which has a 20% chance of death over the coming
year.
3.1 (1 point) What is the mean number of deaths over the coming year?
A. less than 1.8
B. at least 1.8 but less than 1.9
C. at least 1.9 but less than 2.0
D. at least 2.0 but less than 2.1
E. at least 2.1
3.2 (1 point) What is the variance of the number of deaths observed over the coming year?
A. less than 1.5
B. at least 1.5 but less than 1.6
C. at least 1.6 but less than 1.7
D. at least 1.7 but less than 1.8
E. at least 1.8
3.3 (1 point) What is the chance of observing 4 deaths over the coming year?
A. less than 7%
B. at least 7% but less than 8%
C. at least 8% but less than 9%
D. at least 9% but less than 10%
E. at least 10%
3.4 (1 point) What is the chance of observing no deaths over the coming year?
A. less than 13%
B. at least 13% but less than 14%
C. at least 14% but less than 15%
D. at least 15% but less than 16%
E. at least 16%
3.5 (3 points) What is the chance of observing 6 or more deaths over the coming year?
A. less than .1%
B. at least .1% but less than .2%
C. at least .2% but less than .3%
D. at least .3% but less than .4%
E. at least .4%

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 26

3.6 (1 point) What is the median number of deaths per year?


A. 0
B. 1
C. 2
D. 3
E. None of A, B, C, or D
3.7 (1 point) What is the mode of the distribution of deaths per year?
A. 0
B. 1
C. 2
D. 3
E. None of A, B, C, or D

3.8 (1 point) Assume that each year that Joe starts alive, there is a 20% chance that he will die over
the coming year. What is the chance that Joe will die over the next 5 years?
A. less than 67%
B. at least 67% but less than 68%
C. at least 68% but less than 69%
D. at least 69% but less than 70%
E. at least 70%
3.9 (2 points) One insures 10 independent lives for 5 years. Assume that each year that an insured
starts alive, there is a 20% chance that he will die over the coming year.
What is the chance that 6 of these 10 insureds will die over the next 5 years?
A. less than 20%
B. at least 20% but less than 21%
C. at least 21% but less than 22%
D. at least 22% but less than 23%
E. at least 23%
3.10 (1 point) You roll 13 six-sided dice. What is the chance of observing exactly 4 sixes?
A. less than 10%
B. at least 10% but less than 11%
C. at least 11% but less than 12%
D. at least 12% but less than 13%
E. at least 13%
3.11 (1 point) You roll 13 six-sided dice. What is the average number of sixes observed?
A. less than 1.9
B. at least 1.9 but less than 2.0
C. at least 2.0 but less than 2.1
D. at least 2.1 but less than 2.2
E. at least 2.2
3.12 (1 point) You roll 13 six-sided dice.
What is the mode of the distribution of the number of sixes observed?
A. 1
B. 2
C. 3
D. 4
E. None of A, B, C, or D

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 27

3.13 (3 point) You roll 13 six-sided dice.


What is the median of the distribution of the number of sixes observed?
A. 1
B. 2
C. 3
D. 4
E. None of A, B, C, or D
3.14 (1 point) You roll 13 six-sided dice. What is the variance of the number of sixes observed?
A. less than 1.9
B. at least 1.9 but less than 2.0
C. at least 2.0 but less than 2.1
D. at least 2.1 but less than 2.2
E. at least 2.2
3.15 (2 point) The number of losses is Binomial with q = 0.4 and m = 90.
The sizes of loss are Exponential with mean 50, F(x) = 1 - e-x/50.
The number of losses and the sizes of loss are independent.
What is the probability of seeing exactly 3 losses of size greater than 100?
A. 9%
B. 11%
C. 13%
D. 15%
E. 17%
3.16 (2 points) Total claim counts generated from Policy A follow a Binomial distribution with
parameters m = 2 and q = 0.1. Total claim counts generated from Policy B follow a Binomial
distribution with parameters m = 2 and q = 0.6. Policy A is independent of Policy B.
For the two policies combined, what is the probability of observing 2 claims in total?
A. 32%
B. 34%
C. 36%
D. 38%
E. 40%
3.17 (2 points) Total claim counts generated from a portfolio follow a Binomial distribution with
parameters m = 9 and q = 0.1. Total claim counts generated from another independent portfolio
follow a Binomial distribution with parameters m = 15 and q = 0.1.
For the two portfolios combined, what is the probability of observing exactly 4 claims in total?
A. 11%
B. 13%
C. 15%
D. 17%
E. 19%
3.18 (3 points) The number of losses follows a Binomial distribution with m = 6 and q = 0.4.
Sizes of loss follow a Pareto Distribution with = 4 and = 50,000.
There is a deductible of 5000, and a coinsurance of 80%.
Determine the probability that there are exactly two payments of size greater than 10,000.
A. 11%
B. 13%
C. 15%
D. 17%
E. 19%

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 28

Use the following information for the next two questions:


A state holds a lottery once a week.

The cost of a ticket is 1.


1,000,000 tickets are sold each week.
The prize is 1,000,000.
The chance of each ticket winning the prize is 1 in 1,400,000, independent of any other ticket.
In a given week, there can be either no winner, one winner, or multiple winners.
If there are multiple winners, each winner gets a 1,000,000 prize.
The lottery commission is given a reserve fund of 2,000,000 at the beginning of the year.

In any week where no prize is won, the lottery commission sends its receipts of 1 million to the
state department of revenue.
In any week in which prize(s) are won, the lottery commission pays the prize(s) from receipts
and if necessary the reserve fund.
If any week there is insufficient money to pay the prizes, the lottery commissioner must call
the governor of the state, in order to ask the governor to authorize the state department of
revenue to provide money to pay owed prizes and reestablish the reserve fund.
3.19 (3 points) What is the probability that the lottery commissioner has to call the governor the first
week?
A. 0.5%
B. 0.6%
C. 0.7%
D. 0.8%
E. 0.9%
3.20 (4 points) What is the probability that the lottery commissioner does not have to call the
governor the first year (52 weeks)?
A. 0.36%
B. 0.40%
C. 0.44%
D. 0.48%
E. 0.52%

3.21 (3 points) The number of children per family follows a Binomial Distribution m = 4 and q = 0.5.
For a child chosen at random, how many siblings (brothers and sisters) does he have on average?
A. 1.00
B. 1.25
C. 1.50
D. 1.75
E. 2.00
3.22 (2, 5/85, Q.2) (1.5 points) Suppose 30 percent of all electrical fuses manufactured by a
certain company fail to meet municipal building standards. What is the probability that in a random
sample of 10 fuses, exactly 3 will fail to meet municipal building standards?
10
10
A. (0.37 ) (0.73 )
B. (0.33 ) (0.77 )
C. 10 (0.33 ) (0.77 )
3
3
10
D. (0.3 i) (0.7 10 i)
i
i=0
3

E. 1

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 29

3.23 (160, 11/86, Q.14) (2.1 points) In a certain population 40p 25 = 0.9.
From a random sample of 100 lives at exact age 25, the random variable X is the number of lives
who survive to age 65. Determine the value one standard deviation above the mean of X.
(A) 90
(B) 91
(C) 92
(D) 93
(E) 94
3.24 (160, 5/91, Q.14) (1.9 points)
From a study of 100 independent lives over the interval (x, x+1], you are given:
(i) The underlying mortality rate, qx, is 0.1.
(ii) lx+s is linear over the interval.
(iii) There are no unscheduled withdrawals or intermediate entrants.
(iv) Thirty of the 100 lives are scheduled to end observation, all at age x + 1/3.
(v) Dx is the random variable for the number of observed deaths.
Calculate Var(Dx).
(A) 6.9

(B) 7.0

(C) 7.1

(D) 7.2

(E) 7.3

3.25 (2, 2/96, Q.10) (1.7 points) Let X1 , X2 , and X3 , be independent discrete random variables
with probability functions
ni
P[Xi = k] = pk (1p)ni k for i = 1, 2, 3, where 0 < p < 1.
k
Determine the probability function of S = X1 + X2 + X3 , where positive.
n1 + n2 + n3
A.
ps (1 p)n 1 + n2 + n 3 - s
s

B.

n + nni + n3
i=1

ni
ps (1 p)ni - s
s

3 n i

C. p s (1 p)n i - s
i = 1s
3

D.

s i ps (1 p)ni i=1

n1 n2 n3
n
E.
ps (1 p)n1 n2 3 - s
s

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 30

3.26 (2, 2/96, Q.44) (1.7 points) The probability that a particular machine breaks down on any day
is 0.2 and is independent of the breakdowns on any other day.
The machine can break down only once per day.
Calculate the probability that the machine breaks down two or more times in ten days.
A. 0.0175
B. 0.0400
C. 0.2684
D. 0.6242
E. 0.9596
3.27 (4B, 11/96, Q.23) (2 points) Two observations are made of a random variable having a
binomial distribution with parameters m = 4 and q = 0.5.
Determine the probability that the sample variance is zero.
A. 0
B. Greater than 0, but less than 0.05
C. At least 0.05, but less than 0.15
D. At least 0.15, but less than 0.25
E. At least 0.25
3.28 (Course 1 Sample Exam, Q.40) (1.9 points) A small commuter plane has 30 seats.
The probability that any particular passenger will not show up for a flight is 0.10, independent of
other passengers. The airline sells 32 tickets for the flight. Calculate the probability that more
passengers show up for the flight than there are seats available.
A. 0.0042
B. 0.0343
C. 0.0382
D. 0.1221
E. 0.1564
3.29 (1, 5/00, Q.40) (1.9 points)
A company prices its hurricane insurance using the following assumptions:
(i) In any calendar year, there can be at most one hurricane.
(ii) In any calendar year, the probability of a hurricane is 0.05 .
(iii) The number of hurricanes in any calendar year is independent of the number of
hurricanes in any other calendar year.
Using the companys assumptions, calculate the probability that there are fewer
than 3 hurricanes in a 20-year period.
(A) 0.06
(B) 0.19
(C) 0.38
(D) 0.62
(E) 0.92
3.30 (1, 5/01, Q.13) (1.9 points) A study is being conducted in which the health of two
independent groups of ten policyholders is being monitored over a one-year period of time.
Individual participants in the study drop out before the end of the study with probability 0.2
(independently of the other participants). What is the probability that at least 9 participants complete
the study in one of the two groups, but not in both groups?
(A) 0.096
(B) 0.192
(C) 0.235
(D) 0.376
(E) 0.469

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 31

3.31 (1, 5/01, Q.37) (1.9 points) A tour operator has a bus that can accommodate 20 tourists. The
operator knows that tourists may not show up, so he sells 21 tickets. The probability that an
individual tourist will not show up is 0.02, independent of all other tourists.
Each ticket costs 50, and is non-refundable if a tourist fails to show up. If a tourist shows
up and a seat is not available, the tour operator has to pay 100, the ticket cost plus a penalty of 50,
to the tourist. What is the expected revenue of the tour operator?
(A) 935
(B) 950
(C) 967
(D) 976
(E) 985
3.32 (1, 11/01, Q.27) (1.9 points) A company establishes a fund of 120 from which it wants to
pay an amount, C, to any of its 20 employees who achieve a high performance level during the
coming year. Each employee has a 2% chance of achieving a high performance level during the
coming year, independent of any other employee.
Determine the maximum value of C for which the probability is less than 1% that the
fund will be inadequate to cover all payments for high performance.
(A) 24
(B) 30
(C) 40
(D) 60
(E) 120
3.33 (CAS3, 11/03, Q.14) (2.5 points) The Independent Insurance Company insures 25 risks,
each with a 4% probability of loss. The probabilities of loss are independent.
On average, how often would 4 or more risks have losses in the same year?
A. Once in 13 years
B. Once in 17 years
C. Once in 39 years
D. Once in 60 years
E. Once in 72 years
3.34 (CAS3, 11/04, Q.22) (2.5 points) An insurer covers 60 independent risks.
Each risk has a 4% probability of loss in a year.
Calculate how often 5 or more risks would be expected to have losses in the same year.
A. Once every 3 years
B. Once every 7 years
C. Once every 11 years
D. Once every 14 years
E. Once every 17 years

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 32

3.35 (CAS3, 11/04, Q.24) (2.5 points) A pharmaceutical company must decide how many
experiments to run in order to maximize its profits.

The company will receive a grant of $1 million if one or more of its experiments is successful.
Each experiment costs $2,900.
Each experiment has a 2% probability of success, independent of the other experiments.
All experiments are run simultaneously.
Fixed expenses are $500,000.
Ignore investment income.
The company performs the number of experiments that maximizes its expected profit.
Determine the company's expected profit before it starts the experiments.
A. 77,818
B. 77,829
C. 77,840
D. 77,851
E. 77,862
3.36 (SOA3, 11/04, Q.8 & 2009 Sample Q.124) (2.5 points)
For a tyrannosaur with a taste for scientists:
(i) The number of scientists eaten has a binomial distribution with q = 0.6 and m = 8.
(ii) The number of calories of a scientist is uniformly distributed on (7000, 9000).
(iii) The numbers of calories of scientists eaten are independent, and are independent of
the number of scientists eaten.
Calculate the probability that two or more scientists are eaten and exactly two of those eaten
have at least 8000 calories each.
(A) 0.23
(B) 0.25
(C) 0.27
(D) 0.30
(E) 0.3
3.37 (CAS3, 5/05, Q.15) (2.5 points) A service guarantee covers 20 television sets.
Each year, each set has a 5% chance of failing. These probabilities are independent.
If a set fails, it is replaced with a new set at the end of the year of failure.
This new set is included under the service guarantee.
Calculate the probability of no more than 1 failure in the first two years.
A. Less than 40.5%
B. At least 40.5%, but less than 41.0%
C. At least 41.0%, but less than 41.5%
D. At least 41.5%, but less than 42.0%
E. 42.0% or more

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 33

Solutions to Problems:
3.1. B. Binomial with q =0 .2 and m = 9. Mean = (9)(.2) = 1.8.
3.2. A. Binomial with q = 0.2 and m = 9. Variance = (9)(.2)(1-.2) = 1.44.
3.3. A. Binomial with q = 0.2 and m = 9. f(4) = 9!/(4! 5!) .24 .85 = 6.61%.
3.4. B. Binomial with q = 0.2 and m = 9. f(0) = 9!/(0! 9!) .20 .89 = 13.4%.
3.5. D. Binomial with q = 0.2 and m = 9.
The chance of observing different numbers of deaths is:
Number
of Deaths
0
1
2
3
4
5
6
7
8
9

Chance
of Observation
13.4218%
30.1990%
30.1990%
17.6161%
6.6060%
1.6515%
0.2753%
0.0295%
0.0018%
0.0001%

Binomial
Coefficient
1
9
36
84
126
126
84
36
9
1

Adding the chances of having 6, 7 , 8 or 9 claims the answer is .307%.


Alternately one can add the chances of having 0, 1, 2, 3, 4 or 5 claims and subtract this sum from
unity.
Comment: Although you should not do so for the exam, one could also answer this question using
the Incomplete Beta Function. The chance of more than x claims is (x+1, m-x; q).
The chance of more than 5 claims is: (5+1, 9-5; .2) = (6, 4; .2) = 0.00307.
3.6. C. For a discrete distribution such as we have here, employ the convention that the median is
the first value at which the distribution function is greater than or equal to .5.
F(1) = 0.134 + 0.302 = 0.436 < 50%, F(2) = 0.134 + 0.302 + 0.302 = 0.738 > 50%,
and therefore the median is 2.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 34

3.7. E. The mode is the value at which f(n) is a maximum; f(1) = .302 = f(2) and both 1 and 2 are
modes. Alternately, in general for the Binomial the mode is the largest integer in mq + q; the largest
integer in 2 is 2, but when mq + q is an integer both it and the integer one less are modes.
Comment: This is a somewhat unfair question. While it seems to me that E is the best single
answer, one could also argue for B or C. If you are unfortunate enough to have an apparently unfair
question on your exam, do not let it upset you while taking the exam.
3.8. B. The chance that Joe is alive at the end of 5 years is (1-.2)5 = .32768. Therefore, the chance
that he died is 1 - 0.32768 = 0.67232.
3.9. D. Based on the solution of the previous problem, for each life the chance of dying during the
five year period is 0.67232. Therefore, the number of deaths for the 10 independent lives is
Binomial with m = 10 and q = 0.67232.
f(6) = ((10!)/{(6!)(4!)}) (0.672326 ) (0.327684 ) = (210)(0.0924)(0.01153) = 0.224.
The chances of other numbers of deaths are as follows:
Number
of Deaths
0
1
2
3
4
5
6
7
8
9
10

Chance
of Observation
0.001%
0.029%
0.270%
1.479%
5.312%
13.078%
22.360%
26.216%
20.171%
9.197%
1.887%

Binomial
Coefficient
1
10
45
120
210
252
210
120
45
10
1

Sum

1024

3.10. B. The chance of observing a six on an individual six-sided die is 1/6. Assuming the results
of the dice are independent, one has a Binomial distribution with q =1/6 and m =13.
f(4) = 13!/(4! 9!) (1/6)4 (5/6)9 =10.7%.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 35

3.11. D, 3.12. B, & 3.13. B. Binomial with q =1/6 and m =13. Mean = (1/6)(13) = 2.17.
For the Binomial the mode is the largest integer in mq + q = (13)(1/6) + (1/6) = 2.33; the largest
integer in 2.33 is 2. Alternately compute all of the possibilities and 2 is the most likely.
F(1) = .336 <.5 and F(2) = .628 .5, therefore the median is 2.
Number
of Deaths
0
1
2
3
4
5
6
7
8
9
10
11
12
13
Sum

Chance
of Observation
9.3463879%
24.3006085%
29.1607302%
21.3845355%
10.6922678%
3.8492164%
1.0264577%
0.2052915%
0.0307937%
0.0034215%
0.0002737%
0.0000149%
0.0000005%
0.0000000%
1

Binomial
Coefficient
1
13
78
286
715
1287
1716
1716
1287
715
286
78
13
1
8192

Cumulative
Distribution
9.346%
33.647%
62.808%
84.192%
94.885%
98.734%
99.760%
99.965%
99.996%
100.000%
100.000%
100.000%
100.000%
100.000%

3.14. A. Binomial with q =1/6 and m =13. Variance = (13)(1/6)(1-1/6) = 1.806.


3.15. D. S(100) = e-100/50 = .1353. Therefore, thinning the original Binomial, the number of large
losses is Binomial with m = 90 and q = (.1353)(.4) = (.05413).
f(3) = {(90)(89)(88)/3!} (.054133 )(1 - .05413)87 = .147.
3.16. D. Prob[2 claims in total] =
Prob[A = 0]Prob[B = 2] + Prob[A= 1]Prob[ B = 1] + Prob[B = 2]Prob[A = 0] =
(.92 )(.62 ) + {(2)(.1)(.9)}{(2)(.6)(.4)} + (.12 )(.42 ) = 37.96%.
Comment: The sum of A and B is not a Binomial distribution, since their q parameters differ.
3.17. B. For the two portfolios combined, total claim counts follow a Binomial distribution with
parameters m = 9 + 15 = 24 and q = 0.1.
24
f(4) = q4 (1-q)20 = {(24)(23)(22)(21)/4!}(.14 )(.920) = 12.9%.
4

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 36

3.18. B. A payment is of size greater than 10,000 if the loss is of size greater than:
10000/.8 + 5000 = 17,500.
Probability of a loss of size greater than 17,500 is: {50/(50 + 17.5)}4 = 30.1%.
The large losses are Binomial with m = 6 and q = (.301)(0.4) = 0.1204.
6
f(2) = .12042 (1 - .1204)4 = 13.0%.
2
Comment: An example of thinning a Binomial.
3.19. B. The number of prizes is Binomial with m = 1 million and q = 1/1,400,000.
f(0) = (1 - 1/1400000)1000000 = 48.95%.
f(1) = 1000000(1 - 1/1400000)999999 (1/1400000) = 34.97%.
f(2) = {(1000000)(999999)/2}(1 - 1/1400000)999998 (1/1400000)2 = 12.49%.
f(3) = {(1000000)(999999)(999998)/6}(1 - 1/1400000)999997 (1/1400000)3 = 2.97%.
n

f(n)

0
1
2
3
4
5
6

48.95%
34.97%
12.49%
2.97%
0.53%
0.08%
0.01%

Sum

100.00%

The first week, the lottery has enough money to pay 3 prizes,
(1 million in receipts + 2 million in the reserve fund.)
The probability of more than 3 prizes is: 1 - (48.95% + 34.97% + 12.49% + 2.97%) = 0.62%.
3.20. C. Each week there is a .4895 + .3497 = .8392 chance of no need for the reserve fund.
Each week there is a .1249 chance of a 1 million need from the reserve fund.
Each week there is a .0297 chance of a 2 million need from the reserve fund.
Each week there is a .0062 chance of a 3 million or more need from the reserve fund.
The governor will be called if there is at least 3 weeks with 2 prizes each (since each such week
depletes the reserve fund by 1 million), or if there is 1 week with 2 prizes plus 1 week with 3 prizes,
or if there is a week with 4 prizes.
Prob[Governor not called] = Prob[no weeks with more than 1 prize] +
Prob[1 week @2, no weeks more than 2] + Prob[2 weeks @2, no weeks more than 2] +
Prob[0 week @2, 1 week @3, no weeks more than 3] =
(.839252) + (52)(.1249)(.839251) + ((52)(51)/2)(.12492 )(.839250) + (52)(.0297)(.839251) =
.00011 + .00085 + .00323 + .00020 = 0.00439.
Comment: The lottery can not keep receipts from good weeks in order to build up the reserve fund.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 37

3.21. C. Let n be the number of children in a family.


The probability that the child picked is in a family of size n is proportional to the product of the size of
family and the proportion of families of that size: n f(n).
Thus, Prob[child is in a family of size n] = n f(n) / n f(n) = n f(n) / E[N].
For n > 0, the number of siblings the child has is n - 1.

n f(n) (n- 1) (n2 - n) f(n)

Thus the mean number of siblings is:

E[N]

E[N]

E[N2 ] - E[N]
=
E[N]

E[N2 ]
Var[N] + E[N]2
Var[N]
mq(1- q)
-1=
-1=
+ E[N] - 1 =
+ mq - 1 = 1 - q + mq - 1
E[N]
E[N]
mq
E[N]
= (m - 1)q = (3)(0.5) = 1.5.
Alternately, assume for example 10,000 families.
Number
of children

Binomial
Density

Number
of Families

Number
of Children

Number
of Siblings

Product of # of Children
Times # of Siblings

0
1
2
3
4

0.0625
0.2500
0.3750
0.2500
0.0625

625
2,500
3,750
2,500
625

0
2,500
7,500
7,500
2,500

0
0
1
2
3

0
0
7,500
15,000
7,500

Total

1.0000

10,000

20,000

30,000

Mean of number of siblings for a child chosen at random is: 30,000 / 20,000 = 1.5.
Comment: The average size family has two children; each of these children has one sibling.
However, a child chosen at random is more likely to be from a large family.
10
3.22. B. A Binomial Distribution with m = 10 and q = 0.3. f(3) = (0.33 ) (0.77 ).
3
3.23. D. The number of people who survive is Binomial with m = 100 and q = 0.9.
Mean = (100)(.9) = 90. Variance = (100)(.9)(.1) = 9. Mean + Standard Deviation = 93.
3.24. E. The number of deaths is sum of two Binomials, one with m = 30 and q = 0.1/3, and the
other with m = 70 and q = 0.1.
The sum of their variances is: (30)(.03333)(1 - .03333) + (70)(.1)(.9) = .967 + 6.3 = 7.267.
3.25. A. Each Xi is Binomial with parameters ni and p.
The sum is Binomial with parameters n1 + n2 + n3 and p.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 38

3.26. D. Binomial with m = 10 and q = 0.2. 1 - f(0) - f(1) = 1 - .810 - (10)(.2)(.89 ) = 0.624.
3.27. E. The sample variance is the average squared deviation from the mean; thus the sample
variance is positive unless all the observations are equal. In this case, the sample variance is zero if
and only if the two observations are equal. For this Binomial the chance of observing a given
number of claims is:
number of claims: 0
1
2
3
4
probability:
1/16 4/16 6/16 4/16 1/16
Thus the chance that the two observations are equal is:
(1/16)2 + (4/16)2 + (6/16)2 + (4/16)2 + (1/16)2 = 70/256 = .273.
Comment: For example, the chance of 3 claims is: ((m!)/{(3!)((m-3)!)}) q3 (1-q) =
((4!)/{(3!)((1!)}) .53 (1-.5) = 4/16.
3.28. E. The number of passengers that show up for a flight is Binomial with m = 32 and
q = .90. Prob[more show up than seats] = f(31) + f(32) = 32(.1)(.931) + .932 = .1564.
3.29. E. The number of hurricanes is Binomial with m = 20 and q = 0.05.
Prob[< 3 hurricanes] = f(0) + f(1) + f(2) = .9520 + 20(.05)(.9519) + 190(.052 )(.9518) = .9245.
3.30. E. Each group is Binomial with m = 10 and q = 0.8.
Prob[at least 9 complete] = f(9) + f(10) = 10(.2)(.89 ) + .810 = .376.
Prob[one group has at least 9 and one group does not] = (2)(.376)(1 - .376) = .469.
3.31. E. The bus driver collects (21)(50) = 1050 for the 21 tickets he sells. However, he may be
required to refund 100 to one passenger if all 21 ticket holders show up. The number of tourists who
show up is Binomial with m = 21 and q = .98.
Expected penalty is: 100 f(21) = 100(.9821) = 65.425.
Expected revenue is: (21)(50) - 65.425 = 984.6.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 39

3.32. D. The fund will be inadequate if there are more than 120/C payments.
The number of payments is Binomial with m = 20 and q = .02.
x

0
1
2
3

0.66761
0.27249
0.05283
0.00647

0.66761
0.94010
0.99293
0.99940

There is a 1 - .94010 = 5.990% chance of needing more than one payment.


There is a 1 - .992930 = 0.707% chance of needing more than two payments.
Thus we need to require that the fund be able to make two payments. 120/C = 2. C = 60.
3.33. D. This is the sum of 25 independent Bernoullis, each with q = .04.
The number of losses per year is Binomial with m = 25 and q = .04.
f(0) = (1 - q)m = (1 - .04)25 = .3604.
f(1) = mq(1 - q)m-1 = (25)(.04)(1 - .04)24 = .3754.
f(2) = {m(m-1)/2!}q2 (1 - q)m-2 = (25)(24/2)(.042 )(1 - .04)23 = .1877.
f(3) = {m(m-1)(m-2)/3!}q3 (1 - q)m-3 = (25)(24)(23/6)(.043 )(1 - .04)22 = .0600.
Prob[at least 4] = 1 - {f(0) + f(1) + f(2) + f(3)} = 1 - .9835 = .0165.
4 or more risks have losses in the same year on average once in: 1/.0165 = 60.6 years.
3.34. C. A Binomial Distribution with m = 60 and q = .04.
f(0) = .9660 = .08635. f(1) = (60)(.04).9659 = .21588. f(2) = {(60)(59)/2}(.042 ).9658 = .26535.
f(3) = {(60)(59)(58)/6}(.043 ).9657 = .21376. f(4) = {(60)(59)(58)(57)/24}(.044 ).9656 = .12692.
1 - f(0) - f(1) - f(2) - f(3) - f(4) = 1 - .08635 - .21588 - .26535 - .21376 - .12692 = .09174.
1/.09174 = Once every 11 years.
3.35. A. Assume n experiments are run. Then the probability of no successes is .98n .
Thus the probability of at least one success is: 1 - .98n .
Expected profit is:
(1,000,000)(1 - .98n ) - 2900n - 500,000 = 500,000 - (1,000,000).98n - 2900n.
Setting the derivative with respect to n equal to zero:
0 = -ln(.98)(1,000,000).98n - 2900. .98n = .143545. n = 96.1.
Taking n = 96, the expected profit is 77,818.
Comment: For n = 97, the expected profit is 77,794.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 40

3.36. D. (9000 - 8000)/(9000 - 7000) = 1/2. Half the scientists are large.
Therefore, thinning the original Binomial, the number of large scientist is Binomial with m = 8 and
q = 0.6/2 = 0.3. f(2) = {(8)(7)/2} (.76 )(.32 ) = 0.2965.
Alternately, this is a compound frequency distribution, with primary distribution a Binomial with
q = 0.6 and m = 8, and secondary distribution a Bernoulli with q = 1/2 (half chance a scientist is large.)
One can use the Panjer Algorithm. For the primary Binomial Distribution,
a = -q/(1-q) = -.6/.4 = -1.5. b = (m+1)q/(1-q) = (9)(1.5) = 13.5. P(z) = {1 + q(z-1)}m.
c(0) = Pp (s(0)) = Pp (.5) = {1 + (.6)(.5-1)}8 = .057648.
x

c(x) = {1/(1-as(0))}

(a + jb/x) s(j) c(x-j) = (1/1.75) (-1.5 + j13.5/x) s(j) c(x-j).


j=1

j=1

c(1) = (1/1.75)(-1.5 + 13.5) s(1) c(0) = (1/1.75)(12)(1/2)(.057648) = .197650.


c(2) = (1/1.75){(-1.5 + 13.5/2) s(1) c(1) + (-1.5 + (2)13.5/2) s(2) c(0)} =
(1/1.75){(5.25)(1/2)(.197650) + (12)(0)(.057648)} = 0.296475.
Alternately, one can list all the possibilities:
Number
of
Scientist

Binomial
Probability

Given the number of


Scientist, the Probability
that exactly two are large

Extension

0
1
2
3
4
5
6
7
8

0.00066
0.00786
0.04129
0.12386
0.23224
0.27869
0.20902
0.08958
0.01680

0
0
0.25
0.375
0.375
0.3125
0.234375
0.1640625
0.109375

0.00000
0.00000
0.01032
0.04645
0.08709
0.08709
0.04899
0.01470
0.00184

Sum

1.00000

0.29648

For example if 6 scientist have been eaten, then the chance that exactly two of them are large is:
(.56 )6!/(4!2!) = .234375. In algebraic form, this solution is:
8

{8!/(n! (8-n)!)} .6n .48 - n {n! /(2! (n-2)!)} .5n = (1/2) {8!/((n-2)! (8-n)!)} .3n .48 - n =
n=2

n=2
6

(1/2)(8)(7)(.32 ) {6!/(i! (6-i)!)} .3i .46 - i = (28)(.09)(.3 + .4)6 = 0.2965.


i=0

Comment: The Panjer Algorithm (Recursive Method) is discussed in Mahlers Guide to Aggregate
Distributions.

2013-4-1,

Frequency Distributions, 3 Binomial Dist.

HCM 10/4/12,

Page 41

two or more scientists are eaten and exactly two of those eaten have at least 8000 calories each

exactly two large scientists are eaten as well as some unknown number of small scientists.
At least 2 claims of which exactly two are large.

exactly 2 large claims and some unknown number of small claims.


3.37. A. One year is Binomial Distribution with m = 20 and q = .05.
The years are independent of each other. Therefore, the number of failures over 2 years is Binomial
Distribution with m = 40 and q = .05.
Prob[0 or 1 failures] = .9540 + (40)(.9539)(.05) = 39.9%.
Comment: In this question, when a TV fails it is replaced. Therefore, we can have a failure in both
years for a given customer. A somewhat different question than asked would be,
assuming each customer owns one set, calculate the probability that no more than one customer
suffers a failure during the two years. For a given customer, the probability of no failure in the first two
years is: .952 = 0.9025. The probability of 0 or 1 customer suffering a failure is: .902520 +
(20)(.0975)(.902519) = 40.6%.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 42

Section 4, Poisson Distribution


The Poisson Distribution is the single most important frequency distribution to study for the exam.14
The density function for the Poisson is:
f(x) = x e / x!, x 0.
Note that unlike the Binomial, the Poisson density function is positive for all x 0; there is no limit on
the possible number of claims. The Poisson has a single parameter .
The Distribution Function is 1 at infinity since x / x! is the series for e .
For example, heres a Poisson for = 2.5:
n
f(n)
F(n)

0
0.082
0.082

1
0.205
0.287

2
0.257
0.544

3
0.214
0.758

4
0.134
0.891

5
0.067
0.958

6
0.028
0.986

7
0.010
0.996

8
0.003
0.999

9
0.001
1.000

10
0.000
1.000

Prob.
0.25

0.2

0.15

0.1

0.05

x
0

10

For example, the chance of 4 claims is: f(4) = 4 e / 4! = 2.54 e-2.5 / 4! = .1336.
Remember, there is a small chance of a very large number of claims.
For example, f(15) = 2.515 e-2.5 / 15! = 6 x 10-8.
Such large numbers of claims can contribute significantly to the higher moments of the distribution.
14

The Poisson comes up among other places in the Gamma-Poisson frequency process, to be discussed in a
subsequent section. Poisson processes are discussed in Mahlers Guide to Stochastic Models.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 43

Lets calculate the first two moments for this Poisson distribution with = 2.5:

Number
of Claims
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Probability
Density Function
0.08208500
0.20521250
0.25651562
0.21376302
0.13360189
0.06680094
0.02783373
0.00994062
0.00310644
0.00086290
0.00021573
0.00004903
0.00001021
0.00000196
0.00000035
0.00000006
0.00000001
0.00000000
0.00000000
0.00000000
0.00000000

Probability x
# of Claims
0.00000000
0.20521250
0.51303124
0.64128905
0.53440754
0.33400471
0.16700236
0.06958432
0.02485154
0.00776611
0.00215725
0.00053931
0.00012257
0.00002554
0.00000491
0.00000088
0.00000015
0.00000002
0.00000000
0.00000000
0.00000000

Probability
Probability
x
x
Square ofCube ofDistribution
# of Claims
Function
# of Claims
0.00000000
0.08208500
0.20521250
0.28729750
1.02606248
0.54381312
1.92386716
0.75757613
2.13763017
0.89117802
1.67002357
0.95797896
1.00201414
0.98581269
0.48709021
0.99575330
0.19881233
0.99885975
0.06989496
0.99972265
0.02157252
0.99993837
0.00593244
0.99998740
0.00147085
0.99999762
0.00033196
0.99999958
0.00006875
0.99999993
0.00001315
0.99999999
0.00000234
1.00000000
0.00000039
1.00000000
0.00000006
1.00000000
0.00000001
1.00000000
0.00000000
1.00000000

Sum

1.00000000

2.50000000

8.75000000

The mean is 2.5 = . The variance is: E[X2 ] - E[X]2 = 8.75 - 2.52 = 2.5 = .
In general, the mean of the Poisson is and the variance is .
In this case the mode is 2, since f(2) = .2565, larger than any other value of the probability density
function. In general, the mode of the Poisson is the largest integer in .15 This follows from the fact that
for the Poisson f(x+1) / f(x) = / (x+1). Thus for the Poisson the mode is less than or equal to the
mean .
The median in this case is 2, since F(2) = .544 .5, while F(1) = .287 < .5. The median as well as
the mode are less than the mean, which is typical for distributions skewed to the right.

15

If is an integer then f() = f(1), and both and -1 are modes.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 44

Claim Intensity, Derivation of the Poisson:


Assume one has a claim intensity of . The chance of having a claim over an extremely small period
of time t is approximately (t). (The claim intensity is analogous to the force of mortality in Life
Contingencies.) If the claim intensity is a constant over time and the chance of having a claim in any
interval is independent of the chance of having a claim in any other disjoint interval, then the number
of claims observed over a period time t is given by a Poisson Distribution, with parameter t.
A Poisson is characterized by a constant independent claim intensity and vice versa. For
example, if the chance of a claim each month is 0.1%, and months are independent of each other,
the distribution of number of claims over a 5 year period (60 months) is Poisson with mean = 6%.
For the Poisson, the parameter = mean =
(claim intensity)(total time covered). Therefore, if for example one has a Poisson in each of five
years with parameter , then over the entire 5 year period one has a Poisson with parameter 5.
Adding Poissons:
The sum of two independent variables each of which is Poisson with parameters 1 and
2 is also Poisson, with parameter 1 + 2 . 16 This follows from the fact that for a very small
time interval the chance of a claim is the sum of the chance of a claim from either variable, since they
are independent.17 If the total time interval is one, then the chance of a claim from either variable over
a very small time interval t is 1 t + 2 t = (1 + 2 )t. Thus the sum of the variables has constant
claim intensity (1 + 2 ) over a time interval of one, and is therefore a Poisson with parameter
1 + 2 .
For example, the sum of a two independent Poisson variables with means 3% and 5%
is a Poisson variable with mean 8%. So if a portfolio consists of one risk Poisson with mean 3% and
one risk Poisson with mean 5%, the number of claims observed for the whole portfolio is Poisson
with mean 8%.

16

See Theorem 6.1 in Loss Models.

This can also be shown from simple algebra, by summing over i + j = k the terms (i e / i!) (j e / j!) =
e + (i j / i! j!). By the Binomial Theorem, these terms sum to e+ (+)k / k!.

17

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 45

Exercise: Assume one had a portfolio of 25 exposures. Assume each exposure has an
independent Poisson frequency process, with mean 3%. What is the frequency distribution for the
claims from the whole portfolio?
[Solution: A Poisson Distribution with mean: (25)(3%) = 0.75.]
If one has a large number of independent events each with a small probability of occurrence,
then the number of events that occurs has approximately a constant claims intensity and is
thus approximately Poisson Distributed. Therefore the Poisson Distribution can be useful in
modeling such situations.
Thinning a Poisson:18
Sometimes one selects only some of the claims. This is sometimes referred to as thinning the
Poisson distribution. For example, if frequency is given by a Poisson and severity is
independent of frequency, then the number of claims above a certain amount (in
constant dollars) is also a Poisson.
For example, assume that we have a Poisson with mean frequency of 30 and that the size of loss
distribution is such that 20% of the losses are greater than $1 million (in constant dollars). Then the
number of losses observed greater than $1 million (in constant dollars) is also Poisson but with a
mean of (20%)(30) = 6. Similarly, losses observed smaller than $1 million (in constant dollars) is
also Poisson, but with a mean of (80%)(30) = 24.
Exercise: Frequency is Poisson with = 5. Sizes of loss are Exponential with = 100.
Frequency and severity are independent.
What is the distribution of the number of losses of size between 50 and 200?
[Solution: F(200) - F(50) = e-0.5 - e-2 = 0.471.
Thus the number of medium sized losses are Poisson with mean: (.471)(5) = 2.36.
Comment: The number of large losses are Poisson with mean 5S(200) = (0.135)(5) = 0.68.
The number of small losses are Poisson with mean 5F(50) = (0.393)(5) = 1.97.
These three Poisson Distributions are independent of each other.]
In this example, the total number of losses are Poisson and therefore has a constant independent
claims intensity of 5. Since frequency and severity are independent, the large losses also have a
constant independent claims intensity of 5S(200), which is therefore Poisson with mean 5S(200).
Similarly, the small losses have constant independent claims intensity of 5F(50) and therefore are
Poisson. Also, these two processes are independent of each other.

18

See Theorem 6.2 in Loss Models.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 46

If in this example we had a deductible of 200, then only losses of size greater than 200 would result
in a (non-zero) payment. Loss Models refers to the number of payments as NP, in contrast to NL
the number of losses.19 In this example, NL is Poisson with mean 5, while for a 200 deductible NP is
Poisson with mean 0.68.
Thinning a Poisson based on size of loss is a special case of decomposing Poisson frequencies.
The key idea is that there is some way to divide the claims up into mutually exclusive types that are
independent. Then each type is also Poisson, and the Poisson Distributions for each
type are independent.
Exercise: Claim frequency follows a Poisson Distribution with a mean of 20% per year. 1/4 of all
claims involve attorneys. If attorney involvement is independent between different claims, what is
the probability of getting 2 claims involving attorneys in the next year?
[Solution: Claims with attorney involvement are Poisson with mean frequency 20%/4 = 5%.
Thus f(2) = (0.05)2 e-0.05 / 2! = 0.00119.]
Derivation of Results for Thinning Poissons:20
If losses are Poisson with mean , and one selects a portion, t, of the losses in a manner
independent of the frequency, then the selected losses are also Poisson but with mean t.

Prob[# selected losses = n] =

Prob[m total # losses] Prob[n of m losses are selected]

m=n

e- m t n (1- t)m - n m! e- tn n
m - n (1- t)m - n
=
=
m!

n!
(m- n)!
n! (m- n)!
m=n
m=n

e- tn n
e- tn n (1-t)
e
= et(t)n /n! = f(n) for a Poisson with mean t.
i (1- t)i / i! =

n!
n!
i=0

In a similar manner, the number not selected follows a Poisson with mean (1-t).

19
20

I do not regard this notation as particularly important. See Section 8.6 of Loss Models.
I previously discussed how these results follow from the constant, independent claims intensity.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 47

Prob[# selected losses = n | # not selected losses = j] =


Prob[total # = n + j and # not selected losses = j] / Prob[# not selected losses = j] =
Prob[total # = n + j] Prob[# not selected losses = j | total # = n + j]
=
Prob[# not selected losses = j]
e- n + j (1- t)j tn (n + j)!
(n + j)!
n! j!
= et(t)n /n! =
e- (1- t) {(1- t)}j
j!
f(n) for a Poisson with mean t = Prob[# selected losses = n].
Thus the number selected and the number not selected are independent. They are independent
Poisson distributions. The same result follows when dividing into more than 2 disjoint subsets.
Effect of Exposures:21
Assume one has 100 exposures with independent, identically distributed frequency distributions. If
each one is Poisson, then so is the sum, with mean 100. If we change the number of exposures to
for example 150, then the sum is Poisson with mean 150, or 1.5 times the mean in the first case. In
general, as the exposures change, the distribution remains Poisson with the mean changing in
proportion.
Exercise: The total number of claims from a portfolio of private passenger automobile insured has a
Poisson Distribution with = 60. If next year the portfolio has only 80% of the current exposures,
what is its frequency distribution?
[Solution: Poisson with = (.8)(60) = 48.]
This same result holds for a Compound Frequency Distribution, to be discussed subsequently, with
a primary distribution that is Poisson.

21

See Section 6.12 of Loss Models.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 48

Poisson Distribution
Parameters: > 0

Support: x = 0, 1, 2, 3, ...
D. f. :
P. d. f. :

F(x) = 1 - (x+1 ; )

Incomplete Gamma Function22

f(x) = x e / x!

Mean =

Variance =

Coefficient of Variation =

Variance / Mean = 1.

Skewness =

= CV.

Kurtosis = 3 + 1/ = 3 + CV2 .

Mode = largest integer in (if is an integer then both and -1 are modes.)
nth Factorial Moment = n .
Probability Generating Function: P(z) = e(z-1) , > 0.
Moment Generating Function: M(s) = exp[(es - 1)].
f(x+1) / f(x) = a + b / (x+1), a = 0, b = , f(0) = e.
A Poisson Distribution for = 10:
Prob.
0.12
0.1
0.08
0.06
0.04
0.02
0

10

15

20

25

x+1 is the shape parameter of the Incomplete Gamma which is evaluated at the point . Thus one can get the sum
of terms for the Poisson Distribution by using the Incomplete Gamma Function.
22

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Problems:
Use the following information for the next five questions:
The density function for n is: f(n) = 6.9n e-6.9 / n!, n = 0, 1, 2, ...
4.1 (1 point) What is the mean of the distribution?
A. less than 6.9
B. at least 6.9 but less than 7.0
C. at least 7.0 but less than 7.1
D. at least 7.1 but less than 7.2
E. at least 7.2
4.2 (1 point) What is the variance of the distribution?
A. less than 6.9
B. at least 6.9 but less than 7.0
C. at least 7.0 but less than 7.1
D. at least 7.1 but less than 7.2
E. at least 7.2
4.3 (2 points) What is the chance of having less than 4 claims?
A. less than 9%
B. at least 9% but less than 10%
C. at least 10% but less than 11%
D. at least 11% but less than 12%
E. at least 12%
4.4 (2 points) What is the mode of the distribution?
A. 5
B. 6
C. 7
D. 8

E. None of the Above.

4.5 (2 points) What is the median of the distribution?


A. 5
B. 6
C. 7
D. 8

E. None of the Above.

Page 49

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 50

4.6 (2 points) The male drivers in the State of Grace each have their annual claim frequency given
by a Poisson distribution with parameter equal to 0.05.
The female drivers in the State of Grace each have their annual claim frequency given by a Poisson
distribution with parameter equal to 0.03.
You insure in the State of Grace 20 male drivers and 10 female drivers.
Assume the claim frequency distributions of the individual drivers are independent.
What is the chance of observing 3 claims in a year?
A. less than 9.6%
B. at least 9.6% but less than 9.7%
C. at least 9.7% but less than 9.8%
D. at least 9.8% but less than 9.9%
E. at least 9.9 %
4.7 (2 points) Assume that the frequency of hurricanes hitting the State of Windiana is given by a
Poisson distribution, with an average annual claim frequency of 82%. Assume that the losses in
millions of constant 1998 dollars from such a hurricane are given by a Pareto Distribution with
= 2.5 and = 400 million. Assuming frequency and severity are independent, what is chance of
two or more hurricanes each with more than $250 million (in constant 1998 dollars) of loss hitting the
State of Windiana next year?
(There may or may not be hurricanes of other sizes.)
A. less than 2.1%
B. at least 2.1% but less than 2.2%
C. at least 2.2% but less than 2.3%
D. at least 2.3% but less than 2.4%
E. at least 2.4%
Use the following information for the next 3 questions:
The claim frequency follows a Poisson Distribution with a mean of 10 claims per year.
4.8 (2 points) What is the chance of having more than 5 claims in a year?
A. 92%
B. 93%
C. 94%
D. 95%
E. 96%
4.9 (2 points) What is the chance of having more than 8 claims in a year?
A. 67%
B. 69%
C. 71%
D. 73%
E. 75%
4.10 (1 point) What is the chance of having 6, 7, or 8 claims in a year?
A. 19%
B. 21%
C. 23%
D. 25%
E. 27%

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 51

4.11 (2 points) You are given the following:


Claims follow a Poisson Distribution, with a mean of 27 per year.

The size of claims are given by a Weibull Distribution with = 1000 and = 3.
Frequency and severity are independent.
Given that during a year there are 7 claims of size less than 500, what is the expected number of
claims of size greater than 500 during that year?
(A) 20
(B) 21
(C) 22
(D) 23
(E) 24
4.12 (1 point) Frequency follows a Poisson Distribution with = 7.
20% of losses are of size greater than $50,000.
Frequency and severity are independent.
Let N be the number of losses of size greater than $50,000.
What is the probability that N = 3?
A. less than 9%
B. at least 9% but less than 10%
C. at least 10% but less than 11%
D. at least 11% but less than 12%
E. at least 12%
4.13 (1 point) N follows a Poisson Distribution with = 0.1. What is Prob[N = 1 | N 1]?
A. 8%

B. 9%

C. 10%

D. 11%

E. 12%

4.14 (1 point) N follows a Poisson Distribution with = 0.1. What is Prob[N = 1 | N 1]?
A. 91%

B. 92%

C. 93%

D. 94%

E. 95%

4.15 (2 points) N follows a Poisson Distribution with = 0.2. What is E[1/(N+1)]?


A. less than 0.75
B. at least 0.75 but less than 0.80
C. at least 0.80 but less than 0.85
D. at least 0.85 but less than 0.90
E. at least 0.90
4.16 (2 points) N follows a Poisson Distribution with = 2. What is E[N | N > 1]?
A. 2.6

B. 2.7

C. 2.8

D. 2.9

E. 3.0

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 52

4.17 (2 points) The total number of claims from a book of business with 500 exposures has a
Poisson Distribution with = 27. Next year, this book of business will have 600 exposures.
Next year, what is the probability of this book of business having a total of 30 claims?
A. 5.8%
B. 6.0%
C. 6.2%
D. 6.4%
E. 6.6%
Use the following information for the next two questions:
N follows a Poisson Distribution with = 1.3. Define (N-j)+ = n-j if n j, and 0 otherwise.
4.18 (2 points) Determine E[(N-1)+].
A. 0.48

B. 0.51

C. 0.54

D. 0.57

E. 0.60

D. 0.22

E. 0.23

4.19 (2 points) Determine E[(N-2)+].


A. 0.19

B. 0.20

C. 0.21

4.20 (2 points) The total number of non-zero payments from a policy with a $500 deductible
follows a Poisson Distribution with = 3.3.
The ground up losses follow a Weibull Distribution with = 0.7 and = 2000.
If this policy instead had a $1000 deductible, what would be the probability of having 4
non-zero payments?
A. 14%
B. 15%
C. 16%
D. 17%
E. 18%
4.21 (3 points) The number of major earthquakes that hit the state of Allshookup is given by a
Poisson Distribution with 0.05 major earthquakes expected per year.

Allshookup establishes a fund that will pay 1000/major earthquake.


The fund charges an annual premium, payable at the start of each year, of 60.
At the start of this year (before the premium is paid) the fund has 300.
Claims are paid immediately when there is a major earthquake.
If the fund ever runs out of money, it immediately ceases to exist.
Assume no investment income and no expenses.
What is the probability that the fund is still functioning in 40 years?
A. Less than 40%
B. At least 40%, but less than 41%
C. At least 41%, but less than 42%
D. At least 42%, but less than 43%
E. At least 43%

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 53

4.22 (2 points) You are given the following:

A business has bought a collision policy to cover its fleet of automobiles.


The number of collision losses per year follows a Poisson Distribution.
The size of collision losses follows an Exponential Distribution with a mean of 600.
Frequency and severity are independent.
This policy has an ordinary deductible of 1000 per collision.
The probability of no payments on this policy during a year is 74%.
Determine the probability that of the collision losses this business has during a year, exactly three of
them result in no payment on this policy.
(A) 8%
(B) 9%
(C) 10%
(D) 11%
(E) 12%
4.23 (2 points) A Poisson Distribution has a coefficient of variation 0.5.
Determine the probability of exactly seven claims.
(A) 4%
(B) 5%
(C) 6%
(D) 7%
(E) 8%
4.24 (2, 5/83, Q.4) (1.5 points) If X is the mean of a random sample of size n from a Poisson
distribution with parameter , then which of the following statements is true?
A. X has a Normal distribution with mean and variance .
B. X has a Normal distribution with mean and variance /n.
C. X has a Poisson distribution with parameter .
D. n X has a Poisson distribution with parameter n .
E. n X has a Poisson distribution with parameter n.
4.25 (2, 5/83, Q.28) (1.5 points) The number of traffic accidents per week in a small city has a
Poisson distribution with mean equal to 3.
What is the probability of exactly 2 accidents in 2 weeks?
A. 9e-6

B. 18e-6

C. 25e-6

D. 4.5e-3

E. 9.5e-3

4.26 (2, 5/83, Q.45) (1.5 points) Let X have a Poisson distribution with parameter = 1.
What is the probability that X 2, given that X 4?
A. 5/65
B. 5/41
C. 17/65
D. 17/41

E. 3/5

4.27 (2, 5/85, Q.9) (1.5 points) The number of automobiles crossing a certain intersection during
any time interval of length t minutes between 3:00 P.M. and 4:00 P.M. has a Poisson distribution
with mean t. Let W be time elapsed after 3:00 P.M. before the first automobile crosses the
intersection. What is the probability that W is less than 2 minutes?
A. 1 - 2e-1 - e-2

B. e-2

C. 2e-1

D. 1 - e-2

E. 2e-1 + e-2

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 54

4.28 (2, 5/85, Q.16) (1.5 points) In a certain communications system, there is an average of 1
transmission error per 10 seconds. Let the distribution of transmission errors be Poisson.
What is the probability of more than 1 error in a communication one-half minute in duration?
A. 1 - 2e-1

B. 1 - e-1

C. 1 - 4e-3

D. 1 - 3e-3

E. 1 - e-3

4.29 (2, 5/88, Q.49) (1.5 points) The number of power surges in an electric grid has a Poisson
distribution with a mean of 1 power surge every 12 hours.
What is the probability that there will be no more than 1 power surge in a 24-hour period?
A. 2e-2

B. 3e-2

C. e-1/2

D. (3/2)e-1/2

E. 3e-1

4.30 (4, 5/88, Q.48) (1 point) An insurer's portfolio is made up of 3 independent policyholders
with expected annual frequencies of 0.05, 0.1, and 0.15.
Assume that each insured's number of claims follows a Poisson distribution.
What is the probability that the insurer experiences fewer than 2 claims in a given year?
A. Less than .9
B. At least .9, but less than .95
C. At least .95, but less than .97
D. At least .97, but less than .99
E. Greater than .99
4.31 (2, 5/90, Q.39) (1.7 points) Let X, Y, and Z be independent Poisson random variables with
E(X) = 3, E(Y) = 1, and E(Z) = 4. What is P[X + Y + Z 1]?
A. 13e-12

B. 9e-8

C. (13/12)e-1/12

D. 9e-1/8

E. (9/8)e-1/8

4.32 (4B, 5/93, Q.1) (1 point) You are given the following:
A portfolio consists of 10 independent risks.
The distribution of the annual number of claims for each risk in the portfolio is given
by a Poisson distribution with mean = 0.1.
Determine the probability of the portfolio having more than 1 claim per year.
A. 5%
B. 10%
C. 26%
D. 37%
E. 63%
4.33 (4B, 11/94, Q.19) (3 points) The density function for a certain parameter, , is
f() = 4.6 e-4.6 / !, = 0, 1, 2, ...
Which of the following statements are true concerning the distribution function for ?
1. The mode is less than the mean.
2. The variance is greater than the mean.
3. The median is less than the mean.
A. 1
B. 2
C. 3
D. 1, 2

E. 1, 3

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 55

4.34 (4B, 5/95, Q.9) (2 points) You are given the following:
The number of claims for each risk in a group of identical risks follows a Poisson distribution.
The expected number of risks in the group that will have no claims is 96.
The expected number of risks in the group that will have 2 claims is 3.
Determine the expected number of risks in the group that will have 4 claims.
A. Less than .01
B. At least .01, but less than .05
C. At least .05, but less than .10
D. At least .10, but less than .20
E. At least .20
4.35 (2, 2/96, Q.21) (1.7 points) Let X be a Poisson random variable with E(X) = ln(2).
Calculate E[cos(x)].
A. 0

B. 1/4

C. 1/2 D. 1

E. 2ln(2)

4.36 (4B, 11/98, Q.1) (1 point) You are given the following:

The number of claims follows a Poisson distribution.

Claim sizes follow a Pareto distribution.


Determine the type of distribution that the number of claims with sizes greater than 1,000 follows.
A. Poisson
B. Pareto
C. Gamma D. Binomial
E. Negative Binomial
4.37 (4B, 11/98, Q.2) (2 points) The random variable X has a Poisson distribution with mean
n - 1/2, where n is a positive integer greater than 1. Determine the mode of X.
A. n-2
B. n-1
C. n
D. n+1
E. n+2
4.38 (4B, 11/98, Q.18) (2 points) The number of claims per year for a given risk follows a
distribution with probability function p(n) = n e / n! , n = 0, 1,..., > 0 .
Determine the smallest value of for which the probability of observing three or more claims during
two given years combined is greater than 0.1.
A. Less than 0.7
B. At least 0.7, but less than 1.0
C. At least 1.0, but less than 1.3
D. At least 1.3, but less than 1.6
E. At least 1.6

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 56

4.39 (4B, 5/99, Q.8) (3 points) You are given the following:

Each loss event is either an aircraft loss or a marine loss.

The number of aircraft losses has a Poisson distribution with a mean of 0.1 per year.
Each loss is always 10,000,000.

The number of marine losses has a Poisson distribution with a mean of 0.2 per year.
Each loss is always 20,000,000.

Aircraft losses occur independently of marine losses.

From the first two events each year,


the insurer pays the portion of the combined losses that exceeds 10,000,000.
Determine the insurer's expected annual payments.
A. Less than 1,300,000
B. At least 1,300,000, but less than 1,800,000
C. At least 1,800,000, but less than 2,300,000
D. At least 2,300,000, but less than 2,800,000
E. At least 2,800,000
4.40 (IOA 101, 4/00, Q.5) (2.25 points) An insurance companys records suggest that
experienced drivers (those aged over 21) submit claims at a rate of 0.1 per year, and
inexperienced drivers (those 21 years old or younger) submit claims at a rate of 0.15 per year.
A driver can submit more than one claim a year.
The company has 40 experienced and 20 inexperienced drivers insured with it.
The number of claims for each driver can be modeled by a Poisson distribution, and claims are
independent of each other.
Calculate the probability the company will receive three or fewer claims in a year.
4.41 (1, 5/00, Q.24) (1.9 points) An actuary has discovered that policyholders are three times as
likely to file two claims as to file four claims.
If the number of claims filed has a Poisson distribution, what is the variance of the
number of claims filed?
(A) 1/ 3

(B) 1

(C) 2

(D) 2

(E) 4

4.42 (3, 5/00, Q.2) (2.5 points) Lucky Tom finds coins on his way to work at a Poisson rate of 0.5
coins/minute. The denominations are randomly distributed:
(i)
60% of the coins are worth 1;
(ii)
20% of the coins are worth 5; and
(iii)
20% of the coins are worth 10.
Calculate the conditional expected value of the coins Tom found during his one-hour walk today,
given that among the coins he found exactly ten were worth 5 each.
(A) 108
(B) 115
(C) 128
(D) 165
(E) 180

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 57

4.43 (1, 11/00, Q.23) (1.9 points) A company buys a policy to insure its revenue in the event of
major snowstorms that shut down business. The policy pays nothing for the first such snowstorm of
the year and 10,000 for each one thereafter, until the end of the year.
The number of major snowstorms per year that shut down business is assumed to have a Poisson
distribution with mean 1.5.
What is the expected amount paid to the company under this policy during a one-year period?
(A) 2,769
(B) 5,000
(C) 7,231
(D) 8,347
(E) 10,578
4.44 (3, 11/00, Q.29) (2.5 points) Job offers for a college graduate arrive according to a Poisson
process with mean 2 per month. A job offer is acceptable if the wages are at least 28,000.
Wages offered are mutually independent and follow a lognormal distribution,
with = 10.12 and = 0.12.
Calculate the probability that it will take a college graduate more than 3 months to receive an
acceptable job offer.
(A) 0.27
(B) 0.39
(C) 0.45
(D) 0.58
(E) 0.61
4.45 (1, 11/01, Q.19) (1.9 points) A baseball team has scheduled its opening game for April 1.
If it rains on April 1, the game is postponed and will be played on the next day that it does not rain.
The team purchases insurance against rain. The policy will pay 1000 for each day, up to 2 days, that
the opening game is postponed. The insurance company determines that the number of
consecutive days of rain beginning on April 1 is a Poisson random variable with mean 0.6.
What is the standard deviation of the amount the insurance company will have to pay?
(A) 668
(B) 699
(C) 775
(D) 817
(E) 904
4.46 (CAS3, 11/03, Q.31) (2.5 points) Vehicles arrive at the Bun-and-Run drive-thru at a Poisson
rate of 20 per hour. On average, 30% of these vehicles are trucks.
Calculate the probability that at least 3 trucks arrive between noon and 1:00 PM.
A. Less than 0.80
B. At least 0.80, but less than 0.85
C. At least 0.85, but less than 0.90
D. At least 0.90, but less than 0.95
E. At least 0.95

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 58

4.47 (CAS3, 5/04, Q.16) (2.5 points) The number of major hurricanes that hit the island nation of
Justcoast is given by a Poisson Distribution with 0.100 storms expected per year.

Justcoast establishes a fund that will pay 100/storm.


The fund charges an annual premium, payable at the start of each year, of 10.
At the start of this year (before the premium is paid) the fund has 65.
Claims are paid immediately when there is a storm.
If the fund ever runs out of money, it immediately ceases to exist.
Assume no investment income and no expenses.
What is the probability that the fund is still functioning in 10 years?
A. Less than 60%
B. At least 60%, but less than 61%
C. At least 61%, but less than 62%
D. At least 62%, but less than 63%
E. At least 63%
4.48 (CAS3, 11/04, Q.17) (2.5 points) You are given:

Claims are reported at a Poisson rate of 5 per year.


The probability that a claim will settle for less than $100,000 is 0.9.
What is the probability that no claim of $100,000 or more is reported during the next 3 years?
A. 20.59% B. 22.31% C. 59.06% D. 60.65% E. 74.08%
4.49 (CAS3, 11/04, Q.23) (2.5 points) Dental Insurance Company sells a policy that covers two
types of dental procedures: root canals and fillings.
There is a limit of 1 root canal per year and a separate limit of 2 fillings per year.
The number of root canals a person needs in a year follows a Poisson distribution with = 1,
and the number of fillings a person needs in a year is Poisson with = 2.
The company is considering replacing the single limits with a combined limit of 3 claims per year,
regardless of the type of claim.
Determine the change in the expected number of claims per year if the combined limit is adopted.
A. No change
B. More than 0.00, but less than 0.20 claims
C. At least 0.20, but less than 0.25 claims
D. At least 0.25, but less than 0.30 claims
E. At least 0.30 claims

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 59

4.50 (SOA M, 5/05, Q.5) (2.5 points)


Kings of Fredonia drink glasses of wine at a Poisson rate of 2 glasses per day.
Assassins attempt to poison the kings wine glasses. There is a 0.01 probability that any
given glass is poisoned. Drinking poisoned wine is always fatal instantly and is the only
cause of death. The occurrences of poison in the glasses and the number of glasses drunk are
independent events.
Calculate the probability that the current king survives at least 30 days.
(A) 0.40
(B) 0.45
(C) 0.50
(D) 0.55
(E) 0.60
4.51 (CAS3, 11/05, Q.24) (2.5 points) For a compound loss model you are given:

The claim count follows a Poisson distribution with = 0.01.


Individual losses are distributed as follows:
x
F(x)
100 0.10
300 0.20
500 0.25
600 0.40
700 0.50
800 0.70
900 0.80
1,000 0.90
1,200 1.00
Calculate the probability of paying at least one claim after implementing a $500 deductible.
A. Less than 0.005
B. At least 0.005, but less than 0.010
C. At least 0.010, but less than 0.015
D. At least 0.015, but less than 0.020
E. At least 0.020
4.52 (CAS3, 11/05, Q.31) (2.5 points) The Toronto Bay Leaves attempt shots in a hockey game
according to a Poisson process with mean 30. Each shot is independent.
For each attempted shot, the probability of scoring a goal is 0.10.
Calculate the standard deviation of the number of goals scored by the Bay Leaves in a game.
A. Less than 1.4
B. At least 1.4, but less than 1.6
C. At least 1.6, but less than 1.8
D. At least 1.8, but less than 2.0
E. At least 2.0

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 60

4.53 (CAS3, 11/06, Q.32) (2.5 points) You are given:

Annual frequency follows a Poisson distribution with mean 0.3.


Severity follows a normal distribution with F(100,000) = 0.6.
Calculate the probability that there is at least one loss greater than 100,000 in a year.
A. Less than 11 %
B. At least 11%, but less than 13%
C. At least 13%, but less than 15%
D. At least 15%, but less than 17%
E. At least 17%
4.54 (SOA M, 11/06, Q.9) (2.5 points) A casino has a game that makes payouts at a Poisson rate
of 5 per hour and the payout amounts are 1, 2, 3, without limit.
The probability that any given payout is equal to i is 1/ 2i. Payouts are independent.
Calculate the probability that there are no payouts of 1, 2, or 3 in a given 20 minute period.
(A) 0.08
(B) 0.13
(C) 0.18
(D) 0.23
(E) 0.28
4.55 (CAS3L, 5/09, Q.8) (2.5 points) Bill receives mail at a Poisson rate of 10 items per day.
The contents of the items are randomly distributed:

50% of the items are credit card applications.


30% of the items are catalogs.
20% of the items are letters from friends.
Bill has received 20 credit card applications in two days.
Calculate the probability that for those same two days, he receives at least 3 letters from friends and
exactly 5 catalogs.
A. Less than 6%
B. At least 6%, but less than 10%
C. At least 10%, but less than 14%
D. At least 14%, but less than 18%
E. At least 18%

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 61

4.56 (CAS3L, 5/09, Q.9) (2.5 points) You are given the following information:

Policyholder calls to a call center follow a homogenous Poisson process with = 250 per day.
Policyholders may call for 3 reasons: Endorsement, Cancellation, or Payment.
The distribution of calls is as follows:
Call Type
Percent of Calls
Endorsement
50%
Cancellation
10%
Payment
40%
Using the normal approximation with continuity correction, calculate the probability of receiving more
than 156 calls in a day that are either endorsements or cancellations.
A. Less than 27%
B. At least 27%, but less than 29%
C. At least 29%, but less than 31%
D. At least 31%, but less than 33%
E. At least 33%
4.57 (CAS3L, 11/09, Q.11) (2.5 points) You are given the following information:

Claims follow a compound Poisson process.


Claims occur at the rate of = 10 per day.
Claim severity follows an exponential distribution with = 15,000.
A claim is considered a large loss if its severity is greater than 50,000.
What is the probability that there are exactly 9 large losses in a 30-day period?
A. Less than 5%
B. At least 5%, but less than 7.5%
C. At least 7.5%, but less than 10%
D. At least 10%, but less than 12.5%
E. At least 12.5%

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 62

Solutions to Problems:
4.1. B. This is a Poisson distribution with a parameter of 6.9. The mean is therefore 6.9.
4.2. B. This is a Poisson distribution with a parameter of 6.9.
The variance is therefore 6.9.
4.3. A. One needs to sum the chances of having 0, 1, 2, and 3 claims:
n
f(n)
F(n)

0
0.001
0.001

1
0.007
0.008

2
0.024
0.032

3
0.055
0.087

For example, f(3) = 6.93 e-6.9 / 3! = (328.5)(.001008)/6 = .055.


4.4. B. The mode is the value at which f(n) is a maximum; f(6) = .151 and the mode is therefore 6.
n
f(n)

0
0.001

1
0.007

2
0.024

3
0.055

4
0.095

5
0.131

6
0.151

7
0.149

8
0.128

Alternately, in general for the Poisson the mode is the largest integer in the parameter; the largest
integer in 6.9 is 6.
4.5. C. For a discrete distribution such as we have here, employ the convention that the median is
the first value at which the distribution function is greater than or equal to .5.
F(7) 50% and F(6) < 50%, and therefore the median is 7.
n
f(n)
F(n)

0
0.001
0.001

1
0.007
0.008

2
0.024
0.032

3
0.055
0.087

4
0.095
0.182

5
0.131
0.314

6
0.151
0.465

7
0.149
0.614

8
0.128
0.742

4.6. E. The sum of Poisson variables is a Poisson with the sum of the parameters.
The sum has a Poisson parameter of (20)(.05) + (10)(.03) = 1.3.
The chance of three claims is (1.33 )e-1.3 / 3! = 9.98%.
4.7. E. For the Pareto Distribution, S(x) = 1 - F(x) = {/(+x)}.
S(250) = {400/(400+250)}2.5 = 0.2971.
Thus the distribution of hurricanes with more than $250 million of loss is Poisson with mean frequency
of (82%)(.2971) = 24.36%.
The chance of zero such hurricanes is e-0.2436 = 0.7838.
The chance of one such hurricane is: (0.2436)e-0.2436 = 0.1909.
The chance of more than one such hurricane is: 1 - (0.7838 + 0.1909) = 0.0253.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 63

4.8. B. f(n) = e n / n! = e-10 10n / n!


n
f(n)
F(n)

0
0.0000
0.0000

1
0.0005
0.0005

2
0.0023
0.0028

3
0.0076
0.0103

4
0.0189
0.0293

5
0.0378
0.0671

Thus the chance of having more than 5 claims is 1 - .0671 = .9329.


Comment: Although one should not do so on the exam, one can also solve this using the
Incomplete Gamma Function. The chance of having more than 5 claims is
the Incomplete Gamma with shape parameter 5+1 =6 at the value 10: (6;10) = .9329.
4.9. A. f(n) = e n / n! = e-10 10n / n!
n
f(n)
F(n)

0
0.0000
0.0000

1
0.0005
0.0005

2
0.0023
0.0028

3
0.0076
0.0103

4
0.0189
0.0293

5
0.0378
0.0671

6
0.0631
0.1301

7
0.0901
0.2202

8
0.1126
0.3328

Thus the chance of having more than 8 claims is 1 - .3328 = .6672.


Comment: The chance of having more than 8 claims is the incomplete Gamma with shape
parameter 8+1 = 9 at the value 10: (9;10) = 0.6672.
4.10. E. One can add up: f(6) + f(7) + f(8) = 0.0631 + 0.0901 + 0.1126 = 0.2657.
Alternately, one can use the solutions to the two previous questions.
F(8) - F(5) = {1-F(5)} - {1-F(8)} = 0.9239 - 0.6672 = 0.2657.
Comment: Prob[6, 7, or 8 claims] = (6;10) - (9;10) = 0.9239 - 0.6672 = 0.2657.
4.11. E. The large and small claims are independent Poisson Distributions. Therefore, the
observed number of small claims has no effect on the expected number of large claims.
S(500) = exp(-(500/1000)3 ) = 0.8825. Expected number of large claims is: (27)(0.8825) = 23.8.
4.12. D. Frequency of large losses follows a Poisson Distribution with = (20%)(7) = 1.4.
f(3) = 1.43 e-1.4/3! = 11.3%.
4.13. B. Prob[N = 1 | N 1] = Prob[N=1]/Prob[N 1] = e/(e + e) = /(1 + ) = 0.0909.
4.14. E. Prob[N = 1 | N 1] = Prob[N = 1]/Prob[N 1] = e/(1 - e) = /(e - 1) = 0.9508.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

4.15. E. E[1/(N+1)] =

f(n)/ (n +1) =
0

(e-.2/.2)

0.2i / i!

(e-.2/.2){

HCM 10/4/12,

(e- .2 .2n / n!) / (n +1) = (e-.2/.2)


0

0.2i / i!

Page 64

0.2n + 1/ (n +1)!
0

0.20 / 0! } = (e-.2/.2)(e.2 - 1) = (1 - e-.2)/.2 = 0.906.

4.16. D. E[N] = P[N = 0]0 + P[N = 1]1 + P[N > 1]E[N | N > 1].
2 = 2e-2 + (1 - e-2 - 2e-2)E[N | N > 1]. E[N | N > 1] = (2 - 2e-2)/(1 - e-2 - 2e-2) = 2.91.
4.17. E. Next year the frequency is Poisson with = (600/500)(27) = 32.4.
f(30) = e-32.4 32.430/30! = 6.63%.
4.18. D. E[N | N 1] Prob[N 1] + 0 Prob[N = 0] = E[N] = . E[N | N 1] Prob[N 1] = .
E[(N-1)+] = E[N - 1 | N 1] Prob[N 1] + 0 Prob[N = 0] = (E[N | N 1] - 1) Prob[N 1] =
E[N | N 1] Prob[N 1] - Prob[N 1] = - (1 - e) = + e - 1 = 1.3 + e-1.3 - 1 = 0.5725.

Alternately, E[(N-1)+] = (n-1) f(n) = n f(n) - f(n) = E[N] - Prob[N 1] = + e - 1.


n=1

n=1

n=1

Alternately, E[(N-1)+] = E[N] - E[N 1] = - Prob[N 1] = + e - 1.


Alternately, E[(N-1)+] = E[(1-N)+] + E[N] - 1 = Prob[N = 0] + - 1 = e + - 1.
Comment: For the last two alternate solutions, see Mahlers Guide to Loss Distributions.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 65

4.19. B. E[N | N 2] Prob[N 2] + 1 Prob[N = 1] + 0 Prob[N = 0] = E[N] = .

E[N | N 2] Prob[N 2] = e.
E[(N-2)+] = E[N - 2 | N 2] Prob[N 2] + 0 Prob[N < 2] = (E[N | N 2] - 2) Prob[N 2] =
E[N | N 2] Prob[N 2] - 2Prob[N 2] = - e - 2(1 - e - e) = + 2e + e - 2 =
1.3 + 2e-1.3 + 1.3e-1.3 - 2 = 0.199.

Alternately, E[(N-2)+] = (n-2) f(n) = n f(n) - 2 f(n) = E[N] - e - 2 Prob[N 2] =


n=2

n=2

n=2

= - e - 2(1 - e - e) = + 2e + e - 2.
Alternately, E[(N-2)+] = E[N] - E[N 2] = - (Prob[N = 1] + 2Prob[N 2]) = + 2e + e - 2.
Alternately, E[(N-2)+] = E[(2-N)+] + E[N] - 2 = 2Prob[N = 0] + Prob[N = 1] + - 2 =
2e + e + - 2.
Comment: For the last two alternate solutions, see Mahlers Guide to Loss Distributions.
4.20. A. For the Weibull, S(500) = exp[-(500/2000).7] = .6846.
S(1000) = exp[-(1000/2000).7] = .5403. Therefore, with the $1000 deductible, the non-zero
payments are Poisson, with = (.5403/.6846)(3.3) = 2.60. f(4) = e-2.6 2.64 /4! = 14.1%.
4.21. D. In the absence of losses, by the beginning of year 12, the fund would have:
300 + (12)(60) = 1020 > 1000.
In the absence of losses, by the beginning of year 29, the fund would have:
300 + (29)(60) = 2040 > 2000.
Thus in order to survive for 40 years there have to be 0 events in the first 11 years,
at most one event during the first 28 years, and at most two events during the first 40 years.
Prob[survival through 40 years] =
Prob[0 in first 11 years]{Prob[0 in next 17 years]Prob[0, 1, or 2 in final 12 years] +
Prob[1 in next 17 years]Prob[0 or 1 in final 12 years]} =
e-.55{(e-.85)(e-.6 + .6e-.6 + .62 e-.6/2) + (.85e-.85)(e-.6 + .6e-.6)} = 3.14e-2 = 0.425
Comment: Similar to CAS3, 5/04, Q.16.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 66

4.22. C. The percentage of large losses is: e-1000/600 = 18.89%.


Let be the mean of the Poisson distribution of losses.
Then the large losses, those of size greater than 1000, are also Poisson with mean 0.1889.
74% = Prob[0 large losses] = exp[-0.1889]. = 1.59.
The small losses, those of size less than 1000, are also Poisson with mean:
(1 - 0.1889) (1.59) = 1.29.
Prob[3 small losses] = 1.293 e-1.29 / 6 = 9.8%.
4.23. C. The coefficient of variation is the ratio of the standard deviation to the mean, which for a
Poisson Distribution is:

/ = 1/

. 1/

= 0.5. = 4. f(3) = 47 e-4 / 7! = 5.95%.

4.24. E. n X , the sum of n Poissons each with mean , is a Poisson with mean n.
Comment: X can be non-integer, and therefore it cannot have a Poisson distribution.
As n , X a Normal distribution with mean and variance /n.
4.25. B. Over two weeks, the number of accidents is Poisson with mean 6.
f(2) = e 2/2 = e-6 62 /2 = 18e- 6.
4.26. C. Prob[X 2 | X 4] = {f(2) + f(3) + f(4)}/{f(0) + f(1) + f(2) + f(3) + f(4)} =
e -1(1/2 + 1/6 + 1/24)/{e-1(1 + 1 + 1/2 + 1/6 + 1/24)} = (12 + 4 + 1)/(24 + 24 + 12 + 4 + 1)
= 17/65.
4.27. D. Prob[W 2] = 1 - Prob[no cars by time 2] = 1 - e- 2.
4.28. C. Prob[0 errors in 30 seconds] = e-30/10 = e-3. Prob[1 error in 30 seconds] = 3e-3.
Prob[more than one error in 30 seconds] = 1 - 4e- 3.
4.29. B. Prob[0 or 1 surges] = e-24/12 + 2e-2 = 3e- 2.
4.30. C. The sum of three independent Poissons is also a Poisson, whose mean is the sum of the
individual means. Thus the portfolio of three insureds has a Poisson distribution with mean
.05 + .10 + .15 = .30. For a Poisson distribution with mean , the chance of zero claims is e and
that of 1 claim is e. Thus the chance of fewer than 2 claims is: (1+)e.
Thus for this portfolio, the chance of 2 or fewer claims is: (1+.3)e-.3 = 0.963.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 67

4.31. B. X + Y + Z is Poisson with = 3 + 1 + 4 = 8. f(0) + f(1) = e-8 + 8e-8 = 9e- 8.


4.32. C. The sum of independent Poissons is a Poisson, with a parameter the sum of the
individual Poisson parameters. In this case the portfolio is Poisson with a parameter = (10)(.1) = 1.
Chance of zero claims is e-1. Chance of one claim is (1)e-1.
Chance of more than one claim is: 1 - (e-1 + e-1) = 0.264.
4.33. E. This is a Poisson distribution with a parameter of 4.6. The mean is therefore 4.6. The
mode is the value at which f() is a maximum; f(4) = .188 and the mode is 4. Therefore statement
1 is true. For the Poisson the variance equals the mean, and therefore statement 2 is false.
For a discrete distribution such as we have here, the median is the first value at which the distribution
function is greater than or equal to .5. F(4) > 50%, and therefore the median is 4 and less than the
mean of 4.6. Therefore statement 3 is true.
n
f(n)
F(n)

0
0.010
0.010

1
0.046
0.056

2
0.106
0.163

3
0.163
0.326

4
0.188
0.513

5
0.173
0.686

6
0.132
0.818

7
0.087
0.905

8
0.050
0.955

Comment: For a Poisson with parameter , the mode is the largest integer in . In this case
= 4.6 so the mode is 4. Items 1 and 3 can be answered by computing enough values of the
density and adding them up. Alternately, since the distribution is skewed to the right (has positive
skewness), both the peak of the curve and the 50% point are expected to be to the left of the
mean. The median is less affected by the few extremely large values than is the mean, and therefore
for curves skewed to the right the median is smaller than the mean. For curves skewed to the right,
the largest single probability most commonly occurs at less than the mean, but this is not true of all
such curves.
4.34. B. Assume we have R risks in total. The Poisson distribution is given by:
f(n) = e n / n!, n=0,1,2,3,... Thus for n=0 we have R e = 96. For n = 2 we have
R e 2 / 2 = 3. By dividing these two equations we can solve for = (6/96).5 = 1/4.
The number of risks we expect to have 4 claims is: R e 4 / 4! = (96)(1/4)4 / 24 = .0156.

4.35. B. cos(0) = 1. cos() = -1. cos(2) = 1. cos(3) = -1.


E[cos(x)] = (-1)x f(x) = e{1 - + 2/2! - 3/3! + 4/3! - 5/5! + ...} =
e{1 + (-) + (-)2/2! + (-)3/3! + (-)4/3! + (-)5/5! + ...} = ee = e2.
For = ln(2), e2 = 2-2 = 1/4.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 68

4.36. A. If frequency is given by a Poisson and severity is independent of frequency, then the
number of claims above a certain amount (in constant dollars) is also a Poisson.
4.37. B. The mode of the Poisson with mean is the largest integer in .
The largest integer in n - 1/2 is n-1.
Alternately, for the Poisson f(x)/ f(x-1) = /x. Thus f increases when > x and decreases for < x.
Thus f increases for x < = n - .5. For integer n, x < n - .5 for x n -1.
Thus the density increases up to n - 1 and decreases thereafter. Therefore the mode is n - 1.
4.38. A. We want the chance of less than 3 claims to be less than .9. For a Poisson with mean ,
the probability of 0, 1 or 2 claims is: e(1 + + 2 /2). Over two years we have a Poisson with mean
2. Thus we want e2(1 + 2 + 22 ) < .9. Trying the endpoints of the given intervals we determine
that the smallest such must be less than 0.7.
Comment: In fact the smallest such is about .56.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 69

4.39. D. If there are more than two losses, we are not concerned about those beyond the first two.
Since the sum of two independent Poissons is a Poisson, the portfolio has a Poisson frequency
distribution with mean of 0.3. Therefore, the chance of zero claims is e-.3 = .7408, one claim is
.3 e-.3 = .2222, and of two or more claims is 1 - .741 - .222 = .0370. .1/.3 = 1/3 of the losses are
aircraft and .2/.3 = 2/3 of the losses are marine. Thus the probability of the first two events, in the
case of two or more events, is divided up as (1/3)(1/3) = 1/9, 2(1/3)(2/3) = 4/9, (2/3)(2/3) = 4/9,
between 2 aircraft, 1 aircraft and 1 marine, and 2 marine, using the binomial expansion for two
events. (.2222)/(1/3) = .0741 = probability of one aircraft,
(.2222)(2/3) = probability of one marine, (.0370)(1/9) = .0041 = probability of 2 aircraft,
(.0370)(4/9) = probability one of each type, (.0370)(4/9) = .0164 = probability of 2 marine.
If there are zero claims, the insurer pays nothing. If there is one aircraft loss, the insurer pays nothing.
If there is one marine loss, the insurer pays 10 million. If there are two or more events there are three
possibilities for the first two events. If the first two events are aircraft, the insurer pays 10 million. If the
first two events are one aircraft and one marine, the insurer pays 20 million.
If the first two events are marine, the insurer pays 30 million.
Events
(first 2 only)

Probability

Losses from First 2 events


($ million)

Amount Paid by the Insurer


($ million)

None
1 Aircraft
1 Marine
2 Aircraft
1 Aircraft, 1 Marine
2 Marine

0.7408
0.0741
0.1481
0.0041
0.0164
0.0164

0
10
20
20
30
40

0
0
10
10
20
30

2.345

Thus the insurers expected annual payment is $2.345 million.


4.40. The total number of claims from inexperienced drivers is Poisson with mean: (20)(.15) = 3.
The total number of claims from experienced drivers is Poisson with mean: (40)(.1) = 4.
The total number of claims from all drivers is Poisson with mean: 3 + 4 = 7.
Prob[# claims 3] = e-7(1 + 7 + 72 /2 + 73 /6) = 8.177%.
4.41. D. f(2) =3 f(4). e 2/2 = 3e 4/24. = 2. Variance = = 2.
4.42. C. The finding of the three different types of coins are independent Poisson processes.
Over the course of 60 minutes, Tom expects to find (.6)(.5)(60) = 18 coins worth 1 each and
(.2)(.5)(60) = 6 coins worth 10 each. Tom finds 10 coins worth 5 each. The expected worth of the
coins he finds is: (18)(1) + (10)(5) + (6)(10) = 128.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

4.43. C. E[(X-1)+] = E[X] - E[X

HCM 10/4/12,

Page 70

1] = 1.5 - {0f(0) + 1(1 - f(0))} = .5 + f(0) = .5 + e-1.5 = .7231.

Expected Amount Paid is: 10,000E[(X-1)+] = 7231.


Alternately, Expected Amount Paid is: 10,000{1f(2) + 2f(3) + 3f(4) + 4f(5) + 5f(6) + ...} =
(10,000)e-1.5{1.52 /2 + (2)(1.53 /6) + (3)(1.54 /24) + (4)(1.55 /120) + (5)(1.56 /720) + ...} =
2231{1.125 + 1.125 + .6328 + .2531 + .0791 + .0203 + ...) 7200.
4.44. B. For this Lognormal Distribution, S(28,000) = 1 - [ln(28000) - 10.12)/0.12] =
1 - (1) = 1 - 0.8413 = 0.1587. Acceptable offers arrive via a Poisson Process at rate
2 S(28000) = (2)(.01587) = 0.3174 per month. Thus the number of acceptable offers over the first
3 months is Poisson distributed with mean (3)(0.3174) = .9522.
The probability of no acceptable offers over the first 3 months is: e -0.9522 = 0.386.
Alternately, the probability of no acceptable offers in a month is: e-0.3174.
Probability of no acceptable offers in 3 months is: (e-0.3174)3 = e -0.9522 = 0.386.
4.45. B. E[X

2] = (0)f(0) + 1f(1) + 2{1 - f(0) - f(1)} = .6e-.6 + 2{1 - e-.6 - .6e-.6} = .573.

2)2] = (0)f(0) + 1f(1) + 4{1 - f(0) - f(1)} = .6e-.6 + 4{1 - e-.6 - .6e-.6} = .8169.
Var[X 2] = .8169 - .5732 = .4886.
Var[1000(X 2)] = (10002 )(.4886) = 488,600.
StdDev[1000(X 2)] = 488,600 = 699.
E[(X

4.46. D. Trucks arrive at a Poisson rate of: (30%)(20) = 6 per hour.


f(0) = e-6. f(1) = 6e-6. f(2) = 62 e-6/2. 1 - {f(0) + f(1) + f(2)} = 1 - 25e-6 = 0.938.
4.47. D. If there is a storm within the first three years, then there is ruin, since the fund would have
only 65 + 30 = 95 or less. If there are two or more storms in the first ten years, then the fund is
ruined. Thus survival requires no storms during the first three years and at most one storm during the
next seven years. Prob[survival through 10 years] =
Prob[0 storms during 3 years] Prob[0 or 1 storm during 7 years] = (e-.3)(e-.7 + .7e-.7) = 0.625.
4.48. B. Claims of $100,000 or more are Poisson with mean: (5)(1 - 0.9) = 0.5 per year.
The number of large claims during 3 years is Poisson with mean: (3)(0.5) = 1.5.
f(0) = e-1.5 = 0.2231.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

4.49. C. For = 1, E[N


For = 2, E[N

HCM 10/4/12,

Page 71

1] = 0f(0) + 1(1 - f(0)) = 1 - e-1 = .6321.

2] = 0f(0) + 1f(1) + 2(1 - f(0) - f(1)) = 2e-2 + 2(1 - e-2 - 2e-2) = 1.4587.

Expected number of claims before change: .6321 + 1.4587 = 2.091.


The sum of number of root canals and the number of fillings is Poisson with = 3.
For = 3, E[N

3] = 0f(0) + 1f(1) + 2f(2) + 3(1 - f(0) - f(1) - f(2)) =

3e-3 + (2)(9e-3/2) + 3(1 - e-3 - 3e-3 - 4.5e-3) = 2.328. Change is: 2.328 - 2.091 = 0.237.
Comment: Although it is not stated, we must assume that the number of root canals and the number
of fillings are independent.
4.50. D. Poisoned glasses of wine are Poisson with mean: (0.01)(2) = 0.02 per day.
The probability of no poisoned glasses over 30 days is: e-(30)(0.02) = e-0.6 = 0.549.
Comment: Survival corresponds to zero poisoned glasses of wine.
The king can drink any number of non-poisoned glasses of wine.
The poisoned and non-poisoned glasses are independent Poisson Processes.
4.51. B. After implementing a $500 deductible, only losses of size greater than 500 result in a
claim payment. Prob[loss > 500] = 1 - F(500) = 1 - .25 = .75.
Via thinning, large losses are Poisson with mean: (.75)(.01) = .0075.
Prob[at least one large loss] = 1 - e-.0075 = 0.00747.
4.52. C. By thinning, the number of goals is Poisson with mean (30)(0.1) = 3.
This Poisson has variance 3, and standard deviation

3 = 1.732.

4.53. B. Large losses are Poisson with mean: (1 - .6)(.3) = 0.12.


Prob[at least one large loss] = 1 - e-.12 = 11.3%.

2013-4-1,

Frequency Distributions, 4 Poisson Dist.

HCM 10/4/12,

Page 72

4.54. D. Payouts of size one are Poisson with = (1/2)(5) = 2.5 per hour.
Payouts of size one are Poisson with = (1/4)(5) = 1.25 per hour.
Payouts of size one are Poisson with = (1/8)(5) = 0.625 per hour.
Prob[0 of size 1 over 1/3 of an hour] = e-2.5/3.
Prob[0 of size 2 over 1/3 of an hour] = e-1.25/3.
Prob[0 of size 3 over 1/3 of an hour] = e-0.625/3.
The three Poisson Processes are independent, so we can multiply the above probabilities:
e-2.5/3e-1.25/3e-0.625/3 = e-1.458 = 0.233.
Alternately, payouts of sizes one, two, or three are Poisson with
= (1/2 + 1/4 + 1/8)(5) = 4.375 per hour.
Prob[0 of sizes 1, 2, or 3, over 1/3 of an hour] = e-4.375/3 = 0.233.
4.55. C. Catalogs are Poisson with mean over two days of: (2)(30%)(10) = 6.
Letters are Poisson with mean over two days of: (2)(20%)(10) = 4.
The Poisson processes are all independent. Therefore, knowing he got 20 applications tells us
nothing about the number of catalogs or letters.
Prob[ at least 3 letters] = 1 - e-4 - e-4 4 - e-4 42 /2 = 0.7619.
Prob[ 5 catalogs] = e-6 65 /120 = 0.1606.
Prob[ at least 3 letters and exactly 5 catalogs] = (.7619)(.1606) = 12.2%.
4.56. C. The number of endorsements and cancellations is Poisson with = (250)(60%) = 150.
Applying the normal approximation with mean and variance equal to 150:
Prob[more than 156] 1 - [(156.5 - 150)/ 150 ] = 1 - [0.53] = 29.8%.
Comment: Using the continuity correction, 156 is out and 157 is in:
156
156.5
157
|
4.57. D. S(50) = e-50/15 = 0.03567. Large losses are Poisson with mean: 10 S(50) = 0.3567.
Over 30 days, large losses are Poisson with mean: (30)(0.3567) = 10.70.
Prob[exactly 9 large losses in a 30-day period] = e-10.7 10.79 / 9! = 11.4%.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 73

Section 5, Geometric Distribution


The Geometric Distribution, a special case of the Negative Binomial Distribution, will be discussed
first.

Geometric Distribution
Parameters: > 0.

Support: x = 0, 1, 2, 3, ...

D. f. :

x + 1
F(x) = 1 -

1+

P. d. f. :

f(x) =

x
(1 + ) x + 1

f(0) = 1/ (1+).

f(1) = / (1 + )2 .

f(2) = 2 / (1 + )3 .

f(3) = 3 / (1 + )4 .

Mean =
Variance = (1+)

Coefficient of Variation =

Kurtosis = 3 +

Variance / Mean = 1 + > 1.


1 +
.

Skewness =

1 + 2
(1 + )

6 2 + 6 + 1
.
(1+ )

Mode = 0.
Probability Generating Function: P(z) =

1
, z < 1 + 1/.
1 - (z - 1)

f(x+1)/f(x) = a + b/(x+1), a = /(1+), b = 0, f(0) = 1/(1+).


Moment Generating Function: M(s) = 1/{1- (es-1)}, s < ln(1+) - ln().

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 74

Using the notation in Loss Models, the Geometric Distribution is:


x

x
1+
f(x) =
=
, x = 0, 1, 2, 3, ...
1+
(1 + ) x + 1

For example, for = 4, f(x) = 4x/5x+1, x = 0, 1, 2, 3, ...


A Geometric Distribution for = 4:
Prob.
0.2

0.15

0.1

0.05

10

15

20

The densities decline geometrically by a factor of /(1+); f(x+1)/f(x) = /(1+).


This is similar to the Exponential Distribution, f(x+1)/f(x) = e1/.
The Geometric Distribution is the discrete analog of the continuous Exponential Distribution.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 75

For q = 0.3 or = 0.7/0.3 = 2.333, the Geometric distribution is:


Number of
Claims
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Sum

f(x)
0.30000
0.21000
0.14700
0.10290
0.07203
0.05042
0.03529
0.02471
0.01729
0.01211
0.00847
0.00593
0.00415
0.00291
0.00203
0.00142
0.00100
0.00070
0.00049
0.00034
0.00024
0.00017
0.00012
0.00008
0.00006
0.00004

F(x)
0.30000
0.51000
0.65700
0.75990
0.83193
0.88235
0.91765
0.94235
0.95965
0.97175
0.98023
0.98616
0.99031
0.99322
0.99525
0.99668
0.99767
0.99837
0.99886
0.99920
0.99944
0.99961
0.99973
0.99981
0.99987
0.99991

Number of Claims
times f(x)
0.00000
0.21000
0.29400
0.30870
0.28812
0.25211
0.21177
0.17294
0.13836
0.10895
0.08474
0.06525
0.04983
0.03779
0.02849
0.02136
0.01595
0.01186
0.00879
0.00650
0.00479
0.00352
0.00258
0.00189
0.00138
0.00101

Square of Number of Claims


times f(x)
0.00000
0.21000
0.58800
0.92610
1.15248
1.26052
1.27061
1.21061
1.10684
0.98059
0.84743
0.71777
0.59794
0.49123
0.39880
0.32046
0.25523
0.20169
0.15828
0.12345
0.09575
0.07390
0.05677
0.04343
0.03311
0.02515

2.33

13.15

As computed above, the mean is about 2.33. The second moment is about 13.15, so that the
variance is about 13.15 - 2.332 = 7.72. Since the Geometric has a significant tail, the terms involving
the number of claims greater than 25 would have to be taken into account in order to compute a
more accurate value of the variance or higher moments. Rather than taking additional terms it is better
to have a general formula for the moments.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 76

The mean can be computed as follows:

j+1
1+
E[X] = Prob[X > j] =
(/(1+))j+1 =
= .

j=0
j = 0 1+
1
1+

Thus the mean for the Geometric distribution is . For the example, = 2.333 = mean.
The variance of the Geometric is (1+), which for = 2.333 is 7.78.23
Survival Function:
Note that there is a small but positive chance of any very large number of claims.
Specifically for the Geometric distribution the chance of x > j is:


i
i
1
/ (1+ )} j + 1
= {/(1+)}j+1.
(1+ )i + 1 = (1+ ) (1+ ) (/(1+))i = (1+1) {
1

/
(1+)

i=j+1
i=j+1
1 - F(x) = S(x) = {/(1+)}x + 1.
For example, the chance of more than 19 claims is .720 = .00080, so that
F(19) = 1 - 0.00080 = 0.99920, which matches the result above.
Thus for a Geometric Distribution, for n > 0, the chance of at least n claims is (/(1+))n .
The survival function decreases geometrically. The chance of 0 claims from a Geometric is:
1/(1+) = 1 - /(1+) = 1 - geometric factor of decline of the survival function.
Exercise: There is a 0.25 chance of 0 claims, 0.75 chance of at least one claim,
0.752 chance of at least 2 claims, 0.753 chance of at least 3 claims, etc. What distribution is this?
[Solution: This is a Geometric Distribution with 1/(1+) = 0.25, /(1+) = 0.75, or = 3.]
For the Geometric, F(x) = 1 - {/(1+)}x+1. Thus the Geometric distribution is the discrete analog of
the continuous Exponential Distribution which has F(x) = 1 - e-x/ = 1 - (exp[-1/])x.
In each case the density function decreases by a constant multiple as x increases.
For the Geometric Distribution: f(x) = {/(1+)}x / (1+) ,
while for the Exponential Distribution: f(x) = e-x/ / = (exp[-1/])x / .
23

The variance is shown in Appendix B attached to the exam. One way to get the variance as well as higher moments
is via the probability generating function and factorial moments, as will be discussed subsequently.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 77

Memoryless Property:
The geometric shares with the exponential distribution, the memoryless property.24 Given that
there are at least m claims, the probability distribution of the number of claims in excess of m does
not depend on m. In other words, if one were to truncate and shift a Geometric
Distribution, then one obtains the same Geometric Distribution.
Exercise: Let the number of claims be given by an Geometric Distribution with = 1.7. Eliminate
from the data all instances where there are 3 or fewer claims and subtract 4
from each of the remaining data points. (Truncate and shift at 4.)
What is the resulting distribution?
[Solution: Due to the memoryless property, the result is a Geometric Distribution with = 1.7.]
Generally, let f(x) be the original Geometric Distribution. Let g(x) be the truncated and shifted
distribution. Take as an example, a truncation point of 4 as in the exercise.
Then g(x) =

f(x + 4)
x + 4 / (1+)x + 5
= f(x+4) / S(3) =
= x/(1+)x+1,
4 / (1+ )4
1 - {f(0) + f(1)+ f(2)+ f(3)}

which is again a Geometric Distribution with the same parameter .

Constant Force of Mortality:


Another application where the Geometric Distribution arises is constant force of mortality, when one
only looks at regular time intervals rather than at time continuously.25
Exercise: Every year in which Jim starts off alive, he has a 10% chance of dying during that year.
If Jim is currently alive, what is the distribution of his curtate future lifetime?
[Solution: There is a 10% chance he dies during the first year, and has a curtate future lifetime of 0.
If he survives the first year, there is a 10% chance he dies during the second year. Thus there is a
(0.9)(0.1) = 0.09 chance he dies during the second year, and has a curtate future lifetime of 1. If he
survives the second year, which has probability 0.92 , there is a 10% chance he dies during the third
year. Prob[curtate future lifetime = 2] = (0.92 )(0.1).
Similarly, Prob[curtate future lifetime = n] = (0.9n )(0.1).
This is a Geometric Distribution with /(1+) = 0.9 or = 0.9/0.1 = 9.]
24

See Loss Models, pages 105-106. It is due to this memoryless property of the Exponential and Geometric
distributions, that they have constant mean residual lives, as discussed subsequently.
25
When one has a constant force of mortality and looks at time continuously, one gets the Exponential Distribution,
the continuous analog of the Geometric Distribution.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 78

In general, if there is a constant probability of death each year q, then the curtate future lifetime,26 K(x),
follows a Geometric Distribution, with = (1-q)/q =
probability of continuing sequence / probability of ending sequence.
Therefore, for a constant probability of death each year, q, the curtate expectation of life,27 ex, is
= (1-q)/q, the mean of this Geometric Distribution. The variance of the curtate future lifetime is:
(1+) = {(1-q)/q}(1/q) = (1-q)/q2 .
Exercise: Every year in which Jim starts off alive, he has a 10% chance of dying during that year.
What is Jims curtate expectation of life and variance of his curtate future lifetime?
[Solution: Jims curtate future lifetime is Geometric, with mean = = (1 - 0.1)/0.1 = 9,
and variance (1+) = (9)(10) = 90.]
Exercise: Jim has a constant force of mortality, = 0.10536.
What is the distribution of Jims future lifetime.
What is Jims complete expectation of life?
What is the variance of Jims future lifetime?
[Solution: It is Exponential, with mean = 1/ = 9.49, and variance 2 = 90.1.
Comment: Jim has a 1 - e-0.10536 = 10% chance of dying each year in which he starts off alive.
However, here we look at time continuously.]
With a constant force of mortality:
observe continuously Exponential Distribution
observe at discrete intervals Geometric Distribution.
Exercise: Every year in which Jim starts off alive, he has a 10% chance of dying during that year.
Jims estate will be paid $1 at the end of the year of his death.
At a 5% annual rate of interest, what is the present value of this benefit?

26

The curtate future lifetime is the number of whole years completed prior to death.
See page 54 of Actuarial Mathematics.
27
The curtate expectation of life is the expected value of the curtate future lifetime.
See page 69 of Actuarial Mathematics.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 79

[Solution: If Jims curtate future lifetime is n, there is a payment of 1 at time n + 1.

f(n) vn + 1 =

Present value =

(9n / 10n + 1) 0.9524n + 1 = (0.09524)/(1 - 0.8572) = 0.667.]


0

In general, if there is a constant probability of death each year q, the present value of $1 paid at the

end of the year of death is:

(1+ )n + 1 vn + 1 =
0

1
v
1
=
=
(1+) 1 - v / (1+ ) (1+ ) / v -

1
1
=
= q/(q + i).28
(1+ )(1+ i) - (1+ i) / q - (1- q) / q
Exercise: Every year in which Jim starts off alive, he has a 10% chance of dying during that year.
At a 5% annual rate of interest, what is the present value of the benefits from an annuity immediate
with annual payment of 1?
[Solution: If Jims curtate future lifetime is n, there are n payments of 1 each at times:
1, 2, 3,.., n. The present value of these payments is: v + v2 + ... + vn = (1 - vn ) / i.

Present value =

f(n)(1 - vn ) / i = (1/i) {f(n) -

(9n/10n+1) vn} =

n=0

20{1 - (0.1)/(1- 0.9/1.05)} = 6.]


In general, if there is a constant probability of death each year q, the present value of the benefits
from an annuity immediate with annual payment of 1:

f(n)(1 -

n=0

(1/i){1 -

vn )

/ i= {

n=0

f(n) -

n
n
n + 1 (v ) } /i =
(1+)
n=0

1/ (1+ )
} = (1/i){1 - 1/(1 + - v)} = (1/i){( - v)/(1 + - v)} =
1 - v / (1+ )

= (1/(1+i)) {/(1 + - v)} = (1/(1+i)) {1/(1/ + 1 - v)} = (1/(1+i)) {1/(q/(1-q) + 1 - v)} =


(1/(1+i)) {(1-q)/(q + (1 - v)(1-q))} = (1-q)/(q(1+i) + i(1-q))} = (1-q)/(q+i).29
In the previous exercise, the present value of benefits is: (1 - 0.1)/(0.1 + 0.05) = 0.9/0.15 = 6.
For i = 0, (1-q)/(q+i) becomes (1-q)/q, the mean of the Geometric Distribution of curtate future
lifetimes. For q = 0, (1-q)/(q+i) becomes 1/i, the present value of a perpetuity, with the first
payment one year from now.
With a constant force of mortality , the present value of $1 paid at the time of death is: /( + ).
See page 99 of Actuarial Mathematics.
29
With a constant force of mortality , the present value of a life annuity paid continuously is: 1/( + ).
See page 136 of Actuarial Mathematics.
28

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 80

Series of Bernoulli Trials:


For a series of Bernoulli trials with chance of success 0.3, the probability that there are no success in
the first four trials is: (1 - 0.3)4 = 0.24.
Exercise: What is the probability that there is no success in the first four trials and the fifth trial is a
success?
[Solution: (1 - 0.3)4 (0.3) = 0.072 = the probability of the first success occurring on the fifth trial.]
In general, the chance of the first success after x failures is: (1 - 0.3)x(0.3).
More generally, take a series of Bernoulli trials with chance of success q. The probability of the first
success on trial x+1 is: (1-q)x q.
f(x) = (1-q)x q,

x = 0, 1, 2, 3,...

This is the Geometric distribution. It is a special case of the Negative Binomial Distribution.30
Loss Models uses the notation , where q = 1/(1+).
= (1-q) / q = probability of a failure / probability of a success.
For a series of independent identical Bernoulli trials, the chance of the first success
following x failures is given by a Geometric Distribution with mean:
= chance of a failure / chance of a success.
The number of trials = 1 + number of failures = 1 + Geometric.
The Geometric Distribution shows up in many applications, including Markov Chains and Ruin
Theory. In many contexts:

= probability of continuing sequence / probability of ending sequence


= probability of remaining in the loop / probability of leaving the loop.

30

The Geometric distribution with parameter is the Negative Binomial Distribution with parameters and r=1.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Problems:
The following five questions all deal with a Geometric distribution with = 0.6.
5.1 (1 point) What is the mean?
(A) 0.4

(B) 0.5

(C) 0.6

(D) 2/3

(E) 1.5

5.2 (1 point) What is the variance?


A. less than 1.0
B. at least 1.0 but less than 1.1
C. at least 1.1 but less than 1.2
D. at least 1.2 but less than 1.3
E. at least 1.3
5.3 (2 points) What is the chance of having 3 claims?
A. less than 3%
B. at least 3% but less than 4%
C. at least 4% but less than 5%
D. at least 5% but less than 6%
E. at least 6%
5.4 (2 points) What is the mode?
A. 0
B. 1
C. 2

D. 3

E. None of A, B, C, or D.

5.5 (2 points) What is the chance of having 3 claims or more?


A. less than 3%
B. at least 3% but less than 4%
C. at least 4% but less than 5%
D. at least 5% but less than 6%
E. at least 6%

Page 81

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 82

5.6 (1 point) The variable N is generated by the following algorithm:


(1) N = 0.
(2) 25% chance of exiting.
(3) N = N + 1.
(4) Return to step #2.
What is the variance of N?
A. less than 10
B. at least 10 but less than 15
C. at least 15 but less than 20
D. at least 20 but less than 25
E. at least 25
5.7 (2 points) Use the following information:

Assume a Rating Bureau has been making Workers Compensation classification


rates for a very, very long time.
Assume every year the rate for the Carpenters class is based on a credibility
weighting of the indicated rate based on the latest year of data and the current rate.
Each year, the indicated rate for the Carpenters class is given 20% credibility.

Each year, the rate for year Y, was based on the data from year Y-3 and the rate
in the year Y-1. Specifically, the rate in the year 2001 is based on the data from
1998 and the rate in the year 2000.
What portion of the rate in the year 2001 is based on the data from the year 1990?
A. less than 1%
B. at least 1% but less than 2%
C. at least 2% but less than 3%
D. at least 3% but less than 4%
E. at least 4%
5.8 (3 points) An insurance company has stopped writing new general liability insurance policies.
However, the insurer is still paying claims on previously written policies. Assume for simplicity that
payments are made at the end of each quarter of a year. It is estimated that at the end of each
quarter of a year the insurer pays 8% of the total amount remaining to be paid. The next payment
will be made today.
Let X be the total amount the insurer has remaining to pay.
Let Y be the present value of the total amount the insurer has remaining to pay.
If the annual rate of interest is 5%, what is the Y/X?
E. 0.88
A. 0.80
B. 0.82
C. 0.84
D. 0.86

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 83

Use the following information for the next 4 questions:


There is a constant force of mortality of 3%.
There is an annual interest rate of 4%.
5.9 (1 point) What is the curtate expectation of life?
(A) 32.0
(B) 32.2
(C) 32.4
(D) 32.6

(E) 32.8

5.10 (1 point) What is variance of the curtate future lifetime?


(A) 900
(B) 1000
(C) 1100
(D) 1200
(E) 1300
5.11 (2 points) What is the actuarial present value of a life insurance which pays 100,000 at the end
of the year of death?
(A) 41,500 (B) 42,000 (C) 42,500 (D) 43,000 (E) 43,500
5.12 (2 points) What is the actuarial present value of an annuity immediate which pays 10,000 per
year?
(A) 125,000
(B) 130,000
(C) 135,000
(D) 140,000
(E) 145,000
5.13 (1 point) After each time Mark Orfe eats at a restaurant, there is 95% chance he will eat there
again at some time in the future. Mark has eaten today at the Phoenicia Restaurant.
What is the probability that Mark will eat at the Phoenicia Restaurant precisely 7 times in the future?
A. 2.0%
B. 2.5%
C. 3.0%
D. 3.5%
E. 4.0%
5.14 (3 points) Use the following information:
The number of days of work missed by a work related injury to a workers arm is
Geometrically distributed with = 4.

If a worker is disabled for 5 days or less, nothing is paid for his lost wages under
workers compensation insurance.
If he is disabled for more than 5 days due to a work related injury, workers
compensation insurance pays him his wages for all of the days he was out of work.
What is the average number of days of wages reimbursed under workers compensation insurance
for a work related injury to a workers arm?
(A) 2.2
(B) 2.4
(C) 2.6
(D) 2.8
(E) 3.0

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 84

Use the following information for the next two questions:


The variable X is generated by the following algorithm:
(1) X = 0.
(2) Roll a fair die with six sides and call the result Y.
(3) X = X + Y.
(4) If Y = 6 return to step #2.
(5) Exit.
5.15 (2 points) What is the mean of X?
A. less than 4.0
B. at least 4.0 but less than 4.5
C. at least 4.5 but less than 5.0
D. at least 5.0 but less than 5.5
E. at least 5.5
5.16 (2 points) What is the variance of X?
A. less than 8
B. at least 8 but less than 9
C. at least 9 but less than 10
D. at least 10 but less than 11
E. at least 11
5.17 (1 point) N follows a Geometric Distribution with = 0.2. What is Prob[N = 1 | N 1]?
A. 8%

B. 10%

C. 12%

D. 14%

E. 16%

5.18 (1 point) N follows a Geometric Distribution with = 0.4. What is Prob[N = 2 | N 2]?
A. 62%

B. 65%

C. 68%

D. 71%

E. 74%

5.19 (2 points) N follows a Geometric Distribution with = 1.5. What is E[1/(N+1)]?


Hint: x + x2 /2 + x3 /3 + x4 /4 + ... = -ln(1-x), for 0 < x < 1.
A. 0.5

B. 0.6

C. 0.7

D. 0.8

E. 0.9

5.20 (2 points) N follows a Geometric Distribution with = 0.8. What is E[N | N > 1]?
A. 2.6

B. 2.7

C. 2.8

D. 2.9

E. 3.0

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 85

5.21 (2 points) Use the following information:

Larry, his brother Darryl, and his other brother Darryl are playing as a three man
basketball team at the school yard.

Larry, Darryl, and Darryl have a 20% chance of winning each game, independent of any
other game.

When a teams turn to play comes, they play the previous winning team.
Each time a team wins a game it plays again.
Each time a team loses a game it sits down and waits for its next chance to play.
It is currently the turn of Larry, Darryl, and Darryl to play again after sitting for a while.
Let X be the number of games Larry, Darryl, and Darryl play until they sit down again.
What is the variance of X?
A. 0.10
B. 0.16
C. 0.20
D. 0.24
E. 0.31
Use the following information for the next two questions:
N follows a Geometric Distribution with = 1.3.
Define (N - j)+ = n-j if n j, and 0 otherwise.
5.22 (2 points) Determine E[(N - 1)+].
A. 0.73

B. 0.76

C. 0.79

D. 0.82

E. 0.85

D. 0.39

E. 0.42

5.23 (2 points) Determine E[(N-2)+].


A. 0.30

B. 0.33

C. 0.36

Use the following information for the next three questions:


Ethan is an unemployed worker. Ethan has a 25% probability of finding a job each week.
5.24 (2 points) What is the probability that Ethan is still unemployed after looking for a job for 6
weeks?
A. 12%
B. 14%
C. 16%
D. 18%
E. 20%
5.25 (1 point) If Ethan finds a job the first week he looks, count this as being unemployed 0 weeks.
If Ethan finds a job the second week he looks, count this as being unemployed 1 week, etc.
What is the mean number of weeks that Ethan remains unemployed?
A. 2
B. 3
C. 4
D. 5
E. 6
5.26 (1 point) What is the variance of the number of weeks that Ethan remains unemployed?
A. 12
B. 13
C. 14
D. 15
E. 16

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 86

5.27 (3 points) For a discrete density pk, define the entropy as: -

pk ln[pk].
k=0

Determine the entropy for a Geometric Distribution as per Loss Models.


Hint: Why is the mean of the Geometric Distribution ?
5.28 (2, 5/85, Q.44) (1.5 points) Let X denote the number of independent rolls of a fair die
required to obtain the first "3". What is P[X 6]?
A. (1/6)5 (5/6)

B. (1/6)5

C. (5/6)5 (1/6)

D. (5/6)6

E. (5/6)5

5.29 (2, 5/88, Q.22) (1.5 points) Let X be a discrete random variable with probability function
P[X = x] = 2/3x for x = 1, 2, 3, . . .
A. 1/4

B. 2/7

What is the probability that X is even?


C. 1/3
D. 2/3
E. 3/4

5.30 (2, 5/90, Q.5) (1.7 points) A fair die is tossed until a 2 is obtained. If X is the number of trials
required to obtain the first 2, what is the smallest value of x for which P[X x] 1/2?
A. 2 B. 3 C. 4 D. 5 E. 6
5.31 (2, 5/92, Q.35) (1.7 points) Ten percent of all new businesses fail within the first year. The
records of new businesses are examined until a business that failed within the first year is found. Let
X be the total number of businesses examined which did not fail within the first year, prior to finding a
business that failed within the first year. What is the probability function for X?
A. 0.1(0.9x) for x = 0, 1, 2, 3,...

B. 0.9x(0.1x) for x = 1, 2, 3,...

C. 0.1x(0.9x) for x = 0, 1, 2, 3,...

D. 0.9x(0.1x) for x = 1, 2,3,...

E. 0.1(x - 1)(0.9x) for x = 2, 3, 4,...

5.32 (Course 1 Sample Exam, Q. 7) (1.9 points) As part of the underwriting process for
insurance, each prospective policyholder is tested for high blood pressure. Let X represent the
number of tests completed when the first person with high blood pressure is found.
The expected value of X is 12.5.
Calculate the probability that the sixth person tested is the first one with high blood pressure.
A. 0.000
B. 0.053
C. 0.080
D. 0.316
E. 0.394
5.33 (1, 5/00, Q.36) (1.9 points) In modeling the number of claims filed by an individual under an
automobile policy during a three-year period, an actuary makes the simplifying assumption that for all
integers n 0, pn+1 = pn /5, where pn represents the probability that the policyholder files n claims
during the period. Under this assumption, what is the probability that a policyholder files more than
one claim during the period?
(A) 0.04
(B) 0.16
(C) 0.20
(D) 0.80
(E) 0.96

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 87

5.34 (1, 11/01, Q.33) (1.9 points) An insurance policy on an electrical device pays a benefit of
4000 if the device fails during the first year. The amount of the benefit decreases by 1000 each
successive year until it reaches 0. If the device has not failed by the beginning of any given year,
the probability of failure during that year is 0.4.
What is the expected benefit under this policy?
(A) 2234
(B) 2400
(C) 2500
(D) 2667
(E) 2694

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 88

Solutions to Problems:
5.1. C. mean = = 0.6.
5.2. A. variance = (1+ ) = (0.6)(1.6) = 0.96.
5.3. B. f(x) = x / (1+)x+1. f(3) = (.6)3 / (1.6)3+1 = 3.30%.
5.4. A. The mode is 0, since f(0) is larger than any other value.
n
f(n)

0
0.6250

1
0.2344

2
0.0879

3
0.0330

4
0.0124

5
0.0046

6
0.0017

7
0.0007

Comment: Just as with the Exponential Distribution, the Geometric Distribution always has a mode
of zero.
5.5. D. 1 - {f(0) + f(1) + f(2)} = 1 - (.6250 + .2344 + .0879) = 5.27%.
Alternately, S(x) = (/(1+))x+1. S(2) = (.6/1.6)3 = 5.27%.
5.6. B. This is a loop, in which each time through there is a 25% of exiting and a 75% chance of
staying in the loop. Therefore, N is Geometric with
= probability of remaining in the loop / probability of leaving the loop = .75/.25 = 3.
Variance = (1 + ) = (3)(4) = 12.
5.7. D. The weight given to the data from year 1997 in the rate for year 2000 is 0.20.
Therefore, the weight given to the data from year 1997 in the rate for year 2001 is: (1 - 0.2)(0.2).
Similarly, the weight given to the data from year 1996 in the rate for year 2001 is: (1-0.2)2 (0.2).
The weight given to the data from year 1990 in the rate for year 2001 is: (1-0.2)8 (0.2) = 3.4%.
Comment: The weights are from a geometric distribution with = 1/Z - 1 and /(1+) = 1- Z. The
weights are: (1-Z)n Z for n = 0,1,2,... Older years of data get less weight.
This is a simplification of a real world application, as discussed in An Example of Credibility
and Shifting Risk Parameters, by Howard C. Mahler, PCAS 1990.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 89

5.8. E. Let Z be the amount remaining to be paid prior to quarter n. Then the payment in quarter n
is .08Z. This leaves .92Z remaining to be paid prior to quarter n+1. Thus the payment in quarter n+1
is (.08)(.92)Z. The payment in quarter n+1 is .92 times the payment in quarter n. The payments
each quarter decline by a factor of .92.
Therefore, the proportion of the total paid in each quarter is a Geometric Distribution with /(1+) =
.92. = .92/(1-.92) = 11.5. The payment at the end of quarter n is:
X f(n) = X n /(1+)n+1, n = 0, 1, 2, ... (The sum of these payment is X.)
The present value of the payment at the end of quarter n is:
X f(n)/(1.05)n/4 = X(.9879n )n /(1+)n+1, n = 0, 1, 2, ...

Y, the total present value is:

X(.9879n )n /(1+)n+1 = (X/(1+)) (.9879n )(/(1+))n = (X/(1+11.5)) (.9879n )(.92n )


n=0

n=0

n=0

= (X/12.5)/(1-.9089) = .878X. Y/X = 0.878.


Comment: In Measuring the Interest Rate Sensitivity of Loss Reserves, by Richard Gorvett
and Stephen DArcy, PCAS 2000 , a geometric payment pattern is used in order to estimate
Macaulay durations, modified durations, and effective durations.
5.9. E. If a person is alive at the beginning of the year, the chance they die during the next year is:
1 - e = 1 - e-.03 = .02955. Therefore, the distribution of curtate future lifetimes is Geometric with
mean = (1-q)/q = .97045/.02955 = 32.84 years.
Alternately, the complete expectation of life is: 1/ = 1/.03 = 33.33.
The curtate future lifetime is on average about 1/2 less; 33.33 - 1/2 = 32.83.
5.10. C. The variance of the Geometric Distribution is: (1+) = (32.84)(33.84) = 1111.
Alternately, the future lifetime is Exponentially distributed with mean = 1/ = 1/.03 = 33.33, and
variance 2 = 33.332 = 1111. Since approximately they differ by a constant, 1/2, the variance of the
curtate future lifetimes is approximately that of the future lifetimes, 1111.
5.11. C. With constant probability of death, q, the present value of the insurance is: q/(q + i).
(100,000)q/(q + i) = (100000)(.02955)/(.02955 + .04) = 42,487.
Alternately, the present value of an insurance that pays at the moment of death is:
/(+) = .03/(.03 + ln(1.04)) = .03/(.03 + .03922) = .43340. (100000)(.43340) = 43,340.
The insurance paid at the end of the year of death is paid on average about 1/2 year later;
43340/(1.04.5) = 42,498.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 90

5.12. D. With constant probability of death, q, the present value of an annuity immediate is:
(1-q)/(q +i).
(10000)(1-q)/(q +i) = (10000)(1 - .02955)/(.02955 + .04) = 139,533.
Alternately, the present value of an annuity that pays continuously is:
1/(+) = 1/(.03 + ln(1.04)) = 1/(.03 + .03922) = 14.4467. (10000)(14.4467) = 144,467.
Discounting for another half year of interest and mortality, the present value of the annuity immediate
is approximately: 144,467/((1.04.5)(1.03.5)) = 139,583.
5.13. D. There is a 95% chance Mark will return. If he returns, there is another 95% chance he will
return again, etc. The chance of returning 7 times and then not returning an 8th time is:
(.957 )(.05) = 3.5%.
Comment: The number of future visits is a Geometric Distribution with =
probability of continuing sequence / probability of ending sequence = .95/.05 = 19.
f(7) = 7 /(1+)8 = 197 /208 = 3.5%.
5.14. C. If he is disabled for n days, then he is paid 0 if n 5, and n days of wages if n 6.
Therefore, the mean number of days of wages paid is:

n=6

n=0

n=0

n f(n) = n f(n) - n f(n) = E[N] - {0f(0) + 1f(1) + 2f(2) + 3f(3) + 4f(4) + 5f(5)} =

4 - {(1)(.2)(.8) + (2)(.2)(.82 ) + (3)(.2)(.83 ) + (4)(.2)(.84 ) + (5)(.2)(.85 )} = 2.62.


Alternately, due to the memoryless property of the Geometric Distribution (analogous to its
continuous analog the Exponential), truncated and shifted from below at 6, we get the same
Geometric Distribution. Thus if only those days beyond 6 were paid for, the average nonzero
payment is 4. However, in each case where we have at least 6 days of disability we pay the full
length of disability which is 6 days longer, so the average nonzero payment is: 4 + 6 = 10.
The probability of a nonzero payment is: 1 - {f(0) + f(1) + f(2) + f(3) + f(4) + f(5)} =
1 - {.2 + (.2)(.8) + (.2)(.82 ) + (.2)(.83 ) + (.2)(.84 ) + (.2)(.85 )} = 0.262.
Thus the average payment (including zero payments) is: (0.262)(10 days) = 2.62 days.
Comment: Just an exam type question, not intended as a model of the real world.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 91

5.15. B. & 5.16. D. The number of additional dies rolled beyond the first is Geometric with
= probability of remaining in the loop / probability of leaving the loop = (1/6)/(5/6) = 1/5.
Let N be the number of dies rolled, then N - 1 is Geometric with = 1/5.
X = 6(N - 1) + the result of the last 6-sided die rolled.
The result of the last six sided die to be rolled is equally likely to be a 1, 2, 3, 4 or 5 (it cant be a six
or we would have rolled an additional die.)
E[X] = (6)(mean of a Geometric with = 1/5) + (average of 1,2,3,4,5) = (6)(1/5) + 3 = 4.2.
Variance of the distribution equally likely to be 1, 2, 3, 4, or 5 is: (22 + 12 + 02 + 12 + 22 )/5 = 2.
Var[X] = 62 (variance of a Geometric with = 1/5) + 2 = (36)(1/5)(6/5) + 2 = 10.64.
5.17. D. Prob[N = 1 | N 1] = Prob[N = 1]/Prob[N 1] = /(1+)2 /{1/(1+)) + /(1+))2 } =
/(1 + 2) = .2/1.4 = 0.143.
5.18. D. Prob[N = 2 | N 2] = Prob[N = 2]/Prob[N 2] = 2/(1+)3 /{2/(1+)2 } = 1/(1+).
Alternately, from the memoryless property, Prob[N = 2 | N 2] = Prob[N = 0] = 1/(1+) = .714.

5.19. B. E[1/(N+1)] = f(n)/(n+1) = f(m-1)/m = (1/) (/(1+))m/m = (1/)(-ln(1 - /(1+))


n=0

m= 1

m= 1

= ln(1+)/ = ln(2.5)/1.5 = 0.611.


5.20. C. E[N | N > 1]Prob[N > 1] + (1) Prob[N = 1] + (0) Prob[N = 0] = E[N] = .
E[N | N > 1] = { - /(1+)2 } / {2/(1+)2 } = 2 + = 2.8.
5.21. E. X is 1 + a Geometric Distribution with
= (chance of remaining in the loop)/(chance of leaving the loop) = .2/.8 = 1/4.
Variance of X is: (1+) = (1/4)(5/4) = 5/16 = 0.3125.
Comment: Prob[X = 1] = 1 - .2 = .8. Prob[X = 2] = (.2)(.8). Prob[X = 3] = (.22 )(.8).
Prob[X = 4] = (.23 )(.8). Prob[X = x] = (.2x-1)(.8). While this is a series of Bernoulli trials, it ends
when the team has its first failure. X is the number of trials through the first failure.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

5.22. A. E[(N - 1)+] = E[N] - E[N

HCM 10/4/12,

Page 92

1] = - Prob[N 1] = /(1+) = 2/(1+) = 0.7348.

Alternately, E[(N-1)+] = E[(1-N)+] + E[N] - 1 = Prob[N = 0] + - 1 =


+ 1/(1+) - 1 = 1.3 + 1/2.3 - 1 = 0.7348.
Alternately, the memoryless property of the Geometric E[(N-1)+]/Prob[N1] = E[N] = .
E[(N-1)+] = Prob[N1] = /(1+) = 2/(1+) = 0.7348.
5.23. E. E[(N - 2)+] = E[N] - E[N 2] = - (Prob[N = 1] + 2 Prob[N 2]) =
- /(1+)2 - 22/(1+)2 = {(1+)2 - - 22}/(1+)2 = 3/(1+)2 = 1.33 /2.32 = 0.415.
Alternately, E[(N-2)+] = E[(2-N)+] + E[N] - 2 = 2Prob[N = 0] + Prob[N = 1] + - 2 =
+ 2/(1+) + /(1+)2 - 2 = 1.3 + 2/2.3 + (1.3)/2.32 - 2 = 0.415.
Alternately, the memoryless property of the Geometric E[(N-2)+]/Prob[N2] = E[N] = .
E[(N-2)+] = Prob[N2] = 2/(1+)2 = 3/(1+)2 = 0.415.
Comment: For integral j, for the Geometric, E[(N - j)+] = j+1/(1+)j.
5.24. D. Probability of finding a job within six weeks is:
(.25){1 + .75 + .752 + .753 + .754 + .755 } = .822. 1 - .822 = 17.8%.
5.25. B. The number of weeks he remains unemployed is Geometric with
= (chance of failure) / ( chance of success) = 0.75/0.25 = 3. Mean = = 3.
5.26. A. Variance of this Geometric is: (1 + ) = (3)(4) = 12.

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

5.27. The mean of a Geometric Distribution is .

HCM 10/4/12,

Page 93

k pk = .
k=0

pk =

k
(1+ ) k + 1

. ln[pk] = k ln[] - (k+1) ln[1 + ] = {ln[] - ln[1 + ]} k - ln[1 + ] .

k=0

k=0

k=0

pk ln[pk] = {ln[1 + ] - ln[]} k pk + ln[1 + ] pk = {ln[1 + ] - ln[]} + ln[1 + ] (1) =

(1 + ) ln[1 + ] - ln[].
Comment: The Shannon entropy from information theory, except there the log is to the base 2.
5.28. E. Prob[X 6] = Prob[first 5 rolls each 3] = (5/6)5 .
Alternately, the number failures before the first success, X - 1, is Geometric with
= chance of failure / chance of success = (5/6)/(1/6) = 5.
Prob[X 6] = Prob[# failures 5] = 1 - F(4) = {/(1+)}4+1 = (5/6)5 .
5.29. A. Prob[X is even] = 2/32 + 2/34 + 2/36 + ... = (2/9)/(1 - 1/9) = 1/4.
Comment: X - 1 follows a Geometric Distribution with = 2.
5.30. C. X - 1 is Geometric with = chance of failure / chance of success = (5/6)/(1/6) = 5.
Prob[X - 1 x - 1] = {/(1+)}x = (5/6)x. Set this equal to 1/2: 1/2 = (5/6)x.
x = ln(1/2)/ln(5/6) = 3.8. The next greatest integer is 4. P[X 4] = 1 - (5/6)4 = .518 1/2.
Alternately, Prob[X = x] = Prob[X - 1 tosses 2]Prob[X = 2] = (5/6)x-1/6.
X

Probability

Cumulative

1
2
3
4
5
6

0.1667
0.1389
0.1157
0.0965
0.0804
0.0670

0.1667
0.3056
0.4213
0.5177
0.5981
0.6651

5.31. A. X has a Geometric with = chance of continuing / chance of ending = (.9)/(.1) = 9.


f(x) = 9x/10x+1 = (0.1)(0.9x ), for x = 0, 1, 2, 3,...

2013-4-1,

Frequency Distributions, 5 Geometric Dist.

HCM 10/4/12,

Page 94

5.32. B. This is a series of Bernoulli trials, and X - 1 is the number of failures before the first success.
Thus X - 1 is Geometric. = E[X - 1] = 12.5 - 1 = 11.5.
Prob[X = 6] = Prob[X - 1 = 5] = f(5) = 5/(1+ )6 = 11.55 /12.56 = .0527.
Alternately, Prob[person has high blood pressure] = 1/E[X] = 1/12.5 = 8%.
Prob[sixth person is the first one with high blood pressure]
= Prob[first five dont have high blood pressure] Prob[sixth has high blood pressure]
= (1 - .08)5 (.08) = 0.0527.
5.33. A. The densities are declining geometrically.
Therefore, this is a Geometric Distribution, with /(1+ ) = 1/5. = 1/4.
Prob[more than one claim] = 1 - f(0) - f(1) = 1 - 1/(1+ ) - /(1+ )2 = 1 - 4/5 - 4/25 = 0.04.
5.34. E. Expected Benefit =
(4000)(0.4) + (3000)(0.6)(0.4) + (2000)(0.62 )(0.4) + (1000)(0.63 )(0.4) = 2694.
Alternately, the benefit is 1000(4 - N)+, where N is the number of years before the device fails.
N is Geometric, with 1/(1 + ) = .4. = 1.5.
E[N 4] = 0f(0) + 1f(1) + 2f(2) + 3f(3) + 4{1 - f(0) - f(1) - f(2) - f(3)} = 4 - 4f(0) - 3f(1) - 2f(2) - f(3).
Expected Benefit = 1000E[(4 - N)+] = 1000(4 - E[N 4]) = 1000{4f(0) + 3f(1) + 2f(2) + f(3)}
= 1000{4(.4) + 3(.4)(.6) + 2(.4)(.62 ) + (.4)(.63 )} = 2694.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 95

Section 6, Negative Binomial Distribution


The third and final important frequency distribution is the Negative Binomial, which has the Geometric
as a special case.

Negative Binomial Distribution


Support: x = 0, 1, 2, 3, ...

P. d. f. :

f(2) =

r = 1 is a Geometric Distribution

F(x) = (r, x+1 ; 1/(1+)) =1- ( x+1, r ; /(1+) )

D. f. :

f(0) =

Parameters: > 0, r 0.

f(x) =

x+ r 1
x
x
r(r + 1)...(r + x - 1)
=
.

(1+ )x + r x (1+ ) x + r
x!

1
(1 + )r

f(1) =
2

r (r + 1)

(1 + )r + 2

Mean = r

f(3) =

Variance = r(1+)

Coefficient of Variation =

Kurtosis = 3 +

Incomplete Beta Function

1+
r

r
(1 + )r + 1
3

r (r + 1) (r + 2)

(1 + )r + 3

Variance / Mean = 1 + > 1.

Skewness =

1 + 2
r (1+ )

= CV(1+2)/(1+).

6 2 + 6 + 1
.
r (1 + )

Mode = largest integer in (r-1)

(if (r-1) is an integer,


then both (r-1) and (r-1) - 1 are modes.)

Probability Generating Function: P(z) = {1- (z-1)}-r, z < 1 + 1/.


Moment Generating Function: M(s) = {1- (es-1)}-r, s < ln(1+) - ln().
f(x+1)/f(x) = a + b/(x+1), a = /(1+), b = (r-1)/(1+), f(0) = (1+)-r.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

A Negative Binomial Distribution with r = 2 and = 4:


Prob.
0.08

0.06

0.04

0.02

10

15

20

25

30

A Negative Binomial Distribution with r = 0.5 and = 10:


Prob.
0.3
0.25
0.2
0.15
0.1
0.05

10

15

20

25

30

Page 96

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 97

Here is a Negative Binomial Distribution with parameters = 2/3 and r = 8:31


Number of
Claims
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

f(x)
0.0167962
0.0537477
0.0967459
0.1289945
0.1418940
0.1362182
0.1180558
0.0944446
0.0708335
0.0503705
0.0342519
0.0224194
0.0141990
0.0087378
0.0052427
0.0030757
0.0017685
0.0009987
0.0005548
0.0003037
0.0001640
0.0000875
0.0000461
0.0000241
0.0000124
0.0000064
0.0000032
0.0000016
0.0000008
0.0000004
0.0000002

Sum

1.00000

F(x)
0.01680
0.07054
0.16729
0.29628
0.43818
0.57440
0.69245
0.78690
0.85773
0.90810
0.94235
0.96477
0.97897
0.98771
0.99295
0.99603
0.99780
0.99879
0.99935
0.99965
0.99982
0.99990
0.99995
0.99997
0.99999
0.99999
1.00000
1.00000
1.00000
1.00000
1.00000

Number of Claims
times f(x)
0.00000
0.05375
0.19349
0.38698
0.56758
0.68109
0.70833
0.66111
0.56667
0.45333
0.34252
0.24661
0.17039
0.11359
0.07340
0.04614
0.02830
0.01698
0.00999
0.00577
0.00328
0.00184
0.00101
0.00055
0.00030
0.00016
0.00008
0.00004
0.00002
0.00001
0.00001

Square of Number of Claims


times f(x)
0.00000
0.05375
0.38698
1.16095
2.27030
3.40546
4.25001
4.62779
4.53334
4.08001
3.42519
2.71275
2.04465
1.47669
1.02757
0.69204
0.45275
0.28863
0.17977
0.10964
0.06560
0.03857
0.02232
0.01273
0.00716
0.00398
0.00218
0.00119
0.00064
0.00034
0.00018

5.33333

37.33314

For example, f(5) = {( 2/3)5 / (1+ 2/3)8+5}(12!) / {(5!)(7!)}


= (0.000171993)(479,001,600)/{(120)(5040)} = 0.136.
The mean is: r = 8(2/3) = 5.333. The variance is: 8(2/3)(1+2/3) = 8.89.
The variance can also be computed as: (mean)(1+) = 5.333(5/3) = 8.89.
The variance is indeed = E[X2 ] - E[X]2 = 37.333 - 5.33332 = 8.89.
According to the formula given above, the mode should be the largest integer in (r-1) =
(8-1)(2/3) = 4.67, which contains the integer 4. In fact, f(4) = 14.2% is the largest value of the
probability density function. Since F(5) = 0.57 0.5 and F(4) = 0.44 < 0.5, 5 is the median.
31

The values for the Negative Binomial probability density function in the table were computed using:
f(0) = (/(1+))r and f(x+1) / f(x) = (x+r) / {(x+1)(1+)}.
For example, f(12) = f(11)(11+r) / {12(1+)} = (0.02242)(2/3)(19) / 20 = 0.01420.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 98

Mean and Variance of the Negative Binomial Distribution:


The mean of a Geometric distribution is and its variance is (1+). Since the Negative Binomial is
a sum of r Geometric Distributions, it follows that the mean of the Negative Binomial is r and
the variance of the Negative Binomial is r(1+).
Since > 0, 1+ > 1, for the Negative Binomial Distribution the variance is greater than
the mean.
For the Negative Binomial, the ratio of the variance to the mean is 1+, while variance/mean = 1 for
the Poisson Distribution.
Thus ()(mean) is the extra variance for the Negative Binomial compared to the Poisson.
Non-Integer Values of r:
Note that even if r is not integer, the binomial coefficient in the front of the Negative Binomial Density
x+ r 1 (x +r -1)! (x +r -1) (x + r - 2) ... (r)
can be calculated as:
=
.
=
x! (r - 1)!
x!
x
For example with r = 6.2 if one wanted to compute f(4), then the binomial coefficient in front is:
4 + 6.2 1
9.2
9.2!
(9.2) (8.2) (7.2) (6.2)
=
= 140.32.

= =
4
4!

4 5.2! 4!
Note that the numerator has 4 factors; in general it will have x factors. These four factors are:
9.2! / (9.2-4)! = 9.2!/5.2!, or if you prefer: (10.2) / (6.2) = (9.2)(8.2)(7.2)(6.2).
As shown in Loss Models, in general one can rewrite the density of the Negative Binomial as:
x
r(r + 1)...(r + x - 1)
f(x) =
, where there are x factors in the product in the numerator.
(1+ )x + r
x!
Exercise: For a Negative Binomial with parameters r = 6.2 and = 7/3, compute f(4).
[Solution: f(4) = {(9.2)(8.2)(7.2)(6.2)/4!} (7/3)4 / (1+ 7/3)6.2+4 = 0.0193.]

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 99

Negative Binomial as a Mixture of Poissons:


As discussed subsequently, when Poissons are mixed via a Gamma Distribution, the mixed
distribution is always a Negative Binomial Distribution, with r = = shape parameter of the Gamma
and = = scale parameter of the Gamma. The mixture of Poissons via a Gamma distribution
produces a Negative Binomial Distribution and increases the variance above the mean.
Series of Bernoulli Trials:
Return to the situation that resulted in the Geometric distribution, involving a series of independent
Bernoulli trials each with chance of success 1/(1 + ), and chance of failure of /(1 + ).
What is the probability of two successes and four failures in the first six trials?
It is given by the Binomial Distribution:
6
6
1/(1+)2 {1 - 1/(1+)}4 = 4 /(1+)6 .
2
2
The chance of having the third success on the seventh trial is given by 1/(1 + ) times the above
probability:
6
4 /(1+)7
2
Similarly the chance of the third success on trial x + 3 is given by 1/(1 + ) times the probability of
3 - 1 = 2 successes and x failures on the first x + 3 - 1 = x + 2 trials:
x +2

x/(1+)x+3
2
More generally, the chance of the rth success on trial x+r is given by 1/(1 + ) times the probability
of r-1 success and x failures on the first x+r-1 trials.
x+ r 1
f(x) = {1/(1+)}
x/(1+)x+r-1 =
r
-1

x+ r 1

x/(1+)x+r, x = 0, 1, 2, 3...
x

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 100

This is the Negative Binomial Distribution. Thus we see that one source of the Negative Binomial is
the chance of experiencing failures on a series of independent Bernoulli trials prior to getting a certain
number of successes.32 Note that in the derivation, 1/(1 + ) is the chance of success on each
Bernoulli trial.
Thus, = {/(1+)) / {1/(1+)} = chance of a failure / chance of a success.
For a series of independent identical Bernoulli trials, the chance of success number r
following x failures is given by a Negative Binomial Distribution with parameters
= (chance of a failure) / (chance of a success), and r.
Exercise: One has a series of independent Bernoulli trials, each with chance of success 0.3.
What is the distribution of the number of failures prior to the 5th success?
[Solution: A Negative Binomial Distribution, as per Loss Models, with parameters = 0.7/0.3 = 7/3,
and r = 5.]
While this is one derivation of the Negative Binomial distribution, note that the Negative Binomial
Distribution is used to model claim counts in many situations that have no relation to this derivation.

32

Even though the Negative Binomial Distribution was derived here for integer values of r, as has been discussed,
the Negative Binomial Distribution is well defined for r non-integer as well.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 101

Negative Binomial as a Sum of Geometric Distributions:


The number of claims for a Negative Binomial Distribution was modeled as the number of failures
prior to getting a total of r successes on a series of independent Bernoulli trials. Instead one can add
up the number of failures associated with getting a single success r times independently of each
other. As seen before, each of these is given by a Geometric distribution. Therefore, obtaining r
successes is the sum of r separate independent variables each involving getting a single success.
Number of Failures until the third success has a
Negative Binomial Distribution: r = 3, = (1 - q)/q.

Time 0

Success #1

Geometric
= (1 - q)/q

Success #2

Geometric
= (1 - q)/q

Success #3

Geometric
= (1 - q)/q

Therefore, the Negative Binomial Distribution with parameters and r, with r integer, can
be thought of as the sum of r independent Geometric distributions with parameter .

The Negative Binomial Distribution for r = 1 is a Geometric Distribution.


Since the Geometric distribution is the discrete analog of the Exponential distribution, the Negative
Binomial distribution is the discrete analog of the continuous Gamma Distribution33.
The parameter r in the Negative Binomial is analogous to the parameter in the Gamma
Distribution.34

(1+)/ in the Negative Binomial Distribution is analogous to e1/ in the Gamma Distribution.

Recall that the Gamma Distribution is a sum of independent Exponential Distributions, just as the Negative
Binomial is the sum of r independent Geometric Distributions.
34
Note that the mean and variance of the Negative Binomial and the Gamma are proportional respectively to r and .
33

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 102

Adding Negative Binomial Distributions:


Since the Negative Binomial is a sum of Geometric Distributions, if one sums independent Negative
Binomials with the same , then one gets another Negative Binomial, with the same parameter
and the sum of their r parameters.35
Exercise: X is a Negative Binomial with = 1.4 and r = 0.8. Y is a Negative Binomial with
= 1.4 and r = 2.2. Z is a Negative Binomial with = 1.4 and r = 1.7. X, Y, and Z are independent
of each other. What form does X + Y + Z have?
[Solution: X + Y + Z is a Negative Binomial with = 1.4 and r = .8 + 2.2 + 1.7 = 4.7.]
If X is Negative Binomial with parameters and r1 , and Y is Negative Binomial with
parameters and r2 , X and Y independent, then X + Y is Negative Binomial with
parameters and r1 + r2 .
Specifically, the sum of n independent identically distributed Negative Binomial variables, with the
same parameters and r, is a Negative Binomial with parameters and nr.
Exercise: X is a Negative Binomial with = 1.4 and r = 0.8.
What is the form of the sum of 25 independent random draws from X?
[Solution: A random draw from a Negative Binomial with = 1.4 and r = (25)(.8) = 20.]
Thus if one had 25 exposures, each of which had an independent Negative Binomial frequency
process with = 1.4 and r = 0.8, then the portfolio of 25 exposures has a
Negative Binomial frequency process with = 1.4 and r = 20.

This holds whether or not r is integer. This is analogous to adding independent Gammas with the same
parameter. One obtains a Gamma, with the same parameter, but with the new parameter equal to the sum of the
individual parameters.
35

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 103

Effect of Exposures:
Assume one has 100 exposures with independent, identically distributed frequency distributions.
If each one is Negative Binomial with parameters and r, then so is the sum, with parameters and
100r. If we change the number of exposures to for example 150, then the sum is Negative Binomial
with parameters and 150r, or 1.5 times the r parameter in the first case.
In general, as the exposures change, the r parameter changes in proportion.36
Exercise: The total number of claims from a portfolio of insureds has a Negative Binomial Distribution
with = 0.2 and r = 30.
If next year the portfolio has 120% of the current exposures, what is its frequency distribution?
[Solution: Negative Binomial with = 0.2 and r = (1.2)(30) = 36.]

Thinning Negative Binomial Distributions:


Thinning can also be applied to the Negative Binomial Distribution.37
The parameter of the Negative Binomial Distribution is multiplied by the thinning factor.
Exercise: Claim frequency follows a Negative Binomial Distribution with parameters
= 0.20 and r = 1.5. One quarter of all claims involve attorneys. If attorney involvement is
independent between different claims, what is the probability of getting two claims involving
attorneys in the next year?
[Solution: Claims with attorney involvement are Negative Binomial Distribution with
= (.20)(25%) = 0.05 and r = 1.5.
Thus f(2) = r(r+1)2 / {2! (1+)r+2} = (1.5)(2.5)(.05)2 / {2 (1.05)3.5} = 0.395%.]
Note that when thinning the parameter is altered, while when adding the r parameter is affected.
As discussed previously, if one adds two independent Negative Binomial Distributions with the
same , then the result is also a Negative Binomial Distribution, with the sum of the r parameters.

36

See Section 6.12 of Loss Models. This same result holds for a Compound Frequency Distribution, to be
discussed subsequently, with a primary distribution that is Negative Binomial.
37
See Table 8.3 in Loss Models. However, unlike the Poisson case, the large and small accidents are not
independent processes.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Problems:
The following six questions all deal with a Negative Binomial distribution with parameters
= 0.4 and r = 3.
6.1 (1 point) What is the mean?
A. less than .9
B. at least .9 but less than 1.0
C. at least 1.0 but less than 1.1
D. at least 1.1 but less than 1.2
E. at least 1.2
6.2 (1 point) What is the variance?
A. less than 1.8
B. at least 1.8 but less than 1.9
C. at least 1.9 but less than 2.0
D. at least 2.0 but less than 2.1
E. at least 2.1
6.3 (2 points) What is the chance of having 4 claims?
A. less than 3%
B. at least 3% but less than 4%
C. at least 4% but less than 5%
D. at least 5% but less than 6%
E. at least 6%
6.4 (2 points) What is the mode?
A. 0
B. 1
C. 2

D. 3

E. None of A, B, C, or D.

6.5 (2 points) What is the median?


A. 0
B. 1
C. 2

D. 3

E. None of A, B, C, or D.

6.6 (2 points) What is the chance of having 4 claims or less?


A. 90%
B. 92%
C. 94%
D. 96%
E. 98%

Page 104

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 105

6.7 (2 points) Bud and Lou play a series of games. Bud has a 60% chance of winning each game.
Lou has a 40% chance of winning each game. The outcome of each game is independent of any
other. Let N be the number of games Bud wins prior to Lou winning 5 games.
What is the variance of N?
A. less than 14
B. at least 14 but less than 16
C. at least 16 but less than 18
D. at least 18 but less than 20
E. at least 20
6.8 (1 point) For a Negative Binomial distribution with = 2/9 and r = 1.5, what is the chance of
having 3 claims?
A. 1%
B. 2%

C. 3%

D. 4%

E. 5%

6.9 (2 points) In baseball a team bats in an inning until it makes 3 outs. Assume each batter has a
40% chance of getting on base and a 60% chance of making an out. Then what is the chance of a
team sending exactly 8 batters to the plate in an inning? (Assume no double or triple plays.
Assume nobody is picked off base, caught stealing or thrown out on the bases. Assume each
batters chance of getting on base is independent of whether another batter got on base.)
A. less than 1%
B. at least 1% but less than 2%
C. at least 2% but less than 3%
D. at least 3% but less than 4%
E. at least 4%
6.10 (1 point) Assume each exposure has a Negative Binomial frequency distribution, as per Loss
Models, with = 0.1 and r = 0.27. You insure 20,000 independent exposures.
What is the frequency distribution for your portfolio?
A. Negative Binomial with = 0.1 and r = 0.27.
B. Negative Binomial with = 0.1 and r = 5400.
C. Negative Binomial with = 2000 and r = 0.27.
D. Negative Binomial with = 2000 and r = 5400.
E. None of the above.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 106

6.11 (3 points) Frequency is given by a Negative Binomial distribution with = 1.38 and r = 3.
Severity is given by a Weibull Distribution with = 0.3 and = 1000.
Frequency and severity are independent.
What is chance of two losses each of size greater than $25,000?
A. 1%
B. 2%
C. 3%
D. 4%

E. 5%

Use the following information for the next two questions:


Six friends each have their own phone.
The number of calls each friend gets per night from telemarketers is Geometric with = 0.3.
The number of calls each friend gets is independent of the others.
6.12 (2 points) Tonight, what is the probability that three of the friends get one or more calls from
telemarketers, while the other three do not?
A. 11%
B. 14%
C. 17%
D. 20%
E. 23%
6.13 (2 points) Tonight, what is the probability that the friends get a total of three calls from
telemarketers?
A. 11%
B. 14%
C. 17%
D. 20%
E. 23%
6.14 (2 points) The total number of claims from a group of 80 drivers has a
Negative Binomial Distribution with = 0.5 and r = 4.
What is the probability that a group of 40 similar drivers have a total of 2 or more claims?
A. 22%
B. 24%
C. 26%
D. 28%
E. 30%
6.15 (2 points) The total number of non-zero payments from a policy with a $1000 deductible
follows a Negative Binomial Distribution with = 0.8 and r = 3.
The ground up losses follow an Exponential Distribution with = 2500.
If this policy instead had a $5000 deductible, what would be the probability of having no
non-zero payments?
A. 56%
B. 58%
C. 60%
D. 62%
E. 64%

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 107

6.16 (3 points) The mathematician Stefan Banach smoked a pipe.


In order to light his pipe, he carried a matchbox in each of two pockets.
Each time he needs a match, he is equally likely to take it from either matchbox.
Assume that he starts the month with two matchboxes each containing 20 matches.
Eventually Banach finds that when he tries to get a match from one of his matchboxes it is empty.
What is the probability that when this occurs, the other matchbox has exactly 5 matches in it?
A. less than 6%
B. at least 6% but less than 7%
C. at least 7% but less than 8%
D. at least 8% but less than 9%
E. at least 9%
6.17 (2 points) Total claim counts generated from a portfolio of 400 policies follow a Negative
Binomial distribution with parameters r = 3 and = 0.4. If the portfolio increases to 500 policies,
what is the probability of observing exactly 2 claims in total?
A. 21%
B. 23%
C. 25%
D. 27%
E. 29%
Use the following information for the next three questions:
Two teams are playing against one another in a seven game series.
The results of each game are independent of the others.
The first team to win 4 games wins the series.
6.18 (3 points) The Flint Tropics have a 45% chance of winning each game.
What is the Flint Tropics chance of winning the series?
A. 33%
B. 35%
C. 37%
D. 39%
E. 41%
6.19 (3 points) The Durham Bulls have a 60% chance of winning each game.
What is the Durham Bulls chance of winning the series?
A. 67%
B. 69%
C. 71%
D. 73%
E. 75%
6.20 (3 points) The New York Knights have a 40% chance of winning each game.
The Knights lose the first game. The opposing manager offers to split the next two games with the
Knights (each team would win one of the next two games.)
Should the Knights accept this offer?
6.21 (3 points) The number of losses follows a Negative Binomial distribution with r = 4 and
= 3. Sizes of loss are uniform from 0 to 15,000.
There is a deductible of 1000, a maximum covered loss of 10,000, and a coinsurance of 90%.
Determine the probability that there are exactly six payments of size greater than 5000.
A. 9.0%
B. 9.5%
C. 10.0%
D. 10.5%
E. 11.0%

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 108

6.22 (2 points) Define (N - j)+ = n - j if n j, and 0 otherwise.


N follows a Negative Binomial distribution with r = 5 and = 0.3. Determine E[(N - 2)+].
A. 0.25

B. 0.30

C. 0.35

D. 0.40

E. 0.45

6.23 (3 points) The number of new claims the State House Insurance Company receives in a day
follows a Negative Binomial Distribution r = 5 and = 0.8. For a claim chosen at random, on
average how many other claims were also made on the same day?
A. 4.0
B. 4.2
C. 4.4
D. 4.6
E. 4.8
6.24 (2, 5/83, Q.44) (1.5 points) If a fair coin is tossed repeatedly, what is the probability that the
third head occurs on the nth toss?
n
A. (n-1)/2n+1
B. (n - 1)(n - 2)/2n+1
C. (n - 1)(n - 2)/2n
D. (n-1)/2n
E. / 2n
3
6.25 (2, 5/90, Q.45) (1.7 points) A coin is twice as likely to turn up tails as heads. If the coin is
tossed independently, what is the probability that the third head occurs on the fifth trial?
A. 8/81
B. 40/243
C. 16/81
D. 80/243
E. 3/5
6.26 (2, 2/96, Q.28) (1.7 points) Let X be the number of independent Bernoulli trials performed
until a success occurs. Let Y be the number of independent Bernoulli trials performed until 5
successes occur. A success occurs with probability p and Var(X) = 3/4.
Calculate Var(Y).
A. 3/20

B. 3/(4 5 )

C. 3/4

D. 15/4

E. 75/4

6.27 (1, 11/01, Q.11) (1.9 points) A company takes out an insurance policy to cover accidents that
occur at its manufacturing plant. The probability that one or more accidents will occur during any given
month is 3/5. The number of accidents that occur in any given month is independent of the number
of accidents that occur in all other months.
Calculate the probability that there will be at least four months in which no accidents
occur before the fourth month in which at least one accident occurs.
(A) 0.01
(B) 0.12
(C) 0.23
(D) 0.29
(E) 0.41
6.28 (1, 11/01, Q.21) (1.9 points) An insurance company determines that N, the number of claims
received in a week, is a random variable with P[N = n] = 1/2n+1, where n 0.
The company also determines that the number of claims received in a given week is independent of
the number of claims received in any other week. Determine the probability that exactly seven
claims will be received during a given two-week period.
(A) 1/256
(B) 1/128
(C) 7/512
(D) 1/64
(E) 1/32

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 109

6.29 (CAS3, 11/03, Q.18) (2.5 points) A new actuarial student analyzed the claim frequencies of a
group of drivers and concluded that they were distributed according to a negative binomial
distribution and that the two parameters, r and , were equal.
An experienced actuary reviewed the analysis and pointed out the following:
"Yes, it is a negative binomial distribution. The r parameter is fine, but the value of the parameter is
wrong. Your parameters indicate that 1/9 of the drivers should be claim-free, but
in fact, 4/9 of them are claim-free."
Based on this information, calculate the variance of the corrected negative binomial distribution.
A. 0.50
B. 1.00
C. 1.50
D. 2.00
E. 2.50
6.30 (CAS3, 11/04, Q.21) (2.5 points) The number of auto claims for a group of 1,000 insured
drivers has a negative binomial distribution with = 0.5 and r = 5.
Determine the parameters and r for the distribution of the number of auto claims for a group of
2,500 such individuals.
A. = 1.25 and r = 5
B. = 0.20 and r = 5
C. = 0.50 and r = 5
D. = 0.20 and r= 12.5
E. = 0.50 and r = 12.5
6.31 (CAS3, 5/05, Q.28) (2.5 points)
You are given a negative binomial distribution with r = 2.5 and = 5.
For what value of k does pk take on its largest value?
A. Less than 7

B. 7

C. 8

D. 9

E. 10 or more

6.32 (CAS3, 5/06, Q.32) (2.5 points) Total claim counts generated from a portfolio of 1,000
policies follow a Negative Binomial distribution with parameters r = 5 and = 0.2.
Calculate the variance in total claim counts if the portfolio increases to 2,000 policies.
A. Less than 1.0
B. At least 1.0 but less than 1.5
C. At least 1.5 but less than 2.0
D. At least 2.0 but less than 2.5
E. At least 2.5

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 110

6.33 (CAS3, 11/06, Q.23) (2.5 points) An actuary has determined that the number of claims
follows a negative binomial distribution with mean 3 and variance 12.
Calculate the probability that the number of claims is at least 3 but less than 6.
A. Less than 0.20
B. At least 0.20, but less than 0.25
C. At least 0.25, but less than 0.30
D. At least 0.30, but less than 0.35
E. At least 0.35
6.34 (CAS3, 11/06, Q.24) (2.5 points) Two independent random variables, X1 and X2 , follow the
negative binomial distribution with parameters (r1 , 1) and (r2 , 2), respectively.
Under which of the following circumstances will X1 + X2 always be negative binomial?
1. r1 = r2 .
2. 1 = 2.
3. The coefficients of variation of X1 and X2 are equal.
A. 1 only

B. 2 only

C. 3 only

D. 1 and 3 only

E. 2 and 3 only

6.35 (CAS3, 11/06, Q.31) (2.5 points)


You are given the following information for a group of policyholders:

The frequency distribution is negative binomial with r = 3 and = 4.


The severity distribution is Pareto with = 2 and = 2,000.
Calculate the variance of the number of payments if a $500 deductible is introduced.
A. Less than 30
B. At least 30, but less than 40
C. At least 40, but less than 50
D. At least 50, but less than 60
E. At least 60
6.36 (SOA M, 11/06, Q.22 & 2009 Sample Q.283) (2.5 points) The annual number of doctor
visits for each individual in a family of 4 has a geometric distribution with mean 1.5.
The annual numbers of visits for the family members are mutually independent.
An insurance pays 100 per doctor visit beginning with the 4th visit per family.
Calculate the expected payments per year for this family.
(A) 320
(B) 323
(C) 326
(D) 329
(E) 332

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 111

Solutions to Problems:
6.1. E. mean = r = (3)(.4) = 1.2.
6.2. A. variance = r(1+ ) = (3)(.4)(1.4) = 1.68.
x +r -1
6
6.3. B.
x/ (1+)x+r = (.4)4 / (1.4)4+3 = 15 (.0256)/(10.54) = 0.0364.
x
4
6.4. A. & 6.5. B. The mode is 0, since f(0) is larger than any other value.
n
f(n)
F(n)

0
0.3644
0.364

1
0.3124
0.677

2
0.1785
0.855

3
0.0850
0.940

4
0.0364
0.977

The median is 1, since F(0) <.5 and F(1) .5.


Comment: Ive used the formulas: f(0) = (/(1+))r and f(x+1) / f(x) = (x+r) / {(x+1)(1+)}.
Just as with the Gamma Distribution, the Negative Binomial can have either a mode of zero or a
positive mode. For r < 1 + 1/, as is the case here, the mode is zero, and the Negative Binomial
looks somewhat similar to an Exponential Distribution.
6.6. E. F(4) = f(0) + f(1) + f(2) + f(3) + f(4) = 97.7%.
n
f(n)
F(n)

0
0.3644
0.3644

1
0.3124
0.6768

2
0.1785
0.8553

3
0.0850
0.9403

4
0.0364
0.9767

Comment: Using the Incomplete Beta Function: F(4) = 1- (4+1, r; /(1+)) = 1 - (5,3; .4/1.4) =
1 - 0.0233 = 0.9767.
6.7. D. This is series of Bernoulli trials. Treating Lous winning as a success, then chance of
success is 40%. N is the number of failures prior to the 5th success.
Therefore N has a Negative Binomial Distribution with r = 5 and
= chance of failure / chance of success = 60%/40% = 1.5.
Variance is: r(1+) = (5)(1.5)(2.5) = 18.75.
6.8. A. f(3) = {r(r+1)(r+2)} / 3!} 3 / (1+)3+r = {(1.5)(2.5)(3.5)/ 6} (2/9)3 (11/9)-4.5 = 0.0097.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 112

6.9. E. For the defense a batter reaching base is a failure and an out is a success. The number of
batters reaching base is the number of failures prior to 3 successes for the defense. The chance of a
success for the defense is .6. Therefore the number of batters who reach base is given by a
Negative Binomial with r = 3 and
= (chance of failure for the defense)/(chance of success for the defense) = .4/.6 = 2/3.
If 8 batters come to the plate, then 5 reach base and 3 make out. The chance of exactly 5 batters
reaching base is f(5) for r = 3 and = 2/3: {(3)(4)(5)(6)(7)/5!}5 / (1+)5+r = (21)(.13169)/(59.537)
= 0.0464.
Alternately, for there to be exactly 8 batters, the last one has to make an out, and exactly two of the
first 7 must make an out. Prob[2 of 7 make out]
density at 2 of Binomial Distribution with m = 7 and q = .6 ((7)(6)/2).62 .45 = .0774.
Prob[8th batter makes an out]Prob[2 of 7 make an out] = (.6)(.0774) = 0.0464.
Comment: Generally, one can use either a Negative Binomial Distribution or some reasoning and a
Binomial Distribution in order to answer these type of questions.
6.10. B. The sum of independent Negative Binomials, each with the same , is another Negative
Binomial, with the sum of the r parameters. In this case we get a Negative Binomial with = 0.1 and
r = (.27)(20000) = 5400.
6.11. D. S(25,000) = exp(-(25000/1000).3) = .0723. The losses greater than $25,000 is another
Negative Binomial with r = 3 and = (1.38)(.0723) = .0998.
For a Negative Binomial, f(2) = (r(r+1)/2)2 /(1+)r+2 = {(3)(4)/2}.09982 /(1.0998)5 = 3.71%.
Comment: An example of thinning a Negative Binomial.
6.12. A. For the Geometric, f(0) = 1/(1+) = 1/1.3. 1 - f(0) = .3/1.3.
Prob[3 with 0 and 3 not with 0] = {6! / (3! 3!)} (1/1.3)3 (.3/1.3)3 = 0.112.
6.13. B. The total number of calls is Negative Binomial with r = 6 and = .3.
f(3) = (r(r+1)(r+2)/3!)3/(1+)3+r = ((6)(7)(8)/3!).33 /1.39 = 0.143.
6.14. C. The frequency for the 40 drivers is Negative Binomial Distribution with parameters
r = (40/80)(4) = 2 and = 0.5.
f(0) = 1/1.52 = 44.44%. f(1) = 2(.5/1.53 ) = 29.63%. 1 - f(0) - f(1) = 25.9%.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 113

6.15. E. For the Exponential, S(1000) = exp[-1000/2500] = .6703.


S(5000) = exp[-5000/2500] = .1353. Therefore, with the $5000 deductible, the non-zero
payments are Negative Binomial Distribution with r = 3 and = (.1353/.6703)(0.8) = .16.
f(0) = 1/1.163 = 64%.
6.16. E. Let us assume the righthand matchbox is the one discovered to be empty.
Call a success choosing the righthand box and a failure choosing the lefthand box.
Then we have a series of Bernoulli trials, with chance of success 1/2.
The number of failures prior to the 21st success (looking in the righthand matchbox 20 times and
getting a match and once more finding no matches are left) is Negative Binomial with r = 21 and
= (chance of failure)/(chance of success) = (1/2)/(1/2) = 1.
For the lefthand matchbox to then have 5 matches, we must have had 15 failures.
Density at 15 for this Negative Binomial is: {(21)(22)...(35) / 15!} 115/(1 + 1)15+21 = 4.73%.
However, it is equally likely that the lefthand matchbox is the one discovered to be out of matches.
Thus we double this probability: (2)(4.73%) = 9.5%.
Comment: Difficult. The famous Banach Match problem.
6.17. A. When one changes the number of exposures, the r parameter changes in proportion.
For 500 policies, total claim counts follow a Negative Binomial distribution with parameters
r = 3(500/400) = 3.75 and = 0.4.
f(2) = {r(r+1)/2}2/(1+)r+2 = (3.75)(4.75)(.5)(.42 )/(1.45.75) = 20.6%.
Comment: Similar to CAS3, 5/06, Q.32.
6.18. D. Ignoring the fact that once a team wins four games, the final games of the series will not be
played, the total number of games won out of seven by the Tropics is Binomial with q = 0.45 and
m = 7. We want the sum of the densities of this Binomial from 4 to 7:
35(0.454 )(0.553 ) + 21(0.455 )(0.552 ) + 7(0.456 )(0.55) + 0.457
= 0.2388 + 0.1172 + 0.0320 + 0.0037 = 0.3917.
Alternately, the number of failures by the Tropics prior to their 4th success is Negative Binomial with
r = 4 and = .55/.45 = 11/9.
For the Tropics to win the series they have to have 3 or fewer loses prior to their 4th win.
The probability of this is the sum of the densities of the Negative Binomial at 0 to 3:
1/(20/9)4 + 4(11/9)/(20/9)5 + {(4)(5)(11/9)2 /2!}/(20/9)6 + {(4)(5)(6)(11/9)3 /3!}/(20/9)7
= 0.0410 + 0.0902 + 0.1240 + 0.1364 = 0.3916.
Comment: The question ignores any effect of home field advantage.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 114

6.19. C. Ignoring the fact that once a team wins four games, the final games of the series will not be
played, the total number of games won out of seven by the Bulls is Binomial with q = 0.60 and
m = 7. We want the sum of the densities of this Binomial from 4 to 7:
35(.64 )(.43 ) + 21(.65 )(.42 ) + 7(.66 )(.4) + .67
= 0.2903 + 0.2613 + 0.1306 + 0.0280 = 0.7102.
Alternately, the number of failures by the Bulls prior to their 4th success is Negative Binomial with
r = 4 and = .4/.6 = 2/3.
For the Bulls to win the series they have to have 3 or fewer loses prior to their 4th win.
The probability of this is the sum of the densities of the Negative Binomial at 0 to 3:
1/(5/3)4 + 4(2/3)/(5/3)5 + {(4)(5)(2/3)2 /2!}/(5/3)6 + {(4)(5)(6)(2/3)3 /3!}/(5/3)7
= 0.1296 + 0.2074 + 0.2074 + 0.1659 = 0.7103.
Comment: According to Bill James, A useful rule of thumb is that the advantage doubles in a
seven-game series. In other words, if one team would win 51% of the games between two
opponents, then they would win 52% of the seven-game series.If one team would win 55% of the
games, then they would win 60% of the series.
Here is a graph of the chance of winning the seven game series, as a function of the chance of
winning each game:
per series

0.8
0.6
0.4
0.2

0.3

0.4

0.5

0.6

0.7

0.8

per game

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 115

6.20. If the Knights do not accept the offer, then they need to win four of six games.
We want the sum of the densities from 4 to 6 of a Binomial with q = .4 and m = 6:
15(.44 )(.62 ) + 6(.45 )(.6) + .46 = 0.1382 + 0.0369 + 0.0041 = 0.1792.
If the Knights accept the offer, then they need to win three of four games.
We want the sum of the densities from 3 to 4 of a Binomial with q = .4 and m = 4:
4(.43 )(.6) + .44 = 0.1536 + 0.0256 = 0.1792.
Thus the Knights are indifferent between accepting this offer or not.
Alternately, if the Knights do not accept the offer, then they need to win four of six games.
The number of failures by the Knights prior to their 4th success is Negative Binomial with
r = 4 and = .6/.4 = 1.5. The Knights win the series if they have 2 or fewer failures:
1/2.54 + 4(1.5)/2.55 + {(4)(5)(1.5)2 /2!}/2.56 = 0.0256 + 0.0614 + 0.0922 = 0.1792.
If the Knights accept the offer, then they need to win three of four games.
The number of failures by the Knights prior to their 3rd success is Negative Binomial with
r = 3 and = .6/.4 = 1.5. The Knights win the series if they have 1 or fewer failures:
1/2.53 + 3(1.5)/2.54 = 0.0640 + 0.1152 = 0.1792.
Thus the Knights are indifferent between accepting this offer or not.
Comment: A comparison of their chances of winning the series as a function of their chance of
winning a game, accepting the offer (dashed) and not accepting the offer (solid):
% series
0.35
0.3
0.25
0.2
0.15
0.1

0.3

0.35

0.4

0.45

0.5

% game

The Knights should accept the offer if their chance of winning each game is less than 40%.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 116

6.21. C. A payment is of size greater than 5000 if the loss is of size greater than:
5000/.9 + 1000 = 6556. Probability of a loss of size greater than 6556 is: 1 - 6556/15000 =
56.3%. The large losses are Negative Binomial with r = 4 and = (56.3%)(3) = 1.69.
f(6) = {r(r+1)(r+2)(r+3)(r+4)(r+5)/6!}6/(1+)r+6 = {(4)(5)(6)(7)(8)(9)/720}1.696 /2.6910 = 9.9%.
Comment: An example of thinning a Negative Binomial.
6.22. C. f(0) = 1/1.35 = 0.2693. f(1) = (5)0.3/1.36 = 0.3108.
E[N] = 0f(0) + 1f(1) + 2f(2) + 3f(3) + 4f(4) + 5f(5) + ... E[(N - 2)+] = 1f(3) + 2f(4) + 3f(5) + ....
E[N] - E[(N - 2)+] = f(1) + 2f(2) + 2f(3) + 2f(4) + 2f(5) + ... = f(1) + 2{1 - f(0) - f(1)} = 2 - 2f(0) - f(1).
E[(N - 2)+] = E[N] - {2 - 2f(0) - f(1)} = (5)(.3) - {2 - (2)(.2693) -.3108} = 0.3494.
Alternately, E[N 2] = 0f(0) + 1f(1) + 2{1 - f(0) - f(1)} = 1.1506.
E[(N - 2)+] = E[N] - E[N 2] = (5)(0.3) - 1.1506 = 0.3494.

Alternately, E[(N - 2)+] = E[(2-N)+] + E[N] - 2 = 2f(0) + f(1) + (5)(0.3) - 2 = 0.3494.


Comment: See the section on Limited Expected Values in Mahlers Guide to Fitting Loss
Distributions.
6.23. E. Let n be the number of claims made on a day.
The probability that the claim picked is on a day of size n is proportional to the product of the
number of claims on that day and the proportion of days of that size: n f(n).
Thus, Prob[claim is from a day with n claims] = n f(n) / n f(n) = n f(n) / E[N].
For n > 0, the number of other claims on the same day is n - 1.

n f(n) (n- 1) (n2 - n) f(n)

Average number of other claims is:

E[N]

E[N]

E[N2 ] - E[N]
=
E[N]

E[N2 ]
Var[N] + E[N]2
Var[N]
-1=
-1=
+ E[N] - 1 = 1 + + r - 1 = (r + 1) = (6)(0.8) = 4.8.
E[N]
E[N]
E[N]
Comment: The average day has four claims; on the average day there are three other claims.
However, a claim chosen at random is more likely to be from a day that had a lot of claims.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 117

6.24. B. This is a Negative Binomial with r = 3, = chance of failure / chance of success = 1,


and x = number of failures = n - 3.
f(x) = {r(r+1)...(r+x-1)/x!}x/(1+)r+x = {(3)(4)...(x+2)/x!}/23+x = (x+1)(x+2)/24+x.
f(n-3) = (n-2)(n-1)/2n + 1.
Alternately, for the third head to occur on the nth toss, for n 3, we have to have had two head out of
n -1
the first n-1 tosses, which has probability /2n-1 = (n-2)(n-1)/2n , and a head on the nth toss,
2
which has probability 1/2. Thus the total probability is: (n-2)(n-1)/2n + 1.
6.25. A. The number of tails before the third head is Negative Binomial, with r = 3 and
= chance of failure / chance of success = chance of tail / chance of head = 2.
Prob[ third head occurs on the fifth trial] = Prob[2 tails when the get 3rd head] = f(2) =
{r(r+1)/2}2/(1+)r+2 = (6)(4)/35 = 8/81.
Alternately, need 2 heads and 2 tails out of the first 4 tosses, and then a head on the fifth toss:
{4!/(2!2!)}(1/3)2 (2/3)2 (1/3) = 8/81.
6.26. D. X-1 is Geometric with = chance of failure / chance of success = (1 - p)/p = 1/p - 1.
Therefore, 3/4 = Var(X) = Var(X-1) = (1 + ) = (1/p - 1){1/p).
.75p2 + p - 1 = 0. p = {-1 +

1+ 3 }/1.5 = 2/3.

= 3/2 - 1 = 1/2. Y-5 is Negative Binomial with r = 5 and = 1/2.


Var[Y - 5] = Var[Y] = (5)(1/2)(3/2) = 15/4.
Alternately, once one has gotten the first success, the number of additional trials until the second
success is independent of and has the same distribution as X, the number of additional trials until the
first success. Y = X + X + X + X + X. Var[Y] = 5Var[X] = (5)(3/4) = 15/4.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 118

6.27. D. Define a success as a month in which at least one accident occurs.


We have a series of independent Bernoulli trials, and we stop upon the fourth success.
The number of failures before the fourth success is Negative Binomial with r = 4 and
= chance of failure / chance of success = (2/5)/(3/5) = 2/3.
f(0) = 1/(1 + 2/3)4 = .1296. f(1) = 4(2/3)/(5/3)5 = .20736.
f(2) = {(4)(5)/2!}(2/3)2 /(5/3)6 = .20736. f(3) = {(4)(5)(6)/3!}(2/3)3 /(5/3)7 = .165888.
Prob[at least 4 failures] = 1 - (.1296 + .20736 + .20736 + .165888) = .289792.
Alternately, instead define a success as a month in which no accident occurs.
We have a series of independent Bernoulli trials, and we stop upon the fourth success.
The number of failures before the fourth success is Negative Binomial with r = 4 and
= chance of failure / chance of success = (3/5)/(2/5) = 1.5.
f(0) = 1/(1 + 1.5)4 = .0256. f(1) = (4)1.5/2.55 = .06144.
f(2) = {(4)(5)/2!}1.52 /2.56 = .09216. f(3) = {(4)(5)(6)/3!}1.53 /2.57 = .110592.
The event we want will occur if at the time of the fourth success, the fourth month in which no
accidents occur, there have been fewer than four failures, in other words fewer than four months in
which at least one accident occurs.
Prob[fewer than 4 failures] = .0256 + .06144 + .09216 + .110592 = .289792.
6.28. D. The number of claims in a week is Geometric with /(1+) = 1/2. = 1.
The sum of two independent Geometrics is a Negative Binomial with r = 2 and = 1.
f(7) = {(2)(3)(4)(5)(6)(7)(8)/7!}7/(1+)9 = 1/64.
6.29. C. For the students Negative Binomial, r = : f(0) = 1/(1+)r = 1/(1+r)r = 1/9. r = 2.
For the corrected Negative Binomial, r = 2 and: f(0) = 1/(1+)r = 1/(1+)2 = 4/9. = .5.
Variance of the corrected Negative Binomial = r(1+) = (2)(.5)(1.5) = 1.5.
6.30. E. For a Negative Binomial distribution, as the exposures change we get another Negative
Binomial; the r parameter changes in proportion, while remains the same.
The new r = (2500/1000)(5) = 12.5. = 0.5 and r = 12.5.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 119

6.31. B. For a Negative Binomial, a = /(1 + ) = 5/6,


and b = (r - 1)/(1 + ) = (1.5)(5/6) = 5/4.
f(x)/f(x-1) = a + b/x = 5/6 + (5/4)/x, x = 1, 2, 3, ...
To find the mode, where the density is largest, find when this ratio is greater than 1.
5/6 + (5/4)/x = 1. x/6 = 5/4. x = 7.5.
So f(7)/f(6) > 1 while f(8)/f(7) < 1, and 7 is the mode.
Comment: f(6) = .0556878. f(7) = .0563507. f(8) = .0557637.
6.32. D. Doubling the exposures, multiplies r by 2. For 2000 policies, total claim counts follow a
Negative Binomial distribution with parameters r = 10 and = 0.2.
Variance = r(1+) = (10)(.2)(1.2) = 2.4.
Alternately, for 1000 policies, the variance of total claim counts is: (5)(.2)(1.2) = 1.2.
2000 polices. 1000 policies + 1000 policies.

For 2000 policies, the variance of total claim counts is: 1.2 + 1.2 = 2.4.
Comment: When one adds independent Negative Binomial Distribution with the same , one gets
another Negative Binomial Distribution with the sum of the r parameters. When one changes the
number of exposures, the r parameter changes in proportion.
6.33. B. r = 3. r(1+) = 12. 1 + = 12/3 = 4. = 3. r = 1.
f(3) + f(4) + f(5) = 33 /44 + 34 /45 + 35 /46 = 0.244.
Comment: We have fit via Method of Moments. Since r = 1, this is a Geometric Distribution.
6.34. B. 1. False. 2. True.
3. CV =

r(1+ ) / (r) =

(1+ ) / r . False.

Comment: For the Negative Binomial, P(z) = 1/{1 - (z-1)}r.


The p.g.f. of the sum of two independent variables is the product of their p.g.f.s:
1/({1 - 1(z-1)} r1 {1 - 2(z-1)} r2 ).
This only has the same form as a Negative Binomial if and only if 1 = 2.
6.35. A. For the Pareto, S(500) = (2/2.5)2 = 0.64. Thus the number of losses of size greater than
500 is Negative Binomial with r = 3 and = (.64)(4) = 2.56.
The variance of the number of large losses is: (3)(2.56)(3.56) = 27.34.

2013-4-1,

Frequency Distributions, 6 Negative Binomial Dist.

HCM 10/4/12,

Page 120

6.36. D. The total number of visits is the sum of 4 independent, identically distributed Geometric
Distributions, which is a Negative Binomial with r = 4 and = 1.5.
f(0) = 1/2.54 = .0256. f(1) = (4)1.5/2.55 = .06144. f(2) = {(4)(5)/2}1.52 /2.56 = .09216.
E[N 3] = 0f(0) + 1f(1) + 2f(2) + 3{1 - f(0) - f(1) - f(2)} = 2.708.
E[(N-3)+] = E[N] - E[N 3] = (4)(1.5) - 2.708 = 3.292. 100E[(N-3)+] = 329.2.
Alternately, E[(N-3)+] = E[(3-N)+] + E[N] - 3 = 3f(0) + 2f(1) + f(2) + (4)(1.5) - 3 = 3.292.
Comment: See the section on Limited Expected Values in Mahlers Guide to Fitting Loss
Distributions.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 121

Section 7, Normal Approximation


This section will go over important information that Loss Models assumes the reader already knows
concerning the Normal Distribution and its use to approximate frequency distributions. These ideas
are important for practical applications of frequency distributions.38
The Binomial Distribution with parameters q and m is the sum of m independent Bernoulli trials, each
with parameter q. The Poisson Distribution with integer, is the sum of independent Poisson
variables each with mean of one. The Negative Binomial Distribution with parameters and r, with r
integer, is the sum of r independent Geometric distributions each with parameter .
Thus by the Central Limit Theorem, each of these distributions can be approximated by a Normal
Distribution with the same mean and variance.
For the Binomial as m , for the Poisson as , and for the Negative Binomial as
r , the distribution approaches a Normal39. The approximation is quite good for large values of
the relevant parameter, but not very good for extremely small values.
For example, here is the graph of a Binomial Distribution with q = 0.4 and m = 30.
It is has mean (30)(0.4) = 12 and variance = (30)(0.4)(0.6) = 7.2.
Also shown is a Normal Distribution with = 12 and =

7.2 = 2.683.

Prob.
0.14
0.12
0.1
0.08
0.06
0.04
0.02

0
38

10

15

20

25

30

These ideas also underlay Classical Credibility.


In fact as discussed in a subsequent section, the Binomial and the Negative Binomial each approach a Poisson
which in turn approaches a Normal.
39

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 122

Here is the graph of a Poisson Distribution with = 10, and the approximating Normal Distribution
with = 10 and =

10 = 3.162:

Prob.
0.12
0.1
0.08
0.06
0.04
0.02

10

15

20

25

30

Here is the graph of a Negative Binomial Distribution with = 0.5 and r = 20, with
mean (20)(0.5) = 10 and variance (20)(0.5)(1.5) = 15, and the approximating Normal Distribution
with = 10 and =

15 = 3.873:

Prob.
0.1

0.08

0.06

0.04

0.02

10

15

20

25

30

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 123

A typical use of the Normal Approximation would be to find the probability of observing a certain
range of claims. For example, given a certain distribution, what is the probability of at least 10 and no
more than 20 claims.
Exercise: Given a Binomial with parameters q = 0.3 and m = 10, what is the chance of observing 1
or 2 claims?
[Solution: 10(0.31 )(0.79 ) + 45(0.32 )(0.78 ) = 0.1211 + 0.2335 = 0.3546.]
In this case one could compute the exact answer as the sum of only two terms.
Nevertheless, let us illustrate how the Normal Approximation could be used in this case.
The Binomial distribution with q = 0.3 and m = 10 has a mean of: (0.3)(10) = 3, and a variance of:
(10)(0.3)(0.7) = 2.1. This Binomial Distribution can be approximated by a Normal Distribution with
mean of 3 and variance of 2.1, as shown below:
Prob.
0.25
0.2
0.15
0.1
0.05

10

Prob[1 claim] = the area of a rectangle of width one and height f(1) = .1211.
Prob[2 claims] = the area of a rectangle of width one and height f(2) = .2335.
The chance of either one or two claims is the sum of these two rectangles; this is approximated by
the area under this Normal Distribution, with mean 3 and variance 2.1, from 1 - .5 = .5 to 2 + .5 = 2.5.
Prob[ 1 or 2 claims] [(2.5-3)/ 2.1 ] - [(.5-3)/ 2.1 ] = [-.345] - [-1.725] = .365 - .042 = 0.323.
Note that in order to get the probability for two values on the discrete Binomial Distribution, one has
to cover an interval of length two on the real line for the continuous Normal Distribution. We
subtracted 1/2 from the lower end of 1 and added 1/2 to the upper end of 2.
This is called the continuity correction.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 124

Below, I have zoomed in on the relevant of part of the previous diagram:


Prob.
0.25

D
C

0.2

0.15
B
0.1

0.05

0.5

1.5

2.5

It should make it clear why the continuity correction is needed. In this case the chance of having 1 or 2
claims is equal to the area under the two rectangles, which is not close to the area under the Normal
from 1 to 2, but is approximated by the area under the Normal from 0.5 to 2.5.
In order to use the Normal Approximation, one must translate to the so called Standard Normal
Distribution40. In this case, we therefore need to standardize the variables by subtracting the mean
of 3 and dividing by the standard deviation of

2.1 = 1.449. In this case,

0.5 (0.5 - 3) / 1.449 = -1.725, while 2.5 (2.5 - 3) / 1.449 = -0.345. Thus, the chance of
observing either 1 or 2 claims is approximately: [-0.345] - [-1.725] = 0.365 - 0.042 = 0.323.
This compares to the exact result of .3546 calculated above. The diagram above shows why the
approximation was too small in this particular case41. Area A is within the first rectangle, but not under
the Normal Distribution. Area B is not within the first rectangle, but is under the Normal Distribution.
Area C is within the second rectangle, but not under the Normal Distribution. Area D is not within the
second rectangle, but is under the Normal Distribution.
Normal Approximation minus Exact Result = (Area B - Area A) + (Area D - Area C).
While there was no advantage to using the Normal approximation in this example, it saves a lot of
time when trying to deal with many terms.
40

Attached to the exam and shown below.


The approximation gets better as the mean of the Binomial gets larger. The error can be either positive or
negative.
41

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 125

In general, let be the mean of the frequency distribution, while is the standard
deviation of the frequency distribution, then the chance of observing at least i claims
(j + 0.5) -
(i - 0.5)
and not more than j claims is approximately: [
] - [
].

Exercise: Use the Normal Approximation in order to estimate the probability


of observing at least 10 claims but no more than 18 claims from a Negative Binomial Distribution
with parameters = 2/3 and r = 20.
[Solution: Mean = r = 13.33 and variance = r(1+) = 22.22.
Prob[at least 10 claims but no more than 18 claims]
[(18.5 - 13.33) /

22.22 ] [(9.5-13.33) /

22.22 ] = [1.097] [-0.813] =

0.864 - 0.208 = 0.656.


Comment: The exact answer is 0.648.]
Here is a graph of the Normal Approximation used in this exercise:
Prob.

0.08

0.06

0.04

0.02

10

15

18 20

25

30

The continuity correction in this case: at least 10 claims but no more than 18 claims

10 - 1/2 = 9.5 to 18 + 1/2 = 18.5 on the Normal Distribution.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 126

Note that Prob[ 10 # claims 18] = Prob[ 9 < # claims < 19]. Thus one must be careful to carefully
check the wording, to distinguish between open and closed intervals.
Prob[ 9 < # claims < 19] = Prob[ 10 # claims 18] [{18.5 - } / ] [{9.5 - } / ].
One should use the continuity correction whenever one is using the Normal Distribution
in order to approximate the probability associated with a discrete distribution.
Do not use the continuity correction when one is using the Normal Distribution in order to
approximate continuous distributions, such as aggregate distributions42 or the Gamma Distribution.
Exercise: Use the Normal Approximation in order to estimate the probability
of observing more than 15 claims from a Poisson Distribution with = 10.
[Solution: Mean = variance = 10. Prob[# claims > 15] = 1 - Prob[# claims 15]
1 - [(15.5-10)/

10 ] = 1 - [1.739] = 1 - .9590 = 4.10%.

Comment: The exact answer is 4.87%.]


The area under the Normal Distribution and to the right of the vertical line at 15.5 is the approximation
used in this exercise:
Prob.
0.12
0.1
0.08
0.06
0.04
0.02

42

10

See Mahlers Guide to Aggregate Distributions.

15

20

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 127

Diagrams:
Some of you will find the following simple diagrams useful when applying the Normal
Approximation to discrete distributions.
More than 15 claims At least 16 claims 16 claims or more
15

15.5

16

|
Prob[More than 15 claims] 1 - [(15.5 - )/].
Exercise: For a frequency distribution with mean 14 and standard deviation 2, using the Normal
Approximation, what is the probability of at least 16 claims?
[Solution: Prob[At least 16 claims] = Prob[More than 15 claims] 1 - [(15.5 - )/] =
1 - [(15.5 - 14)/2] = 1 - [0.75] = 1 - .07734 = 22.66%.]
Less than 12 claims At most 11 claims 11 claims or less
11

11.5

12

|
Prob[Less than 12 claims] [(11.5 - )/].
Exercise: For a frequency distribution with mean 10 and standard deviation 4, using the Normal
Approximation, what is the probability of at most 11 claims?
[Solution: Prob[At most 11 claims] = Prob[Less than 12 claims] [(11.5 - )/] =
[(11.5 - 10)/4] = [0.375] = 64.6%.]

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 128

At least 10 claims and at most 13 claims More than 9 claims and less than 14 claims
9

9.5

10

11

12

13

13.5 14

Prob[At least 10 claims and at most 13 claims ] [(13.5 - )/] - [(9.5 - )/].
Exercise: For a frequency distribution with mean 10 and standard deviation 4, using the Normal
Approximation, what is the probability of more than 9 claims and less than 14 claims?
[Solution: Prob[more than 9 claims and less than 14 claims] =
Prob[At least 10 claims and at most 13 claims] [(13.5 - )/] [(9.5 - )/] =
[(13.5 - 10)/4] - [(9.5 - 10)/4] = [0.875] - [-0.125] = 0.809 - 0.450 = 35.9%.]

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 129

Confidence Intervals:
One can use the lower portion of the Normal Distribution table in order to get confidence intervals.
For example, in order to get a 95% confidence interval, one allows 2.5% probability on either tail.
(1.96) = (1 + 95%)/2 = 97.5%.
Thus 95% of the probability on the Standard Normal Distribution is between -1.96 and 1.96:

2.5%

2.5%

- 1.96

1.96

Thus a 95% confidence interval for a Normal would be: mean 1.960 standard deviations.
Similarly, since (1.645) = (1 + 90%)/2 = 95%, a 90% confidence interval is:
mean 1.645 standard deviations.

5%

5%
- 1.645

1.645

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 130

Normal Distribution:
The Normal Distribution is a bell-shaped symmetric distribution. Its two parameters are
(x -)2
]
22 , - < x < .
2

exp[its mean and its standard deviation . f(x) =

The sum of two independent Normal Distributions is also a Normal Distribution, with the
sum of the means and variances. If X is normally distributed, then so is aX + b, but with mean
a+b and standard deviation a. If one standardizes a normally distributed variable by subtracting
and dividing by , then one obtains a Standard Normal with mean 0 and standard deviation of 1.
A Normal Distribution with = 10 and = 5:
0.08

0.06

0.04

0.02

- 10

10

The density of the Standard Normal is denoted by (x) =

20
exp[-x2 / 2]
2

30
, - < x < .43

The corresponding distribution function is denoted by (x).

(x) 1- (x){.4361836t -.1201676t2 +.9372980t3 }, where t = 1/(1+.33267x).44


43
44

As shown near the bottom of the first page of the Tables for Exam 4/C.
See pages 103-104 of Simulation by Ross or 26.2.16 in Handbook of Mathematical Functions.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Normal Distribution
Support: > x > -

Parameters: > > - (location parameter)


> 0 (scale parameter)

F(x) = [(x)/]

D. f. :

P. d. f. :

x -
f(x) =
/ =

(x -)2
]
22 .
2

exp[-

(x) =

exp[-x2 / 2]
2

Central Moments: E[(X)n] = n n! /{2n/2 (n/2)!} n even, n 2


E[(X)n] = 0

n odd, n 1

Variance = 2

Mean =

Coefficient of Variation = Standard Deviation / Mean = /


Skewness = 0 (distribution is symmetric)
Mode =

Kurtosis = 3

Median =

Limited Expected Value Function:


E[X

x] = [(x)/] exp[ -(x- )2 /(22 ) ]/ 2 + x {1 - [(x)/]}

Excess Ratio: R(x) = {1- x/}{1((x)/)} + (/)exp( -[(x- )2 ]/[22 ] )/ 2


Mean Residual Life: e(x) = - x + exp( -[(x- )2 ]/[22 ] )/ {{1 - ((x)/) 2 }}
Derivatives of d.f. :

F(x) / = - ((x)/) F(x) / = - ((x)/) / 2

Method of Moments: = 1 , = (2 - 1 2).5


Percentile Matching: Set gi = 1(pi), then = (x1 -x2 )/(g1 -g2 ), = x1 - g1
Method of Maximum Likelihood: Same as Method of Moments.

Page 131

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 132

Using the Normal Table:


When using the normal distribution, choose the nearest z-value to find the probability, or
if the probability is given, chose the nearest z-value. No interpolation should be used.
Example: If the given z-value is 0.759, and you need to find Pr(Z < 0.759) from the normal
distribution table, then choose the probability value for z-value = 0.76; Pr(Z < 0.76) = 0.7764.
When using the Normal Approximation to a discrete distribution, use the continuity correction.45
When using the top portion of the table, use the symmetry of the Standard Normal Distribution
around zero: [-x] = 1 - [x].
For example, [-0.4] = 1 - [0.4] = 1 - 0.6554 = 0.3446.
The bottom portion of the table can be used to get confidence intervals.
To cover a confidence interval of probability P, find y such that [y] = (1 + P)/2.
For example, in order to get a 95% confidence interval, find y such that [y] = 97.5%.
Thus, y = 1.960.
[-1.960, 1.960] covers 95% probability on a Standard Normal Distribution.

45

The instructions for Exam 4/C from the SOA/CAS.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 133

Normal Distribution Table


Entries represent the area under the standardized normal distribution from - to z, Pr(Z < z). The value of z
to the first decimal place is given in the left column. The second decimal is given in the top row.
z

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.0
0.1
0.2
0.3
0.4

0.5000
0.5398
0.5793
0.6179
0.6554

0.5040
0.5438
0.5832
0.6217
0.6591

0.5080
0.5478
0.5871
0.6255
0.6628

0.5120
0.5517
0.5910
0.6293
0.6664

0.5160
0.5557
0.5948
0.6331
0.6700

0.5199
0.5596
0.5987
0.6368
0.6736

0.5239
0.5636
0.6026
0.6406
0.6772

0.5279
0.5675
0.6064
0.6443
0.6808

0.5319
0.5714
0.6103
0.6480
0.6844

0.5359
0.5753
0.6141
0.6517
0.6879

0.5
0.6
0.7
0.8
0.9

0.6915
0.7257
0.7580
0.7881
0.8159

0.6950
0.7291
0.7611
0.7910
0.8186

0.6985
0.7324
0.7642
0.7939
0.8212

0.7019
0.7357
0.7673
0.7967
0.8238

0.7054
0.7389
0.7704
0.7995
0.8264

0.7088
0.7422
0.7734
0.8023
0.8289

0.7123
0.7454
0.7764
0.8051
0.8315

0.7157
0.7486
0.7794
0.8078
0.8340

0.7190
0.7517
0.7823
0.8106
0.8365

0.7224
0.7549
0.7852
0.8133
0.8389

1.0
1.1
1.2
1.3
1.4

0.8413
0.8643
0.8849
0.9032
0.9192

0.8438
0.8665
0.8869
0.9049
0.9207

0.8461
0.8686
0.8888
0.9066
0.9222

0.8485
0.8708
0.8907
0.9082
0.9236

0.8508
0.8729
0.8925
0.9099
0.9251

0.8531
0.8749
0.8944
0.9115
0.9265

0.8554
0.8770
0.8962
0.9131
0.9279

0.8577
0.8790
0.8980
0.9147
0.9292

0.8599
0.8810
0.8997
0.9162
0.9306

0.8621
0.8830
0.9015
0.9177
0.9319

1.5
1.6
1.7
1.8
1.9

0.9332
0.9452
0.9554
0.9641
0.9713

0.9345
0.9463
0.9564
0.9649
0.9719

0.9357
0.9474
0.9573
0.9656
0.9726

0.9370
0.9484
0.9582
0.9664
0.9732

0.9382
0.9495
0.9591
0.9671
0.9738

0.9394
0.9505
0.9599
0.9678
0.9744

0.9406
0.9515
0.9608
0.9686
0.9750

0.9418
0.9525
0.9616
0.9693
0.9756

0.9429
0.9535
0.9625
0.9699
0.9761

0.9441
0.9545
0.9633
0.9706
0.9767

2.0
2.1
2.2
2.3
2.4

0.9772
0.9821
0.9861
0.9893
0.9918

0.9778
0.9826
0.9864
0.9896
0.9920

0.9783
0.9830
0.9868
0.9898
0.9922

0.9788
0.9834
0.9871
0.9901
0.9925

0.9793
0.9838
0.9875
0.9904
0.9927

0.9798
0.9842
0.9878
0.9906
0.9929

0.9803
0.9846
0.9881
0.9909
0.9931

0.9808
0.9850
0.9884
0.9911
0.9932

0.9812
0.9854
0.9887
0.9913
0.9934

0.9817
0.9857
0.9890
0.9916
0.9936

2.5
2.6
2.7
2.8
2.9

0.9938
0.9953
0.9965
0.9974
0.9981

0.9940
0.9955
0.9966
0.9975
0.9982

0.9941
0.9956
0.9967
0.9976
0.9982

0.9943
0.9957
0.9968
0.9977
0.9983

0.9945
0.9959
0.9969
0.9977
0.9984

0.9946
0.9960
0.9970
0.9978
0.9984

0.9948
0.9961
0.9971
0.9979
0.9985

0.9949
0.9962
0.9972
0.9979
0.9985

0.9951
0.9963
0.9973
0.9980
0.9986

0.9952
0.9964
0.9974
0.9981
0.9986

3.0
3.1
3.2
3.3
3.4

0.9987
0.9990
0.9993
0.9995
0.9997

0.9987
0.9991
0.9993
0.9995
0.9997

0.9987
0.9991
0.9994
0.9995
0.9997

0.9988
0.9991
0.9994
0.9995
0.9997

0.9988
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9995
0.9996
0.9997

0.9990
0.9993
0.9995
0.9996
0.9997

0.9990
0.9993
0.9995
0.9997
0.9998

3.5
3.6
3.7
3.8
3.9

0.9998
0.9998
0.9999
0.9999
1.0000

0.9998
0.9998
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

z
Pr(Z < z)

0.842
0.800

Values of z for selected values of Pr(Z < z)


1.036
1.282
1.645
1.960
0.850
0.900
0.950
0.975

2.326
0.990

2.576
0.995

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 134

Problems:
7.1 (2 points) You roll 1000 6-sided dice.
What is the chance of observing exactly 167 sixes?
(Use the Normal Approximation.)
A. less than 2.5%
B. at least 2.5% but less than 3.0%
C. at least 3.0% but less than 3.5%
D. at least 3.5% but less than 4.0%
E. at least 4.0%
7.2 (2 points) You roll 1000 6-sided dice.
What is the chance of observing 150 or more sixes but less than or equal to 180 sixes?
(Use the Normal Approximation.)
A. less than 78%
B. at least 78% but less than 79%
C. at least 79% but less than 80%
D. at least 80% but less than 81%
E. at least 81%
7.3 (2 points) You conduct 100 independent Bernoulli Trials, each with chance of success 1/4.
What is the chance of observing a total of at least 16 but not more than 20 successes?
(Use the Normal Approximation.)
A. less than 11%
B. at least 11% but less than 12%
C. at least 12% but less than 13%
D. at least 13% but less than 14%
E. at least 14%
7.4 (2 points) One observes 10,000 independent lives, each of which has a 2% chance of death
over the coming year. What is the chance of observing 205 or more deaths?
(Use the Normal Approximation.)
A. less than 36%
B. at least 36% but less than 37%
C. at least 37% but less than 38%
D. at least 38% but less than 39%
E. at least 39%

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 135

7.5 (2 points) The number of claims in a year is given by a Poisson distribution with parameter
= 400. What is the probability of observing at least 420 but no more than 440 claims over the
next year? (Use the Normal Approximation.)
A. less than 11%
B. at least 11% but less than 12%
C. at least 12% but less than 13%
D. at least 13% but less than 14%
E. at least 14%
Use the following information in the next three questions:
The Few States Insurance Company writes insurance in the states of Taxachusetts, Florgia and
Calizonia. Claims frequency for Few States Insurance in each state is Poisson, with expected claims
per year of 400 in Taxachusetts, 500 in Florgia and 1000 in Calizonia. The claim frequencies in the
three states are independent.
7.6 (2 points) What is the chance of Few States Insurance having a total of more than 1950 claims
next year? (Use the Normal Approximation.)
A. less than 10%
B. at least 10% but less than 11%
C. at least 11% but less than 12%
D. at least 12% but less than 13%
E. at least 13%
7.7 (3 points) What is the chance that Few States Insurance has more claims next year from
Taxachusetts and Florgia combined than from Calizonia?
(Use the Normal Approximation.)
A. less than 1.0%
B. at least 1.0% but less than 1.2%
C. at least 1.2% but less than 1.4%
D. at least 1.4% but less than 1.6%
E. at least 1.6%
7.8 (3 points) Define a large claim as one larger than $10,000. Assume that 30% of claims are large
in Taxachusetts, 25% in Florgia and 20% in Calizonia. Which of the following is an approximate 90%
confidence interval for the number of large claims observed by Few States Insurance over the next
year? Frequency and severity are independent.
(Use the Normal Approximation.)
A. [390, 500]
B. [395, 495]
C. [400, 490]
D. [405, 485]
E. [410, 480]

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 136

7.9 (2 points) A six-sided die is rolled five times. Using the Central Limit Theorem, what is the
estimated probability of obtaining a total of 20 on the five rolls?
A. less than 9.0%
B. at least 9% but less than 9.5%
C. at least 9.5% but less than 10%
D. at least 10% but less than 10.5%
E. at least 10.5%
7.10 (2 points) The number of claims in a year is given by the negative binomial distribution:
9999 + x
P[X=x] =
0.610000 0.4x, x = 0,1,2,3...
x

Using the Central Limit Theorem, what is the estimated probability of having 6800 or more claims in
a year?
A. less than 10.5%
B. at least 10.5% but less than 11%
C. at least 11% but less than 11.5%
D. at least 11.5% but less than 12%
E. at least 12%
7.11 (2 points) In order to estimate 1 - (4), use the formula:

(x) 1- (x){.4361836t -.1201676t2 +.9372980t3 }, where t = 1/(1 + .33267x),


A. less than .0020%
B. at least .0020% but less than .0020%
C. at least .0025% but less than .0030%
D. at least .0030% but less than .0035%
E. at least .0035%
7.12 (2 points) You are given the following:

The New York Yankees baseball team plays 162 games.

Assume the Yankees have an a priori chance of winning each game of 65%.

Assume the results of the games are independent of each other.

What is the chance of the Yankees winning 114 or more games?


(Use the Normal Approximation.)
A. less than 6%
B. at least 6% but less than 7%
C. at least 7% but less than 8%
D. at least 8% but less than 9%
E. at least 9%

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 137

7.13 (2 points) You are given the following:

Sue takes an actuarial exam with 40 multiple choice questions, each of equal value.
Sue knows the answers to 13 questions and answers them correctly.
Sue guesses at random on the remaining 27 questions, with a 1/5 chance of
getting each such question correct, with each question independent of the others.
If 22 correct answers are needed to pass the exam, what is the probability that Sue passed her
exam?
Use the Normal Approximation.
A. 4%
B. 5%
C. 6%
D. 7%
E. 8%
7.14 (3 points) You are given the following:

The New York Yankees baseball team plays 162 games, 81 at home and 81 on
the road.

The Yankees have an a priori chance of winning each home game of 80%.

The Yankees have an a priori chance of winning each road game of 50%.

Assume the results of the games are independent of each other.

What is the chance of the Yankees winning 114 or more games?


(Use the Normal Approximation.)
A. less than 6%
B. at least 6% but less than 7%
C. at least 7% but less than 8%
D. at least 8% but less than 9%
E. at least 9%
7.15 (2 points) You are given the following:

Lucky Tom takes an actuarial exam with 40 multiple choice questions, each of equal value.
Lucky Tom knows absolutely nothing about the material being tested.
Lucky Tom guesses at random on each question, with a 40% chance of
getting each question correct, independent of the others.
If 24 correct answers are needed to pass the exam, what is the probability that Lucky Tom passed
his exam? Use the Normal Approximation.
A. 0.4%
B. 0.5%
C. 0.6%
D. 0.7%
E. 0.8%
7.16 (4 points) X has a Normal Distribution with mean and standard deviation .
Determine the expected value of |x|.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 138

7.17 (4, 5/86, Q.48) (2 points) Assume an insurer has 400 claims drawn independently from a
distribution with mean 500 and variance 10,000.
Assuming that the Central Limit Theorem applies, find M such that the probability of the sum of
these claims being less than or equal to M is approximately 99%.
In which of the following intervals is M?
A. Less than 202,000
B. At least 202,000, but less than 203,000
C. At least 203,000, but less than 204,000
D. At least 204,000, but less than 205,000
E. 205,000 or more
7.18 (4, 5/86, Q.51) (1 point) Suppose X has a Poisson distribution with mean q.
Let be the (Cumulative) Standard Normal Distribution.
Which of the following is an approximation for Prob(1 x 4) for sufficiently large q?
A. [(4 - q) /

q ] - [(1 - q) /

q]

B. [(4.5 - q) /

q ] - [(.5 - q) /

C. [(1.5 - q) /

q ] - [(3.5 - q) /

q]

D. [( 3.5 - q) /

q ] - [(1.5 - q) /

q]

q]

E. [(4 - q) / q] - [(1 - q) / q]
7.19 (4, 5/87, Q.51) (2 points) Suppose that the number of claims for an individual policy during a
year has a Poisson distribution with mean 0.01. What is the probability that there will be 5, 6, or 7
claims from 400 identical policies in one year, assuming a normal approximation?
A. Less than 0.30
B. At least 0.30, but less than 0.35
C. At least 0.35, but less than 0.40
D. At least 0.40, but less than 0.45
E. 0.45 or more.
7.20 (4, 5/88, Q.46) (1 point) A random variable X is normally distributed with mean 4.8 and
variance 4. The probability that X lies between 3.6 and 7.2 is (b) - (a) where is the distribution
function of the unit normal variable. What are a and b, respectively?
A. 0.6, 1.2 B. 0.6, -0.3 C. -0.3, 0.6 D. -0.6, 1.2 E. None A, B, C, or D.
7.21 (4, 5/88, Q.49) (1 point) An unbiased coin is tossed 20 times. Using the normal
approximation, what is the probability of obtaining at least 8 heads?
The cumulative unit normal distribution is denoted by (x).
A. (-1.118)

B. (-.671)

C. 1 - (-.447)

D. (.671)

E. (1.118)

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 139

7.22 (4, 5/90, Q.25) (1 point) Suppose the distribution of claim amounts is normal with a mean of
$1,500. If the probability that a claim exceeds $5,000 is .015, in what range is the standard
deviation, , of the distribution?
A. < 1,600
B. 1,600 < 1,625
C. 1,625 < 1,650
D. 1,650 < 1,675
E. 1,675
7.23 (4, 5/90, Q.36) (2 points) The number of claims for each insured written by the
Homogeneous Insurance Company follows a Poisson process with a mean of .16.
The company has 100 independent insureds.
Let p be the probability that the company has more than 12 claims and less than 20 claims.
In what range does p fall? You may use the normal approximation.
A. p < 0.61
B. 0.61 < p < 0.63
C. 0.63 < p < 0.65
D. 0.65 < p < 0.67
E. 0.67 < p
7.24 (4, 5/91, Q.29) (2 points) A sample of 1,000 policies yields an estimated claim frequency of
0.210. Assuming the number of claims for each policy has a Poisson distribution, use the Normal
Approximation to find a 95% confidence interval for this estimate.
A. (0.198, 0.225) B. (0.191, 0.232) C. (0.183, 0.240)
D. (0.173, 0.251) E. (0.161, 0.264)
7.25 (4B, 5/92, Q.5) (2 points) You are given the following information:

Number of large claims follows a Poisson distribution.

Exposures are constant and there are no inflationary effects.

In the past 5 years, the following number of large claims has occurred: 12, 15, 19, 11, 18
Estimate the probability that more than 25 large claims occur in one year.
(The Poisson distribution should be approximated by the normal distribution.)
A. Less than .002
B. At least .002 but less than .003
C. At least .003 but less than .004
D. At least .004 but less than .005
E. At least .005

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 140

7.26 (4B, 11/92, Q.13) (2 points) You are given the following information:

The occurrence of hurricanes in a given year has a Poisson distribution.


For the last 10 years, the following number of hurricanes has occurred:
2, 4, 3, 8, 2, 7, 6, 3, 5, 2
Using the normal approximation to the Poisson, determine the probability of more than 10
hurricanes occurring in a single year.
A. Less than 0.0005
B. At least 0.0005 but less than 0.0025
C. At least 0.0025 but less than 0.0045
D. At least 0.0045 but less than 0.0065
E. At least 0.0065

7.27 (4B, 5/94, Q.20) (2 points) The occurrence of tornadoes in a given year is assumed to follow
a binomial distribution with parameters m = 50 and q = 0.60.
Using the Normal approximation to the binomial, determine the probability that at least 25 and at
most 40 tornadoes occur in a given year.
A. Less than 0.80
B. At least 0.80, but less than 0.85
C. At least 0.85, but less than 0.90
D. At least 0.90, but less than 0.95
E. At least 0.95
7.28 (5A, 11/94, Q.35) (1.5 points)
An insurance contract was priced with the following assumptions:
Claim frequency is Poisson with mean 0.01.
All claims are of size $5000.
Premiums are 110% of expected losses.
The company requires a 99% probability of not having losses exceed premiums.
(3/4 point) a. What is the minimum number of policies that the company must write given
the above surplus requirement?
(3/4 point) b. After the rate has been established, it was discovered that the claim severity
assumption was incorrect and that the claim severity should be 5% greater than
originally assumed. Now, what is the minimum number of policies that
the company must write given the above surplus requirement?

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 141

7.29 (4B, 11/96, Q.31) (2 points) You are given the following:

A portfolio consists of 1,600 independent risks.

For each risk the probability of at least one claim is 0.5.


Using the Central Limit Theorem, determine the approximate probability that the number of risks in
the portfolio with at least one claim will be greater than 850.
A. Less than 0.01
B. At least 0.01, but less than 0.05
C. At least 0.05, but less than 0.10
D. At least 0.10, but less than 0.20
E. At least 0.20
7.30 (4B, 11/97, Q.1) (2 points) You are given the following:
A portfolio consists of 10,000 identical and independent risks.
The number of claims per year for each risk follows a Poisson distribution with mean .
During the latest year, 1000 claims have been observed for the entire portfolio.
Determine the lower bound of a symmetric 95% confidence interval for .
A. Less than 0.0825
B. At least 0.0825, but less than 0.0875
C. At least 0.0875, but less than 0.0925
D. At least 0.0925, but less than 0.0975
E. At least 0.0975
7.31 (IOA 101, 9/00, Q.3) (1.5 points) The number of claims arising in a period of one month from
a group of policies can be modeled by a Poisson distribution with mean 24.
Using the Normal Approximation, determine the probability that fewer than 20 claims arise in a
particular month.
7.32 (IOA 101, 4/01, Q.4) (1.5 points) For a certain type of policy the probability that a
policyholder will make a claim in a year is 0.001. If a random sample of 10,000 policyholders is
selected, using the Normal Approximation, calculate an approximate value for the probability that
not more than 5 will make a claim next year.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 142

Solutions to Problems:
7.1. C. This is Binomial with q = 1/6 and m =1000. Mean = 1000 /6 = 166.66.
Standard Deviation =
[

(1000)(5 / 6)(1/ 6) = 11.785.

167.5 -166.66
166.5 -166.66
] - [
] = [0.07] - [-0.01] = 0.5279 - 0.4960 = 0.0319.
11.785
11.785

7.2. D. This is Binomial with q = 1/6 and m =1000. Mean = 1000 /6 = 166.66.
Standard Deviation =

(1000)(5 / 6)(1/ 6) = 11.785. The interval from 150 to 180 corresponds on

the Standard Normal to the interval from {(149.5-166.66)/11.785}


to {(180.5-166.66)/11.785}. Therefore the desired probability is:
((180.5-166.66)/11.785) - ((149.5-166.66)/11.785) = (1.17) - ( -1.46) =
.8790 - .0721 = 0.8069.
Comment: The exact answer is 0.8080, so the Normal Approximation is quite good.
7.3. D. This is the Binomial Distribution with q =.25 and m = 100. Therefore the mean is (100)(.25)
= 25. The Variance is: (100)(.25)(.75) = 18.75 and the Standard Deviation is:
Therefore the desired probability is:

18.75 = 4.330.

[(20.5-25)/4.330) - ((15.5-25)/4.330] = (-1.04) - (-2.19) = .1492 - .0143 = 0.1349.


Comment: The exact answer is .1377, so the Normal Approximation is okay.
7.4. C. Binomial Distribution with mean = 200 and variance = (10,000)(.02)(1-.02) = 196.
Standard deviation = 14. Chance of 205 or more claims = 1 - chance of 204 claims or less
1 - ((204.5-200)/14) = 1 - (.32) =1 - .6255 = 0.3745.
7.5. E. Mean = 400 = variance. Standard deviation = 20.
((440.5-400)/20) - ((419.5-400)/20) = (2.03) - (0.98) =.9788 - .8365 = 0.1423.
7.6. D. The total claims follow a Poisson Distribution with mean 400 + 500 + 1000 = 1900, since
independent Poisson variables add. This has a variance equal to the mean of 1900 and therefore a
standard deviation of

1900 = 43.59.

Prob[more than 1950 claims] 1 - ((1950.5-1900)/43.59) = 1 - (1.16) = 1- .8770 = 0.123.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 143

7.7. B. The number of claims in Taxachusetts and Florgia is given by a Poisson with mean
400 + 500 = 900. (Since the sum of independent Poisson variables is a Poisson.) This is
approximated by a Normal distribution with a mean of 900 and variance of 900. The number of
claims in Calizonia is approximated by the Normal distribution with mean 1000 and variance of
1000. The difference between the number of claims in Calizonia and the sum of the claims in
Taxachusetts and Florgia is therefore approximately a Normal Distribution with
mean = 1000 - 900 = 100 and variance = 1000 + 900 = 1900.
More claims next year from Taxachusetts and Florgia combined than from Calizonia
(# in Calizonia) - (# in Taxachusetts + # in Florgia) < 0
(# in Calizonia) - (# in Taxachusetts + # in Florgia) -1.
The probability of this is approximately: ({(0 - .5) -100} /

1900 ) = (-2.31) = .0104.

Comment: The sum of independent Normal variables is a Normal.


If X is Normal, then so is -X, so the difference of Normal variables is also Normal.
Also E[X - Y] = E[X] - E[Y]. For X and Y independent variables Var[X - Y] = Var[X] + Var[Y].
7.8. E. The number of large claims in Taxachusetts is Poisson with mean (30%)(400) = 120. (This
is the concept of thinning a Poisson.) Similarly the number of large claims in Florgia and Calizonia
are Poisson with means of 125 and 200 respectively. Thus the large claims from all three states is
Poisson with mean = 120 + 125 + 200 = 445. (The sum of independent Poisson variables is a
Poisson.) This Poisson is approximated by a Normal with a mean of 445 and a variance of 445.
The standard deviation is

445 = 21.10. (1.645) = .95 and thus a 90% confidence interval

the mean 1.645 standard deviations, which in this case is about 445 (1.645)(21.10) =
410.3 to 479.7. Thus [410, 480] covers a little more than 90% of the probability.
7.9. A. For a six-sided die the mean is 3.5 and the variance is 35/12. For 5 such dice the mean is:
(5)(3.5) = 17.5 and the variance is: (5)(35/12) = 175/12. The standard deviation = 3.819. Thus the
interval from 19.5 to 20.5 corresponds to (19.5-17.5)/3.819 = .524 to
(20.5-17.5)/3.819 = .786 on the standard unit normal distribution. Using the Standard Normal Table,
this has a probability of (.79) - (.52) = .7852 - .6985 = 0.0867.
7.10. A. A Negative Binomial distribution with = 2/3 and r =10000.
Mean = r= (10000)(2/3) = 6666.66. Variance = mean(1+ ) = 11111.11
Standard Deviation = 105.4. 1 - ((6799.5-6666.66)/105.4) = 1 - (1.26) = 1- .8962 = 10.38%
Comment: You have to recognize that this is an alternate way of writing the Negative Binomial
Distribution. In the tables attached to the exam, f(x) = {r(r+1)..(r+x-1)/x!} x / (1+)x+r. The factor
/(1 + ) is taken to the power x. Thus for the form of the distribution in this question, / (1 + ) =
0.4, and solving = 2/3. Then line up with the formula in Appendix B and note that r = 10,000.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 144

7.11. D. t = 1/(1+(.33267)(4)) = .42906.

(4) = exp(-42 /2) / 2 = 0.00033546/2.5066 = 0.00013383.


1- (4) (.0013383){.4361836(.42906) -.1201676(.42906)2 +.9372980(.42906)3 } =
(.00013383)(.23906) = 0.00003199.
Comment: The exact answer is 1- (4) = .000031672.
7.12. D. The number of games won is a binomial with m =162 and q = .65.
The mean is: (162)(.65) = 105.3. The variance is: (162)(.65)(1-.65) = 36.855.
The standard deviation is

36.855 = 6.07. Thus the chance of 114 or more wins is about:

1 - ((113.5-105.3)/6.07) = 1 - (1.35) = 1 - .9115 = 8.85%.


Comment: The exact probability is 8.72%.
7.13. D. The number of correct guesses is Binomial with parameters m = 27 and
q = 1/5, with mean: (1/5)(27) = 5.4 and variance: (1/5)(4/5)(27) = 4.32.
Therefore, Prob(# correct guesses 9) 1 - [(8.5-5.4)/ 4.32 ] = 1 - (1.49) = 6.81%.
Comment: Any resemblance between the situation in this question and actual exams is coincidental.
The exact answer is in terms of an incomplete Beta Function: 1 - (19, 9, 0.8) = 7.4%.
If Sue knows c questions, c 22, then her chance of passing is: 1 - (19, 22-c, 0.8),
as displayed below:
1

0.8

0.6

0.4

0.2

10

15

20

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 145

7.14. C. The number of home games won is a binomial with m = 81 and q = .8. The mean is:
(81)(.8) = 64.8 and variance is: (81)(.8)(1-.8) = 12.96. The number of road games won is a
binomial with m = 81 and q = .5. The mean is: (81)(.5) = 40.5 and variance is: (81)(.5)(1-.5) =
20.25. The number of home and road wins are independent random variables, thus the variance of
their sum is the sum of their variances: 12.96 + 20.25 = 33.21.
The standard deviation is 33.21 = 5.76. The mean number of wins is: 64.8 + 40.5 = 105.3.
Thus the chance of 114 or more wins is about:
1 - ((113.5-105.3)/5.76) = 1 - (1.42) = 1 - .9228 = 7.78%.
Comment: The exact probability is 7.62%, obtained by convoluting two binomial distributions.
7.15. E. Number of questions Lucky Tom guesses correctly is Binomial with mean
(0.4)(40) = 16, and variance (40)(0.4)(0.6) = 9.6. The probability he guesses 24 or more correctly
is approximately: 1 - [(23.5 - 16)/ 9.6 ] = 1 - (2.42) = 1 - .9922 = 0.78%.
Comment: The exact answer is 0.834177%. An ordinary person would only have a 20% chance of
randomly guessing correctly on each question. Therefore, their chance of passing would be
approximately: 1 - [(23.5 - 8)/ 6.4 ] = 1 - (6.13) = 4.5 x 10-10.
7.16. Let y = (x - )/. Then y follows a Standard Normal Distribution with mean 0
and standard deviation 1. f(y) = exp[-y2 /2]/ 2 . x = y + .
Expected value of |x| = Expected value of |y + | =

| y + | exp[-y2 / 2]
2

-
-/

= -

dy = -

(y + ) exp[-y2 / 2]
2

y exp[-y2 / 2]
2

-/

dy - (-/) +

-/
y=

{1 - 2(-/s)} +

exp[-y2 / 2]
2

y = - /

{1 - 2[-/]} +

dy +

-/

y exp[-y2 / 2]
dy + {1 - (-/)} =
2
y = - /

(y + ) exp[-y2 / 2]
dy
2

exp[-y2 / 2]
2

y = -

2
2
exp[].

2 2

Comment: For a Standard Normal, with = 0 and = 1, E[|X|] =

2
.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 146

7.17. D. The sum of 400 claims has a mean of (400)(500) = 200,000 and a variance of
(400)(10000). Thus the standard deviation of the sum is (20)(100) = 2000.
In order to standardize the variables one subtracts the mean and divides by the standard deviation,
thus standardizing M gives: (M - 200,000)/2000.
We wish the probability of the sum of the claims being less than or equal to M to be 99%.
For the standard Normal Distribution, (2.327) = 0.99.
Setting 2.327 = (M - 200,000)/2000, we get M = 200000 + (2.327)(2000) = 204,654.
7.18. B. [(4.5 - q) / q ] - [(0.5 - q ) / q ].
7.19. C. The portfolio has a mean of (400)(.01) = 4. Since each policy has a variance of .01 and
they should be assumed to be independent, then the variance of the portfolio is (400)(.01) = 4.
Thus the probability of 5, 6 or 7 claims is approximately:
[(7.5-4)/2] - [4.5-4)/2] = [1.75] - [.25] = 0.9599 - 0.5987 = 0.3612.
7.20. D. The standard deviation of the Normal is 4 = 2.
Thus 3.6 corresponds to (3.6-4.8)/2 = -0.6, while 7.2 corresponds to (7.2-4.8)/2 = 1.2.
7.21. E. The distribution is Binomial with q =.5 and m = 20. That has mean (20)(.5) = 10 and
variance (20)(.5)(1-.5) = 5. The chance of obtaining 8 or more heads is approximately:
1 - [(7.5-10)/ 5 ] = 1 - (-1.118) = 1 - {1 - (1.118)} = (1.118).
7.22. B. The chance that a claim exceeds 5000 is 1 - ((5000-1500) / ) = .015.
Thus (3500 / ) = .985. Consulting the Standard Normal Distribution, (2.17) = .985,
therefore 3500 / = 2.17. = 3500 / 2.17 = 1613.
7.23. B. The sum of independent Poisson variables is a Poison. The mean number of claims is
(100)(.16) = 16. Since for a Poisson the mean and variance are equal, the variance is also 16.
The standard deviation is 4. The probability is:
((19.5-16) / 4) - ((12.5-16) / 4) = (0.87) - (-0.88) = 0.8078 - 0.1894 = 0.6184.
Comment: More than 12 claims (greater than or equal to 13 claims) corresponds to 12.5 due to the
continuity correction.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 147

7.24. C. The observed number of claims is (1000)(.210) = 210. Since for the Poisson Distribution
the variance is equal to the mean, the estimated variance for the sum is also 210. The standard
deviation is

210 = 14.49. Using the Normal Approximation, an approximate 95% confidence

interval is 1.96 standard deviations. ((1.96) = .975.) Therefore a 95% confidence interval for the
number of claims from 1000 policies is 210 (1.96)(14.49) =
210 28.4. A 95% confidence interval for the claim frequency is: 0.210 0.028.
Alternately, the standard deviation for the estimated frequency declines as the square root of the
number of policies used to estimate it: 0.210 / 1000 = .458 / 31.62 = 0.01449. Thus a 95%
confidence interval for the claim frequency is: 0.210 (1.96)(0.01449) = 0.210 0.028.
Alternately, one can be a little more precise and let be the Poisson frequency. Then the standard
deviation is:

/ 1000 and the 95% confidence interval has within 1.96 standard deviations of

.210: -1.96 / 1000 (.210 - ) 1.96 / 1000 . We can solve for the boundaries of this
interval: 1.962 /1000 = (.210 - )2 . 2 - 0.4238416 + 0.0441 = 0.
= {0.4238416

0.42384162 - (4)(1)(0.0441) } / {(2) (1)} = 0.2119 0.0285.

Thus the boundaries are 0.2119 - 0.0285 = 0.183 and 0.2119 + 0.0285 = 0.240.
Comment: One needs to assume that the policies have independent claim frequencies.
The sum of independent Poisson variables is again a Poisson.
7.25. C. The average number of large claims observed per year is: (12+15+19+11+18)/5 = 15.
Thus we estimate that the Poisson has a mean of 15 and thus a variance of 15.
Thus Prob(N > 25) 1 - [(25.5 - 15)/

15 ] = 1 - (2.71) 1 - 0.9966 = 0.0034.

7.26. B. The observed mean is 42 / 10 = 4.2. Assume a Poisson with mean of 4.2 and therefore
variance of 4.2. Using the continuity correction, more than 10 on the discrete Poisson,
(11, 12, 13, ...) will correspond to more than 10.5 on the continuous Normal Distribution.
With a mean of 4.2 and a standard deviation of

4.2 = 2.05, 10.5 is standardized to:

(10.5 - 4.2) / 2.05 = 3.07. Thus P(N > 10) 1 - (3.07) = 1 - .9989 = 0.0011.
7.27. D. The mean of the Binomial is mq = (50)(.6) = 30.
The variance of the Binomial is mq(1-q) = (50)(.6)(1-.6) = 12.
Thus the standard deviation is

12 = 3.464.

[(40.5 - 30) / 3.464] - [(24.5 - 30) / 3.464] = (3.03) - (-1.59) = 0.9988 - 0.0559 = 0.9429.

2013-4-1,

Frequency Distributions, 7 Normal Approximation

HCM 10/4/12,

Page 148

7.28. (a) With N policies, the mean aggregate loss = N(.01)(5000) and the variance of aggregate
losses = N(.01)(50002 ). Thus premiums are 1.1N(.01)(5000).
The 99th percentile of the Standard Normal Distribution is 2.326. Thus we want:
Premiums - Expected Losses = 2.326(standard deviation of aggregate losses).
0.1N(0.01)(5000) = 2.326 N(0.01)(50002 ) . Therefore, N = (2.326/0.1)2 /0.01 = 54,103.
(b) With a severity of (1.05)(5000) = 5250 and N policies, the mean aggregate loss =
N(.01)(5250), and the variance of aggregate losses = N(.01)(52502 ).
Premiums are still: 1.1N(0.01)(5000). Therefore, we want:
1.1N(0.01)(5000) - N(0.01)(5250) = 2.326 N(0.01)(52502 ) .
2.5N = 1221.15 N . N = 238,593.
7.29. A. The mean is 800 while the variance is: (0.5)(1 - 0.5)(1600) = 400.
Thus the standard deviation is 20.
Using the continuity correction, more than 850 corresponds on the continuous Normal Distribution to:
1 - [(850.5-800)/20] = 1 - (2.53) = 0.0057.
7.30. D. The observed frequency is 1000/10000 = .1, which is the point estimate for .
Since for a Poisson the variance is equal to the mean, the estimated variance for a single insured of
its observed frequency is .1. For the sum of 10,000 identical insureds, the variance is divided by
10000; thus the variance of the observed frequency is: 0.1/10,000 = 1/100,000. The standard
deviation is

1/ 100,000 = 0.00316. Using the Normal Approximation, 1.96 standard deviations

would produce an approximate 95% confidence interval:


0.1 (1.96)(0.00316) = 0.1 0.0062 = [0.0938, 0.1062].
7.31. Prob[N < 20] [(19.5 - 24)/ 24 ] = [-0.92] = 17.88%.
7.32. The Binomial has mean: (0.001)(10,000) = 10, and variance: (0.001)(.999)(10,000) = 9.99.
Prob[N 5] [(5.5 - 10)/ 9.99 ] = [-1.42] = 7.78%.
Comment: The exact answer is 0.06699.

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 149

Section 8, Skewness
Skewness is one measure of the shape of a distribution.46
For example, take the following frequency distribution:

Number
of Claims
0
1
2
3
4
5
6
7
8
9
10
11

Probability
Density Function
0.1
0.2
0
0.1
0
0
0.1
0
0
0.1
0.3
0.1

Probability x
# of Claims
0
0.2
0
0.3
0
0
0.6
0
0
0.9
3
1.1

Probability x
Square of
# of Claims
0
0.2
0
0.9
0
0
3.6
0
0
8.1
30
12.1

Probability x
Cube of
# of Claims
0.0
0.2
0.0
2.7
0.0
0.0
21.6
0.0
0.0
72.9
300.0
133.1

Sum

6.1

54.9

530.5

E[X] = 1st moment about the origin = 6.1


E[X2 ] = 2nd moment about the origin = 54.9
E[X3 ] = 3rd moment about the origin = 530.5
Variance 2nd Central Moment = E[X2 ] - E[X]2 = 54.9 - 6.12 = 17.69.
Standard Deviation =

17.69 = 4.206.

3rd Central Moment E[(X - E[X])3 ] = E[X3 - 3X2 E[X] + 3XE[X]2 - E[X]3 ]
= E[X3 ] - 3E[X]E[X2 ] + 2E[X]3 = 530.5 - (3)(6.1)(54.9) + (2)(6.13 ) = -20.2.
(Coefficient of) Skewness Third Central Moment / STDDEV3 = -20.2/4.2063 = -0.27.
The third central moment: 3 E[ (X - E[X])3 ] = E[X3 ] - 3E[X] E[X2 ] + 2E[X]3 .
The (coefficient of) skewness is defined as the 3rd central moment divided by the cube
of the standard deviation = E[ (X - E[X])3 ] / STDDEV3 .
46

The coefficient of variation and kurtosis are others. See Mahlers Guide to Loss Distributions.

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 150

In the above example, the skewness is -0.27.


A negative skewness indicates a curve skewed to the left.
The Binomial Distribution for q > 0.5 is skewed to the left.
For example, here is a Binomial Distribution with m = 10 and q = 0.7:

0.25

0.2

0.15

0.1

0.05

10

In contrast, the Binomial Distribution for q < 0.5 has positive skewness and is skewed to the right.
For example, here is a Binomial Distribution with m = 10 and q = 0.2:
0.3
0.25
0.2
0.15
0.1
0.05

10

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 151

The Poisson Distribution, the Negative Binomial Distribution (including the special case of the
Geometric Distribution), as well as most size of loss distributions, are skewed to the right; they have
a small but significant probability of very large values.
For example, here is a Geometric Distribution with = 2:47

0.3
0.25
0.2
0.15
0.1
0.05

47

10

Even through only densities up to 10 are shown, the Geometric Distribution has support from zero to infinity.

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 152

As another example of a distribution skewed to the right, here is a Poisson Distribution with = 3:48

0.2

0.15

0.1

0.05

10

For the Poisson distribution the skewness is positive and therefore the distribution is skewed to the
right. However, as gets very large, the skewness of a Poisson approaches zero; in fact the
Poisson approaches a Normal Distribution.49
For example, here is a Poisson Distribution with = 30:
0.07
0.06
0.05
0.04
0.03
0.02
0.01

48

10

20

30

40

50

60

Even through only densities up to 10 are shown, the Poisson Distribution has support from zero to infinity.
This follows from the Central Limit Theorem and the fact that for integral N, a Poisson with parameter N is the sum
of N independent variables each with a Poisson distribution with a parameter of unity. The Normal Distribution is
symmetric and therefore has zero skewness.
49

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 153

A symmetric distribution has zero skewness.


Therefore, the Binomial Distribution for q = 0.5 and the Normal Distribution each have zero
skewness. For example, here is a Binomial Distribution with m = 10 and q = 0.5:
0.25

0.2

0.15

0.1

0.05

10

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 154

Binomial Distribution:
For a Binomial Distribution with m = 5 and q = 0.1:
Number
of Claims
0
1
2
3
4
5

Probability
Density Function
59.049%
32.805%
7.290%
0.810%
0.045%
0.001%

Probability x
# of Claims
0.00000
0.32805
0.14580
0.02430
0.00180
0.00005

Probability x
Square of
# of Claims
0.00000
0.32805
0.29160
0.07290
0.00720
0.00025

Probability x
Cube of
# of Claims
0.00000
0.32805
0.58320
0.21870
0.02880
0.00125

Sum

0.50000

0.70000

1.16000

The mean is: 0.5 = (5)(0.1) = mq.


The variance is: E[X2 ] - E[X]2 = 0.7 - 0.52 = 0.45 = (5)(0.1)(0.9) = mq(1-q).
The skewness is:

E[X3] - 3 E[X] E[X2] + 2 E[X]3 1.16 - (3)(0.7)(0.5) + 2 (0.53 )


=
3
0.45 3 / 2
0.8

= 1.1925 =

For a Binomial Distribution, the skewness is:

1 - 2q
m q (1 - q)

0.45

1 - 2q
m q (1 - q)

Binomial Distribution with q < 1/2 positive skewness skewed to the right.
Binomial Distribution q = 1/2 symmetric zero skewness.
Binomial Distribution q > 1/2 negative skewness skewed to the left.

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 155

Poisson Distribution:
For a Poisson distribution with = 2.5:

Number
of Claims
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Probability
Density Function
0.08208500
0.20521250
0.25651562
0.21376302
0.13360189
0.06680094
0.02783373
0.00994062
0.00310644
0.00086290
0.00021573
0.00004903
0.00001021
0.00000196
0.00000035
0.00000006
0.00000001
0.00000000
0.00000000
0.00000000
0.00000000

Probability x
# of Claims
0.00000000
0.20521250
0.51303124
0.64128905
0.53440754
0.33400471
0.16700236
0.06958432
0.02485154
0.00776611
0.00215725
0.00053931
0.00012257
0.00002554
0.00000491
0.00000088
0.00000015
0.00000002
0.00000000
0.00000000
0.00000000

Probability x
Square of
# of Claims
0.00000000
0.20521250
1.02606248
1.92386716
2.13763017
1.67002357
1.00201414
0.48709021
0.19881233
0.06989496
0.02157252
0.00593244
0.00147085
0.00033196
0.00006875
0.00001315
0.00000234
0.00000039
0.00000006
0.00000001
0.00000000

Probability x
Cube of
# of Claims
0.00000000
0.20521250
2.05212497
5.77160147
8.55052069
8.35011786
6.01208486
3.40963146
1.59049864
0.62905464
0.21572518
0.06525687
0.01765024
0.00431553
0.00096250
0.00019731
0.00003741
0.00000660
0.00000109
0.00000017
0.00000002

Sum

1.00000000

2.50000000

8.75000000

36.87500000

The mean is: 2.5 = .


The coefficient of variation =
The skewness is:

Distribution
Function
0.08208500
0.28729750
0.54381312
0.75757613
0.89117802
0.95797896
0.98581269
0.99575330
0.99885975
0.99972265
0.99993837
0.99998740
0.99999762
0.99999958
0.99999993
0.99999999
1.00000000
1.00000000
1.00000000
1.00000000
1.00000000

The variance is: E[X2 ] - E[X]2 = 8.75 - 2.52 = 2.5 = .


1
variance

=
=
.
mean

E[X3] - 3 E[X] E[X2] + 2 E[X]3


=
3

36.875 - (3)(2.5)(8.75) + 2 (2.53)


=
2.5 3/ 2

1
2.5

For the Poisson Distribution, the skewness is:

. For the Poisson, Skewness = CV.

Poisson Distribution positive skewness skewed to the right.

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 156

Negative Binomial Distribution:


For a Negative Binomial distribution with r = 3 and = 0.4:
Number
of Claims
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Probability
Density Function
0.36443149
0.31236985
0.17849705
0.08499860
0.03642797
0.01457119
0.00555093
0.00203912
0.00072826
0.00025431
0.00008719
0.00002944
0.00000981
0.00000324
0.00000106
0.00000034
0.00000011
0.00000004
0.00000001
0.00000000
0.00000000

Probability x
# of Claims
0.00000000
0.31236985
0.35699411
0.25499579
0.14571188
0.07285594
0.03330557
0.01427382
0.00582605
0.00228880
0.00087193
0.00032386
0.00011777
0.00004206
0.00001479
0.00000513
0.00000176
0.00000060
0.00000020
0.00000007
0.00000002

Probability x
Square of
# of Claims
0.00000000
0.31236985
0.71398822
0.76498738
0.58284753
0.36427970
0.19983344
0.09991672
0.04660838
0.02059924
0.00871926
0.00356244
0.00141320
0.00054677
0.00020706
0.00007697
0.00002815
0.00001015
0.00000361
0.00000127
0.00000044

Sum

1.00000000

1.19999999

3.11999977

Probability x
Cube of
# of Claims
0.00000000
0.31236985
1.42797644
2.29496213
2.33139010
1.82139852
1.19900062
0.69941703
0.37286706
0.18539316
0.08719255
0.03918682
0.01695839
0.00710805
0.00289887
0.00115454
0.00045038
0.00017251
0.00006501
0.00002414
0.00000885
10.79999502

Distribution
Function
0.36443149
0.67680133
0.85529839
0.94029699
0.97672496
0.99129614
0.99684707
0.99888619
0.99961445
0.99986876
0.99995595
0.99998539
0.99999520
0.99999844
0.99999950
0.99999984
0.99999995
0.99999998
0.99999999
1.00000000
1.00000000

The mean is: 1.2 = (3)(0.4) = r.


The variance is: E[X2 ] - E[X]2 = 3.12 - 1.22 = 1.68 = (3)(.4)(1.4) = r(1+).
The third central moment is: E[X3 ] - 3 E[X] E[X2 ] + 2 E[X]3 = 10.8 - (3)(1.2)(3.12) + 2 (1.23 ) =
3.024 = (1.8)(3)(.4)(1.4) = (1 + 2)r(1 + ).50
The skewness is: 3.024 / 1.683/2 = 1.389 =

1.8
(3)(0.4 )(1.4 )

For the Negative Binomial Distribution, the skewness is:

1 + 2
r(1 + )

1 + 2
r(1 + )

Negative Binomial Distribution positive skewness skewed to the right.


For the Negative Binomial Distribution, 32 - 2 + 2(2 - )2 / = 3r(1+) 2r + 2{r(1+) r}2 /(r) =
r + 3r2 + 2r3 = r(1 + )(1 + 2) = third central moment.
This property of the Negative Binomial is discussed in Section 6.9 of Loss Models, not on the syllabus.
50

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 157

Problems:
8.1 (3 points) What is the skewness of the following frequency distribution?
Number of Claims Probability
0
0.02
1
0.04
2
0.14
3
0.31
4
0.36
5
0.13
A. less than -1.0
B. at least -1.0 but less than -0.5
C. at least -0.5 but less than 0
D. at least 0 but less than 0.5
E. at least 0.5
8.2 (2 points) A distribution has first moment = m, second moment about the origin = m + m2 , and
third moment about the origin = m + 3m2 + m3 .
Which of the following is the skewness of this distribution?
A. m

B. m0.5

C. 1

D. m-0.5

E. m-1

8.3 (3 points) The number of claims filed by a commercial auto insured as the result of
at-fault accidents caused by its drivers is shown below:
Year
Claims
2002
7
2001
3
2000
5
1999
10
1998
5
Calculate the skewness of the empirical distribution of the number of claims per year.
A. Less than 0.50
B. At least 0.50, but less than 0.75
C. At least 0.75, but less than 1.00
D. At least 1.00, but less than 1.25
E. At least 1.25

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 158

8.4 (4 points) You are given the following distribution of the number of claims on 100,000 motor
vehicle comprehensive polices:
Number of claims Observed number of policies
0
88,585
1
10,577
2
779
3
54
4
4
5
1
6 or more
0
Calculate the skewness of this distribution.
A. 1.0
B. 1.5
C. 2.0
D. 2.5
E. 3.0
8.5 (4, 5/87, Q.33) (1 point) There are 1000 insurance policies in force for one year.
The results are as follows:
Number of Claims
Policies
0
800
1
130
2
50
3
20
1000
Which of the following statements are true?
1. The mean of this distribution is 0.29.
2. The variance of this distribution is at least 0.45.
3. The skewness of this distribution is negative.
A. 1
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 159

8.6 (CAS3, 5/04, Q.28) (2.5 points) A pizza delivery company has purchased an automobile
liability policy for its delivery drivers from the same insurance company for the past five years.
The number of claims filed by the pizza delivery company as the result of at-fault accidents caused
by its drivers is shown below:
Year
Claims
2002
4
2001
1
2000
3
1999
2
1998
15
Calculate the skewness of the empirical distribution of the number of claims per year.
A. Less than 0.50
B. At least 0.50, but less than 0.75
C. At least 0.75, but less than 1.00
D. At least 1.00, but less than 1.25
E. At least 1.25

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 160

Solutions to Problems:
8.1. B. Variance = E[X2 ] - E[X]2 = 12.4 - 3.342 = 1.244.
Standard Deviation =

1.244 = 1.116.

skewness = {E[X3 ] - (3 E[X] E[X2 ]) + (2 E[X]3 )} / STDDEV3 =


{48.82 - (3)(3.34)(12.4) + (2) (3.343 )} / (1.1163 ) = -0.65.
Number
of Claims
0
1
2
3
4
5

Probability
Density Function
2%
4%
14%
31%
36%
13%

Probability x
# of Claims
0
0.04
0.28
0.93
1.44
0.65

Probability x
Square of
# of Claims
0
0.04
0.56
2.79
5.76
3.25

Probability x
Cube of
# of Claims
0.0
0.0
1.1
8.4
23.0
16.2

Sum

3.34

12.4

48.82

8.2. D. 2 = 2 - 1 2 = (m + m2 ) - m2 = m.
skewness = {3 - (3 1 2 ) + (2 1 3)} / 3 =
{(m + 3m2 + m3 ) - 3(m + m2 )m + 2 m3 } / m3/2 = m- 0 . 5.
Comment: The moments are those of the Poisson Distribution with mean m.
8.3. B. E[X] = (7 + 3 + 5 + 10 + 5)/5 = 6. E[X2 ] = (72 + 32 + 52 + 102 + 52 )/5 = 41.6.
Var[X] = 41.6 - 62 = 5.6.

E[X3 ] = (73 + 33 + 53 + 103 + 53 )/5 = 324.

Skewness = {E[X3 ] - 3 E[X2 ]E[X] + 2E[X]3 } / Var[X]1.5


= {324 - (3)(41.6)(6) + (2)(63 )}/5.61.5 = 7.2/13.25 = 0.54.
Comment: Similar to CAS3, 5/04, Q.28. E[(X- X )3 ] = (13 + (-3)3 + (-1)3 + 43 + (-1)3 )/5 = 7.2.

2013-4-1,

Frequency Distributions, 8 Skewness

HCM 10/4/12,

Page 161

8.4. E. E[X] = 12318/100000 = 0.12318. E[X2 ] = 14268/100000 = 0.14268.


Number
of Claims

Number of
Policies

Contribution to
First Moment

Contribution to
Second Moment

Contribution to
Third Moment

0
1
2
3
4
5

88,585
10,577
779
54
4
1

0
10577
1558
162
16
5

0
10577
3116
486
64
25

0
10577
6232
1458
256
125

Total

100,000

12,318

14,268

18,648

Var[X] = 0.14268 - 0.123182 = 0.12751.


E[X3 ] = 18648/100000 = 0.18648.
Third Central Moment = E[X3 ] - 3 E[X]E[X2 ] + 2E[X]3
= .18648 - (3)(.12318)(.14268) + (2)(.123183 ) = .13749.
Skewness = (Third Central Moment) / Var[X]1.5 = .13749/.127511.5 = 3.02.
Comment: Data taken from Table 5.9.1 in Introductory Statistics with Applications in General
Insurance by Hossack, Pollard and Zehnwirth.
8.5. A. 1. True. The mean is {(0)(800) + (1)(130) + (2)(50) + (3)(20)} / 1000 = 0.290.
2. False. The second moment is {(02)(800) + (12)(130) + (22)(50) + (32)(20)} / 1000 = 0.510.
Thus the variance = 0.510 - 0.292 = 0.4259.
3. False. The distribution is skewed to the right and thus of positive skewness. The third moment is:
{(03 )(800) + (13 )(130) + (23 )(50) + (33 )(20)} / 1000 = 1.070.
Therefore, skewness = {3 - (3 1 2 ) + (2 1 3)} / STDDEV3 =
{1.070 - (3)(.29)(.51) +(2)(.293 )} / .278 = 2.4 > 0.
8.6. E. E[X] = (4 + 1 + 3 + 2 + 15)/5 = 5. E[X2 ] = (42 + 12 + 32 + 22 + 152 )/5 = 51.
Var[X] = 51 - 52 = 26.

E[X3 ] = (43 + 13 + 33 + 23 + 153 )/5 = 695.

Skewness = {E[X3 ] - 3 E[X2 ]E[X] + 2E[X]3 } / Var[X]1.5


= {695 - (3)(51)(5) + (2)(53 )}/261.5 = 180/133.425 = 1.358.
Alternately, the third central moment is:
{(4 - 5)3 + (1 - 5)3 + (3 - 5)3 + (2 - 5)3 + (15 - 5)3 }/5 = 180. Skewness = 180/261.5 = 1.358.

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 162

Section 9, Probability Generating Functions51


The Probability Generating Function, p.g.f., is useful for working with frequency distributions.52

P(z) = Expected Value of

zn

E[zn ] =

f(n) zn .
n=0

Note that as with other generating functions, there is a dummy variable, in this case z.
Exercise: Assume a distribution with 1-q chance of 0 claims and q chance of 1 claim. (This a Bernoulli
distribution with parameter q.) What is the probability generating function?
[Solution: P(z) = E[zn ] = (1-q)(z0 ) + q(z1 ) = 1 + q(z-1).]
The Probability Generating Function of the sum of independent frequencies is the
product of the individual Probability Generating Functions.
Specifically, if X and Y are independent random variables, then
PX+Y(z) = E[zx+y] = E[zx zy] = E[zx]E[zy] = PX(z)PY(z).
Exercise: What is the probability generating function of the sum of two independent Bernoulli
distributions each with parameter q?
[Solution: It is the product of the probability generating functions of each Bernoulli:
{1 + q(z-1)}2 . Alternately, one can compute that for the sum of the two Bernoulli there is:
(1-q)2 chance of zero claims, 2q(1-q) chance of 1 claims and q2 chance of 2 claims.
Thus P(z) = (1-q)2 z0 + 2q(1-q)z1 + q2 z2 = 1 - 2q +q2 + 2qz + 2q2 z + q2 z2 =
1 + 2q(z-1) + (z2 - 2z +1)q2 = {1 + q(z-1)}2 .]
As discussed, a Binomial distribution with parameters q and m is the sum of m independent
Bernoulli distributions each with parameter q. Therefore the probability generating function of a
Binomial distribution is that of the Bernoulli, to the power m: {1 + q(z-1)}m.
The probability generating functions, as well as much other useful information
on each frequency distribution, are given in the tables attached to the exam.

51

See Definition 3.9 in Loss Models.

The Probability Generating Function is similar to the Moment Generating Function: M(z) = E[ezn].
See Mahlers Guide to Aggregate Distributions. They are related via P(z) = M(ln(z)). They share many properties.
Loss Models uses the Probability Generating Function when dealing with frequency distributions.
52

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 163

Densities:
The distribution determines the probability generating function and vice versa.
Given a p.g.f., one can obtain the probabilities by repeated differentiation as follows:
d n P(z)

dz n z = 0
f(n) =
.
n!
f(0) = P(0).53 f(1) = P(0). f(2) = P(0)/2. f(3) = P(0)/6. f(4) = P(0)/24.
Exercise: Given the probability generating function: e(z-1), what is the probability of three claims?
[Solution: P(z) = e(z-1) = eze. P(z) = eze. P(z) = 2eze. P(z) = 3eze.
f(3) = (d3 P(z) / dz3 )z=0 / 3! = (d3 e(z-1) / dz3 )z=0 / 3! = 3e0e / 3! = 3e / 3!.
Note that this is the p.g.f. of a Poisson Distribution with parameter lambda, and this is indeed the
probability of 3 claims for a Poisson. ]
Alternately, the probability of n claims is the coefficient of zn in the p.g.f.
So for example, given the probability generating function: e(z-1) = eze =
e

- i
(z)i
i! = e i! zi . Thus for this p.g.f., f(i) = e i/ i!,
i=0
i=0

which is the density function of a Poisson distribution.


Mean:

P(z) =

f(n)zn .

n=0

P(z) =

P(1) =

f(n) = 1.

n=0

n=1

n=1

n=0

f(n)nzn-1. P(1) = f(n)n = f(n)n = Mean.

P(1) = Mean.54
53
54

As z 0, zn 0 for n > 0. Therefore, P(z) = f(n) zn f(0) as z 0.


This is a special case of a result discussed subsequently in the section on factorial moments.

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 164

Proof of Results for Adding Distributions:


One can use the probability generating function, in order to determine the results of adding Poisson,
Binomial, or Negative Binomial Distributions.
Assume one has two independent Poisson Distributions with means 1 and 2.
The p.g.f. of a Poisson is P(z) = e(z-1).
The p.g.f. of the sum of these two Poisson Distributions is the product of the p.g.f.s of the two
Poisson Distributions: exp[1(z-1)]exp[2(z-1)] = exp[(1 + 2)(z-1)].
This is the p.g.f. of a Poisson Distribution with mean 1 + 2.
In general, the sum of two independent Poisson Distributions is also Poisson with mean equal to the
sum of the means.
Similarly, assume we are summing two independent Binomial Distributions with parameters m1 and
q, and m2 and q. The p.g.f. of a Binomial is P(z) = {1 + q(z-1)}m.
The p.g.f. of the sum is: {1 + q(z-1)}m1 {1 + q(z-1)}m2 = {1 + q(z-1)}m1 + m2.
This is the p.g.f. of a Binomial Distribution with parameters m1 + m2 and q.
In general, the sum of two independent Binomial Distributions with the same q parameter is also
Binomial with parameters m1 + m2 and q.
Assume we are summing two independent Negative Binomial Distributions with parameters r1 and
, and r2 and . The p.g.f. of the Negative Binomial is P(z) = {1 - (z-1)}-r.
The p.g.f. of the sum is: {1 - (z-1)}-r1 {1 - (z-1)}-r2 = {1 - (z-1)}-(r1 + r2).
This is the p.g.f. of a Negative Binomial Distribution with parameters r1 + r2 and .
In general, the sum of two independent Negative Binomial Distributions with the same parameter
is also Negative Binomial with parameters r1 + r2 and .

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 165

Infinite Divisibility:55
If a distribution is infinitely divisible, then if one takes the probability generating function to any
positive power, one gets the probability generating function of another member of the same family
of distributions.56
For example, for the Poisson P(z) = e(z-1). If we take this p.g.f. to the power > 0,
P(z) = e(z-1), which is the p.g.f. of a Poisson with mean .
The p.g.f. of a sum of r independent identically distributed variables, is the individual p.g.f. to the
power r. Since for the Geometric P(z) = 1/{1- (z-1)}, for the Negative Binomial distribution:
P(z) = {1- (z-1)}-r, for r > 0, > 0.
Exercise: P(z) = {1- (z-1)}-r, for r > 0, > 0.
Is the corresponding distribution infinitely divisible?
[Solution: P(z) = {1- (z-1)}-r. Which is of the same form, but with r rather than r. Thus the
corresponding Negative Binomial Distribution is infinitely divisible.]
Infinitely divisible distributions include: Poisson, Negative Binomial, Compound Poisson,
Compound Negative Binomial, Normal, Gamma, and Inverse Gaussian.57
Exercise: P(z) = {1 + q(z-1)}m, for m a positive integer and 0 < q < 1.
Is the corresponding distribution infinitely divisible?
[Solution: P(z) = {1 + q(z-1)}m. Which is of the same form, but with m rather than m. However,
unless is integral, m is not. Thus the corresponding distribution is not infinitely divisible. This is a
Binomial Distribution. While Binomials can be added up, they can not be divided into pieces smaller
than a Bernoulli Distribution.]
If a distribution is infinitely divisible, and one adds up independent identically distributed random
variables, then one gets a member of the same family. As has been discussed this is the case for
the Poisson and for the Negative Binomial.

55

See Definition 6.17 of Loss Models not on the syllabus, and in Section 9.2 of Loss Models on the syllabus.
One can work with either the probability generating function, the moment generating function, or the
characteristic function.
57
Compound Distributions will be discussed in a subsequent section.
56

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 166

In particular infinitely divisible distributions are preserved under a change of exposure.58 One can find
a distribution of the same type such that when one adds up independent identical copies they add
up to the original distribution.
Exercise: Find a Poisson Distribution, such that the sum of 5 independent identical copies will be a
Poisson Distribution with = 3.5.
[Solution: A Poisson Distribution with = 3.5/5 = 0.7.]
Exercise: Find a Negative Binomial Distribution, such that the sum of 8 independent identical copies
will be a Negative Binomial Distribution with = .37 and r = 1.2.
[Solution: A Negative Binomial Distribution with = .37 and r = 1.2/8 = .15.]

Distribution

Probability Generating Function, P(z)59

Infinitely Divisible

Binomial

{1 + q(z-1)}m

No60

Poisson

e(z-1)

Yes

Negative Binomial

{1 - (z-1)}-r, z < 1 + 1/

Yes

58

See Section 6.12 of Loss Models.


As shown in Appendix B, attached to the exam.
60
Since m is an integer.
59

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 167

A(z):61
Let an = Prob[x > n].

Define A(z) =

an zn .
n=0

(1 - z) A(z) =

an

zn

n=0

Thus A(z) =

an

zn+ 1

= a0 -

n=0

(an - 1 -

an)zn

= 1 - p0 -

n=1

pn zn

= 1 - P(z).

n=1

1 - P(z)
.
1 - z

P(z) - 1
P(1) = 1. Therefore, as z 1, A(z) =
P(1) = E[X]. Thus
z - 1

an = A(1) = E[X].
n=0

This is analogous to the result that the mean is the integral of the survival function from 0 to .
For example, for a Geometric Distribution, P(z) =
Thus A(z) =

1
.
1 - (z - 1)

1 - P(z)
-(z - 1)
1

=
=
. A(1) = = mean.
1 - z
1 - (z - 1) 1 - z 1 + - z

Now in general, 1 / (1 - x/c) = 1 + (x/c) + (x/c)2 + (x/c)3 + (x/c)4 + ....


Thus A(z) =

=
1 + - z
1 +

1
=
1 - z / (1+ )

{1 + {z/(1+)} + {z/(1+)}2 + {z/(1+)}3 + {z/(1+)}4 + ....} =


1 +

+z(
)2 + z2 (
)3 + z3 (
)4 + z4 (
)5 + ...
1 +
1 +
1 +
1 +
1 +

Thus matching up coefficients of zn , we have: an = (


Thus for the Geometric, Prob[x > n] = (

61

See Exercise 6.34 in Loss Models.

)n+1.
1 +

)n+1, a result that has been discussed previously.


1 +

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 168

Problems:
9.1 (2 points) The number of claims, N, made on an insurance portfolio follows the following
distribution:
n
Pr(N=n)
0
0.35
1
0.25
2
0.20
3
0.15
4
0.05
What is the Probability Generating Function, P(z)?
A. 1 + 0.65z + 0.4z2 + 0.2z3 + 0.05z4
B. 0.35 + 0.6z + 0.8z2 + 0.95z3 + z4
C. 0.35 + 0.25z + 0.2z2 + 0.15z3 + 0.05z4
D. 0.65 + 0.75z + 0.8z2 + 0.85z3 + 0.95z4
E. None of A, B, C, or D.
9.2 (1 point) For a Poisson Distribution with = 0.3, what is the Probability Generating Function at
5?
A. less than 3
B. at least 3 but less than 4
C. at least 4 but less than 5
D. at least 5 but less than 6
E. at least 6
9.3 (1 point) Which of the following distributions is not infinitely divisible?
A. Binomial
B. Poisson
C. Negative Binomial
D. Normal
E. Gamma

9.4 (3 points) Given the Probability Generating Function, P(z) =


for the corresponding frequency distribution?
A. 1/2%
B. 1%
C. 2%
D. 3%

E. 4%

e0.4z - 1
, what is the density at 3
e0.4 - 1

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 169

9.5 (5 points) You are given the following data on the number of runs scored during half innings of
major league baseball games from 1980 to 1998:
Runs
Number of Occurrences
0
518,228
1
105,070
2
47,936
3
21,673
4
9736
5
4033
6
1689
7
639
8
274
9
107
10
36
11
25
12
5
13
7
14
1
15
0
16
1
Total
709,460
WIth the aid of computer, from z = -2.5 to z = 2.5, graph P(z) the probability generating function of
the empirical model corresponding to this data.
9.6 (2 points) A variable B has probability generating function P(z) = 0.8z2 + 0.2z4 .
A variable C has probability generating function P(z) = 0.7z +0.3z5 .
B and C are independent.
What is the probability generating function of B + C.
A. 1.5z3 + 0.9z5 + 1.1z7 + 0.5z9
B. 0.25z3 + 0.25z5 + 0.25z7 + 0.25z9
C. 0.06z3 + 0.24z5 + 0.14z7 + 0.56z9
D. 0.56z3 + 0.14z5 + 0.24z7 + 0.06z9
E. None of A, B, C, or D.
9.7 (1 point) Given the Probability Generating Function, P(z) = 0.5z + 0.3z2 + 0.2z4 , what is the
density at 2 for the corresponding frequency distribution?
A. 0.2
B. 0.3
C. 0.4
D. 0.5
E. 0.6

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 170

9.8 (1 point) For a Binomial Distribution with m = 4 and q = 0.7, what is the Probability Generating
Function at 10?
A. less than 1000
B. at least 1000 but less than 1500
C. at least 1500 but less than 2000
D. at least 2000 but less than 2500
E. at least 2500
9.9 (1 point) N follows a Poisson Distribution with = 5.6. Determine E[3N].
A. 10,000

B. 25,000

C. 50,000

D. 75,000

9.10 (7 points) A frequency distribution has P(z) = 1 - (1-z)-r,


where r is a parameter between 0 and -1.
(a) (3 points) Determine the density at 0, 1, 2, 3, etc.
(b) (1 point) Determine the mean.

(c) (2 points) Let an = Prob[x > n]. Define A(z) =

an zn .
n=0

Show that in general, A(z) =

1 - P(z)
.
1 - z

(d) (2 points) Using the result in part (c), show that for this distribution,
n + r
an =
= (r+1) (r+2) .... (r+n) / n!.
n

E. 100,000

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 171

9.11 (2, 2/96, Q.15) (1.7 points) Let X1 ,..., Xn be independent Poisson random variables with
n

expectations 1, . . . , n , respectively. Let Y = cXi , where c is a constant.


i =1

Determine the probability generating function of Y.


n

A. exp[(zc + z2 c2 /2) i ]
i=1

B. exp[(zc - 1) i ]
i=1

i=1

i=1

C. exp[zc i + (z2 c2 /2) i2 ]


n

D. exp[(zc - 1) i ]
i=1

E. (zc - 1)n i
i =1

9.12 (IOA 101, 4/00, Q.10) (4.5 points) Under a particular model for the evolution of the size of
a population over time, the probability generating function of Xt , the size at time t, is given by:
P(z) = {z + t(1 - z)}/{1 + t(1 - z)}, > 0.
If the population dies out, it remains in this extinct state for ever.
(i) (2.25 points) Determine the expected size of the population at time t.
(ii) (1.5 points) Determine the probability that the population has become extinct by time t.
(iii) (0.75 points) Comment briefly on the future prospects for the population.
9.13 (IOA 101, 9/01, Q.2) (1.5 points) Let X1 and X2 be independent Poisson random variables
with respective means 1 and 2. Determine the probability generating function of
X1 + X2 and hence state the distribution of X1 + X2 .

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 172

Solutions to Problems:
9.1. C. P(z) = E[zn ] = (.35)(z0 ) + (.25)(z1 ) + (.20)(z2 ) + (.15)(z3 ) + (.05)(z4 ) =
0.35 + 0.25z + 0.2z2 + 0.15z3 + 0.05z4 .
9.2. B. As shown in the Appendix B attached to the exam, for a Poisson Distribution
P(z) = e(z-1). P(5) = e4 = e1.2 = 3.32.
9.3. A. The Binomial is not infinitely divisible.
Comment: In the Binomial m is an integer. For m =1 one has a Bernoulli. One can not divide a
Bernoulli into smaller pieces.
9.4. C. P(z) = (e0.4z - 1)/(e0.4 - 1). P(z) = .4e0.4z/(e0.4 - 1). P(z) = .16e0.4z/(e0.4 - 1).
P(z) = .064e0.4z/(e0.4 - 1). f(3) = (d3 P(z) / dz3 )z=0 / 3! = (.064/(e0.4 - 1))/6 = 2.17%.
Comment: This is a zero-truncated Poisson Distribution with = 0.4, not on the syllabus.
9.5. P(z) = {518,228 + 105,070 z + 47,936 z2 + ... + z16} / 709,460.
Here is a graph of P(z):
PGF

3.0
2.5
2.0
1.5
1.0
0.5
-2
-1
1
Comment: For example, P(-2) = 0.599825, and P(2) = 2.73582.

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 173

9.6. D. The probability generating function of a sum of independent variables is the product of the
probability generating functions.
PB+C(z) = PB(z)PC(z) = (.8z2 + .2z4 )(.7z + .3z5 ) = 0.56z3 + 0.14z5 + 0.24z7 + 0.06z9 .
Alternately, B has 80% probability of being 2 and 20% probability of being 4.
C has 70% probability of being 1 and 30% probability of being 5.
Therefore, B + C has: (80%)(70%) = 56% chance of being 1+ 2 = 3,
(20%)(70%) = 14% chance of being 4 + 1 = 5, (80%)(30%) = 24% chance of being 2 + 5 = 7,
and (20%)(30%) = 6% chance of being 4 + 5 = 9.

PB+C(z) = (.8z2 + .2z4 )(.7z + .3z5 ) = 0.56z3 + 0.14z5 + 0.24z7 + 0.06z9 .


Comment: An example of a convolution.
9.7. B. P(z) = Expected Value of zn = f(n) zn . Thus f(2) = 0.3.
Alternately, P(z) = .5z + .3z2 + .2z4 . P(z) = .5 + .6z + .8z3 . P(z) = .6 + .24z3 .
f(2) = (d2 P(z) / dz2 )z=0 / 2! = .6/2 = 0.3.
9.8. E. As shown in the Appendix B attached to the exam, for a Binomial Distribution
P(z) = {1 + q(z-1)}m = {1 + (.7)(z-1)}4 . P(10) = {1 + (.7)(9)}4 = 2840.
9.9. D. The p.g.f. of the Poisson Distribution is: P(z) = e(z-1) = e5.6(z-1).
E[3N] = P(3) = e5.6(3-1) = e11.2 = 73,130.

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 174

9.10. (a) f(0) = P(0) = 0.


P(z) = -r (1-z)-(r+1). f(1) = P(0) = -r. 0
P(z) = -r (r+1)(1-z)-(r+1). f(2) = P(0)/2 = -r(r+1)/2.
P(z) = -r (r+1)(r+2)(1-z)-(r+1). f(3) = P(0)/3! = -r(r+1)(r+2)/6.
f(x) = -r(r+1) ... (r+x - 1)/x! = -

[x +r]
, x = 1, 2, 3, ...
[x +1] [r]

(b) Mean = P(1) = infinity. The densities go to zero too slowly; thus there is no finite mean.

(c) (1 - z) A(z) =

an
n=0

Thus A(z) =

zn

an
n=0

zn+ 1

= a0 -

(an - 1 -

an)zn

= 1 - p0 -

n=1

pn zn

= 1 - P(z).

n=1

1 - P(z)
.
1 - z

(d) Thus for this distribution A(z) = (1-z)-r / (1-z) = (1-z)-(r+1) =

n=0

Thus since A(z) =

n +r n
z , from the Taylor series.
n

n + r
= (r+1) (r+2) .... (r+n) / n!.
n

an zn , an =
n=0

Comment: This is called a Sibuya frequency distribution.


It is the limit of an Extended Zero-Truncated Negative Binomial Distribution, as .
See Exercises 6.13, 6.34, and 8.32 in Loss Models.
For r = -0.7, the densities at 1 through 10: 0.7, 0.105, 0.0455, 0.0261625, 0.0172672, 0.0123749,
0.00936954, 0.00737851, 0.00598479, 0.00496738.
For r = -0.7, Prob[n > 10] = (0.3)(1.3) ... (9.3) / 10! = 0.065995.
1 - P(z)
P(1) = 1. Therefore, as z 0, A(z) =
P(1) = E[X]. Thus
1 - z

an = A(1) = E[X].
n=0

This is analogous to the result that the mean is the integral of the survival function from 0 to .
For this distribution, Mean = A(1) = (1 - 1)-(r+1) = .

2013-4-1,

Frequency Distributions, 9 Prob. Generating Func. HCM 10/4/12, Page 175

9.11. D. For each Poisson, the probability generating function is: P(z) = exp[i(z-1)].
Multiplying a variable by a constant: PcX[z] = E[zcx] = E[(zc)x] = PX[zc].
For each Poisson times c, the p.g.f. is: exp[i(zc - 1)].
n

The p.g.f. of the sum of variables is a product of the p.g.f.s: PY(z) = exp[(zc - 1) i ].
i=1

Comment: Multiplying a Poisson variable by a constant does not result in another Poisson; rather it
results in what is called an Over-Dispersed Poisson Distribution.
Since Var[cX]/E[cX] = cVar[X]/E[X], for a constant > 1, the Over-Dispersed Poisson Distribution
has a variance greater than it mean. See for example A Primer on the Exponential Family of
Distributions, by David R. Clark and Charles Thayer, CAS 2004 Discussion Paper Program.
9.12. (i) P(z) = ({1-t}{1 + t(1-z)} + t{z + t(1-z)}) / {1 + t(1-z)}2 = 1 / {1 + t(1- z)}2 .
E[X] = P(1) = 1. The expected size of the population is 1 regardless of time.
(ii) f(0) = P(0) = t/(1 + t). This is the probability of extinction by time t.
The probability of survival to time t is: 1 - t/(1 + t) = 1/(1 + t) = (1/)/{(1/) + t),
the survival function of a Pareto Distribution with = 1 and = 1/.
(iii) As t approaches infinity, the probability of survival approaches zero.
Comment: P(z) = 2t/{1 + t(1- z)}3 . E[X(X -1)] = P(1) = 2t.

E[X2 ] = E[X] + 2t = 1 + 2t. Var[X] = 1 + 2t - 12 = 2t.


9.13. P1 (z) = exp[1(z-1)]. P2 (z) = exp[2(z-1)].
Since X1 and X2 are independent, the probability generating function of X1 + X2 is:
P1 (z)P2 (z) = exp[1(z-1) + 2(z-1)] = exp[(1 + 2)(z-1)].
This is the probability generating function of a Poisson with mean 1 + 2, which must therefore be
the distribution of X1 + X2 .

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 176

Section 10, Factorial Moments


When working with frequency distributions, in addition to moments around the origin and central
moments, one sometimes uses factorial moments. The nth factorial moment is the expected value
of the product of the n factors: X(X-1) .. (X+1-n).
( n ) = E[X(X-1) ... (X+1-n)].62
So for example, (1) = E[X], (2) = E[X(X-1)], (3) = E[X(X-1)(X-2)].
Exercise: What is the second factorial moment of a Binomial Distribution with parameters
m = 4 and q = 0.3?
[Solution: The density function is:
f(0) = 0.74 , f(1) = (4)(0.73 )(0.3), f(2) = (6)(0.72 )(0.32 ), f(3) = (4)(0.7)(0.3)3 , f(4) = 0.34 .
E[X(X-1)] = (0)(-1)f(0) + (1)(0)f(1) + (2)(1)f(2) + (3)(2)f(3) + (4)(3)f(4) =
(12)(0.72 ) (0.32 ) + (24)(0.7)(0.33 ) + 12(0.34 ) = 12(0.32 ){(0.72 ) + (2)(0.7)(0.3) + (0.32 )} =
(12)(0.32 )(0.7 + 0.3)2 = (12)(0.32 ) = 1.08.]
The factorial moments are related to the moments about the origin as follows:63
(1) = 1 =
(2) = 2 - 1

(3) = 3 - 32 + 21
(4) = 4 - 63 + 112 - 61
The moments about the origin are related to the factorial moments as follows:
1 = (1) =
2 = (2) + (1)

3 = (3) + 3(2) + (1)


4 = (4) + 6 (3) + 7(2) + (1)
Note that one, can use the factorial moments to compute the variance, etc.
62
63

See the first page of Appendix B of Loss Models.


Moments about the origin are sometimes referred to as raw moments.

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 177

For example for a Binomial Distribution with m = 4 and q = 0.3, the mean is mq = 1.2, while the
second factorial moment was computed to be 1.08. Thus the second moment around the origin is
2 = (2) + (1) = (2) + = 1.08 + 1.2 = 2.28. Thus the variance is 2.28 - 1.22 = 0.84.
This in fact equals mq(1-q) = (4)(0.3)(0.7) = 0.84.
In general the variance (the second central moment) is related to the factorial moments as follows:
variance = 2 = 2 - 1 2 = (2) + (1) - (1)2.
Using the Probability Generating Function to Get Factorial Moments:
One can use the Probability Generating Function to get the factorial moments.
To get the nth factorial moment, one differentiates the p.g.f. n times and sets z =1:
d n P(z)

= Pn (1).
( n ) =
dzn z = 1
So for example, (1) = E[X] = P(1), and (2) = E[X(X-1)] = P(1).64
Exercise: Given that the p.g.f. of a Poisson Distribution is e(z-1), determine its first four factorial
moments.
[Solution: P(z) = e(z-1) = eze. P(z) = eze. P(z) = 2eze. P(z) = 3eze
P(z) = 4eze. (1) = P(1) = e1e = . (2) = P(1) = 2ee = 2.
(3) = P(1) = 3e1e = 3. (4) = P(1) = 4.
Comment: For the Poisson Distribution, (n) = n .]
Exercise: Using the first four factorial moments of a Poisson Distribution, determine the first four
moments of a Poisson Distribution.
[Solution: 1 = (1) = .
2 = (2) + (1) = 2 + .
3 = (3) + 3(2) + (1) = 3 + 32 + .
4 = (4) + 6(3) + 7(2) + (1) = 4 + 63 + 72 + .]
64

Exercise 6.1 in Loss Models.

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 178

Exercise: Using the first four moments of a Poisson Distribution, determine its coefficient of variation,
skewness, and kurtosis.
[Solution: variance = 2 - 1 2 = 2 + - 2 = .
Coefficient of variation = standard deviation / mean =

/ = 1/ .

third central moment = 3 31 2 + 21 3 = 3 + 32 + 3(2 + ) + 23 = .


skewness = third central moment / variance1.5 = /1.5 = 1/ .
fourth central moment = 4 41 3 + 61 22 31 4 =
4 + 63 + 72 + - 4(3 + 32 + ) + 62 (2 + ) - 34 = 32 + .
kurtosis = fourth central moment / variance2 = (32 + )/2 = 3 + 1/.
Comment: While there is a possibility you might use the skewness of the Poisson Distribution, you
are extremely unlikely to ever use the kurtosis of the Poisson Distribution!
Kurtosis is discussed in Mahlers Guide to Loss Distributions.
As lambda approaches infinity, the kurtosis of a Poisson approaches 3, that of a Normal Distribution.
As lambda approaches infinity, the Poisson approaches a Normal Distribution.]
Exercise: Derive the p.g.f. of the Geometric Distribution and use it to determine the variance.

[Solution: P(z) = Expected Value of

zn =

f(n)
n=0

zn

n=0

n
1
zn =
n
+
1
1+
(1+ )

n=0

1+

1
1
1
=
, z < 1 + 1/.
1+ 1- z / (1+) 1- (z -1)

P(z) =

22
.
P(z)
=
. (1) = P(1) = . (2) = P(1) = 22.
2
3
{1- (z- 1)}
{1- (z- 1)}

Thus the variance of the Geometric distribution is: (2) + (1) - (1)2 = 22 + 2 = (1+).]

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 179

Formulas for the (a, b, 0) class:65


One can use iteration to calculate the factorial moments of a member of the (a,b,0) class.66
(1) = (a + b)/(1-a)

(n) = (an + b)(n-1)/(1-a)

Exercise: Use the above formulas to compute the first three factorial moments about the origin of a
Negative Binomial Distribution.
[Solution: For a Negative Binomial Distribution: a = /(1+) and b = (r-1)/(1+).
(1) = (a + b)/(1 - a) = r/(1+) /{1/(1+)} = r.
(2) = (2a + b)(1)/(1 - a) = {(r+1)/(1+)} r/{1/(1+)} = r(r+1)2.
(3) = (3a + b)(2)/(1 - a) = {(r+2)/(1+)} r(r+1)2/{1/(1+)} = r(r+1)(r+2)3.]
In general, the nth factorial moment of a Negative Binomial Distribution is:
(n) = r(r+1)...(r+n-1)n .
Exercise: Use the first three factorial moments to compute the first three moments about the origin of
a Negative Binomial Distribution.
[Solution: 1 = (1) = r. 2 = (2) + (1) = r(r+1)2 + r.
3 = (3) + 3(2) + (1) = r(r+1)(r+2)3 + 3r(r+1)2 + r.]
Exercise: Use the first two moments about the origin of a Negative Binomial Distribution to compute
its variance.
[Solution: The variance of the Negative Binomial is 2 1 2 = r(r+1)2 + r - (r)2 = r(1+).]
Exercise: Use the first three moments about the origin of a Negative Binomial Distribution to
compute its skewness.
[Solution: Third central moment = 3 - (3 1 2 ) + (2 1 3) =
r(r+1)(r+2)3 + 3r(r+1)2 + r -(3)(r)( r(r+1)2 + r) + 2(r)3
= 2r3 + 3r2 +r. Variance = r(1+).
Therefore, skewness =

65
66

2r3 + 3r 2 + r
{r(1+)}1.5

The (a, b, 0) class will be discussed subsequently.


See Appendix B.2 of Loss Models.

1 + 2
r(1 + )

.]

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 180

One can derive that for any member of the (a,b,0) class, the variance = (a+b)/(1-a)2 .67
For example for the Negative Binomial Distribution, a = /(1+) and b = (r-1)/(1+)
variance = (a+b)/(1-a)2 = {r/(1+)}/ {1/(1+)2 } = r(1+).
The derivation is as follows:

(1) = (a+b)/(1-a) . (2) = (2a +b)(1)/(1-a) = (2a +b)(a+b)/(1-a)2 .


2 = (2) + (1) = (2a +b)(a+b)/(1-a)2 + (a+b)/(1-a) = (a +b+1)(a+b)/(1-a)2
variance = 2 1 2 = (a +b+1)(a+b)/(1-a)2 - {(a+b)/(1-a)}2 = (a+b)/(1-a)2 .
Exercise: Use the above formula for the variance of a member of the (a,b,0) class to compute the
variance of a Binomial Distribution.
[Solution: For the Binomial, a = -q/(1-q) and b = (m+1)q/(1-q).
variance = (a+b)/(1-a)2 = mq/(1-q) / {1/(1-q) }2 = mq(1-q).]

Distribution

nth Factorial Moment

Binomial

m(m-1)...(m+1-n)qn for n m, 0 for n > m


n

Poisson
Negative Binomial

67

r(r+1)...(r+n-1)n

See Appendix B.2 of Loss Models.

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 181

Problems:
10.1 (2 points) The number of claims, N, made on an insurance portfolio follows the following
distribution:
n
Pr(N=n)
0
0.3
1
0.3
2
0.2
3
0.1
4
0.1
What is the second factorial moment of N?
A. 1.6
B. 1.8
C. 2.0
D. 2.2
E. 2.4
10.2 (3 points) Determine the third moment of a Poisson Distribution with = 5.
A. less than 140
B. at least 140 but less than 160
C. at least 160 but less than 180
D. at least 180 but less than 200
E. at least 200
10.3 (2 points) The random variable X has a Binomial distribution with parameters q and m = 8.
Determine the expected value of X(X -1)(X - 2).
A. 512

B. 512q(q-1)(q-2)

C. q(q-1)(q-2)

D. q3

E. None of A, B, C, or D

10.4 (2 points) You are given the following information about the probability generating function for
a discrete distribution:

P'(1) = 10
P"(1) = 98
Calculate the variance of the distribution.
A. 8
B. 10
C. 12
D. 14

E. 16

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 182

10.5 (3 points) The random variable X has a Negative Binomial distribution with parameters
= 7/3 and r = 9. Determine the expected value of X(X -1)(X - 2)(X-3).
A. less than 200,000
B. at least 200,000 but less than 300,000
C. at least 300,000 but less than 400,000
D. at least 400,000 but less than 500,000
E. at least 500,000
10.6 (3 points) Determine the third moment of a Binomial Distribution with m = 10 and q = 0.3.
A. less than 40
B. at least 40 but less than 50
C. at least 50 but less than 60
D. at least 60 but less than 70
E. at least 70
10.7 (3 points) Determine the third moment of a Negative Binomial Distribution with r = 10 and
= 3.
A. less than 36,000
B. at least 36,000 but less than 38,000
C. at least 38,000 but less than 40,000
D. at least 40,000 but less than 42,000
E. at least 42,000
10.8 (4B, 11/97, Q.21) (2 points) The random variable X has a Poisson distribution with mean .
Determine the expected value of X(X -1)...(X - 9).
A. 1

B.

C. (1)...(9)

D. 10

E. (+1)...(+9)

10.9 (CAS3, 11/06, Q.25) (2.5 points) You are given the following information about the
probability generating function for a discrete distribution:

P'(1) = 2
P"(1) = 6
Calculate the variance of the distribution.
A. Less than 1.5
B. At least 1.5, but less than 2.5
C. At least 2.5, but less than 3.5
D. At least 3.5, but less than 4.5
E. At least 4.5

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Solutions to Problems:
10.1. D. The 2nd factorial moment is:
E[N(N-1)] = (.3)(0)(-1) + (.3)(1)(0) + (.2)(2)(1) + (.1)(3)(2) + (.1)(4)(3) = 2.2.

Page 183

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 184

10.2. E. The factorial moments for a Poisson are: n . mean = first factorial moment = = 5.
Second factorial moment = 52 = 25 = E[X(X-1)] = E[X2 ] - E[X]. E[X2 ] = 25 + 5 = 30.
Third factorial moment = 53 = 125 = E[X(X-1)(X-2)] = E[X3 ] - 3E[X2 ] + 2E[X].

E[X3 ] = 125 + (3)(30) - (2)(5) = 205.


Alternately, for the Poisson P(z) = e(z-1).
P(z) = e(z-1). P(z) = e(z-1). P(z) = 2e(z-1). P(z) = 3e(z-1).
mean = first factorial moment = P(1) = . Second factorial moment = P(1) = 2.
Third factorial moment = P(1) = 3. Proceed as before.
Alternately, the skewness of a Poisson is 1/ .
Since the variance is , the third central moment is: 1.5/ = .
= E[(X - )3 ] = E[X3 ] - 3E[X2 ] + 32E[X] - 3.

E[X3 ] = + 3E[X2 ] - 32E[X] + 3 = + 3( + 2) - 32 + 3 = 3 + 32 +


= 125 + 75 + 5 = 205.
Comment: One could compute enough of the densities and then calculate the third moment:
Number
of Claims

Probability
Density Function

Probability x
# of Claims

Probability x
Square of
# of Claims

Probability x
Cube of
# of Claims

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

0.674%
3.369%
8.422%
14.037%
17.547%
17.547%
14.622%
10.444%
6.528%
3.627%
1.813%
0.824%
0.343%
0.132%
0.047%
0.016%
0.005%
0.001%

0.00000
0.03369
0.16845
0.42112
0.70187
0.87734
0.87734
0.73111
0.52222
0.32639
0.18133
0.09066
0.04121
0.01717
0.00660
0.00236
0.00079
0.00025

0.00000
0.03369
0.33690
1.26337
2.80748
4.38668
5.26402
5.11780
4.17779
2.93751
1.81328
0.99730
0.49453
0.22323
0.09246
0.03538
0.01258
0.00418

0.00000
0.03369
0.67379
3.79010
11.22991
21.93342
31.58413
35.82459
33.42236
26.43761
18.13279
10.97034
5.93437
2.90193
1.29444
0.53070
0.20127
0.07101

Sum

0.99999458366

4.99990

29.99818

204.96644

2013-4-1,

Frequency Distributions, 10 Factorial Moments

10.3. E. f(x) =

(8)(7)(6)q3

x=3

Page 185

8!
qx (1- q)8 - x , for x = 0 to 8.
x! (8 - x)!

E[X(X-1)(X-2)] =
x=8

HCM 10/4/12,

x=8

x=8

x=3

x=3

x(x -1)(x - 2)f(x) = x(x -1)(x - 2) x! (8 - x)! qx (1- q)8 - x =

5!
qx - 3 (1- q)8 - x = 336q3
(x- 3)! (8 - x)!

8!

y=5

y! (55!- y)! qy (1- q)5 - y = 336q3 .


y=0

Alternately, the 3rd factorial moment is the 3rd derivative of the p.g.f. at z = 1.
For the Binomial: P(z) = {1+ q(z-1)}m. dP/dz = mq{1+ q(z-1)}(m-1).
P(z) = m(m-1)q2 {1+ q(z-1)}(m-2). P(z) = m(m-1)(m-2)q3 {1+ q(z-1)}(m-3) .
P(1) = m(m-1)(m-2)q3 = (8)(7)(6)q3 = 336q3 .
Comment: Note that the product x(x -1)(x-2) is zero for x = 0,1 and 2, so only terms for x 3
contribute to the sum. Then a change of variables is made: y = x-3. Then the resulting sum is the
sum of Binomial terms from y = 0 to 5, which sum is one, since the Binomial is a Distribution, with a
support in this case 0 to 5. The expected value of: X(X -1)(X - 2), is an example of what is referred
to as a factorial moment.
In the case of the Binomial, the kth factorial moment for km is:
p k(m!)/(m-k)! = pk(m)(m-1)...(m-(k-1)). In our case we have the 3rd factorial moment (involving the
product of 3 terms) equal to: q3 (m)(m-1)(m-2).
10.4. A. 10 = P'(1) = E[N]. 98 = P"(1) = E[N(N-1)] = E[N2 ] - E[N]. E[N2 ] = 98 + 10 = 108.
Var[N] = E[N2 ] - E[N]2 = 108 - 102 = 8.
Comment: Similar to CAS3, 11/06, Q.25.

2013-4-1,

Frequency Distributions, 10 Factorial Moments

10.5. C. f(x) =

x=

x(x- 1)(x - 2)(x - 3)f(x) =

x=0

x=

(12)(11)(10)(9)(7/3)4

x=4

352,147

x(x- 1)(x - 2)(x - 3)

x=4

y=

y=0

Page 186

(7 / 3)x
(x + 8)!
. E[ X(X-1)(X-2)(X-3) ] =
x! 8! (1+ 7 / 3)x + 9

x=

HCM 10/4/12,

(x + 8)!
(7 / 3)x
=
x! 8! (1+ 7 / 3)x + 9

(x + 8)!
(7 / 3)x- 4
=
(x - 4)! 12! (1+ 7 / 3)x + 9
y

(7 / 3)
(y + 12)!
= 352,147.
y! 12! (1+ 7 / 3)y + 13

Alternately, the 4th factorial moment is the 4th derivative of the p.g.f. at z = 1.
For the Negative Binomial: P(z) = {1- (z-1)}-r. dP/dz = r{1- (z-1)}-(r+1).
P(z) = r(r+1)2{1- (z-1)}-(r+2). P(z) = r(r+1)(r+2)3{1- (z-1)}-(r+3).
P(z) = r(r+1)(r+2)(r+3)4{1- (z-1)}-(r+4).
P(1) = r(r+1)(r+2)(r+3)4 = (9)(10)(11)(12)(7/3)4 = 352,147.
Comments: Note that the product x(x -1)(x-2)(x-3) is zero for x = 0,1,2 and 3, so only terms for x 4
contribute to the sum. Then a change of variables is made: y = x-4. Then the resulting sum is the
sum of Negative Binomial terms, with = 7/3 and r = 13, from y = 0 to infinity, which sum is one,
since the Negative Binomial is a Distribution with support from 0 to .
The expected value of X(X -1)(X - 2)(X-3), is an example of a factorial moment.
In the case of the Negative Binomial, the mth factorial moment is: m (r)(r+1)...(r+m-1).
In our case we have the 4th factorial moment (involving the product of 4 terms) equal to:
4 (r)(r+1)(r+2)(r+3), with = 7/3 and r = 9.

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 187

10.6. B. P(z) = {1 + q(z-1)}m = {1 + 0.3(z-1)}10 = {0.7 + 0.3z}10.


P(z) = (10)(0.3){0.7 + 0.3z}9 . P(z) = (3)(2.7){0.7 + 0.3z}8 . P(z) = (3)(2.7)(2.4){0.7 + 0.3z}7 .
mean = first factorial moment = P(1) = 3.
Second factorial moment = P(1) = (3)(2.7) = 8.1.
Third factorial moment = P(1) = (3)(2.7)(2.4) = 19.44.
Second factorial moment = 8.1 = E[X(X-1)] = E[X2 ] - E[X]. E[X2 ] = 8.1 + 3 = 11.1.
Third factorial moment = 19.44 = E[X(X-1)(X-2)] = E[X3 ] - 3E[X2 ] + 2E[X].

E[X3 ] = 19.44 + (3)(11.1) - (2)(3) = 46.74.


Comment: E[X2 ] = variance + mean2 = 2.1 + 32 = 11.1.
One could compute all of the densities and then calculate the third moment:
Number
of Claims

Probability
Density Function

Probability x
# of Claims

Probability x
Square of
# of Claims

Probability x
Cube of
# of Claims

0
1
2
3
4
5
6
7
8
9
10

2.825%
12.106%
23.347%
26.683%
20.012%
10.292%
3.676%
0.900%
0.145%
0.014%
0.001%

0.00000
0.12106
0.46695
0.80048
0.80048
0.51460
0.22054
0.06301
0.01157
0.00124
0.00006

0.00000
0.12106
0.93390
2.40145
3.20194
2.57298
1.32325
0.44108
0.09259
0.01116
0.00059

0.00000
0.12106
1.86780
7.20435
12.80774
12.86492
7.93949
3.08758
0.74071
0.10044
0.00590

Sum

3.00000

11.10000

46.74000

10.7. C. P(z) = {1 - (z-1)}-r = {1 - 3(z-1)}-10 = (4 - 3z)-10.


P(z) = (-10)(-3)(4 - 3z)-11. P(z) = (30)(33)(4 - 3z)-12. P(z) = (30)(33)(36)(4 - 3z)-13.
mean = first factorial moment = P(1) = 30.
Second factorial moment = P(1) = (30)(33) = 990.
Third factorial moment = P(1) = (30)(33)(36) = 35,640.
Second factorial moment = 990 = E[X(X-1)] = E[X2 ] - E[X]. E[X2 ] = 990 + 30 = 1020.
Third factorial moment = 35,640 = E[X(X-1)(X-2)] = E[X3 ] - 3E[X2 ] + 2E[X].

E[X3 ] = 35,640 + (3)(1020) - (2)(30) = 38,640.


Comment: E[X2 ] = variance + mean2 = (10)(3)(4) + 302 = 1020.

2013-4-1,

Frequency Distributions, 10 Factorial Moments

HCM 10/4/12,

Page 188

10.8. D. For a discrete distribution, the expected value of a quantity is determined by taking the
sum of its product with the probability density function. In this case, the density of the Poisson is:
e x / x! , x = 0, 1, 2... Thus E[ X(X -1)...(X - 9) ] =
x=

x=0

e - x
x(x- 1)(x - 2)(x - 3)(x - 4)(x - 5)(x - 6)(x - 7)(x - 8)(x - 9)
=
x!
x=

e 10

x=10

x - 10
= e 10
(x -10)!

y=

y=0

= e 10 e = 10.
y!

Alternately, the 10th factorial moment is the 10th derivative of the p.g.f. at z = 1.
For the Poisson: P(z) = exp((z-1)). dP/dz = exp((z-1)). P(z) = 2 exp((z-1)).
P(z) = 3 exp((z-1)). P10(z) = 10 exp((z-1)). P10(1) = 10.
Comment: Note that the product x(x -1)...(x - 9) is zero for x = 0,1...,9, so only terms for x 10
contribute to the sum. The expected value of X(X -1)...(X - 9), is an example of a factorial moment.
In the case of the Poisson, the nth factorial moment is to the nth power. In this case we have the
10th factorial moment (involving the product of 10 terms) equal to 10.
10.9. D. 2 = P'(1) = E[N]. 6 = P"(1) = E[N(N-1)] = E[N2 ] - E[N]. E[N2 ] = 6 + 2 = 8.
Var[N] = E[N2 ] - E[N]2 = 8 - 22 = 4.
Comment: P(z) = E[zn ] = f(n)zn . P(z) = nf(n)zn-1. P(1) = nf(n) = E[N].
P(z) = n(n-1)f(n)zn-2. P(1) = n(n-1)f(n) = E[N(N-1)].

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 189

Section 11, (a, b, 0) Class of Distributions


The (a,b,0) class of frequency distributions consists of the three common
distributions: Binomial, Poisson, and Negative Binomial.
Distribution

Mean Variance

Binomial

mq

Poisson

Negative Binomial

Distribution

r(1+)

m q (1 - q)

1/

Variance < Mean

Variance = Mean

1+ > 1

Variance > Mean

r(1 + )

f(x)

Binomial

m! qx (1- q)m - x
x! (m - x)!

Poisson

x e / x!

If q < 0.5 skewed right, if q > 0.5 skewed left

Skewed to the right

1 + 2

Negative Binomial

Binomial

1-q <1

1 - 2q

Poisson

Negative

mq(1-q)

Skewness

Binomial

Distribution

Variance / Mean

Skewed to the right

f(x+1)

f(x+1) / f(x)

m! qx+ 1 (1- q)m - x - 1


(x +1)! (m- x - 1)!

x+1 e / (x+1)!

x
r(r +1)...(r + x - 1)
(1+ )x + r
x!

m - x
q
1 - q x + 1

/ (x +1)

x + 1
r(r +1)...(r + x)
(1+) x + r + 1
(x + 1)!

x + r
1+ x + 1

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 190

(a,b,0) relationship:
For each of these three frequency distributions: f(x+1) / f(x) = a +

b
, x = 0, 1, 2, ...
x+1

where a and b depend on the parameters of the distribution:68


Distribution
Binomial
Poisson
Negative Binomial

a
-q/(1-q)
0
/(1+)

b
(m+1)q/(1-q)

(r-1)/(1+)

f(0)
(1-q)m
e
1/(1+)r

Loss Models writes this recursion formula equivalently as: pk/pk-1 = a + b/k, k = 1, 2, 3 ... 69
This relationship defines the (a,b,0) class of frequency distributions.70 The (a, b ,0) class of
frequency distributions consists of the three common distributions: Binomial, Poisson, and Negative
Binomial.71 Therefore, it also includes the Bernoulli, which is a special case of the Binomial, and the
Geometric, which is a special case of the Negative Binomial.
Note that a is positive for the Negative Binomial, zero for the Poisson, and negative for the Binomial.
These formula can be useful when programming these frequency distributions into spreadsheets.
One calculates f(0) and then one gets additional values of the density function via iteration:
f(x+1) = f(x){a + b / (x+1)}.
f(1) = f(0) (a + b). f(2) = f(1) (a + b/2). f(3) = f(2) (a + b/3). f(4) = f(3) (a + b/4), etc.

68

These a and b values are shown in the tables attached to the exam. This relationship is used in the Panjer
Algorithm (recursive formula), a manner of computing either the aggregate loss distribution or a compound
frequency distribution. For a member of the (a,b,0) class, the values of a and b determine everything about the
distribution. Given the density at zero, all of the densities would follow; however, the sum of all of the densities must
be one.
69
See Definition 6.4 in Loss Models.
70
See Table 6.1 and Appendix B.2 in Loss Models. The (a, b, 0) class is distinguished from the (a, b, 1) class, to be
discussed in a subsequent section, by the fact that the relationship holds starting with the density at zero, rather
than possibly only starting with the density at one.
71
As stated in Loss Models, these are the only members of the (a, b, 0) class. This is proved in Lemma 6.6.1 of
Insurance Risk Models, by Panjer and Willmot. Only certain combinations of a and b are acceptable. Each of the
densities must be nonnegative and they must sum to one, a finite amount.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 191

Thinning and Adding:


Distribution

Thinning by factor of t

Adding n independent, identical copies

Binomial

q tq

m nm

Poisson

Negative Binomial

r nr

If for example, we assume 1/4 of all claims are large:


If All Claims

Then Large Claims

Binomial m = 5, q = 0.04

Binomial m = 5, q = 0.01

Poisson = 0.20

Poisson = 0.05

Negative Binomial r = 2, = 0.10

Negative Binomial r = 2, = 0.025

In the Poisson case, small and large claims are independent Poisson Distributions.72
For X and Y independent:
X
Binomial(m1 , q)

Y
Binomial(m2 , q)

X+Y
Binomial(m1 + m2 , q)

Poisson( 1)

Poisson( 2)

Poisson( 1 + 2)

Negative Binomial(r1 , )

Negative Bin.(r2 , )

Negative Bin.(r1 + r2 , )

If Claims each Year

Then Claims for 6 Independent Years

Binomial m = 5, q = 0.04

Binomial m = 30, q = 0.04

Poisson = 0.20

Poisson = 1.20

Negative Binomial r = 2, = 0.10

Negative Binomial r = 12, = 0.10

72

As discussed in the section on the Gamma-Poisson Frequency Process, in the Negative Binomial case, the
number of large and small claims are positively correlated. In the Binomial case, the number of large and small claims
are negatively correlated.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 192

Probability Generating Functions:


Recall that the probability generating function for a given distribution is P(z) = E[zN].
Distribution

Probability Generating Function

Binomial

P(z) = {1 + q(z-1)}m

Poisson

P(z) = e(z-1)

Negative Binomial

P(z) = {1 - (z-1)}-r, z < 1 + 1/

Parametric Models:
Some advantages of parametric models:
1. They summarize the information in terms of the form of the distribution and the parameter values.
2. They serve to smooth the empirical data.
3. They greatly reduce the dimensionality of the information.
In addition one can use parametric models to extrapolate beyond the largest observation.
As will be discussed in a subsequent section, the behavior in the righthand tail is an important
feature of any frequency distribution.
Some advantages of working with separate distributions of frequency and severity:73
1. Can obtain a deeper understanding of a variety of issues surrounding insurance.
2. Allows one to address issues of modification of an insurance contract (for example, deductibles.)
3. Frequency distributions are easy to obtain and do a good job of modeling the empirical
situations.

73

See Section 6.1 of Loss Models.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 193

Limits:
Since, the probability generating function determines the distribution, one can take limits of a
distribution by instead taking limits of the Probability Generating Function.
Assume one takes a limit of the probability generating function of a Binomial distribution for qm =
as m and q 0 :
P(z) = {1 + q(z-1)}m = {1 + q(z-1)}/q = [{1 + q(z-1)}1/q] {e(z-1)} = e(z-1).
Where we have used the fact that as x 0, (1+ax)1/x ea.
Thus the limit of the Binomial Probability Generating Function is the Poisson Probability Generating
Function. Therefore, as we let q get very small in a Binomial but keep the mean constant, in the limit
one approaches a Poisson with the same mean.74
For example, a Poisson (triangles) with mean 10 is compared to a Binomial (squares) with q = 1/3
and m = 30 (mean = 10, variance = 20/3):
0.15

0.125

0.1

0.075

0.05

0.025

10

15

20

While the Binomial is shorter-tailed than the Poisson, they are not that dissimilar.

74

The limit of the probability generating function is the probability generating function of the limit of the distributions
if it exists.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 194

Assume one takes a limit of the probability generating function of a Negative Binomial distribution for
r = as r and 0:
P(z) = {1- (z-1)}-r = {1- (z-1)}/ = [{1- (z-1)} 1/ ]- {e(z-1)}- = e(z-1).
Thus the limit of the Negative Binomial Probability Generating Function is the Poisson Probability
Generating Function. Therefore, as we let get very close to zero in a Negative Binomial but keep
the mean constant, in the limit one approaches a Poisson with the same mean.75
A Poisson (triangles) with mean 10 is compared to a Negative Binomial Distribution (squares) with
r = 20 and = 0.5 (mean = 10, variance = 15):

0.12

0.1

0.08

0.06

0.04

0.02

10

15

20

For the three distributions graphed here and previously, while the means are the same, the
variances are significantly different; thus the Binomial is more concentrated around the mean while the
Negative Binomial is more dispersed from the mean. Nevertheless, one can see how the three
distributions are starting to resemble each other.76

75

The limit of the probability generating function is the probability generating function of the limit of the distributions
if it exists.
76
They are each approximated by a Normal Distribution. While these three Normal Distributions have the same mean,
they have different variances.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 195

If the Binomial q were smaller and m larger such that the mean remained 10, for example q = 1/30
and m = 300, then the Binomial would have been much closer to the Poisson. Similarly, if on the
Negative Binomial one had closer to zero with r larger such that the mean remained 10, for
example = 1/9 and r = 90, then the Negative Binomial would have been much closer to the
Poisson.
Thus the Poisson is the limit of either a series of Binomial or Negative Binomial Distributions as they
come from different sides.77 The Binomial has q go to zero; one adds up very many Bernoulli
Trials each with a very small chance of success. This approaches a constant chance of success per
very small unit of time, which is a Poisson. Note that for each Binomial the mean is greater than the
variance, but as q goes to zero the variance approaches the mean.
For the Negative Binomial one lets goes to zero; one adds up very many Geometric distributions
each with very small chance of a claim.78 Again this limit is a Poisson, but in this case for each
Negative Binomial the variance is greater than the mean. As goes to zero, the variance
approaches the mean.
As mentioned previously the Distribution Function of the Binomial Distribution is a form of the
Incomplete Beta Function, while that of the Poisson is in the form of an Incomplete Gamma
Function. As q 0 and the Binomial approaches a Poisson, the Distribution Function of the
Binomial approaches that of the Poisson. An Incomplete Gamma Function can thus be
obtained as a limit of Incomplete Beta Distributions. Similarly, the Distribution Function of the
Negative Binomial is a somewhat different form of the Incomplete Beta Distribution.
As 0 and the Negative Binomial approaches a Poisson, the Distribution Function of the
Negative Binomial approaches that of the Poisson. Again, an Incomplete Gamma Function can
be obtained as a limit of Incomplete Beta Distributions.

77

One can also show this via the use of Sterlings formula to directly calculate the limits rather than via the use of
Probability Generating Functions.
78
The mean of a Geometric is , thus as 0, the chance of a claim becomes very small.
For the Negative Binomial, r = mean/, so that as 0 for a fixed mean, r .

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 196

Modes:
The mode, where the density is largest, can be located by observing where f(x+1)/f(x) switches
from being greater than 1 to being less than 1.79
Exercise: For a member of the (a, b, 0) frequency class, when is f(x+1)/f(x) greater than one, equal
to one, and less than one?
[Solution: f(x+1)/f(x) = 1 when a + b/(x+1) = 1. This occurs when x = b/(1-a) - 1.
For x < b/(1-a) - 1, f(x+1)/f(x) > 1. For x > b/(1-a) - 1, f(x+1)/f(x) < 1.]
For example, for a Binomial Distribution with m = 10 and q = .23, a = - q/(1-q) = -.2987 and
b = (m+1)q/(1-q) = 3.2857. For x > b/(1-a) - 1 = 1.53, f(x+1)/f(x) < 1.
Thus f(3) < f(2). For x < 1.53, f(x+1)/f(x) > 1. Thus f(2) > f(1). Therefore, the mode is 2.
In general, since for x < b/(1-a) - 1, f(x+1) > f(x), if c is the largest integer in b/(1-a), f(c) > f(c-1).
Since for x > b/(1-a) - 1, f(x+1) < f(x), f(c+1) > f(c). Thus c is the mode.
For a member of the (a, b, 0) class, the mode is the largest integer in b/(1-a).
If b/(1-a) is an integer, then f(b/(1-a) - 1) = f(b/(1-a)), and there are two modes.
For the Binomial Distribution, a = - q/(1-q) and b = (m+1)q/(1-q), so b/(1-a) = (m+1)q.
Thus the mode is the largest integer in (m+1)q.
If (m+1)q is an integer, there are two modes at: (m+1)q and (m+1)q - 1.
For the Poisson Distribution, a = 0 and b = , so b/(1-a) = . Thus the mode is the largest integer in
. If is an integer, there are two modes at: and - 1.
For the Negative Binomial Distribution, a = /(1+) and b = (r-1)/(1+), so b/(1-a) =
(r-1) . Thus the mode is the largest integer in (r-1).
If (r-1) is an integer, there are two modes at: (r-1) and (r-1) - 1.
Note that in each case the mode is close to the mean.80
So one could usefully start a numerical search for the mode at the mean.

79

In general this is only a local maximum, but members of the (a,b, 0) class do not have local maxima other than the
mode.
80
The means are mq, , and r.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 197

Moments:
Formulas for the Factorial Moments of the (a, b, 0) class have been discussed in a previous section.
It can be derived from those formulas that for a member of the (a, b, 0) class:
Mean

(a + b)/(1 - a)

Second Moment

(a + b)(a + b + 1)/(1 - a)2

Variance

(a + b)/(1 - a)2

Third Moment

(a + b){(a + b + 1)(a + b + 2) + a - 1}/(1 - a)3

Skewness

(a + 1)/ a + b

A Generalization of the (a, b, 0) Class:


The (a, b, 0) relationship is: f(x+1) / f(x) = a + {b / (x+1)}, x = 0, 1, 2, ...
or equivalently: pk / pk-1 = a + b/k, k = 1, 2, 3, ...
A more general relationship is: pk / pk-1 = (ak + b) / (k + c), k = 1, 2, 3, ...
If c = 0, then this would reduce to the (a, b, 0) relationship.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 198

Contagion Model:81
Assume one has a claim intensity of . The chance of having a claim over an extremely small period
of time t is approximately (t).82
As mentioned previously, if the claim intensity is a constant over time, then the number of claims
observed over a period time t is given by a Poisson, with parameter t. If the claim intensity
depends on the number of claims that have occurred so far then the frequency distribution is other
than Poisson.
Given one has had x-1 claims so far, let x t be the chance of having the xth claim in small time
period t. Assume x = + (x-1).
Then for > 0, one gets a Negative Binomial Distribution. As one observes more claims the chance
of observing another claim goes up. This is referred to as positive contagion; examples might be
claims due to a contagious disease or from a very large fire. Over time period (0, t), the parameters
of the Negative Binomial are r = / and = e-t - 1.
Then for < 0, one gets a Binomial distribution. As one observes more claims, the chance of future
claims goes down. This is referred to as negative contagion. Over time period (0, t), the parameters
of the binomial are m = -/ and q = 1 - et.
For = 0 one gets the Poisson. There is no contagion and the claim intensity is constant. Thus the
contagion model is another mathematical connection between the three common frequency
distributions. We expect as 0 in either the Binomial or Negative Binomial that we approach a
Poisson. This is indeed the case as discussed previously.

81
82

Not on the syllabus of your exam. See pages 52-53 of Mathematical Methods of Risk Theory by Buhlmann.
The claim intensity is analogous to the force of mortality in Life Contingencies.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 199

As used in the Heckman-Meyers algorithm to calculate aggregate losses, the frequency distributions
are parameterized in a related but somewhat different manner via their mean and
a contagion parameter c:83

Binomial

mq

-1/m

Poisson

Negative Binomial

1/r

HyperGeometric Distribution:
For the HyperGeometric Distribution with parameters m, n, and N, the density is:84

f(x) =

m N - m
x n - x

N
m

f(x +1)
=
f(x)

, x = 0, 1, ..., n.

m N - m
x + 1 n - x - 1

m N - m
x n - x

(m- x)! x!
(N - m - n + x)! (n- x)!
.
(m- x - 1)! (x +1)! (N - m - n + x + 1)! (n - x - 1)!

m - x
n - x
.
x + 1 N - m - n + x + 1

Thus the HyperGeometric Distribution is not a member of the (a, b, 0) family.


Mean =
83

nm
.
N

Variance =

nm (N - m) (N - n)
.
(N - 1) N2

Not on the syllabus of your exam. See PCAS 1983 p. 35- 36, The Calculation of Aggregate Loss Distributions
from Claim Severity and Claim Count Distributions, by Phil Heckman and Glenn Meyers.
84
Not on the syllabus of your exam. See for example, A First Course in Probability, by Sheldon Ross.
If we had an urn with N balls,of which m were white, and we took a sample of size n, then f(x) is the probability that x of
the balls in the sample were white.
For example, tests with 35 questions will be selected at random from a bank of 500 questions.
Treat the 35 questions on the first randomly selected test as white balls.
Then the number of white balls in a sample of size n from the 500 balls is HyperGeometric with m = 35 and N = 500.
Thus the number of questions a second test of 35 questions has in common with the first test is HyperGeometric
with m = 35, n = 35, and N = 500. The densities from 0 to 10 are: 0.0717862, 0.204033, 0.272988, 0.228856,
0.134993, 0.0596454, 0.0205202, 0.00564155, 0.00126226, 0.000232901, 0.000035782.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 200

Problems:
11.1 (1 point) Which of the following statements are true?
1. The variance of the Negative Binomial Distribution is less than the mean.
2. The variance of the Poisson Distribution only exists for > 2.
3. The variance of the Binomial Distribution is greater than the mean.
A. 1
B. 2
C. 3
D. 1, 2, and 3
E. None of A, B, C, or D
11.2 (1 point) A member of the (a, b, 0) class of frequency distributions has a = -2.
Which of the following types of Distributions is it?
A. Binomial B. Poisson C. Negative Binomial
D. Logarithmic
E. None A, B, C, or D.
11.3 (1 point) A member of the (a, b, 0) class of frequency distributions has a = 0.4 and b = 2.
Given f(4) = 0.1505, what is f(7)?
A. Less than 0.06
B. At least 0.06, but less than 0.07
C. At least 0.07, but less than 0.08
D. At least 0.08, but less than 0.09
E. At least 0.09
11.4 (2 points) X is a discrete random variable with a probability function which is a member of the
(a,b,0) class of distributions. P(X = 1) = 0.0064. P(X = 2) = 0.0512. P(X = 3) = 0.2048.
Calculate P(X = 4).
(A) 0.37
(B) 0.38
(C) 0.39
(D) 0.40
(E) 0.41
11.5 (2 points) For a discrete probability distribution, you are given the recursion relation:
f(x+1) = {

0.6
1
+
} f(x), x = 0, 1, 2,.
x +1
3

Determine f(3).
(A) 0.09
(B) 0.10

(C) 0.11

(D) 0.12

(E) 0.13

11.6 (2 points) A member of the (a, b, 0) class of frequency distributions has a = 0.4, and
b = 2.8. What is the mode?
A. 0 or 1
B. 2
C. 3
D. 4
E. 5 or more
11.7 (2 points) For a discrete probability distribution, you are given the recursion relation:
p(x) = (-2/3 + 4/x)p(x-1), x = 1, 2,.
Determine p(3).
(A) 0.19
(B) 0.20
(C) 0.21
(D) 0.22
(E) 0.23

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 201

11.8 (3 points) X is a discrete random variable with a probability function which is a member of the
(a, b, 0) class of distributions.
P(X = 100) = 0.0350252. P(X = 101) = 0.0329445. P(X = 102) = 0.0306836.
Calculate P(X = 105).
(A) .022
(B) .023
(C) .024
(D) .025
(E) .026
11.9 (2 points) X is a discrete random variable with a probability function which is a member of the
(a, b, 0) class of distributions.
P(X = 10) = 0.1074. P(X = 11) = 0.
Calculate P(X = 6).
(A) 6%
(B) 7%
(C) 8%
(D) 9%
(E) 10%
11.10 (3 points) Show that the (a, b, 0) relationship with a = -2 and b = 6 leads to a legitimate
distribution while a = -2 and b = 5 does not.
11.11 (2 points) A discrete probability distribution has the following properties:
(i) pk = c(-1 + 4/k)pk-1 for k = 1, 2,
(ii) p0 = 0.7.
Calculate c.
(A) 0.06

(B) 0.13

(C) 0.29

(D) 0.35

(E) 0.40

11.12 (3 points) Show that the (a, b, 0) relationship with a = 1 and b = -1/2 does not lead to a
legitimate distribution.
11.13 (3 points) A member of the (a, b, 0) class of frequency distributions has been fit via
maximum likelihood to the number of claims observed on 10,000 policies.
Number of claims Number of Policies
Fitted Model
0
6587
6590.79
1
2598
2586.27
2
647
656.41
3
136
136.28
4
25
25.14
5
7
4.29
6 or more
0
0.80
Determine what type of distribution has been fit and the value of the fitted parameters.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 202

11.14 (4, 5/86, Q.50) (1 point) Which of the following statements are true?
1. For a Poisson distribution the mean and variance are equal.
2. For a binomial distribution the mean is less than the variance.
3. The negative binomial distribution is a useful model of the distribution of claim
frequencies of a heterogeneous group of risks.
A. 1
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3
11.15 (4B, 11/92, Q.21) (1 point) A portfolio of 10,000 risks yields the following:
Number of Claims Number of Insureds
0
6,070
1
3,022
2
764
3
126
4
18
Based on the portfolio's sample moments, which of the following distributions provides the best fit
to the portfolio's number of claims?
A. Binomial
B. Poisson
C. Negative Binomial
D. Lognormal
E. Pareto
11.16 (5A, 11/94, Q.24) (1 point) Let X and Y be random variables representing the number of
claims for two separate portfolios of insurance risks. You are asked to model the distributions of the
number of claims using either the Poisson or Negative Binomial distributions. Given the following
information about the moments of X and Y, which distribution would be the best choice for each?
E[X] = 2.40 E[Y] = 3.50
E[X2 ] = 8.16 E[Y2 ] = 20.25
A. X is Poisson and Y is Negative Binomial
B. X is Poisson and Y is Poisson
C. X is Negative Binomial and Y is Negative Binomial
D. X is Negative Binomial and Y is Poisson
E. Neither distribution is appropriate for modeling numbers of claims.
11.17 (5A, 11/99, Q.39) (2 points) You are given the following information concerning the
distribution, S, of the aggregate claims of a particular line of business:
E[S] = $500,000 and Var[S] = 7.5 x 109 .
The claim severity follows a Normal Distribution with both mean and standard deviation equal to
$5,000.
What conclusion can be drawn regarding the individual claim propensity of the insureds in this line of
business?

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 203

11.18 (3, 5/01, Q.25 & 2009 Sample Q.108) (2.5 points) For a discrete probability distribution,
you are given the recursion relation
p(k) = (2/k) p(k-1), k = 1, 2,.
Determine p(4).
(A) 0.07
(B) 0.08
(C) 0.09
(D) 0.10
(E) 0.11
11.19 (3, 11/02, Q.28 & 2009 Sample Q.94) (2.5 points) X is a discrete random variable with a
probability function which is a member of the (a,b,0) class of distributions.
You are given:
(i) P(X = 0) = P(X = 1) = 0.25
(ii) P(X = 2) = 0.1875
Calculate P(X = 3).
(A) 0.120
(B) 0.125
(C) 0.130
(D) 0.135
(E) 0.140
11.20 (CAS3, 5/04, Q.32) (2.5 points) Which of the following statements are true about the sums
of discrete, independent random variables?
1. The sum of two Poisson random variables is always a Poisson random variable.
2. The sum of two negative binomial random variables with parameters (r, ) and (r', ') is a
negative binomial random variable if r = r'.
3. The sum of two binomial random variables with parameters (m, q) and (m', q') is a binomial
random variable if q = q'.
A. None of 1, 2, or 3 is true. B. 1 and 2 only C. 1 and 3 only D. 2 and 3 only E. 1, 2, and 3
11.21 (CAS3, 5/05, Q.16) (2.5 points)
Which of the following are true regarding sums of random variables?
1. The sum of two independent negative binomial distributions with parameters (r1 , 1) and
(r2 , 2) is negative binomial if and only if r1 = r2 .
2. The sum of two independent binomial distributions with parameters (q1 , m1 ) and (q2 , m2 )
is binomial if and only if m1 = m2 .
3. The sum of two independent Poison distributions with parameters 1 and 2 is Poisson if
and only if 1 = 2.
A. None are true

B. 1 only

C. 2 only

D. 3 only

E. 1 and 3 only

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

11.22 (SOA M, 5/05, Q.19 & 2009 Sample Q.166) (2.5 points)
A discrete probability distribution has the following properties:
(i) pk = c (1 + 1/k) pk-1 for k = 1, 2,
(ii) p0 = 0.5.
Calculate c.
(A) 0.06

(B) 0.13

(C) 0.29

(D) 0.35

(E) 0.40

11.23 (CAS3, 5/06, Q.31) (2.5 points)


N is a discrete random variable from the (a, b, 0) class of distributions.
The following information is known about the distribution:

Pr(N = 0) = 0.327680
Pr(N = 1) = 0.327680
Pr(N = 2) = 0.196608
E(N) = 1.25
Based on this information, which of the following are true statements?
I. Pr(N = 3) = 0.107965
II. N is from a Binomial distribution.
III. N is from a Negative Binomial distribution.
A. I only
B. II only
C. III only
D. I and II
E. I and III

Page 204

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 205

Solutions to Problems:
11.1. E. 1. The variance of the Negative Binomial Distribution is greater than the mean. Thus
Statement #1 is false. 2. The variance of the Poisson always exists (and is equal to the mean.)
Thus Statement #2 is false.
3. The variance of the Binomial Distribution is less than the mean. Thus Statement #3 is false.
11.2. A. For a < 0, one has a Binomial Distribution.
Comment: Since a = -q/(1-q), q = a/(a-1) = -2/(-3) = 2/3.
a = 0 is a Poisson, 1> a > 0 is a Negative Binomial. The Logarithmic Distribution is not a member of
the (a,b,0) class. The Logarithmic Distribution is a member of the (a,b,1) class.
11.3. B. f(x+1) = f(x) {a + b/(x+1)} = f(x){.4 + 2/(x+1)} = f(x)(.4)(x+6)/(x+1).
Then proceed iteratively. For example f(5) = f(4)(.4)(10)/5 = (.1505)(.8) = .1204.
n
f(n)

0
0.0467

1
0.1120

2
0.1568

3
0.1672

4
0.1505

5
0.1204

6
0.0883

7
0.0605

Comment: Since 0 < a < 1 we have a Negative Binomial Distribution. r = 1 + b/a = 1+ (2/.4) = 6.
= a/(1-a) = .4/.6 = 2/3. Thus once a and b are given in fact f(4) is determined. Normally one would
compute f(0) = (1+)-r = .66 = .0467, and proceed iteratively from there.
11.4. E. For a member of the (a,b,0) class of distributions, f(x+1) / f(x) = a + {b / (x+1)}.
f(2)/f(1) = a + b/2. .0512/.0064 = 8 = a + b/2.
f(3)/f(2) = a + b/3. .2048/.0512 = 4 = a + b/3.
Therefore, a = -4 and b = 24. f(4) = f(3)(a + b/4) = (.2048)(-4 + 24/4) = 0.4096.
Alternately, once one solves for a and b, a < 0 a Binomial Distribution.
-4 = a = -q/(1-q). q = .8. 24 = b = (m+1)q/(1-q). m + 1 = 6. m = 5.
f(4) = (5)(.84 )(.2) = 0.4096.
Comment: Similar to 3, 11/02, Q.28.
11.5. B. This is a member of the (a, b, 0) class of frequency distributions with a = 1/3 and
b = 0.6. Since a > 0, this is a Negative Binomial, with a = /(1+) = 1/3, and
b = (r - 1)/(1 + ) = .6. Therefore, r - 1 = .6/(1/3) = 1.8. r = 2.8. = 0.5.
f(3) = {(2.8)(3.8)(4.8)/3!} .53 /(1.52.8+3) = 0.1013.
Comment: Similar to 3, 5/01, Q.25. f(x+1) = f(x) {a + b/(x+1)}, x = 0, 1, 2, ...

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 206

11.6. D. For a member of the (a, b, 0) class, the mode is the largest integer in b/(1-a) =
2.8/(1-.4) = 4.667. Therefore, the mode is 4.
Alternately, f(x+1)/f(x) = a + b/(x+1) = .4 + 2.8/(x+1).
x
f(x+1)/f(x)

0
3.200

1
1.800

2
1.333

3
1.100

4
0.960

5
0.867

6
0.800

Therefore, f(4) = 1.1 f(3) > f(3), but f(5) = .96f(4) < f(4). Therefore, the mode is 4.
Alternately, since a > 0, this a Negative Binomial Distribution with a = /(1+) and
b = (r-1)/(1+). Therefore, = a/(1-a) = .4/.6 = 2/3 and r = b/a + 1 = 2.8/.4 + 1 = 8.
The mode of a Negative Binomial is the largest integer in: (r-1) = (7)(2/3) = 4.6667.
Therefore, the mode is 4.
11.7. E. This is a member of the (a, b, 0) class of frequency distributions with a = -2/3 and
b = 4. Since a < 0, this is a Binomial, with a = -q/(1-q) = -2/3, and b = (m+1)q/(1-q) = 4.
Therefore, m + 1 = 4/(2/3) = 6; m = 5. q = .4. f(3) = {(5!)/((3!)(2!))}.43 .62 = 0.2304.
Comment: Similar to 3, 5/01, Q.25. f(x) = f(x-1) {a + b/x}, x = 1, 2, 3, ...
11.8. B. For a member of the (a,b,0) class of distributions, f(x+1) / f(x) = a + {b / (x+1)}.
f(101)/f(100) = a + b/101. 0.0329445/0.0350252 = .940594 = a + b/101.
f(102)/f(101) = a + b/102. 0.0306836 /0.0329445 = .931372 = a + b/102.
Therefore, a = 0 and b = 95.0. f(105) = f(102)(a + b/103)(a + b/104)(a + b/105) =
(0.0306836)(95/103)(95/104)(95/105) = 0.0233893.
Comment: Alternately, once one solves for a and b, a = 0 a Poisson Distribution.
= b = 95. f(105) = e-95(95105)(105!) = .0233893, difficult to calculate using most calculators.
11.9. D. The Binomial is the only member of the (a, b, 0) class with finite support.
P(X = 11) = 0 and P(X = 10) > 0 m = 10.

.1074 = P(X = 10) = q10. q = .800.

P(X = 6) = 10!/(6!4!) (1-q)4 q6 = (210) .24 .86 = 0.088.


11.10. f(1) = f(0) (a + b). f(2) = f(1) (a + b/2). f(3) = f(2) (a + b/3). f(4) = f(3) (a + b/4), etc.
For a = -2 and b = 6:
f(1) = f(0) (-2 + 6) = 4 f(0). f(2) = f(1) (-2 + 6/2) = f(1). f(3) = f(2) (-2 + 6/3) = 0. f(4) = 0, etc.
This is a Binomial with m = 2 and q = a/(a-1) = 2/3.
f(0) = 1/9. f(1) = 4/9. f(2) = 4/9.
For a = -2 and b = 5:
f(1) = f(0) (-2 + 5) = 3 f(0). f(2) = f(1) (-2 + 5/2) = 1.5f(1). f(3) = f(2) (-2 + 5/3) < 0. No good!
Comment: Similar to Exercise 6.3 in Loss Models.
For a < 0, we require that b/a be a negative integer.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 207

11.11. B. This is the (a, b, 0) relationship, with a = -c and b = 4c.


For the Binomial, a < 0. For the Poisson a = 0. For the Negative Binomial, a > 0.
c must be positive, since the densities are positive, therefore, a < 0 and this is a
Binomial. For the Binomial, a = -q/(1-q) and b = (m+1)q/(1-q).
b = -4a. m + 1 = 4. m = 3.
0.7 = p0 = (1 - q)m = (1 - q)3 . q = .1121.
c = -a = q/(1-q) = .1121/.8879 = 0.126.
Comment: Similar to SOA M, 5/05, Q.19.
11.12. f(1) = f(0) (1 - 1/2) = (1/2) f(0). f(2) = f(1) (1 - 1/4) = (3/4)f(1).
f(3) = f(2) (1 - 1/6) = (5/6)f(2). f(4) = f(3) (1 - 1/8) = (7/8)f(3). f(5) = f(4) (1 - 1/10) = (9/10)f(4).
The sum of these densities is:
f(0){1 + 1/2 + (3/4)(1/2) + (5/6)(3/4)(1/2) + (7/8)(5/6)(3/4)(1/2) + (9/10)(7/8)(5/6)(3/4)(1/2) + ...}
f(0){1 + 1/2 + 3/8 + 5/16 + 35/128 + 315/1280 + ...} > f(0){1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + ...}.
However, the sum 1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + ..., diverges.
Therefore, these densities would sum to infinity.
Comment: We require that a < 1. a is positive for a Negative Binomial; a = /(1 + ) < 1.
11.13. For a member of the (a, b, 0) class f(1)/f(0) = a + b, and f(2)/f(1) = a + b/2.
a + b = 2586.27/6590.79 = 0.39241.
a + b/2 = 656.41/2586.27 = 0.25381.

b = 0.27720. a = 0.11521.
Looking in Appendix B in the tables attached to the exam, a is positive for the Negative Binomial.
Therefore, we have a Negative Binomial.
0.11521 = a = /(1+).

1/ = 1/0.11521 - 1 = 7.6798. = 0.1302.


0.27720 = b = (r-1) /(1+).

r - 1 = 0.27720/0.11521 = 2.4060. r = 3.406.


Comment: Similar to Exercise 16.21b in Loss Models.
11.14. C. 1. True.
2. False. The variance = nq(1-q) is less than the mean = nq, since q < 1.
3. True. Statement 3 is referring to the mixture of Poissons via a Gamma, which results in a Negative
Binomial frequency distribution for the entire portfolio.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 208

11.15. B. The mean frequency is .5 and the variance is: .75 - .52 = .5.
Number
of Insureds
6070
3022
764
126
18
Average

Number
of Claims
0
1
2
3
4

Square of
Number of Claims
0
1
4
9
16

0.5000

0.7500

Since estimated mean = estimated variance, we expect the Poisson to provide the best fit.
Comment: If the estimated mean is approximately equal to the estimated variance, then the
Poisson is likely to provide a good fit. The Pareto and the LogNormal are continuous distributions
not used to fit discrete frequency distributions.
11.16. A. Var[X] = E[X2 ] - E[X]2 = 8.16 - 2.402 = 2.4 = E[X], so a Poisson Distribution is a good
choice for X. Var[Y] = E[Y2 ] - E[Y]2 = 20.25 - 3.502 = 8 > 3.5 = E[Y], so a Negative Binomial
Distribution is a good choice for Y.
11.17. Mean frequency = $500,000/$5000 = 100. Assuming frequency and severity are
independent: Var[S] = 7.5 x 109 = (100)(50002 ) + (50002 ) (Variance of the frequency).
Variance of the frequency = 200. Thus if each insured has the same frequency distribution, then it has
variance > mean, so it might be a Negative Binomial. Alternately, each insured could have a Poisson
frequency, but with the means varying across the portfolio. In that case, the mean of mixing
distribution = 100. When mixing Poisons, Variance of the mixed distribution
= Mean of mixing Distribution + Variance of the mixing distribution,
so the variance of the mixing distribution = 200 - 100 = 100.
Comment: There are many possible other answers.
11.18. C. f(x+1)/f(x) = 2/(x+1), x = 0, 1, 2,...
This is a member of the (a, b , 0) class of frequency distributions:
with f(x+1)/f(x) = a + b/(x+1), for a = 0 and b = 2.
Since a = 0, this is a Poisson with = b = 2. f(4) = e-2 24 /4! = 0.090.
Alternately, let f(0) = c. Then f(1) = 2c, f(2) = 22 c/2!, f(3) = 23 c/3!, f(4) = 24 c/4!, ....
1 = f(i) = 2ic/i! = c2i/i! = c e2 . Therefore, c = e-2. f(4) = e-2 24 /4! = 0.090.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 209

11.19. B. For a member of the (a, b, 0) class of distributions, f(x+1) / f(x) = a + {b / (x+1)}.
f(1)/f(0) = a + b. .25/.25 = 1 = a + b.
f(2)/f(1) = a + b/2. .1875/.25 = .75 = a + b/2.
Therefore, a = .5 and b = .5.
f(3) = f(2)(a + b/3) = (.1875)(.5 + .5/3) = 0.125.
Alternately, once one solves for a and b, a > 0 a Negative Binomial Distribution.
1/2 = a = /(1 + ). = 1. 1/2 = b = (r-1)/(1 + ). r - 1 = 1. r = 2.
f(3) = r(r + 1)(r + 2) 3/{(1 + )r+3 3!} = (2)(3)(4)/{(25 )(6)} = 0.125.
11.20. C. 1. True. 2. False. Would be true if = ', in which case the sum would have the sum of
the r parameters. 3. True. The sum would have the sum of the m parameters.
Comment: Note the requirement that the variables be independent.
11.21. A. The sum of two independent negative binomial distributions with parameters
(r1 , 1) and (r2 , 2) is negative binomial if and only if 1 = 2. Statement 1 is false.
The sum of two independent binomial distributions with parameters (q1 , m1 ) and (q2 , m2 ) is
binomial if and only if q1 = q2 . Statement 2 is false.
The sum of two independent Poison distributions with parameters 1 and 2 is Poisson, regardless
of the values of lambda. Statement 3 is false.
11.22. C. This is the (a, b, 0) relationship, with a = c and b = c.
For the Binomial, a < 0. For the Poisson a = 0. For the Negative Binomial, a > 0.
c must be positive, since the densities are positive, therefore, a > 0 and this is a
Negative Binomial. For the Negative Binomial, a = /(1+) and b = (r-1)/(1+).
a = b. r - 1 = 1. r = 2.
0.5 = p0 = 1/(1+)r = 1/(1+)2 . (1+)2 = 2. =
c = a = /(1+) = 0.4142/1.4142 = 0.293.

2 - 1 = 0.4142.

2013-4-1,

Frequency Distributions, 11 (a, b, 0) Class

HCM 10/4/12,

Page 210

11.23. C. For a member of the (a, b, 0) class, f(1)/f(0) = a + b, and f(2)/f(1) = a + b/2.
Therefore, a + b = 1, and a + b/2 = 0.196608/0.327680 = .6. a = 0.2 and b = 0.8.
Since a is positive, we have a Negative Binomial Distribution. Statement III is true.
f(3) = f(2)(a + b/3) = (0.196608)(0.2 + 0.8/3) = 0.0917504. Statement I is false.
Comment: .2 = a = /(1+) and .8 = b = (r-1)/(1+). r = 5 and = 0.25.
E[N] = r = (5)(.25) = 1.25, as given.
f(3) = {r(r+1)(r+2)/3!}3/(1+)3+r = {(5)(6)(7)/6}.253 /1.258 = 0.0917504.

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 211

Section 12, Accident Profiles85


Constructing an Accident Profile is a technique in Loss Models that can be used to decide whether
data was generated by a member of the (a, b, 0) class of frequency distributions and if so which
member.
As discussed previously, the (a,b,0) class of frequency distributions consists of the three common
distributions: Binomial, Poisson, and Negative Binomial. Therefore, it also includes the Bernoulli,
which is a special case of the Binomial, and the Geometric, which is a special case of the Negative
Binomial. As discussed previously, for members of the (a, b, 0) class:
f(x+1) / f(x) = a + b / (x+1),
where a and b depend on the parameters of the distribution:86
Distribution

f(0)

Binomial

-q/(1-q)

(m+1)q/(1-q)

(1-q)m

Poisson

Negative Binomial

/(1+)

(r-1)/(1+)

1/(1+)r

Note that a < 0 is a Binomial, a = 0 is a Poisson, and 1 > a > 0 is a Negative Binomial.
For the Binomial: q = a/(a-1) = |a| / ( |a| +1).
For the Negative Binomial: = a/(1-a).
For the Binomial: m = -(a+b)/a = (a + b)/ |a|.87 The Bernoulli has m =1 and b = -2a.
For the Poisson: = b.
For the Negative Binomial: r = 1 + b/a. The Geometric has r =1 and b = 0.
Thus given values of a and b, one can determine which member of the (a,b,0) class one has and its
parameters.

85

See the latter portion of Section 6.5 of Loss Models.


See Appendix B of Loss Models.
87
Since for the Binomial m is an integer, we require that b/ |a| to be an integer.
86

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 212

Accident Profile:
Also note that for a member of the (a, b, 0) class, (x+1)f(x+1)/f(x) = (x+1)a + b, so that
(x+1)f(x+1)/f(x) is linear in x. It is a straight line with slope a and intercept a + b.
Thus graphing (x+1)f(x+1)/f(x) can be a useful method of determining whether one of these three
distributions fits the given data.88 If a straight line does seem to fit this accident profile, then one
should use a member of the (a, b, 0) class.
The slope determines which of the three distributions is likely to fit: if the slope is close to zero then a
Poisson, if significantly negative then a Binomial, and if significantly positive then a Negative
Binomial.
For example, here is the accident profile for some data:
Number of
Claims

Observed

0
1
2
3
4
5
6
7
8
9&+

17,649
4,829
1,106
229
44
9
4
1
1
0

Observed
Density Function

(x+1)f(x+1)/f(x)

0.73932
0.20229
0.04633
0.00959
0.00184
0.00038
0.00017
0.00004
0.00004

0.27361
0.45807
0.62116
0.76856
1.02273
2.66667
1.75000
8.00000

Prior to the tail where the data thins out,(x+1)f(x+1)/f(x) approximately follows a straight line with a
positive slope of about 0.2, which indicates a Negative Binomial with /(1+) 0.2.89 90
The intercept is r/(1+), so that r 0.27 / 0.2 1.4.91
In general, an accident profile is used to see whether data is likely to have come from a member of
the (a, b, 0) class. One would do this test prior to attempting to fit a Negative Binomial, Poisson, or
Binomial Distribution to the data. One starts with the hypothesis that the data was drawn from a
member of the (a, b, 0) class, without specifying which one. If this hypothesis is true the accident
profile should be approximately linear.92

88

This computation is performed using the empirical densities.


One should not significantly rely on those ratios involving few observations.
90
Slope is: a = /(1+).
91
Intercept is: a + b = /(1+) + (r-1)/(1+) = r/(1+).
92
Approximate, because any finite sample of data is subject to random fluctuations.
89

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 213

If the accident profile is approximately linear, then we do not reject the hypothesis and decide
which member of the (a, b, 0) to fit based on the slope of this line.93

Comparing the Mean to the Variance:


Another way to decide which of the members of the (a,b,0) class is most likely to fit a given set of
data is to compare the sample mean and sample variance.
Binomial

Mean > Variance

Poisson

Mean = Variance94

Negative Binomial

Mean < Variance

93

There is not a numerical statistical test to perform, such as with the Chi-Square Test.
For data from a Poisson Distribution, the sample mean and sample variance will be approximately equal rather than
equal, because any finite sample of data is subject to random fluctuations.
94

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 214

Problems:
12.1 (2 points) You are given the following accident data:
Number of accidents Number of policies
0
91,304
1
7,586
2
955
3
133
4
18
5
3
6
1
7+
0
Total
100,000
Which of the following distributions would be the most appropriate model for this data?
(A) Binomial
(B) Poisson
(C) Negative Binomial, r 1
(D) Negative Binomial, r > 1
(E) None of A, B, C, or D
12.2 (3 points) You are given the following accident data:
Number of accidents Number of policies
0
860
1
2057
2
2506
3
2231
4
1279
5
643
6
276
7
101
8
41
9
4
10
2
11&+
0
Total
10,000
Which of the following distributions would be the most appropriate model for this data?
(A) Binomial
(B) Poisson
(C) Negative Binomial, r 1
(D) Negative Binomial, r > 1
(E) None of the above

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 215

12.3 (3 points) You are given the following data on the number of runs scored during half innings of
major league baseball games from 1980 to 1998:
Runs
Number of Occurrences
0
518,288
1
105,070
2
47,936
3
21,673
4
9736
5
4033
6
1689
7
639
8
274
9
107
10
36
11
25
12
5
13
7
14
1
15
0
16
1
Total
709,460
Which of the following distributions would be the most appropriate model for this data?
(A) Binomial
(B) Poisson
(C) Negative Binomial, r 1
(D) Negative Binomial, r > 1
(E) None of the above

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 216

12.4 (3 points) You are given the following accident data:


Number of accidents Number of policies
0
820
1
1375
2
2231
3
1919
4
1397
5
1002
6
681
7
330
8
172
9
56
10
14
11
3
12&+
0
Total
10,000
Which of the following distributions would be the most appropriate model for this data?
(A) Binomial
(B) Poisson
(C) Negative Binomial, r 1
(D) Negative Binomial, r > 1
(E) None of the above
12.5 (2 points) You are given the following distribution of the number of claims per policy during a
one-year period for 20,000 policies.
Number of claims per policy
Number of Policies
0
6503
1
8199
2
4094
3
1073
4
128
5
3
6+
0
Which of the following distributions would be the most appropriate model for this data?
(A) Binomial
(B) Poisson
(C) Negative Binomial, r 1
(D) Negative Binomial, r > 1
(E) None of the above

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 217

12.6 (2 points) You are given the following distribution of the number of claims on motor vehicle
polices:
Number of claims in a year
Observed frequency
0
565,664
1
68,714
2
5,177
3
365
4
24
5
6
6
0
Which of the following distributions would be the most appropriate model for this data?
(A) Binomial
(B) Poisson
(C) Negative Binomial, r 1
(D) Negative Binomial, r > 1
(E) None of the above
12.7 (4, 5/00, Q.40) (2.5 points)
You are given the following accident data from 1000 insurance policies:
Number of accidents Number of policies
0
100
1
267
2
311
3
208
4
87
5
23
6
4
7+
0
Total
1000
Which of the following distributions would be the most appropriate model for this data?
(A) Binomial
(B) Poisson
(C) Negative Binomial
(D) Normal
(E) Gamma

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

12.8 (4, 11/03, Q.32 & 2009 Sample Q.25) (2.5 points)
The distribution of accidents for 84 randomly selected policies is as follows:
Number of Accidents
Number of Policies
0
32
1
26
2
12
3
7
4
4
5
2
6
1
Total
84
Which of the following models best represents these data?
(A) Negative binomial
(B) Discrete uniform
(C) Poisson
(D) Binomial
(E) Either Poisson or Binomial

Page 218

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 219

Solutions to Problems:
12.1. C. Calculate (x+1)f(x+1)/f(x). Since it is approximately linear, we seem to have a member of
the (a, b, 0) class. f(x+1)/f(x) = a + b/(x+1), so (x+1)f(x+1)/f(x) = a(x+1) + b =
ax + a + b. The slope is positive, so a > 0 and we have a Negative Binomial.
The slope, a 0.17. The intercept is about 0.08. Thus a + b 0.08.
Therefore, b 0.08 - 0.17 = -0.09 < 0.
For the Negative Binomial b = (r-1)/(1+). Thus b < 0, implies r < 1.
Number of
Accident
0
1
2
3
4
5
6
7+

Observed
91,304
7,586
955
133
18
3
1
0

Observed
Density
Function
0.91304
0.07586
0.00955
0.00133
0.00018
0.00003
0.00001
0.00000

(x+1)f(x+1)/f(x)

Differences

0.083
0.252
0.418
0.541
0.833
2.000

0.169
0.166
0.124
0.292

Comment: Similar to 4, 5/00, Q.40. Do not put much weight on the values of (x+1)f(x+1)/f(x) in the
righthand tail, which can be greatly affected by random fluctuation.
The first moment is 0.09988, and the second moment is 0.13002.
The variance is: 0.13002 - 0.099882 = 0.12004, significantly greater than the mean.

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 220

12.2. B. Calculate (x+1)f(x+1)/f(x). Since it is approximately linear, we seem to have a member of


the (a, b, 0) class. f(x+1)/f(x) = a + b/(x+1), so (x+1)f(x+1)/f(x) = a(x+1) + b =
ax + a + b. The slope seems close to zero, until the data starts to get thin, so a 0 and therefore we
assume this data probably came from a Poisson.
Number of
Accident
0
1
2
3
4
5
6
7
8
9
10

Observed
860
2,057
2,506
2,231
1,279
643
276
101
41
4
2

Observed
Density
Function
0.0860
0.2057
0.2506
0.2231
0.1279
0.0643
0.0276
0.0101
0.0041
0.0004
0.0002

(x+1)f(x+1)/f(x)
2.392
2.437
2.671
2.293
2.514
2.575
2.562
3.248
0.878
5.000

Comment: Any actual data set is subject to random fluctuation, and therefore the observed slope of
the accident profile will never be exactly zero. One can never distinguish between the possibility
that the model was a Binomial with q small, a Poisson, or a Negative Binomial with small.
This data was simulated as 10,000 independent random draws from a Poisson with = 2.5.
12.3. E. Calculate (x+1)f(x+1)/f(x).
Note that f(x+1)/f(x) = (number with x + 1)/(number with x).
Since (x+1)f(x+1)/f(x) is not linear, we do not have a member of the (a, b, 0) class.
Number of
runs

Observed

0
1
2
3
4
5
6
7
8
9
10
11

518,228
105,070
47,936
21,673
9,736
4,033
1,689
639
274
107
36
25

(x+1)f(x+1)/f(x)

Differences

0.203
0.912
1.356
1.797
2.071
2.513
2.648
3.430
3.515
3.364
7.639

0.710
0.444
0.441
0.274
0.442
0.136
0.782
0.084
-0.150
4.274

Comment: At high numbers of runs, where the data starts to thin out, one would not put much
reliance on the values of (x+1)f(x+1)/f(x). The data is taken from An Analytic Model for Per-inning
Scoring Distributions, by Keith Woolner.

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 221

12.4. E. Calculate (x+1)f(x+1)/f(x). Since it is does not appear to be linear, we do not seem to
have a member of the (a, b, 0) class.
Number of
Accident
0
1
2
3
4
5
6
7
8
9
10
11

Observed
820
1,375
2,231
1,919
1,397
1,002
681
330
172
56
14
3

Observed
Density
Function
0.0820
0.1375
0.2232
0.1920
0.1397
0.1002
0.0681
0.0330
0.0172
0.0056
0.0014
0.0003

(x+1)f(x+1)/f(x)
1.677
3.245
2.580
2.912
3.586
4.078
3.392
4.170
2.930
2.500
2.357

12.5. A. Calculate (x+1)f(x+1)/f(x) = (x+1)(number with x + 1)/(number with x).


Number of
claims

Observed

0
1
2
3
4
5

6,503
8,199
4,094
1,073
128
3

(x+1)f(x+1)/f(x)

Differences

1.261
0.999
0.786
0.477
0.117

-0.262
-0.212
-0.309
-0.360

Since (x+1)f(x+1)/f(x) is approximately linear, we probably have a member of the (a, b, 0) class.
a = slope < 0. Binomial Distribution.
Comment: The data was simulated from a Binomial Distribution with m = 5 and q = 0.2.
12.6. E. Calculate (x+1)f(x+1)/f(x).
Note that f(x+1)/f(x) = (number with x + 1)/(number with x).
Number of claims

Observed

(x+1)f(x+1)/f(x)

Differences

0
1
2
3
4
5

565,664
68,714
5,177
365
24
6

0.121
0.151
0.212
0.263
1.250

0.029
0.061
0.052
0.987

Even ignoring the final value, (x+1)f(x+1)/f(x) is not linear.


Therefore, we do not have a member of the (a, b, 0) class.
Comment: Data taken from Table 6.6.2 in Introductory Statistics with Applications in General
Insurance by Hossack, Pollard and Zehnwirth. See also Table 6.5 in Loss Models.

2013-4-1,

Frequency Distributions, 12 Accident Profiles

HCM 10/4/12,

Page 222

12.7. A. Calculate (x+1)f(x+1)/f(x). Since it seems to be decreasing linearly, we seem to have a


member of the (a, b, 0) class, with a < 0, which is a Binomial Distribution.
Number of
Accident

Observed

0
1
2
3
4
5
6
7+

100
267
311
208
87
23
4
0

Observed
Density Function

(x+1)f(x+1)/f(x)

0.10000
0.26700
0.31100
0.20800
0.08700
0.02300
0.00400
0.00000

2.67
2.33
2.01
1.67
1.32
1.04

Alternately, the mean is 2, and the second moment is 5.494. Therefore, the sample variance is
(1000/999)(5.494 - 22 ) = 1.495. Since the variance is significantly less than the mean, this indicates
a Binomial Distribution.
Comment: One would not use a continuous distribution such as the Normal or the Gamma to model
a frequency distribution. (x+1)f(x+1)/f(x) = a(x+1) + b. In this case, a -.33.
For the Binomial, a = -q/ (1-q), so q .25. In this case, b 2.67+.33 = 3.00.
For the Binomial, b = (m+1)q/ (1-q), so m (3/.33) -1 = 8.
12.8. A. Calculate (x+1)f(x+1)/f(x). For example, (3)(7/84)/(12/84) = (3)(7)/12 = 1.75.
Number of
Accident

(x+1)f(x+1)/f(x)
Observed

0
1
2
3
4
5
6

32
26
12
7
4
2
1

0.81
0.92
1.75
2.29
2.50
3.00

Since this quantity seems to be increasing roughly linearly, we seem to have a member of the
(a, b, 0) class, with a = slope > 0, which is a Negative Binomial Distribution.
Alternately, the mean is: 103/84 = 1.226, and the second moment is: 287/84 = 3.417.
The sample variance is: (84/83)(3.417 - 1.2262 ) = 1.937. Since the sample variance is significantly
more than the sample mean, this indicates a Negative Binomial.
Comment: If (x+1)f(x+1)/f(x) had been approximately linear with a slope that was close to zero,
then one could not distinguish between the possibility that the model was a Binomial with q small, a
Poisson, or a Negative Binomial with small. If the correct model were the discrete uniform, then we
would expect the observed number of policies to be similar for each number of accidents.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 223

Section 13, Zero-Truncated Distributions95


Frequency distributions can be constructed that have support on the positive integers or alternately
have f(0) = 0. For example, let f(x) = (e-3 3x / x!) / (1 - e-3). for x = 1, 2, 3, ...
x
f(x)
F(x)

1
15.719%
15.719%

2
23.578%
39.297%

3
23.578%
62.875%

4
17.684%
80.558%

5
10.610%
91.169%

6
5.305%
96.4736%

7
2.274%
98.74718%

Exercise: Verify that the sum of f(x) = (e-3 3x / x!) / (1 - e-3) for x =1 to is unity.

[Solution: The sum of the Poisson Distribution from 0 to is 1.

e3 3x / x! = 1.

i=0

i=1

i=1

Therefore,

e3 3x / x! = 1 e3 . f(x) = 1.]

This is an example of a Poisson Distribution Truncated from Below at Zero, with = 3.


In general, if f is a distribution on 0, 1, 2, 3,...,
f(x)
then g(x) =
is a distribution on 1, 2, 3, ...
1 - f(0)
This is a special case of truncation from below. The general concept of truncation of a distribution is
covered in a Mahlers Guide to Loss Distributions.
We have the following three examples, shown in Appendix B.3.1 of Loss Models:
Distribution

Binomial

Poisson

Negative Binomial
95

Density of the Zero-Truncated Distribution


m! qx (1- q)m - x
x! (m - x)!
1 - (1- q)m
e- x / x!
1 - e-

x = 1, 2, 3,... , m

x = 1, 2, 3,...

x
r(r +1)...(r + x - 1)
(1+ )x + r
x!
x = 1, 2, 3,...
1 - 1/ (1+ )r

See Section 6.7 in Loss Models.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 224

Moments:
Exercise: For a Zero-Truncated Poisson with = 3, what is the mean?
[Solution: Let f(x) be the untruncated Poisson, and g(x) be the truncated distribution.
Then g(x) = f(x) / {1 - f(0)}.

The mean of g =

x=1

x f(x)
mean of f
x f(x)

3
x g(x) =
= x=0
=
=
=
-
- 3 = 3.157.]
1
f(0)
1
f(0)
1
f(0)
1
e
1
e
x=1

In general, the moments of a zero-truncated distribution, g, are given in terms of those of the
corresponding untruncated distribution, f, by: Eg [Xn ] =

Ef [Xn]
.
1 - f(0)

For example for the Zero-Truncated Poisson the mean is: / (1 - e),
while the second moment is: ( + 2)/ (1 - e).
Exercise: For a Zero-Truncated Poisson with = 3 what is the second moment?
[Solution: Let f(x) be the untruncated Poisson, and g(x) be the truncated distribution.
Then g(x) = f(x) / {1 - f(0)}.
The second moment of f is its variance plus the square of its mean = + 2.
The second moment of h = (the second moment of f)/ {1 - f(0)} =
( + 2)/ (1 - e) = (3 + 32 )/(1 - e-3) =12.629.]
Thus a Zero-Truncated Poisson with = 3 has a variance of 12.629 - 3.1572 = 2.66.
This matches the result of using the formula in Appendix B of Loss Models:
{1 - (+1)e}/ (1 - e)2 = (3){1 - 4e-3}/ (1 - e-3)2 = (3)(.8009)/(.9502)2 = 2.66.
It turns out that for the Zero-Truncated Negative Binomial, the parameter r can take on values
between -1 and 0, as well as the usual positive values, r > 0. This is sometimes referred to as the
Extended Zero-Truncated Negative Binomial, however provided r 0 all the same formulas apply.
As r approaches zero, the Zero-Truncated Negative Binomial approaches the Logarithmic
Distribution.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 225

Logarithmic Distribution:96
The Logarithmic Distribution with parameter has support equal to the positive integers:

f(x) =

with mean:

1+
x ln(1+ )

, for x = 1, 2, 3,...

, and variance:
ln(1+)

ln(1+)
.
ln(1+)

1 + -

ln[1 - (z - 1)]
, z < 1 + 1/.
ln[1+ ]
Exercise: Assume the number of vehicles involved in each automobile accident is given by

a = / (1 +).

b = - / (1 +)

P(z) = 1 -

f(x) = 0.2x / {x ln(1.25)}, for x = 1, 2, 3,...


Then what is the mean number of vehicles involved per automobile accident?
[Solution: This is a Logarithmic Distribution with = 0.25. Mean /ln(1+) = 0.25/ ln(1.25) = 1.12.
Comment: / (1 + ) = 0.25/1.25 = 0.2.]
The density function of this Logarithmic Distribution with = 0.25 is as follows:
x
f(x)
F(x)

1
89.6284%
89.628%

2
8.9628%
98.591%

3
1.1950%
99.786%

4
0.1793%
99.966%

5
0.0287%
99.994%

6
0.0048%
99.9990%

Exercise: Show that the densities of a Logarithmic Distribution sum to one.

Hint: ln[1/ (1 - y)] =

k , for |y| < 1.


yk

97

k=1

[Solution:

k=1

1
f(k) =
ln(1+ )

k=1

/ k.
1+

Let y = / (1+). Then, 1/ (1 - y) = 1 + .

Thus

k=1

96
97

1
f(k) =
ln(1+ )

k
k=1

yk

ln[1/ (1 - y)] ln(1+ )


=
= 1. ]
ln(1+ )
ln(1+ )

Sometimes called instead a Log Series Distribution.


Not something you need to know for your exam. This result can be derived as a Taylor series.

7
0.0008%
99.9998%

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 226

Exercise: Show that the limit as r 0 of Zero-Truncated Negative Binomial Distributions with
the other parameter fixed, is a Logarithmic Distribution with parameter .
[Solution: For the Zero-Truncated Negative Binomial Distribution:
f(x) = {(r+x-1)! x / {(1+)x+r (r-1)! x!}} / {1- 1/(1+)r}
= {r(r+1)...(r+x-1) / x!} {/(1+)}x /{(1+)r - 1}.
lim f(x) = {{/(1+)}x / x!} lim (r+1)...(r+x-1) {r/((1+)r -1)} =
r 0

r0

{{/(1+)}x / x!} (x-1)! lim r/{(1+)r - 1} = {{/(1+)}x / x} lim 1/{ln(1+) (1+)r } =


r 0

r 0

{{/(1+)}x / x} / ln(1+). Where I have used LHospitals Rule.


This is the density of a Logarithmic Distribution.
Alternately, the p.g.f. of a Zero-Truncated Negative Binomial Distribution is:
P(z) = {{1 - (z-1)}-r - (1+)-r }/{1 - (1+)-r}.
lim P(z) = lim {{1 - (z-1)}-r - (1+)-r }/{1 - (1+)-r} =
r 0

r0

lim [- ln(1-(z-1)){1 - (z-1)}-r + ln(1+)(1 + )-r] / {ln(1+) (1 + )-r} =


r 0

{ln(1+) - ln[1 - (z-1)]} / ln(1+) = 1 - ln[1 - (z-1)] / ln(1+).


Where I have used LHospitals Rule.
This is the p.g.f of a Logarithmic Distribution.]
(a,b,1) Class:98
The (a,b,1) class of frequency distributions in Loss Models is a generalization of the (a,b,0)
class. As with the (a,b,0) class, the recursion formula: f(x)/f(x-1) = a + b/x applies.
However, this relationship need only apply now for x 2, rather than x 1.
Members of the (a,b,1) family include: all the members of the (a,b,0) family,99 the zero-truncated
versions of those distributions: Zero-Truncated Binomial, Zero-Truncated Poisson, and
Extended Truncated Negative Binomial,100 and the Logarithmic Distribution.
In addition the (a,b,1) class includes the zero-modified distributions corresponding to these, to be
discussed in the next section.

98

See Table 6.4 and Appendix B.3 in Loss Models.


Binomial, Poisson, and the Negative Binomial.
100
The Zero-Truncated Negative Binomial where in addition to r > 0, -1 < r < 0 is also allowed.
99

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 227

Probability Generating Functions:


The Probability generating function, P(z) = E[zN], for a zero-truncated distribution can be obtained
from that for the untruncated distribution.
PT(z) =

P(z) - f(0)
,
1 - f(0)

where P(z) is the p.g.f. for the untruncated distribution and PT(z) is the p.g.f. for the zero-truncated
distribution, and f(0) is the probability at zero for the untruncated distribution.
Exercise: What is the Probability Generating Function for a Zero-Truncated Poisson Distribution.
[Solution: For the untruncated Poisson P(z) = e(z-1). f(0) = e.
PT(z) = {P(z) - f(0)} / {1 - f(0)} = {e(z-1) - e} / {1- e} = {ez - 1} / {e - 1 }.]
One can derive this relationship as follows:

PT(z) =

zn

f(n)

zn g(n) = x=1
1 - f(0)
x=1

zn f(n) = x=0

1 - f(0)

f(0)
=

P(z) - f(0)
.
1 - f(0)

In any case, Appendix B of Loss Models displays the Probability Generating Functions for all of the
Zero-Truncated Distributions.
Loss Models Notation:
p k the density function of the untruncated frequency distribution at k.
pT
k the density function of the zero-truncated frequency distribution at k.
101
pM
k the density function of the zero-modified frequency distribution at k.

T
Exercise: Give a verbal description of the following terms: p7 , pM
4 , and p 6 .

[Solution: p7 is the density of the frequency at 7, f(7).


pM
4 is the density of the zero-modified frequency at 4, fM(4).
pT
6 is the density of the zero-truncated frequency at 6, fT(6).]
101

Zero-modified distributions will be discussed in the next section.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 228

Problems:
13.1 (1 point) The number of persons injured in an accident is assumed to follow a
Zero -Truncated Poisson Distribution with parameter = 0.3.
Given an accident, what is the probability that exactly 3 persons were injured in it?
A. Less than 1.0%
B. At least 1.0% but less than 1.5%
C. At least 1.5% but less than 2.0%
D. At least 2.0% but less than 2.5%
E. At least 2.5%
Use the following information for the next four questions:
The number of vehicles involved in an automobile accident is given by a Zero-Truncated Binomial
Distribution with parameters q = 0.3 and m = 5.
13.2 (1 point) What is the mean number of vehicles involved in an accident?
A. less than 1.8
B. at least 1.8 but less than 1.9
C. at least 1.9 but less than 2.0
D. at least 2.0 but less than 2.1
E. at least 2.1
13.3 (2 points) What is the variance of the number of vehicles involved in an accident?
A. less than 0.5
B. at least 0.5 but less than 0.6
C. at least 0.6 but less than 0.7
D. at least 0.7 but less than 0.8
E. at least 0.8
13.4 (1 point) What is the chance of observing exactly 3 vehicles involved in an accident?
A. less than 11%
B. at least 11% but less than 13%
C. at least 13% but less than 15%
D. at least 15% but less than 17%
E. at least 17%
13.5 (2 points) What is the median number of vehicles involved in an accident??
A. 1 B. 2 C. 3 D. 4 E. 5

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 229

Use the following information for the next five questions:


The number of family members is given by a Zero-Truncated Negative Binomial Distribution with
parameters r = 4 and = 0.5.
13.6 (1 point) What is the mean number of family members?
A. less than 2.0
B. at least 2.0 but less than 2.1
C. at least 2.1 but less than 2.2
D. at least 2.2 but less than 2.3
E. at least 2.3
13.7 (2 points) What is the variance of the number of family members?
A. less than 2.0
B. at least 2.0 but less than 2.2
C. at least 2.2 but less than 2.4
D. at least 2.4 but less than 2.6
E. at least 2.6
13.8 (2 points) What is the chance of a family having 7 members?
A. less than 1.1%
B. at least 1.1% but less than 1.3%
C. at least 1.3% but less than 1.5%
D. at least 1.5% but less than 1.7%
E. at least 1.7%
13.9 (3 points) What is the probability of a family having more than 5 members?
A. less than 1%
B. at least 1%, but less than 3%
C. at least 3%, but less than 5%
D. at least 5%, but less than 7%
E. at least 7%
13.10 (1 point) What is the probability generating function?

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 230

Use the following information for the next three questions:


A Logarithmic Distribution with parameter = 2.
13.11 (1 point) What is the mean?
A. less than 2.0
B. at least 2.0 but less than 2.1
C. at least 2.1 but less than 2.2
D. at least 2.2 but less than 2.3
E. at least 2.3
13.12 (2 points) What is the variance?
A. less than 2.0
B. at least 2.0 but less than 2.2
C. at least 2.2 but less than 2.4
D. at least 2.4 but less than 2.6
E. at least 2.6
13.13 (1 point) What is the density function at 6?
A. less than 1.1%
B. at least 1.1% but less than 1.3%
C. at least 1.3% but less than 1.5%
D. at least 1.5% but less than 1.7%
E. at least 1.7%

13.14 (1 point) For a Zero-Truncated Negative Binomial Distribution with parameters r = -0.6
and = 3, what is the density function at 5?
A. less than 1.1%
B. at least 1.1% but less than 1.3%
C. at least 1.3% but less than 1.5%
D. at least 1.5% but less than 1.7%
E. at least 1.7%

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 231

Use the following information for the next five questions:


The number of days per hospital stay is given by a Zero-Truncated Poisson Distribution with
parameter = 2.5.
13.15 (1 point) What is the mean number of days per hospital stay?
A. less than 2.5
B. at least 2.5 but less than 2.6
C. at least 2.6 but less than 2.7
D. at least 2.7 but less than 2.8
E. at least 2.8
13.16 (2 points) What is the variance of the number of days per hospital stay?
A. less than 2.2
B. at least 2.2 but less than 2.3
C. at least 2.3 but less than 2.4
D. at least 2.4 but less than 2.5
E. at least 2.5
13.17 (1 point) What is the chance that a hospital stay is 6 days?
A. less than 3%
B. at least 3% but less than 4%
C. at least 4% but less than 5%
D. at least 5% but less than 6%
E. at least 6%
13.18 (2 points) What is the chance that a hospital stay is fewer than 4 days?
A. less than 50%
B. at least 50% but less than 60%
C. at least 60% but less than 70%
D. at least 70% but less than 80%
E. at least 80%
13.19 (2 points) What is the mode of this frequency distribution?
A. 1
B. 2
C. 3
D. 4
E. 5
13.20 (4 points) Let X follow an Exponential with mean .
Let Y be the minimum of a random sample from X of size k.
However, K in turn follows a Logarithmic Distribution with parameter .
What is the distribution function of Y?

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 232

Use the following information for the next 2 questions:


Harvey Wallbanker, the Automatic Teller Machine, works 24 hours a day, seven days a week,
without a vacation or even an occasional day off.
Harvey services on average one customer every 10 minutes.
60% of Harveys customers are male and 40% are female.
The gender of a customer is independent of the gender of the previous customers.
Harveys hobby is to observe patterns of customers. For example, FMF denotes a female
customer, followed by a male customer, followed by a female customer.
Harvey starts looking at customers who arrive after serving Pat, his most recent customer.
How long does it take on average until he sees the following patterns.
13.21 (2 points) How long on average until Harvey sees M?
13.22 (2 points) How long on average until Harvey sees F?

13.23 (1 point) X and Y are independently, identically distributed


Zero-Truncated Poisson Distributions, each with = 3.
What is the probability generating function of their sum?
13.24 (3 points) Let X follow an Exponential with mean .
Let Y be the minimum of a random sample from X of size k.
However, K in turn follows a Zero-Truncated Geometric Distribution with parameter .
What is the mean of Y?
Hint: The densities of a Logarithmic Distribution sum to one.
A. / (1 + )

B. (/) ln[1 + ]

C. / (1 + ln[1 + ])

D. (1 + )

E. None of A, B, C, or D.
13.25 (5 points) At the Hyperion Hotel, the number of days a guest stays is distributed via
a zero-truncated Poisson with = 4.
On the day they check out, each guest leaves a tip for the maid equal to $3 per day of their stay.
The guest in room 666 is checking out today. What is the expected value of the tip?

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 233

13.26 (5 points) The Krusty Burger Restaurant has started a new sales promotion.
With the purchase of each meal they give the customer a coupon.
There are ten different coupons, each with the face of a different famous resident of Springfield.
A customer is equally likely to get each type of coupon, independent of the other coupons he has
gotten in the past.
Once you get one coupon of each type, you can turn your 10 different coupons for a free meal.
(a) Assuming a customer saves his coupons, and does not trade with anyone else, what is the
mean number of meals he must buy until he gets a free meal?
(b) What is the variance of the number of meals until he gets a free meal?
13.27 (Course 151 Sample Exam #1, Q.12) (1.7 points)
A new business has initial capital 700 and will have annual net earnings of 1000.
It faces the risk of a one time loss with the following characteristics:

The loss occurs at the end of the year.


The year of the loss is one plus a Geometric distribution with = 0.538.
(So the loss may either occur at the end of the first year, second year, etc.)

The size of the loss is uniformly distributed on the ten integers:


500,1000,1500, ..., 5000.
Determine the probability of ruin.
(A) 0.00
(B) 0.41
(C) 0.46

(D) 0.60

(E) 0.65

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 234

Solutions to Problems:
13.1. B. Let f(x) be the density of a Poisson Distribution, then the distribution truncated from below
at zero is: g(x) = f(x) /(1-f(0)). Thus for = 0.3, g(x) = {.3x e-.3 / x!} / {1-e-.3}.
g(3) = {.33 e-.3 / 3!} / {1-e-.3} = .00333 / .259 = 1.3%.
13.2. B. Mean is that of the non-truncated binomial, divided by 1 - f(0): (.3)(5) / (1-.75 ) = 1.803.
13.3. D. The second moment is that of the non-truncated binomial, divided by 1 - f(0):
(1.05+1.52 ) / (1-.75 ) = 3.967. Variance = 3.967 - 1.8032 = 0.716.
Comment: Using the formula in Appendix B of Loss Models:
Variance = mq{(1-q) - (1 - q + mq)(1-q)m} / {1-(1-q)m}2
= (5)(.3){(.7 - (.7 + 1.5)(.7)5 } / {1-(.7)5 }2 = (1.5)(.3303)/.83192 = 0.716.
13.4. D. For a non-truncated binomial, f(3) = 5!/{(3!)(2!)} .33 .72 = .1323. For the zero-truncated
distribution one gets the density by dividing by 1 - f(0): (.1323) / (1-.75 ) = 15.9%.
13.5. B. For a discrete distribution such as we have here, employ the convention that the median is
the first value at which the distribution function is greater than or equal to .5.
F(1) = .433 < 50%, F(2) = .804 > 50%, and therefore the median is 2.
Number
of Vehicles
0
1
2
3
4
5

Untruncated
Binomial
Zero-Truncated
CoefficientBinomial
Binomial
16.81%
36.02%
43.29%
30.87%
37.11%
13.23%
15.90%
2.83%
3.41%
0.24%
0.29%

Cumulative
Zero-Truncated
Binomial
43.29%
80.40%
96.30%
99.71%
100.00%

13.6. E. Mean is that of the non-truncated negative binomial, divided by 1-f(0):


(4)(.5)/ (1-1.5-4) = 2 / 0.8025 = 2.49

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 235

13.7. D. The second moment is that of the non-truncated negative binomial, divided by 1-f(0):
(3+22 ) / (1-1.5-4) = 8.723. Variance = 8.723 - 2.4922 = 2.51.
Comment: Using the formula in Appendix B of Loss Models:
Variance = r{(1+) - (1 + + r)(1+)-r} / {1-(1+)-r}2
= (4)(.5){(1.5 - (1 + .5 + 2)(1.5-4)} / (1-1.5-4)2 = (2)(.8086)/.80252 = 2.51.
The non-truncated negative binomial has mean = r = 2, and variance = r(1+) = 3,
and thus a second moment of: 3 + 22 = 7.
13.8. C. For the non-truncated negative binomial,
f(7) = (4)(5)(6)(7)(8)(9)(10) .57 /((7!)(1.5)11) = 1.08%. For the zero-truncated distribution one gets
the density by dividing by 1-f(0): (1.08%) / (1-1.5-4) = 1.35%.
13.9. D. The chance of more than 5 is: 1 - .9471 = 5.29%.
Number
Untruncated Zero-Truncated
of Members Neg. Binomial Neg. Binomial
0
19.75%
1
26.34%
32.82%
2
21.95%
27.35%
3
14.63%
18.23%
4
8.54%
10.64%
5
4.55%
5.67%
6
2.28%
2.84%
7
1.08%
1.35%
8
0.50%
0.62%
9
0.22%
0.28%

Cumulative
Zero-Truncated
Neg. Binomial
32.82%
60.17%
78.40%
89.04%
94.71%
97.55%
98.90%
99.52%
99.79%

13.10. As shown in Appendix B of Loss Models,


P(z) =

{1 - (z -1)}- r - (1+)- r
(1.5 - 0.5z)- 4 - 1/ 1.54
1.54 / (1.5 - 0.5z)4 - 1
=
=
.
1 - (1+ )- r
1 - 1/ 1.54
1.54 - 1

Alternately, for the Negative Binomial, f(0) = 1/(1+)r = 1/1.54 ,


and P(z) = {1 - (z-1)}-r = {1 - (0.5)(z - 1)}-4 = (1.5 - 0.5z)-4.
PT(z) =

1.54 / (1.5 - 0.5z)4 - 1


P(z) - f(0) (1.5 - 0.5z)- 4 - 1/ 1.54
=
=
.
1 - f(0)
1 - 1/ 1.54
1.54 - 1

Comment: This probability generating function only exists for z < 1 + 1/ = 1 + 1/0.5 = 3.
13.11. A. Mean of the logarithmic distribution is: /ln(1+) = 2 / ln(3) = 1.82.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 236

13.12. B. Variance of the logarithmic distribution is: {1 + /ln(1+)}/ln(1+) =


2{3 -1.82}/ ln(3) = 2.15.
13.13. C. For the logarithmic distribution, f(x) = {/ (1+)}x / {x ln(1+)}
f(6) = (2/3)6 / {6 ln(3)} = 1.33%.
13.14. A. For the zero-truncated Negative Binomial Distribution,
f(5) = r(r+1)(r+2)(r+3)(r+4) (/(1+))x /{(5!)((1+)r -1)} =
(-.6)(.4)(1.4)(2.4)(3.4)(3/4)5 / {(120)(4-.6 -1) = (-2.742)(.2373) / (120)(-.5647) = .96%.
Comment: Note this is an extended zero-truncated negative binomial distribution, with
0 > r > -1. The same formulas apply as when r > 0. (As r approaches zero one gets a logarithmic
distribution.) For the untruncated negative binomial distribution we must have r > 0. So in this case
there is no corresponding untruncated distribution.
13.15. D. Mean is that of the non-truncated Poisson, divided by 1- f(0):
(2.5) / (1 - e-2.5) = 2.5/.9179 = 2.724.
Comment: Note that since the probability at zero has been distributed over the positive integers,
the mean is larger for the zero-truncated distribution than for the corresponding untruncated
distribution.
13.16. A. The second moment is that of the non-truncated Poisson, divided by 1 - f(0):
(2.5 + 2.52 ) / (1 - e-2.5) = 9.533. Variance = 9.533 - 2.7242 = 2.11.
Comment: Using the formula in Appendix B of Loss Models:
Variance = {1 - (+1)e} /(1-e)2 = (2.5){1 - 3.5e-2.5}/(1 - e-2.5)2 = (2.5)(.7127)/.91792 = 2.11.
13.17. B. For a untruncated Poisson, f(6) = (2.56 )e-2.5/6! = .0278. For the zero-truncated
distribution one gets the density by dividing by 1-f(0): (.0278) / (1-e-2.5) = 3.03%.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 237

13.18. D. One adds up the chances of 1, 2 and 3 days, and gets 73.59%.
Number
of Days
0
1
2
3
4
5
6
7
8

Untruncated
Binomial
Zero-Truncated
CoefficientPoisson
Poisson
8.21%
20.52%
22.36%
25.65%
27.95%
21.38%
23.29%
13.36%
14.55%
6.68%
7.28%
2.78%
3.03%
0.99%
1.08%
0.31%
0.34%

Cumulative
Zero-Truncated
Poisson
22.36%
50.30%
73.59%
88.14%
95.42%
98.45%
99.54%
99.88%

Comment: By definition, there is no probability of zero items for a zero-truncated distribution.


13.19. B. The mode is where the density function is greatest, 2.
Number
of Days
0
1
2
3
4

Untruncated
Binomial
Zero-Truncated
CoefficientPoisson
Poisson
8.21%
20.52%
22.36%
25.65%
27.95%
21.38%
23.29%
13.36%
14.55%

Comment: Unless the mode of the untruncated distribution is 0, the mode of the zero-truncated
distribution is the same as that of the untruncated distribution. For example, in this case all the
densities on the positive integers are increased by the same factor 1/(1 - .0821). Thus since the
density at 2 was largest prior to truncation, it remains the largest after truncation at zero.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 238

13.20. Assuming a sample of size k, then


Prob[Min > y | k] = Prob[all elements of the sample > y] = (e-y/)k = e-yk/.
Let pk be the Logarithmic density.

Prob[Min > y] =

Prob[Min >

y | k] pk =

k=1

(e- y / )k pk = EK[(e-y/)k].
k=1

However, the P.G.F. of a frequency distribution is defined as E[zk].


For the Logarithmic Distribution, P(z) = 1 -

ln[1 - (z - 1)]
.
ln(1+)

Therefore, taking z = e-y/,


Prob[Min > y] = 1 -

ln[1 - (e- y / - 1)]


.
ln(1+ )

Thus Prob[Min y], in other words the distribution function is:


F(y) =

ln[1 - (e- y / - 1)]


, y > 0.
ln(1+ )

Comment: The distribution of Y is called an Exponential-Logarithmic Distribution.


If one lets p = 1/(1+), then one can show that F(y) = 1 - ln[1 - (1-p) e-y/] / ln(p).
As approaches 0, in other words as p approaches 1, the distribution of Y approaches an
Exponential Distribution.
The Exponential-Logarithmic Distribution has a declining hazard rate.
In general, if S(x) is the survival function of severity, Y is the minimum of a random sample from X of
size k, and K in turn follows a frequency distribution with support k 1 and Probability Generating
Function P(z), then F(y) = 1 - P(S(y)).
13.21. The number of customers he has to wait is a Zero-Truncated Geometric Distribution with
= chance of failure / chance of success = (1 - 0.6)/0.6 = 1/0.6 - 1.
So the mean number of customers is 1/0.6 = 1.67. 16.7 minutes on average.
Comment: The mean of the Zero-Truncated Geometric Distribution is:

= 1 + .
1 - 1/ (1+)

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 239

13.22. The number of customers he has to wait is a Zero-Truncated Geometric Distribution with
= chance of failure / chance of success = (1 - .4)/.4 = 1/.4 - 1.
So the mean number of customers is 1/.4 = 2.5. 25 minutes on average.
Comment: Longer patterns can be handled via Markov Chain ideas not on the syllabus.
See Example 4.20 in Probability Models by Ross.
13.23. As shown in Appendix B of Loss Models, for the zero-truncated Poisson:
P(z) =

ez - 1 e3z - 1
= z
.
ez - 1
e - 1

The p.g.f. for the sum of two independently, identically distributed variables is: P(z) P(z) = P(z)2 :
e3z - 1 2
.
ez - 1
Comment: The sum of two zero-truncated distributions has a minimum of two events.
Therefore, the sum of two zero-truncated Poissons is not a zero-truncated Poisson.
13.24. B. Prob[Min > y | k] = Prob[all elements of the sample > y] = (e-y/)k = e-yk/.
Thus the minimum from a sample of size k, follows an Exponential Distribution with mean /k.
Therefore, E[Y] = E[E[Y|k]] = E[/k] = E[1/k].
For a Zero-Truncated Geometric, pk = k-1 / (1+)k, for k = 1, 2, 3,...

Thus E[1/k] = (1/)

k=1

/k .
1+

1+
However, for the Logarithmic: pk =
, for k = 1, 2, 3,...
k ln(1+ )

Therefore, since these Logarithmic densities sum to one:

k=1

/ k = ln(1 +).
1+

Thus E[1/k] = (1/) ln[1 +]. Thus E[Y] = E[1/k] = (/) ln[1 + ].

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 240

13.25. The probability of a stay of length k is pT


k .
If a stay is of length k, the probability that today is the last day is 1/k.
Therefore, for an occupied room picked at random, the probability that its guest is checking out

today is:

pTk / k .
k=1

The tip for a stay of length k is 3k.


Thus, the expected tip left by the guest checking out of room 666 is:

3k

pT
k

k=1

/ k

pTk / k
k=1

pTk

3
= k=1
=
.
pTk / k
pTk / k

k=1

k=1

For the zero-truncated Poisson,

pTk / k =
k=1

e-
1 - e-

( + 2/4 + 3/18 + 4/96 + 5/600 + 6/4320 + 7/35,280 + 8/322,560


+ 9/3,265,920 + 10/36,288,000 + 11/439,084,800 + ...) = 0.330.

Thus, the expected tip left by the guest checking out of room 666 is: 3 / 0.330 = 9.09.
Alternately, the (average) tip per day is 3.
3 = (0)(probability not last day) + (average tip if last day)(probability last day).
3 = (average tip if last day)(0.333).
Therefore, the average tip if it is the last day is: 3 / 0.330 = 9.09.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 241

13.26. (a) After the customer gets his first coupon, there is 9/10 probability that his next coupon is
different. Therefore, the number of meals it takes him to get his next unique coupon after his first is
a zero-truncated Geometric Distribution,
with = (probability of failure) / (probability of success) = (1/10)/(9/10) = 1/9.
(Alternately, it is one plus a Geometric Distribution with = 1/9.)
Thus the mean number of meals from the first to the second unique coupon is: 1 + 1/9 = 10/9.
After the customer gets his second unique coupon, there is 8/10 probability his next coupon is
different than those he already has. Therefore, the number of meals it takes him to get his third
unique coupon after his second is a zero-truncated Geometric Distribution,
with = (probability of failure) / (probability of success) = (2/10)/(8/10) = 2/8.
Thus the mean number of meals from the second to the third unique coupon is: 1 + 2/8 = 10/8.
Similarly, the number of meals it takes him to get his fourth unique coupon after his third is a
zero-truncated Geometric Distribution, with = 3/7, and mean 10/7.
Proceeding in a similar manner, the means to get the remaining coupons are: 10/6 + ... + 10/1.
Including one meal to get the first coupon, the mean total number of meals is:
(10) (1/10 + 1/9 + 1/8 + 1/7 + 1/6 + 1/5 + 1/4 + 1/3 + 1/2 + 1/1) = 29.29.
(b) It takes one meal to get the first coupon; variance is zero.
The number of additional meals to get the second unique coupon is a zero-truncated Geometric
Distribution, with = 1/9 and variance: (1/9)(10/9).
Similarly, the variance of the number of meals from the second to the third unique coupon is:
(2/8)(10/8).
The number of meals in intervals between unique coupons are independent, so their variances add.
Thus, the variance of the total number of meals is:
(10) (1/92 + 2/82 + 3/72 + 4/62 + 5/52 + 6/42 + 7/32 + 8/22 + 9/12 ) = 125.69.
Comment: The coupon collectors problem.

2013-4-1,

Frequency Distributions, 13 Zero-Truncated

HCM 10/4/12,

Page 242

13.27. D. At the end of year one the business has 1700. Thus, if the loss occurs at the end of year
one, there is ruin if the size of loss is > 1700, a 70% chance. Similarly, at the end of year 2, if the loss
did not occur in year 1, the business has 2700. Thus, if the loss occurs at the end of year two there
is ruin if the size of loss is > 2700, a 50% chance.
If the loss occurs at the end of year three there is ruin if the size of loss is > 3700, a 30% chance.
If the loss occurs at the end of year four there is ruin if the size of loss is > 4700, a 10% chance.
If the loss occurs in year 5 or later there is no chance of ruin.
The probability of the loss being in year n is: (1/(1+))(/(1+))n-1 = .65(.35n-1).
A

Year

Probability of Loss
in this year

Probability of Ruin if Loss


Occurs in this year

Column B
times Column C

1
2
3
4
5

0.6500
0.2275
0.0796
0.0279
0.0098

0.7
0.5
0.3
0.1
0

0.4550
0.1138
0.0239
0.0028
0.0000
0.5954

Alternately, if the loss is of size 500, 1000, or 1500 there is not ruin. If the loss is of size 2000 or
2500, then there is ruin if the loss occurs in year 1. If the loss is of size 3000 or 3500, then there is
ruin if the loss occurs by year 2. If the loss is of size 4000 or 4500, then there is ruin if the loss occurs
by year 3. If the loss is of size 5000, then there is ruin if the loss occurs by year 4.
A

Size
of Loss
500, 1000, 1500
2000 or 2500
3000 or 3500
4000 or 4500
5000

Year by which Probability that


Probability of a
loss occurs
Loss Occurs by
Loss of this Size
for Ruin
this year
0.3
0.2
0.2
0.2
0.1

none
1
2
3
4

0.000
0.650
0.877
0.957
0.985

Column B
times
Column D
0.000
0.130
0.175
0.191
0.099
0.595

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 243

Section 14, Zero-Modified Distributions102


Frequency distributions can be constructed whose densities on the positive integers are
proportional to those of a well-known distribution, but with f(0) having any value between zero and
one.

For example, let g(x) =

e - 3 3x 1 - 0.25
, for x = 1, 2 ,3, ..., and g(0) = 0.25.
x!
1 - e- 3

Exercise: Verify that the sum of this density is in fact is unity.

[The sum of the Poisson Distribution from 0 to is 1.

e3 3x / x! = 1.

i=0

Therefore,

3x

/ x! = 1 -

e-3 .

i= 1

i= 1

i= 0

g(x) (1 - 0.25). g(x) = 1 - 0.25 + 0.25 = 1.]

This is just an example of a Poisson Distribution Modified at Zero, with = 3 and 25% probability
placed at zero.
For a Zero-Modified distribution, an arbitrary amount of probability has been placed at zero. In the
example above it is 25%. Loss Models uses pM
0 to denote this probability at zero. The remaining
probability is spread out proportional to some well-known distribution such as the Poisson. In
general if f is a distribution on 0, 1, 2, 3,..., and 0 < pM
0 < 1,
then g(0) =

pM
0,

g(x) = f(x)

1 - pM
0
1 - f(0)

, x = 1, 2, 3,... is a distribution on 0, 1, 2, 3, ....

Exercise: For a Poisson Distribution Modified at Zero, with = 3 and 25% probability placed at
zero, what are the densities at 0, 1, 2, 3, and 4?
[Solution: For example the density at 4 is: (0.75)(34 )e-3/{(4!)(1 - e-3)} = 0.133.
x
f(x)

0
0.250

1
0.118

2
0.177

3
0.177

4
0.133

In the case of a Zero-Modified Distribution, there is no relationship assumed between the density at
zero and the other densities, other than the fact that all of the densities sum to one.
102

See Section 6.7 and Appendix B.3.2 in Loss Models.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 244

We have the following four cases:


Distribution

Zero-Modified Distribution, f(0) = pM


0
m! qx (1- q)m - x
x! (m - x)!
(1- pM
0)
1 - (1- q)m

Binomial

(1- pM
0)

Poisson

Negative Binomial103

Logarithmic

e- x / x!
1 - e-

x = 1, 2, 3,... , m

x = 1, 2, 3,...

x
r(r +1)...(r + x - 1)
(1+ )x + r
x!
(1- pM
)
0
1 - 1/ (1+ )r

x =1, 2, 3,...

1+
(1- pM
)
0 x ln(1+) x = 1, 2, 3,...

These four zero-modified distributions complete the (a, b, 1) class of frequency distributions.104
They each follow the formula: f(x)/f(x-1) = a + b/x , for x 2.
Note that if pM
0 = 0, one has f(0) = 0 and the zero-modified distribution reduces to a zero-truncated
distribution. However, even though it might be useful to think of the zero-truncated distributions as a
special case of the zero-modified distributions, Loss Models restricts the term zero-modified to
those cases where f(0) > 0.
Moments:
The moments of a zero-modified distribution h are given in terms of those of f by
Ef [Xn]
Eh [Xn ] = (1 - pM
. For example for the Zero-Truncated Poisson the mean is:
0)
1 - f(0)
(1 - pM
0)

103

M +
,
while
the
second
moment
is:
(1
p
)
.
0
1 - e-
1 - e-

The zero-modified version of the Negative Binomial is referred to by Loss Models as the Zero-Modified Extended
Truncated Negative Binomial.
104
See Table 6.4 and Appendix B.3 in Loss Models.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 245

Exercise: For a Zero-modified Poisson with = 3 and 25% chance of zero claims, what is the mean?
[Solution: Let f(x) be the untruncated Poisson, and g(x) be the zero-modified distribution.
0.75 f(x)
Then g(x) =
, x > 0. The mean of h is:
1 - f(0)

x=1

x f(x)
mean of f
0.75 x f(x)

x g(x) =
= 0.75 x=0
= 0.75
= 0.75
- =
1
f(0)
1
f(0)
1
f(0)
1
e
x=1

(0.75)

3
= (0.75)(3.157) = 2.368.
1 - e- 3

Comment: The term involving x = 0 would contribute nothing to the mean.]


Note that in order to get the moments of a zero-modified distribution, one could first compute the
moment of the zero-truncated distribution and then multiply by (1 - pM
0 ). For example, the mean of
a zero-truncated Poisson with = 3 is 3.157. Then the mean of the zero-modified with = 3 and
25% chance of zero claims has a mean of: (3.157) (1 - 0.25) = 2.368.
Exercise: For a Negative Binomial with r = 0.7 and = 3 what is the second moment?
[Solution: The mean is (0.7)(3) = 2.1, the variance is (0.7)(3)(1+3) = 8.4, so the second moment is:
8.4 + 2.12 = 12.81.]
Exercise: For a Zero-Truncated Negative Binomial with r = 0.7 and = 3 what is the second
moment?
[Solution: For a Negative Binomial with r = 0.7 and = 3,
the density at zero is: 1/(1+)r = 4-0.7 = 0.379, and the second moment is 12.81.
Thus the second moment of the zero-truncated distribution is: 12.81/(1 - 0.379) = 20.62.]
Exercise: For a Zero-Modified Negative Binomial with r = 0.7 and = 3, with a 15% chance of zero
claims, what is the second moment?
[Solution: For a Zero-Truncated Negative Binomial with r = 0.7 and = 3, the second moment is
20.62. Thus the second moment of the zero-modified distribution with a 15% chance of zero claims
is: (20.62)(1 - 0.15) = 17.52.]

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 246

Probability Generating Functions:


The zero-modified distribution, can be thought of a mixture of a point mass of probability at zero and
a zero-truncated distribution. The probability generating function of a mixture is the mixture of the
probability generating functions. A point of probability at zero, has a probability generating function
E[zn ] = E[z0 ] = 1. Therefore, the Probability generating function, P(z) = E[zN], for a
zero-modified distribution can be obtained from that for zero-truncated distribution:
M
T
PM(z) = pM
0 + (1 - p 0 ) P (z).

where PM(z) is the p.g.f. for the zero-modified distribution and PT(z) is the p.g.f. for the
zero-truncated distribution, and pM
0 is the probability at zero for the zero-modified distribution.
Exercise: What is the Probability Generating Function for a Zero-Modified Poisson Distribution,
with 30% probability placed at zero?
[Solution: For the zero-truncated Poisson. PT(z) = {ez - 1} /{e - 1}.
ez - 1
M
M
.]
PM(z) = p0 + (1 - p0 )PT(z) = 0.3 + 0.7
e - 1
One can derive this relationship as follows:
Let g(n) be the zero modified distribution and h(n) be the zero-truncated distribution.
M
Then g(0) = pM
0 and g(n) = h(n) (1 - p 0 ) for n > 0.

PM(z)

n=0

zn g(n)

pM
0

zn (1
n=1

M
M
T
- pM
0 ) h(n) = p 0 + (1 - p 0 ) P (z).

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 247

Thinning:105
If we take at random a fraction of the events, then we get a distribution of the same family.
One parameter is altered by the thinning as per the non-zero-modified case.
In addition, the probability at zero, pM
0 , is altered by thinning.
Distribution

Result of thinning by a factor of t

Zero-Modified Binomial

q tq
pM
0

m remains the same


M
m
m
m
pM
0 - (1- q) + (1- tq) - p0 (1- tq)

1 - (1- q)m

Zero-Modified Poisson

pM
0

Zero-Modified Negative Binomial106

1 - e-

t
pM
0

Zero-Modified Logarithmic

- + e - t - pM e - t
pM
0 - e
0

r remains the same


- r + (1+ t) - r - pM (1+ t)- r
pM
0 - (1+ )
0

1 - (1+ ) - r

t
M ln[1+ t]
pM
0 1 - (1 - p 0 ) ln[1+ ]

In each case, the new probability of zero claims is the probability generating function for the original
zero-modified distribution at 1 - t, where t is the thinning factor.
M
For example, for the Zero-Modified Binomial, P(z) = pM
0 + (1 - p 0 ) (p.g.f. of zero-truncated) =

pM
0

+ (1 -

P(1 - t) =
105
106

pM
0)
pM
0

{1 + q(z -1)}m - (1- q)m


.
1 - (1- q)m

+ (1 -

pM
0)

M
M
{1 - qt}m - (1- q)m p0 - (1- q)m + (1- tq)m - p0 (1- tq)m
=
.
1 - (1- q)m
1 - (1- q)m

See Table 8.3 in Loss Models.


Including the special case the zero-modified geometric.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 248

For example, let us assume we look at only large claims, which are t of all claims.
Then if we have n claims, the probability of zero large claims is: (1-t)n .
Thus the probability of zero large claims is:
Prob[zero claims] (1-t)0 + Prob[1 claim] (1-t)1 + Prob[2 claims] (1-t)2 + Prob[3 claims] (1-t)3 + ...
E[(1-t)n ] = P(1 - t) = p.g.f. for the original distribution at 1 - t.
Exercise: Show that the p.g.f. for the original zero-modified Logarithmic distribution at 1 - t matches
the above result for the density at zero for the thinned distribution.
M
[Solution: For the Zero-Modified Logarithmic, P(z) = pM
0 + (1 - p 0 ) (p.g.f. of Logarithmic) =

ln[1 - (z - 1)]
M
pM
}.
0 + (1 - p 0 ) {1 ln[1+ ]
M
M ln[1+ t]
P(1 - t) = pM
0 + (1 - p 0 ) {1 - ln[1 - t] / ln[1+]} = 1 - (1 - p 0 ) ln[1+ ] . ]

Exercise: The number of losses follows a zero-modified Poisson with = 2 and pM


0 = 10%.
30% of losses are large. What is the distribution of the large losses?
[Solution: Large losses follow a zero-modified Poisson with = (30%)(2) = 0.6 and
pM
0 =

0.1 - e -2 + e - 0.6 - (0.1) e - 0.6


= 0.5304.]
1 - e- 2

Exercise: The number of members per family follows a zero-truncated Negative Binomial
with r = 0.5 and = 4.
It is assumed that 60% of people have first names that begin with the letters A through M,
and that size of family is independent of the letters of the first names of its members.
What is the distribution of the number of family members with first names that begin with the letters
A through M?
[Solution: The zero-truncated distribution is mathematically the same as a zero-modified distribution
with pM
0 = 0.
Thus the thinned distribution is a zero-modified Negative Binomial with r = 0.5, = (60%)(4) = 2.4,
and pM
0 =

0 - 5 - 0.5 + 3.4 - 0.5 - (0) (3.4 -0.5 )


= 0.1721.
1 - 5 - 0.5

Comment: While prior to thinning there is no probability of zero members, after thinning there is a
probability of zero members with first names that begin with the letters A through M.
Thus the thinned distribution is zero-modified rather than zero-truncated.]

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 249

Problems:
Use the following information for the next six questions:
The number of claims per year is given by a Zero-Modified Binomial Distribution with parameters
q = 0.3 and m = 5, and with 15% probability of zero claims.
14.1 (1 point) What is the mean number of claims over the coming year?
A. less than 1.4
B. at least 1.4 but less than 1.5
C. at least 1.5 but less than 1.6
D. at least 1.6 but less than 1.7
E. at least 1.7
14.2 (2 points) What is the variance of the number of claims per year?
A. less than 0.98
B. at least 0.98 but less than 1.00
C. at least 1.00 but less than 1.02
D. at least 1.02 but less than 1.04
E. at least 1.04
14.3 (1 point) What is the chance of observing 3 claims over the coming year?
A. less than 13.0%
B. at least 13.0% but less than 13.4%
C. at least 13.4% but less than 13.8%
D. at least 13.8% but less than 14.2%
E. at least 14.2%
14.4 (2 points) What is the 95th percentile of the distribution of the number of claims per year?
A. 1
B. 2
C. 3
D. 4
E. 5
14.5 (2 points) What is the probability generating function at 3?
A. less than 9
B. at least 9 but less than 10
C. at least 10 but less than 11
D. at least 11 but less than 12
E. at least 12
14.6 (2 points) Small claims are 70% of all claims.
What is the chance of observing exactly 2 small claims over the coming year?
A. 20%
B. 22%
C. 24%
D. 26%
E. 28%

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 250

Use the following information for the next five questions:


The number of claims per year is given by a Zero-Modified Negative Binomial Distribution with
parameters r = 4 and = 0.5, and with 35% chance of zero claims.
14.7 (1 point) What is the mean number of claims over the coming year?
A. less than 1.7
B. at least 1.7 but less than 1.8
C. at least 1.8 but less than 1.9
D. at least 1.9 but less than 2.0
E. at least 2.0
14.8 (2 points) What is the variance of the number of claims year?
A. less than 2.0
B. at least 2.0 but less than 2.2
C. at least 2.2 but less than 2.4
D. at least 2.4 but less than 2.6
E. at least 2.6
14.9 (1 point) What is the chance of observing 7 claims over the coming year?
A. less than 0.8%
B. at least 0.8% but less than 1.0%
C. at least 1.0% but less than 1.2%
D. at least 1.2% but less than 1.4%
E. at least 1.4%
14.10 (3 points) What is the probability of more than 5 claims in the coming year?
A. less than 1%
B. at least 1%, but less than 3%
C. at least 3%, but less than 5%
D. at least 5%, but less than 7%
E. at least 7%
14.11 (3 points) Large claims are 40% of all claims.
What is the chance of observing more than 1 large claim over the coming year?
A. 10%
B. 12%
C. 14%
D. 16%
E. 18%

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 251

Use the following information for the next four questions:


The number of claims per year is given by a Zero-Modified Logarithmic Distribution with parameter
= 2, and a 25% chance of zero claims.
14.12 (1 point) What is the mean number of claims over the coming year?
A. less than 1.0
B. at least 1.0 but less than 1.1
C. at least 1.1 but less than 1.2
D. at least 1.2 but less than 1.3
E. at least 1.3
14.13 (2 points) What is the variance of the number of claims per year?
A. less than 2.0
B. at least 2.0 but less than 2.2
C. at least 2.2 but less than 2.4
D. at least 2.4 but less than 2.6
E. at least 2.6
14.14 (1 point) What is the chance of observing 6 claims over the coming year?
A. less than 1.1%
B. at least 1.1% but less than 1.3%
C. at least 1.3% but less than 1.5%
D. at least 1.5% but less than 1.7%
E. at least 1.7%
14.15 (2 points) Medium sized claims are 60% of all claims.
What is the chance of observing exactly one medium sized claim over the coming year?
A. 31%
B. 33%
C. 35%
D. 37%
E. 39%

14.16 (1 point) The number of claims per year is given by a Zero-Modified Negative Binomial
Distribution with parameters r = -.6 and = 3, and with a 20% chance of zero claims.
What is the chance of observing 5 claims over the coming year?
A. less than 0.8%
B. at least 0.8% but less than 1.0%
C. at least 1.0% but less than 1.2%
D. at least 1.2% but less than 1.4%
E. at least 1.4%

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 252

Use the following information for the next seven questions:


The number of claims per year is given by a Zero-Modified Poisson Distribution with parameter
= 2.5, and with 30% chance of zero claims.
14.17 (1 point) What is the mean number of claims over the coming year?
A. 1.9

B. 2.0

C. 2.1

D. 2.2

E. 2.3

14.18 (2 points) What is the variance of the number of claims per year?
A. less than 2.7
B. at least 2.7 but less than 2.8
C. at least 2.8 but less than 2.9
D. at least 2.9 but less than 3.0
E. at least 3.0
14.19 (1 point) What is the chance of observing 6 claims over the coming year?
A. less than 2%
B. at least 2% but less than 3%
C. at least 3% but less than 4%
D. at least 4% but less than 5%
E. at least 5%
14.20 (1 point) What is the chance of observing 2 claims over the coming year?
A. 18%
B. 20%
C. 22%
D. 24%
E. 26%
14.21 (2 points) What is the chance of observing fewer than 4 claims over the coming year?
A. less than 70%
B. at least 70% but less than 75%
C. at least 75% but less than 80%
D. at least 80% but less than 85%
E. at least 85%
14.22 (2 points) What is the mode of this frequency distribution?
A. 0
B. 1
C. 2
D. 3
E. 4
14.23 (2 points) Large claims are 20% of all claims.
What is the chance of observing exactly one large claim over the coming year?
A. 15%
B. 17%
C. 19%
D. 21%
E. 23%

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 253

14.24 (3 points) Let pk denotes the probability that the number of claims equals k for k = 0, 1, ...
If pn / pm = 2.4n-m m! / n!, for m 0, n 0, then using the corresponding zero-modified claim count
M
distribution with pM
0 = 0.31, calculate p 3 .

(A) 16%

(B) 18%

(C) 20%

(D) 22%

(E) 24%

14.25 (3 points) The number of losses follow a zero-modified Poisson Distribution with and pM
0.
Small losses are 70% of all losses.
From first principles determine the probability of zero small losses.
14.26 (3 points) The following data is the number sick days taken at a large company during the
previous year.
Number of days:
0
1
2
3
4
5
6
7
8+
Number of employees:
50,122
9190 5509 3258 1944 1160 693 418 621
Is it likely that this data was drawn from a member of the (a, b, 0) class?
Is it likely that this data was drawn from a member of the (a, b, 1) class?
M
14.27 (3 points) For a zero-modified Poisson, pM
2 = 27.3%, and p 3 = 12.7%.

Determine pM
0.
(A) 11%

(B) 12%

(C) 13%

(D) 14%

(E) 15%

14.28 (3 points) X is a discrete random variable with a probability function which is a member of the
(a, b, 1) class of distributions.
p k denotes the probability that X = k.
p 1 = 0.1637, p2 = 0.1754, and p3 = 0.1503.
Calculate p5 .
(A) 7.5%

(B) 7.7%

(C) 7.9%

(D) 8.1%

(E) 8.3%

14.29 (3, 5/00, Q.37) (2.5 points) Given:


(i) pk denotes the probability that the number of claims equals k for k = 0, 1, 2, ...
(ii) pn / pm = m! / n!, for m 0, n 0
M
Using the corresponding zero-modified claim count distribution with pM
0 = 0.1, calculate p1 .

(A) 0.1

(B) 0.3

(C) 0.5

(D) 0.7

(E) 0.9

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 254

Solutions to Problems:
14.1. C. Mean is that of the unmodified Binomial, multiplied by (1 - 0.15) and divided by 1- f(0):
(0.3)(5)(0.85) / (1 - 0.75 ) = 1.533.
14.2. D. The second moment is that of the unmodified Binomial,
multiplied by (1-.15) and divided by 1 - f(0):
(1.05 + 1.52 )(0.85) / (1 - 0.75 ) = 3.372. Variance = 3.372 - 1.5332 = 1.022.
14.3. C. For an unmodified binomial, f(3) = (5!/(3!)(2!)) 0.33 0.72 = 0.1323.
For the zero-truncated distribution one gets the density by multiplying by (1-.15)
and dividing by 1 - f(0):
(0.1323)(0.85) / (1 - 0.75 ) = 13.5%.
14.4. C. The 95th percentile is that value corresponding to the distribution function being 95%.
For a discrete distribution such as we have here, employ the convention that the 95th percentile is
the first value at which the distribution function is greater than or equal to 0.95. F(2) = 0.8334 < 95%,
F(3) = 0.9686 95%, and therefore the 95th percentile is 3.
Number
of Claims
0
1
2
3
4
5

Unmodified
Binomial
Zero-Modified
CoefficientBinomial
Binomial
16.81%
15.00%
36.02%
36.80%
30.87%
31.54%
13.23%
13.52%
2.83%
2.90%
0.24%
0.25%

Cumulative
Zero-Modified
Binomial
51.80%
83.34%
96.86%
99.75%
100.00%

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 255

14.5. C. As shown in Appendix B of Loss Models, for the zero-truncated Binomial Distribution:
PT(z) =

5
5
{1 + q(z -1)} m - (1- q)m
T(3) = {1 + (0.3)(3 -1)} - (1- 0.3) = 12.402.
.

P
1 - (1- q)m
1 - (1- 0.3)5

The p.g.f. for the zero-modified distribution is:


M T
M
PM(z) = pM
0 + (1 - p 0 )P (z). P (3) = (0.15) + (0.85)(12.402) = 10.69.

Comment: The densities of the zero-modified distribution:


Number
of Claims
0
1
2
3
4
5

Unmodified
Binomial
Zero-Modified
CoefficientBinomial
Binomial
16.807%
15.000%
36.015%
36.797%
30.870%
31.541%
13.230%
13.517%
2.835%
2.897%
0.243%
0.248%

PT(3) is the expected value of 3n :


(15%)(30 ) + (36.797%)(31 ) + (31.541%)(32 ) + (13.517%)(33 ) + (2.897%)(34 ) + (0.248%)(35 ) =
10.69.
14.6. B. After thinning we get another zero-modified Binomial, with m = 5,
but q = (0.7)(0.3) = 0.21, and
M
m
m
m
5
5
5
pM
0 - (1- q) + (1- tq) - p0 (1- tq) = 0.15 - 0.7 + 0.79 - (0.15) (0.79 ) = 0.2927.
pM

0
1 - (1- q)m
1 - 0.75

The density at two of the new zero-modified Binomial is:


1 - 0.2927
(10) (0.212 ) (0.793 ) = 22.21%.
1 - 0.795
Comment: The probability of zero claims for the thinned distribution is the p.g.f. for the original
zero-modified distribution at 1 - t, where t is the thinning factor.
14.7. A. Mean is that of the unmodified negative binomial, multiplied by (1-.35) and divided by
1 - f(0): (4)(0.5)(0.65)/ (1 - 1.5-4) = 2 / 0.8025 = 1.62
14.8. E. The second moment is that of the unmodified negative binomial, multiplied by (1-.35) and
divided by 1 - f(0): (3+22 ) (0.65)/ (1 - 1.5-4) = 5.67. Variance = 5.67 - 1.622 = 3.05.
14.9. B. For the unmodified negative binomial,
f(7) = (4)(5)(6)(7)(8)(9)(10) .57 /((7!)(1.5)11) = 1.08%. For the zero-truncated distribution one gets
the density by multiplying by (1-.35) and dividing by 1 - f(0): (1.08%)(0.65) / (1 - 1.5-4) = 0.87%.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 256

14.10. C. The chance of more than 5 claims is: 1 - 0.9656 = 3.44%.


Number
of Claims
0
1
2
3
4
5
6
7
8
9

Unmodified
Neg. Binomial
19.75%
26.34%
21.95%
14.63%
8.54%
4.55%
2.28%
1.08%
0.50%
0.22%

Zero-Modified
Neg. Binomial
35.00%
21.33%
17.78%
11.85%
6.91%
3.69%
1.84%
0.88%
0.40%
0.18%

Cumulative
Zero-Modified
Neg. Binomial
56.33%
74.11%
85.96%
92.88%
96.56%
98.41%
99.29%
99.69%
99.87%

14.11. D. After thinning we get another zero-modified Negative Binomial, with r = 4,


but = (40%)(0.5) = 0.2, and
pM
0

- r + (1+ t) - r - pM (1+ t)- r


pM
0 - (1+ )
0

1 - (1+ ) - r

0.35 - 1.5 - 4 + 1.2 - 4 - (0.35) (1.2 - 4 )


=
=
1 - 1.5 - 4

0.5806.
The density at one of the new zero-modified Negative Binomial is:
1 - 0.5806 (4)(0.2)
= 0.2604.
1 - 1/ 1.24 1.25
Probability of more than one large claim is: 1 - 0.5806 - 0.2604 = 15.90%.
Comment: The probability of zero claims for the thinned distribution is the p.g.f. for the original
zero-modified distribution at 1 - t, where t is the thinning factor.
Similar to Example 8.9 in Loss Models.
14.12. E. Mean of the logarithmic distribution is: /ln(1+) = 2 / ln(3) = 1.82.
For the zero-modified distribution, the mean is multiplied by 1-.25: (.75)(1.82) = 1.37.
Comment: Note the unmodified logarithmic distribution has no chance of zero claims.
Therefore, we need not divide by 1-f(0) to get to the zero-modified distribution (or alternately we
are dividing by 1 - 0 = 1.)
14.13. C. Variance of the unmodified logarithmic distribution is:
{1 + /ln(1+)}/ln(1+) = 2{3 -1.82}/ ln(3) = 2.15.
Thus the unmodified logarithmic has a second moment of: 2.15 + 1.822 = 5.46.
For the zero-modified distribution, the second moment is multiplied by 1-.25: (.75)(5.46) = 4.10.
Thus the variance of the zero-modified distribution is: 4.10 - 1.372 = 2.22.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 257

14.14. A. For the unmodified logarithmic distribution, f(x) = {/ (1+)}x / {x ln(1+)}


f(6) = (2/3)6 / {6ln(3)} = 1.33%.
For the zero-modified distribution, the density at 6 is multiplied by 1-.25: (.75)(1.33%) = 1.00%.
14.15. D. After thinning we get another zero-modified Logarithmic, with = (60%)(2) = 1.2, and
ln[2.2]
M ln[1+ t]
pM
0 1 - (1 - p 0 ) ln[1+ ] = 1 - (0.75) ln[3] = 0.4617.
The density at one of the new zero-modified Logarithmic is:
1.2
(1 - 0.4617)
= 37.23%.
2.2 ln[2.2]
Comment: The probability of zero claims for the thinned distribution is the p.g.f. for the original
zero-modified distribution at 1 - t, where t is the thinning factor.
14.16. A. For the zero-truncated Negative Binomial Distribution,
f(5) = r(r+1)(r+2)(r+3)(r+4) (/(1+))x /{(5!)((1+)r -1)} =
(-.6)(.4)(1.4)(2.4)(3.4)(3/4)5 / {(120)(4-.6 -1) = (-2.742)(.2373) / (120)(-.5647) = .96%.
For the zero-modified distribution, multiply by 1 - .2: (.8)(.96%) = .77%.
Comment: Note this is an extended zero-truncated negative binomial distribution, with
0 > r > -1. The same formulas apply as when r > 0. (As r approaches zero one gets a logarithmic
distribution.) For the unmodified negative binomial distribution we must have r > 0. So in this case
there is no corresponding unmodified distribution.
14.17. A. The mean is that of the non-modified Poisson, multiplied by (1-.3) and divided by
1- f(0): (2.5) (.7) / (1-e-2.5) = 1.907.
14.18. E. The second moment is that of the unmodified Poisson, multiplied by (1-.3) and divided
by 1-f(0): (2.5+2.52 )(.7) / (1-e-2.5) = 6.673. Variance = 6.673 - 1.9072 = 3.04.
14.19. B. For an unmodified Poisson, f(6) = (2.56 )e-2.5/6! = .0278.
For the zero-modified distribution one gets the density by multiplying by (1 - .3) and dividing by
1 - f(0): (.0278)(.7) / (1 - e-2.5) = 2.12%.
14.20. B. For the unmodified Poisson f(0) = e-2.5 = 8.208%, and f(2) = 2.52 e-2.5/2 = 25.652%.
The zero-modified Poisson has a density at 2 of: (25.652%)(1 - 30%)/(1 - 8.208%) = 19.56%.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 258

14.21. D. One adds up the chances of 0, 1, 2 and 3 claims, and gets 81.5%.
Number
of Claims
0
1
2
3
4
5
6
7
8

Unmodified
BinomialZero-Modified
Poisson
Coefficient Poisson
8.21%
30.00%
20.52%
15.65%
25.65%
19.56%
21.38%
16.30%
13.36%
10.19%
6.68%
5.09%
2.78%
2.12%
0.99%
0.76%
0.31%
0.24%

Cumulative
Zero-Modified
Poisson
45.65%
65.21%
81.51%
91.70%
96.80%
98.92%
99.68%
99.91%

Comment: We are given a 30% chance of zero claims.


The remaining 70% is spread in proportion to the unmodified Poisson. For example,
(70%)(20.52%)/(1-.0821) = 15.65%, and (70%)(25.65%)/(1-.0821) = 19.56%
Unlike the zero-truncated distribution, the zero-modified distribution has a probability of zero events.
14.22. A. The mode is where the density function is greatest, 0.
Number
of Claims
0
1
2
3
4
5
6
7
8

Unmodified
Binomial
Zero-Modified
CoefficientPoisson
Poisson
8.21%
30.00%
20.52%
15.65%
25.65%
19.56%
21.38%
16.30%
13.36%
10.19%
6.68%
5.09%
2.78%
2.12%
0.99%
0.76%
0.31%
0.24%

Comment: If the mode of the zero-modified and unmodified distribution are 0, then the
zero-modified distribution has the same mode as the unmodified distribution, since all the densities
on the positive integers are multiplied by the same factor.
14.23. E. After thinning we get another zero-modified Poisson, with = (20%)(2.5) = 0.5, and
pM
0

- + e - t - pM e - t
pM
0 - e
0

1 - e-

0.3 - e - 2.5 + e - 0.5 - (0.3) (e - 0.5 )


= 0.6999.
1 - e - 2.5

The density at one of the new zero-modified Poisson is:


1 - 0.6999
(0.5 e-0.5) = 23.13%.
1 - e - 0.5
Comment: The probability of zero claims for the thinned distribution is the p.g.f. for the original
zero-modified distribution at 1 - t, where t is the thinning factor.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 259

14.24. A. f(x+1)/ f(x) = 2.4{x!/ (x+1)!} = 2.4/(x+1). Thus this is a member of the (a, b, 0) subclass,
f(x+1)/ f(x) = a + b/(x+1), with a = 0 and b = 2.4. This is a Poisson Distribution, with = 2.4.
For the unmodified Poisson, the probability of more than zero claims is: 1 - e-2.4 .
After, zero-modification, this probability is: 1- .31 = .69. Thus the zero-modified distribution is,
fM(x) = (.69/(1 - e-2.4))f(x) = (.69/(1 - e-2.4))e-2.4 2.4x /x! = 2.4x(.69)/((e2.4 - 1) x!), x1.
fM(3) = 2.43 (.69/((e2.4 - 1) 3!)= 0.159.
# claims
zero-modified density

0
0.31

1
0.1652

2
0.1983

Comment: For a Poisson with = 2.4, f(n)/f(m) =

3
4
5
6
7
0.1586 0.0952 0.0457 0.0183 0.0063
(e-2.4 2.4n / n!)/(e-2.4 2.4m / m!) =

2.4n-m m! / n!.
14.25. If there are n losses, then the probability that zero of them are small is 0.3n .
Prob[0 small losses] =
Prob[0 losses] + Prob[1 loss] Prob[loss is big] + Prob[2 losses] Prob[both losses are big] + ... =
pM
0
M
p0

M
p0

+{

1 - pM
0

1 - e-

e}

(0.3) + {

1 - pM
0

1 - e-

2e

/ 2!}

(0.32 )

+{

1 - pM
0

e {0.3 + (0.3)2 / 2! + (0.3)3 / 3! + ...} =

1 - pM
0

1 - e-

1 - e-

{e0.3

- 1} =

M
p0

1 - pM
0

1 - e-

M
- 0.7 - e - )
(1 - e - ) pM
0 + (1 - p0 ) (e

1 - e-

1 - pM
0

1 - e-

3e / 3!} (0.33 ) + ... =

{e-0.7 - e} =

- + e - 0.7 - pM e - 0.7
pM
0 - e
0

Comment: Matches the general formula with t = 0.7:

1 - e-

pM
0

- + e - t - pM e - t
pM
0 - e
0

1 - e-

The thinned distribution is also a zero-modified Poisson, with * = 0.7.


The probability of zero claims for the thinned distribution is the P.G.F. for the original
zero-modified distribution at 1 - t, where t is the thinning factor.

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 260

14.26. Calculate (x+1)f(x+1)/f(x) = (x+1) (number with x+1) / (number with x).
Number of
Days

Observed

(x+1)f(x+1)/f(x)

Differences

0
1
2
3
4
5
6
7
8+

50,122
9,190
5,509
3,258
1,944
1,160
693
418
621

0.183
1.199
1.774
2.387
2.984
3.584
4.222

1.016
0.575
0.613
0.597
0.601
0.638

The accident profile is not approximately linear starting at zero.


Thus, this is probably not from a member of the (a, b, 0) class.
The accident profile is approximately linear starting at one.
Thus, this is probably from a member of the (a, b, 1) class.
Comment: f(x+1)/f(x) = a + b/(x+1), so (x+1)f(x+1)/f(x) = a(x+1) + b = ax + a + b.
The slope is positive, so a > 0 and we have a Negative Binomial.
The slope, a 0.6. The intercept is about 0.6. Thus a + b 0.6. Therefore, b 0.
For the Negative Binomial b = (r-1)/(1+). Thus b = 0, implies r 1.
Thus the data may have been drawn from a Zero-Modified Geometric, with 0.6.

14.27. E.

Thus

pM
2

pM
pM
3/ 2

= f(2)

1 - pM
0
1 - f(0)

pM
3

= f(3)

1 - pM
0
1 - f(0)

3 e- / 6
= f(3) / f(2) = 2
= /3. /3 = 12.7%/27.3%. = 1.396.
e- / 2

27.3% = {(1.3962 ) e-1.396 / 2}


Comment: pM
= 39.11%.
1

1 - pM
0
. pM
0 = 14.86%.
1.396
1 - e

2013-4-1,

Frequency Distributions, 14 Zero-Modified

HCM 10/4/12,

Page 261

14.28. B. Since we have a member of the (a, b, 1) family:


p 2 /p1 = a + b/2. 2a + b = (2)(0.1754)/0.1637 = 2.1429.
p 3 /p2 = a + b/3. 3a + b = (3)(0.1503)/0.1754 = 2.5707.

a = 0.4278. b = 1.2873.
p 4 = (a + b/4) p3 = (0.4278 + 1.2873/4) (0.1503) = 0.1127.
p 5 = (a + b/45) p4 = (0.4278 + 1.2873/5) (0.1127) = 0.0772.
Comment: Based on a zero-modified Negative Binomial, with r = 4, = 0.75, and pM
0 = 20%.
14.29. C. f(x+1)/ f(x) = x!/ (x+1)! = 1/(x+1). Thus this is a member of the (a, b, 0) subclass,
f(x+1) / f(x) = a + b/(x+1), with a = 0 and b = 1. This is a Poisson Distribution, with = 1.
For the unmodified Poisson, the probability of more than zero claims is: 1 - e-1 .
After, zero-modification, this probability is: 1 - 0.1 = .9. Thus the zero-modified distribution is,
fM(x) = (0.9/(1-e-1))f(x) = (0.9/(1-e-1))e-1 1x /x! = 0.9/((e - 1) x!), x1.
fM(1) = 0.9/(e-1) = 0.524.
# claims
zero-modified density

0
0.1

1
0.5238

2
0.2619

Comment: For a Poisson with = 1, f(n)/f(m) =

3
4
5
6
0.0873 0.0218 0.0044 0.0007
(e-1 1n / n!)/(e-1 1m / m!) = m! / n!.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 262

Section 15, Compound Frequency Distributions107


A compound frequency distribution has a primary and secondary distribution, each of which is a
frequency distribution. The primary distribution determines how many independent random draws
from the secondary distribution we sum.
For example, assume the number of taxicabs that arrive per minute at the Heartbreak Hotel is
Poisson with mean 1.3. In addition, assume that the number of passengers dropped off at the hotel
by each taxicab is Binomial with q = 0.4 and m = 5. The number of passengers dropped off by
each taxicab is independent of the number of taxicabs that arrive and is independent of the number
of passengers dropped off by any other taxicab.
Then the aggregate number of passengers dropped off per minute at the Heartbreak Hotel is an
example of a compound frequency distribution. It is a compound Poisson-Binomial distribution, with
parameters = 1.3, q = 0.4, m = 5.108
The distribution function of the primary Poisson is as follows:
1.3
Number
of Claims
0
1
2
3
4
5
6

Probability
Density Function
27.253%
35.429%
23.029%
9.979%
3.243%
0.843%
0.183%

Cumulative
Distribution
Function
0.27253
0.62682
0.85711
0.95690
0.98934
0.99777
0.99960

So for example, there is a 3.243% chance that 4 taxicabs arrive; in which case the number
passengers dropped off is the sum of 4 independent identically distributed Binomials109, given by
the secondary Binomial Distribution. There is a 27.253% chance there are no taxicabs, a 35.429%
chance we take one Binomial, 23.029% chance we sum the result of 2 independent identically
distributed Binomials, etc.

107

See Section 6.8 of Loss Models, not on the syllabus. However, compound distributions are mathematically the
same as aggregate distributions. See Mahlers Guide to Aggregate Distributions. Some of you may better
understand the idea of compound distributions by seeing how they are simulated in Mahlers Guide to Simulation.
108
In the name of a compound distribution, the primary distribution is listed first and the secondary distribution is
listed second.
109
While we happen to know that the sum of 4 independent Binomials each with q = 0.4, m = 5 is another Binomial
with parameters q = 0.4, m = 20, that fact is not essential to the general concept of a compound distribution.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 263

The secondary Binomial Distribution with q = 0.4, m = 5 is as follows:


Number
of Claims
0
1
2
3
4
5

Probability
Density Function
7.776%
25.920%
34.560%
23.040%
7.680%
1.024%

Cumulative
Distribution
Function
0.07776
0.33696
0.68256
0.91296
0.98976
1.00000

Thus assuming a taxicab arrives, there is a 34.560% chance that 2 passengers are dropped off.
In this example, the primary distribution determines how many taxicabs arrive, while the secondary
distribution determines the number of passengers departing per taxicab. Instead, the primary
distribution could be the number of envelopes arriving and the secondary distribution could be the
number of claims in each envelope.110
Actuaries often use compound distributions when the primary distribution determines
how many accidents there are, while for each accident the number of persons injured or
number of claimants is determined by the secondary distribution.111 This particular model,
while useful for comprehension, may or may not apply to any particular use of the mathematical
concept of compound frequency distributions.
There are number of methods of computing the density of compound distributions, among them the
use of convolutions and the use of the Recursive Method (Panjer Algorithm.)112

Probability Generating Function of Compound Distributions:


One can get the Probability Generating Function of a compound distribution in terms of those of its
primary and secondary distributions:
p.g.f. of compound distribution = p.g.f. of primary distribution[p.g.f. of secondary distribution]
P(z) = P1 [P2 (z)].

110

See 3, 11/01, Q.30.


See 3, 5/01, Q.36.
112
Both discussed in Mahlers Guide to Aggregate Distributions, where they are applied to both compound and
aggregate distributions.
111

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 264

Exercise: What is the Probability Generating Function of a Compound


Geometric-Binomial Distribution, with = 3, q = 0.1, and m = 2.
[Solution: The p.g.f. of the primary Geometric is: 1 / {1 -3 (z-1)} = 1 / (4 - 3z), z < 1 +1/ = 4/3.
The p.g.f. of the secondary Binomial is: {1 + (0.1)(z-1)}2 = (0.9 + 0.1z)2 = 0.01z2 + 0.18z + 0.81.
P(z) = P1 [P2 (z)] = 1 / {4 - 3(0.01z2 + 0.18z + 0.81)} = -1/(0.03z2 + 0.54z - 1.57), z < 4/3.]
Recall, that for any frequency distribution, f(0) = P(0). Therefore, for a compound distribution,
c(0) = Pc(0) = P1 [P2 (0)] = P1 [s(0)].
compound density at 0 = p.g.f. of the primary at density at 0 of the secondary.113
For example, in the previous exercise, the density of the compound distribution at zero is its p.g.f. at
z = 0: 1/1.57 = 0.637. The density at 0 of the secondary Binomial Distribution is:
0.92 = 0.81. The p.g.f. of the primary distribution at 0.81 is: 1 / {4 - (3)(0.81)} = 1/1.57 = 0.637.
If one takes the p.g.f. of a compound distribution to a power > 0, P(z) = P1 [P2 (z)].
Thus if the primary distribution is infinitely divisible, i.e., P1 has the same form as P1 , then P
has the same form as P. If the primary distribution is infinitely divisible, then so is the
compound distribution.
Since the Poisson and the Negative Binomial are each infinitely divisible, so are compound
distributions with a primary distribution which is either a Poisson or a Negative Binomial
(including a Geometric.)
Adding Compound Distributions:
For example, let us assume that taxi cabs arrive at a hotel (primary distribution) and drop people off
(secondary distribution.) Assume two independent Compound Poisson Distributions with the same
secondary distribution. The first compound distribution represents those cabs whose drivers were
born in January through June and has = 11, while the second compound distribution represents
those cabs whose drivers were born in July through December and has = 9.
Then the sum of the two distributions represents the passengers from all of the cabs, and is a
Compound Poisson Distribution with = 11 + 9 = 20, and the same secondary distribution as each
of the individual Compound Distributions.
Note that the parameter of the primary rather than secondary distribution was affected.
113

This is the first step of the Panjer Algorithm, discussed in Mahlers Guide to Aggregate Distributions.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 265

Exercise: Let X be a Poisson-Binomial Distribution compound frequency distribution with


= 4.3, q = 0.2, and m = 5. Let Y be a Poisson-Binomial Distribution compound frequency
distribution with = 2.4, q = 0.2, and m = 5. What is the distribution of X + Y?
[Solution: A Poisson-Binomial Distribution with = 4.3 + 2.4 = 6.7, q = 0.2, and m = 5.]
The sum of two independent identically distributed Compound Poisson variables
has the same form. The sum of two independent identically distributed Compound Negative
Binomial variables has the same form.
Exercise: Let X be a Negative Binomial-Poisson compound frequency distribution with
= 0.7, r = 2.5, and = 3.
What is the distributional form of the sum of two independent random draws from X?
[Solution: A Negative Binomial-Poisson with = 0.7, r = (2)(2.5) = 5, and = 3.]
Exercise: Let X be a Poisson-Geometric compound frequency distribution with = 0.3 and
= 1.5.
What is the distributional form of the sum of twenty independent random draws from X?
[Solution: The sum of 20 independent identically distributed variables is of the same form.
However, = (20)(0.3) = 6. We get a Poisson-Geometric compound frequency distribution with
= 6 and = 1.5.]
If one adds independent identically distributed Compound Binomial variables one gets the same
form.
Exercise: Let X be a Binomial-Geometric compound frequency distribution with q = 0.2, m = 3, and
= 1.5.
What is the distributional form of the sum of twenty independent random draws from X?
[Solution: The sum of 20 independent identically distributed binomial variables is of the same form,
with m = (20)(3) = 60. We get a Binomial-Geometric compound frequency distribution with q = 0.2,
m = 60, and = 1.5.]

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 266

Thinning Compound Distributions:


Thinning compound distributions can be done in two different manners, one manner affects
the primary distribution, and the other manner affects the secondary distribution.
For example, assume that taxi cabs arrive at a hotel (primary distribution) and drop people off
(secondary distribution.) Then we can either select certain types of cabs or certain types of people.
Depending on which we select, we affect the primary or secondary distribution.
Assume we select only those cabs that are less than one year old (and assume age of cab is
independent of the number of people dropped off and the frequency of arrival of cabs.)
Then this would affect the primary distribution, the number of cabs.
Exercise: Cabs arrive via a Poisson with mean 1.3. The number of people dropped off by each
cab is Binomial with q = 0.2 and m = 5. The number of people dropped off per cab is independent
of the number of cabs that arrive. 30% of cabs are less than a year old.
The age of cabs is independent of the number of people dropped off and the frequency of arrival
of cabs.
What is the distribution of the number of people dropped off by cabs less than one year old?
[Solution: Cabs less than a year old arrive via a Poisson with = (30%)(1.3) = 0.39.
There is no effect on the number of people per cab (secondary distribution.)
We get a Poisson-Binomial Distribution compound frequency distribution with = 0.39, q = 0.2, and
m = 5.]
This first manner of thinning affects the primary distribution. For example, it might occur if the primary
distribution represents the number of accidents and the secondary distribution represents the
number of claims.
For example, assume that the number of accidents is Negative Binomial with = 2 and r = 30, and
the number of claims per accident is Binomial with q = 0.3 and m = 7. Then the total number of claims
is Compound Negative Binomial-Binomial with parameters = 2, r = 30,
q = 0.3 and m = 7.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 267

Exercise: Accidents are assigned at random to one of four claims adjusters:


Jerry, George, Elaine, or Cosmo.
What is the distribution of the number claims adjusted by George?
[Solution: We are selecting at random 1/4 of the accidents. We are thinning the Negative Binomial
Distribution of the number of accidents. Therefore, the number of accidents assigned to George is
Negative Binomial with = 2/4 = 0.5 and r = 30.
The number claims adjusted by George is Compound Negative Binomial-Binomial with
parameters = 0.5, r = 30, q = 0.3 and m = 7.]
Returning to the cab example, assume we select only female passengers, (and gender of
passenger is independent of the number of people dropped off and the frequency of arrival of
cabs.). Then this would affect the secondary distribution, the number of passengers.
Exercise: Cabs arrive via a Poisson with mean 1.3. The number of people dropped off by each
cab is Binomial with q = 0.2 and m = 5. The number of people dropped off per cab is independent
of the number of cabs that arrive. 40% of the passengers are female.
The gender of passengers is independent of the number of people dropped off and the frequency
of arrival of cabs.
What is the distribution of the number of females dropped off by cabs?
[Solution: The distribution of female passengers per cab is Binomial with q = (0.4)(0.2) = 0.08 and
m = 5. There is no effect on the number of cabs (primary distribution.)
We get a Poisson-Binomial Distribution compound frequency distribution with = 1.3,
q = 0.08, and m = 5.]
This second manner of thinning a compound distribution affects the secondary distribution.
It is mathematically the same as what happens when one takes only the large claims in a frequency
and severity situation, when the frequency distribution itself is compound.114
For example, if frequency is Poisson-Binomial with = 1.3, q = 0.2, and m = 5, and 40% of the
claims are large. The number of large claims would be simulated by first getting a random draw from
the Poisson, then simulating the appropriate number of random Binomials, and then for each claim
from the Binomial there is a 40% chance of selecting it at random independent of any other claims.
This is mathematically the same as thinning the Binomial. Therefore, large claims have a PoissonBinomial Distribution compound frequency distribution with = 1.3, q = (0.4)(0.2) = 0.08 and m = 5.

114

This is what is considered in Section 8.6 of Loss Models.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 268

Exercise: Let frequency be given by a Geometric-Binomial compound frequency distribution with


= 1.5, q = 0.2, and m = 3. Severity follows an Exponential Distribution with mean 1000.
Frequency and severity are independent.
What is the frequency distribution of losses of size between 500 and 2000?
[Solution: The fraction of losses that are of size between 500 and 2000 is:
F(2000) - F(500) = (1-e-2000/1000) - (1-e-500/1000) = e-.5 - e-2 = 0.4712. Thus the losses of size
between 500 and 2000 follow a Geometric-Binomial compound frequency distribution with
= 1.5, q = (0.4712)(0.2) = 0.0942, and m = 3.]
Proof of Some Thinning Results:115
One can use the result for the probability generating function for a compound distribution,
p.g.f. of compound distribution = p.g.f. of primary distribution[p.g.f. of secondary distribution],
in order to determine the results of thinning a Poisson, Binomial, or Negative Binomial Distribution.
Assume one has a Poisson Distribution with mean .
Assume one selects at random 30% of the claims.
This is mathematically the same as a compound distribution with a primary distribution that is Poisson
with mean and a secondary distribution that is Bernoulli with q = 0.3.
The p.g.f. of the Poisson is P(z) = e(z-1).
The p.g.f. of the Bernoulli is P(z) = 1 + 0.3(z-1).
The p.g.f. of the compound distribution is obtained by replacing z in the p.g.f. of the primary
Poisson with the p.g.f. of the secondary Bernoulli:
P(z) = exp[{1 + 0.3(z-1) - 1}] = exp[(0.3)(z - 1)].
This is the p.g.f. of a Poisson Distribution with mean 0.3.
Thus the thinned distribution is also Poisson, with mean 0.3.
In general, when thinning a Poisson by a factor of t, the thinned distribution is also Poisson with mean
t.

115

See Section 8.6 of Loss Models.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 269

Similarly, assume we are thinning a Binomial Distribution with parameters q and m.


The p.g.f. of the Binomial is P(z) = {1 + q(z-1)}m.
This is mathematically the same as a compound distribution with secondary distribution a Bernoulli
with mean t.
The p.g.f. of this compound distribution is: {1 + q(1 + t(z-1) -1)}m = {1 + tq(z-1))}m.
This is the p.g.f. of a Binomial Distribution with parameters tq and m.
In general, when thinning a Binomial by a factor of t, the thinned distribution is also Binomial with
parameters tq and m.
Assume we are thinning a Negative Binomial Distribution with parameters and r.
The p.g.f. of the Negative Binomial is P(z) = {1 - (z-1)}-r.
This is mathematically the same as a compound distribution with secondary distribution a Bernoulli
with mean t.
The p.g.f. of this compound distribution is: {1 - (1 + t(z-1) -1)}-r = {1 - t((z-1)}-r.
This is the p.g.f. of a Negative Binomial Distribution with parameters r and t.
In general, when thinning a Negative Binomial by a factor of t, the thinned distribution is also
Negative Binomial with parameters t and r.116
Since thinning is mathematically the same as a compound distribution with secondary distribution a
Bernoulli with mean t, and the p.g.f. of the Bernoulli is 1 - t + tz,
the p.g.f. of the thinned distribution is P(1 - t + tz),
where P(z) is the p.g.f. of the original distribution. In general, P(0) = f(0).
Thus the density at zero for the thinned distribution is: P(1 - t + t0) = P(1 - t).
The density of the thinned distribution at zero is the p.g.f. of the original distribution at 1 - t.117
Let us assume instead we start with a zero-modified distribution.
Let P(z) be the p.g.f. of the original distribution prior to being zero-modified.
M
M
M P(z) - f(0)
Then PZM(z) = pM
0 + (1 - p 0 ) PZT(z) = p 0 + (1 - p 0 ) 1 - f(0) .
Now the density at zero for the thinned version of the original distribution is: P(1 - t).
The density at zero for the thinned version of the original distribution is:
1 - P(1- t)
M
M P(1- t) - f(0)
pM
. 1 - pM
* = (1 - pM
)
.
0 * = PZM(1 - t) = p 0 + (1 - p 0 )
0
0
1 - f(0)
1 - f(0)
116
117

Including the special case the Geometric Distribution.


This general result was discussed previously with respect to thinning zero-modified distributions.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 270

The p.g.f. of the thinned zero-modified distribution is:


M P(1 - t + tz) - f(0)
PZM(1 - t + tz) = pM
=
0 + (1 - p 0 )
1 - f(0)
P(1 - t + tz) - f(0)
M P(1- t) - f(0)
pM
+ (1 - pM
)
=
0 * - (1 - p 0 )
0
1 - f(0)
1 - f(0)
M
pM
0 * + (1 - p 0 )

Now,

P(1 - t + tz) - P(1 - t)


P(1 - t + tz) - P(1 - t)
= pM
* + (1 - pM
*)
.
0
0
1 - f(0)
1 - P(1 - t)

P(1 - t + tz) - P(1 - t)


=
1 - P(1 - t)

(p.g.f. of thinned non- modified dist.) - (density at zero of thinned non- modified dist.)
.
1 - (density at zero of thinned non - modified distribution)
Therefore, the form of the p.g.f. of the thinned zero-modified distribution:
P(1 - t + tz) - P(1 - t)
M
pM
,
0 * + (1 - p 0 *)
1 - P(1 - t)
is the usual form of the p.g.f. of a zero-modified distribution,
with the thinned version of the original distribution taking the place of the original distribution.
Therefore, provided thinning preserves the family of the original distribution, the thinned
zero-truncated distribution is of the same family with pM
0 *, and with the other parameters as per
thinning of the non-modified distribution. Specifically as discussed before:
Distribution

Result of thinning by a factor of t

Zero-Modified Binomial

q tq

m remains the same

M
m
m
m
pM
0 - (1- q) + (1- tq) - p0 (1- tq)
pM

0
1 - (1- q)m

Zero-Modified Poisson

t
- + e - t - pM e - t
pM
0 - e
0
pM

1 - e

Zero-Modified Negative Binomial

t
pM
0

r remains the same


- r + (1+ t) - r - pM (1+ t)- r
pM
0 - (1+ )
0

1 - (1+ ) - r

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 271

As discussed previously, things work similarly for a zero-modified Logarithmic.


Let P(z) be the p.g.f. of the original Logarithmic distribution prior to being zero-modified.
M
Then PZM(z) = pM
0 + (1 - p 0 ) P(z).

Now the density at zero for the thinned version of the original distribution is: P(1 - t).
The density at zero for the thinned version of the zero-modified distribution is:
M
M
pM
0 * = PZM(1 - t) = p 0 + (1 - p 0 ) P(1 - t).
M
1 - pM
0 * = (1 - p 0 ) {1 - P(1 - t)}.

As before, since the p.g.f. of the secondary Bernoulli is 1 - t + tz,


the p.g.f. of the thinned zero-modified distribution is:
M
M
M
M
PZM(1 - t + tz) = pM
0 + (1 - p 0 ) P(1 - t + tz) = p 0 * - (1 - p 0 ) P(1 - t) + (1 - p 0 ) P(1 - t + tz) =
M
pM
0 * + (1 - p 0 *)

Now,

P(1 - t + tz) - P(1 - t)


.
1 - P(1 - t)

P(1 - t + tz) - P(1 - t)


=
1 - P(1 - t)

(p.g.f. of thinned non- modified dist.) - (density at zero of thinned non- modified dist.)
.
1 - (density at zero of thinned non - modified distribution)
Therefore, the form of the p.g.f. of the thinned zero-modified distribution:
P(1 - t + tz) - P(1 - t)
M
pM
,
0 * + (1 - p 0 *)
1 - P(1 - t)
is the usual form of the p.g.f. of a zero-modified distribution,
with the thinned version of the original distribution taking the place of the original distribution.
Therefore, since thinning results in another Logarithmic, the thinned
zero-truncated distribution is of the same family with pM
0 *, and the other parameter as per thinning of
the non-modified distribution. As discussed before:
Zero-Modified Logarithmic

t
M ln[1+ t]
pM
0 1 - (1 - p 0 ) ln[1+ ]

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 272

Problems:
15.1 (2 points) The number of accidents is Geometric with = 1.7.
The number of claims per accident is Poisson with = 3.1.
For the total number of claims, what is the Probability Generating Function, P(z)?
exp[3.1(z - 1)]
A.
2.7 - 1.7z
B.

1
2.7 - 1.7exp[3.1(z - 1)]

C. exp[3.1(z - 1)] + (2.7 - 1.7z)


3.1(z - 1.7)
D. exp[
]
2.7 - 1.7z
E. None of the above
15.2 (1 point) Frequency is given by a Poisson-Binomial compound frequency distribution, with
= 0.18, q = 0.3, and m = 3.
One third of all losses are greater than $10,000. Frequency and severity are independent.
What is frequency distribution of losses of size greater than $10,000?
A. Compound Poisson-Binomial with = 0.18, q = 0.3, and m = 3.
B. Compound Poisson-Binomial with = 0.18, q = 0.1, and m = 3.
C. Compound Poisson-Binomial with = 0.18, q = 0.3, and m = 1.
D. Compound Poisson-Binomial with = 0.06, q = 0.3, and m = 3.
E. None of the above.
15.3 (1 point) X is given by a Binomial-Geometric compound frequency distribution, with
q = 0.15, m = 3, and = 2.3. Y is given by a Binomial-Geometric compound frequency distribution,
with q = 0.15, m = 5, and = 2.3. X and Y are independent.
What is the distributional form of X + Y?
A. Compound Binomial-Geometric with q = 0.15, m = 4, and = 2.3
B. Compound Binomial-Geometric with q = 0.15, m = 8, and = 2.3
C. Compound Binomial-Geometric with q = 0.15, m = 4, and = 4.6
D. Compound Binomial-Geometric with q = 0.15, m = 8, and = 4.6
E. None of the above.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 273

15.4 (2 points) A compound claims frequency model has the following properties:
(i) The primary distribution has probability generating function:
P(z) = 0.2z + 0.5z2 + 0.3z3 .
(ii) The secondary distribution has probability generating function:
P(z) = exp[0.7(z - 1)].
Calculate the probability of no claims from this compound distribution.
(A) 18%
(B) 20%
(C) 22%
(D) 24%
(E) 26%
15.5 (1 point) Assume each exposure has a Poisson-Poisson compound frequency distribution, as
per Loss Models, with 1 = 0.03 and 2 = 0.07. You insure 20,000 independent exposures. What
is the frequency distribution for your portfolio?
A. Compound Poisson-Poisson with 1 = 0.03 and 2 = 0.07
B. Compound Poisson-Poisson with 1 = 0.03 and 2 = 1400
C. Compound Poisson-Poisson with 1 = 600 and 2 = 0.07
D. Compound Poisson-Poisson with 1 = 600 and 2 = 1400
E. None of the above.
15.6 (2 points) Frequency is given by a Poisson-Binomial compound frequency distribution,
with parameters = 1.2, q = 0.1, and m = 4.
What is the Probability Generating Function?
A. {1+ 0.1(z - 1)}4

B. exp(1.2(z - 1))

C. exp[1.2({1 + 0.1(z - 1)}4 -1)]


E. None of the above

D. {1 + 0.1(exp[1.2(z - 1)] - 1)}4

15.7 (1 point) The total number of claims from a book of business with 100 exposures has a
Compound Poisson-Geometric Distribution with = 4 and = 0.8.
Next year this book of business will have 75 exposures.
Next year, what is the distribution of the total number of claims from this book of business?
A. Compound Poisson-Geometric with = 4 and = 0.8.
B. Compound Poisson-Geometric with = 3 and = 0.8.
C. Compound Poisson-Geometric with = 4 and = 0.6.
D. Compound Poisson-Geometric with = 3 and = 0.6.
E. None of the above.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 274

15.8 (2 points) A compound claims frequency model has the following properties:
(i) The primary distribution has probability generating function:
P(z) = 1 / (5 - 4z).
(ii) The secondary distribution has probability generating function:
P(z) = (0.8 + 0.2z)3 .
Calculate the probability of no claims from this compound distribution.
(A) 28%
(B) 30%
(C) 32%
(D) 34%
(E) 36%
15.9 (1 point) The total number of claims from a group of 50 drivers has a
Compound Negative Binomial-Poisson Distribution with = 0.4, r = 3, and = 0.7.
What is the distribution of the total number of claims from 500 similar drivers?
A. Compound Negative Binomial-Poisson with = 0.4, r = 30, and = 0.7.
B. Compound Negative Binomial-Poisson with = 4, r = 3, and = 0.7.
C. Compound Negative Binomial-Poisson with = 0.4, r = 3, and = 7.
D. Compound Negative Binomial-Poisson with = 4, r = 30, and = 7.
E. None of the above.
15.10 (SOA M, 11/05, Q.27 & 2009 Sample Q.208) (2.5 points)
An actuary has created a compound claims frequency model with the following properties:
(i) The primary distribution is the negative binomial with probability generating function
P(z) = [1 - 3(z - 1)]-2.
(ii) The secondary distribution is the Poisson with probability generating function
P(z) = exp[(z - 1)].
(iii) The probability of no claims equals 0.067.
Calculate .
(A) 0.1

(B) 0.4

(C) 1.6

(D) 2.7

(E) 3.1

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 275

Solutions to Problems:
15.1. B. P(z) = P1 [P2 (z)].
The p.g.f of the primary Geometric is: 1/{1 - (z-1)} = 1/{1 - 1.7(z-1)} = 1/(2.7 - 1.7z).
The p.g.f of the secondary Poisson is: exp[(z-1)] = exp[3.1(z-1)].
Thus the p.g.f. of the compound distribution is: 1 / {2.7 - 1.7exp[3.1(z-1)]}.
Comment: P(z) only exist for z < 1 + 1/ = 1 + 1/1.7.
15.2. B. We are taking 1/3 of the claims from the secondary Binomial. Thus the secondary
distribution is Binomial with q = 0.3/3 = 0.1 and m = 3. Thus the frequency distribution of losses of
size greater than $10,000 is given by a Poisson-Binomial compound frequency distribution, as per
Loss Models with = 0.18, q = 0.1, and m = 3.
15.3. B. Provided the secondary distributions are the same, the primary distributions add as they
usually would. The sum of two independent Binomials with the same q, is another Binomial with the
sum of the m parameters. In this case it is a Binomial with q = 0.15 and
m = 3 + 5 = 8. X + Y is a Binomial-Geometric with q = 0.15, m = 8, and = 2.3.
Comment: The secondary distributions determine how many claims there are per accident. The
primary distributions determine how many accidents. In this case the Binomial distributions of the
number of accidents add.
15.4. E. P(z) = P1 [P2 (z)].
Density at 0 is: P(0) = P1 [P2 (0)] = P1 [e-0.7] = 0.2e-0.7 + 0.5e-1.4 + 0.3e-2.1 = 0.259.
Alternately, the primary distribution has 20% probability of 1, 50% probability of 2, and 30%
probability of 3, while the secondary distribution is a Poisson with = 0.7.
The density at zero of the secondary distribution is e-.7.
Therefore, the probability of zero claims for the compound distribution is:
(0.2)(Prob 0 from secondary) + (0.5)(Prob 0 from secondary)2 + (0.3)(Prob 0 from secondary)3
= .2e-.7 + .5(e-.7)2 + .3(e-.7)3 = 0.259.
15.5. C. One adds up 20,000 independent identically distributed variables. In the case of a
Compound Poisson distribution, the primary Poissons add to give another Poisson with
1 = (20000)(.03) = 600. The secondary distribution stays the same.
The portfolio has a compound Poisson-Poisson with 1 = 600 and 2 = 0.07.

2013-4-1,

Frequency Distributions, 15 Compound Dists.

HCM 10/4/12,

Page 276

15.6. C. The p.g.f of the primary Poisson is exp((z-1)) = exp(1.2(z-1)).


The p.g.f of the secondary Binomial is {1+ q(z-1)}m = {1+ .1(z-1)}4 .
Thus the p.g.f. of the compound distribution is P(z) = P1 [P2 (z)] = exp[1.2({1+ .1(z-1)}4 -1)].
15.7. B. Poisson-Geometric with = (75/100)(4) = 3 and = 0.8.
Comment: One adjusts the primary Poisson distribution in a manner similar to that if one just had a
Poisson distribution.
15.8. D. P(z) = P1 [P2 (z)].
Density at 0 is: P(0) = P1 [P2 (0)] = P1 [.83 ] = 1/{5 - 4(.83 )} = 0.339.
Alternately, the secondary distribution is a Binomial with m = 3 and q = 0.2.
The density at zero of the secondary distribution is .83 .
Therefore, the probability of zero claims for the compound distribution is:
P1 [.83 ] = 1/{5 - 4(.83 )} = 0.339.
15.9. A. Negative Binomial-Poisson with = 0.4, r = (500/50)(3) = 30, and = 0.7.
Comment: One adjusts the primary Negative Binomial distribution in a manner similar to that if one
just had a Negative Binomial distribution.
15.10. E. The p.g.f. of the compound distribution is the p.g.f. of the primary distribution at the p.g.f.
of the secondary distribution: P(z) = [1 - 3(exp[(z - 1)] - 1)]-2.
0.067 = f(0) = P(0) = [1 - 3(exp[(0 - 1)] - 1)]-2 = [1 - 3(exp[-] - 1)]-2.

1 - 3(exp[-] - 1) = 3.8633. exp[-] = .04555. = 3.089.


Alternately, the Poisson secondary distribution at zero is e.
From the first step of the Panjer Algorithm, c(0) = Pp [s(0)] = [1 - 3(e - 1)]-2. Proceed as before.
Comment: P(z) = E[zn ] = f(n)zn . Therefore, letting z approach zero, P(0) = f(0).
The probability generating function of the Negative Binomial only exists for z < 1 + 1/ = 4/3.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 277

Section 16, Moments of Compound Frequency Distributions118


One may find it helpful to think of the secondary distribution as taking the role of a severity
distribution in the calculation of aggregate losses.119 Since the situations are mathematically
equivalent, many of the techniques and formulas that apply to aggregate losses apply to
compound frequency distributions.
For example, the same formulas for the mean, variance and skewness apply.120
Mean of Compound Dist. =
(Mean of Primary Dist.) (Mean of Secondary Dist.)
Variance of Compound Dist. =
(Mean of Primary Dist.) (Variance of Secondary Dist.) +
(Mean of Secondary Dist.)2 (Variance of Primary Dist.)
Skewness Compound Dist. =
{(Mean of Primary Dist.)(Variance of Second. Dist.)3/2(Skewness of Secondary Dist.) +
3(Variance of Primary Dist.)(Mean of Secondary Dist.)(Variance of Second. Dist.) +
(Variance of Primary Dist.)3/2(Skewness of Primary Dist.)(Mean of Second. Dist.)3 } /
(Variance of Compound Dist.)3/2
Exercise: What are the mean and variance of a compound Poisson-Binomial distribution, with
parameters = 1.3, q = 0.4, m = 5.
[Solution: The mean and variance of the primary Poisson Distribution are both 1.3.
The mean and variance of the secondary Binomial Distribution are
(0.4)(5) = 2 and (0.4)(0.6)(5) = 1.2.
Thus the mean of the compound distribution is: (1.3)(2) = 2.6.
The variance of the compound distribution is: (1.3)(1.2) + (2)2 (1.3) = 6.76.]

118

See Section 6.8 of Loss Models, not on the syllabus. However, since compound distributions are mathematically
the same as aggregate distributions, I believe that a majority of the questions in this section would be legitimate
questions for your exam. Compound frequency distributions used to be on the syllabus.
119
In the case of aggregate losses, the frequency distribution determines how many independent identically
distributed severity variables we will sum.
120
The secondary distribution takes the place of the severity, while the primary distribution takes the place of the
frequency, in the formulas involving aggregate losses. agg2 = F S2 + S2 F2.
See Mahlers Guide to Aggregate Distributions.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 278
Thus in the case of the Heartbreak Hotel example in the previous section, on average 2.6
passengers are dropped off per minute.121 The variance of the number of passengers dropped off
is 6.76.
Poisson Primary Distribution:
In the case of a Poisson primary distribution with mean , the variance of the compound distribution
could be rewritten as:
(Variance of Secondary Dist.) + (Mean of Secondary Dist.)2 =
(Variance of Secondary Dist. + Mean of Secondary Dist.2 ) =
(2nd moment of Secondary Distribution).
It also turns out that the third central moment of a compound Poisson distribution =
(3rd moment of Secondary Distribution).
For a Compound Poisson Distribution:
Mean = (mean of Secondary Distribution).
Variance = (2nd moment of Secondary Distribution).
3rd central moment = (3rd moment of Secondary Distribution).
Skewness = .5(3rd moment of Second. Dist.)/(2nd moment of Second. Dist.)1.5.122
Exercise: The number of accidents follows a Poisson Distribution with = 0.04.
Each accident generates 1, 2 or 3 claimants with probabilities 60%, 30%, and 10%.
Determine the mean, variance, and skewness of the total number of claimants.
[Solution: The secondary distribution has mean 1.5, second moment 2.7, and third moment 5.7.
Thus the mean number of claimants is: (0.04)(1.5) = 0.06.
The variance of the number of claimants is: (0.04)(2.7) = 0.108.
The skewness of the number of claimants is: (0.04-.5)(5.7)/(2.7)1.5 = 6.42.]
121

The number of taxicabs that arrive per minute at the Heartbreak Hotel is Poisson with mean 1.3. The number of
passengers dropped off at the hotel by each taxicab is Binomial with q = .4 and m = 5. The number of passengers
dropped off by each taxicab is independent of the number of taxicabs that arrive and is independent of the number
of passengers dropped off by any other taxicab.
122
Skewness = (third central moment)/ Variance1.5.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 279
Problems:
16.1 (1 point) For a compound distribution:
Mean of primary distribution = 15.
Standard Deviation of primary distribution = 3.
Mean of secondary distribution = 10.
Standard Deviation of secondary distribution = 4.
What is the standard deviation of the compound distribution?
A. 26
B. 28
C. 30
D. 32
E. 34
16.2 (2 points) The number of accidents follows a Poisson distribution with mean 10 per month.
Each accident generates 1, 2, or 3 claimants with probabilities 40%, 40%, 20%, respectively.
Calculate the variance in the total number of claimants in a year.
A. 250
B. 300
C. 350
D. 400
E. 450
Use the following information for the next 3 questions:
The number of customers per minute is Geometric with = 1.7.
The number of items sold to each customer is Poisson with = 3.1.
The number of items sold per customer is independent of the number of customers.
16.3 (1 point) What is the mean?
A. less than 5.0
B. at least 5.0 but less than 5.5
C. at least 5.5 but less than 6.0
D. at least 6.0 but less than 6.5
E. at least 6.5
16.4 (1 point) What is the variance?
A. less than 50
B. at least 50 but less than 51
C. at least 51 but less than 52
D. at least 52 but less than 53
E. at least 53
16.5 (2 points) What is the chance that more than 4 items are sold during the next minute?
Use the Normal Approximation.
A. 46%
B. 48%
C. 50%
D. 52%
E. 54%

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 280
16.6 (3 points) A dam is proposed for a river which is currently used for salmon breeding.
You have modeled:
(i) For each hour the dam is opened the number of female salmon that will pass through and
reach the breeding grounds has a distribution with mean 50 and variance 100.
(ii) The number of eggs released by each female salmon has a distribution
with mean of 3000 and variance of 1 million.
(iii) The number of female salmon going through the dam each hour it is open and the
numbers of eggs released by the female salmon are independent.
Using the normal approximation for the aggregate number of eggs released, determine the least
number of whole hours the dam should be left open so the probability that 2 million eggs will be
released is greater than 99.5%.
(A) 14
(B) 15
(C) 16
(D) 17
(E) 18
16.7 (3 points) The claims department of an insurance company receives envelopes with claims for
insurance coverage at a Poisson rate of = 7 envelopes per day. For any period of time, the
number of envelopes and the numbers of claims in the envelopes are independent. The numbers
of claims in the envelopes have the following distribution:
Number of Claims Probability
1
0.60
2
0.30
3
0.10
Using the normal approximation, calculate the 99th percentile of the number of claims
received in 5 days.
(A) 73
(B) 75

(C) 77

(D) 79

(E) 81

16.8 (3 points) The number of persons using an ATM per hour has a Negative Binomial Distribution
with = 2 and r = 13. Each hour is independent of the others.
The number of transactions per person has the following distribution:
Number of Transactions
Probability
1
0.30
2
0.40
3
0.20
4
0.10
Using the normal approximation, calculate the 80th percentile of the number of transactions in 5
hours.
A. 300

B. 305

C. 310

D. 315

E. 320

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 281
Use the following information for the next 3 questions:

The number of automobile accidents follows a Negative Binomial distribution


with = 0.6 and r = 100.

For each automobile accident the number of claimants with bodily injury follows
a Binomial Distribution with q = 0.1 and m = 4.
The number of claimants with bodily injury is independent between accidents.

16.9 (2 points) Calculate the variance in the total number of claimants.


(A) 33
(B) 34
(C) 35
(D) 36
(E) 37
16.10 (1 point) What is probability that there are 20 or fewer claimants in total?
(A) 22%
(B) 24%
(C) 26%
(D) 28%
(E) 30%
16.11 (3 points) The amount of the payment to each claimant follows a Gamma Distribution with
= 3 and = 4000. The amount of payments to different claimants are independent of each other
and are independent of the number of claimants.
What is the probability that the aggregate payment exceeds 300,000?
(A) 44%
(B) 46%
(C) 48%
(D) 50%
(E) 52%
16.12 (3 points) The number of batters per half-inning of a baseball game is:
3 + a Negative Binomial Distribution with = 1 and r = 1.4.
The number of pitches thrown per batter is:
1 + a Negative Binomial Distribution with = 1.5 and r = 1.8.
What is the probability of more than 30 pitches in a half-inning?
Use the normal approximation with continuity correction.
A. 1/2%
B. 1%
C. 2%
D. 3%
E. 4%
16.13 (3 points) The number of taxicabs that arrive per minute at the Gotham City Railroad Station
is Poisson with mean 5.6. The number of passengers dropped off at the station by each taxicab is
Binomial with q = 0.3 and m = 4. The number of passengers dropped off by each taxicab is
independent of the number of taxicabs that arrive and is independent of the number of passengers
dropped off by any other taxicab. Using the normal approximation for the aggregate passengers
dropped off, determine the least number of whole minutes one must observe in order that the
probability that at least 1000 passengers will be dropped off is greater than 90%.
A. 155
B. 156
C. 157
D. 158
E. 159

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 282
16.14 (4 points) At a storefront legal clinic, the number of lawyers who volunteer to provide legal aid
to the poor on any day is uniformly distributed on the integers 1 through 4. The number of hours
each lawyer volunteers on a given day is Binomial with q = 0.6 and m = 7. The number of clients
that can be served by a given lawyer per hour is a Poisson distribution with mean 5.
Determine the probability that 40 or more clients can be served in a day at this storefront law clinic,
using the normal approximation.
(A) 69%
(B) 71%
(C) 73%
(D) 75%
(E) 77%
Use the following information for the next 3 questions:
The number of persons entering a library per minute is Poisson with = 1.2.
The number of books returned per person is Binomial with q = 0.1 and m = 4.
The number of books returned per person is independent of the number of persons.
16.15 (1 point) What is the mean number of books returned per minute?
A. less than 0.5
B. at least 0.6 but less than 0.7
C. at least 0.7 but less than 0.8
D. at least 0.8 but less than 0.9
E. at least 0.9
16.16 (1 point) What is the variance of the number of books returned per minute?
A. less than 0.6
B. at least 0.6 but less than 0.7
C. at least 0.7 but less than 0.8
D. at least 0.8 but less than 0.9
E. at least 0.9
16.17 (1 point) What is the probability of observing more than two books returned in the next
minute?
Use the Normal Approximation.
A. less than 0.6%
B. at least 0.6% but less than 0.7%
C. at least 0.7% but less than 0.8%
D. at least 0.8% but less than 0.9%
E. at least 0.9%

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 283
16.18 (2 points) Yosemite Sam is panning for gold.
The number of pans with gold nuggets he finds per day is Poisson with mean 3.
The number of nuggets per such pan are: 1, 5, or 25, with probabilities: 90%, 9%, and 1%
respectively.
The number of pans and the number of nuggets per pan are independent.
Using the normal approximation with continuity correction, what is the probability that the number of
nuggets found by Sam over the next ten day is less than 30?
(A) (-1.2)

(B) (-1.1)

(C) (-1.0)

(D) (-0.9)

(E) (-0.8)

16.19 (3, 11/00, Q.2 & 2009 Sample Q.112) (2.5 points) In a clinic, physicians volunteer their
time on a daily basis to provide care to those who are not eligible to obtain care otherwise. The
number of physicians who volunteer in any day is uniformly distributed on the integers 1 through 5.
The number of patients that can be served by a given physician has a Poisson distribution with
mean 30.
Determine the probability that 120 or more patients can be served in a day at the clinic,
using the normal approximation with continuity correction.
(A) 1 - (0.68)

(B) 1 - (0.72)

(C) 1 - (0.93)

(D) 1 - (3.13)

(E) 1 - (3.16)

16.20 (3, 5/01, Q.16 & 2009 Sample Q.106) (2.5 points) A dam is proposed for a river which is
currently used for salmon breeding. You have modeled:
(i) For each hour the dam is opened the number of salmon that will pass through and
reach the breeding grounds has a distribution with mean 100 and variance 900.
(ii) The number of eggs released by each salmon has a distribution with mean of 5
and variance of 5.
(iii) The number of salmon going through the dam each hour it is open and the
numbers of eggs released by the salmon are independent.
Using the normal approximation for the aggregate number of eggs released, determine the least
number of whole hours the dam should be left open so the probability that 10,000 eggs will be
released is greater than 95%.
(A) 20
(B) 23
(C) 26
(D) 29
(E) 32
16.21 (3, 5/01, Q.36 & 2009 Sample Q.111) (2.5 points)
The number of accidents follows a Poisson distribution with mean 12.
Each accident generates 1, 2, or 3 claimants with probabilities 1/2, 1/3, 1/6, respectively.
Calculate the variance in the total number of claimants.
(A) 20
(B) 25
(C) 30
(D) 35
(E) 40

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 284
16.22 (3, 11/01, Q.30) (2.5 points) The claims department of an insurance company receives
envelopes with claims for insurance coverage at a Poisson rate of = 50 envelopes per week.
For any period of time, the number of envelopes and the numbers of claims in the envelopes are
independent. The numbers of claims in the envelopes have the following distribution:
Number of Claims Probability
1
0.20
2
0.25
3
0.40
4
0.15
Using the normal approximation, calculate the 90th percentile of the number of claims
received in 13 weeks.
(A) 1690
(B) 1710

(C) 1730

(D) 1750

(E) 1770

16.23 (3, 11/02, Q.27 & 2009 Sample Q.93) (2.5 points) At the beginning of each round of a
game of chance the player pays 12.5. The player then rolls one die with outcome N. The player
then rolls N dice and wins an amount equal to the total of the numbers showing on the N dice.
All dice have 6 sides and are fair.
Using the normal approximation, calculate the probability that a player starting with
15,000 will have at least 15,000 after 1000 rounds.
(A) 0.01
(B) 0.04
(C) 0.06
(D) 0.09
(E) 0.12
16.24 (CAS3, 5/04, Q.26) (2.5 points) On Time Shuttle Service has one plane that travels from
Appleton to Zebrashire and back each day.
Flights are delayed at a Poisson rate of two per month.
Each passenger on a delayed flight is compensated $100.
The numbers of passengers on each flight are independent and distributed with mean 30 and
standard deviation 50.
(You may assume that all months are 30 days long and that years are 360 days long.)
Calculate the standard deviation of the annual compensation for delayed flights.
A. Less than $25,000
B. At least $25,000, but less than $50,000
C. At least $50,000, but less than $75,000
D. At least $75,000, but less than $100,000
E. At least $100,000

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 285
16.25 (SOA M, 11/05, Q.18 & 2009 Sample Q.205) (2.5 points) In a CCRC, residents start
each month in one of the following three states: Independent Living (State #1), Temporarily in a
Health Center (State #2) or Permanently in a Health Center (State #3). Transitions between states
occur at the end of the month. If a resident receives physical therapy, the number of sessions that
the resident receives in a month has a geometric distribution with a mean which depends on the
state in which the resident begins the month. The numbers of sessions received are independent.
The number in each state at the beginning of a given month, the probability of needing physical
therapy in the month, and the mean number of sessions received for residents receiving therapy are
displayed in the following table:
State# Number in state Probability of needing therapy
Mean number of visits
1
400
0.2
2
2
300
0.5
15
3
200
0.3
9
Using the normal approximation for the aggregate distribution, calculate the probability that
more than 3000 physical therapy sessions will be required for the given month.
(A) 0.21
(B) 0.27
(C) 0.34
(D) 0.42
(E) 0.50
16.26 (SOA M, 11/05, Q.39 & 2009 Sample Q.213) (2.5 points) For an insurance portfolio:
(i) The number of claims has the probability distribution
n
pn
0
0.1
1
0.4
2
0.3
3
0.2
(ii) Each claim amount has a Poisson distribution with mean 3; and
(iii) The number of claims and claim amounts are mutually independent.
Calculate the variance of aggregate claims.
(A) 4.8
(B) 6.4
(C) 8.0
(D) 10.2
(E) 12.4

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 286
16.27 (CAS3, 5/06, Q.35) (2.5 points)
The following information is known about a consumer electronics store:

The number of people who make some type of purchase follows a Poisson distribution with
a mean of 100 per day.

The number of televisions bought by a purchasing customer follows a Negative Binomial


distribution with parameters r = 1.1 and = 1.0.
Using the normal approximation, calculate the minimum number of televisions the store must have in
its inventory at the beginning of each day to ensure that the probability of its inventory being
depleted during that day is no more than 1.0%.
A. Fewer than 138
B. At least 138, but fewer than 143
C. At least 143, but fewer than 148
D. At least 148, but fewer than 153
E. At least 153
16.28 (SOA M, 11/06, Q.30 & 2009 Sample Q.285) (2.5 points)
You are the producer for the television show Actuarial Idol.
Each year, 1000 actuarial clubs audition for the show.
The probability of a club being accepted is 0.20.
The number of members of an accepted club has a distribution with mean 20 and variance 20.
Club acceptances and the numbers of club members are mutually independent.
Your annual budget for persons appearing on the show equals 10 times the expected number
of persons plus 10 times the standard deviation of the number of persons.
Calculate your annual budget for persons appearing on the show.
(A) 42,600 (B) 44,200 (C) 45,800 (D) 47,400 (E) 49,000

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 287
Solutions to Problems:
16.1. E. Standard deviation of the compound distribution is:
(15)(42 ) + (102 )(32) = 1140 = 33.8.
16.2. E. The frequency over a year is Poisson with mean: (12)(10) = 120 accidents.
Second moment of the secondary distribution is: (40%)(12 ) + (40%)(22 ) + (20%)(32 ) = 3.8.
Variance of compound distribution is: (120)(3.8) = 456.
Comment: Similar to 3, 5/01, Q.36.
16.3. B. The mean of the primary Geometric Distribution is 1.7. The mean of the secondary
Poisson Distribution is 3.1. Thus the mean of the compound distribution is: (1.7)(3.1) = 5.27.
16.4. A. The mean of the primary Geometric Distribution is 1.7. The mean of the secondary
Poisson Distribution is 3.1. The variance of the primary Geometric is: (1.7)(1+1.7) = 4.59.
The variance of the secondary Poisson Distribution is 3.1.
The variance of the compound distribution is: (1.7)(3.1) + (3.1)2 (4.59) = 49.38.
Comment: The variance of the compound distribution is large compared to its mean. A very large
number of items can result if there are a large number of customers from the Geometric combined
with some of those customers buying a large numbers of items from the Poisson.
Compound distributions tend to have relatively heavy tails.
16.5. E. From the previous solutions, the mean of the compound distribution is 5.27, and the
variance of the compound distribution is 49.38. Thus the standard deviation is 7.03.
1 - [(4.5 - 5.27)/7.03)] = 1 - (-0.11) = (0.11) = 0.5438.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 288
16.6. C. Over y hours, the number of salmon has mean 50y and variance 100y.
The mean aggregate number of eggs is: (50y)(3000) = 150000y.
The standard deviation of the aggregate number of eggs is:
(50y)(10002 ) + (30002)(100y) = 30822 y .
Thus the probability that the aggregate number of eggs is < 2 million is approximately:
((1999999.5 - 150000y)/30822 y ).
Since (2.576) = .995, this probability will be 1/2% if:
(1999999.5 - 150000y)/ 30822 y = - 2.576 150000y - 79397 y - 1999999.5 = 0.
y = (79397

793972 + (4)(150000)(1999999.5) )/ ((2)(150000)) = 0.2647 3.6611.

y = 3.926. y = 15.4. The smallest whole number of hours is therefore 16.


Alternately, try the given choices and stop when (Mean - 2 million)/StdDev. > 2.576.
Hours

Mean

Standard Deviation

# of Claims

14
15
16
17
18

2,100,000
2,250,000
2,400,000
2,550,000
2,700,000

115,325
119,373
123,288
127,082
130,767

0.867
2.094
3.244
4.328
5.353

Comment: Similar to 3, 5/01, Q.16.


Note that since the variance over one hour is 100, the variance of the number of salmon over two
hours is: (2)(100) = 200.
Number of salmon over two hours = number over the first hour + number over the second hour.

Var[Number over two hours] = Var[number over first hour] + Var[number over second hour]
= 2 Var[number over an hour]. We are adding independent random variables, rather than
multiplying an individual variable by a constant.
16.7. B. The mean frequency over 5 days is: (7)(5) = 35.
Mean number of claims per envelope is: (60%)(1) + (30%)(2) + (10%)(3) = 1.5.
Mean of compound distribution is: (35)(1.5) = 52.5.
Second moment of number of claims per envelope is: (60%)(12 ) + (30%)(22 ) + (10%)(32 ) = 2.7.
Variance of compound distribution is: (35)(2.7) = 94.5.
99th percentile mean + (2.326)(standard deviations) = 52.5 + (2.326) 94.5 = 75.1.
Comment: Similar to 3, 11/01, Q.30.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 289
16.8. C. The number of persons has mean: (13)(2) = 26,
and variance: (13)(2)(2 + 1) = 78.
The number of transactions per person has mean:
(30%)(1) + (40%)(2) + (20%)(3) + (10%)(4) = 2.1,
second moment: (30%)(12 ) + (40%)(22 ) + (20%)(32 ) + (10%)(42 ) = 5.3,
and variance: 5.3 - 2.12 = 0.89.
The number of transactions in an hour has mean: (26)(2.1) = 54.6,
and variance: (26)(.89) + (2.12 )(78) = 367.12.
The number of transactions in 5 hours has mean: (5)(54.6) = 273,
and variance: (5)(367.12) = 1835.6.
(0.842) = 80%. 80th percentile 273 + (0.842) 1835.6 = 309.1.
16.9. E. Mean of the Primary Negative Binomial = (100)(0.6) = 60.
Variance of the Primary Negative Binomial = (100)(0.6)(1.6) = 96.
Mean of the Secondary Binomial = (4)(0.1) = 0.4.
Variance of the Secondary Binomial = (4)(0.1)(0.9) = .36.
Variance of the Compound Distribution = (60)(.36) + (0.42 )(96) = 36.96.
16.10. D. Mean of the Compound Distribution = (60) (0.4) = 24.
Prob[# claimants 20] [(20.5 - 24)/ 36.96 ] = (-0.58) = 1 - 0.7190 = 28.1%.
16.11. A. Mean Frequency: 24. Variance of Frequency: 36.96.
Mean Severity: (3)(4000) = 12,000. Variance of Severity: (3)(40002 ) = 48,000,000.
Mean Aggregate Loss = (24)(12000) = 288,000.
Variance of the Aggregate Loss = (24)(48,000,000) + (12,0002 )(36.96) = 6474 million.
Prob[Aggregate loss > 300000] 1 - ((300000 - 288000)/ 6474 million n) =
1 - (0.15) = 1 - 0.5596 = 44%.
16.12. E. The number of batters has mean: 3 + (1.4)(1) = 4.4, and variance: (1.4)(1)(1 + 1) = 2.8.
The number of pitches per batter has mean: 1 + (1.8)(1.5) = 3.7,
and variance: (1.8)(1.5)(1 + 1.5) = 6.75.
The number of pitches per half-inning has mean: (4.4)(3.7) = 16.28,
and variance: (4.4)(6.75) + (3.72 )(2.8) = 68.032.
Prob[# pitches > 30] 1 - [(30.5 - 16.28)/ 68.032 ] = 1 - (1.72) = 4.27%.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 290
16.13. D. Over y minutes, the number of taxicabs has mean 5.6y and variance 5.6y.
The passengers per cab has mean: (0.3)(4) = 1.2, and variance: (0.3)(1 - 0.3)(4) = 0.84.
The mean aggregate number of passengers is: (5.6y)(1.2) = 6.72y.
The standard deviation of the aggregate number of passengers is:
(5.6y)(0.84) + (1.22)(5.6y) = 3.573 y .
Thus the probability that the aggregate number of passengers is 1000 is approximately:
1 - [(999.5 - 6.72y)/3.573 y ]. Since (1.282) = 0.90, this probability will be greater than 90% if:
(Mean - 999.5) / StdDev. = (6.72y - 999.5) / (3.573 y ) > 1.282.
Try the given choices and stop when (Mean - 999.5) / StdDev. > 1.282.
Minutes

Mean

Standard Deviation

# of Claims

155
156
157
158
159

1,041.6
1,048.3
1,055.0
1,061.8
1,068.5

44.48
44.63
44.77
44.91
45.05

0.946
1.094
1.241
1.386
1.531

The smallest whole number of minutes is therefore 158.


16.14. A. The mean number of lawyers is: 2.5 and the variance is:
{(1 - 2.5)2 + (2 - 2.5)2 + (3 - 2.5)2 + (4 - 2.5)2 }/4 = 1.25.
The mean number of hours per lawyer is: (7)(.6) = 4.2 and the variance is: (7)(.4)(.6) = 1.68.
Therefore, the total number of hours volunteered per day has mean: (2.5)(4.2) = 10.5 and variance:
(2.5)(1.68) + (4.22 )(1.25) = 26.25.
The number of clients per hour has mean 5 and variance 5.
Therefore, the total number of clients per day has mean: (5)(10.5) = 52.5,
and variance: (10.5)(5) + (52 )(26.25) = 708.75.
Prob[# clients 40] 1 - [(39.5 - 52.5)/ 708.75 ] = 1 - (-.49) = 68.79%.
Alternately, the mean number of clients per lawyer is: (4.2)(5) = 21
with variance: (4.2)(5) + (52 )(1.68) = 63.
Therefore, the total number of clients per day has mean: (2.5)(21) = 52.5 and
variance: (2.5)(63) + (212 )(1.25) = 708.75. Proceed as before.
Comment: Similar to 3, 11/00, Q.2.
16.15. A. The mean of the primary Poisson Distribution is 1.2.
The mean of the secondary Binomial Distribution is: (4)(.1) = .4.
Thus the mean of the compound distribution is: (1.2)(.4) = 0.48.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 291
16.16. B. The mean of the primary Poisson Distribution is 1.2. The mean of the secondary
Binomial Distribution is: (4)(0.1) = 0.4. The variance of the primary Poisson Distribution is 1.2.
The variance of the secondary Binomial Distribution is: (4)(0.1)(.9) = 0.36.
The variance of the compound distribution is: (1.2)(0.36) + (0.4)2 (1.2) = 0.624.
16.17. A. The compound distribution has mean of .48 and variance of .624.
Prob[# books > 2] 1 - [(2.5 - 0.48)/ 0.624 ] = 1 - (2.56) = 1 - 0.9948 = 0.0052.
16.18. B. The mean number of nuggets per pan is: (90%)(1) + (9%)(5) + (1%)(25) = 1.6.
2nd moment of the number of nuggets per pan is: (90%)(12 ) + (9%)(52 ) + (1%)(252 ) = 9.4.
Mean aggregate over 10 days is: (10)(3)(1.6) = 48.
Variance of aggregate over 10 days is: (10)(3)(9.4) = 282.
Prob[aggregate < 30] [(29.5 - 48)/ 282 ] = (-1.10) = 13.57%.
16.19. A. This is a compound frequency distribution with a primary distribution that is discrete and
uniform on 1 through 5 and with secondary distribution which is Poisson with = 30. The primary
distribution has mean of 3 and second moment of:
(12 + 22 + 32 + 42 + 52 )/5 = 11. Thus the primary distribution has variance: 11 - 32 = 2.
Mean of the Compound Dist. = (Mean of Primary Dist.)(Mean of Secondary Dist.) = (3)(30) = 90.
Variance of the Compound Distribution =
(Mean of Primary Dist.)(Variance of Secondary Dist.) +
(Mean of Secondary Dist.)2 (Variance of Primary Dist.) = (3)(30) + (302 )(2) = 1890.
Probability of 120 or more patients 1 - [(119.5 - 90)/ 1890 ] = 1 - (0.68).

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 292
16.20. B. Over y hours, the number of salmon has mean 100y and variance 900y.
The mean aggregate number of eggs is: (100y)(5) = 500y.
The variance of the aggregate number of eggs is: (100y)(5) + (52 )(900y) = 23000y.
Thus the probability that the aggregate number of eggs is < 10000 is approximately:
((9999.5 - 500y)/ 23000y ). Since (1.645) = 0.95, this probability will be 5% if:
(9999.5 - 500y)/ 23000y = - 1.645 500y - 249.98 y - 9999.5 = 0.
y = {249.48

249.482 + (4)(500)(9999.5) } / {(2)(500)} = 0.24948 4.479.

y = 4.729. y = 22.3. The smallest whole number of hours is therefore 23.


Alternately, calculate the probability for each of the number of hours in the choices.
Hours Mean
Variance
Probability of at least 10,000 eggs
20

10,000

460,000

1 - ((9999.5 - 10000)/ 460,000 ) = 1 - (-0.0007) = 50.0%

23

11,500

529,000

1 - ((9999.5 - 11500)/ 529,000 ) = 1 - (-2.063) = 98.0%

26

13,000

598,000

1 - ((9999.5 - 13000)/ 598,000 ) = 1 - (-3.880) = 99.995%

Thus 20 hours is not enough and 23 hours is enough so that the probability is greater than 95%.
Comment: The number of salmon acts as the primary distribution, and the number of eggs per
salmon as the secondary distribution. This exam question should have been worded better. They
intended to say so the probability that at least 10,000 eggs will be released is greater than 95%.
The probability of exactly 10,000 eggs being released is very small.
16.21. E. The second moment of the number of claimants per accident is:
(1/2)(12 ) + (1/3(22 ) + (1/6)(32 ) = 3.333. The variance of a Compound Poisson Distribution is:
(2nd moment of the secondary distribution) = (12)(3.333) = 40.
Alternately, thinning the original Poisson, those accidents with 1, 2, or 3 claimants are independent
Poissons. Their means are: (1/2)(12) = 6, (1/3)(12) = 4, and (1/6)(12) = 2.
Number of accidents with 3 claimants is Poisson with mean 2
The variance of the number of accidents with 3 claimants is 2.
Number of claimants for those accidents with 3 claimants = (3)(# of accidents with 3 claimants)
The variance of the # of claimants for those accidents with 3 claimants is: (32 )(2).
Due to independence, the variances of the three processes add: (12 )(6) + (22 )(4) + (32 )(2) = 40.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 293
16.22. B. Mean # claims / envelope = (1)(0.2) + (2)(0.25) + (3)(0.4) + (4)(0.15) = 2.5.
2nd moment # claims / envelope = (12 )(0.2) + (22 )(0.25) + (32 )(0.4) + (42 )(0.15) = 7.2.
Over 13 weeks, the number of envelopes is Poisson with mean: (13)(50) = 650.
Mean of the compound distribution = (650)(2.5) = 1625.
Variance of the aggregate number of claims = Variance of a compound Poisson distribution =
(mean primary Poisson distribution)(2nd moment of the secondary distribution) = (650)(7.2) =
4680. (1.282) = 0.90. Estimated 90th percentile = 1625 + 1.282

4680 = 1713.

16.23. E. The amount won per a round of the game is a compound frequency distribution.
Primary distribution (determining how many dice are rolled) is a six-sided die, uniform and discrete on
1 through 6, with mean 3.5, second moment (12 + 22 + 32 + 42 +52 + 62 )/6 = 91/6,
and variance 91/6 - 3.52 = 35/12.
Secondary distribution is also a six-sided die, with mean 3.5 and variance 35/12.
Mean of the compound distribution is: (3.5)(3.5) = 12.25.
Variance of the compound distribution is: (3.5)(35/12) + (3.52 )(35/12) = 45.94.
Therefore, the net result of a round has mean 12.25 - 12.5 = -0.25, and variance 45.94.
1000 rounds have a net result with mean -250 and variance 45,940.
Prob[net result 0] 1 - ((-0.5 + 250)/ 45,940 ) = 1 - (1.16) = 1 - 0.8770 = 0.1220.
16.24. B. The total number of delayed passengers is a compound frequency distribution, with
primary distribution the number of delayed flights, and the secondary distribution the number of
passengers on a flight.
The number of flights delayed per year is Poisson with mean: (2)(12) = 24.
The second moment of the secondary distribution is: 502 + 302 = 3400.
The variance of the number of passengers delayed per year is: (24)(3400) = 81,600.
The standard deviation of the number of passengers delayed per year is:

81,600 = 285.66.

The standard deviation of the annual compensation is: (100)( 285.66) = 28,566.

2013-4-1, Frequency Distributions, 16 Moments of Comp. Dists. HCM 10/4/12, Page 294
16.25. D. The mean number of sessions is:
(400)(0.2)(2) + (300)(0.5)(15) + (200)(0.3)(9) = 2950.
For a single resident we have a Bernoulli primary (whether the resident need therapy) and a
geometric secondary (how many visits).
This has variance: (mean of primary)(variance of second.) + (mean second.)2 (var. of primary)
= q(1 + ) + 2q(1 - q).
For a resident in state 1, the variance of the number of visits is:
(0.2)(2)(3) + (32 )(0.2)(1 - 0.8) = 1.84.
For state 2, the variance of the number of visits is: (0.5)(15)(16) + (152 )(0.5)(1 - 0.5) = 176.25.
For state 3, the variance of the number of visits is: (0.3)(9)(10) + (92 )(0.3)(1 - 0.3) = 44.01.
The sum of the visits from 400 residents in state 1, 300 in state 2, and 200 in state 3, has variance:
(400)(1.84) + (300)(176.25) + (200)(44.01) = 62,413.
Prob[sessions > 3000] 1 - [(3000.5 - 2950)/ 62413 ] = 1 - [0.20] = 0.4207.
16.26. E. Primary distribution has mean: (0)(.1) + (1)(.4) + (2)(.3) + (3)(.2) = 1.6,
second moment: (02 )(.1) + (12 )(.4) + (22 )(.3) + (32 )(.2) = 3.4, and variance: 3.4 - 1.62 = 0.84.
The secondary distribution has mean 3 and variance 3.
The compound distribution has variance: (1.6)(3) + (32 )(0.84) = 12.36.
16.27. E. Mean = (mean primary)(mean secondary) = (100)(1.1)(1.0) = 110.
Variance = (mean primary)(variance of secondary) + (mean secondary)2 (variance of primary) =
(100)(1.1)(1)(1 + 1) + {(1.1)(1.0)}2 (100) = 341. (2.326) = 0.99.
99th percentile: 110 + 2.326 341 = 152.95. Need at least 153 televisions.
16.28. A. The primary distribution is Binomial with m = 1000 and q = .2, with mean 200 and
variance 160. The mean of the compound distribution is: (200)(20) = 4000.
The variance of the compound distribution is: (200)(20) + (202 )(160) = 68,000.
Annual budget is: 10(4000 +

68000 ) = 42,608.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 295

Section 17, Mixed Frequency Distributions


One can mix frequency models together by taking a weighted average of different frequency
models. This can involve either a discrete mixture of several different frequency distributions or a
continuous mixture over a portfolio as a parameter varies.
For example, one could mix together Poisson Distributions with different means.123
Discrete Mixtures:
Assume there are four types of risks, each with claim frequency given by a Poisson distribution:
Average Annual
A Priori
Type
Claim Frequency
Probability
Excellent
1
40%
Good
2
30%
Bad

20%

Ugly

10%

Recall that for a Poisson Distribution with parameter the chance of having n claims is given by:
f(n) = n e / n!.
So for example for an Ugly risk with = 4, the chance of n claims is: 4n e-4 / n!.
For an Ugly risk the chance of 6 claims is: 46 e-4 /6! = 10.4%.
Similarly the chance of 6 claims for Excellent, Good, or Bad risks are: 0.05%, 1.20%, and 5.04%
respectively.
If we have a risk but do not know what type it is, we weight together the 4 different chances of
having 6 claims, using the a priori probabilities of each type of risk in order to get the chance of
having 6 claims: (0.4)(0.05%) + (0.3)(1.20%) + (0.2)(5.04%) + (0.1)(10.42%) = 2.43%.
The table below displays similar values for other numbers of claims.
The probabilities in the final column represents the assumed distribution of the number of claims for
the entire portfolio of risks.124 This probability for all risks is the mixed distribution. While the mixed
distribution is easily computed by weighting together the four Poisson distributions, it is not itself a
Poisson nor other well known distribution.
123

The parameter of a Poisson is its mean. While one can mix together other frequency distributions, for example
Binomials or Negative Binomials, you are most likely to be asked about mixing Poissons. (It is unclear what if anything
they will ask on this subject.)
124
Prior to any observations. The effect of observations will be discussed in Mahlers Guide to Buhlmann Credibility
and Mahlers Guide to Conjugate Priors.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 296


Number of
Claims

Probability for
Excellent Risks

Probability for
Good Risks

Probability for
Bad Risks

Probability for
Ugly Risks

Probability for
All Risks

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

0.3679
0.3679
0.1839
0.0613
0.0153
0.0031
0.0005
0.0001
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

0.1353
0.2707
0.2707
0.1804
0.0902
0.0361
0.0120
0.0034
0.0009
0.0002
0.0000
0.0000
0.0000
0.0000
0.0000

0.0498
0.1494
0.2240
0.2240
0.1680
0.1008
0.0504
0.0216
0.0081
0.0027
0.0008
0.0002
0.0001
0.0000
0.0000

0.0183
0.0733
0.1465
0.1954
0.1954
0.1563
0.1042
0.0595
0.0298
0.0132
0.0053
0.0019
0.0006
0.0002
0.0001

0.1995
0.2656
0.2142
0.1430
0.0863
0.0478
0.0243
0.0113
0.0049
0.0019
0.0007
0.0002
0.0001
0.0000
0.0000

SUM

1.0000

1.0000

1.0000

1.0000

1.0000

The density function of the mixed distribution, is the mixture of the density function for
specific values of the parameter that is mixed.
Moments of Mixed Distributions:
The overall (a priori) mean can be computed in either one of two ways.
First one can weight together the means for each type of risks, using their (a priori) probabilities:
(0.4)(1) + (0.3)(2) + (0.2)(3) + (0.1)(4) = 2.
Alternately, one can compute the mean of the mixed distribution:
(0)(0.1995) + (1)(0.2656) + (2)( 0.2142) + ... = 2.
In either case, the mean of this mixed distribution is 2.
The mean of a mixed distribution is the mixture of the means for specific values of the
parameter : E[X] = E [E[X | ]].
One can calculate the second moment of a mixture in a similar manner.
Exercise: What is the second moment of a Poisson distribution with = 3?
[Solution: Second Moment = Variance + Mean2 = 3 + 32 = 12.]

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 297


In general, the second moment of a mixture is the mixture of the second moments.
In the case of this mixture, the second moment is:
(0.4)(2) + (0.3)(6) + (0.2)(12) + (0.1)(20) = 7.
One can verify this second moment, by working directly with the mixed distribution:
Number
Probability
Probability
of for
for
Excellent
Claims
Good
Bad
Ugly
All
Risks
Risks
Risks
Risks
Risks
0.1995
0.2656
0.2142
0.1430
0.0863
0.0478
0.0243
0.0113
0.0049
0.0019
0.0007
0.0002
0.0001
0.0000
0.0000
Mean
Average

Number of
Claims

Square of #
of Claims

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

0
1
4
9
16
25
36
49
64
81
100
121
144
169
196

2.000

7.000

Exercise: What is the variance of this mixed distribution?


[Solution: 7 - 22 = 3.]
First one mixes the moments, and then computes the variance of the mixture from its
first and second moments.125
In general, the nth moment of a mixed distribution is the mixture of the nth moments for
specific values of the parameter : E[Xn ] = E [E[Xn | ]].126
There is nothing unique about assuming four types of risks. If one had assumed for example 100
different types of risks, with mean frequencies from 0.1 to 10. There would have been no change in
the conceptual complexity of the situation, although the computational complexity would have been
increased. This discrete example can be extended to a continuous case.
125

As discussed in Mahlers Guide to Buhlmann Credibility, one can split the variance of a mixed distribution into
two pieces, the Expected Value of the Process Variance and the Variance of the Hypothetical Means.
126
Third and higher moments are more likely to be asked about for Loss Distributions.
Mixtures of Loss Distributions are discussed in Mahlers Guide to Loss Distributions.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 298


Continuous Mixtures:
We have seen how one can mix a discrete number of Poisson Distributions.127 For a continuous
mixture, the mixed distribution is given as the integral of the product of the distribution of the
parameter times the Poisson density function given .128
g(x) =

f(x; ) u() d.

The density function of the mixed distribution, is the mixture of the density function for
specific values of the parameter that is mixed.
Exercise: The claim count N for an individual insured has a Poisson distribution with mean .
is uniformly distributed between 0.3 and 0.8.
Find the probability that a randomly selected insured will have one claim.
[Solution: For the Poisson Distribution, f(1 | ) = e.
0.8

(1/0.5)

e-

d = (2)(-

0.3

e-

= 0.8

e )]

= (2)(1.3e-0.3 - 1.8 e-0.8) = 30.85%.]

= 0.3

Continuous mixtures can be performed of either frequency distributions or loss distributions.129


Such a continuous mixture is called a Mixture Distribution.130
Mixture Distribution Continuous Mixture of Models.
Mixture Distributions can be created from other frequency distributions than the Poisson.
For example, if f is a Binomial with fixed m, one could mix on the parameter q:
g(x) =

f(x; q) u(q) dq.

For example, if f is a Negative Binomial with fixed r, one could mix on the parameter :
g(x) =

f(x; ) u() d.

If f is a Negative Binomial with fixed r, one could instead mix on the parameter p = 1/(1+).
127

One can mix other frequency distributions besides the Poisson.


The very important Gamma-Poisson situation is discussed in a subsequent section.
129
See the section on Continuous Mixtures of Models in Mahlers Guide to Loss Distributions.
130
See Section 5.2.4 of Loss Models.
128

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 299


Moments of Continuous Mixtures:
As in the case of discrete mixtures, the nth moment of a continuous mixture is the mixture of the nth
moments for specific values of the parameter : E[Xn ] = E [E[Xn | ]].
Exercise: What is the mean for a mixture of Poissons?
[Solution: For a given value of lambda, the mean of a Poisson Distribution is . We need to weight

these first moments together via the density of lambda u(): u() d = mean of u.]
If for example, were uniformly distributed from 0.1 to 0.5, then the mean of the mixed distribution
would be 0.3.
In general, the mean of a mixture of Poissons is the mean of the mixing distribution.131 For the case of
a mixture of Poissons via a Gamma Distribution with parameters and , the mean of the mixed
distribution is that of the Gamma, .132
Exercise: What is the Second Moment for Poissons mixed via a Gamma Distribution with
parameters and ?
[Solution: For a given value of lambda, the second moment of a Poisson Distribution is + 2.
We need to weight these second moments together via the density of lambda: 1 e/ / ().

( + 2 ) - 1 e- / -
-
P(z) =
d =
( + + 1 ) e - / d =
()
()
=0
=0
-
{(+1)+1 + (+2)+2} = + (+1)2.]
()
Since the mean of the mixed distribution is that of the Gamma, , the variance of the mixed
distribution is: + (+1)2 - ()2 = + 2.
As will be discussed, the mixed distribution is a Negative Binomial Distribution, with r = and
= . Thus the variance of the mixed distribution is: + 2 = r + r2 = r(1+), which is in fact that
the variance of a Negative Binomial Distribution.
131
132

This result will hold whenever the parameter being mixed is the mean, as it was in the case of the Poisson.
The Gamma-Poisson will be discussed in a subsequent section.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 300


Factorial Moments of Mixed Distributions:
The nth factorial moment of a mixed distribution is the mixture of the nth factorial moments for specific
values of the parameter :
E[(X)(X-1) ... (X-n)] = E[E[(X)(X-1) ... (X-n) | ].
When we are mixing Poissons, the factorial moments of the mixed distribution have a simple form.
nth factorial moment of mixed Poisson = E[(X)(X-1) ... (X-n)] = E [E[(X)(X-1) ... (X-n) | ] =
E [nth factorial moment of Poisson] = E [n ] = nth moment of the mixing distribution.133
Exercise: Given Poissons are mixed via a distribution u(), what are the mean and variance of the
mixed distribution?
[Solution: The mean of the mixed distribution = first factorial moment =
mean of the mixing distribution.
The second moment of the mixed distribution =
second factorial moment + first factorial moment =
second moment of the mixing distribution + mean of the mixing distribution.
Variance of the mixed distribution =
second moment of mixed distribution - (mean of mixed distribution)2 =
second moment of the mixing distribution + mean of the mixing distribution (mean of the mixing distribution)2 =
Variance of the mixing distribution + Mean of the mixing distribution.]
When mixing Poissons, Mean of the Mixed Distribution = Mean of the Mixing Distribution,
and the Variance of the Mixed Distribution =
Variance of the Mixing Distribution + Mean of the Mixing Distribution.
Therefore, for a mixture of Poissons, the variance of the mixed distribution is always
greater than the mean of the mixed distribution.
For example, for a Gamma mixing distribution, the variance of the mixed Poisson is:
Variance of the Gamma + Mean of the Gamma = 2 + .

133

See equation 8.24 in Insurance Risk Models by Panjer & Willmot.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 301


Probability Generating Functions of Mixed Distributions:
The Probability Generating Function of the mixed distribution, is the mixture of the probability
generating functions for specific values of the parameter :

P(z) =

P(z; ) u() d.

Exercise: What is the Probability Generating Function for Poissons mixed via a Gamma Distribution
with parameters and ?
[Solution: For a given value of lambda, the p.g.f. of a Poisson Distribution is e(z-1).
We need to weight these Probability Generating Functions together via the density of lambda:
1 e/ / ().

e(z - 1) - 1 e- / -
- - 1 - (1/ + 1 - z)
P(z) =
d =
d
e
() =0
()
=0
-
=
{() (1/+1-z)} = {1+ (z-1)}.]
()
This is the p.g.f. of a Negative Binomial Distribution with r = and = . This is one way to establish
that when Poissons are mixed via a Gamma Distribution, the mixed distribution is always a Negative
Binomial Distribution, with r = = shape parameter of the Gamma and
= = scale parameter of the Gamma.134

134

The Gamma-Poisson frequency process is the subject of an important subsequent section.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 302


Mixing Poissons:135
In the very important case of mixing Poisson frequency distributions, the p.g.f. of the mixed
distribution can be put in terms of the Moment Generating Function of the mixing distribution of .
The Moment Generating Function of a distribution is defined as: MX(t) = E[ext].136
For a mixture of Poissons:
Pmixed distribution(z) = Emixing distribution of [PPoisson(z)] = Emixing distribution of [exp[(z - 1)] =
M mixing distribution of (z - 1).
Thus when mixing Poissons, Pmixed distribution(z) = Mmixing distribution of (z - 1).137
Exercise: Apply the above formula for probability generating functions to Poissons mixed via a
Gamma Distribution.
[Solution: The m.g.f. of a Gamma Distribution with parameters and is: (1 - t).
Therefore, the p.g.f. of the mixed distribution is:
M mixing distribution(z - 1) = (1 - (z - 1)).
Comment: This is the p.g.f. of a Negative Binomial Distribution, with r = and = .
Therefore, the mixture of Poissons via a Gamma, with parameters and , is a Negative Binomial
Distribution, with r = and = .]
M X(t) = EX[ext] = EX[exp[t]x] = PX[et]. Therefore, when mixing Poissons:
M mixed distribution(t) = Pmixed distribution(et) = Mmixing distribution of (et - 1).
Exercise: Apply the above formula for moment generating formulas to Poissons mixed via an
Inverse Gaussian Distribution with parameters and .
[Solution: The m.g.f. of an Inverse Gaussian Distribution with parameters and is:
exp[( / ) (1 -

1 - 2 2 t / )] .

Therefore, the moment generating function of the mixed distribution is:


M mixing distribution of (et - 1) = exp[( / ) {1 135

1 - 2 2 (et 1) / }] . ]

See Section 6.10.2 of Loss Models, not on the syllabus.


See Definition 3.9 in Loss Models and Mahlers Guide to Aggregate Distributions.
The moment generating functions of loss distributions are shown in Appendix B, when they exist.
137
See Equation 6.45 in Loss Models.
136

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 303


Exercise: The p.g.f. of the Zero-Truncated Negative Binomial Distribution is:
{1 - (z - 1)} - r - (1 + )- r
{1 - (z - 1)} - r - 1
P(z) =
=
1
+
, z < 1 +1/.
1 - (1 + )- r
1 - (1 + )- r
What is the moment generating function of a compound Poisson-Extended Truncated Negative
Binomial Distribution, with parameters = (/){(1 + 22/).5 - 1}, r = -1/2, and = 22/?
[Solution: The p.g.f. of a Poisson Distribution with parameter is: P(z) = e(z - 1).
For a compound distribution, the m.g.f can be written in terms of the p.g.f. of the primary distribution
and m.g.f. of the secondary distribution:
M compound dist.(t) = Pprimary [Msecondary[t]] = Pprimary [Psecondary[et]] =
exp[{P secondary[et] - 1}] = exp[

{1 - (et - 1)} - r - 1
]=
1 - (1 + )- r

( / ) { 1 + 2 2 / -1} { 1 - 2 (2 / )(et - 1) - 1}
exp
=

1 - 1 + 2 2 /
exp[( / ) {1 -

1 - 2 2 (et 1) / }] .

Comment: This is the same as the m.g.f. of Poissons mixed via an Inverse Gaussian Distribution
with parameters and .]
Since their moment generating functions are equal, if a Poisson is mixed by an Inverse
Gaussian as per Loss Models, with parameters and , then the mixed distribution is a
compound Poisson-Extended Truncated Negative Binomial Distribution as per Loss Models,

with parameters: = ( 1 + 2 2 / - 1) , r = -1/2, and = 22/.138

This is an example of a general result:139 If one mixes Poissons and the mixing distribution is infinitely
divisible,140 then the resulting mixed distribution can also be written as a compound Poisson
distribution, with a unique secondary distribution.
The Inverse Gaussian Mixing Distribution was infinitely divisible and the result of mixing the
Poissons was a Compound Poisson Distribution with a particular Extended Truncated Negative
Binomial Distribution as a secondary distribution.
138

See Example 6.26 in Loss Models.


See Theorem 6.20 in Loss Models.
140
As discussed previously, if a distribution is infinitely divisible, then if one takes the probability generating function
to any positive power, one gets the probability generating function of another member of the same family of
distributions. Examples of infinitely divisible distributions include: Poisson, Negative Binomial, Compound Poisson,
Compound Negative Binomial, Normal, Gamma, and Inverse Gaussian.
139

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 304


Another example is mixing Poissons via a Gamma. The Gamma is infinitely divisible, and therefore
the mixed distribution can be written as a compound distribution. As discussed previously, the
mixed distribution is a Negative Binomial. It turns out that the Negative Binomial can also be written
as a Compound Poisson with a logarithmic secondary distribution.
Exercise: The logarithmic frequency distribution has:
x

ln[1 - (z - 1)]
1+
f(x) =
, x = 1, 2, 3,...
P(z) = 1 , z < 1 + 1/.
x ln(1+)
ln[1+ ]
Determine the probability generating function of a Compound Poisson with a logarithmic secondary
distribution.
[Solution: Pcompound distribution(z) = Pprimary [Psecondary[z]] =
exp[{P secondary[z] - 1}] = exp[-

ln[1 - (z - 1)]
-
] = exp[
ln[1 - (z - 1)]]
ln[1+ ]
ln[1+]

= {1 - (z - 1)}/ln[1 + ].]
The p.g.f. of the Negative Binomial is: P(z) = {1 - (z -1)}-r. This is the same form as the probability
generating function obtained in the exercise, with r = /ln[1+] and = .
Therefore, a Compound Poisson with a logarithmic secondary distribution is a Negative Binomial
Distribution with parameters r = /ln[1 + ] and = .141
Mixing versus Adding:
The number of accidents Alice has is Poisson with mean 3%.
The number of accidents Bob has is Poisson with mean 5%.
The number of accidents Alice and Bob have are independent.
Exercise: Determine the probability that Alice and Bob have a total of two accidents.
[Solution: Their total number of accidents is Poisson with mean 8%. 0.082 e-0.08 / 2 = 0.30%.
Comment: An example of adding two Poisson variables.]
Exercise: We choose either Alice or Bob at random.
Determine the probability that the chosen person has two accidents.
[Solution: (50%)(0.032 e-0.03 / 2) + (50%)(0.052 e-0.05 / 2) = 0.081%.
Comment: A 50%-50% mixture of two Poisson Distributions with means 3% and 5%.
Mixing is different than adding.]
141

See Example 6.14 in Loss Models, not on the syllabus.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 305


Problems:
Use the following information for the next three questions:
Each insureds claim frequency follows a Poisson process.
There are three types of insureds as follows:
Type A Priori Probability Mean Annual Claim Frequency (Poisson Parameter)
A
60%
1
B
30%
2
C
10%
3
17.1 (1 point) What is the chance of a single individual having 4 claims in a year?
A. less than 0.03
B. at least 0.03 but less than 0.04
C. at least 0.04 but less than 0.05
D. at least 0.05 but less than 0.06
E. at least 0.06
17.2 (1 point) What is the mean of this mixed distribution?
A. 1.1
B. 1.2
C. 1.3
D. 1.4
E. 1.5
17.3 (2 points) What is the variance of this mixed distribution?
A. less than 2.0
B. at least 2.0 but less than 2.1
C. at least 2.1 but less than 2.2
D. at least 2.2 but less than 2.3
E. at least 2.3

17.4 (7 points) Each insured has its annual number of claims given by a Geometric Distribution
with mean . Across a portfolio of insureds, is distributed as follows: () = 3/(1+)4 , 0 < .
(a) Determine the algebraic form of the density of this mixed distribution.
(b) List the first several values of this mixed density.
(c) Determine the mean of this mixed distribution.
(d) Determine the variance of this mixed distribution.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 306


17.5 (1 point) Each insureds claim frequency follows a Binomial Distribution, with m = 5.
There are three types of insureds as follows:
Type A Priori Probability Binomial Parameter q
A
60%
0.1
B
30%
0.2
C
10%
0.3
What is the chance of a single individual having 3 claims in a year?
A. less than 0.03
B. at least 0.03 but less than 0.04
C. at least 0.04 but less than 0.05
D. at least 0.05 but less than 0.06
E. at least 0.06
Use the following information for the following four questions:

The claim count N for an individual insured has a Poisson distribution with mean .
is uniformly distributed between 0 and 4.
17.6 (2 points) Find the probability that a randomly selected insured will have no claims.
A. Less than 0.22
B. At least 0.22 but less than 0.24
C. At least 0.24 but less than 0.26
D. At least 0.26 but less than 0.28
E. At least 0.28
17.7 (2 points) Find the probability that a randomly selected insured will have one claim.
A. Less than 0.22
B. At least 0.22 but less than 0.24
C. At least 0.24 but less than 0.26
D. At least 0.26 but less than 0.28
E. At least 0.28
17.8 (1 point) What is the mean claim frequency?
17.9 (1 point) What is the variance of the mixed frequency distribution?

17.10 (4 points) For a given value of q, the number of claims is Binomial with parameters m and q.
However, m is distributed via a Negative Binomial with parameters r and .
What is the mixed distribution of the number of claims?

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 307


Use the following information for the next 3 questions:
Assume that given q, the number of claims observed for one risk in m trials is given by a Binomial
distribution with mean mq and variance mq(1-q). Also assume that the parameter q varies between
0 and 1 for the different risks, with q following a Beta distribution:
(a + b) a-1
ab
a
g(q) =
q (1-q)b-1, with mean
and variance
.
2
(a) (b)
(a + b) (a+ b +1)
a +b
17.11 (2 points) What is the unconditional mean frequency?
m
a
ab
A.
B. m
C. m
a +b
a +b
a +b
D. m

a
(a + b) (a +b +1)

E. m

a
(a + b)2 (a + b +1)

17.12 (4 points) What is the unconditional variance?


a
a
A. m2
B. m2
(a + b) (a +b +1)
a +b
D. m(m+a+b)

ab
(a + b)2 (a + b +1)

E. m(m+a+b)

C. m2

ab
(a + b) (a +b +1)

ab
(a + b) (a +b +1) (a + b + 2)

17.13 (4 points) If a = 2 and b = 4, then what is the probability of observing 5 claims in 7 trials for an
individual insured?
A. less than 0.068
B. at least 0.068 but less than 0.070
C. at least 0.070 but less than 0.072
D. at least 0.072 but less than 0.074
E. at least 0.074

17.14 (2 points)
Each insureds claim frequency follows a Negative Binomial Distribution, with r = 0.8.
There are two types of insureds as follows:
Type A Priori Probability

A
70%
0.2
B
30%
0.5
What is the chance of an insured picked at random having 1 claim next year?
A. 13%
B. 14%
C. 15%
D. 16%
E. 17%

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 308


17.15 (3 points) For a given value of q, the number of claims is Binomial with parameters m and q.
However, m is distributed via a Poisson with mean .
What is the mixed distribution of the number of claims?
Use the following information for the next two questions:
The number of claims a particular policyholder makes in a year follows a distribution with parameter
p: f(x) = p(1-p)x, x = 0, 1, 2, ....
The values of the parameter p for the individual policyholders in a portfolio follow a Beta Distribution,
with parameters a = 4, b = 5, and = 1: g(p) = 280 p3 (1-p)4 , 0 p 1.
17.16 (2 points) What is the a priori mean annual claim frequency for the portfolio?
A. less than 1.5
B. at least 1.5 but less than 1.6
C. at least 1.6 but less than 1.7
D. at least 1.7 but less than 1.8
E. at least 1.8
17.17 (3 points) For an insured picked at random from this portfolio, what is the probability
of observing 2 claims next year?
A. 9%
B. 10%
C. 11%
D. 12%
E. 13%

Use the following information for the next 2 questions:


(i) An individual insured has an annual claim frequency that follow a Poisson distribution with
mean .
(ii) Across the portfolio of insureds, the parameter has probability density function:
() = (0.8)(40e40) + (0.2)(10e10).
17.18 (1 point) What is the expected annual frequency?
(A) 3.6%
(B) 3.7%
(C) 3.8%
(D) 3.9%
(E) 4.0%
17.19 (2 points) For an insured picked at random, what is the probability that he will have at least
one claim in the coming year?
(A) 3.6%
(B) 3.7%
(C) 3.8%
(D) 3.9%
(E) 4.0%

17.20 (4 points) For a given value of q, the number of claims is Binomial with parameters m and q.
However, m is distributed via a Binomial with parameters 5 and 0.1.
What is the mixed distribution of the number of claims?

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 309


Use the following information for the next four questions:
For a given value of q, the number of claims is Binomial distributed with parameters m = 3 and q.
In turn q is distributed uniformly from 0 to 0.4.
17.21 (2 points) What is the chance that zero claims are observed?
A. Less than 0.52
B. At least 0.52 but less than 0.53
C. At least 0.53 but less than 0.54
D. At least 0.54 but less than 0.55
E. At least 0.55
17.22 (2 points) What is the chance that one claim is observed?
A. Less than 0.32
B. At least 0.32 but less than 0.33
C. At least 0.33 but less than 0.34
D. At least 0.34 but less than 0.35
E. At least 0.35
17.23 (2 points) What is the chance that two claims are observed?
A. Less than 0.12
B. At least 0.12 but less than 0.13
C. At least 0.13 but less than 0.14
D. At least 0.14 but less than 0.15
E. At least 0.15
17.24 (2 points) What is the chance that three claims are observed?
A. Less than 0.01
B. At least 0.01 but less than 0.02
C. At least 0.02 but less than 0.03
D. At least 0.03 but less than 0.04
E. At least 0.04

17.25 (2 points) For students at a certain college, 40% do not own cars and do not drive.
For the rest of the students, their accident frequency is Poisson with = 0.07.
Let T = the total number of accidents for a group of 100 students picked at random.
What is the variance of T?
A. 4.0
B. 4.1
C. 4.2
D. 4.3
E. 4.4

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 310


Use the following information for the next 7 questions:
On his daily walk, Clumsy Klem loses coins at a Poisson rate.
At random, on half the days, Klem loses coins at a rate of 0.2 per minute.
On the other half of the days, Klem loses coins at a rate of 0.6 per minute.
The rate on any day is independent of the rate on any other day.
17.26 (2 points) Calculate the probability that Clumsy Klem loses exactly one coin during the sixth
minute of todays walk.
(A) 0.21
(B) 0.23
(C) 0.25
(D) 0.27
(E) 0.29
17.27 (2 points) Calculate the probability that Clumsy Klem loses exactly one coin during the first
two minutes of todays walk.
A. Less than 32%
B. At least 32%, but less than 34%
C. At least 34%, but less than 36%
D. At least 36%, but less than 38%
E. At least 38%
17.28 (2 points) Let A = the number of coins that Clumsy Klem loses during the first minute of
todays walk. Let B = the number of coins that Clumsy Klem loses during the first minute of
tomorrows walk. Calculate Prob[A + B = 1].
(A) 0.30
(B) 0.32
(C) 0.34
(D) 0.36
(E) 0.38
17.29 (2 points) Calculate the probability that Clumsy Klem loses exactly one coin during the third
minute of todays walk and exactly one coin during the fifth minute of todays walk.
(A) 0.05
(B) 0.06
(C) 0.07
(D) 0.08
(E) 0.09
17.30 (2 points) Calculate the probability that Clumsy Klem loses exactly one coin during the third
minute of todays walk and exactly one coin during the fifth minute of tomorrows walk.
(A) 0.05
(B) 0.06
(C) 0.07
(D) 0.08
(E) 0.09
17.31 (2 points) Calculate the probability that Clumsy Klem loses exactly one coin during the first
four minutes of todays walk and exactly one coin during the first four minutes of tomorrows walk.
A. Less than 8.5%
B. At least 8.5%, but less than 9.0%
C. At least 9.0%, but less than 9.5%
D. At least 9.5%, but less than 10.0%
E. At least 10.0%
17.32 (3 points) Calculate the probability that Clumsy Klem loses exactly one coin during the first
2 minutes of todays walk, and exactly two coins during the following 3 minutes of todays walk.
(A) 0.05
(B) 0.06
(C) 0.07
(D) 0.08
(E) 0.09

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 311


Use the following information for the next two questions:
Each insured has its accident frequency given by a Poisson Distribution with mean .
For a portfolio of insureds, is distributed as follows on the interval from a to b:
(d +1) d
f() = d + 1
, 0 a b .
b
- ad + 1
17.33 (2 points) If the parameter d = -1/2, and if a = 0.2 and b = 0.6, what is the mean frequency?
A. less than 0.35
B. at least 0.35 but less than 0.36
C. at least 0.36 but less than 0.37
D. at least 0.37 but less than 0.38
E. at least 0.38
17.34 (2 points) If the parameter d = -1/2, and if a = 0.2 and b = 0.6, what is the variance of the
frequency?
A. less than 0.39
B. at least 0.39 but less than 0.40
C. at least 0.40 but less than 0.41
D. at least 0.41 but less than 0.42
E. at least 0.42
17.35 (3 points) Let X be a 50%-50% weighting of two Binomial Distributions.
The first Binomial has parameters m = 6 and q = 0.8.
The second Binomial has parameters m = 6 and q unknown.
For what value of q, does the mean of X equal the variance of X?
A. 0.3
B. 0.4
C. 0.5
D. 0.6
E. 0.7
Use the following information for the next 2 questions:
(i) Claim counts for individual insureds follow a Poisson distribution.
(ii) Half of the insureds have expected annual claim frequency of 4%.
(iii) The other half of the insureds have expected annual claim frequency of 10%.
17.36 (1 point) An insured is picked at random.
What is the probability that this insured has more than 1 claim next year?
(A) 0.21% (B) 0.23% (C) 0.25% (D) 0.27% (E) 0.29%
17.37 (1 point) A large group of such insured is observed for one year.
What is the variance of the distribution of the number of claims observed for individuals?
(A) 0.070
(B) 0.071
(C) 0.072
(D) 0.073
(E) 0.074

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 312


Use the following information for the next three questions:
An insurance company sells two types of policies with the following characteristics:
Type of Policy
Proportion of Total Policies
Annual Claim Frequency
I

25%

Poisson with = 0.25

II

75%

Poisson with = 0.50

17.38 (1 point) What is the probability that an insured picked at random will have no claims next
year?
A. 50%
B. 55%
C. 60%
D. 65%
E. 70%
17.39 (1 point) What is the probability that an insured picked at random will have one claim next
year?
A. less than 30%
B. at least 30% but less than 35%
C. at least 35% but less than 40%
D. at least 40% but less than 45%
E. at least 45%
17.40 (1 point) What is the probability that an insured picked at random will have two claims next
year?
A. 4%
B. 6%
C. 8%
D. 10%
E. 12%

17.41 (3 points) The Spiders sports team will play a best of 3 games playoff series.
They have an 80% chance to win each home game and only a 40% chance to win each road game.
The results of each game are independent of the results of any other game.
It has yet to be determined whether one or two of the three games will be home games for the
Spiders, but you assume these two possibilities are equally likely.
What is the chance that the Spiders win their playoff series?
A. 63%
B. 64%
C. 65%
D. 66%
E. 67%
17.42 (4 points) The number of claims is modeled as a two point mixture of Poisson Distributions,
with weight p to a Poisson with mean 1 and weight (1-p) to a Poisson with mean 2.
(a) For the mixture, determine the ratio of the variance to the mean as a function of 1, 2, and p.
(b) With the aid of a computer, for 1 = 10% and 2 = 20%,
graph this ratio as a function of p for 0 p 1.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 313


Use the following information for the next five questions:
For a given value of q, the number of claims is Binomial distributed with parameters m = 4 and q.
In turn q is distributed from 0 to 0.6 via: (q) =

2500 2
q (1-q).
99

17.43 (3 points) What is the chance that zero claims are observed?
A. 12%
B. 14%
C. 16%
D. 18%
E. 20%
17.44 (3 points) What is the chance that one claim is observed?
A. 26%
B. 26%
C. 28%
D. 30%
E. 32%
17.45 (3 points) What is the chance that two claims are observed?
A. 26%
B. 26%
C. 28%
D. 30%
E. 32%
17.46 (2 points) What is the chance that three claims are observed?
A. 19%
B. 21%
C. 23%
D. 25%
E. 27%
17.47 (2 points) What is the chance that four claims are observed?
A. 3%
B. 4%
C. 5%
D. 6%
E. 7%

17.48 (3 points) Use the following information:

There are two types of insurance policies.


Three quarters are low risk policies, while the remaining one quarter are high risk policies.
The annual claims from each type of policy are Poisson.
The mean number of claims from a high risk policy is 0.4.
The variance of the mixed distribution of the number of claims is 0.2575.
Determine the mean annual claims from a low risk policy.
A. 12%
B. 14%
C. 16%
D. 18%
E. 20%

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 314


17.49 (4, 11/82, Q.48) (3 points)
Let f(x|) = frequency distribution for a particular risk having parameter .
f(x|) = (1-)x, where is in the interval [p, 1], p is a fixed value such that 0 < p < 1,
and x is a non-negative integer.
g() = distribution of within a given class of risks.

g() =

-1
, for p 1.
ln(p)

Find the frequency distribution for the class of risks.


A.

-(x +1) (1- p)x


p2 ln(p)

B.

- px + 1
(x +1) ln(p)

C.

-(1- p)x + 1
(x +1) ln(p)

D.

- (x +1) px
ln(p)

E. None A, B, C, or D.
17.50 (2, 5/88, Q.33) (1.5 points) Let X have a binomial distribution with parameters m and q, and
let the conditional distribution of Y given X = x be Poisson with mean x.
What is the variance of Y?
A. x

B. mq

D. mq2

C. mq(1 - q)

E. mq(2 - q)

17.51 (4, 5/88, Q.32) (2 points) Let N be the random variable which represents the number of
claims observed in a one year period. N is Poisson distributed with a probability density function
with parameter : P[N = n | ] = e- n /n!, n = 0, 1, 2, ...
The probability of observing no claims in a year is less than .450.
Which of the following describe possible probability distributions for ?
1. is uniformly distributed on (0, 2).
2. The probability density function of is f() = e for > 0.
3. P[ = 1] = 1 and P[ 1] = 0.
A. 1

B. 2

C. 3

D. 1, 2

E. 1, 3

17.52 (3, 11/00, Q.13 & 2009 Sample Q.114) (2.5 points)
A claim count distribution can be expressed as a mixed Poisson distribution.
The mean of the Poisson distribution is uniformly distributed over the interval [0, 5].
Calculate the probability that there are 2 or more claims.
(A) 0.61
(B) 0.66
(C) 0.71
(D) 0.76
(E) 0.81

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 315


17.53 (SOA3, 11/04, Q.32 & 2009 Sample Q.130) (2.5 points)
Bob is a carnival operator of a game in which a player receives a prize worth W = 2N if the player
has N successes, N = 0, 1, 2, 3,
Bob models the probability of success for a player as follows:
(i) N has a Poisson distribution with mean .
(ii) has a uniform distribution on the interval (0, 4).
Calculate E[W].
(A) 5
(B) 7

(C) 9

(D) 11

(E) 13

17.54 (CAS3, 11/06, Q.19) (2.5 points)


In 2006, annual claim frequency follows a negative binomial distribution with parameters and r.
follows a uniform distribution on the interval (0, 2) and r = 4.
Calculate the probability that there is at least 1 claim in 2006.
A. Less than 0.85
B. At least 0.85, but less than 0.88
C. At least 0.88, but less than 0.91
D. At least 0.91, but less than 0.94
E. At least 0.94
17.55 (SOA M, 11/06, Q.39 & 2009 Sample Q.288) (2.5 points)
The random variable N has a mixed distribution:
(i) With probability p, N has a binomial distribution with q = 0.5 and m = 2.
(ii) With probability 1 - p, N has a binomial distribution with q = 0.5 and m = 4.
Which of the following is a correct expression for Prob(N = 2)?
(A) 0.125p2
(B) 0.375 + 0.125p
(C) 0.375 + 0.125p2
(D) 0.375 - 0.125p2
(E) 0.375 - 0.125p

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 316


Solutions to Problems:
17.1. D. Chance of observing 4 accidents is 4e / 24. Weight the chances of observing 4
accidents by the a priori probability of .
Type
A
B
C

A Priori
Probability
0.6
0.3
0.1

Poisson
Parameter
1
2
3

Chance of
4 Claims
0.0153
0.0902
0.1680

Average

0.053

17.2. E. (60%)(1) + (30%)(2) + (10%)(3) = 1.5.


17.3. A. For a Type A insured, the second moment is: variance + mean2 = 1 + 12 = 2.
For a Type B insured, the second moment is: variance + mean2 = 2 + 22 = 6.
For a Type C insured, the second moment is: variance + mean2 = 3 + 32 = 12.
The second moment of the mixture is: (60%)(2) + (30%)(6) + (10%)(12) = 4.2.
The variance of the mixture is: 4.2 - 1.52 = 1.95.
Alternately, the Expected Value of the Process Variance is:
(60%)(1) + (30%)(2) + (10%)(3) = 1.5.
The Variance of the Hypothetical Means is:
(60%)(1 - 1.5)2 + (30%)(2 - 1.5)2 + (10%)(3 - 1.5)2 = 0.45.
Total Variance = EPV + VHM = 1.5 + 0.45 = 1.95.
Comment: For the mixed distribution, the variance is greater than the mean.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 317


17.4. (a) For the Geometric distribution, f(x) = x/(1+)x+1.

For the mixed distribution, f(x) =

f(x;) ( )d =

x / (1+)x + 1 3 / (1+)4 d = 3 u3 (1u)x du ,

where u = 1/(1+), 1 - u = /(1+), and du = -1/(1+)2 .


This integral is of the Beta variety; its value of (x+1) (3+1) / (x + 1 + 3 + 1),
follows from the fact that the density of a Beta Distribution integrates to one over its support.
Therefore, f(x) = (3){(x+1) (3+1) / (x + 1 + 3 + 1)} = (3)(x!)(3!)/(x+4)! =
18 / {(x+1)(x+2)(x+3)(x+4)}.
(b) The densities from 0 to 20 are:
3/4, 3/20, 1/20, 3/140, 3/280, 1/168, 1/280, 1/440, 1/660, 3/2860, 3/4004, 1/1820, 3/7280,
3/9520, 1/4080, 1/5168, 1/6460, 1/7980, 3/29260, 3/35420, 1/14168.
(c) The mean of this mixed distribution is:

( )d =

3 / (1+)4

d = 3 u (1u) du =
0

(3)(1/2 - 1/3) = 1/2.


(d) The second moment of a Geometric is: variance + mean2 = (1+) + 2 = + 22.

2 () d =

2 3 / (1+)4 d = 3 (1 u)2 du = 3/3 = 1.

Therefore, the second moment of this mixed distribution is: 1/2 + (2)(1) = 2.5.
The variance of this mixed distribution is: 2.5 - 0.52 = 2.25.
Comment: This is a Yule Distribution as discussed In Example 6.22 of Loss Models, with a = 3.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 318


17.5. B. Chance of observing 3 claims is 10q3 (1-q)2 . Weight the chances of observing 3 claims
by the a priori probability of q.
A Priori
Probability
0.6
0.3
0.1

Type
A
B
C

q
Parameter
0.1
0.2
0.3

Chance of
3 Claims
0.0081
0.0512
0.1323

Average

0.033

17.6. C. The chance of no claims for a Poisson is: e.


We average over the possible values of :
4

(1/4)

e-

d = (1/ 4)(-

=4

e )

= (1/4)(1 - e-4) = 0.245.

=0

17.7. B. The chance of one claim for a Poisson is: e.


We average over the possible values of :
4

=4

(1/4) e- d = (1/ 4)(- e- - e- )]

= (1/4)(1 - 5e-4 ) = 0.227.

=0

Comment: The densities of this mixed distribution from 0 to 9: 0.245421, 0.227105, 0.190474,
0.141632, 0.0927908, 0.0537174, 0.0276685, 0.0127834, 0.00534086, 0.00203306.
17.8. E[] = (0 + 4)/2 = 2.
17.9. The second moment of a Poisson is: variance + mean2 = + 2.
E[ + 2] = E[] + E[2] = mean of uniform distribution + second moment of uniform distribution
= 2 + {22 + (4 - 0)2 /12} = 2 + 4 + 1.333 = 7.333.
variance = second moment - mean2 = 7.333 - 22 = 3.333.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 319


17.10. The p.g.f. of each Binomial is: {1 + q(z-1)}m.
The p.g.f. of the mixture is the mixture of the p.g.f.s:
Pmixture[z] = f(m){1 + q(z-1)}m = p.g.f. of f at: 1 + q(z-1).
However, f(m) is Negative Binomial, with p.g.f.: {1 - (z-1)}-r
Therefore, Pmixture[z] = {1 - (1 + q(z-1) - 1)}-r = {1 - q(z-1)}-r .
However, this is the p.g.f. of a Negative Binomial Distribution with parameters r and q, which is
therefore the mixed distribution.
Alternately, the mixed distribution at k is:
m=

m=

m=k

m=k

Prob[k | m] Prob[m] =

m!
(r + m- 1)!
m
k
m
k
q (1- q)
=
(m- k)! k!
(r -1)! m! (1+ )r + m

(r + k -1)!
q k k
(1+ ) r + k (r - 1)! k!

m=

(1- q) m - k
(r + m- 1)!
=

(m- k)! (r +k - 1)! 1 +

(r + k -1)!
q k k
r
+
k
(r - 1)! k!
(1+ )

n=

n
(r + k + n -1)! (1- q)

=
n! (r +k - 1)! 1 +

m=k

n=0

r+k
(r + k -1)!
q k k
1
=

(1+ ) r + k (r - 1)! k! 1 - (1- q) / (1+)


(q)k
(r + k -1)! (1+ )r + k
(r + k -1)!
q k k
=
.
(r - 1)! k! (1 + q)r + k
(1+ ) r + k (r - 1)! k! (1 + q)r + k
This is a Negative Binomial Distribution with parameters r and q.
Comment: The sum was simplified using the fact that the Negative Binomial densities sum to 1:
i=

1=

i=0

i=

(s + i -1)!
i
.
i! (s - 1)! (1+ ) s + i

i=

i=0

(s + i -1)! i
= (1+)s.
i! (s - 1)! 1+

i! (s - 1)! i = (1-1)s .
(s + i -1)!

i=0

Where, = /(1+). = /(1-). 1+ = 1/(1-).

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 320


17.11. B. The conditional mean given q is: mq. The unconditional mean can be obtained by
integrating the conditional means versus the distribution of q:
1

E[X] =

E[X | q] g(q) dq =

(a + b)
m
(a) (b)

mq

(a + b) a - 1
q
(1- q)b - 1 dq =
(a) (b)

qa (1- q)b - 1 dq = m

(a + b) (a +1) (b)
(a +1) (a+ b)
=m
= ma / (a+b).
(a) (b) (a + b + 1)
(a) (a + b + 1)

Alternately,
1

E[X] =

0 E[X | q] g(q) dq = m 0 q g(q) dq = m (mean of Beta Distribution) = ma / (a+b).

Comment: The Beta distribution with = 1 has density from 0 to 1 of:


Therefore, the integral from zero to of xa-1(1-x)b-1 is:

(a) (b)
.
(a + b)

(a + b) a-1
x (1-x)b-1.
(a) (b)

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 321


17.12. D. The conditional variance given q is: mq(1-q) = mq - mq2 . Thus the conditional second
moment given q is: mq - mq2 +(mq)2 = mq + (m2 - m)q2 . The unconditional second moment can
be obtained by integrating the conditional second moments versus the distribution of q:
1

E[X2 ] =

E[X2 | q] g(q) dq = {mq + (m2 - m)q2} {(a+b) / (a)(b)} qa-1(1-q)b-1dq =

0
1

m{(a+b) / (a)(b)} qa(1-q)b-1dq +(m2 -m){(a+b) / (a)(b)} qa+1(1-q)b-1dq =


0

m{(a+b) / (a)(b)}{(a+1)(b)/(a+b+1) } + (m2 - m){(a+b) / (a)(b)}{(a+2)(b)/(a+b+2)}


= m{(a+1)/(a) }{(a+b) /(a+b+1)} + (m2 - m) {(a+2)/(a) }{(a+b) /(a+b+2)} =
ma/(a+b) + (m2 - m)a(a+1)/{(a+b)(a+b+1)}. Since the mean is ma/(a+b), the variance is:
ma/(a+b) + (m2 - m)a(a+1)/{(a+b)(a+b+1)} - m2 a2 /(a+b)2 =
{ma/ {(a+b)2(a+b+1)}}{(a+b+1)(a+b) + (m-1)(a+1)(a+b) - ma(a+b+1)} =
{ma/ (a + b)2(a + b + 1)}{ab + b2 + mb )} = m(m+a+b)ab/{(a+b)2(a+b+1)}.
Alternately,

E[X2 ] =
0

E[X2 | q] g(q) dq = m q g(q)dq + (m2 - m) q2 g(q)dq =


0

m(mean of Beta Distribution) + (m2 - m) (second moment of the Beta Distribution) =


ma/(a+b) + (m2 - m)( ab/{(a+b+1) (a+b)2 } + a2 /(a + b)2) =
ma/(a+b) + (m2 - m)( a/{(a+b+1) (a+b)2 })(b + a2 + ab + a) =
ma/(a+b) + (m2 - m)a(a+1)/{(a+b)(a+b+1)}. Then proceed as before.
Comment: This is an example of the Beta-Binomial Conjugate Prior Process. See Mahlers Guide
to Conjugate Priors. The unconditional distribution is sometimes called a Beta-Binomial
Distribution. See Example 6.21 in Loss Models or Kendalls Advanced Theory of Statistics by
Stuart and Ord.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 322


17.13. E. The probability density of q is a Beta Distribution with parameters a and b:
{(a+b-1)! / {(a-1)! (b-1)!} qa-1(1-q)b-1 = {(a+b) / (a)(b)} qa-1(1-q)b-1.
One can compute the unconditional density at n 7 via integration:
1

f(n) = f(n | q) {(a+b) / (a)(b)} qa-1(1-q)b-1 dq =


0
1

{(a+b) / (a)(b)} {m! / (n!)(m-n)!} qn (1-q)m-nqa-1(1-q)b-1dq =


0
1

{(a+b) / (a)(b)}{(m+1) / ((n+1))(m+1-n)}

qa+n-1(1-q)b+m-n-1dq =

{(a+b) / (a)(b)}{(m+1) / ((n+1))(m+1-n)} {(a+n) (b+m-n) / (a+b+m) } =


(a+b)(m+1)(a+n) (b+m-n) / {(a)(b) (n+1) (m+1-n)(a+b+m)}
For n = 5, a = 2, b = 4, and m = 7.
f(5) = (6)(8)(7) (6) / {(2)(4) (6) (3)(13)} =
5! 7! 6! 5! /{1! 3! 5! 2! 12!} = 0.07576.
Comment: Beyond what you are likely to be asked on your exam.
The probability of observing other number of claims in 7 trials is as follows:
n
f(n)
F(n)

0
0.15152
0.15152

1
0.21212
0.36364

2
0.21212
0.57576

3
0.17677
0.75253

4
0.12626
0.87879

5
0.07576
0.95455

6
0.03535
0.98990

This is an example of the Binomial-Beta distribution with: a = 2, b = 4, and m = 7.


17.14. B. For a Negative Binomial Distribution, f(1) =
For Type A: f(1) = (0.8) (0.2) / (1.21.8) = 11.52%.
For Type B: f(1) = (0.8) (0.5) / (1.51.8) = 19.28%.
(70%) (11.52%) + (30%) (19.28%) = 13.85%.

r
(1 + )r + 1

7
0.01010
1.00000

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 323


17.15. The p.g.f. of each Binomial is: {1 + q(z-1)}m.
The p.g.f. of the mixture is the mixture of the p.g.f.s:
Pmixture[z] = f(m){1 + q(z-1)}m = p.g.f. of f at: 1 + q(z-1).
However, f(m) is Poisson, with p.g.f.: exp[(z-1)].
Therefore, Pmixture[z] = exp[{1 + q(z-1) - 1}] = exp[q(z-1)].
However, this is the p.g.f. of a Poisson Distribution with mean q, which is therefore the mixed
distribution.
Alternately, the mixed distribution at k is:

Prob[k | m]Prob[m] = m!/{(m-k)!k!} qk (1-q)m-k e m/m! = qkek/k!(1-q)m-km-k/(m-k)!


m=k

m=k

m=k

= qke k/k!(1-q)n n /n! = (qke k/k!)exp[(1-q)] = (q)keq/k!.


n=0

This is a Poisson Distribution with mean q.


17.16. C. This is a Geometric Distribution (a Negative Binomial with r =1), parameterized
somewhat differently than in Loss Models, with p = 1/(1 + ). Therefore for a given value of p the
mean is: (p) = = (1-p)/p. In order to get the average mean over the whole portfolio we need to
take the integral of (p) g(p) dp.
1

(p) g(p) dp = ((1-p)/p) 280 p3(1-p)4 dp = 280 p2(1-p)5 dp = 280 (3)(6) / (3+6)
0

= 280 (2!)(5!) / 8! = 5/3 .


Comment: Difficult! Special case of mixing a Negative Binomial (for r fixed) via a Beta Distribution.
See Example 6.22 in Loss Models, where the mixed distribution is called the Generalized Waring.
For the Generalized Waring in general, the a priori mean turns out to be rb/(a-1). For r =1, b = 5 and
a = 4, the a priori mean is (1)(5)/3 = 5/3.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 324


17.17. D. The probability density of p is a Beta Distribution with parameters a and b:
{(a+b) / (a)(b)} pa-1(1-p)b-1.
One can compute the unconditional density at n via integration:
1

f(n) = f(n | p) {(a+b) / (a)(b)} pa-1(1-p)b-1 dp =


0
1

{(a+b) / (a)(b)} p(1-p)x p a-1(1-p)b-1dp =


0
1

{(a+b) / (a)(b)}

p a(1-p)b+n-1dp =

{(a+b) / (a)(b)}{(a+1) (b+n) / (a+b+n+1)} = a (a+b) (b+n) / {(b) (a+b+n+1)}.


For a = 4, b = 5: f(n) = 4 (9) (5+n) / {(5) (10+n)} = 4 8! (n+4)! / {4! (n+9)!} =
6720 (n+4)! / (n+9)!.
f(2) = 6720 6! / 11! = 12.1%.
Comment: The Beta distribution with = 1 has density from 0 to 1 of:
{(a+b) / (a)(b)}xa-1(1-x)b-1.
Therefore, the integral from zero to of xa-1(1-x)b-1 is: (a)(b)/ (a+b).
This is an example of a Generalized Waring Distribution, with r = 1, a = 4 and b = 5.
See Example 6.22 in Loss Models. The probabilities of observing 0 to 20 claims is as follows:
0.444444, 0.222222, 0.121212, 0.0707071, 0.043512, 0.027972, 0.018648, 0.0128205,
0.00904977, 0.00653595, 0.00481596, 0.00361197, 0.00275198, 0.00212653, 0.00166424,
0.00131752, 0.00105402, 0.000851323, 0.00069367, 0.000569801, 0.000471559.
Since the densities must add to unity:

1 = a (a+b) (b+n) / {(b) (a+b+n+1)}. (b+n) / (a+b+n+1) = (b)/{a (a+b)}.


n =0

n =0

17.18. E. E[] = the mean of the prior mixed exponential = weighted average of the means of the
two exponential distributions = (.8)(1/40) + (.2)(1/10) = 4.0%.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 325


17.19. C. Given , f(0) = e.

f(0; ) () d = 32 e41 d + 2 e11 d = (32/41) + (2/11) = 0.9623.


0

Prob[at least one claim] = 1 - 0.9623 = 3.77%.


17.20. The p.g.f. of each Binomial is: {1 + q(z-1)}m.
The p.g.f. of the mixture is the mixture of the p.g.f.s:
Pmixture[z] = f(m){1 + q(z-1)}m = p.g.f. of f at: 1 + q(z-1).
However, f(m) is Binomial with parameters 5 and 0.1, with p.g.f.: {1 + .1(z-1)}5 .
Therefore, Pmixture[z] = {1 + .1(1 + q(z-1) -1)}5 = {1 + .1q(z-1)}5 .
However, this is the p.g.f. of a Binomial Distribution with parameters 5 and .1q, which is therefore the
mixed distribution.
Alternately, the mixed distribution at k 5 is:

Prob[k | m]Prob[m] = m!/{(m-k)!k!} qk (1-q)m-k 5!/{(5-m)!m!} .1m .95-m


m=k

m=k

= qk .1k 5!/{(5-k)!k!}(5-k)!/{(m-k)!(5-m)!} (1-q)m-k.1m-k.95-m =


m=k

= qk .1k 5!/{(5-k)!k!}(5-k)!/{n!(5- k - n)!} (1-q)n .1n .95- k - n =


n=0

= qk .1k 5!/{(5-k)!k!} {(1-q)(.1) + .9}5- k = 5!/{(5-k)!k!} (.1q)k (1 - .1q)5- k.


This is a Binomial Distribution with parameters 5 and .1q.
Comment: The sum was simplified using the Binomial expansion:
m

(x + y)m = xiy m-i m!/{i!(m-i)!}.


i=0

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 326


17.21. D. Given q, we have a Binomial with parameters m = 3 and q. The chance that we observe
zero claims is: (1-q)3 . The distribution of q is uniform: (q) = 2.5 for 0 q 0.4.
.4

f(0) =

.4

q = .4

f(0 | q) (q) dq = (1 - q)3 (2.5) dq = (-2.5/4)(1 - q)4 ] = (-.625)(.64 - 14) = 0.544.

q=0

17.22. B. Given q, we have a Binomial with parameters m = 3 and q.


The chance that we observe one claim is: 3q(1-q)2 = 3q - 6q2 + 3q3 .
.4

P(c=1) =

.4

q= .4

P(c=1 | q) f(q) dq = ( 3q - 6q2 + 3q3) (2.5)dq = (2.5)(1.5q2 - 2q3 +.75q4) ]

q= 0

= (2.5)(.24 - .128 + .0192) = 0.328.


17.23. A. Given q, we have a Binomial with parameters m = 3 and q.
The chance that we observe two claims is: 3q2 (1-q) = 3q2 - 3 q3 .
.4

P(c=2) =

.4

q= .4

P(c=2 | q) f(q) dq = (3q2 - 3 q3) (2.5)dq = (2.5)(q3 -.75q4) ]

q= 0

= (2.5)(.064 - .0192) = 0.112.


17.24. B. Given q, we have a Binomial with parameters m = 3 and q. The chance that we observe
three claims is: q3 .
.4

P(c=3) =
0

.4

q= .4

P(c=3 | q) f(q) dq = ( q3) (2.5)dq = (2.5)(q4/4) ] = (2.5)(.0064) = 0.016.


0

q= 0

Comment: Since we have a Binomial with m = 3, the only possibilities are 0, 1, 2 or 3 claims.
Therefore, the probabilities for 0, 1, 2 and 3 claims (calculated in this and the prior three questions)
add to one: 0.544 + 0.328 + 0.112 + 0.016 = 1.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 327


17.25. D. This is a 40%-60% mixture of zero and a Poisson with = .07.
The second moment of the Poisson is: variance + mean2 = .07 + .072 = .0749.
The mean of the mixture is: (40%)(0) + (60%)(.07) = .042.
The second moment of the mixture is: (40%)(0) + (60%)(.0749) = .04494.
The variance of the mixture is: .04494 - .0422 = .0432, per student.
For a group of 100 students the variance is: (100)(.0432) = 4.32.
17.26. C. For = 0.2, f(1) = .2e-.2 = 0.1638. For = 0.6, f(1) = .6e-.6 = 0.3293.
Prob[1 coin] = (.5)(.1638) + (.5)(.3293) = 24.65%.
17.27. A. Over two minutes, the mean is either 0.4: f(1) = .4e-.4 = 0.2681,
or the mean is 1.2: f(1) = 1.2e-1.2 = 0.3614.
Prob[1 coin] = (.5)(.2681) + (.5)(.3614) = 31.48%.
17.28. C. Prob[0 coins during a minute] = (.5)e-.2 + (.5)e-.6 = 0.6838.
Prob[1 coin during a minute] = (.5).2e-.2 + (.5).6e-.6 = 0.2465.
Prob[A + B = 1] = Prob[A= 0]Prob[B] + Prob[A = 1]Prob[B = 0] = (2)(.6838)(.2465) = 33.71%.
Comment: Since the minutes are on different days, their lambdas are picked independently.
17.29. C.
Prob[1 coin during third minute and 1 coin during fifth minute | = 0.2] = (.2e-.2)(.2e-.2) = 0.0268.
Prob[1 coin during third minute and 1 coin during fifth minute | = 0.6] = (.6e-.6)(.6e-.6) = 0.1084.
(.5)(.0268) + (.5)(.1084) = 6.76%.
Comment: Since the minutes are on the same day, they have the same , whichever it is.
17.30. B. Prob[1 coin during a minute] = (.5).2e-.2 + (.5).6e-.6 = 0.2465.
Since the minutes are on different days, their lambdas are picked independently.
Prob[1 coin during 1 minute today and 1 coin during 1 minute tomorrow] =
Prob[1 coin during a minute] Prob[1 coin during a minute] = 0.24652 = 6.08%.
17.31. A. Prob[1 coin during 4 minutes] = (.5).8e-.8 + (.5)2.4e-2.4 = 0.2866.
Since the time intervals are on different days, their lambdas are picked independently.
Prob[1 coin during 4 minutes today and 1 coin during 4 minutes tomorrow] =
Prob[1 coin during 4 minutes] Prob[1 coin during 4 minutes] = 0.28662 = 8.33%.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 328


17.32. B. Prob[1 coin during two minute and 2 coins during following 3 minutes | = 0.2] =
(.4e-.4)(.62 e-.6/2) = 0.0265.
Prob[1 coin during two minute and 2 coins during following 3 minutes | = 0.6] =
(1.2e-1.2)(1.82 e-1.8/2) = 0.0968.
(.5)(.0265) + (.5)(.0968) = 6.17%.
b

=b

f() d = (d+1) d / {bd+1 - ad+1} d = {(d+1)/{bd+1 - ad+1}} d+2 / (d+2) ] =

17.33. E.
a

=a

{(d+1)/ (d+2)}{bd+2 - ad+2 } / {bd+1 - ad+1} = (.5/1.5){.61.5 - .21.5) / (.6.5 - .2.5) = 0.3821.
b

=b

2 f() d = 2 (d+1) d / {bd+1 - ad+1} d = {(d+1)/{bd+1 - ad+1}} d+3 / (d+3) ] =

17.34. B.
a

=a

{(d+1)/ (d+3)}{bd+3 - ad+3 } / {bd+1 - ad+1} = (.5/2.5){.62.5 - .22.5) / (.6.5 - .2.5) = .15943.
For fixed , the second moment of a Poisson is: + 2.
Therefore, the second moment of the mixture is: E[] + E[2] = .3821 + .15943 = .5415.
Therefore, the variance of the mixture is: .5415 - .38212 = 0.3955.
Alternately, Variance[] = Second Moment[] - Mean[]2 = .15943 - .38212 = .0134.
The variance of frequency for a mixture of Poissons is: E[] + Var[] = .3821 + .0134 = 0.3955.
17.35. A. E[X] = (.5)(6)(.8) + (.5)(6)q = 2.4 + 3q.
The second moment of a Binomial is: mq(1 - q) + (mq)2 = mq - mq2 + m2 q2 .
E[X2 ] = (.5){(6)(.8) - (6)(.82 ) + (62 )(.82 )} + (.5){6q - 6q2 + 36q2 } = 12 + 3q + 15q2 .
Var[X] = 12 + 3q + 15q2 - (2.4 + 3q)2 = 6.24 - 11.4q + 6q2 .
E[X] = Var[X]. 2.4 + 3q = 6.24 - 11.4q + 6q2 . 6q2 - 14.4q + 3.84 = 0.
q = {14.4

14.42 - (4)(6)(3.84) } / 12 = {14.4 10.7331} / 12 = 2.094 or 0.3056.

Comment: 0 q 1. When one mixes distributions, the variance increases. As discussed in


Mahlers Guide to Buhlmann Credibility, Var[X] = E[Var[X | q]] + Var[E[X | q]] E[Var[X | q]].
Since for a Binomial Distribution, the variance is less than the mean, for a mixture of Binomial
Distributions, the variance can be either less than, greater than, or equal to the mean.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 329


17.36. D. For = .04, Prob[more than 1 claim] = 1 - e-.04 - .04 e-.04 = .00077898.
For = .10, Prob[more than 1 claim] = 1 - e-.10 - .10 e-.10 = .00467884.
Prob[more than 1 claim] = (.5)(.00077898) + (.5)(.00467884) = 0.273%.
17.37. B. For = .04, the mean is .04 and the second moment is: + 2 = .04 + .042 = .0416.
For = .10, the mean is .10 and the second moment is: + 2 = .10 + .102 = .11.
Therefore, the mean of the mixture is: (.5)(.04) + (.5)(.10) = 0.07, and the second moment of the
mixture is: (.5)(.0416) + (.5)(.11) = 0.0758.
The variance of the mixed distribution is: 0.0758 - 0.072 = 0.0709.
Alternately, Variance[] = Second Moment[] - Mean[]2 = (.5)(.042 ) + (.5)(.12 ) - .072 = .0009.
The variance of frequency for a mixture of Poissons is:
Expected Value of the Process Variance + Variance of the Hypothetical Means =
E[] + Var[] = .07 + .0009 = 0.0709.
17.38. D. (25%)(e-.25) + (75%)(e-.5) = 65.0%.
17.39. A. (25%)(.25 e-.25) + (75%)(.5 e-.5) = 27.6%.
17.40. B. (25%)(.252 e-.25/2) + (75%)(.52 e-.5/2) = 6.3%.
17.41. D. If there is one home game and two road games, then the distributions of road wins is:
2 @ 16%, 1 @ 48%, 0 @ 36%.
Thus the chance of winning at least 2 games is:
Prob[win 2 road] + Prob[win 1 road] Prob[win one home] = 16% + (48%)(80%) = 0.544.
If instead there is one road game and two home games, then the distributions of home wins is:
2 @ 64%, 1 @ 32%, 0 @ 4%.
Thus the chance of winning at least 2 games is:
Prob[win 2 home] + Prob[win one home] Prob[win 1 road] = 64% + (32%)(40%) = 0.768.
Thus the chance the Spiders win the series is:
(50%)(0.544) + (50%)(0.768) = 65.6%.
Comment: This is a 50%-50% mixture of two situations.
(Each situation has its own distribution of games won.)
While in professional sports there is a home filed advantage, it is not usually this big.
Note that for m = 3 and q = (0.8 + 0.4)/2 = 0.6, the probability of at least two wins is:
0.63 + (3)(0.62 )(0.4) = 0.648 0.656.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 330


17.42. The mean of the mixture is: p 1 + (1 - p)2.
The second moment of a Poisson is: variance + mean2 = + 2.
Therefore, the second moment of the mixture is: p (1 + 12) + (1 - p)(2 + 22).
Variance of the mixture is: p (1 + 12) + (1 - p)(2 + 22) - {p 1 + (1 - p)2}2 .
For the mixture, the ratio of the variance to the mean is:
{p ( 1 + 12) + (1 - p)(2 + 22)}/{p 1 + (1 - p)2} - {p 1 + (1 - p)2}.
For 1 = 10% and 2 = 20%, the ratio of the variance to the mean is:
{p .11 + (1 - p).24} / {p .1 + (1 - p).2} - {p .1 + (1 - p).2}
= (0.24 - 0.13p) / (0.2 - 0.1p) - (0.2 - 0.1p).
Here is a graph of the ratio of the variance to the mean as a function of p:
Ratio
1.0175
1.015
1.0125
1.01
1.0075
1.005
1.0025
0.2

0.4

0.6

0.8

Comment: For either p = 0 or p = 1, this ratio is 1.


For either p = 0 or p = 1, we have a single Poisson and the mean is equal to the variance.
For 0 < p < 1, mixing increases the variance, and the variance of the mixture is greater than its mean.
For example, for p = 80%, (0.24 - 0.13p) / (0.2 - 0.1p) - (0.2 - 0.1p) = 1.013.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 331


17.43. B. Given q, we have a Binomial with parameters m = 4 and q. The chance that we observe
zero claims is: (1-q)4 . The distribution of q is: (q) =
0.6

f(0) =

2500
99

2500
f(0 | q) (q) dq =
99

2500 2
q (1-q).
99

0.6

0 q2(1- q)5 dq =

0.6

0 q2 - 5q3 + 10q4 - 10q5 + 5 q6 - q7 dq =

2500
{0.63 /3 - (5)(0.64 )/4 + (10)(0.65 )/5 - (10)(0.66 )/6 + (5)(0.67 )/7 - 0.68 /8} = 14.28%.
99
17.44. D. Given q, we have a Binomial with parameters m = 4 and q. The chance that we observe
one claim is: 4q(1-q)3 . The distribution of q is: (q) =
0.6

f(1) =

10,000
99

2500
f(1 | q) (q) dq =
(4)
99

2500 2
q (1-q).
99

0.6

0 q3(1- q)4 dq =

0.6

0 q3 - 4q4 + 6q5 - 4q6 + q7 dq =

10,000
{0.64 /4 - (4)(0.65 )/5 + (6)(0.66 )/6 - (4) (0.67 )/7 + 0.68 /8} = 29.81%.
99
17.45. E. Given q, we have a Binomial with parameters m = 4 and q. The chance that we observe
two claims is: 6q2 (1-q)2 . The distribution of q is: (q) =
0.6

f(2) =

2500
f(2 | q) (q) dq =
(6)
99

0.6

2500 2
q (1-q).
99

5000
q4 (1- q)3 dq =
33

5000
{0.65 /5 - (3)(0.66 )/6 + (3)(0.67 )/7 - 0.68 /8} = 32.15%.
33

0.6

0 q4 - 3q5 + 3q6 - q7 dq =

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 332


17.46. A. Given q, we have a Binomial with parameters m = 4 and q. The chance that we observe
three claims is: 4q3 (1-q). The distribution of q is: (q) =
0.6

f(3) =

2500
f(3 | q) (q) dq =
(4)
99

0.6

2500 2
q (1-q).
99

10,000
q5 (1- q)2 dq =
99

0.6

0 q5 - 2q6 + q7 dq =

10,000
{0.66 /6 - (2)(0.67 )/7 + 0.68 /8} = 18.96%.
99
17.47. C. Given q, we have a Binomial with parameters m = 4 and q. The chance that we observe
four claims is: q4 . The distribution of q is: (q) =
0.6

f(4) =

2500
f(4 | q) (q) dq =
99

0.6

2500 2
q (1-q).
99

2500
q6 (1- q) dq =
99

0.6

0 q6 - q7 dq =

2500
{0.67 /7 - 0.68 /8} = 4.80%.
99
Comment: Since we have a Binomial with m = 4, the only possibilities are 0, 1, 2, 3 or 4 claims.
Therefore, the probabilities for 0, 1, 2, 3, and 4 claims must add to one:
14.28% + 29.81% + 32.15% + 18.96% + 4.80% = 1.
17.48. E. Let x be the mean for the low risk policies.
The mean of the mixture is: (3/4)x + 0.4/4 = 0.75x + 0.1
The second of the mixture is the mixture of the second moments:
(3/4)(x + x2 ) + (0.4 + 0.42 )/4 = 0.75x2 + 0.75x + 0.14.
Thus the variance of the mixture is:
0.75x2 + 0.75x + 0.14 - (0.75x + 0.1)2 = 0.1875x2 + 0.6x + 0.13.
Thus, 0.2575 = 0.1875x2 + 0.6x + 0.13. 0.1875x2 + 0.6x - 0.1275 = 0.
x=

-0.6

0.62 - (4)(0.1875)(-0.1275)
= 0.20, taking the positive root.
(2)(0.1875)

Comment: You can try the choices and see which one works.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 333


17.49. C. The frequency distribution for the class =
1

=1

(1- )x + 1
x
f(x | ) g() d = - (1- ) / ln(p) d =
(x +1) ln(p)

x+1

-(1- p)
=
.
(x + 1) ln(p)

=p

Comment: 4,11/82, Q.48, rewritten. Note that f(x|) is a geometric distribution.


The mixed frequency distribution for the class is a logarithmic distribution, with
= 1/p -1 and x+1 running from 1 to infinity (so that f(0) is the logarithmic distribution at 1, f(1) is the
logarithmic at 2, etc. The support of the logarithmic is 1,2,3,...)
17.50. E. Var[Y] = EX[VAR[Y | X]] + VARX[E[Y | X]] = EX[x] + VARX[x] = mq + mq(1 - q) =
mq(2 - q).
Comment: Total Variance = Expected Value of the Process Variance + Variance of the Hypothetical
Means. See Mahlers Guide to Buhlmann Credibility.
17.51. E. P(Y=0) =

P(Y = 0 | ) f()d = e f() d.

For the first case, f() = 1/2, for 0 2


2

P(Y=0) =

0 e- / 2

d = (1 - e-2)/2 = 0.432.

For the second case, f() = e, for > 0 and

P(Y=0) =

0 e-2 d = 1/2.

For the third case, P(Y=0) = e-1 = 0.368. In the first and third cases P(Y=0) < 0.45.
Comment: Three separate problems in which you need to calculate P(Y=0) given
three different distributions of .
17.52. A. The chance of zero or one claim for a Poisson distribution is: e + e.
We average over the possible values of :
5

Prob(0 or 1 claim) = (1/5)

e-

e -

d = (1/5)

(-2e-

=5

e )

= (1/5) (2 - 7e-5) = 0.391.

=0

Probability that there are 2 or more claims = 1 - Prob(0 or 1 claim) = 1 - 0.391 = 0.609.

2013-4-1, Frequency Distributions, 17 Mixed Distributions HCM 10/4/12, Page 334


17.53. E. P(z) E[zN]. The p.g.f. of the Poisson Distribution is: P(z) = e(z-1).
Therefore, for the Poisson, E[zN] = e(z-1). E[2N | ] = P(2) = e(2-1) = e.
4

E[W] =

E[2N

| ] (1/ 4) d = (1/4)

0 e d = (1/4)(e4 - 1) = 13.4.

17.54. A. For a Negative Binomial with r = 4, f(0) = 1/(1+)4 .


2

Prob[0 claims] =

=2

1/ (1+ ) 4 (1/ 2) d

= -1/{6(1 + )3 }

=0

= (1/6)(1 - 33 ) = .1605.

Prob[at least 1 claim] = 1 - .1605 = 0.8395.


17.55. E. For q = 0.5 and m = 2, f(2) = .52 = .25.
4
For q = 0.5 and m = 4, f(2) = (.52 )(.52 ) = .375.
2
Probability that the mixed distribution is 2 is: p(.25) + (1 - p)(.375) = 0.375 - 0.125p.
Comment: The solution cannot involve p2 , eliminating choices A, C, and D.

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 335

Section 18, Gamma Function142


The quantity x1e-x is finite for x 0 and 1.
Since it declines quickly to zero as x approaches infinity, its integral from zero to exists.
This is the much studied and tabulated (complete) Gamma Function.
() =

t - 1 e - t dt =

t - 1e - t /

dt , for 0 , 0.

We prove the equality of these two integrals, by making the change of variables x = t/:

0 t - 1e- t dt = 0 (x / ) - 1 e- x/ dx/
() = ( 1) !

0 x - 1 e- x/ dx .

() = (-1) (1).

(1) = 1. (2) = 1. (3) = 2. (4) = 6. (5) = 24. (6) = 120. (7) = 720. (8) = 5040.
One does not need to know how to compute the complete Gamma Function for noninteger alpha.
Many computer programs will give values of the complete Gamma Function.

(1/2) =

(3/2) = 0.5

For 10: ln() ( - 0.5) ln - +


+

(-1/2) = -2

(-3/2) = (4/3) .

ln(2 )
1
1
1
1
+
+
2
12 360 3 1260 5 1680 7

691
3617
1
1
+
. 143
9
11
13
15
360,360
122,400
1188
156

For < 10 use the recursion relationship () = (1) (1).


The Gamma function is undefined at the negative integers and zero.
For large : () e- 1/2

2 , which is Sterlings formula.144

The ratios of two Gamma functions with arguments that differ by an integer can be computed in
terms of a product of factors, just as one would with a ratio of factorials.

142

See Appendix A of Loss Models. Also see the Handbook of Mathematical Functions, by M. Abramowitz, et. al.
See Appendix A of Loss Models, and the Handbook of Mathematical Functions, by M. Abramowitz, et. al.
144
See the Handbook of Mathematical Functions, by M. Abramowitz, et. al.
143

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 336

Exercise: What is (7) / (4)?


[Solution: (7) / (4) = 6! / 3! = (6)(5)(4) = 120.]
Exercise: What is (7.2) / (4.2)?
[Solution: (7.2) / (4.2) = 6.2! / 3.2! = (6.2)(5.2)(4.2) = 135.4.]
Note that even when the arguments are not integer, the ratio still involves a product of factors.
The solution of the last exercise depended on the fact that 7.2 - 4.2 = 3 is an integer.
Integrals involving ex and powers of x can be written in terms of the Gamma function:

t - 1e - t /

dt =

() , or for integer n:

tn e- c t

dt = n! / cn+1.

Exercise: What is the integral from 0 to of: t3 e-t/10?


[Solution: With = 4 and = 10, this integral is: (4) 104 = (6)(10000) = 60,000.]
This formula for gamma-type integrals is very useful for working with anything involving the Gamma
distribution, for example the Gamma-Poisson process. It follows from the definition of the Gamma
function and a change of variables.
The Gamma density in the Appendix of Loss Models is: x1 ex/ / ().
Since this probability density function must integrate to unity, the above formula for gamma-type
integrals follows. This is a useful way to remember this formula on the exam.
Incomplete Gamma Function:
As shown in Appendix A of Loss Models, the Incomplete Gamma Function is defined as:
( ; x) =

t - 1 e- t

dt / ().

( ; 0) = 0. ( ; ) = ()/() = 1. As discussed below, the Incomplete Gamma Function with


the introduction of a scale parameter is the Gamma Distribution.

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 337

Exercise: Via integration by parts, put (2 ; x) in terms of Exponentials and powers of x.


[Solution: (2 ; x) =

e- t

dt / (2) =

e- t

dt =

e-t

t =x
t e t]

= 1 - e-x - xe-x.]

t =0

One can prove via integration by parts that ( ; x) = (-1 ; x) - x-1 e-x / ().145
This recursion formula for integer alpha is: (n ; x) = (n-1 ; x) - xn-1 e-x /(n-1)!.
Combined with the fact that (1 ; x) = et dt = 1 - e-x, this leads to the following formula for the
Incomplete Gamma for positive integer alpha:146

(n ; x) = 1 -

n1 i
x

i =0

e- x
.
i!

Integrals Involving Exponentials times Powers:


One can use the incomplete Gamma Function to handle integrals involving te-t/.
x

t e- t / dt =

t e - t / dt

x/

s e-s ds = 2 se-s ds = 2(2 ; x/)(2) = 2{1 - e-x/ - (x/)e-x/}.

= 2 {1 - e-x/ - (x/)e-x/ }.

Exercise: What is the integral from 0 to 3.4 of: te-t/10?


[Solution: (102 ) {1 - e-3.4/10 - (3.4/10)e-3.4/10} = 4.62.]
Such integrals can also be done via integration by parts, or as discussed below using the formula for
the present value of a continuously increasing annuity, or one can make use of the formula for the
Limited Expected Value of an Exponential Distribution:147
145

See for example, Formula 6.5.13 in the Handbook of Mathematical Functions, by Abramowitz, et. al.
See Theorem A.1 in Appendix A of Loss Models. One can also establish this result by computing the waiting time
until the nth claim for a Poisson Process, as shown in Mahlers Guide to Stochastic Processes, on another exam.
147
See Appendix A of Loss Models.
146

2013-4-1, Frequency Distributions, 18 Gamma Function


x

t e- t / dt = t e- t / / dt = {E[X

HCM 10/4/12,

Page 338

x] - xS(x)} =

{(1 - e-x/) - xe-x/} = 2{1 - e-x/ - (x/)e-x/}.


When the upper limit is infinity, the integral simplifies:

t e - t dt = 2.

148

In a similar manner, one can use the incomplete Gamma Function to handle integrals involving
tn e-t/, for n integer:
x

tn

e- t/

dt

= n+1

x /

sn

e -s

ds

= n+1(n+1; x/)(n+1)

= n!

n+1{1

x ei!

-x

}.

i =0

Exercise: What is the integral from 0 to 3.4 of: t3 e-t/10?


x

[Solution:

t3 e - t /

dt = 4

x /

s3 e -s d s = 4(4 ; x/)(4) =

64{1 - e-x/ - (x/)e-x/ - (x/)2 e-x//2 - (x/)3 e-x//6}.


For = 10 and x = 3.4, this is:
60000{1 - e-0.34 - 0.34e-0.34 - 0.342 e-0.34/2 - 0.343 e-0.34/6} = 25.49.]

If one divided by , then the integrand would be t times the density of an Exponential Distribution. Therefore, the
given integral is (mean of an Exponential Distribution) = 2.

148

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 339

Continuously Increasing Annuities:


The present value of a continuously increasing annuity of term n, with force of interest , is:149

(I a)n = (a n ne

)/

where the present value of a continuous annuity of term n, with force of interest , is:

a n = (1 e

-n

)/

However, the present value of a continuously increasing annuity can also be written as the integral
from 0 to n of te-t. Therefore,
n

t e t dt

= {(1-e-n)/ - ne-n}/ = (1-e-n)/2 - ne-n/.

Those who remember the formula for the present value of an increasing continuous annuity will find
writing such integrals involving te-t in terms of increasing annuities to be faster than doing integration
by parts.
Exercise: What is the integral from 0 to 3.4 of: te-t/10?
[Solution: {(1-e-3.4/10)/0.1 - (3.4)e-3.4/10}/0.1 = (2.882 - 2.420)/0.1 = 4.62.
Comment: Matches the answer gotten above using Incomplete Gamma Functions.
4.62 is the present value of a continuously increasing annuity with term 3.4 years and force of interest
10%.]

149

See for example, The Theory of Interest by Kellison.

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 340

Gamma Distribution:150
The Gamma Distribution can be defined in terms of the Incomplete Gamma Function,
F(x) = ( ; x/ ). Note that (; ) = () / () = 1 and (; 0) = 0, so we have as required for a
distribution function F() = 1 and F(0) = 0.
f(x) =

(x / ) e - x /
x 1 e - x /
=
, x > .
x ( )
(a)

Exercise: What is the mean of a Gamma Distribution?

[Solution:

x f(x) dx =

x-1

x e - x/ dx

e - x/

dx = 0

()

()

(+ 1) + 1 (+ 1)
=
= .]
()
()

(+ n) + n (+ n) n
=

( )
()

Exercise: What is the nth moment of a Gamma Distribution?


[Solution:

xn f(x) dx = xn
0

x- 1

e - x/

()

xn + 1 e - x/ dx

dx = 0

()

= (+n-1)(+n-2)....() n .
Comment: This is the formula shown in Appendix A of Loss Models.]
Exercise: What is the 3rd moment of a Gamma Distribution with = 5 and = 2.5?
[Solution: (+n-1)(+n-2)....()n = (5+3-1)(5+3-2)(5)(2.53 ) = 3281.25.]
Since the Chi-Square Distribution with degrees of freedom is a Gamma Distribution with
shape parameter of /2 and (inverse) scale parameter of 1/2 : 2 (x) = (/2 ; x/2).
Therefore, one can look up values of the Incomplete Gamma Function (for half integer or
integer values of ) by using the cumulative values of the Chi-Square Distribution.
For example, (6;10) = the Chi-Square Distribution for 2 x 6 = 12 degrees of freedom at a value
of 2 x 10 = 20. For the Chi-Square with 12 d.f. there is a 0.067 chance of a value greater than
20, so the value of the distribution function is: 212 (20) = 1 - 0.067 = 0.933 = (6;10) .
150

See Mahlers Guide to Loss Distributions.

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 341

Inverse Gamma Distribution:151


By employing the change of variables y = 1/x, integrals involving e1/x and powers of 1/x can be
written in terms of the Gamma function:

t - ( + 1) e - / t

dt = () .

The Inverse Gamma Distribution can be defined in terms of the Incomplete Gamma Function,
F(x) = 1 - [ ; (/x)].
e - / x
The density of the Inverse Gamma is: + 1
, for 0 < x < .
x
[]
A good way to remember the result for integrals from zero to infinity of powers of 1/x times
Exponentials of 1/x, is that the density of the Inverse Gamma Distribution must integrate to unity.

151

See Mahlers Guide to Loss Distributions. Appendix A of Loss Models.

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 342

Problems:
18.1 (1 point) What is the value of the integral from zero to infinity of: x5 e-8x?
A. less than 0.0004
B. at least 0.0004 but less than 0.0005
C. at least 0.0005 but less than 0.0006
D. at least 0.0006 but less than 0.0007
E. at least 0.0007
18.2 (1 point) What is the density at x = 8 of the Gamma distribution with parameters
= 3 and = 10?
A. less than 0.012
B. at least 0.012 but less than 0.013
C. at least 0.013 but less than 0.014
D. at least 0.014 but less than 0.015
E. at least 0.015

18.3 (1 point) Determine

x- 6 e - 4 / x dx .
0

A. less than 0.02


B. at least 0.02 but less than 0.03
C. at least 0.03 but less than 0.04
D. at least 0.04 but less than 0.05
E. at least 0.05
18.4 (2 points) What is the integral from 6.3 to 8.4 of x2 e-x / 2?
Hint: Use the Chi-Square table.
A. less than 0.01
B. at least 0.01 but less than 0.03
C. at least 0.03 but less than 0.05
D. at least 0.05 but less than 0.07
E. at least 0.07
18.5 (2 points) What is the integral from 4 to 8 of: xe-x/5?
A. 7

B. 8

C. 9

D. 10

E. 11

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 343

Use the following information for the next 3 questions:


Define the following distribution function in terms of the Incomplete Gamma Function:
F(x) = [ ; ln(x)/], 1 < x.
18.6 (2 points) What is the probability density function corresponding this distribution function?
x e - / x
A.
[]
ln[x] -1
B. 1+ 1/
[]
x
C.

x + 1 e - / x
[]

ln[x]
D. 1+ 1/
[]
x
E. None of the above
18.7 (2 points) What is the mean of this distribution?
A. /(1)
B. /
C. (1)
D.
E. None of the above
18.8 (3 points) If = 5 and = 1/7, what is the 3rd moment of this distribution?
A. less than 12
B. at least 12 but less than 13
C. at least 13 but less than 14
D. at least 14 but less than 15
E. at least 15

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 344

Solutions to Problems:
18.1. B. (5+1) / 85+1 = 5! / 86 = 0.000458.
18.2. D. x1 ex / / () = (10-3) 82 e-0.8 / (3) = 0.0144.
18.3. B. The density of the Inverse Gamma is: e/x /{x+1 ()}, 0 < x < .
Since this density integrates to one, x(+1) e/x integrates to ().
Thus taking = 5 and = 4, x-6 e-4/x integrates to: 4-5 (5) = 24 / 45 = 0.0234.
Comment: Alternately, one can make the change of variables y = 1/x.
18.4. C. The integrand is that of the Incomplete Gamma Function for = 3:
x1e-x / () = x2 e-x /2. Thus the integral is: (3; 8.4) (3; 6.3).
Since the Chi-Square Distribution with degrees of freedom is a Gamma Distribution with shape
parameter of /2 and scale parameter of 2 : 26 (x) = (3 ; x/2).
Looking up the Chi-Square Distribution for 6 degrees of freedom, the distribution function is 99% at
16.8 and 95% at 12.6.
99% = (3; 16.8/2) = (3; 8.4), and 95% = (3; 12.6/2) = (3; 6.3).
Thus (3; 8.4) (3; 6.3) = 0.99 - 0.95 = 0.04.
Comment: The particular integral can be done via repeated integration by parts.
One gets: -e-x {(x2 /2) + x + 1}. Evaluating at the limits of 8.4 and 6.3 gives the same result.
18.5. A.

te-t/ dt = 2{1 - e-x/ - (x/)e-x/}.

Set = 5.

0
8

x=8

te-t/5 dt = te-t/5 dt - te-t/5 dt = (52){1 - e-x/5 - (x/5)e-x/5}] =


4

x=4

(25){e-.8 + (.8)e-.8 - e-1.6 - (1.6)e-1.6} = 7.10.


Comment: Can also be done using integration by parts or the increasing annuity technique.

2013-4-1, Frequency Distributions, 18 Gamma Function

HCM 10/4/12,

Page 345

18.6. B. Let y = ln(x)/. If y follows a Gamma Distribution with parameters and 1, then x follows a
LogGamma Distribution with parameters and . If y follows a Gamma Distribution with
parameters and 1, then f(y) = y1 ey / (). Then the density of x is given by:
f(y)(dy/dx) = {(ln(x)/)1 exp(- ln(x)/) / ()} /(x) = {ln(x)}1 / {x1+1/ ()}.
Comment: This is called the LogGamma Distribution and bears the same relationship to the Gamma
Distribution as the LogNormal bears to the Normal Distribution.
Note that the support for the LogGamma is 1 to , since when y = 0, x = exp(0) = 1.
18.7. E.

xf(x)dx = {ln(x)}1 / {x1/ () } dx.


1

Let y = ln(x)/, and thus x = exp( y), dx = exp(y)dy, then the integral for the first moment is:

{ y}1 {exp(y)dy }/{exp(y) ()} = y 1 exp[-y(1-)] dy/ () = (1-) .


0

18.8. E. The formula for the nth moment is derived as follows:

xnf(x)dx = xn {ln(x)}1dx /{x1+1/ ()} = {ln(x)}1 xn-(1+1/)dx / ()}


1

Let y = ln(x)/, and thus x = exp( y), dx = exp(y)dy, then the integral for the nth moment is:

{y}1 exp({n-(1+1/)}y){ exp(y/)dy}/() = y1exp[-y(1-n)]dy/ ()


0

= (1-n) , n < 1. Thus the 3rd moment with = 5 and = 1/7 is: (1-n) = (1-3/7)-5 = 16.41.
Comment: One could plug in n = 3 and the value of the parameters at any stage in the computation.
I have chosen to do so at the very end.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 346

Section 19, Gamma-Poisson Frequency Process152


The single most important specific example of mixing frequency distributions, is mixing Poisson
Frequency Distributions via a Gamma Distribution. Each insured in a portfolio is assumed to have a
Poisson distribution with mean . Across the portfolio, is assumed to be distributed via a Gamma
Distribution. Due to the mathematical properties of the Gamma and Poisson there are some specific
relationships. For example, as will be discussed, the mixed distribution is a Negative Binomial
Distribution.
Prior Distribution:
The number of claims a particular policyholder makes in a year is assumed to be Poisson with mean
. For example, the chance of having 6 claims is given by: 6 e / 6!
Assume the values of the portfolio of policyholders are Gamma distributed with = 3 and
= 2/3, and therefore probability density function:153
f() = 1.6875 2 e1.5

0.

This prior Gamma Distribution of Poisson parameters is displayed below:

0.4

0.3

0.2

0.1

152

Poisson Parameter

Section 6.3 of Loss Models.


Additional aspects of the Gamma-Poisson are discussed in Mahlers Guide to Conjugate Priors.
153
For the Gamma Distribution, f(x) = x1 e- x// ().

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 347

The Prior Distribution Function is given in terms of the Incomplete Gamma Function:
F() = (3; 1.5). So for example, the a priori chance that the value lies between
4 and 5 is: F(5) - F(4) = (3; 7.5) - (3; 6) = 0.9797 - 0.9380 = 0.0417.
Graphically, this is the area between 4 and 5 and under the prior Gamma.
Mixed Distribution:
If we have a risk and do not know what type it is, in order to get the chance of having 6 claims, one
would weight together the chances of having 6 claims, using the a priori probabilities and integrating
from zero to infinity:154

6 -

6 e-
e
2 e- 1.5 d = 0.00234375
f()
d
=
1.6875

6!
6!
8 e- 2.5 d .
0
0
0

This integral can be written in terms of the (complete) Gamma function:

1 e- /

d = () .

Thus

8 e- 2.5 d = (9) 2.5-9 = (8!) (0.4)9 10.57.


0

Thus the chance of having 6 claims (0.00234375) (10.57) 2.5%.


More generally, if the distribution of Poisson parameters is given by a Gamma distribution
f() = 1 e // (), and we compute the chance of having n accidents by integrating from zero
to infinity:

6 -

n e-
e
1 e- /
1
f()
d
=
d
=
n + 1 e- (1 + 1 / ) d =
n!
6!

()
n! () 0
0
0

n
n
1
(n + )
(n+ )
( + 1)...( + n -1)
=
=
.
() n! n + (1 + 1/ )n +
(1 + )n +
n! () (1 + 1/ )n +
n!
The mixed distribution is in the form of the Negative Binomial distribution with parameters
r = and = :
Probability of n accidents =

x
r(r +1)...(r + x - 1)
.
(1+ ) x + r
x!

Note the way both the Gamma and the Poisson have factors involving powers of and e and these similar
factors combine in the product.
154

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 348

For the specific case dealt with previously: n = 6, = 3 and = 2/3.


Therefore, the mixed Negative Binomial Distribution has parameters r = = 3 and = = 2/3.
Thus the chance of having 6 claims is:

(2 / 3)6
(3)(4)(5)(6)(7)(8)
= 2.477%.
(1 + 2 / 3)6 + 3
6!

This is the same result as calculated above.


This mixed Negative Binomial Distribution is displayed below, through 10 claims:
0.25
0.2
0.15
0.1
0.05

10

On the exam, one should not go through the calculation above. Rather remember that the mixed
distribution is a Negative Binomial.

When Poissons are mixed via a Gamma Distribution,


the mixed distribution is always a Negative Binomial Distribution,
with r = = shape parameter of the Gamma and
= = scale parameter of the Gamma.
r goes with alpha, beta rhymes with theta.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 349

Note that the overall (a priori) mean can be computed in either one of two ways.
First one can weight together the means for each type of risk, using the a priori probabilities.
This is E[] = the mean of the prior Gamma = = 3(2/3) = 2.
Alternately, one can compute the mean of the mixed distribution: the mean of a Negative Binomial is
r = 3(2/3) = 2. Of course the two results match.
Exponential-Poisson:155
It is important to note that the Exponential distribution is a special case of the Gamma
distribution, for = 1.

For the important special case = 1, we have an Exponential distribution of : f() = e//, 0.
The mixed distribution is a Negative Binomial Distribution with r = 1 and = .
For the Exponential-Poisson, the mixed distribution is a Geometric Distribution with
= .
Mixed Distribution for the Gamma-Poisson, When Observing Several Years of Data:
One can observe for a period of time longer than a year. If an insured has a Poisson parameter of
for each individual year, with the same for each year, and the years are independent, then for
example one has a Poisson parameter of 7 for 7 years. The chances of such an insured having a
given number of claims over 7 years is given by a Poisson with parameter 7. For a portfolio of
insureds, each of its Poisson parameters is multiplied by 7. This is mathematically just like inflation.
If before their each being multiplied by 7, the Poisson parameters follow a Gamma distribution with
parameter and , then after being multiplied by 7 they follow a Gamma with parameters and
7.156 Thus the mixed distribution for 7 years of data is given by a Negative Binomial with
parameters r = and = 7.

155

See for example 3/11/01, Q.27.


Under uniform inflation, the scale parameter of the Gamma Distribution is multiplied by the inflation factor.
See Mahlers Guide to Loss Distributions.
156

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 350

In general, if one observes a Gamma-Poisson situation for Y years, and each insureds
Poisson parameter does not change over time, then the distribution of Poisson
parameters for Y years is given by a Gamma Distribution with parameters and Y, and
the mixed distribution for Y years of data is given by a Negative Binomial Distribution,
with parameters r = and = Y.157
Exercise: Assume that the number of claims in a year for each insured has a Poisson Distribution with
mean . The distribution of over the portfolio of insureds is a Gamma Distribution with parameters
= 3 and = 0.01.
What is the mean annual claim frequency for the portfolio of insureds?
[Solution: The mean annual claims frequency = mean of the (prior) Gamma = = (3)(0.01) = 3%.]
Exercise: Assume that the number of claims in a year for each insured has a Poisson Distribution with
mean . For each insured, does not change over time. For each insured, the numbers of claims in
one year is independent of the number of claims in another year. The distribution of over the
portfolio of insureds is a Gamma Distribution with parameters = 3 and = 0.01.
An insured is picked at random and observed for 9 years.
What is the chance of observing exactly 4 claims from this insured?
[Solution: The mixed distribution for 9 years of data is given by a Negative Binomial Distribution with
parameters r = = 3 and = Y = (9)(0.01) = 0.09.
f(4) =

(4 + 3 -1)!
4! 2!

0.094
= 0.054%.]
(1 + 0.09)3 + 4

If Lois has a low expected annual claim frequency, for example 2%, then over 9 years she has a
Poisson Distribution with mean 18%. Her chance of having 4 claims during these nine years is:
0.184 e-0.18/ 24 = 0.004%.
If Hi has a very high expected annual claim frequency, for example 20%, then over 9 years he has a
Poisson Distribution with mean 180%. His chance of having 4 claims during these nine years is:
1.84 e-1.8/ 24 = 7.23%.
Drivers such as Lois with a low in one year are assumed to have the same low every year.
Such good drivers have an extremely small chance of having four claims in 9 years.
157

Each insureds Poisson parameter does not change over time. If Alans lambda is 4% this year, it is 4% next year,
and every year. Similarly, if Bonnies lambda is 3% this year , then it is 3% every year.
Unless stated otherwise, on the exam assume lambda does not vary over time.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 351

Drivers such as Hi with a very high in one year are assumed to have the same high every year.
Such drivers have a significant chance of having four claims in 9 years. It is such very bad drivers
which contribute significantly to the 0.054% probability of four claims in 9 years for an insured picked
at random.
This situation in which for a given insured is the same over time, contrasts with that in which
changes randomly each year.
Exercise: Assume that the number of claims in a year for each insured has a Poisson Distribution with
mean . For each insured, changes each year at random; the in one year is independent of the
in another year.
The distribution of is a Gamma Distribution with parameters = 3 and = 0.01.
An insured is picked at random and observed for 9 years.
What is the chance of observing exactly 4 claims from this insured?
[Solution: The mixed distribution for 1 year of data is given by a Negative Binomial Distribution with
parameters r = = 3 and = = 0.01. Over 9 years, we get a sum of 9 independent Negative
Binomials, with r = (9)(3) = 27 and = 0.01.
f(4) =

(4 + 27 -1)!
4! 26!

0.014
= 0.00020.]
(1 + 0.01)27 + 4

This is different than the Gamma-Poisson process in which we assume that the lambda for an
individual insured is the same each year. For the Gamma-Poisson the parameter is multiplied by
Y, while here the r parameter is multiplied by Y. This situation in which instead changes each year
is mathematically the same as if we assume an insured each year has a Negative Binomial
Distribution.
For example, assume an insured has a Negative Binomial with parameters r and . Assume the
numbers of claims in one year is independent of the number of claims in another year. Then over Y
years, we add up Y independent identically distributed Negative Binomials; over Y years, the
frequency distribution for this insured is Negative Binomial with parameters Yr and .
Exercise: Assume that the number of claims in a year for an insured has a Negative Binomial
Distribution with parameters r = 3 and = 0.01. What is the mean annual claim frequency?
[Solution: r = (3)(0.01) = 3%.]

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 352

Exercise: Assume that the number of claims in a year for an insured has a Negative Binomial
Distribution with parameters r = 3 and = 0.01. The numbers of claims in one year is independent
of the number of claims in another year. What is the chance of observing exactly 4 claims over 9
years from this insured?
[Solution: Over 9 years, the frequency distribution for this insured is Negative Binomial with
parameters r = (9)(3) = 27 and = 0.01.
f(4) =

(4 + 27 -1)!
4! 26!

0.014
= 0.00020.]
(1 + 0.01)27 + 4

Even though both situations had a 3% mean annual claim frequency, the probability of observing 4
claims over 9 years was higher in the Gamma-Poisson situation with the same each year for a
given insured, than when we assumed changed each year or equivalently an insured had the same
Negative Binomial Distribution each year. In the Gamma-Poisson situation with the same each
year for a given insured, we were more likely to see extreme results such as 4 claims in 9 years,
since there is a small probability of picking at random an insured with a high expected annual claim
frequency, such as Hi with = 20%.
Thinning a Negative Binomial Distribution:
Since the Gamma-Poisson is one source of the Negative Binomial Distribution, it can be used to aid
our understanding of the Negative Binomial Distribution.
For example, assume we have a Negative Binomial Distribution with r = 4 and = 2.
We can think of that as resulting from a mixture of Poisson Distributions, with distributed via a
Gamma Distribution with = 4 and = 2.158
Assume frequency and severity are independent, and that 30% of losses are large.
Then for each insured, his large losses are Poisson with mean .3. If is distributed via a Gamma
with = 4 and = 2, then 0.3 is distributed via a Gamma with = 4 and = (0.3)(2) = 0.6.159
The large losses are a Gamma-Poisson Process, and therefore, across the whole portfolio, the
distribution of large losses is Negative Binomial, with r = 4 and = 0.6.

158

While this may not be real world situation that the Negative Binomial is modeling, since the results are
mathematically identical, we can assume it is for the purpose of deriving general mathematical results.
159
When a variable is Gamma Distributed, then a constant times that variable is also Gamma Distributed, with the same
shape parameter, but with the scale parameter multiplied by that constant. See the discussion of uniform inflation in
Mahlers Guide to Loss Distributions.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 353

In this manner one can show, as has been discussed previously, that if losses are Negative
Binomial with parameters r and , then if we take a fraction t of all the losses in a manner independent
of frequency, then these selected losses are Negative Binomial with parameters r and t.160
Returning to the example, the small losses for an individual insured are Poisson with mean 0.7.
Since is Gamma distributed, 0.7 is distributed via a Gamma with = 4 and = (0.7)(2) = 1.4.
Therefore, across the whole portfolio, the distribution of small losses is Negative Binomial, with r = 4
and = 1.4.
Thus as in the Poisson situation, the overall process has been thinned into two similar processes.
However, unlike the Poisson case, these two Negative Binomials are not independent.
If for example, we observe a lot of large losses, such as 5, it is more likely that the observation
came from an insured with a large . This implies we are more likely to also have observed a higher
than average number of small losses. The number of large losses and the number of small losses
are positively correlated.161
Correlation of Number of Small and Large Losses, Negative Binomial:
Assume the number of losses follow a Negative Binomial Distribution with parameters r and , and
that large losses are t of all the losses. As previously, assume each insured is Poisson with mean
, and is distributed via a Gamma with = r and = .
Then the number of large losses is a Gamma-Poisson with = r and = t.
Posterior to observing L large losses, the distribution of the mean frequency for large losses is
Gamma with = r + L and 1/ = 1/(t) + 1 = t/(1+ t).162
Since the mean frequency of large losses is t times the mean frequency, posterior to observing L
large losses, the distribution of the mean frequency is Gamma with = r + L and = /(1+ t).
Therefore, given we have observed L large losses, the small losses are
Gamma-Poisson with = r + L, and = (1-t)/(1+ t).

160

This can be derived via probability generating functions. See Example 8.8 in Loss Models.
In the case of thinning a Binomial, the number of large and small loses would be negatively correlated.
162
See Mahlers Guide to Conjugate Priors.
161

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 354

One computes the correlation between the number of small losses, S, and the number of large
losses, L, as follows:
E[LS] = EL [E[LS | L]] = EL [L E[S | L]] = EL [

L (r + L) (1- t)
]=
1 + t

(1- t)
(1- t)
{rEL [L] + EL [L2 ]} =
{rrt + rt(1 + t) + (rt)2 } = (1-t)t2r(1+r).163
1 + t
1 + t
Cov[L, S] = E[LS] - E[L]E[S] = (1-t)t2r(1+r) - rtr(1-t) = 2rt(1-t).

Corr[L, S] =

2 r t(1- t)
=
r t(1+ t)r (1- t) {1+(1- t) }

1
1
1
) (1 +
)
(1 +
(1- t)
t

> 0.

For example, assume we have a Negative Binomial Distribution with r = 4 and = 2.


Assume frequency and severity are independent, and that 30% of losses are large.
Then the number of large losses are Negative Binomial with r = 4 and = 0.6, and the number of
small losses are Negative Binomial with r = 4 and = 1.4.
The correlation of the number of large and small losses is:
1
1
=
= 0.468.
1
1
(1 + 1/ 0.6) (1 + 1/ 1.4)
) (1 +
)
(1 +
(1- t)
t

163

Large losses are Negative Binomial with parameters r and t. Thus, EL [L2 ] = Var[L] + E[L]2 = rt(1 + t) + (rt)2 .

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 355

Problems:
Use the following information to answer the next 2 questions:
The number of claims a particular insured makes in a year is Poisson with mean .
for a particular insured remains the same each year.
The values of the Poisson parameter (for annual claim frequency) for the insureds in a portfolio
follow a Gamma distribution, with parameters = 3 and = 1/12.
19.1 (2 points) What is the chance that an insured picked at random from the portfolio will have no
claims over the next three years?
A. less than 35%
B. at least 35% but less than 40%
C. at least 40% but less than 45%
D. at least 45% but less than 50%
E. at least 50%
19.2 (2 points) What is the chance that an insured picked at random from the portfolio will have one
claim over the next three years?
A. less than 35%
B. at least 35% but less than 40%
C. at least 40% but less than 45%
D. at least 45% but less than 50%
E. at least 50%

19.3 (2 points) The distribution of the annual number of claims for an insured chosen at random is
modeled by the negative binomial distribution with mean 0.6 and variance 0.9.
The number of claims for each individual insured has a Poisson distribution and the means of these
Poisson distributions are gamma distributed over the population of insureds.
Calculate the variance of this gamma distribution.
(A) 0.20
(B) 0.25
(C) 0.30
(D) 0.35
(E) 0.40

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 356

19.4 (2 points) The number of claims a particular policyholder makes in a year has a
Poisson distribution with mean . The -values for policyholders follow a gamma distribution with
variance equal to 0.3. The resulting distribution of policyholders by number of claims is a Negative
Binomial with parameters r and such that the variance is equal to 0.7.
What is the value of r(1+)?
A. less than 0.90
B. at least 0.90 but less than 0.95
C. at least 0.95 but less than 1.00
D. at least 1.00 but less than 1.05
E. at least 1.05
Use the following information for the next 3 questions:
Assume that the number of claims for an individual insured is given by a Poisson distribution with
mean (annual) claim frequency and variance . Also assume that the parameter varies for the
different insureds, with following a Gamma distribution:
g() = 1 e/ / (), for 0< < , with mean , and variance 2.
19.5 (2 points) An insured is picked at random and observed for one year.
What is the chance of observing 2 claims?
A. 2/ (1+)+2
B. (+1)2/ (1+)+2
C. (+1)2/ {2(1+)+2}
D. 2(+1)2/ {6(1+)+2}
E. 2(+1)(+2)2 / {6(1+)+2}
19.6 (2 points) What is the unconditional mean frequency?
A.

B. (-1)

C. (-1)2

D. (-1)2

E. (-1)(+1)2/2

19.7 (3 points) What is the unconditional variance?


A. 2

B. + 2

C. + 22

D. 22

E. (+1)

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 357

Use the following information for the next 8 questions:


As he walks, Clumsy Klem loses coins at a Poisson rate. The Poisson rate, expressed in coins per
minute, is constant during any one day, but varies from day to day according to a gamma distribution
with mean 0.2 and variance 0.016.
The denominations of coins are randomly distributed: 50% of the coins are worth 5;
30% of the coins are worth 10; and 20% of the coins are worth 25.
19.8 (2 points) Calculate the probability that Clumsy Klem loses exactly one coin during the tenth
minute of todays walk.
(A) 0.09
(B) 0.11

(C) 0.13

(D) 0.15

(E) 0.17

19.9 (3 points) Calculate the probability that Clumsy Klem loses exactly two coins during the first 10
minutes of todays walk.
(A) 0.12
(B) 0.14

(C) 0.16

(D) 0.18

(E) 0.20

19.10 (4 points) Calculate the probability that the worth of the coins Clumsy Klem loses during his
one-hour walk today is greater than 300.
A. 1%
B. 3%
C. 5%
D. 7%
E. 9%
19.11 (2 points) Calculate the probability that the sum of the worth of the coins Clumsy Klem loses
during his one-hour walks each day for the next 5 days is greater than 900.
A. 1%
B. 3%
C. 5%
D. 7%
E. 9%
19.12 (2 points) During the first 10 minutes of todays walk, what is the chance that Clumsy Klem
loses exactly one coin of worth 5, and possibly coins of other denominations?
A. 31%
B. 33%
C. 35%
D. 37%
E. 39%
19.13 (3 points) During the first 10 minutes of todays walk, what is the chance that Clumsy Klem
loses exactly one coin of worth 5, and no coins of other denominations?
A. 11.6%
B. 12.0%
C. 12.4%
D. 12.8%
E. 13.2%
19.14 (3 points) Let A be the number of coins Clumsy Klem loses during the first minute of his walk
today. Let B be the number of coins Clumsy Klem loses during the first minute of his walk tomorrow.
What is the probability that A + B = 3?
A. 0.2%
B. 0.4%
C. 0.6%
D. 0.8%
E. 1.0%
19.15 (3 points) Let A be the number of coins Clumsy Klem loses during the first minute of his walk
today. Let B be the number of coins Clumsy Klem loses during the first minute of his walk tomorrow.
Let C be the number of coins Clumsy Klem loses during the first minute of his walk the day after
tomorrow. What is the probability that A + B + C = 2?
A. 8%
B. 10%
C. 12%
D. 14%
E. 16%

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 358

19.16 (2 points) For an insurance portfolio the distribution of the number of claims a particular
policyholder makes in a year is Poisson with mean .
The -values of the policyholders follow the Gamma distribution, with parameters = 4,
and = 1/9.
The probability that a policyholder chosen at random will experience x claims is given by which of
the following?
(x + 3)!
A.
0.94 0.1x
x! 3!
B.

(x + 3)!
0.14 0.9x
x! 3!

C.

(x + 8)!
0.754 0.25x
x! 8!

D.

(x + 8)!
0.254 0.75x
x! 8!

E. None of A, B, C, or D.
19.17 (2 points) The number of claims a particular policyholder makes in a year has a Poisson
distribution with mean . The -values for policyholders follow a Gamma distribution.
This Gamma Distribution has a variance equal to one quarter that of the resulting Negative Binomial
distribution of policyholders by number of claims.
What is the value of the parameter of this Negative Binomial Distribution?
A. 1/6

B. 1/5

C. 1/4

D. 1/3

E. Can not be determined

19.18 (1 point) Use the following information:

The random variable representing the number of claims for a single policyholder follows
a Poisson distribution.

For a portfolio of policyholders, the Poisson parameters follow a Gamma distribution


representing the heterogeneity of risks within that portfolio.

The random variable representing the number of claims in a year of a policyholder,


chosen at random, follows a Negative Binomial distribution with parameters:
r = 4 and = 3/17.
Determine the variance of the Gamma distribution.
(A) 0.110
(B) 0.115
(C) 0.120
(D) 0.125

(E) 0.130

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 359

19.19 (2 points) Tom will generate via simulation 100,000 values of the random variable X as
follows:
(i) He will generate the observed value from a distribution with density e/1.4 /1.96.
(ii) He then generates x from the Poisson distribution with mean .
(iii) He repeats the process 99,999 more times: first generating a value , then
generating x from the Poisson distribution with mean .
Calculate the expected number of Toms 100,000 simulated values of X that are 6.
(A) 4200
(B) 4400
(C) 4600
(D) 4800
(E) 5000
19.20 (2 points) In the previous question, let V = the variance of a single simulated set of 100,000
values. What is the expected value of V?
A. 0
B. 2.8
C. 3.92
D. 5.6
E. 6.72
19.21 (2 points) Dick will generate via simulation 100,000 values of the random variable X as
follows:
(i) He will generate the observed value from a distribution with density e/1.4 /1.96.
(ii) He will then generate 100,000 independent values from the Poisson distribution
with mean .
Calculate the expected number of Dicks 100,000 simulated values of X that are 6.
(A) 4200
(B) 4400
(C) 4600
(D) 4800
(E) 5000
19.22 (2 points) In the previous question, let V = the variance of a single simulated set of 100,000
values. What is the expected value of V?
A. 0
B. 2.8
C. 3.92
D. 5.6
E. 6.72
19.23 (1 point) Harry will generate via simulation 100,000 values of the random variable X as
follows:
(i) He will generate the observed value from a distribution with density
e/1.4 /1.96.
(ii) He then generates x from the Poisson distribution with mean .
(iii) He will then copy 99,999 times this value of x.
Calculate the expected number of Harrys 100,000 simulated values of X that are 6.
(A) 4200
(B) 4400
(C) 4600
(D) 4800
(E) 5000
19.24 (1 point) In the previous question, let V = the variance of a single simulated set of 100,000
values. What is the expected value of V?
A. 0
B. 2.8
C. 3.92
D. 5.6
E. 6.72

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 360

Use the following information for the next 7 questions:

The number of vehicles arriving at an amusement park per day is Poisson with mean .

varies from day to day via a Gamma Distribution with = 40 and = 10.

The value of on one day is independent of the value of on another day.

The number of people leaving each vehicle is:


1 + a Negative Binomial Distribution with r = 1.6 and = 6.

The amount of money spent at the amusement park by each person is


LogNormal with = 5 and = 0.8.

19.25 (1 point) What is the variance of the number of vehicles that will show up tomorrow at the
amusement park?
A. 4,000
B. 4,400
C. 4,800
D. 5,200
E. 5,600
19.26 (1 point) What is the variance of the number of vehicles that will show up over the next 7
days at the amusement park?
A. 25,000
B. 27,000
C. 29,000
D. 31,000
E. 33,000
19.27 (2 points) What is the variance of the number of people that will show up tomorrow at the
amusement park?
A. 480,000 B. 490,000 C. 500,000 D. 510,000 E. 520,000
19.28 (1 point) What is the variance of the number of people that will show up over the next 7 days
at the amusement park?
A. 2.8 million
B. 3.0 million C. 3.2 million D. 3.4 million E. 3.6 million
19.29 (3 points) What is the standard deviation of the money spent tomorrow at the amusement
park?
A. 150,000 B. 160,000 C. 170,000 D. 180,000 E. 190,000
19.30 (1 point) What is the standard deviation of the money spent over the next 7 days at the
amusement park?
A. 360,000 B. 370,000 C. 380,000 D. 390,000 E. 400,000
19.31 (2 points) You simulate the amount of the money spent over the next 7 days at the
amusement park. You run this simulation a total of 1000 times.
How many runs do you expect in which less than 5 million is spent?
A. 1
B. 2
C. 3
D. 4
E. 5

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 361

Use the following information for the next 6 questions:

For each individual driver, the number of accidents in a year follows a Poisson Distribution.
For each individual driver, the mean of their Poisson Distribution is the same each year.
For each individual driver, the number of accidents each year is independent of other years.
The number of accidents for different drivers are independent.
varies between drivers via a Gamma Distribution with mean 0.08 and variance 0.0032.
Moe, Larry, and Curly are each drivers.
19.32 (2 points) What is the probability that Moe has exactly one accident next year?
A. 6.9%
B. 7.1%
C. 7.3%
D. 7.5%
E. 7.7%
19.33 (2 points) What is the probability that Larry has exactly 2 accidents over the next 3 years?
A. 2.25%
B. 2.50%
C. 2.75%
D. 3.00%
E. 3.25%
19.34 (2 points) What is the probability that Moe, Larry, and Curly have a total of exactly 2
accidents during the next year?
A. 2.25%
B. 2.50%
C. 2.75%
D. 3.00%
E. 3.25%
19.35 (2 points) What is the probability that Moe, Larry, and Curly have a total of exactly 3
accidents during the next four years?
A. 5.2%
B. 5.4%
C. 5.6%
D. 5.8%
E. 6.0%
19.36 (3 points) What is the probability that Moe has no accidents next year, Larry has exactly one
accident over the next two years, and Curly has exactly two accidents over the next three years?
A. 0.3%
B. 0.4%
C. 0.5%
D. 0.6%
E. 0.7%
19.37 (9 points) Let M = the number of accidents Moe has next year.
Let L = the number of accidents Larry over the next two years.
Let C = the number of accidents Curly over the next three years.
Determine the probability that: M + L + C = 3.
A. 0.9%
B. 1.1%
C. 1.3%
D. 1.5%
E. 1.7%

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 362

Use the following information to answer the next 3 questions:


The number of claims a particular policyholder makes in a year is Poisson. The values of the Poisson
parameter (for annual claim frequency) for the individual policyholders in a portfolio of 10,000 follow a
Gamma distribution, with parameters = 4 and = 0.1.
You observe this portfolio for one year and divide it into three groups based on how many claims
you observe for each policyholder:
Group A: Those with no claims.
Group B: Those with one claim.
Group C: Those with two or more claims.
19.38 (1 point) What is the expected size of Group A?
(A) 6200
(B) 6400
(C) 6600
(D) 6800
(E) 7000
19.39 (1 point) What is the expected size of Group B?
(A) 2400
(B) 2500
(C) 2600
(D) 2700
(E) 2800
19.40 (1 point) What is the expected size of Group C?
(A) 630
(B) 650
(C) 670
(D) 690
(E) 710

19.41 (3 points) The claims from a particular insured in a time period t are Poisson with mean t.
The values of for the individual insureds in a portfolio follow a Gamma distribution,
with parameters = 3 and = 0.02.
For an insured picked at random what is the average wait until the first claim?
A. 17
B. 19
C. 21
D. 23
E. 25
19.42 (2 points) Use the following information:

Frequency for an individual is a 80-20 mixture of two Poissons with means and 3.
The distribution of is Exponential with a mean of 0.1.
For an insured picked at random, what is the probability of seeing two claims?
A. 1.2%
B. 1.3%
C. 1.4%
D. 1.5%
E. 1.6%
19.43 (2 points) Claim frequency follows a Poisson distribution with parameter .
is distributed according to: g() = 25 e-5.
Determine the probability that there will be at least 2 claims during the next year.
A. 5%
B. 7%
C. 9%
D. 11%
E. 13%

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 363

Use the following information for the next two questions:


60% of claims are small.

40% of claims are large.


The annual number of claims from a particular insured is Poisson with mean .
is distributed across a group of insureds via a Gamma with = 2 and = 0.5.
You pick an insured at random and observe for one year.
19.44 (2 points) What is the variance of the number of small claims?
A. 0.78
B. 0.80
C. 0.82
D. 0.84
E. 0.86
19.45 (2 points) What is the variance of the number of large claims?
A. 0.40
B. 0.42
C. 0.44
D. 0.46
E. 0.48

19.46 (4B, 11/96, Q.15) (2 points) You are given the following:

The number of claims for a single policyholder follows a Poisson distribution with mean .

follows a gamma distribution.

The number of claims for a policyholder chosen at random follows a distribution


with mean 0.10 and variance 0.15.
Determine the variance of the gamma distribution.
A. 0.05
B. 0.10
C. 0.15
D. 0.25
E. 0.30
19.47 (4B, 11/96, Q.26) (2 points) You are given the following:

The probability that a single insured will produce 0 claims during the next
exposure period is e.

varies by insured and follows a distribution with density function


f() = 36e-6, 0 < < .

Determine the probability that a randomly selected insured will produce 0 claims during the next
exposure period.
A. Less than 0.72
B. At least 0.72, but less than 0.77
C. At least 0.77, but less than 0.82
D. At least 0.82, but less than 0.87
E. At least 0.87

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 364

19.48 (Course 3 Sample Exam, Q.12) The annual number of accidents for an individual driver
has a Poisson distribution with mean . The Poisson means, , of a heterogeneous population of
drivers have a gamma distribution with mean 0.1 and variance 0.01.
Calculate the probability that a driver selected at random from the population will have 2 or more
accidents in one year.
A. 1/121
B. 1/110
C. 1/100
D. 1/90
E. 1/81
19.49 (3, 5/00, Q.4) (2.5 points) You are given:
(i) The claim count N has a Poisson distribution with mean .
(ii) has a gamma distribution with mean 1 and variance 2.
Calculate the probability that N = 1.
(A) 0.19
(B) 0.24
(C) 0.31

(D) 0.34

(E) 0.37

19.50 (3, 5/01, Q.3 & 2009 Sample Q.104) (2.5 points) Glen is practicing his simulation skills.
He generates 1000 values of the random variable X as follows:
(i) He generates the observed value from the gamma distribution with = 2 and
=1 (hence with mean 2 and variance 2).
(ii) He then generates x from the Poisson distribution with mean .
(iii) He repeats the process 999 more times: first generating a value , then
generating x from the Poisson distribution with mean .
(iv) The repetitions are mutually independent.
Calculate the expected number of times that his simulated value of X is 3.
(A) 75
(B) 100
(C) 125
(D) 150
(E) 175
19.51 (3, 5/01, Q.15 & 2009 Sample Q.105) (2.5 points) An actuary for an automobile insurance
company determines that the distribution of the annual number of claims for an insured chosen at
random is modeled by the negative binomial distribution with mean 0.2 and variance 0.4.
The number of claims for each individual insured has a Poisson distribution and the means of these
Poisson distributions are gamma distributed over the population of insureds.
Calculate the variance of this gamma distribution.
(A) 0.20
(B) 0.25
(C) 0.30
(D) 0.35
(E) 0.40

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 365

19.52 (3, 11/01, Q.27) (2.5 points) On his walk to work, Lucky Tom finds coins on the ground at a
Poisson rate. The Poisson rate, expressed in coins per minute, is constant during any one day, but
varies from day to day according to a gamma distribution with mean 2 and variance 4.
Calculate the probability that Lucky Tom finds exactly one coin during the sixth minute of todays
walk.
(A) 0.22
(B) 0.24
(C) 0.26
(D) 0.28
(E) 0.30
19.53 (2 points) In 3, 11/01, Q.27, calculate the probability that Lucky Tom finds exactly one coin
during the first two minutes of todays walk.
(A) 0.12
(B) 0.14
(C) 0.16
(D) 0.18
(E) 0.20
19.54 (3 points) In 3, 11/01, Q.27, let A = the number of coins that Lucky Tom finds during the first
minute of todays walk. Let B = the number of coins that Lucky Tom finds during the first minute of
tomorrows walk. Calculate Prob[A + B = 1].
(A) 0.09
(B) 0.11
(C) 0.13
(D) 0.15
(E) 0.17
19.55 (3 points) In 3, 11/01, Q.27, calculate the probability that Lucky Tom finds exactly one coin
during the third minute of todays walk and exactly one coin during the fifth minute of todays walk.
A. Less than 4.5%
B. At least 4.5%, but less than 5.0%
C. At least 5.0%, but less than 5.5%
D. At least 5.5%, but less than 6.0%
E. At least 6.0%
19.56 (3 points) In 3, 11/01, Q.27, calculate the probability that Lucky Tom finds exactly one coin
during the first minute of todays walk, exactly two coins during the second minute of todays walk,
and exactly three coins during the third minute of todays walk.
A. Less than 0.2%
B. At least 0.2%, but less than 0.3%
C. At least 0.3%, but less than 0.4%
D. At least 0.4%, but less than 0.5%
E. At least 0.5%
19.57 (2 points) In 3, 11/01, Q.27, calculate the probability that Lucky Tom finds exactly one coin
during the first minute of todays walk and exactly one coin during the fifth minute of tomorrows walk.
(A) 0.05
(B) 0.06
(C) 0.07
(D) 0.08
(E) 0.09
19.58 (2 points) In 3, 11/01, Q.27, calculate the probability that Lucky Tom finds exactly one coin
during the first three minutes of todays walk and exactly one coin during the first three minutes of
tomorrows walk.
(E) 0.025
(A) 0.005
(B) 0.010
(C) 0.015
(D) 0.020

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 366

19.59 (3, 11/02, Q.5 & 2009 Sample Q.90) (2.5 points) Actuaries have modeled auto
windshield claim frequencies. They have concluded that the number of windshield claims filed per
year per driver follows the Poisson distribution with parameter , where follows the gamma
distribution with mean 3 and variance 3.
Calculate the probability that a driver selected at random will file no more than 1 windshield claim
next year.
(A) 0.15
(B) 0.19
(C) 0.20
(D) 0.24
(E) 0.31
19.60 (CAS3, 11/03, Q.15) (2.5 points)
Two actuaries are simulating the number of automobile claims for a book of business.
For the population that they are studying:
i) The claim frequency for each individual driver has a Poisson distribution.
ii) The means of the Poisson distributions are distributed as a random variable, .
iii) has a gamma distribution.
In the first actuary's simulation, a driver is selected and one year's experience is generated. This
process of selecting a driver and simulating one year is repeated N times.
In the second actuary's simulation, a driver is selected and N years of experience are generated for
that driver.
Which of the following is/are true?
I. The ratio of the number of claims the first actuary simulates to the number of claims the
second actuary simulates should tend towards 1 as N tends to infinity.
II. The ratio of the number of claims the first actuary simulates to the number of claims the
second actuary simulates will equal 1, provided that the same uniform random
numbers are used.
Ill. When the variances of the two sequences of claim counts are compared the first actuary's
sequence will have a smaller variance because more random numbers are used in
computing it.
A. I only
B. I and II only
C. I and Ill only
D. II and Ill only
E. None of I, II, or Ill is true

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 367

19.61 (CAS3, 5/05, Q.10) (2.5 points) Low Risk Insurance Company provides liability coverage
to a population of 1,000 private passenger automobile drivers.
The number of claims during a given year from this population is Poisson distributed.
If a driver is selected at random from this population, his expected number of claims per year is a
random variable with a Gamma distribution such that = 2 and = 1.
Calculate the probability that a driver selected at random will not have a claim during the year.
A. 11.1%
B. 13.5%
C. 25.0%
D. 33.3%
E. 50.0%
19.62 (2 points) In CAS3, 5/05, Q.10, what is the probability that at most 265 of these 1000
drivers will not have a claim during the year?
A. 75%
B. 78%
C. 81%
D. 84%
E. 87%
19.63 (2 points) In CAS3, 5/05, Q.10, what is the probability that these 1000 drivers will have a
total of more than 2020 claims during the year?
A. 31%
B. 33%
C. 35%
D. 37%
E. 39%
19.64 (4 points) In CAS3, 5/05, Q.10, let A be the number of these 1000 drivers that have one
claim during the year and B be the number of these 1000 drivers that have two claims during
the year. Determine the correlation of A and B.
A. -0.32
B. -0.30
C. -0.28
D. -0.26
E. -0.24

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 368

Solutions to Problems:
19.1. E. The Poisson parameters over three years are three times those on an annual basis.
Therefore they are given by a Gamma distribution with = 3 and = 3/12 = 1/4.
(The mean frequency is now 3/4 per three years rather than 3/12 = 1/4 on an annual basis. It might
be helpful to recall that is the scale parameter for the Gamma Distribution.)
The mixed distribution is a Negative Binomial, with parameters r = = 3 and = = 1/4.
f(0) = 1/(1+)r = 1/1.253 = 0.512.
Comment: Over one year, the mixed distribution is Negative Binomial, with parameters r = = 3
and = = 1/12. Thus for a driver picked at random, the probability of no claims next year is:
1/(1 + 1/12)3 = 0.7865. Then one might be tempted to think that the probability of no claims over
the next three years for a driver picked at random is: 0.78653 = 0.4865.
However, drivers with a low in one year are assumed to have the same low every year.
Such good drivers have a large chance of having 0 claims in 3 years.
Drivers with a high in one year are assumed to have the same high every year.
Such drivers have a smaller chance of having 0 claims in 3 years.
As discussed in Mahlers Guide to Conjugate Priors, a driver who has no claims the first year, has a
posterior distribution of lambda that is Gamma, but with = 3 + 0 = 3, and 1/ = 12 + 1 = 13.
Therefore for a driver with no claims in year one, the mixed distribution in year two is Negative
Binomial with parameters r = = 3 and = = 1/13. Thus for a driver with no claims in year one, the
probability of no claims in year two is: 1/(1 + 1/13)3 = 0.8007.
A driver who has no claims the first two years, has a posterior distribution of lambda that is Gamma,
but with = 3 + 0 = 3, and 1/ = 12 + 2 = 14.
Therefore for a driver with no claims in the first two years, the mixed distribution in year two is
Negative Binomial with parameters r = = 3 and = = 1/14. Thus for a driver with no claims in year
one, the probability of no claims in year two is: 1/(1 + 1/14)3 = 0.8130.
Prob[0 claims in three years] = (0.7865)(0.8007)(0.8130) = 0.512 0.4865.
19.2. A. From the previous solution, f(1) = r/(1+)r+1 = (3)(1/4)/1.254 = 0.3072.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 369

19.3. C. The mean of the Negative Binomial is r = .6, while the variance is r(1+) = .9.
Therefore, 1 + = .9/.6 = 1.5, and = .5. Therefore r = 1.2.
For a Gamma-Poisson, = r = 1.2 and = = 0.5.
Therefore, the variance of the Gamma Distribution is: 2 = (1.2)(.52 ) = 0.3.
Comment: Similar to 3, 5/01, Q.15.
19.4. B. For the Gamma-Poisson, the variance of the mixed Negative Binomial is equal to:
mean of the Gamma + variance of the Gamma. Thus mean of Gamma + 0.3 = 0.7. Therefore,
mean of Gamma = 0.4 = . Variance of Gamma = 0.3 = 2 . Therefore, = 0.3 / 0.4 = 3/4.
= 0.4/ = 0.5332. r = = 0.5332 and = = 3/4. r(1+) = 0.5332 (7/4) = 0.933.
19.5. C. The conditional chance of 2 claims given is e 2 /2.
The unconditional chance can be obtained by integrating the conditional chances versus the
distribution of :

f(2) =

f(2 | )g() d = e2 1 e/ /2()d = {/2()}+1 e(1+1/)/ d =

{/ 2()} (+2) / (1+1/)+2 = (+1)2/ {2(1+) +2}.


Comment: The mixed distribution is a Negative Binomial with r = and = .
f(x) = {(r+x-1)! /(x!)(r-1)!}x / (1+)x+r. f(2) = (+1)2/{2(1+)+2}.
19.6. A. The conditional mean given is: .
The unconditional mean can be obtained by integrating the conditional means versus the distribution
of :

E[X | ] g() d = 1 e/ /()d ={/()} e/ d =

E[X] =
0

{/()} (+1) +1 = . Alternately,

E[X] =
0

E[X | ] g() d = g() d = Mean of the Gamma Distribution = .


0

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 370

19.7. B. The conditional mean given is: . The conditional variance given is: . Thus the
conditional second moment given is: + 2. The unconditional second moment can be obtained
by integrating the conditional second moments versus the distribution of :

E[X2 ] =

E[X2 | ] g() d = { + 2}1 e/ /()d =

{/()} e/d + {/()} +1 e/d =


0

{/()} (+1) / +1 + {/()} (+2) / +2 = + (+1)2.


Since the mean is , the variance is: + (+1)2 - 2 2 = + 2.
Comment: Note that one integrates the conditional second moments in order to obtain the
unconditional second moment. If instead one integrated the conditional variance one would obtain the
Expected Value of the Process Variance, (in this case ), which is only one piece of the total
unconditional variance. One would need to also add the Variance of the Hypothetical Means, (which
in this case is 2), in order to obtain the total variance of
+ 2. The mixed distribution is a Negative Binomial with r = and = .
It has variance: r(1+)= + 2.
19.8. D. For the Gamma, mean = = .2, and variance = 2 = .016.
Thus = .016/.2 = .08 and = 2.5.
This is a Gamma-Poisson, with mixed distribution a Negative Binomial, with r = = 2.5 and
= = .08. f(1) = r/(1+)1+r = (2.5)(.08)/(1+.08)3.5 = 0.153.
Comment: Similar to 3, 11/01, Q.27.
19.9. E. Over 10 minutes, the rate of loss is Poisson, with 10 times that for one minute.
has a Gamma distribution with = 2.5 and = 0.08
10 has a Gamma distribution with = 2.5, and = (10)(.08) = 0.8.
The mixed distribution is a Negative Binomial, with r = = 2.5 and = = 0.8.
f(2) = {r(r+1)/2} 2 /(1+)2+r = {(2.5)(3.5)/2}.82 /(1+.8)4.5 = 0.199.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 371

19.10. B. Mean value of a coin is: (50%)(5) + (30%)(10) + (20%)(25) = 10.5.


2nd moment of the value of a coin is: (50%)(52 ) + (30%)(102 ) + (20%)(252 ) = 167.5.
Over 60 minutes, the rate of loss is Poisson, with 60 times that for one minute.
has a Gamma distribution with = 2.5 and = .08
60 has a Gamma distribution with = 2.5 and = (60)(.08) = 4.8.
The mixed distribution is a Negative Binomial, with r = = 2.5 and = = 4.8.
Therefore, the mean number of coins: r = (2.5)(4.8) = 12,
and the variance of number of coins: r(1+) = (2.5)(4.8)(5.8) = 69.6.
The mean worth is: (10.5)(12) = 126.
Variance of worth is: (12)(167.5 - 10.52 ) + (10.52 )(69.6) = 8360.4.
Prob[worth > 300] 1 - [(300.5 - 126)/ 8360.4 ] = 1 - [1.91] = 2.81%.
Klem loses money in units of 5 cents or more.
Therefore, if he loses more than 300, he loses 305 or more.
Thus it might be better to approximate the probability as:
1 - ((304.5 - 126)/ 8360.4 ) = 1 - [1.95] = 2.56%.
Along this same line of thinking, one could instead approximate the probability by taking the
probability from 302.5 to infinity: 1 - [(302.5 - 126)/ 8360.4 ] = 1 - [1.93] = 2.68%.
19.11. E. From the previous solution, for a day chosen at random, the worth has mean 126 and
variance 8360.4. The worth over five days is the sum of 5 independent variables; the sum of 5
days has mean: (5)(126) = 630 and variance: (5)(8360.4) = 41,802.
Prob[worth > 900] 1 - [(900.5 - 630)/ 41,802 ] = 1 - [1.32] = 9.34%.
Klem loses money in units of 5 cents or more.
Therefore, if he loses more than 900, he loses 905 or more.
It might be better to approximate the probability as:
Prob[worth > 900] = Prob[worth 905] 1 - [(904.5 - 630)/ 41,802 ] = 1 - [1.34] = 9.01%.
One might have instead approximated as: 1 - [(902.5 - 630)/ 41,802 ] = 1 - [1.33] = 9.18%.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 372

19.12. A. 50% of the coins are worth 5, so if the overall process is Poisson with mean , then
losing coins of worth 5 is Poisson with mean .5.
Over 10 minutes it is Poisson with mean 5.
has a Gamma distribution with = 2.5 and = .08
5 has a Gamma distribution with = 2.5 and = (5)(.08) = .4.
The mixed distribution is a Negative Binomial, with r = = 2.5 and = = .4.
f(1) = r/(1 + )r+1 = (2.5)(.4)/(1.43.5) = 30.8%.
19.13. D. Losing coins of worth 5, 10, and 25 are three independent Poisson Processes.
Over 10 minutes losing coins of worth 5 is Poisson with mean 5.
Over 10 minutes losing coins of worth 10 is Poisson with mean 3.
Over 10 minutes losing coins of worth 25 is Poisson with mean 2.
Prob[1 coin @ 5]Prob[0 coins @ 10]Prob[0 coins @ 25] = 5e5e3e2 = 5e10.
has a Gamma distribution with = 2.5 and = 0.08. 1/ = 12.5.

f() = 12.52.51.5e12.5/(2.5).

5e10 f() d = 5e10 12.52.51.5e12.5/(2.5) d = {(5)12.52.5/(2.5)}2.5e22.5 d =


0

{(5)12.52.5/(2.5)}{(3.5)/22.53.5}

= (5)(2.5)12.52.5/22.53.5 = 12.8%.

Comment: While given lambda, each Poisson Process is independent, the mixed Negative
Binomials are not independent, since each day we use the same lambda (appropriately thinned) for
each denomination of coin. From the previous solution, the probability of one coin worth 5 is
30.80%. The distribution of coins worth ten is Negative Binomial with r = 2.5 and = (3)(.08) = .24.
Therefore, the chance of seeing no coins worth 10 is: 1/1.242.5 = 58.40%. The distribution of coins
worth 25 is Negative Binomial with r = 2.5 and = (2)(.08) = .16. Therefore, the chance of seeing no
coins worth 25 is: 1/1.162.5 = 69.0%.
However, (30.80%)(58.40%)(69.00%) = 12.4% 12.8%, the correct solution.
One can not multiply the three probabilities together, because the three events are not
independent. The three probabilities each depend on the same lambda value for the given day.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 373

19.14. E. A is Poisson with mean A, where A is a random draw from a Gamma Distribution with
= 2.5 and = .08. B is Poisson with mean B, where B is a random draw from a Gamma
Distribution with = 2.5 and = .08. Since A and B are from walks on different days, A and B are
independent random draws from the same Gamma.
Thus A + B is from a Gamma Distribution with = 2.5 + 2.5 = 5 and = .08.
Thus A + B is from a Negative Binomial Distribution with r = 5 and = .08.
The density at 3 of this Negative Binomial Distribution is: {(5)(6)(7)/3!}.083 /1.088 = 0.97%.
Alternately, A and B are independent Negative Binomials each with r = 2.5 and = .08.
Thus A + B is a Negative Binomial Distribution with r = 5 and = .08. Proceed as before.
Alternately, for A and B the densities for each are:
f(0) = 1/(1+)r = 1/1.082.5 = .825, f(1) = r/(1+)1+r = (2.5).08/1.083.5 = .153,
f(2) = {r(r+1)/2} 2 /(1+)2+r = {(2.5)(3.5)/2}.082 /1.084.5 = .0198,
f(3) = {r(r+1)(r+2)/3!} 3 /(1+)3+r = {(2.5)(3.5)(4.5)/6}.083 /1.085.5 = .00220.
Prob[A + B = 3] =
Prob[A=0]Prob[B=3] + Prob[A=1]Prob[B=2] + Prob[A=2]Prob[B=1] + Prob[A=3]Prob[B=0] =
(0.825)(0.00220) + (0.153)(0.0198) + (0.0198)(0.153) + (0.00220)(0.825) = 0.97%.
Comment: For two independent Gamma Distributions with the same :
Gamma(1, ) + Gamma(2, ) = Gamma(1 + 2, ).
19.15. B. A + B + C is from a Gamma Distribution with = (3)(2.5) = 7.5 and = .08.
Thus A + B + C is from a Negative Binomial Distribution with r = 7.5 and = .08.
The density at 2 of this Negative Binomial Distribution is: {(7.5)(8.5)/2!}.082 /1.089.5 = 9.8%.
19.16. A. Mixing a Poisson via a Gamma leads to a negative binomial overall frequency
distribution. The negative binomial has parameters r = = 4 and = = 1/9.
f(x) = {r(r+1)...(r+x-1)/x!} x / (1+)x+r = {(4)(5) ... (x + 4)/x!} (1/9)x / (10/9)x+4 =
{(x+3)! / (x! 3!)} 0.94 0.1x .
19.17. D. For the Gamma-Poisson, = and r = . Therefore, the variance of the Gamma = 2
= r2 . Total Variance = Variance of the mixed Negative Binomial = r(1+). Thus for the GammaPoisson we have: (Var. of the Gamma)/(Var. of the Negative Binomial) = /(1+)
= 1/{1 + 1/}. Thus in this case 1/{1 + 1/} = .25. = 1/3.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 374

19.18. D. The parameters of the Gamma can be gotten from those of the Negative Binomial,
= r = 4, = = 3/17. Then the Variance of the Gamma = 2 = 0.125.
Alternately, the variance of the Gamma is the Variance of the Hypothetical Means =
Total Variance - Expected Value of the Process Variance =
Variance of the Negative Binomial - Mean of the Gamma =
Variance of the Negative Binomial - Mean of Negative Binomial =
r(1+) - r = r2 = (4)(3/17)2 = 0.125.
19.19. D. This is a Gamma-Poisson with = 2 and = 1.4. The mixed distribution is Negative
Binomial with r = = 2, and = = 1.4.
For a Negative Binomial Distribution, f(6) = {(r)(r+1)(r+2)(r+3)(r+4)(r+5)/6!}6/(1+)r+6 =
{(2)(3)(4)(5)(6)(7)/720}(1.46 )/(2.48 ) = 0.04788.
Thus we expect (100,000)(0.04788) = 4788 out of 100,000 simulated values to be 6.
Comment: Similar to 3, 5/01, Q.3. One need know nothing about simulation, in order to answer
these questions.
19.20. E. Each year is a random draw from a different Poisson with unknown .
The simulated set consists of random draws each from different Poisson Distributions.
Thus each simulated set is a mixed distribution for a Gamma-Poisson, a Negative Binomial
Distributions with r = = 2, and = = 1.4.
E[V] = variance of this Negative Binomial = (2)(1.4)(1 + 1.4) = 6.72.
Alternately, Expected Value of the Process Variance is:
E[P.V. | ] = E[] = = (2)(1.4) = 2.8.
Variance of the Hypothetical Means is: Var[Mean | ] = Var[] = 2 = (2)(1.42 ) = 3.92.
Total Variance is: EPV + VHM = 2.8 + 3.92 = 6.72.
Comment: Difficult! In other words, Var[X] = E[Var[X|Y] + Var[E[X|Y]].
19.21. D. This is a Gamma-Poisson with = 2 and = 1.4. The mixed distribution is Negative
Binomial with r = = 2, and = = 1.4.
For this Negative Binomial Distribution, 100000f(6) = 4788.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 375

19.22. B. Each year is a random draw from the same Poisson with unknown .
The simulated set is from this Poisson Distribution with mean . V = .
E[V] = E[] = mean of the Gamma = = (2)(1.4) = 2.8.
Comment: Difficult! What Tom did was simulate one year each from 100,000 randomly selected
insureds. What Dick did was pick a random insured and simulate 100,000 years for that insured; each
year is an independent random draw from the same Poisson distribution with unknown . The two
situations are different, even though they have the same mean. In Dicks case there is no variance
associated with the selection of the parameter lambda; the only variance is associated with the
variance of the Poisson Distribution. In Toms case there is variance associated with the selection of
the parameter lambda as well as variance is associated with the variance of the Poisson Distribution.
19.23. D. This is a Gamma-Poisson with = 2 and = 1.4.
The mixed distribution is Negative Binomial with r = = 2, and = = 1.4.
For this Negative Binomial Distribution, 100000f(6) = 4788.
19.24. A. Since all 100,000 values in the simulated set are the same, V = 0. E[V] = 0.
Comment: Contrast Tom, Dick, and Harrys simulations. Even though they all have the same mean,
they are simulating somewhat different situations.
19.25. B. The number of vehicles is Negative Binomial with r = = 40 and = = 10.
It has variance: r(1 + ) = (40)(10)(11) = 4400.
19.26. D. This is the sum of 7 independent variables, each with variance 4400.
(7)(4400) = 30,800.
Comment: Although is constant on any given day, it varies from day to day. A day picked at
random is a Negative Binomial with r = 40 and = 10. The sum of seven independent Negative
Binomials is a Negative Binomial with r = (7)(40) = 280 and = 10.
This has variance: (280)(10)(11) = 30,800.
If instead had been the same for a whole week, the answer would have changed.
In that case, one would get a Negative Binomial with r = 40 and = (7)(10) = 70, with variance:
(40)(70)(71) = 198,800.
19.27. E. The mean number of people per vehicle is: 1 + (1.6)(6) = 10.6.
The variance of the people per vehicle is: (1.6)(6)(1 + 6) = 67.2.
Variance of the number of people is: (400)(67.2) + (10.62 )(4400) = 521,264.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 376

19.28. E. This is the sum of 7 independent variables. (7)(521,264) = 3,648,848.


19.29. A. The number of people has mean: (400)(10.6) = 4240, and variance: 521,264.
The LogNormal has mean: exp[5 + .82 /2] = 204.38, second moment:
exp[(2)(5) + (2)(.82 )] = 79,221, and variance: 79221 - 204.382 = 37,450.
Variance of the money spent: (4240)(37450) + (204.382 )(521264) = 21,933 million.
21,933 million = 148,098.
19.30. D. This is the sum of 7 independent variables, with variance:
(7)(21,933 million) = 153,531 million.

153,531 million = 391,830.

19.31. C. The mean amount spent per day is: (4240)(204.38) = 866,571.
Over 7 days the mean amount spent is: (7)(866571) = 6,065,997, with variance 153,531 million.
Prob[amount spent < 5 million] [(5 million - 6.0660 million)/ 153,531 million ] = (-2.72) = .33%.
So we expect: (1000)(0.33%) = 3 such runs.
19.32. B. For the Gamma, mean = = 0.08, and variance = 2 = 0.0032.
Thus = 0.04 and = 2.
This is a Gamma-Poisson, with mixed distribution a Negative Binomial:
with r = = 2 and = = 0.04.
f(1) =

r
(2)(0.04)
=
= 7.11%.
r
+
1
(1 + )
(1 + 0.04 )3

Comment: The fact that it is the next year rather than some other year is irrelevant.
19.33. C. For one year, each insureds mean is , and is distributed via a Gamma with:
= 0.04 and = 2.
Over three years, each insureds mean is 3, and is distributed via a Gamma with:
= (3)(0.04) = 0.12, and = 2.
This is a Gamma-Poisson, with mixed distribution a Negative Binomial:
with r = = 2 and = = 0.12.
f(2) =

r (r + 1)
2

0.122
(2) (3)
=
= 2.75%.
2
(1 + )r + 2
(1 + 0.12 )4

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 377

19.34. B. For one year, each insureds mean is , and is distributed via a Gamma with:
= 0.04 and = 2.
This is a Gamma-Poisson, with mixed distribution a Negative Binomial:
with r = = 2 and = = 0.04.
We add up three individual independent drivers and we get a Negative Binomial with:
with r = = (3)(2) = 6, and = 0.04.
f(2) =

r (r + 1)
2

0.042
(6) (7)
=
= 2.46%.
2
(1 + )r + 2
(1 + 0.04 )8

Comment: The Negative Binomial Distributions here and in the previous solution have the same
mean, however the densities are not the same. Here is a graph of the ratios of the densities of the
Negative Binomial in the previous solution and those of the Negative Binomial here:
ratio

3.0

2.0
1.5

1.0
0

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 378

19.35. E. For one year, each insureds mean is , and is distributed via a Gamma with:
= 0.04 and = 2.
Over four years, each insureds mean is 4, and is distributed via a Gamma with:
= (4)(0.04) = 0.16, and = 2.
This is a Gamma-Poisson, with mixed distribution a Negative Binomial:
with r = = 2 and = = 0.16.
We add up three individual independent drivers and we get a Negative Binomial with:
with r = = (3)(2) = 6, and = 0.16.
f(3) =

3
0.163
r (r + 1) (r + 2)
(6) (7) (8)
=
= 6.03%.
6
6
(1 + )r + 3
(1 + 0.16 )9

19.36. A. The number of accidents Moe has over one year is Negative Binomial:
with r = = 2 and = = 0.04.
f(0) =

1
1
=
= 0.9246.
r
(1 + )
(1 + 0.04 )2

The number of accidents Larry has over two years is Negative Binomial:
with r = = 2 and = 2 = 0.08.
f(1) =

r
(2)(0.08)
=
= 0.1270.
r
+
1
(1 + )
(1 + 0.08 )3

The number of accidents Curly has over three years is Negative Binomial:
with r = = 2 and = 3 = 0.12.
f(2) =

r (r + 1)
2

=
(1 + )r + 2

0.12 2
(2) (3)
= 0.0275.
2
(1 + 0.12)4

Prob[Moe = 0, Larry = 1, and Curly = 2)] = (0.9246)(0.1270)(0.0275) = 0.32%.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 379

19.37. D. The number of accidents Moe has over one year is Negative Binomial:
with r = = 2 and = = 0.04.
f(0) = 0.9246.
f(1) = 0.0711.
f(2) = 0.0041.
f(3) = 0.0002.
The number of accidents Larry has over two years is Negative Binomial:
with r = = 2 and = 2 = 0.08.
f(0) = 0.8573.
f(1) = 0.1270.
f(2) = 0.0141.
f(3) = 0.0014.
The number of accidents Curly has over three years is Negative Binomial:
with r = = 2 and = 3 = 0.12.
f(0) = 0.7972.
f(1) = 0.1708.
f(2) = 0.0275.
f(3) = 0.0039.
We need to list all of the possibilities:
Prob[M = 0, L = 0, C = 3] + Prob[M = 0, L = 1, C = 2] + Prob[M = 0, L = 2, C = 1] +
Prob[M = 0, L = 3, C = 1] + Prob[M = 1, L = 0, C = 2] + Prob[M = 1, L = 1, C = 1] +
Prob[M = 1, L = 2, C = 0] + Prob[M = 2, L = 0, C = 1] + Prob[M = 2, L = 1, C = 0] +
Prob[M = 3, L = 0, C = 0] =
(0.9246) {(0.8573)(0.0039) + (0.1270)(0.0275) + (0.0141)(0.1708) + (0.0014)(0.7972)} +
(0.0711) {(0.8573)(0.0275) + (0.1270)(0.1708) + (0.0141)(0.7972)} +
(0.0041) {(0.8573)(0.1708) + (0.1270)(0.7972)} + (0.0002)(0.8573)(0.7972) = 1.475%.
Comment: Adding up the three independent drivers, M + L + C does not follow a Negative
Binomial, since the betas are not the same.
Note that the solution to the previous question is one of the possibilities here.
19.38. D, 19.39. B, & 19.40. D.
The mixed distribution is a Negative Binomial with r = = 4 and = = .1.
f(0) = (1+)-r = 1.1-4 = .6830. Expected size of group A: 6830.
f(1) = r(1+)-(r+1) = (4)(.1)1.1-5 = .2484. Expected size of group B: 2484.
Expected size of group C: 10000 - (6830 + 2484) = 686.
19.41. E. For an individual insured, the probability of no claims by time t is the density at zero of a
Poisson Distribution with mean t: exp[-t].
In other words, the probability the first claim occurs by time t is: 1 - exp[-t].
This an Exponential Distribution with mean 1/.
Thus, for an individual the average wait until the first claim is 1/.
(This is a general result for Poisson Processes.)
For a Gamma Distribution, E[X-1] = 1 ( + k) / () = 1/ {(-1)}, > 1.
Lambda is Gamma Distributed, thus E[1/] = 1/ {(-1)} = 1 / {(0.02)(3 - 1)} = 25.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 380

19.42. C. There is an 80% chance we get a random draw from the Poisson with mean .
In which case, we have a Gamma-Poisson with = 1 and = 0.1.
The mixed distribution is Negative Binomial with r = 1 and = 0.1.

f(2) = 0.12 / 1.13 = 0.751%.

There is a 20% chance we get a random draw from the Poisson with mean 3.
3 follows an Exponential with mean 0.3.
We have a Gamma-Poisson with = 1 and = 0.3.
The mixed distribution is Negative Binomial with r = 1 and = 0.3.

f(2) = 0.32 / 1.33 = 4.096%

Thus the overall probability of two claims is: (0.8)(0.751%) + (0.2)(4.096%) = 1.420%.
19.43. B. This is a Gamma-Poisson with = 2 and = 1/5.
Thus the mixed distribution is Negative Binomial with r = 2 and = 1/5.
For this Negative Binomial: f(0) = 1/(1 + 1/5)2 = 25/36. f(1) = (2)(1/5)/(1 + 1/5)3 = 25/108.
Probability of at least 2 claims is: 1 - 25/36 - 25/108 = 8/108 = 2/27 = 7.41%.
19.44. A. The mixed distribution is Negative Binomial with r = 2 and = 0.5.
Thinning, small claims are Negative Binomial with r = 2 and = (60%)(0.5) = 0.3.
Variance of the number of small claims is: (2)(0.3)(1.3) = 0.78.
Alternately, for each insured, the number of small claims is Poisson with mean: 0.6 .
0.6 follows a Gamma Distribution with = 2 and = (0.6)(0.5) = 0.3.
Thus the mixed distribution for small claims is Negative Binomial with r = 2 and = 0.3.
Variance of the number of small claims is: (2)(0.3)(1.3) = 0.78.
19.45. E. The mixed distribution is Negative Binomial with r = 2 and = 0.5.
Thinning, large claims are Negative Binomial with r = 2 and = (40%)(0.5) = 0.2.
Variance of the number of large claims is: (2)(0.2)(1.2) = 0.48.
Alternately, for each insured, the number of large claims is Poisson with mean: 0.2 .
0.2 follows a Gamma Distribution with = 2 and = (0.4)(0.5) = 0.2.
Thus the mixed distribution for large claims is Negative Binomial with r = 2 and = 0.2.
Variance of the number of large claims is: (2)(0.2)(1.2) = 0.48.
Comment: The number of small and large claims is positively correlated.
The distribution of claims of all sizes is Negative Binomial with r = 2 and = 0.5;
it has a variance of: (2)(0.5)!9.5) = 1.5 > 1.26 = 0.78 + 0.48.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 381

19.46. A. For the Gamma-Poisson, the mixed distribution is a Negative Binomial with mean r and
variance = r(1+). Thus we have r = 0.1 and 0.15/0.1 = 1+. Thus = 0.5, and r = 0.1/0.5 = 0.2.
The parameters of the Gamma follow from those of the Negative Binomial: = r = 0.2 and
= = 0.5. The variance of the Gamma is 2 = 0.05.
Alternately, the total variance is 0.15.
For the Gamma-Poisson, the variance of the mixed Negative Binomial is equal to: mean of the
Gamma + variance of the Gamma.
Therefore, the variance of the Gamma = 0.15 - 0.10 = 0.05.

19.47. B.
0

ef()d = e36e6 d = 36 e7 d = (36)((2)/ 72) = (36)(1/49) = 0.735.


0

Alternately, assume that the frequency for a single insured is given by a Poisson with a mean of .
(This is consistent with the given information that the chance of 0 claims is e.) In that case one would
have a Gamma-Poisson process and the mixed distribution is a Negative Binomial. The given
Gamma distribution of has = 2 and = 1/6. The mixed Negative Binomial has r = = 2 and = =
1/6, and f(0) = (1+)-r = (1 + 1/6 )-2 = 36/49.
Comment: Note that while the situation described is consistent with a Gamma-Poisson, it need not
be a Gamma-Poisson.
19.48. A. One can solve for the parameters of the Gamma, = .1, and 2 = .01, therefore = .1
and = 1. The mixed distribution is a Negative Binomial with parameters r = = 1 and = = .1,
a Geometric Distribution. f(0) = 1/(1+) = 1/1.1 = 10/11. f(1) = /(1+)2 = .1/1.12 = 10/121.
The chance of 2 or more accidents is: 1 - f(0) - f(1) = 1 - 10/11 - 10/121 = 1/121.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 382

19.49. A. mean of Gamma = = 1 and variance of Gamma = 2 = 2.


Therefore, = 2 and = 1/2.
The mixed distribution is a Negative Binomial with r = = 1/2 and = = 2.
f(1) = r/(1+)1+r = (1/2)(2)/(33/2) = 0.192.
Alternately, f(1) =

0 f(1 | ) g() d = 0
1
(1/ 2)

e-

( / 2)1/ 2 e- / 2
1
d =
(1/ 2)
(1/ 2)

(3/2) (2/3)3/2 = 2

0 1/ 2 e - 3 / 2 d =

(3 / 2) -3/2
3
= 2(1/2) 3-3/2 = 0.192.
(1/ 2)

19.50. C. This is a Gamma-Poisson with = 2 and = 1.


The mixed distribution is Negative Binomial with r = = 2 , and = = 1.
For a Negative Binomial Distribution,
f(3) = {(r)(r+1)(r+2)/3!}3/(1+)r+3 = {(2)(3)(4)/6}(13 )/(25 ) = 1/8.
Thus we expect (1000)(1/8) = 125 out of 1000 simulated values to be 3.
19.51. A. The mean of the Negative Binomial is r = 0.2, while the variance is r(1+) = 0.4.
Therefore, 1 + = 2 = 1 and r = 0.2. For a Gamma-Poisson, = r = 0.2 and = = 1.
Therefore, the variance of the Gamma Distribution is: 2 = (0.2)(12 ) = 0.2.
Alternately, for the Gamma-Poisson, the variance of the mixed Negative Binomial is equal to:
mean of the Gamma + variance of the Gamma. Variance of the Gamma =
Variance of the Negative Binomial - Mean of the Gamma =
Variance of the Negative Binomial - Overall Mean =
Variance of the Negative Binomial - Mean of the Negative Binomial = 0.4 - 0.2 = 0.2.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 383

19.52. A. For the Gamma, mean = = 2, and variance = 2 = 4. Thus = 2 and = 1.


This is a Gamma-Poisson, with mixed distribution a Negative Binomial, with r = = 1 and = = 2.
This is a Geometric with f(1) = /(1+)2 = 2/(1+2)2 = 2/9 = 0.222.
Alternately, is distributed via an Exponential with mean 2, f() = e/2/2.
Prob[1 claim] = Prob[1 claim | ] f() d = ee/2/2 d =

(1/2) e3/2 d = (1/2) (2/3)2 (2) = (1/2)(4/9)(1!) = 2/9 = 0.222.


0

Alternately, for the Gamma-Poisson, the variance of the mixed Negative Binomial = total variance =
E[Var[N | ]] + Var[E[N | ]] = E[] + Var[] = mean of the Gamma + variance of the Gamma = 2 + 4
= 6. The mean of the mixed Negative Binomial = overall mean = E[] = mean of the Gamma = 2.
Therefore, r = 2 and r(1+) = 6. r =1 and = 2.
f(1) = /(1+)2 = 2/(1+2)2 = 2/9 = 0.222.
Comment: The fact that it is the sixth rather than some other minute is irrelevant.
19.53. C. Over two minutes (on the same day) we have a Poisson with mean 2.
Gamma(, ) = Gamma (1, 2).
2 Gamma(, 2) = Gamma (1, 4), as per inflation.
Mixed Distribution is Negative Binomial, with r = = 1 and = = 4.
f(1) = /(1 + )2 = 4/(1 + 4)2 = 16%.
Comment: If one multiplies a Gamma variable by a constant, one gets another Gamma with the
same alpha and with the new theta equal to that constant times the original theta.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 384

19.54. D. A Negative Binomial with r = 1 and = 2.


B Negative Binomial with r = 1 and = 2.
A + B Negative Binomial with r = 2 and = 2.
f(1) = r / (1 + )1+r = (2)(2) / (1 + 2)3 = 14.8%.
Alternately, the number of coins found in the minutes are independent Poissons with means 1 and
2 . Total number found is Poisson with mean 1 + 2 .
1 + 2 Gamma(2, ) = Gamma (2, 2).
Mixed Negative Binomial has r = 2 and = 2. Proceed as before.
Alternately, P[A + B = 1] = P[A = 1]P[B = 0] + P[A = 0]P[B = 1] = (2/9)(1/3) + (1/3)(2/9) = 14.8%.
Comment: The sum of two independent Gamma variables with the same theta, is another Gamma
with the same theta and with the new alpha equal to the sum of the alphas.
19.55. E. Prob[1 coin during minute 3 | ] = e. Prob[1 coin during minute 5 | ] = e.
The Gamma has = 2 and = 1, an Exponential. () = e/2/2.
Prob[1 coin during minute 3 and 1 coin during minute 5] =

Prob[1 coin during minute 3 | ] Prob[1 coin during minute 5 | ] () d =

(e ) (e ) (e/ 2 / 2) d =
0

2 e 2.5 / 2 d = (3) (1/2.5)3 / 2 = (1/2)(2/2.53) =

6.4%.

Comment: It is true that Prob[1 coin during minute 3] = Prob[1 coin during minute 5] = 2/9.
(2/9)(2/9) = 4.94%. However, since the two probabilities both depend on the same lambda, they
are not independent.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 385

19.56. D. Prob[1 coin during minute 1 | ] = e. Prob[2 coins during minute 2 | ] = 2e/2.
Prob[3 coins during minute 3 | ] = 3e/6.
The Gamma has = 2 and = 1, an Exponential. () = e/2/2.
Prob[1 coin during minute 1, 2 coins during minute 2, and 3 coins during minute 3] =

Prob[1 coin minute 1 | ] Prob[2 coins minute 2 | ] Prob[3 coins minute 3 | ] () d =

(e )

(2 e /2

(3e /

6)

(e / 2 /

2) d =

6 e 3.5 / 24 d

= (7) (1/3.5)7 / 24 =

(720/24)/3.57 = 0.466%.
Comment: Prob[1 coin during minute 1] = 2/9. Prob[2 coins during minute 2] = 4/27.
Prob[3 coins during minute 3] = 8/81. (2/9)(4/27)(8/81) = 0.325%. However, since the three
probabilities depend on the same lambda, they are not independent.
19.57. A. From a previous solution, for one minute, the mixed distribution is Geometric with = 2.
f(1) = /(1+)2 = 2/(1+2)2 = 2/9 = 0.2222.
Since the minutes are on different days, their lambdas are picked independently.
Prob[1 coin during 1 minute today and 1 coin during 1 minute tomorrow] =
Prob[1 coin during a minute] Prob[1 coin during a minute] = 0.22222 = 4.94%.
19.58. C. Over three minutes (on the same day) we have a Poisson with mean 3.
Gamma(, ) = Gamma (1, 2).
3 Gamma(, 3) = Gamma (1, 6).
Mixed Distribution is Negative Binomial, with r = = 1 and = = 6.
f(1) = /(1 + )2 = 6/(1 + 6)2 = 0.1224.
Since the time intervals are on different days, their lambdas are picked independently.
Prob[1 coin during 3 minutes today and 1 coin during 3 minutes tomorrow] =
Prob[1 coin during 3 minutes] Prob[1 coin during 3 minutes] = 0.12242 = 1.50%.
19.59. E. Gamma has mean = = 3 and variance = 2 = 3 = 1 and = 3.
The Negative Binomial mixed distribution has r = = 3 and = = 1.
f(0) = 1/(1+)3 = 1/8. f(1) = r/(1+)4 = 3/16. F(1) = 1/8 + 3/16 = 5/16 = 0.3125.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 386

19.60. E. Assume the prior Gamma, used by both actuaries, has parameters and .
The first actuary is simulating N drivers from a Gamma-Poisson frequency process.
The number of claims from a random driver is Negative Binomial with r = and = .
The total number of claim is a sum of N independent, identically distributed Negative Binomials,
which is Negative Binomial with parameters r = N and = .
The second actuary is simulating N years for a single driver.
An individual who is Poisson with mean , over N years is Poisson with mean N.
I. The Negative Binomial Distribution simulated by the first actuary has mean N.
The Poisson simulated by the second actuary has mean N, where depends on which driver the
second actuary has picked at random. There is no reason why the mean number of claims simulated
by the two actuaries should be the same. Thus statement I is not true.
II. The number of claims simulated will usually be different, since they are from two different
distributions. Thus statement II is not true.
III. The first actuarys Negative Binomial has variance (1 + ). The second actuarys simulated
sequence has an expected variance of , where depends on which driver the second actuary has
picked at random. The expected variance for the second actuarys simulated sequence could be
higher or lower than the first actuarys, depending on which driver he has picked. Thus statement III is
not true.
19.61. C. Gamma-Poisson. The mixed distribution is Negative Binomial with r = = 2 and
= = 1. f(0) = 1/(1 + )r = 1/(1 + 1)2 = 1/4.
19.62. E. From the previous solution, the probability that each driver does not have a claim is 1/4.
Thus for 1000 independent drivers, the number of drivers with no claims is Binomial with m = 1000
and q = 1/4. This Binomial has mean mq = 250, and variance
mq(1 - q) = 187.5. Using the Normal Approximation with continuity correction,
Prob[At most 265 claim-free drivers] [(265.5 - 250)/ 187.5 ] = [1.13] = 87.08%.

2013-4-1, Frequency Distributions, 19 Gamma-Poisson

HCM 10/4/12,

Page 387

19.63. D. The distribution of number of claims from a single driver is Negative Binomial with
r = 2 and = 1. The distribution of the sum of 1000 independent drivers is Negative Binomial with
r = (1000)(2) = 2000 and = 1. This Negative Binomial has mean r = 2000, and
variance r(1 + ) = 4000. Using the Normal Approximation with continuity correction,
Prob[more than 2020 claims] 1 - [(2020.5 - 2000)/ 4000 ] = 1 - [.32] = 37.45%.
Alternately, the mean of the sum of 1000 independent drivers is 1000 times the mean of single
driver: (1000) (2) = 2000.
The variance of the sum of 1000 independent drivers is 1000 times the variance of single driver:
(1000) (2) (1) (1+1) = 4000. Proceed as before.
19.64. C. The distribution of number of claims from a single driver is Negative Binomial with r = 2
and = 1. f(0) = 1/4. f(1) = r/(1 + )r+1 = (2)(1)/(1 + 1)3 = 1/4.
f(2) = {r(r + 1)/2}2/(1 + )r+2 = {(2)(3)/2}(12 )/(1 + 1)4 = 3/16.
The number of drivers with given numbers of claims is a multinomial distribution,
with parameters 1000, f(0), f(1), f(2), ... = 1000, 1/4, 1/4, 3/16, ....
The covariance of the number of drivers with 1 claim and the number with 2 claims is:
-(1000)(1/4)(3/16) = -46.875.
The variance of the number of drivers with 1 claim is: (1000)(1/4)(1 - 1/4) = 187.5.
The variance of the number of drivers with 2 claims is: (1000)(3/16)(1 - 3/16) = 152.34.
The correlation of the number of drivers with 1 claim and the number with 2 claims is:
-46.875/ (187.5)(152.34) = -0.277.
Comment: Well beyond what you are likely to be asked on your exam!
The multinomial distribution is discussed in A First Course in Probability by Ross.
The correlation is: -

f(1) f(2)
= -1/ 13 = -0.277.
{1 - f(1)} {1 - f(2)}

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 388

Section 20, Tails of Frequency Distributions


Actuaries are sometimes interested in the behavior of a frequency distribution as the number of
claims gets very large.164 The question of interest is how quickly the density and survival function go
to zero as x approaches infinity. If the density and survival function go to zero more slowly, one
describes that as a "heavier-tailed distribution."
Those frequency distributions which are heavier-tailed than the Geometric distribution are often
considered to have heavy tails, while those lighter-tailed than Geometric are consider to have light
tails.165 There are number of general methods by which one can distinguish which distribution or
empirical data set has the heavier tail. Lighter-tailed distributions have more moments that exist.
For the frequency distributions on the exam all of the moments exist.
Nevertheless, the three common frequency distributions differ in their tail behavior. Since the
Binomial has finite support, f(x) = 0 for x > n, it is very light-tailed. The Negative Binomial has its
variance greater than its mean, so that the Negative Binomial is heavier-tailed than the Poisson which
has its variance equal to its mean.
From lightest to heaviest tailed, the frequency distribution in the (a,b,0) class are:
Binomial, Poisson, Negative Binomial r > 1, Geometric, Negative Binomial r < 1.
Skewness:
The larger the skewness, the heavier-tailed the distribution. The Binomial distribution for
q > 0.5 is skewed to the left (has negative skewness.) The Binomial distribution for q < 0.5, the
Poisson distribution, and the Negative Binomial distribution are skewed to the right (have positive
skewness); they have a few very large values and many smaller values. A symmetric distribution
has zero skewness. Therefore, the Binomial Distribution for q = 0.5 has zero skewness.
Mean Residual Lives/ Mean Excess Loss:
As with loss distributions one can define the concept of the mean residual life.
The Mean Residual Life, e(x) is defined as:
e(x) = (average number of claims for those insureds with more than x claims) - x.
Thus we only count those insureds with more than x and only that part of each number of claims
greater than x.166 Heavier-tailed distributions have their mean residual life increase to infinity, while
lighter-tailed distributions have their mean residual life approach a constant or decline to zero.
164

Actuaries are more commonly concerned with the tail behavior of loss distributions, as discussed in Mahlers
Guide to Loss Distributions.
165
See Section 6.3 of Loss Models.
166
Thus the Mean Residual Life is the mean of the frequency distribution truncated and shifted at x.

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 389

One complication is that for discrete distributions this definition is discontinuous at the integers. For
example, assume we are interested in the mean residual life at 3. As we take the limit from below
we include those insureds with 3 claims in our average; as we approach 3 from above, we dont
include insureds with 3 claims in our average.
Define e(3-) as the limit as x approaches 3 from below of e(x). Similarly, one can define e(3+) as the
limit as x approaches 3 from above of e(x). Then it turns out that e(0-) = mean, in analogy to the
situation for continuous loss distributions. For purposes of comparing tail behavior of frequency
distributions, one can use either e(x-) or e(x+). I will use the former, since the results using e(x-) are
directly comparable to those for the continuous size of loss distributions. At integer values of x:

e(x-) =

(i - x) f(i)

(i - x) f(i)

i =x

f(i)

i =x

S(x)

i =x

One can compute the mean residual life for the Geometric Distribution, letting q = /(1+) and thus
1 - q = 1/(1+):
e(x-) S(x-1) =
1
1+

i=x+1

i=x+1

(i - x) f(i) = (i - x) i

/ (1+) i + 1 =

1
(i - x) qi =

1+ i=x+1

qx + 1
qx + 2
qx + 3
qi + qi + qi + ... = (1- q)
+
+
+ ...
i=x+1
1- q
1- q
1- q

i=x+2
i=x+3

= qx+1 + qx+2 + qx+3 + ... = qx+1 /(1-q) = {/(1+)}x+1(1+) = x+1 /(1+)x.


In a previous section, the survival function for the geometric distribution was computed as:
S(x) = {/(1+)}x+1. Therefore, S(x-1) = {/(1+)}x.
Thus e(x-) =

x + 1 / (1+ ) x
= .
{ / (1+ )} x

The mean residual life for the Geometric distribution is constant.167 As discussed previously, the
Geometric distribution is the discrete analog of the Exponential distribution which also has a constant
mean residual life.168
e(x-) = = E[X].
The Exponential and Geometric distributions have constant mean residual lives due to their memoryless property
as discussed in Section 6.3 of Loss Models.
167
168

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 390

As discussed previously, the Negative Binomial is the discrete analog of the Gamma Distribution.
The tail behavior of the Negative Binomial is analogous to that of the Gamma.169 The mean residual
life for a Negative Binomial goes to a constant. For r < 1, e(x-) increases to , the mean of the
corresponding Geometric, while for r>1, e(x-) decreases to as x approaches infinity. For r = 1, one
has the Geometric Distribution with e(x-) constant.
Using the relation between the Poisson Distribution and the Incomplete Gamma Function, it
turns out that for the Poisson e(x-) = (-x) +

xe-
.170 The mean residual life e(x-) for the
(x) (x; )

Poisson Distribution declines to zero as x approaches infinity.171 172 This is another way of seeing that
the Poisson has a lighter tail than the Negative Binomial Distribution.
Summary:
Here are the common frequency distributions, arranged from lightest to heaviest righthand tail:
Frequency Distribution
Binomial, q > 0.5

Skewness
negative

Righthand Tail Behavior


Finite Support

Binomial, q = 0.5

zero

Finite Support

Binomial, q < 0.5

positive

Finite Support

Poisson

positive

e(x-) 0,

Tail Similar to

Normal Distribution

approximately as 1/x
Negative Binomial, r >1

positive

e(x-) decreases to

Gamma, > 1

Geometric

positive

e(x-) constant =

Exponential
(Gamma, = 1)

(Negative Binomial, r =1)


Negative Binomial, r < 1
169

positive

e(x-) increases to

Gamma, < 1

See Mahlers Guide to Loss Distributions, for a discussion of the mean residual life for the Gamma and other size
of loss distributions. For a Gamma Distribution with >1, e(x) decreases towards a horizontal asymptote .
For a Gamma Distribution with <1, e(x) increases towards a horizontal asymptote .
170
For the Poisson F(x) = 1 - (x+1; ).
171
It turns out that e(x-) / x for very large x. This is similar to the tail behavior for the Normal Distribution.
While e(x-) declines to zero, e(x+) for the Poisson Distribution declines to one as x approaches infinity.
172
This follows from the fact that the Poisson is a limit of Negative Binomial Distributions. For a sequence of Negative
Binomial distributions with r = as r (and 0), in the limit one approaches a Poisson Distribution with the
mean . The tails of each Negative Binomial have e(x-) decreasing to as x approaches infinity.
As 0, the limits of e(x-) 0.

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 391

Skewness and Kurtosis of the Poisson versus the Negative Binomial:173

The Poisson has skewness:

The Negative Binomial has skewness:

1 + 2
r(1 + )

Therefore, if a Poisson and Negative Binomial have the same mean, = r, then the ratio of the
skewness of the Negative Binomial to that of the Poisson is:

1 + 2
1 +

> 1.

The Poisson has kurtosis: 3 + 1/.


The Negative Binomial has kurtosis: 3 +

6 2 + 6 + 1
.
r (1+ )

Therefore, if a Poisson and Negative Binomial have the same mean, = r, then the ratio of the
kurtosis minus 3 of the Negative Binomial to that of the Poisson is:174
6 2 + 6 + 1
> 1.
1 +
Tails of Compound Distributions:
Compound frequency distributions can have longer tails than either their primary or secondary
distribution. If the primary distribution is the number of accidents, and the secondary distribution is the
number of claims, then one can have a large number of claims either due to a large number of
accidents, or an accident with a large number of claims, or a combination of the two. Thus there is
more chance for an unusually large number of claims.
Generally the longer-tailed the primary distribution and the longer-tailed the secondary distribution,
the longer-tailed the compound distribution. The skewness of a compound distribution can be rather
large.

173
174

See The Negative Binomial and Poisson Distributions Compared, by Leroy J. Simon, PCAS 1960.
The kurtosis minus 3 is sometimes called the excess.

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 392

Tails of Mixed Distributions:


Mixed distributions can also have long tails. For example, the Gamma Mixture of Poissons is a
Negative Binomial, with a longer tail than the Poisson. As with compound distributions, with mixed
distributions there is more chance for an unusually large number of claims. One can either have a
unusually large number of claims for a typical value of the parameter, have an unusual value of the
parameter which corresponds to a large expected claim frequency, or a combination of the two.
Generally the longer tailed the distribution type being mixed and the longer tailed the mixing
distribution, the longer tailed the mixed distribution.
Tails of Aggregate Loss Distributions:
Actuaries commonly look at the combination of frequency and severity. This is termed the aggregate
loss distribution. The tail behavior of this aggregate distribution is determined by the behavior of the
heavier-tailed of the frequency and severity distributions.175
Since the common frequency distributions have tails that are similar to the Gamma Distribution or
lighter and the common severity distributions for Casualty Insurance have tails at least as heavy as
the Gamma, actuaries working on liability or workers compensation insurance are usually most
concerned with the heaviness of the tail of the severity distribution. It is the rare extremely large
claims that then are of concern.
However, natural catastrophes such as hurricanes or earthquakes can be examples where a
large number of claims can be the concern.176 (Tens of thousands of homeowners claims, even
limited to for example 1/4 million dollars each, can add up to a lot of money!) In that case the
tail of the frequency distribution could be heavier than a Negative Binomial.

175

See for example Panjer & WiIlmot, Insurance Risk Models.


Natural catastrophes are now commonly modeled using simulation models that incorporate the science of the
particular physical phenomenon and the particular distribution of insured exposures.
176

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 393

Problems:
20.1 (1 point) Which of the following frequency distributions have positive skewness?
1. Negative Binomial Distribution with r = 3, = 0.4.
2. Poisson Distribution with = 0.7.
3. Binomial Distribution with m = 3, q = 0.7.
A. 1, 2 only
B. 1, 3 only
C. 2, 3 only
D. 1, 2, and 3
E. The correct answer is not given by A, B, C, or D.

Use the following information for the next five questions:


Five friends: Oleg Puller, Minnie Van, Bob Alou, Louis Liu, and Shelly Fish, are discussing studying
for their next actuarial exam. Theyve counted 10,000 pages worth of readings and agree that on
average they expect to find about 2000 important ideas. However, they are debating how many
of these pages there are expected to be with 3 or more important ideas.
20.2 (2 points) Oleg assumes the important ideas are distributed as a Binomial with
q = 0.04 and m = 5.
How many pages should Oleg expect to find with 3 or more important ideas?
A. Less than 10
B. At least 10 but less than 20
C. At least 20 but less than 40
D. At least 40 but less than 80
E. At least 80
20.3 (2 points) Minnie assumes the important ideas are distributed as a Poisson with = 0.20.
How many pages should Minnie expect to find with 3 or more important ideas?
A. Less than 10
B. At least 10 but less than 20
C. At least 20 but less than 40
D. At least 40 but less than 80
E. At least 80

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 394

20.4 (2 points) Bob assumes the important ideas are distributed as a Negative Binomial with
= 0.1 and r = 2. How many pages should Bob expect to find with 3 or more important ideas?
A. Less than 10
B. At least 10 but less than 20
C. At least 20 but less than 40
D. At least 40 but less than 80
E. At least 80
20.5 (3 points) Louis assumes the important ideas are distributed as a compound
Poisson-Poisson distribution, with 1 = 1 and 2 = 0.2.
How many pages should Louis expect to find with 3 or more important ideas?
A. Less than 10
B. At least 10 but less than 20
C. At least 20 but less than 40
D. At least 40 but less than 80
E. At least 80
20.6 (3 points) Shelly assumes the important ideas are distributed as a compound
Poisson-Poisson distribution, with 1 = 0.2 and 2 = 1.
How many pages should Shelly expect to find with 3 or more important ideas?
A. Less than 10
B. At least 10 but less than 20
C. At least 20 but less than 40
D. At least 40 but less than 80
E. At least 80
20.7 (3 points) Define Riemanns zeta function as: (s) =

1/ ks , s > 1.

k=1

1
Let the zeta distribution be: f(x) = + 1
, x = 1, 2, 3, ..., > 0.
( +1)
x
Determine the moments of the zeta distribution.
20.8 (4B, 5/99, Q.29) (2 points) A Bernoulli distribution, a Poisson distribution, and a uniform
distribution each has mean 0.8. Rank their skewness from smallest to largest.
A. Bernoulli, uniform, Poisson
B. Poisson, Bernoulli, uniform
C. Poisson, uniform, Bernoulli
D. uniform, Bernoulli, Poisson
E. uniform, Poisson, Bernoulli

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 395

Solutions to Problems:
20.1. A. 1. True. The skewness of any Negative Binomial Distribution is positive.
2. True. The skewness of any Poisson Distribution is positive. 3. False. The skewness of the
Binomial Distribution depends on the value of q. For q > .5, the skewness is negative.
20.2. A. f(x) = (5! / {(x!)(5-x)!}) 0.04x 0.965-x.
One needs to sum the chances of having x = 0, 1, and 2 :
n
f(n)
F(n)

0
0.81537
0.81537

1
0.16987
0.98524

2
0.01416
0.99940

Thus the chance of 3 or more important ideas is: 1 - 0.99940 = 0.00060.


Thus we expect: (10000)(0.00060) = 6.0 such pages.
20.3. B. f(x) = e-0.2 0.2x /x! .
One needs to sum the chances of having x = 0, 1, and 2 :
n
f(n)
F(n)

0
0.81873
0.81873

1
0.16375
0.98248

2
0.01637
0.99885

Thus the chance of 3 or more important ideas is: 1 - 0.99885 = 0.00115.


Thus we expect: (10000)(0.00115) = 11.5 such pages.
20.4. C. f(x) = ((x+2-1)! / {(x!)(2-1)!})(0.1)x / (1.1)x+2 = (x+1)(10/11)2 (1/11)x.
One needs to sum the chances of having x = 0, 1, and 2:
n
f(n)
F(n)

0
0.82645
0.82645

1
0.15026
0.97671

2
0.02049
0.99720

Thus the chance of 3 or more important ideas is: 1 - 0.99720 = 0.00280.


Thus we expect: (10000)(0.00280) = 28.0 such pages.
Comment: Note that the distributions of important ideas in these three questions all have a mean of
.2. Since the Negative Binomial has the longest tail, it has the largest expected number of pages
with lots of important ideas. Since the Binomial has the shortest tail, it has the smallest expected
number of pages with lots of important ideas.

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 396

20.5. D. For the Primary Poisson a =0 and b = 1 = 1. The secondary Poisson has density at zero
of e-0.2 = 0.8187. The p.g.f of the Primary Poisson is P(z) = e(z-1). The density of the compound
distribution at zero is the p.g.f. of the primary distribution at 0.8187: e(0.8187-1) = 0.83421.
The densities of the secondary Poisson Distribution with = 0.2 are:
n
0
1
2
3
4
5

s(n)
0.818731
0.163746
0.016375
0.001092
0.000055
0.000002
x

Use the Panjer Algorithm, c(x) = {1/(1-as(0))}(a +jb/x)s(j)c(x-j) = (1/x) js(j) c(x-j) .
j=1

j=1

c(1) = (1/1) (1) s(1) c(0) = (.16375)(.83421) = .13660.


c(2) = (1/2) {(1)s(1) c(1) + (2)s(2)c(0)} = (1/2){(.16375)( .13660)+(2)(.01638)(.83421)} = .02485.
Thus c(0) + c(1) +c(2) = 0.83421 + 0.13660 + 0.02485 = 0.99566.
Thus the chance of 3 or more important ideas is: 1 - 0.99566 = 0.00434.
Thus we expect: (10000)(0.00434) = 43.4 such pages.
Comment: The Panjer Algorithm (recursive method) is discussed in Mahlers Guide to Aggregate
Distributions.

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 397

20.6. E. For the Primary Poisson a = 0 and b = 1 = .2. The secondary Poisson has density at zero
of e-1 = .3679. The p.g.f of the Primary Poisson is P(z) = e.2(z-1). The density of the compound
distribution at zero is the p.g.f. of the primary distribution at .3679 : e.2(.3679-1) = .88124.
The densities of the secondary Poisson Distribution with = 1 are:
n
0
1
2
3
4
5

s(n)
0.367879
0.367879
0.183940
0.061313
0.015328
0.003066
x

Use the Panjer Algorithm,

c(x) = {1/(1-as(0))} (a +jb/x)s(j)c(x-j) = (.2/x) js(j) c(x-j) .


j=1

j=1

c(1) = (.2/1) (1) s(1) c(0) = (.2)(.3679)(.88124) = .06484.


c(2) = (.2/2) {(1)s(1) c(1) + (2)s(2)c(0)} = (.1){(.3679)(.06484)+(2)(.1839)(.88124)} = .03480.
Thus c(0) + c(1) +c(2) = .88124 + .06484 + .03480 = .98088.
Thus the chance of 3 or more important ideas is: 1 - 0.98088 = 0.01912.
Thus we expect: (10000)(0.01912) = 191.2 such pages.
Comment: This Poisson-Poisson has a mean of .2, but an even longer tail than the previous
Poisson-Poisson which has the same mean. Note that it has a variance of (.2)(1) + (1)2 (.2) = .40,
while the previous Poisson-Poisson has a variance of (1)(.2) + (.2)2 (1) = .24. The Negative
Binomial has a variance of (2)(.1)(1.1) = .22. The variance of the Poisson is .20. The variance of the
Binomial is (5)(.04)(.96) = .192.
20.7.

E[Xn ] = xn (1/x+1)/( + 1) = (1/x+1-n)/( + 1) = ( + 1 - n)/( + 1), n < .


x=1

x=1

Comment: You are extremely unlikely to be asked about the zeta distribution!
The zeta distribution is discrete and has a heavy righthand tail similar to a Pareto Distribution or a
Single Parameter Pareto Distribution, with only some of its moments existing.
The zeta distribution is mentioned in Exercise 16.33 in Loss Models.

(2) = 2/6. (4) = 4/90. See the Handbook of Mathematical Functions.

2013-4-1,

Frequency Distributions, 20 Tails

HCM 10/4/12,

Page 398

20.8. A. The uniform distribution is symmetric, so it has a skewness of zero.


The Poisson has a positive skewness.
The Bernoulli has a negative skewness for q = 0.8 > 0.5.
Comment: For the Poisson with mean , the skewness is 1/ .
For the Bernoulli, the skewness is:

1 - 2q
=
q(1 - q)

1 - (2)(0.8)
= -1.5.
(0.8) (1 - 0.8)

If one computes for this Bernoulli, the third central moment E[(X-0.8)3 ] =
0.2(0- 0.8)3 + 0.8(1 - 0.8)3 = -0.096. Thus the skewness is:

-0.096
= -1.5.
(0.8) (1 - 0.8)

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 399

Section 21, Important Formulas and Ideas


Here are what I believe are the most important formulas and ideas from this study guide to know for
the exam.
Basic Concepts (Section 2)
The mean is the average or expected value of the random variable.
The mode is the point at which the density function reaches its maximum.
The median, the 50th percentile, is the first value at which the distribution function is 0.5.
The 100pth percentile as the first value at which the distribution function p.
Variance = second central moment = E[(X - E[X])2 ] = E[X2 ] - E[X]2 .
Standard Deviation = Square Root of Variance.
Binomial Distribution (Section 3)
m
m!
f(x) = f(x) = qx (1- q )m-x =
qx (1- q)m - x , 0 x m.
x! (m- x)!
x
Mean = mq

Variance = mq(1-q)

Probability Generating Function: P(z) = {1 + q(z-1)}m


The Binomial Distribution for m =1 is a Bernoulli Distribution.
X is Binomial with parameters q and m1 , and Y is Binomial with parameters q and m2 ,
X and Y independent, then X + Y is Binomial with parameters q and m1 + m2 .
Poisson Distribution (Section 4)
f(x) = x e / x!, x 0
Mean =

Variance =

Probability Generating Function: P(z) = e(z-1) , > 0.


A Poisson is characterized by a constant independent claim intensity and vice versa.
The sum of two independent variables each of which is Poisson with parameters 1 and
2 is also Poisson, with parameter 1 + 2 .
If frequency is given by a Poisson and severity is independent of frequency, then the
number of claims above a certain amount (in constant dollars) is also a Poisson.

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 400

Geometric Distribution (Section 5)


f(x) =

x
.
(1 + ) x + 1

Mean =

Variance = (1+)

Probability Generating Function: P(z) =

1
.
1- (z-1)

n
For a Geometric Distribution, for n > 0, the chance of at least n claims is:
.
1+
For a series of independent identical Bernoulli trials, the chance of the first success following x failures
is given by a Geometric Distribution with mean
= (chance of a failure) / (chance of a success).
Negative Binomial Distribution (Section 6)
f(x) =

x
r(r + 1)...(r + x - 1)
.
(1+ )x + r
x!

Mean = r

Variance = r(1+)

Negative Binomial for r = 1 is a Geometric Distribution.


For the Negative Binomial Distribution with parameters and r, with r integer, can be
thought of as the sum of r independent Geometric distributions with parameter .
If X is Negative Binomial with parameters and r1 , and Y is Negative Binomial with parameters
and r2 , X and Y independent, then X + Y is Negative Binomial with parameters and r1 + r2 .
For a series of independent identical Bernoulli trials, the chance of success number r following x
failures is given by a Negative Binomial Distribution with parameters r and
= (chance of a failure) / (chance of a success).

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 401

Normal Approximation (Section 7)


In general, let be the mean of the frequency distribution, while is the standard
deviation of the frequency distribution, then the chance of observing at least i claims
and not more than j claims is approximately:
(j + 0.5) -
(i - 0.5)
[
] - [
].

Normal Distribution
F(x) = ((x)/)
(x - )2
]
2 2 , - < x < .
2

exp[f(x) = ((x)/) / =

(x) =

exp[-x2 / 2]
2

, - < x < .

Mean =

Variance = 2

Skewness = 0 (distribution is symmetric)

Kurtosis = 3

Skewness (Section 8)
Skewness = third central moment /STDDEV3 = E[(X - E[X])3 ]/STDDEV3
= {E[X3 ] - 3 X E[X2 ] + 2 X 3 } / Variance3/2.
A symmetric distribution has zero skewness.
Binomial Distribution with q < 1/2 positive skewness skewed to the right.
Binomial Distribution q = 1/2 symmetric zero skewness.
Binomial Distribution q > 1/2 negative skewness skewed to the left.
Poisson and Negative Binomial have positive skewness.
Probability Generating Function (Section 9)
Probability Generating Function, p.g.f.:

P(z) = Expected Value of

zn

E[zn ] =

f(n) zn .
n=0

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 402

The Probability Generating Function of the sum of independent frequencies is the product of the
individual Probability Generating Functions.
The distribution determines the probability generating function and vice versa.
f(n) = (dn P(z) / dzn )z=0 / n!.

f(0) = P(0).

P(1) = Mean.

If a distribution is infinitely divisible, then if one takes the probability generating function to any
positive power, one gets the probability generating function of another member of the same family
of distributions. Examples of infinitely divisible distributions include: Poisson, Negative Binomial,
Compound Poisson, Compound Negative Binomial, Normal, Gamma.
Factorial Moments (Section 10)
nth factorial moment = (n) = E[X(X-1) .. (X+1-n)].
(n) = (dn P(z) / dzn )z=1.
P(1) = E[X].

P(1) = E[X(X-1)].

(a, b, 0) Class of Distributions (Section 11)


For each of these three frequency distributions: f(x+1) / f(x) = a + {b / (x+1)}, x = 0, 1, ...
where a and b depend on the parameters of the distribution:
Distribution
Binomial

a
-q/(1-q)

Poisson
Negative Binomial
Distribution
Binomial

b
(m+1)q/(1-q)

/(1+)

(r-1)/(1+)
Variance
mq(1-q)

Poisson

Negative Binomial

r(1+)

Thinning by factor of t

(1-q)m
e

Mean
mq

Distribution

f(0)

1/(1+)r

Variance Over Mean


1-q < 1
1
1+ > 1

Variance < Mean


Variance = Mean
Variance > Mean

Adding n independent, identical copies

Binomial

q tq

m nm

Poisson

Negative Binomial

r nr

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 403

For X and Y independent:


X
Binomial(q, m1 )

Y
Binomial(q, m2 )

X+Y
Binomial(q, m1 + m2 )

Poisson( 1)

Poisson( 2)

Poisson( 1 + 2)

Negative Binomial(, r1 )

Negative Bin.(, r2 )

Negative Bin.(, r1 + r2 )

Accident Profiles (Section 12)


For the Binomial, Poisson and Negative Binomial Distributions:
(x+1) f(x+1) / f(x) = a(x + 1) + b, where a and b depend on the parameters of the distribution.
a < 0 for the Binomial, a = 0 for the Poisson, and a > 0 for the Negative Binomial Distribution.
Thus if data is drawn from one of these three distributions, then we expect (x+1) f(x+1) / f(x) for this
data to be approximately linear with slope a; the sign of the slope, and thus the sign of a,
distinguishes between these three distributions of the (a, b, 0) class.
Zero-Truncated Distributions (Section 13)
In general if f is a distribution on 0,1,2,3,..., then g(x) = f(x) / {1 - f(0)} is a distribution on 1,2,3, ....
We have the following three examples:
Distribution
Density of the Zero-Truncated Distribution

Binomial

Poisson

Negative Binomial

m! qx (1- q)m - x
x! (m - x)!
1 - (1- q)m
e- x / x!
1 - e-

x = 1, 2, 3,... , m

x = 1, 2, 3,...

x
r(r +1)...(r + x - 1)
(1+ )x + r
x!
x = 1, 2, 3,...
1 - 1/ (1+ )r

The moments of a zero-truncated distribution, g, are given in terms of those of the corresponding
untruncated distribution, f, by:

Eg [Xn ] = Ef[Xn ] / {1 - f(0)}.

1+
The Logarithmic Distribution has support equal to the positive integers: f(x) =
.
x ln(1+)

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 404

The (a,b,1) class of frequency distributions is a generalization of the (a,b,0) class.


As with the (a,b,0) class, the recursion formula: f(x)/f(x-1) = a + b/x applies.
However, it need only apply now for x 2, rather than x 1.
Members of the (a,b,1) family include: all the members of the (a,b,0) family, the zero-truncated
versions of those distributions: Zero-Truncated Binomial, Zero-Truncated Poisson,
Extended Truncated Negative Binomial, and the Logarithmic Distribution.
In addition the (a,b,1) class includes the zero-modified distributions corresponding to these.
Zero-Modified Distributions (Section 14)
If f is a distribution on 0,1,2,3,..., and 0 < pM
0 < 1,
M
then g(0) = pM
0 , g(x) = f(x){1 - p 0 } / {1 - f(0)}, x=1, 2 , 3..., is a distribution on 0, 1, 2, 3, ....

The moments of a zero-modified distribution g are given in terms of those of f by


n
Eg [Xn ] = (1- pM
0 ) Ef[X ] / {1 - f(0)}.

Compound Frequency Distributions (Section 15)


A compound frequency distribution has a primary and secondary distribution, each of which is a
frequency distribution. The primary distribution determines how many independent random draws
from the secondary distribution we sum.
p.g.f. of compound distribution = p.g.f. of primary dist.[p.g.f. of secondary dist.]
P(z) = P1 [P2 (z)].
compound density at 0 = p.g.f. of the primary at the density at 0 of the secondary.
Moments of Compound Distributions (Section 16)
Mean of Compound Dist. = (Mean of Primary Dist.)(Mean of Sec. Dist.)
Variance of Compound Dist. = (Mean of Primary Dist.)(Var. of Sec. Dist.)
+ (Mean of Secondary Dist.)2 (Variance of Primary Dist.)
In the case of a Poisson primary distribution with mean , the variance of the compound distribution
could be rewritten as: (2nd moment of Second. Dist.).
The third central moment of a compound Poisson distribution = (3rd moment of Sec. Dist.).

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 405

Mixed Frequency Distributions (Section 17)


The density function of the mixed distribution, is the mixture of the density function for
specific values of the parameter that is mixed.
The nth moment of a mixed distribution is the mixture of the nth moments.
First one mixes the moments, and then computes the variance of the mixture from its first and
second moments.
The Probability Generating Function of the mixed distribution, is the mixture of the probability
generating functions for specific values of the parameter.
For a mixture of Poissons, the variance is always greater than the mean.
Gamma Function (Section 18)
The (complete) Gamma Function is defined as:
() =

t - 1 e - t dt =

t - 1e - t /

() = (-1)(-1)

() = ( -1)!

dt , for 0 , 0.

t - 1 e - t / dt

= () .

The Incomplete Gamma Function is defined as:


( ; x) =

t - 1 e- t

dt / ().

2013-4-1,

Frequency Distributions, 21 Important Ideas

HCM 10/4/12,

Page 406

Gamma-Poisson Frequency Process (Section 19)


If one mixes Poissons via a Gamma, then the mixed distribution is in the form of the
Negative Binomial distribution with r = and = .
If one mixes Poissons via a Gamma Distribution with parameters and , then over a period of
length Y, the mixed distribution is Negative Binomial with r = and = Y.
For the Gamma-Poisson, the variance of the mixed Negative Binomial is equal to:
mean of the Gamma + variance of the Gamma.
Var[X] = E[Var[X | ]] + Var[E[X | ]].

Mixing increases the variance.

Tails of Frequency Distributions (Section 20)


From lightest to heaviest tailed, the frequency distribution in the (a,b,0) class are:
Binomial, Poisson, Negative Binomial r > 1, Geometric, Negative Binomial r < 1.

Mahlers Guide to

Loss Distributions
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-2


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-2,

Loss Distributions,

HCM 10/8/12,

Page 1

Mahlers Guide to Loss Distributions


Copyright 2013 by Howard C. Mahler.
The Loss Distributions concepts in Loss Models, by Klugman, Panjer, and WiIlmot,
are demonstrated.1
Information in bold or sections whose title is in bold are more important for passing the exam. Larger
bold type indicates it is extremely important. Information presented in italics (and sections whose
titles are in italics) should not be needed to directly answer exam questions and should be skipped
on first reading. It is provided to aid the readers overall understanding of the subject, and to be
useful in practical applications.
Highly Recommended problems are double underlined.
Recommended problems are underlined.
Solutions to the problems in each section are at the end of that section.
Note that problems include both some written by me and some from past exams.2 The latter are
copyright by the Casualty Actuarial Society and SOA and are reproduced here solely to aid
students in studying for exams. The solutions and comments are solely the responsibility of the
author; the CAS and SOA bear no responsibility for their accuracy. While some of the comments
may seem critical of certain questions, this is intended solely to aid you in studying and in no way is
intended as a criticism of the many volunteers who work extremely long and hard to produce quality
exams.
Greek letters used in Loss Models:
= alpha, = beta, = gamma, = theta, = lambda, = mu, = sigma, = tau
= beta, used for the Beta and incomplete Beta functions.
= Gamma, used for the Gamma and incomplete Gamma functions.
= Phi, used for the Normal distribution. = phi, used for the Normal density function.

= Pi is used for the continued product just as


= Sigma is used for the continued sum

In some cases the material covered is preliminary to the current Syllabus; you will be assumed to know it in order to
answer exam questions, but it will not be specifically tested.
2
In some cases Ive rewritten these questions in order to match the notation in the current Syllabus.

Loss Distributions,

2013-4-2,

Page 2

HCM 10/8/12,

Loss Distributions as per Loss Models

Distribution
Name

Distribution
Function

Probability Density
Function

Exponential

1 - e-x/

e-x/ /


1 - , x > .
x

Single Parameter Pareto

x
x
exp -


x

x
1 - exp-

Weibull

(x / ) e- x/
x ()

[ ; x/]

Gamma

LogNormal

ln(x)

Pareto


1 -

+ x

x
Inverse Gaussian 1

Inverse Gamma


,x>
x + 1

exp -

)2
22
x 2


( + x) + 1

x
+ e2 / + 1

1 - [ ; /x]

( ln(x)

x -1 e - x /
=
()

x

exp 2x
2
x1 .5

e - / x
x + 1 []

2
1

Loss Distributions,

2013-4-2,

Page 3

HCM 10/8/12,

Moments of Loss Distributions as per Loss Models

Distribution
Name

Mean

Variance

Moments

Exponential

Single Parameter Pareto

2
( 1)2 ( 2)

Weibull

[1 + 1/]

Gamma

n! n
n
, > n
n

2 {([1 + 2/] [1 + 1/]2 }

n [1 + n/]

2
n

n1

( + i) = n ()...( + n -1) = n
i=0

LogNormal

Pareto

exp[ + 2/2]

exp[2 + 2] (exp[2] - 1)
2
( 1)2 ( 2)

[ + n]
[]

exp[n + n2 2/2]

n! n

( i)

n! n
, > n
( 1)...( n)

i=1

Inverse Gaussian

Inverse Gamma

3/

e/

2
( 1)2 ( 2)

2 n
Kn - 1/2 (/)

( i)
i=1

n
,>n
( 1)...( n)

Loss Distributions,

2013-4-2,

HCM 10/8/12,

Section #

Pages

Section Name

1
2
3
4
5
6
7
8
9
10

9-10
11-27
28-47
48-57
58-62
63-64
67-71
72-78
79-86
87-99

11
12
13
14
15
16
17
18
19
20

100
101-114
115-127
128-139
140-157
158-170
171-180
181-212
213-218
219-223

Grouped Data
Working with Grouped Data
Uniform Distribution
Statistics of Grouped Data
Policy Provisions
Truncated Data
Censored Data
Average Sizes
Percentiles
Definitions

21
22
23
24
25
26
27
28
29
30

224-228
229-247
248-262
263-320
321-336
337-349
350-363
364-371
372-384
385-410

Parameters of Distributions

31
32
33
34
35
36
37
38
39
40

411-468
469-506
507-538
539-562
563-580
581-666
667-731
732-784
785-825
826-849

Limited Expected Values


Limited Higher Moments
Mean Excess Loss
Hazard Rate
Loss Elimination Ratios and Excess Ratios
The Effects of Inflation
Lee Diagrams
N-Point Mixtures of Models
Continuous Mixtures of Models
Spliced Models

41
42
43

850-853
854-873
874-893

Relationship to Life Contingencies


Gini Coefficient
Important Ideas & Formulas

Ungrouped Data
Statistics of Ungrouped Data
Coefficient of Variation, Skewness, and Kurtosis
Empirical Distribution Function
Limited Losses
Losses Eliminated
Excess Losses
Mean Excess Loss
Layers of Loss
Average Size of Losses in an Interval

Exponential Distribution
Single Parameter Pareto Distribution
Common Two Parameter Distributions
Other Two Parameter Distributions
Three Parameter Distributions
Beta Function and Distribution
Transformed Beta Distribution
Producing Additional Distributions
Tails of Loss Distributions

Page 4

Loss Distributions,

2013-4-2,

HCM 10/8/12,

Exam 3/M Questions by Section of this Study Aid


Section
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Sample

11/00

5/01

11/01

11/02

35

33
31

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

5/00

37
8

17
18

24

25

27
21

30

41-42

37

10

28
17

28

The CAS/SOA did not release the 5/02 and 5/03 exams.

Page 5

Loss Distributions,

2013-4-2,
Sec.

CAS 3
11/03

SOA 3
11/03

39

CAS 3
5/04

CAS 3
11/04

HCM 10/8/12,

SOA 3
11/04

CAS 3
5/05

Page 6
SOA M
5/05

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

26

35

32

22

17

34

20
21

25

35

16
21

3
28

7
24

19

7 27

4
30

29
17, 29, 34
33

20 23
18

33
30
28 29

Starting in 11/03, the CAS and SOA gave separate exams.


The SOA did not release its 5/04 exam.

18
34
10

Loss Distributions,

2013-4-2,
Sec.

CAS 3
11/05

SOA M
11/05

CAS 3
5/06

Page 7

HCM 10/8/12,

CAS 3
11/06

SOA M
11/06

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

26

20
8

19
22
10
11

25, 36

27
14

37

13

38
10, 11, 16

20, 31

29
21, 33

28

32

32
17, 20
35

The SOA did not release its 5/06 exam.

26, 39
28

30
20
18

Loss Distributions,

2013-4-2,

HCM 10/8/12,

Page 8

Course 4 Exam Questions by Section of this Study Aid3


Section Sample

5/00

11/00

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

5/01

11/01 11/02 11/03 11/04

5/05

3
36

7
2

18

26
39

37

13

The CAS/SOA did not release the 5/02, 5/03, 5/04, 5/06, 11/07 and subsequent exams.
3

5/07

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

11/05 11/06

Questions on more advanced ideas are in Mahlers Guide to Fitting Loss Distributions.

13

2013-4-2,

Loss Distributions, 1 Ungrouped Data

HCM 10/8/12,

Page 9

Section 1, Ungrouped Data


There are 130 losses of sizes:
300
400
2,800
4,500
4,900
5,000
7,700
9,600
10,400
10,600
11,200
11,400
12,200
12,900
13,400
14,100
15,500
19,300
19,400
22,100
24,800
29,600
32,200
32,500
33,700
34,300

37,300
39,500
39,900
41,200
42,800
45,900
49,200
54,600
56,700
57,200
57,500
59,100
60,800
62,500
63,600
66,400
66,900
68,100
68,900
71,100
72,100
79,900
80,700
83,200
84,500
84,600

86,600
88,600
91,700
96,600
96,900
106,800
107,800
111,900
113,000
113,200
115,000
117,100
119,300
122,000
123,100
126,600
127,300
127,600
127,900
128,000
131,300
132,900
134,300
134,700
135,800
146,100

150,300
171,800
173,200
177,700
183,000
183,300
190,100
209,400
212,900
225,100
226,600
233,200
234,200
244,900
253,400
261,300
261,800
273,300
276,200
284,300
316,300
322,600
343,400
350,700
395,800
406,900

423,200
437,900
442,700
457,800
463,000
469,300
469,600
544,300
552,700
566,700
571,800
596,500
737,700
766,100
846,100
852,700
920,300
981,100
988,300
1,078,800
1,117,600
1,546,800
2,211,000
2,229,700
3,961,000
4,802,200

Each individual value is shown, rather than the data being grouped into intervals.
The type of data shown here is called individual or ungrouped data.
Some students will find it helpful to put this data set on a computer and follow along with the
computations in the study guide to the best of their ability.4 The best way to learn is by doing.

Even this data set is far bigger than would be presented on an exam. In many actual applications, there are many
thousands of claims, but such a large data set is very difficult to present in a Study Aid. It is important to realize that
with modern computers, actuaries routinely deal with such large data sets. There are other situations where all that is
available is a small data set such as presented here.

2013-4-2,

Loss Distributions, 1 Ungrouped Data

HCM 10/8/12,

Page 10

This ungrouped data set is used in many examples throughout this study guide:
300, 400, 2800, 4500, 4900, 5000, 7700, 9600, 10400, 10600, 11200, 11400, 12200, 12900,
13400, 14100, 15500, 19300, 19400, 22100, 24800, 29600, 32200, 32500, 33700, 34300,
37300, 39500, 39900, 41200, 42800, 45900, 49200, 54600, 56700, 57200, 57500, 59100,
60800, 62500, 63600, 66400, 66900, 68100, 68900, 71100, 72100, 79900, 80700, 83200,
84500, 84600, 86600, 88600, 91700, 96600, 96900, 106800, 107800, 111900, 113000,
113200, 115000, 117100, 119300, 122000, 123100, 126600, 127300, 127600, 127900,
128000, 131300, 132900, 134300, 134700, 135800, 146100, 150300, 171800, 173200,
177700, 183000, 183300, 190100, 209400, 212900, 225100, 226600, 233200, 234200,
244900, 253400, 261300, 261800, 273300, 276200, 284300, 316300, 322600, 343400,
350700, 395800, 406900, 423200, 437900, 442700, 457800, 463000, 469300, 469600,
544300, 552700, 566700, 571800, 596500, 737700, 766100, 846100, 852700, 920300,
981100, 988300, 1078800, 1117600, 1546800, 2211000, 2229700, 3961000, 4802200

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 11

Section 2, Statistics of Ungrouped Data


For the ungrouped data in Section 1:
Average of X = E[X] = 1st moment = xi /n = 40647700/130 312,674.6.
Average of X2 = E[X2 ] = 2nd moment about the origin = xi2 /n = 4.9284598 x 1011.
(empirical) Mean = X = 312,674.6.
(empirical) Variance = E[X2 ] - E[X]2 = 3.9508 x 1011.
(empirical) Standard Deviation = Square Root of Variance = 6.286 x 105 .

Mean:
The mean is the average or expected value of the random variable.
Mean of the variable X = E[X].
Empirical Mean of a sample of random draws from X = X .
The mean of the data allows you to set the scale for the fitted distribution.
In general means add: E[X + Y] = E[X] + E[Y].
Also multiplying a variable by a constant multiplies the mean by the same constant;
E[kX] = kE[X]. The mean is a linear operator, E[aX + bY] = aE[X] + bE[Y].
Mode:
The mean differs from the mode which represents the value most likely to occur. For a continuous
distribution function the mode is the point at which the density function reaches its
maximum. For the empirical data in Section 1 there is no clear mode5. For discrete distributions, for
example frequency distributions, the mode has the same definition but is easier to pick out. If one
multiplies all the claims by a constant, the mode is multiplied by that same constant.
Median:
The median is that value such that half of the claims are on either side. At the median the
distribution function is 0.5. The median is the 50th percentile. For a discrete loss distribution, one
may linearly interpolate in order to estimate the median. If one multiplies all the claims by a constant,
the median is multiplied by that same constant.

One would expect a curve fit to this data to have a mode much smaller than the mean.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 12

The sample median for the data in Section 1 is about $121 thousand.6 This is much less than the
sample mean of about $313 thousand. While the mean can be affected greatly by a few large
claims, the median is affected equally by each claim size.
For a continuous7 distribution with positive skewness typically: mean > median > mode (alphabetical
order.) The situation is reversed for negative skewness. Also usually the median is closer to the
mean than to the mode (just as it is in the dictionary.)8

Variance:
The variance is the expected value of the squared difference of the variable and its mean.
The variance is the second central moment.
Var[X] = E[(X - E[X])2 ] = E[X2 ] - E[X]2 =
second moment minus the square of the first moment.
For the Ungrouped Data in Section 1, we calculate the empirical variance as:
(1/N)(Xi - X )2 = E[X2 ] - E[X]2 = 4.92845 x 1011 - 312674.62 = 3.9508 x 1011.
Thus if X is in dollars, then Var[X] is in dollars squared.
Multiplying a variable by a constant multiplies the variance by the square of that constant;
Var[kX] = k2 Var[X]. In particular, Var[-X] = Var[X].
Exercise: Var[X] = 6. What is Var[3X]?
[Solution: Var[3X] = 32 Var[X] = (9)(6) = 54.]
For independent random variables the variances add.9
If X and Y are independent, then Var [X + Y] = Var [X] + Var [Y].
Also If X and Y are independent, then Var[aX + bY] = a2 Var[X] + b2 Var[Y].
In particular, Var [X - Y] = Var[X] + Var[Y], for X and Y independent.
6

The 65th out 130 claims is $119,300 and the 66th claim is $122,000. As discussed in Mahlers Guide to Fitting
Loss Distributions, a point estimate for the median would be at the (.5)(1+130) = 65.5th claim. So one would linearly
interpolate half way between the 65th and 66th claim to get a point estimate of the median of:
(.5)(119300) +(.5) (122000) = $120,650.
7
For frequency distributions the relationship may be different due to the fact that only certain discrete values can
appear as the mode or median.
8
See page 49 of Kendalls Advanced Theory of Statistics, Volume 1 (1994) by Stuart & Ord.
9
In general Var[X+Y] = Var[X] + Var[Y] + 2Cov[X,Y], where Cov[X,Y] = E[XY] - E[X]E[Y] = covariance of X and Y.
For X and Y independent E[XY] = E[X]E[Y] and Cov[X,Y] = 0.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 13

Exercise: Var[X] = 6. Var[Y] = 8. X and Y are independent. What is Var[3X + 10Y]?


[Solution: Var[3X + 10Y] = 9Var[X] + 100Var[Y] = 54 + 800 = 854.]
Note that if X and Y are independent and identically distributed, then Var[X1 + X2 ] =
2 Var [X]. Adding up n such variables gives a variable with variance = nVar[X].
Exercise: Var[X] = 6. What is Var[X1 + X2 + X3 ]?
[Solution: Var[X1 + X2 + X3 ] = Var[X] + Var[X] + Var[X] = 3Var[X] = (3)(6) = 18.]
Averaging consists of summing n random draws, and then dividing by n.
Averaging n such variables gives a variable with variance:
Var[(1/n)Xi] = Var[Xi] / n2 = n Var[X] / n2 = Var[X] / n.
Thus the sample mean has a variance that is inversely proportional to the number of
points. Therefore, the sample mean has a standard deviation that is inversely proportional to the
square root of the number of points.
Exercise: Var[X] = 6.
What is the variance of the average of 100 independent random draws from X?
[Solution: Var[X] / n = 6/100 = .06.]
While variances add for independent variables, more generally:
Var[X + Y] = Var[X] + Var[Y] + 2Cov[X,Y].
Exercise: Var[X] = 6, Var[Y] = 8, and Cov[X, Y] = 5. What is Var[X + Y]?
[Solution: Var[X + Y] = Var[X] + Var[Y] + 2Cov[X,Y] = 6 + 8 + (2)(5) = 24.]

Covariances and Correlations:


The Covariance of two variables X and Y is defined by:
Cov[X, Y] = E[XY] - E[X]E[Y].
Exercise: E[X] = 3, E[Y] = 5, and E[XY] = 25. What is the covariance of X and Y?
[Solution: Cov[X,Y] = E[XY] - E[X]E[Y] = 25 - (3)(5) = 10.]
Since Cov[X,X] = E[X2 ] - E[X]E[X] = Var[X], the covariance is a generalization of the variance.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 14

Covariances have the following useful properties:


Cov[X, aY] = aCov[X, Y].
Cov[X, Y] = Cov[Y, X].
Cov[X, Y + Z] = Cov[X, Y] + Cov[X, Z].
The Correlation of two random variables is defined in terms of their covariances:
Cov[X ,Y]
Corr[X, Y] =
.
Var[X] Var[Y]
Exercise: Var[X] = 6, Var[Y] = 8, and Cov[X, Y] = 5. What is Corr[X, Y]?
Cov[X ,Y]
5
[Solution: Corr[X,Y] =
=
= 0.72.]
(6) (8)
Var[X] Var[Y]
The correlation is always between -1 and +1.
Corr[X, Y] = Corr[Y, X]
Corr[X, X] = 1
Corr[X, -X] = -1
Corr[X, Y] if a > 0

Corr[X, aY] =
0 if a = 0
-Corr[X, Y] if a < 0

Corr[X, aX] = 1 if a > 0


Two variables that are proportional with a positive proportionality constant are perfectly correlated
and have a correlation of one. Closely related variables, such as height and weight, have a
correlation close to but less than one. Unrelated variables have a correlation near zero. Inversely
related variables, such as the average temperature and use of heating oil, are negatively correlated.
Standard Deviation:
The standard deviation is the square root of the variance.
If X is in dollars, then the standard deviation of X is also in dollars.
STDDEV[kX] = kSTDDEV[X].
Exercise: Var[X] = 16. Var[Y] = 9. X and Y are independent.
What is the standard deviation of X + Y?
[Solution: Var[X + Y] = 16 + 9 = 25. StdDev[X + Y] = 25 = 5.
Comment: Standard deviations do not add. In the exercise, 4 + 3 5.]

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 15

Exercise: Var[X] = 16.


What is the standard deviation of the average of 100 independent random draws from X?
[Solution: variance of the average = Var[X] / n = 16/100 = .16.
standard deviation of the average =
Alternately, StdDev[X] =

0.16 = 0.4.

16 = 4.

standard deviation of the average = StdDev[X] /

n = 4/10 = 0.4.]

Sample Variance:
Sample Mean = Xi / N = X .

(Xi
Note that the variance as calculated above,

- X )2

, is a biased estimator of the variance of the

distribution from which this data set was drawn.


The sample variance is an unbiased estimator of the variance of the distribution from
which a data set was drawn:10

Sample variance

(Xi

- X )2

N - 1

For the Ungrouped Data in Section 1, we calculate the sample variance as:

(Xi

- X )2

N - 1

= 3.9814 x 1011.

For 130 data points, the sample variance is the empirical variance multiplied by:
N / (N - 1) = 130/129.

10

Bias is discussed in Mahlers Guide to Fitting Loss Distributions.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

Problems:
Use the following information for the next two questions:
Prob[ X = 1] = 70%, Prob[X = 5] = 20%, Prob[X =10] = 10%.
2.1 (1 point) What is the mean of X?
A. less than 3.0
B. at least 3.0 but less than 3.5
C. at least 3.5 but less than 4.0
D. at least 4.0 but less than 4.5
E. at least 4.5
2.2 (1 point) What is the variance of X?
A. less than 6
B. at least 6 but less than 7
C. at least 7 but less than 8
D. at least 8 but less than 9
E. at least 9
Use the following data set for the next two questions:
4, 7, 13, 20.
2.3 (1 point) What is the mean?
A. 8

B. 9

C. 10

D. 11

E. 12

2.4 (1 point) What is the sample variance?


A. 30

B. 35

C. 40

D. 45

2.5 (1 point) You are given the following:

Let X be a random variable X.

Y is defined to be X/2.
Determine the correlation coefficient of X and Y.
A. 0.00
B. 0.25
C. 0.50
D. 0.75

E. 50

E. 1.00

2.6 (2 points) X and Y are two independent variables.


E[X] = 3. Var[X] = 5. E[Y] = 6. Var[Y] = 2.
Let Z = XY. Determine the standard deviation of Z.
A. 14
B. 16
C. 18
D. 20
E. 22

HCM 10/8/12,

Page 16

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 17

2.7 (1 point) Let X1 , X2 , ..., X20, be 20 independent, identically distributed


random variables, each of which has variance of 17. If one estimates the mean by the average of
the 20 observed values, what is the variance of this estimate?
A. less than .6
B. at least .6 but less than .7
C. at least .7 but less than .8
D. at least .8 but less than .9
E. at least .9
2.8 (1 point) Let X and Y be independent random variables.
Which of the following statements are true?
1. If Z is the sum of X and Y, the variance of Z is the sum of the variance of X and the
variance of Y.
2. If Z is the difference between X and Y, the variance of Z is the difference between the
variance of X and the variance of Y.
3. If Z is the product of X and Y, then the expected value of Z is the product of the
expected values of X and Y.
A. None of 1, 2, or 3
B. 1
C. 2
D. 3
E. None of A ,B, C, or D
Use the following information for the next two questions:
Height of Husband (inches):
66
68
69
71
Height of Wife (inches):
64
63
67
65

73
69

2.9 (2 points) What is the covariance of the heights of husbands and wives?
A. 2
B. 3
C. 4
D. 5
E. 6
2.10 (2 points) What is the correlation of the heights of husbands and wives?
A. 70%
B. 75%
C. 80%
D. 85%
E. 90%
2.11 (1 point) Let x and y be two independent random draws from a continuous distribution.
Demonstrate that the mean squared difference between x and y is twice the variance of the
distribution.
2.12 (2, 5/83, Q.24) (1.5 points) Let the random variable X have the density function
f(x) = kx for 0 < x < 2 / k . If the mode of this distribution is at x =
then what is the median of this distribution?
A.

2 /6

B. 1/4

C.

2 /4

D.

2 /24

E. 1/2

2 /4,

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 18

2.13 (2, 5/83, Q.30) (1.5 point) Below are shown the probability density functions of two
symmetric bounded distributions with the same median.

Which of the following statements about the means and standard deviations of the two distributions
are true?
A. II > I and II = I

B. II > I and II > I

C. I = II and II < I

D. I = II and I < II

E. Cannot be determined from the given information

2.14 (2, 5/83, Q.49) (1.5 point) Let X and Y be random variables with Var(X) = 4,
Var(Y) = 9, and Var(X - Y) = 16. What is Cov(X, Y)?
A. -3/2
B. -1/2
C. 1/2
D. 3/2
E. 13/16
2.15 (2, 5/85, Q.5) (1.5 points)
Let X and Y be random variables with variances 2 and 3, respectively, and covariance -1.
Which of the following random variables has the smallest variance?
A. 2X + Y
B. 2X - Y
C. 3X - Y
D. 4X
E. 3Y
2.16 (2, 5/85, Q.11) (1.5 points) Let X and Y be independent random variables, each with
density f(t) = 1/(2) for - < t < . If Var(XY) = 64/9, then what is ?
A. 1

B. 2

C. 4 3 / 3

D. 2 2

E. 8 3 / 3

2.17 (2, 5/85, Q.47) (1.5 points) Let X be a random variable with finite variance.
If Y = 15 - X, then determine Corr[X, (X + Y)X].
A. -1
B. 0
C. 1/15
D. 1
E. Cannot be determined from the information given.
2.18 (4, 5/86, Q.31) (1 point) Which of the following statements are true about the distribution of a
random variable X?
1. If X is discrete, the value of X which occurs most frequently is the mode.
2. If X is continuous, the expected value of X is equal to the mode of X.
3. The median of X is the value of X which divides the distribution in half.
A. 1
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 19

2.19 (4, 5/86, Q.32) (1 point) Let X, Y and Z be random variables.


Which of the following statements are true?
1. The variance of X is the second moment about the origin of X.
2. If Z is the product of X and Y, then the expected value of Z is the product of the expected
values of X and Y.
3. The expected value of X is equal to the expectation over all possible values of Y, of the
conditional expectation of X given Y.
A. 2
B. 3
C. 1, 2
D. 1, 3
E. 2, 3
2.20 (2, 5/88, Q.43) (1.5 points) X, Y, and Z have means 1, 2, and 3, respectively, and variances
4, 5, and 9, respectively. The covariance of X and Y is 2, the covariance of X and Z is 3, and the
covariance of Y and Z is 1.
What are the mean and variance, respectively, of the random variable 3X + 2Y - Z?
A. 4 and 31
B. 4 and 65
C. 4 and 67
D. 14 and 13
E. 14 and 65
2.21 (4, 5/89, Q.26) (1 point) If the random variables X and Y are not independent, which of the
following equations will still be true?
1. E(X + Y) = E(X) + E(Y)
2. E(XY) = E(X) E(Y)
3. Var (X + Y) = Var (X) + Var (Y)
A. 1
B. 2
C. 1, 2
D. 1, 3
E. None of the above
2.22 (2, 5/90, Q.18) (1.7 points) Let X be a continuous random variable with density function
f(x) = x(4 - x)/9 for 0 < x < 3. What is the mode of X?
A. 4/9
B. 1
C. 1/2
D. 7/4
E. 2
2.23 (2, 5/92, Q.2) (1.7 points) Let X be a random variable such that E(X) = 2, E(X3 ) = 9, and
E[(X - 2)3 ] = 0. What is Var(X)?
A. 1/6
B. 13/6
C. 25/6

D. 49/6

E. 17/2

2.24 (4B, 5/93, Q.9) (1 point) If X and Y are independent random variables, which of the following
statements are true?
1.
Var[X + Y] = Var[X] + Var[Y]
2.
Var[X - Y] = Var[X] + Var[Y]
3.
A. 1

Var[aX + bY] = a2 E[X2 ] - a(E[X])2 + b2 E[Y2 ] - b(E[Y])2


B. 1,2
C. 1,3
D. 2,3
E. 1,2,3

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 20

2.25 (4B, 5/94, Q.5) (2 points) Two honest, six-sided dice are rolled, and the results D1 and D2
are observed. Let S = D1 + D2 . Which of the following are true concerning the conditional
distribution of D1 given that S<6?
1. The mean is less than the median.
2. The mode is less than the mean.
3. The probability that D1 = 2 is 1/3.
A. 2

B. 3

C. 1, 2

D. 2, 3

E. None of A, B, C, or D

2.26 (Course 160 Sample Exam #3, 1994, Q.1) (1.9 points) You are given:
(i) T is the failure time random variable.
(ii) f(t) = {(10 - t)/10}9 for 0 < t 10.
Calculate the ratio of the mean of T to the median of T.
(A) 0.67
(B) 0.74
(C) 1.00
(D) 1.36
(E) 1.49
2.27 (2, 2/96, Q.25) (1.7 points) The sum of the sample mean and median of ten distinct data
points is equal to 20. The largest data point is equal to 15. Calculate the sum of the sample mean
and median if the largest data point were replaced by 25.
A. 20
B. 21
C. 22
D. 30
E. 31
2.28 (Course 1 Sample Exam, Q.9) (1.9 points)
The distribution of loss due to fire damage to a warehouse is:
Amount of Loss
Probability
0
0.900
500
0.060
1,000
0.030
10,000
0.008
50,000
0.001
100,000
0.001
Given that a loss is greater than zero, calculate the expected amount of the loss.
A. 290
B. 322
C. 1,704
D. 2,900
E. 32,222

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 21

2.29 (1, 5/00, Q.8) (1.9 points) A probability distribution of the claim sizes for an auto insurance
policy is given in the table below:
Claim Size Probability
20
0.15
30
0.10
40
0.05
50
0.20
60
0.10
70
0.10
80
0.30
What percentage of the claims are within one standard deviation of the mean claim size?
(A) 45%
(B) 55%
(C) 68%
(D) 85%
(E) 100%
2.30 (IOA 101, 9/00, Q.2) (3 points) Consider a random sample of 47 white-collar workers and a
random sample of 24 blue-collar workers from the workforce of a large company. The mean salary
for the sample of white-collar workers is 28,470 and the standard deviation is 4,270; whereas the
mean salary for the sample of blue-collar workers is 21,420 and the standard deviation is 3,020.
Calculate the mean and the standard deviation of the salaries in the combined sample of 71
employees.
2.31 (1, 11/00, Q.1) (1.9 points) A recent study indicates that the annual cost of maintaining and
repairing a car in a town in Ontario averages 200 with a variance of 260.
If a tax of 20% is introduced on all items associated with the maintenance and repair of
cars (i.e., everything is made 20% more expensive), what will be the variance of the
annual cost of maintaining and repairing a car?
(A) 208
(B) 260
(C) 270
(D) 312
(E) 374
2.32 (1, 11/00, Q.38) (1.9 points) The profit for a new product is given by Z = 3X - Y - 5.
X and Y are independent random variables with Var(X) = 1 and Var(Y) = 2.
What is the variance of Z?
(A) 1
(B) 5
(C) 7
(D) 11
(E) 16
2.33 (1, 11/01, Q.7) (1.9 points) Let X denote the size of a surgical claim and let Y denote the size
of the associated hospital claim. An actuary is using a model in which E(X) = 5,
E(X2 ) = 27.4, E(Y) = 7, E(Y2 ) = 51.4, and Var(X+Y) = 8.
Let C1 = X+Y denote the size of the combined claims before the application of a 20%
surcharge on the hospital portion of the claim, and let C2 denote the size of the combined
claims after the application of that surcharge. Calculate Cov(C1 , C2 ).
(A) 8.80
(B) 9.60
(C) 9.76
(D) 11.52
(E) 12.32

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 22

2.34 (1, 5/03, Q.15) (2.5 points) An insurance policy pays a total medical benefit consisting of two
parts for each claim. Let X represent the part of the benefit that is paid to the surgeon, and let Y
represent the part that is paid to the hospital. The variance of X is 5000, the variance of Y is 10,000,
and the variance of the total benefit, X + Y, is 17,000.
Due to increasing medical costs, the company that issues the policy decides to increase
X by a flat amount of 100 per claim and to increase Y by 10% per claim.
Calculate the variance of the total benefit after these revisions have been made.
(A) 18,200 (B) 18,800 (C) 19,300 (D) 19,520 (E) 20,670

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 23

Solutions to Problems:
2.1. A. E[X] = (70%)(1) + (20%)(5) + (10%)(10) = 2.7.
2.2. D. E[X2 ] = (70%)(12 ) + (20%)(52 ) + (10%)(102 ) = 15.7. Var[X] = 15.7 - 2.72 = 8.41.
Alternately, Var[X] = (70%)(1 - 2.7)2 + (20%)(5 - 2.7)2 + (10%)(10 - 2.7)2 = 8.41.
2.3. D. mean = (4 + 7 + 13 + 20)/4 = 11.
2.4. E. sample variance = {(4 - 11)2 + (7 - 11)2 + (13 - 11)2 + (20 - 11)2 }/(4 -1) = 50.
2.5. E. Var[Y] = Var[X/2] = Var[X]/4.
Cov[X, Y] = Cov[X, X/2] =Cov[X, X]/2 = Var[X]/2.
Therefore, Corr[X, Y] = (Var[X]/2) / Var[X] Var[X]/ 4 = 1.
Comments: Two variables that are proportional with a positive proportionality constant are perfectly
correlated and have a correlation of one.
2.6. A. E[Z] = E[X]E[Y] = (3)(6) = 18.
E[Z2 ] = E[X2 ]E[Y2 ] = (5 + 32 )(2 + 62 ) = 532.
Var[Z] = 532 - 182 = 208. StdDev[Z] =

208 = 14.4.
20

2.7. D. The estimated mean is: (1/20) xi . Therefore,


i=1
20

20

Var(mean) = Var (xi/20) = (1/202 ) Var(xi) = (20)(1/202 ) Var(x) = 17/20 = .85.


i=1

i=1

Comment: Since the xi are independent, Var(x1 +x2 ) = Var(x1 )+Var(x2 ). Since they are identically
distributed Var(x1 )= Var(x2 ). Since Var(aY) = a2 Var(Y), Var(x1 /20) =
(1/202 )Var(x1 ). Note that as the number of observations n increases, the variance of the mean
decreases as 1/n.
2.8. E. 1. True, since X and Y are independent.
2. False. In general VAR[X - Y] = VAR[X] + VAR[Y] - 2COV[X, Y]. When X and Y are
independent, Cov[X, Y] = 0 and therefore, VAR[X - Y] = VAR[X] + VAR[Y].
3. True. In general E[XY] E[X]E[Y], although this is true if X and Y are independent.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 24

2.9. C. Let H = husbands heights and W = wives heights. E[H] = 69.4. E[W] = 65.6.
E[HW] = {(66)(64) + (68)(63) + (69)(67) + (71)(65) + (73)(69)}/5 = 4556.6.
Cov[H, W] = 4556.6 - (69.4)(65.6) = 3.96.
2.10. B. Var[H] = {(66 -69.4)2 + (68 -69.4)2 + (69 -69.4)2 + (71 -69.4)2 + (73 -69.4)2 }/5 = 5.84.
Var[W] = {(64 -65.6)2 + (63 -65.6)2 + (67 -65.6)2 + (65 -65.6)2 + (69 -65.6)2 }/5 = 4.64.
Corr[H, W] = Cov[H, W]/ Var[H]Var[W] = 3.96/ (5.84)(4.64) } = 0.761.
2.11. E[(X - Y)2 ] = E[X2 - 2XY + Y2 ] = E[X2 ] - 2E[XY] + E[Y2 ] = E[X2 ] - 2 E[X]E[Y] + E[X2 ] =
2E[X2 ] - 2E[X]2 = 2 Var[X].
2.12. B. The mode is where the density is largest, which in this case is at the righthand endpoint of
the support,

2/ k.

2 / k = 2 /4. k = 4. k = 16.

f(x) = 16x. F(x) = 8x2 . At the median, F(x) = 0.5. 8x2 = 0.5. x = 1/4.
2.13. D. The means are each equal to the medians, since the distributions are symmetric.
The two medians are equal, therefore so are the means.
The second central moment of distribution II is larger than that of distribution I, since distribution II is
more dispersed around its mean. II2 > I2. II > I.
2.14. A. 16 = Var(X - Y) = Var(X) + Var(Y) - 2Cov(X, Y). Cov(X, Y) = -(16 - 4 - 9)/2 = -3/2.
2.15. A. Var[2X + Y] = (4)(2) + 3 + (2)(2)(-1) = 7.
Var[2X - Y] = (4)(2) + 3 - (2)(2)(-1) = 15. Var[3X - Y] = (9)(2) + 3 - (2)(3)(-1) = 27.
Var[4X] = (16)(2) = 32. Var[3Y] = (9)(3) = 27. 2X + Y has the smallest variance.
2.16. D. E[X] = E[Y] = 0. Var[X] = Var[Y] = (2)2 /12 = 2/3. E[X2 ] = E[Y2 ] = 2/3.
Var[XY] = E[(XY)2 ] - E[XY]2 = E[X2 ] E[Y2 ] - E[X]2 E[Y]2 = (2/3)(2/3) - 0 = 4/9.
4/9 = 64/9. =

8 = 2 2.

2.17. D. X + Y = 15. Corr[X, (X + Y)X] = Corr[X, 15X] = 1.


Comment: Two variables that are proportional with a positive constant have a correlation of 1.
2.18. C. 1. True. 2. False. The expected value of X is the mean. Usually the mean and the mode
are not equal. 3. True.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 25

2.19. B. 1. False. The variance is the second central moment: VAR[X] = E[(X-E[X])2] =
E[X2] - E[X]2. The second moment around the origin is E[X2].
2. False. COVAR[X,Y] = E[XY] - E[X]E[Y], so statement 2 only holds when the covariance of X and
Y is zero. (This is true if X and Y are independent.) 3. True. E[X] = EY[E[X|Y]].
2.20. C. E[3X + 2Y - Z] = 3E[X] + 2E[Y] - E[Z] = (3)(1) + (2)(2) - 3 = 4.
Var[3X + 2Y - Z] = 9Var[X] + 4Var[Y] + Var[Z] + 12Cov[X, Y] - 6Cov[X, Z] - 4Cov[Y, Z] =
(9)(4) + (4)(5) + 9 + (12)(2) - (6)(3) - (4)(1) = 67.
2.21. A. Means always add, so statement 1 is true. E[XY] =E[X]E[Y] + COVAR[X,Y], therefore
E[XY] = E[X]E[Y] if and only if the covariance of X and Y is zero. Thus statement 2 is not true in
general. In general VAR[X+Y] = VAR[X] + VAR[Y] + 2 COVAR[X,Y]. If X and Y are independent
then their covariance is zero, and statement 3 would hold. However, statement 3 is not true in
general.
2.22. E. f(x) = (4 - x)/9 - x/9 = 0. x = 2. f(2) = 4/9. Check endpoints: f(0) = 0, f(3) = 1/3.
2.23. A. 0 = E[(X - 2)3 ] = E[X3 ] - 6E[X2 ] + 12E[X] - 8. 9 - 6E[X2 ] + (12)(2) - 8 = 0.

E[X2 ] = 25/6. Var(X) = E[X2 ] - E[X]2 = 25/6 - 22 = 1/6.


2.24. B. 1. True. 2. True. 3. For X and Y Independent, Var[aX + bY] = a2 Var[X] + b2 Var[Y] =
a2 E[X2 ] - a2 E[X]2 + b2 E[Y2 ] - b2 E[Y]2 , therefore Statement #3 is False.
2.25. A. When S < 6 we have the following equally likely possibilities:
D2
Conditional Density Function
D1

of D1
1
2
3
4
Possibilities
1
x
x
x
x
4
2
x
x
x
3
3
x
x
2
4
x
1
The mean of the conditional density function of D1 given that S<6 is:

given that S<6


4/10
3/10
2/10
1/10

(.4)(1) + (.3)(2) + (.2)(3) + (.1)(4) = 2.


The median is equal to 2, since the Distribution Function at 2 is .7 .5, but at 1 it is .4 < .5. The
mode is 1, since that is the value at which the density is a maximum. 1. F, 2. T, 3. F.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 26

10

2.26. D. S(t) = {(10 - t)/10}10. E[T] = S(t)dt = 10/11 = .9091.


0

Set 0.5 = S(t) = {(10 - t)/10}10. Median = 10(1 - 0.50.1) = 0.6697.


Mean/Median = 0.9091/0.6697 = 1.357.
2.27. B. The sample median remains the same, while the sample mean is increased by
(25 - 15)/10 = 1. The sum of the sample mean and median is now: 20 + 1 = 21.
2.28. D. {(500)(0.060) + (1,000)(0.030) + (10,000)(0.008) + (50,000)(0.001) +
(100,000)(0.001)}/(.06 + .03 + .008 + .001 + .001) = 29000/.10 = 2900.
2.29. A. mean = (20)(0.15) + (30)(0.10) + (40)(0.05)+ (50)(0.20) + (60)(0.10) + (70)(0.10) +
(80)(0.30) = 55.
second moment = (202 )(0.15) + (302 )(0.10) + (402 )(0.05)+ (502 )(0.20) + (602 )(0.10) +
(702 )(0.10) + (802 )(0.30) = 3500. standard deviation = 3500 - 552 = 21.79.
Prob[within one standard deviation of the mean] = Prob[33.21 X 76.79]
= .05 + .20 + .10 + .10 = 45%.
2.30. Total is: (47)(28470) + (24)(21420) = 1,852,170.
Overall mean is: 1,852,170/(47 + 24) = 26,087.
The overall second moment is: {(47)(4,2702 + 284702 ) + (24)(3,0202 + 214202 )}/(47 + 24) =
706,800,730.
Overall variance is: 706,800,730 - 26,0872 = 26,269,161.
Overall standard deviation is:

26,269,161 = 5125.

2.31. E. When one multiplies a variable by a constant, in this case 1.2, one multiplies the variable
by that the square of that constant, in this case 1.22 = 1.44. (1.44)(260) = 374.4.
2.32. D. Var[Z] = Var[3X - Y - 5] = 9Var[X] + Var[Y] = (9)(1) + 2 = 11.
2.33. A. Var[X] = 27.4 - 52 = 2.4. Var[Y] = 51.4 - 72 = 2.4.
Cov[X, Y] = (Var[X+Y] - Var[X] - Var[Y])/2 = (8 - 2.4 - 2.4)/2 = 1.6.
Cov[C1 , C2 ] = Cov[X + Y, X + 1.2Y] = Cov[X, X] + Cov[X, 1.2 Y] + Cov[Y, X] + Cov[Y, 1.2 Y]
= Var[X] + 1.2 Cov[X, Y] + Cov[Y, X] + 1.2 Cov[Y, Y]
=Var[X] + 1.2 Var[Y] + 2.2 Cov[X, Y] = 2.4 + (1.2)(2.4) + (2.2)(1.6) = 8.8.

2013-4-2,

Loss Distributions, 2 Stats Ungrouped Data

HCM 10/8/12,

Page 27

2.34. C. Cov[X, Y] = (Var[X + Y] - Var[X] - Var[Y])/2 = 1000.


Adding a flat amount of 100 does not affect the variance.
Var[X + 1.10Y] = Var[X] + 1.21Var[Y] + 2(1.1)Cov[X, Y] = 5000 + (1.21)(10000) + (2.2)(1000) =
19,300.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 28

Section 3, Coefficient of Variation, Skewness, and Kurtosis


The coefficient of variation, skewness, and kurtosis, all help to describe the shape of a size of loss
distribution.
For the ungrouped data in Section 1:
Average of X3 = E[X3 ] = 3rd moment about the origin = xi3 / n = 1.600225 x 1018.
Average of X4 = E[X4 ] = 4th moment about the origin = xi4 / n = 6.465278 x 1024.
Coefficient of Variation = Standard Deviation / Mean = 6.286 x 105 / 312675 = 2.01.
(Coefficient of) Skewness = 1 =

Kurtosis = 2 =

E[(X - E[X]) 3] E[X3] - 3 E[X] E[X2] + 2 E[X] 3


=
= 4.83.
STDDEV 3
STDDEV 3

E[(X - E[X])4 ] E[X4] - 4 E[X] E[X3] + 6 E[X]2 E[X2] - 3 E[X]4


=
= 30.3.
Variance2
Variance2

Coefficient of Variation:
The coefficient of variation (CV) = standard deviation / mean.
The coefficient of variation measures how dispersed the sizes of loss are around their mean. The
larger the coefficient of variation the more dispersed the distribution. The coefficient of variation helps
describe the shape of the distribution.
Exercise: Let 5 and 77 be the first two moments (around the origin) of a distribution.
What is the coefficient of variation of this distribution?
[Solution: Variance = 77 - 52 = 52. CV =

52 / 5 = 1.44. ]

Since if X is in dollars then both the standard deviation and the mean are in dollars, the coefficient of
variation is a dimensionless quantity; i.e., it is a pure number which is not in any particular currency.
Thus the coefficient of variation of X is unaffected if X is multiplied by a constant.

When adding two independent random variables: CV[X + Y] =

Var[X] + Var[Y]
. In particular,
E[X] + E[Y]

when adding two independent identically distributed random variables: CV[X + X] = CV[X] /

2.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 29

So if one adds up more and more independent identically distributed random variables, then the
coefficient of variation declines towards zero.11
The following formula for unity plus the square of the coefficient of variation follows directly from the
definition Coefficient of Variation.
C V2 = Variance / E[X]2 = (E[X2 ] - E[X]2 ) / E[X]2 = (E[X2 ] / E[X]2 ) - 1.
Thus, 1 + CV2 = E[X2 ] / E[X]2 = 2nd moment divided by the square of the mean.
Exercise: One observes losses of sizes: $300, $600, $1,200, $1,500, and $2,800.
Determine the empirical coefficient of variation.
[Solution: X = (300 + 600 + 1200 + 1500 + 2800)/5 = 1280. Variance =
{(300 - 1280)2 + (600 - 1280)2 + (1200 - 1280)2 + (1500 - 1280)2 + (2800 - 1280)2 }/5 =
757,600. CV =

757,600 / 1280 = 0.680.

Alternately, 2nd moment is: (3002 + 6002 + 12002 + 15002 + 28002 )/5 = 2,396,000.
1 + CV2 = 2,396,000 / 12802 = 1.4624. CV = 0.680.
Comment: Note the use of the biased estimator of the variance rather than the sample variance.
The CV would be the same using each of the losses divided by 100.]
Skewness:
The (coefficient of) skewness is defined as the 3rd central moment divided by the cube of the
standard deviation:
Skewness = 1 =

E[(X - E[X]) 3]
.
STDDEV 3

The third central moment can be written in terms of moments around the origin:
E[(X - E[X])3 ] = E[X3 - 3X2 E[X] + 3XE[X]2 - E[X]3 ] = E[X3 ] - 3 E[X] E[X2 ] + 3E[X]E[X]2 - E[X]3
= E[X3 ] - 3 E[X] E[X2 ] + 2 E[X]3 .
E[(X - E[X])3 ] = E[X3 ] - 3 E[X] E[X2 ] + 2 E[X]3 .
Exercise: Let 5, 77, 812 be the first three moments (around the origin) of a distribution.
What is the skewness of this distribution?
[Solution: Variance = 77 - 52 = 52.
Third central moment = E[X3 ] - 3 E[X] E[X2 ] + 2 E[X]3 = 812 - (3)(5)(77) + 2(53 ) = -93.
Skewness = Third central moment / Variance1.5 = -93/521.5 = - 0.248.]
11

This is the fundamental idea behind the usefulness of Credibility.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 30

The skewness helps to describe the shape of the distribution. Typically size of loss distributions
have positive skewness, such as the following Pareto Distribution with mode of zero:12

0.00004

0.00003

0.00002

0.00001

20000

40000

60000

80000

100000

Or the following LogNormal Distribution with a positive mode:

0.00003
0.000025
0.00002
0.000015
0.00001
5 10- 6

20000

40000

60000

80000

100000

Positive skewness skewed to the right.


There is a significant probability of very large results.
A symmetric distribution has zero skewness.13
While a symmetric curve has a skewness of zero, the converse is not true.
12
13

The Pareto Distribution is discussed in the subsequent section on Common Two Parameter Distributions.
For example, the Normal distribution has zero skewness.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 31

The following Weibull Distribution has skewness of zero, but it is not symmetric:14

0.000035
0.00003
0.000025
0.00002
0.000015
0.00001
5 10- 6
20000

40000

60000

80000

100000

The following Weibull Distribution has negative skewness and is skewed to the left:15
0.00006
0.00005
0.00004
0.00003
0.00002
0.00001

20000

40000

60000

80000

100000

With = 3.60235. See the Section on Common Two Parameter Distributions.


With = 6. The skewness depends on the value of the shape parameter tau. For > 3.60235, the Weibull has
negative skewness. For < 3.60235, the Weibull has positive skewness.
14
15

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 32

If X is in dollars, both the third central moments of X and the cube of the standard deviation are in
dollars cubed. Therefore the skewness is a dimensionless quantity; i.e., it is a pure number which is
not in any particular currency. Thus the skewness of X is unaffected if X is multiplied by a positive
constant. However, Skew[-X] = -Skew[X].16
Thus if X has positive skewness then -X has negative skewness.17
Exercise: The skewness of a random variable X is 3.5. What is the skewness of 1.1X?
[Solution: 3.5. The skewness is unaffected when a variable is multiplied by a positive constant.
Comment: This could be due to the impact of 10% inflation.]
Exercise: The skewness of a random variable X is 3.5. What is the skewness of -1.1X?
[Solution: -3.5. The skewness is multiplied by -1 when a variable is multiplied by a negative
constant.]
The numerator and the denominator of the skewness both involve central moments. The numerator
is the third central moment, while the denominator is the second central moment taken to the 3/2
power. Therefore they are unaffected by the addition or subtraction of a constant. Therefore, the
skewness of X + c is the same as the skewness of X. Translating a curve to the left or the right does
not change its shape; specifically it does not change its skewness.
Exercise: The skewness of a random variable X is 3.5.
What is the skewness of 10X + 7?
[Solution: 3.5. The skewness is unaffected when a variable is multiplied by a positive constant.
Also, the skewness is unaffected when a constant is added.]
Note that skewnesses do not add. However, since third central moments of independent variables
do add, one can derive useful formulas.18

16

The numerator of the skewness is negative of what it was, but the denominator is unaffected since the standard
deviation is never negative by definition.
17
If X is skewed to the right, then -X, which is X reflected in the Y-Axis, is skewed to the left.
18
For X and Y independent the 2nd and 3rd central moments add; the 4th central moment and higher central
moments do not add. Cumulants of independent variables add and the 2nd and 3rd central moments are equal to
the 2nd and 3rd cumulants. See for example, Practical Risk Theory for Actuaries, by Daykin, Pentikainen and
Pesonen.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 33

For X and Y independent:


3rd central moment of X+Y = 3rd central moment of X + 3rd central moment of Y =
Skew[X]Var[X]1.5 + Skew[Y]Var[Y]1.5.
Thus for X and Y independent:
Skew[X] Var[X]1.5 + Skew[Y] Var[Y]1.5
Skew[X + Y] =
.
(Var[X] + Var[Y])1.5
In particular, when adding two independent identically distributed random variables,
Skew[X + X] = Skew[X]/2. As we add more identically distributed random variables the skewness
goes to zero; the sum goes to a symmetric distribution.19
The coefficient of variation and the skewness are useful summary statistics that describe the shape of
the distribution. They give you an idea of which type of distribution is likely to fit. Note that the
Coefficient of Variation and Skewness do not depend on the scale parameter if any.
Most size of loss distributions have a positive skewness (skewed to the right), with a few very large
claims and many smaller claims. The more of the total dollars of loss represented by the rare large
claims, the more skewed the distribution.20
Exercise: One observes losses of sizes: $300, $600, $1,200, $1,500, and $2,800.
Determine the empirical coefficient of skewness.
[Solution: From a previous exercise X = 1280, and the variance = 757,600. 3rd central moment =
{(300 - 1280)3 + (600 - 1280)3 + (1200 - 1280)3 + (1500 - 1280)3 + (2800 - 1280)3 }/5 =
453,264,000. Skewness = 453,264,000/ 757,6001.5 = 0.687.
Comment: We have again used the biased estimate of the variance rather than the sample
variance. The skewness would be the same using each of the losses divided by 100.]

19

Note the relation to the central limit theorem, where a sum of standardized identical distributions goes to a
symmetric normal distribution.
20
As discussed subsequently, this situation is referred to as a heavy-tailed distribution.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 34

Kurtosis:
The kurtosis is defined as the fourth central moment divided by the square of the variance.

Kurtosis =

E[(X - E[X])4 ]
.
Variance2

Exercise: One observes losses of sizes: $300, $600, $1,200, $1,500, and $2,800.
Determine the empirical kurtosis.
[Solution: From a previous exercise X = 1280, and the variance = 757,600. 4th central moment =
{(300 - 1280)4 + (600 - 1280)4 + (1200 - 1280)4 + (1500 - 1280)4 + (2800 - 1280)4 }/5 =
1,295,302,720,000. Kurtosis = 1,295,302,720,000/ 757,6002 = 2.257.
Comment: The kurtosis would be the same using each of the losses divided by 100.]
As with the skewness, the kurtosis is a dimensionless quantity, which describes the shape of the
distribution.21 Thus the kurtosis is unaffected when a variable is multiplied by a (non-zero) constant.
Since the fourth central moment is always non-negative, so is the kurtosis.
Large kurtosis a heavy-tailed distribution.
Exercise: The kurtosis of a random variable X is 3.5. What is the kurtosis of 1.1X?
[Solution: 3.5. The kurtosis is unaffected when a variable is multiplied by a constant.
Comment: This exercise could be referring to the impact of 10% inflation.]
Exercise: The kurtosis of a random variable X is 3.5. What is the kurtosis of -1.1X?
[Solution: 3.5. The kurtosis is unaffected when a variable is multiplied by a non-zero constant.
Comment: Remember that the kurtosis is always positive.]
The numerator and the denominator of the kurtosis both involve central moments. The numerator is
the fourth central moment, while the denominator is the second central moment squared. Therefore
they are unaffected by the addition or subtraction of a constant; the kurtosis of X + c is the same as
the kurtosis of X. Translating a curve to the left or the right does not change its shape; it does not
change its kurtosis.

21

Both the numerator and denominator are in dollars to the fourth power.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 35

Exercise: The kurtosis of a random variable X is 3.5. What is the kurtosis of 10X + 7?
[Solution: 3.5. The kurtosis is unaffected when a variable is multiplied by a non-zero constant. Also,
the kurtosis is unaffected when a constant is added.]
If X is a Normal Distribution, then (X-)/ is a Standard Normal.
X = (Standard Normal + /) = a constant times (Standard Normal plus another constant.)
Thus all Normal Distributions have the same kurtosis as a Standard Normal.
It turns out that, all Normal Distributions have a kurtosis of 3.
Distributions with a kurtosis less than 3 are lighter-tailed than a Normal Distribution. Distributions with a
kurtosis more than 3 are heavier-tailed than a Normal Distribution; they have their densities go to
zero more slowly as x approaches infinity than a Normal.
Most size of loss distributions encountered in practice have a kurtosis greater than 3.
For example, the kurtosis of a Gamma Distribution with shape parameter is: 3 + 6/.
Exercise: What is the 4th central moment in terms of moments around the origin?
[Solution: The 4th central moment is: E[(X - E[X])4 ] = E[X4 - 4E[X]X3 + 6E[X]2 X2 - 4E[X]3 X + E[X]4 ]
= E[X4 ] - 4E[X]E[X3 ] + 6E[X]2 E[X2 ] - 4E[X]3 E[X] + E[X]4
= E[X4 ] - 4E[X]E[X3 ] + 6E[X]2 E[X2 ] - 3E[X]4 .]
Thus we have the formula for the Kurtosis in terms of moments around the origin:
Kurtosis = 2 =

E[(X - E[X])4 ] E[X4] - 4 E[X] E[X3] + 6 E[X]2 E[X2] - 3 E[X]4


=
.
Variance2
Variance2

The empirical kurtosis of the ungrouped data in Section 1 is:


E[X4] - 4 E[X] E[X3] + 6 E[X]2 E[X2] - 3 E[X]4
=
Variance2
{(6.465278 x 1024 - (4)(312,674.6)(1.600225 x 1018) + (6)( 312,674.62 )(4.9284598 x 1011)
- (3)(312,674.64 )} / (3.9508 x 1011)2 = 30.3
It should be noted that empirical estimates of the kurtosis are subject to large estimation errors, since
the empirical kurtosis is very heavily effected by the absence or presence of a few large claims.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 36

Exercise: Let 5, 77, 812, 10423 be the first four moments (around the origin) of a distribution.
What is the kurtosis of this distribution?
[Solution: Variance = 77 - 52 = 52. Kurtosis =

E[X4] - 4 E[X] E[X3] + 6 E[X]2 E[X2] - 3 E[X]4


=
Variance2

{10423 - (4)(5)(812) + (6)(52 )(77) - (3)(54 ) }/522 = 1.43.]


Jensens Inequality states that for a convex function f, E[f(X)] f(E[X]).22
f(x) = x2 is an example of a convex function; its second derivative is positive.
Therefore, by Jensens Inequality, E[X2 ] E[X]2 .23
Letting X = Y2 , we therefore have that E[Y4 ] E[Y2 ]2 .
The fourth moment is greater than or equal to the square of the second moment.
Letting X = (Y - Y)2 , we therefore have that E[(Y - Y)4 ] E[(Y - Y)2 ]2 .
The fourth central moment is greater than or equal to the square of the variance.
Therefore, the Kurtosis is always greater than or equal to one.
In fact, Kurtosis 1 + Skewness2 .24
Exercise: Let Prob[X = -10] = 50% and Prob[X = 10] = 50%.
Determine the skewness and kurtosis of X.
[Solution: Since this distribution is symmetric around its mean of 0, skewness = 0.
Variance = 102 = 100. Fourth Central Moment = 104 . Kurtosis = 104 /1002 = 1.]
Exercise: One observes losses of sizes: $300, $600, $1,200, $1,500, and $2,800.
Determine the empirical kurtosis.
[Solution: From a previous exercise X = 1280, and the variance = 757,600. 4th central moment =
{(300 - 1280)4 + (600 - 1280)4 + (1200 - 1280)4 + (1500 - 1280)4 + (2800 - 1280)4 }/5 =
1,295,302,720,000. Kurtosis = 1,295,302,720,000/ 757,6002 = 2.257.]
When computing the empirical coefficient of variation, skewness, or kurtosis, we use
the biased estimate of the variance, with n in the denominator, rather than the sample
variance. We do so since everyone else does.25
22

See for example Actuarial Mathematics.


This also follows from the fact that the variance is never negative.
24
See Exercise 3.19 in Volume I of Kendalls Advanced Theory of Statistics.
25
See for example, 4, 5/01, Q.3.
23

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 37

Problems:
3.1 (1 point) A size of loss distribution has moments as follows: First moment = 3,
Second moment = 50, Third Moment = 2000. Determine the skewness.
A. less than 6
B. at least 6 but less than 6.2
C. at least 6.2 but less than 6.4
D. at least 6.4 but less than 6.6
E. at least 6.6
Use the following information for the next 4 questions:
E[X] = 5, E[X2 ] = 42.8571, E[X3 ] = 584.184, E[X4 ] = 11,503.3.
3.2 (1 point) What is the variance of X?
A. less than 17
B. at least 17 but less than 18
C. at least 18 but less than 19
D. at least 19 but less than 20
E. at least 20
3.3 (1 point) What is the coefficient of variation of X?
A. less than 0.6
B. at least 0.6 but less than 0.7
C. at least 0.7 but less than 0.8
D. at least 0.8 but less than 0.9
E. at least 0.9
3.4 (2 points) What is the skewness of X?
A. less than 2.4
B. at least 2.4 but less than 2.5
C. at least 2.5 but less than 2.6
D. at least 2.6 but less than 2.7
E. at least 2.7
3.5 (3 points) What is the kurtosis of X?
A. less than 10
B. at least 10 but less than 11
C. at least 11 but less than 12
D. at least 12 but less than 13
E. at least 13

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 38

3.6 (1 point) Let X be a random variable. Which of the following statements are true?
1. A measure of skewness of X is E[X3 ] / Var[X]3/2.
2. The measure of skewness is positive if X has a heavy tail to the right.
3. If X is given by Standard Unit Normal, then X has kurtosis equal to one.
A. None of 1, 2, or 3
B. 1
C. 2
D. 3
E. None of A, B, C, or D
3.7 (3 points) There are 10,000 claims observed as follows:
Size of Claim
Number of Claims
100
9000
200
800
300
170
400
30
Which of the following statements are true?
1. The mean of this distribution is 112.3.
2. The variance of this distribution is 14210.
3. The skewness of this distribution is positive.
A. None of 1, 2, or 3
B. 1
C. 2
D. 3
E. None of A, B, C, or D
Use the following information for the next three questions:
There are five losses of sizes: 5, 10, 20, 50, 100.
3.8 (2 points) What is the empirical coefficient of variation?
A. 0.90
B. 0.95
C. 1.00
D. 1.05
E. 1.1
3.9 (2 points) What is the empirical coefficient of skewness?
A. 0.8
B. 0.9
C. 1.0
D. 1.2
E. 1.4
3.10 (2 points) What is the empirical kurtosis?
A. 1.1
B. 1.4
C. 1.7
D. 2.0

E. 2.3

3.11 (3 points) f(x) = 2x, 0 < x < 1. Determine the skewness.


A. -0.6
B. -0.3
C. 0
D. 0.3

E. 0.6

3.12 (3 points) f(x) = 1, 0 < x < 1. Determine the skewness.


A. -0.6
B. -0.3
C. 0
D. 0.3

E. 0.6

3.13 (3 points) f(x) = 2(1 - x), 0 < x < 1. Determine the skewness.
A. -0.6
B. -0.3
C. 0
D. 0.3

E. 0.6

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 39

3.14 (4, 5/86, Q.33) (1 point)


Which of the following statements are true about the random variable X?
1. If X is given by a unit normal distribution, then X has its measure of skewness equal to one.
2. A measure of the skewness of X is E[X3 ]/(VAR[X])3 .
3. The measure of skewness of X is positive if X has a heavy tail to the right.
A. 1
B. 2
C. 3
D. 1, 2
E. 1, 3
3.15 (4, 5/88, Q.30) (1 point) Let X be a random variable with mean m, and let ar denote the rth
moment of X about the origin. Which of the following statements are true?
1. m = a1
2. The third central moment is equivalent to a3 + 3a2 - 2a1 3 .
3. The variance of X is the second central moment of X.
A. 1
B. 2
C. 2, 3
D. 1, 3

E. 1, 2 and 3

3.16 (4, 5/89, Q.27) (1 point) There are 30 claims for a total of $180,000.
Given the following claim size distribution, calculate the coefficient of skewness.
Claim Size ( $000 ) Number of Claims
2
2
4
6
6
12
8
10
A. Less than -.6
B. At least -.6, but less than -.2
C. At least -.2, but less than .2
D. At least .2, but less than .6
E. .6 or more
3.17 (2 points) In the previous question, determine the kurtosis.
A. 1.0
B. 1.5
C. 2.0
D. 2.5
E. 3.0

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 40

3.18 (4B, 5/93, Q.34) (1 point) Claim severity has the following distribution:
Claim Size Probability
$100
0.05
$200
0.20
$300
0.50
$400
0.20
$500
0.05
Determine the distribution's measure of skewness.
A. -0.25
B. 0.00
C. 0.15
D. 0.35
E. Cannot be determined
3.19 (2 points) In the previous question, determine the kurtosis.
A. 1.0
B. 1.5
C. 2.0
D. 2.5
E. 3.0
3.20 (4B, 5/95, Q.28) (2 points) You are given the following:
For any random variable X with finite first three moments, the skewness of the distribution of X is
denoted Sk(X).
X and Y are independent, identically distributed random variables with mean = 0 and
finite second and third moments.
Which of the following statements must be true?
1. 2Sk(X) = Sk(2X)
2. -Sk(Y) = Sk(-Y)
3. |Sk(X)| |Sk(X+Y)|
A. 2
B. 3
C. 1,2
D. 2,3
E. None of A, B, C, or D
3.21 (4B, 5/97, Q.21) (2 points) You are given the following:

Both the mean and the coefficient of variation of a particular distribution are 2.

The third moment of this distribution about the origin is 136.


Determine the skewness of this distribution.
Hint: The skewness of a distribution is defined to be the third central moment divided by the cube of
the standard deviation.
A. 1/4
B. 1/2
C. 1
D. 4
E. 17
3.22 (4B, 11/99. Q.29) (2 points) You are given the following:
A is a random variable with mean 5 and coefficient of variation 1.
B is a random variable with mean 5 and coefficient of variation 1.
C is a random variable with mean 20 and coefficient of variation 1/2.
A, B, and C are independent.
X=A+B
Y=A+C
Determine the correlation coefficient between X and Y.
A. -2/ 10

B. -1/ 10

C. 0

D. 1/ 10

E. 2/ 10

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 41

3.23 (4, 5/01, Q.3) (2.5 points) You are given the following times of first claim for five randomly
selected auto insurance policies observed from time t = 0: 1, 2, 3, 4, 5.
Calculate the kurtosis of this sample.
(A) 0.0
(B) 0.5
(C) 1.7
(D) 3.4
(E) 6.8
3.24 (4, 11/06, Q.3 & 2009 Sample Q.248) (2.9 points)
You are given a random sample of 10 claims consisting of
two claims of 400, seven claims of 800, and one claim of 1600.
Determine the empirical skewness coefficient.
(A) Less than 1.0
(B) At least 1.0, but less than 1.5
(C) At least 1.5, but less than 2.0
(D) At least 2.0, but less than 2.5
(E) At least 2.5

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 42

Solutions to Problems:
3.1. B. Stand. deviation =

41 = 6.403. Skewness = (2000 - 450 + 54) / 262.5 = 6.1

3.2. B. Variance = 42.8571 - 52 = 17.8571.


Comment: The given moments are for an Inverse Gaussian Distribution with parameters = 5 and
= 7. The variance for an Inverse Gaussian is: 3/ = 125 / 7 = 17.8571.
3.3. D. CV =

17.8571 / 5 = 0.845.

Comment: The given moments are for an Inverse Gaussian Distribution with parameters = 5 and
= 7. The coefficient of variation for an Inverse Gaussian is:

3.4. C. Skewness =

/ = 5 /7 = .845.

E[X3] - 3 E[X] E[X2] + 2 E[X] 3


=
STDDEV 3

{584.184 - (3)(5)(42.8571) + (2)(125)}/ 17.85711.5 = 191.3275 / 75.4599 = 2.535.


Comment: The given moments are for an Inverse Gaussian Distribution with parameters = 5 and
= 7. The skewness for an Inverse Gaussian is 3 / = 3 5 /7 = 2.535.

3.5. E. Kurtosis =

E[X4] - 4 E[X] E[X3] - 6 E[X]2 E[X2] - 3 E[X]4


=
Variance2

{11503.3 -(4)(5)(584.184) + (6)(25)(42.8571) - (3)(625)}/17.85712 = 4373.19/318.88 = 13.71.


Comment: The given moments are for an Inverse Gaussian Distribution with parameters = 5 and
= 7. The kurtosis for an Inverse Gaussian is 3 + 15/ = 96/7 = 13.71.
3.6. C. 1. The numerator of the skewness should be the third central moment:
E[(X - E[X])3 ] = E[X3 ] - 3 E[X] E[X2 ] + 2 E[X]3 . Thus Statement 1 is not true in general.
2. Statement 2 is true. A good example is the Pareto Distribution.
3. The Normal Distribution has a kurtosis of three, thus Statement #3 is not true.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 43

3.7. E. The mean is: 1,123,000 / 10000 = 112.3. So Statement #1 is true.


The second moment is: 142,100,000 / 10,000 = 14,210, thus the variance is:
14210 - 11.232 = 1598.7. Thus Statement #2 is false.
A

Size of Claim Number of Claims Col.A x Col.B Sq. of Col.A x Col.B Cube of Col.A x Col.B
100
9000
900,000
90,000,000
9,000,000,000
200
800
160,000
32,000,000
6,400,000,000
300
170
51,000
15,300,000
4,590,000,000
400
30
12,000
4,800,000
1,920,000,000
10000

1,123,000

142,100,000

21,910,000,000

E[X] = 1,123,000 / 10,000 = 112.3. E[X2 ] = 142,100,000 / 10,000 = 14,210.


E[X3 ] = 21,910,000,000 / 10,000 = 2,191,000. STDDEV =
Skewness =

1598.7 = 39.98.

E[X3] - 3 E[X] E[X2] + 2 E[X] 3 3


=
STDDEV 3

{2,191,000 - (3)(112.3)( 14,210) + (2)(112.3)3 )} / 39.983 = 3.7. Thus Statement #3 is true.


3.8. B. X = (5 + 10 + 20 + 50 + 100)/5 = 37.
Variance = {(5 - 37)2 + (10 - 37)2 + (20 - 37)2 + (50 - 37)2 + (100 - 37)2 }/5 = 1236.
CV = 1236 / 37 = 0.950.
Comment: Note the use of the biased estimator of the variance rather than the sample variance.
3.9. B. Third Central Moment = {(5 - 37)3 + (10 - 37)3 + (20 - 37)3 + (50 - 37)3 + (100 - 37)3 }/5
= 38,976. Skewness = 38,976 / 12361.5 = 0.897.

3.10. E. 4th Central Moment = {(5 - 37)4 + (10 - 37)4 + (20 - 37)4 + (50 - 37)4 + (100 - 37)4 }/5 =
3,489,012. Kurtosis = 3,489,012 / 12362 = 2.284.
1

3.11. A. E[X] = xf(x)dx = 2/3. E[X2 ] = x2 f(x)dx = 1/2. E[X3 ] = x3 f(x)dx = 2/5.
variance = (1/2) - (2/3)2 = 1/18.
third central moment = E[X3 ] - 3E[X]E[X2 ] + 2E[X]3 = 2/5 - (3)(2/3)(1/2) + 2(2/3)3 = -0.0074074.
Skewness = -0.0074074/(1/18)1.5 = -0.5657.
Comment: A Beta Distribution with a = 2, b = 1, and = 1.
Skewness = 2 (b - a)

a + b + 1 / {(a + b + 2) a b } = -2 4 / {5 2 } = -0.5657.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 44

3.12. C. The distribution is symmetric around its mean of 1/2. The skewness is 0.
1

E[X] = xf(x)dx = 1/2. E[X2 ] = x2 f(x)dx = 1/3. E[X3 ] = x3 f(x)dx = 1/4.


variance = (1/3) - (1/2)2 = 1/12. third central moment = E[X3 ] - 3E[X]E[X2 ] + 2E[X]3
= 1/4 - (3)(1/2)(1/3) + 2(1/2)3 = 0. Skewness = 0/(1/12)1.5 = 0.
1

3.13. E. E[X] = xf(x)dx = 1 - 2/3 = 1/3. E[X2 ] = x2 f(x)dx = 2/3 - 1/2 = 1/6.
1

variance = (1/6) - (1/3)2 = 1/18. E[X3 ] = x3 f(x)dx = 1/2 - 2/5 = 1/10.


0

3rd central moment = E[X3 ] - 3E[X]E[X2 ] + 2E[X]3 = 1/10 - (3)(1/3)(1/6) + 2(1/3)3 = 0.0074074.
Skewness = 0.0074074/(1/18)1.5 = 0.5657.
Comment: A Beta Distribution with a = 1, b = 2, and = 1.
Skewness = 2 (b - a)

a + b + 1 / {(a + b + 2) a b } = 2 4 / {5 2 } = 0.5657.

3.14. C. 1. False . The Normal Distribution is symmetric with a skewness of zero. 2. False. The
numerator should be the third central moment, while the denominator should be the standard
deviation cubed. SKEW[X] = E[(X-E[X])3 ]/(VAR[X])3/2 . 3. True.
3.15. D. Statement one is true, the mean is the 1st moment around the origin: E[X].
Statement 2 is false. The 3rd central moment = E[(X-m)3 ] = a3 - 3a2 a1 + 2a1 3 ;
Statement 3 is true, the variance is the 2nd central moment: E[(X-m)2].
Comment: In the 3rd central moment each term must be in dollars cubed.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 45

3.16. B. & 3.17. D. Calculate the moments:


Number
of Claims

Size of
Claim

Square of
Size of Claim

Cube of
Size of Claim

2
6
12
10

2
4
6
8

4
16
36
64

8
64
216
512

Average

6.000

39.200

270.400

E[X] = 6 , E[X2 ] = 39.2, E[X3 ] = {(2)(23 ) +(6)(43 ) +(12)(63 ) +(10)(83 ) }/(2+6+10+12) = 270.4.
Variance = E[X2 ] - E[X]2 = 39.2 - 62 = 3.2.
(Coefficient of) Skewness = {E[X3 ] - (3 E[X] E[X2 ]) + (2 E[X]3 )} / STDDEV3 =
{(270.4) - (3)(6)(39.2) +(2)(6)3 } / (3.2)3/2 = -3.2 / 5.724 = -0.56.
Alternately, Third central moment = {2(2 - 6)3 + 6(4 - 6)3 + 12(6 - 6)3 + 10(8 - 6)3 }/30 = -3.2.
Skewness = -3.2/ (3.2)3/2 = -0.56.
Fourth central moment = {2(2 - 6)4 + 6(4 - 6)4 + 12(6 - 6)4 + 10(8 - 6)4 }/30 = 25.6.
Kurtosis = (Fourth central moment)/Variance2 = 25.6/3.22 = 2.5.
Comment: The distribution is skewed to the left and therefore has a negative skewness.
3.18. B. A symmetric distribution has zero skewness.
3.19. E. Calculate the moments:
Probability

Size of
Claim ($00)

Square of
Size of Claim

5%
20%
50%
20%
5%

1
2
3
4
5

1
4
9
16
25

Average

3.0

9.8

E[X] = 3 , E[X2 ] = 9.8. Variance = E[X2 ] - E[X]2 = 9.8 - 32 = 0.8.


4th central moment = (.05)(1 - 3)4 + (.2)(2 - 3)4 + (.5)(3 - 3)4 + (.2)(4 - 3)4 + (.05)(5 - 3)4 = 2.
Kurtosis = (Fourth central moment)/Variance2 = 2/0.82 = 3.125.
Comment: The kurtosis does not depend on the scale. So dividing all of the claim sizes by 100
makes the arithmetic easier, but does not affect the answer.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 46

3.20. D. Statement 1 is false. The skewness is a dimensionless quantity; i.e., it is a pure number
which is not in any particular currency. Thus the skewness of X is unaffected if X is multiplied by a
positive constant. In this specific case both the 3rd central moment and the cube of the standard
deviation are multiplied by 23 = 8. Therefore the skewness which is their ratio is unaffected.
Statement 2 is true. The skewness is defined as the 3rd central moment divided by the cube of the
standard deviation. The former is multiplied by -1 since by definition the third central moment is E[(XE[X])3 ]. Alternately, recall that the third central moment = 3 - (3 1 2 ) + (2 1 3), each of whose
terms is multiplied by -1. (The odd powered moments around the origin are each multiplied by -1,
while the even powered moments are unaffected.) The cube of the standard deviation is unaffected
since the standard deviation is always positive. Statement 3 is true. Skewnesses do not add.
However, since third central moments of independent variables do add, for X and Y independent,
3rd central moment of X+Y = 3rd central moment of X + 3rd central moment of Y =
Skew[X]Var[X]1.5 + Skew[Y]Var[Y]1.5. Thus for X and Y independent,
Skew[X+Y] = {Skew[X]Var[X]1.5 + Skew[Y]Var[Y]1.5} / {Var[X] + Var [Y]}1.5.
In particular, when adding two independent identically distributed random variables,
Skew[X + Y] = Skew[X] / 2 Skew [X].
Comment: Long and difficult. Tests important concepts. Statement 2 says that if X is skewed to the
right, then -X is skewed to the left by the same amount.
3.21. B. We are given E[X3 ] = 136, E[X] = 2 and CV = / E[X] = 2. Therefore = 4.
Therefore E[X2 ] = 2 + E[X]2 = 42 + 22 = 20. Skewness = {E[X3 ] - (3 E[X] E[X2 ]) + (2 E[X]3 )} / 3 =
{136 - (3)(20)(2) + 2(23 )} / 43 = 32/64 = 1/2.
3.22. D. Var[A] = {(mean)(CV)}2 = 25. Var[B] = {(5)(1)}2 = 25. Var[C] = {(20)(1/2)}2 = 100.
Var[X] = Var[A] + Var[B] = 25 + 25 = 50, since A and B are independent.
Var[Y] = Var[A] + Var[C] = 25 + 100 = 125, since A and C are independent.
Cov[X, Y] = Cov[A+B, A+C] = Cov[A, A] + Cov[A, C] + Cov[B, A] + Cov[B, C] =
Var[A] + 0 + 0 + 0 = 25.
Corr[X , Y] = Cov[X , Y] /

Var[X] Var[Y] = 25 /

(50)(125) = 1/

10 .

Comment: Since A, B, and C are independent, Cov[A, C] = Cov[B, A] = Cov[B, C] = 0.


3.23. C. Mean = (1 + 2 + 3 + 4 +5)/5 = 3.
Variance = 2nd central moment = {(1-3)2 + (2-3)2 + (3-3)2 + (4-3)2 + (5-3)2 }/5 = 2.
4th central moment = {(1-3)4 + (2-3)4 + (3-3)4 + (4-3)4 + (5-3)4 }/5 = 34/5.
Kurtosis = the fourth central moment divided by the variance squared = (34/5)/22 = 1.7.
Comment: We use the biased estimator of the variance rather than the sample variance.

2013-4-2,

Loss Distributions, 3 CV Skewness Kurtosis HCM 10/8/12, Page 47

3.24. B. E[X] = {(2)(400) + (7)(800) + 1600}/10 = 800.


E[X2 ] = {(2)(4002 ) + (7)(8002 ) + 16002 }/10 = 736,000.
E[X3 ] = {(2)(4003 ) + (7)(8003 ) + 16003 }/10 = 780,800,000.
Variance is: 736,000 - 8002 = 96,000.
Third Central Moment is: 780,800,000 - (3)(736,000)(800) + (2)(8003 ) = 38,400,000.
Skewness is: 38,400,000/96,0001.5 = 1.291.
Alternately, Third Central Moment is: {(2)(400 - 800)3 + (7)(800 - 800)3 + (1600 - 800)3 }/10 =
38,400,000. Proceed as before.
Comment: If one divide all of the claim sizes by 100, then the skewness is unaffected.
Note that the denominator is not based on using the sample variance.

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 48

Section 4, Empirical Distribution Function


This section will discuss the Distribution and Survival Functions.
Cumulative Distribution Function:
For the Cumulative Distribution Function, F(x) = Prob[X x].
Various Distribution Functions are listed in Appendix A attached to your exam.
For example, for the Exponential Distribution, F(x) = 1 - e-x/.26
Exercise: What is the value at 3 of an Exponential Distribution with = 2.
[Solution: F(x) = 1 - e-x/. F(3) = 1- e-3/2 = .777.]
F(x) = f(x) 0.
0 F(x) 1, nondecreasing, right-continuous, starts at 0 and ends at 1.27
Here is graph of the Exponential Distribution with = 2:
1
0.8
0.6
0.4
0.2

26

10

See Appendix A in the tables attached to the exam. The Exponential Distribution will be discussed in detail in a
subsequent section.
27
As x approaches y from above, F(x) approaches F(y). F would not be continuous at a jump discontinuity, but would
still be right continuous. See Section 2.2 of Loss Models.

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 49

Survival Function:
Similarly, we can define the Survival Function, S(x) = 1 - F(x) = Prob[X > x].
S(x) = -f(x) 0.
0 S(x) 1, nonincreasing, right-continuous, starts at 1 and ends at 0.28
S(x) = 1 - F(x) = Prob[X > x] = the Survival Function
= the tail probability of the Distribution Function F.
For example, for the Exponential Distribution, S(x) = 1 - F(x) = 1 - (1- e-x/) = e-x/.
Here is graph of the Survival Function of an Exponential with = 2:
1
0.8
0.6
0.4
0.2

10

Exercise: What is S(5) for a Pareto Distribution29 with = 2 and = 3?


[Solution: F(x) = 1 - {/(x+)}. S(x) = {/(x+)}. S(5) = {3/(3+5)}2 = 9/64 = 14.1%.]
In many situations you may find that the survival function is easier for you to use than the distribution
function. Whenever a formula has S(x), one can always use 1 - F(x) instead, and vice-versa.

28

See Definition 2.4 in Loss Models.


See Appendix A in the tables attached to the exam.
The Pareto Distribution will be discussed in a subsequent section.
29

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 50

Empirical Model:
The Empirical Model: probability of 1/(# data points) is assigned to each observed value.30
For the ungrouped data set in Section 1, the corresponding empirical model has density of 1/130 at
each of the 130 data points:
p(300) = 1/130, p(400) = 1/130, ..., p(4802200) = 1/130.
Exercise: The following observations: 17, 16, 16, 19 are taken from a random sample.
What is the probability function (pdf) of the corresponding empirical model?
[Solution: p(16) = 1/2, p(17) = 1/4, p(19) = 1/4.]
Empirical Distribution Function:
The Empirical Model is the density that corresponds to the Empirical Distribution Function:
Fn (x) = (# data points x)/(total # of data points).
The Empirical Distribution Function at x, is the observed number of claims less than or
equal to x divided by the total number of claims observed.
At each observed claim size the Empirical Distribution Function has a jump
discontinuity.
For example for the ungrouped data in Section 1, just prior to 37,300 the Empirical Distribution
Function is 26/130 = .2000, while at 37,300 it is 27/130 = .2077.
Exercise: One observes losses of sizes: $300, $600, $1,200, $1,500, and $2,800.
What is the Empirical Distribution Function?
[Solution: Fn (x) is: 0 for x < 300, 1/5 for 300 x < 600, 2/5 for 600 x < 1200,
3/5 for 1200 x < 1500, 4/5 for 1500 x < 2800, 1 for x 2800.]

30

See Definition 3.2 in Loss Models.

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 51

Here is a graph of this Empirical Distribution Function:


Probability
1.0

0.8

0.6

0.4

0.2

o
300 600

1200 1500

2800

The empirical distribution function is constant on intervals, with jumps up of 1/5 at each of the five
observed points. For example, it is 1/5 at 599.99999 but 2/5 at 600.
Mean and Variance of the Empirical Distribution Function:
Assume the losses are drawn from a Distribution Function F(x). Then each observed loss has a
chance of F(x) of being less than or equal to x. Thus the number of losses observed less than or
equal to x is a sum of N independent Bernoulli trials with chance of success F(x). Thus if one has a
sample of N losses, the number of losses observed less than or equal to x is Binomially distributed
with parameters N and F(x).
Therefore, the Empirical Distribution Function is (1/N) times a Binomial Distribution with parameters N
and F(x). Therefore, the Empirical Distribution Function has mean of F(x)
and a variance of: F(x){1-F(x)}/N.
Exercise: Assume 130 losses are independently drawn from an Exponential Distribution:
F(x) = 1 - e-x/300,000.
Then what is the distribution of the number of losses less than or equal to 100,000?
[Solution: The number of losses observed less than or equal to 100,000 is Binomially distributed
with parameters 130, 1-e-1/3 = 0.283.]

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 52

Exercise: Assume 130 losses are independently drawn from an Exponential Distribution:
F(x) = 1 - e-x/300000.
Then what is the variance of the number of losses less than or equal to 100,000?
[Solution: The number of losses observed less than or equal to 100,000 is Binomially distributed
with parameters 130, 1-e-1/3 = 0.283.
Thus it has a variance of: (130)(0.283)(1 - 0.283) = 26.38. ]
Exercise: 130 losses are independently drawn from an Exponential Distribution:
F(x) = 1 - e-x/300000. What is the distribution of the empirical distribution function at 100,000?
[Solution: The number of losses observed less than or equal to 100,000 is Binomially distributed
with parameters 130, .283. The empirical distribution function at 100,000, Fn (100000), is the
percentage of losses 100,000. Thus the empirical distribution function at 100,000 is (1/130) times
a Binomial with parameters 130 and .283.]
Exercise: 130 losses are independently drawn from an Exponential Distribution:
F(x) = 1 - e-x/300000.
What is the variance of the percentage of losses less than or equal to 100,000?
[Solution: Fn (100000) is (1/130) times a Binomial with parameters 130 and .283. Thus it has a
variance of (1/130)2 (130)(0.283)(1 - 0.283) = 0.00156. ]
As the number of losses, N, increases, the variance of the estimate of the distribution decreases as
1/N. All other things being equal, the variance of the empirical distribution function is largest when
trying to estimate the middle of the distribution rather than either of the tails31.
Empirical Survival Function:
The Empirical Survival Function is: 1 - Empirical Distribution Function.
Empirical Distribution Function at x is: (# losses x)/(total # of losses).32
Empirical Survival Function at x is: (# losses > x)/(total # of losses).
Exercise: One observes losses of sizes: $300, $600, $1,200, $1,500, and $2,800.
What are the empirical distribution function and survival function at 1000?
[Solution: Fn (1000) = (# losses 1000) / (# losses) = 2/5.
S n (1000) = (# losses > 1000) / (# losses) = 3/5.]
31
32

F(x){1-F(x)} is largest for F(x) 1/2. However, small differences in the tail probabilities can be important.
More generally, the empirical distribution function is: (# observations x) / (total # of observations).

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 53

For 300, 600, 1,200, 1,500, and 2,800, here is a graph of the Empirical Survival Function:
Probability
1.0

0.8

0.6

0.4

0.2

300

600

1200 1500

2800

The empirical survival function is constant on intervals, with jumps down of 1/5 at each of the five
observed points. For example, it is 4/5 at 599.99999 but 3/5 at 600.
Exercise: Determine the area under this empirical survival function.
[Solution: (1)(300) + (.8)(300) + (.6)(600) + (.4)(300) + (.2)(1300) = 1280.]
The sample mean, X = (300 + 600 + 1200 +1500 + 2800)/5 = 1280.
The sample mean is equal to the integral of the empirical survival function.
As will be discussed in a subsequent section, the mean is equal to the integral of the survival
function, for those cases where the support of the survival function starts at zero.

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 54

Problems:
4.1 (1 point) Insureds suffer six losses of sizes: 3, 8, 13, 22, 35, 62.
What is the empirical survival function at 30?
A. 1/6
B. 1/3
C. 1/2
D. 2/3

E. 5/6

4.2 (1 point) You observe 5 losses of sizes: 15, 35, 70, 90, 140.
What is the empirical distribution function at 50?
A. 20%
B. 30%
C. 40%
D. 50%
E. 60%
4.3 (2 points) F(200) = 0.9, F(d) = 0.25, and
200

x f(x) dx = 75.
d
200

F(x) dx + d = 150.
d

Determine d.
A. 60
B. 70

C. 80

D. 90

E. 100

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 55

Use the following information for the next two questions:


You are given the following graph of an empirical distribution function:
Probability
1

0.6
0.4

0
0

11

19

Size

4.4 (1 point) Determine the mean of the data.


(A) Less than 10
(B) At least 10, but less than 11
(C) At least 11, but less than 12
(D) At least 12, but less than 13
(E) At least 13

(Xi
4.5 (1 point) For this data, determine the biased estimator of the variance,

- X )2

A) Less than 26
(B) At least 26, but less than 28
(C) At least 28, but less than 30
(D) At least 30, but less than 32
(E) At least 34

4.6 (CAS9, 11/99, Q.16) (1 point) Which of the following can cause distortions in a loss claim
size distribution derived from empirical data?
1. Claim values tend to cluster around target values, such as $5,000 or $10,000.
2. Individual clams may come from policies with different policy limits.
3. Final individual claim sizes are not always known.
A. 1
B. 2
C. 3
D. 1, 2
E. 1, 2, 3

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 56

4.7 (IOA 101, 9/01, Q.7) (3.75 points) The probability density function of a random variable X is
given by f(x) = kx(1 - ax2 ), 0 x 1, where k and a are positive constants.
(i) (2.25 points) Show that a 1, and determine the value of k in terms of a.
(ii) (1.5 points) For the case a = 1, determine the mean of X.

2013-4-2,

Loss Distributions, 4 Empirical Dist. Function HCM 10/8/12, Page 57

Solutions to Problems:
4.1. B. S(30) = (# losses > 30)/(# losses) = 2/6 = 1/3.
4.2. C. There are 2 losses of size 50. Empirical distribution function at 50 is: 2/5 = 0.4.
200

200

4.3. A. By integration by parts: F(x) dx = xF(x)] d

200

x f(x) dx = (200)F(200) - dF(d) - 75 =

d d

(200)(.9) - .25d - 75 = 105 - .25d. 105 - .25d + d = 150. d = 45/.75 = 60.


4.4. D. From the empirical distribution function, 40% of the data is 7, 60% - 40% = 20% of the data
is 11, and 100% - 60% = 40% of the data is 19.
The mean is: (40%)(7) + (20%)(11) + (40%)(19) = 12.6.
Comment: If the data set was of size five, then it was: 7, 7, 11, 19, 19. The mean is: 63/5 = 12.6.
4.5. C. From the empirical distribution function, 40% of the data is 7,
60% - 40% = 20% of the data is 11, and 100% - 60% = 40% of the data is 19.
The mean is: (40%)(7) + (20%)(11) + (40%)(19) = 12.6.
The second moment is: (40%)(72 ) + (20%)(112 ) + (40%)(192 ) = 188.2.

(Xi

- X )2

= 188.2 - 12.62 = 29.44.

4.6 . E. All of these are true.


Item #3 is referring to the time between when the insurer knows about a claim and sets up a
reserve, and when the claim is paid and closed.
4.7. (i) f(x) 0. 1 - ax2 0, 0 x 1. a 1.
Integral from 0 to 1 of f(x) = k(x - ax3 ) is: k(1/2 - a/4).
Setting this integral equal to one: k(1/2 - a/4) = 1. k = 4/(2 - a).
(ii) k = 4/(2 - a) = 4/(2 - 1) = 4. f(x) = 4x - 4x3 .
The integral from zero to one of xf(x) = 4x2 - 4x4 is: 4/3 - 4/5 = 8/15.

2013-4-2,

Loss Distributions, 5 Limited Losses

HCM

10/8/12,

Page 58

Section 5, Limited Losses


The next few sections will introduce a number of related ideas: the Limited Loss Variable,
Limited Expected Value, Losses Eliminated, Loss Elimination Ratio, Excess Losses,
Excess Ratio, Excess Loss Variable, Mean Residual Life/ Mean Excess Loss, and
Hazard Rate/ Failure Rate.
X

1000 Minimum of x and 1000 = Limited Loss Variable.

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is X 1000?
[Solution: X 1000 = $300, $600, $1000, $1000, $1000.]
If the insured had a policy with a $1000 policy limit (and no deductible), then the insurer would pay
$300, $600, $1000, $1000, and $1000, for a total of $3900 for these five losses.
The Limited Loss Variable33 corresponding to a limit L X

censored from above at L right censored at L34


the payments with a policy limit L (and no deductible) X for X < L, L for X L.
Limited Expected Value:
Limited Expected Value at 1000 = E[X 1000] =
an average over all sizes of loss of the minimum of 1000 and the size of loss.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the (empirical) limited expected value at $1000?
[Solution: E[X 1000] = (300 + 600 + 1000 + 1000 + 1000)/5 = 3900/5 = 780.]
In this case, the insurer pays 3900 on 5 losses or an average of 780 per loss.
The mean of the limited loss variable corresponding to L = E[ X
the average payment per loss with a policy limit of L.

L] =

Since E[X

L] E[Min[X, L]] = average of numbers each L, E[ X

Since E[X

L] E[Min[X, L]] = average of numbers each X, E[ X

33
34

See Definition 3.6 in Loss Models.


Censoring will be discussed in a subsequent section.

L] L.

L] E[X].

2013-4-2,

Loss Distributions, 5 Limited Losses

HCM

10/8/12,

Page 59

Exercise: For the ungrouped data in Section 1, what is the Limited Expected Value at $10,000?
[Solution: E[X

10000] is an average over all sizes of loss of the minimum of $10,000 and the size

of loss. So the first 8 losses in Section 1 would all enter into the average at their total size, while the
remaining 122 losses all enter at $10,000.
E[X

10000] = (35200 + (122)(10000)) / 130 = $1,255,200 / 130 = $9655.4.]

The limited expected value can be written as the sum of the contributions of the small losses and the
large losses. The (theoretical) Limited Expected Value (LEV),
E[X L], would be written for a continuous size of loss distribution as two pieces:
E[X

L] =

x f(x) dx + L S(L)

= contribution of small losses + contribution of large losses.


The first piece represents the contribution of losses up to L in size, while the second piece
represents the contribution of those losses larger than L. The smaller losses each contribute their
size, while the larger losses each contribute L to the average.
For example, for the Exponential Distribution:

E[X L] =

e- x /

/ dx + L

e-L/ =

(-x

e- x /

x =L
x
/

e
)]

35

See Appendix A of Loss Models and the tables attached to the exam.

x=0

+ L e-L/ = (1 - e-L/).35

2013-4-2,

Loss Distributions, 5 Limited Losses

HCM

10/8/12,

Page 60

Problems:
5.1 (1 point) Insureds suffer six losses of sizes: 3, 8, 13, 22, 35, 62.
What is the Limited Expected Value, for a Limit of 25?
A. 15
B. 16
C. 17
D. 18
E. 19
5.2 (1 point) You observe 5 losses of sizes: 15, 35, 70, 90, 140.
What is the Limited Expected Value at 50?
A. 10
B. 20
C. 30
D. 40

E. 50

Use the following information for the next two questions:


Frequency is Poisson with = 20.
E[X] = $10,000.
E[X 25,000] = $8000.
5.3 (1 point) If there is no policy limit, what is the expected aggregate annual loss?
5.4 (1 point) If there is a 25,000 policy limit, what is the expected aggregate annual loss?

5.5 (2 points) For an insurance policy, you are given:


(i) The policy limit is 100,000 per loss, with no deductible.
(ii) Expected aggregate losses are 1,000,000 annually.
(iii) The number of losses follows a Poisson distribution.
(iv) The claim severity distribution has:
S(50,000) = 10%.
S(100,000) = 4%.
E[X 50,000] = 28,000.
E[X 100,000] = 32,000.
(v) Frequency and severity are independent.
Determine the probability that no losses will exceed 50,000 during the next year.
(A) 3.0%
(B) 3.5%
(C) 4.0%
(D) 4.5%
(E) 5.0%
5.6 (1 point) E[X 5000] = 3200.
Size
Number of Losses
Dollars of Loss
0 to 5000
170
???
5001 to 25,000
60
700,000
over 25,000
20
???
Determine E[X 250,000].
(A) 5600
(B) 5800
(C) 6000
(D) 6200
(E) 6400

2013-4-2,

Loss Distributions, 5 Limited Losses

HCM

5.7 (4, 11/01, Q.36) (2.5 points) For an insurance policy, you are given:
(i) The policy limit is 1,000,000 per loss, with no deductible.
(ii) Expected aggregate losses are 2,000,000 annually.
(iii) The number of losses exceeding 500,000 follows a Poisson distribution.
(iv) The claim severity distribution has
Pr(Loss > 500,000) = 0.0106
E[min(Loss; 500,000)] = 20,133
E[min(Loss; 1,000,000)] = 23,759
Determine the probability that no losses will exceed 500,000 during 5 years.
(A) 0.01
(B) 0.02
(C) 0.03
(D) 0.04
(E) 0.05

10/8/12,

Page 61

2013-4-2,

Loss Distributions, 5 Limited Losses

HCM

10/8/12,

Page 62

Solutions to Problems:
5.1. B. E[X 25] = (3 + 8 + 13 + 22 + 25 + 25) / 6 = 96 / 6 = 16.
5.2. D. E[X

50] = (15 + 35 + 50 + 50 + 50)/5 = 40.

5.3. (20)($10000) = $200,000.


5.4. (20)($8000) = $160,000.
5.5. D. 1,000,000 = expected annual aggregate loss = (mean frequency)E[X

100,000] =

(mean frequency) (32,000). mean frequency = 1 million / 32,000 = 31.25 losses per year.
The expected number of losses exceeding 50,000 is: (31.25)S(50,000) = 3.125.
The large losses are Poisson; the chance of having zero of them is: e-3.125 = 4.4%.
Comment: Similar to 4, 11/01, Q.36.
5.6. E. E[X 25,000] =
E[X 5000] + (contribution above 5000 from medium claims)
+ (contribution above 5000 from large claims)
= 3200 + {700,000 - (60)(5000)} / 250 + (20)(25,000 - 5000) / 250 = 6400.
Alternately, let y be the dollars of loss on losses of size 0 to 5000.
Then, 3200 = E[X 5000] = {y + (5000) (60 + 20)} / 250. y = 400,000.
E[X 25,000] = {400,000 + 700,000 + (20)(25,000)} / 250 = 6400.
Comment: Each loss of size more than 25,000, contributes an additional 20,000 to E[X 25,000],
compared to E[X 5000].
Each loss of size 5001 to 25,000 contributes an additional x - 5000 to E[X 25,000],
compared to E[X 5000].
5.7. A. 2,000,000 = expected annual aggregate loss = (mean frequency)E[X 1 million] =
(mean frequency) (23,759). Therefore, mean frequency = 2 million/ 23759 = 84.18 per year.
Assuming frequency and severity are independent, the expected number of losses exceeding
1/2 million is: (84.18)(.0106) = .892 per year.
Over 5 years we expect (5)(.892) = 4.461 losses > 1/2 million. Since we are told these losses are
Poisson Distributed, the chance of having zero of them is: e-4.461 = 0.012.

2012-4-2,

Loss Distributions, 6 Losses Eliminated

HCM

10/8/12,

Page 63

Section 6, Losses Eliminated


Assume an (ordinary) deductible of $10,000 and the ground up36 loss sizes from Section 1. Then
the insurer would pay nothing for the first 8 losses, each of which is less than the $10,000 deductible.
For the ninth loss of size $10,400, the insurer would pay $400 while the insured would have to
absorb $10,000. For a loss of $37,300, the insurer would pay:
$37,300 - $10,000 = $27,300. Similarly, for each of the larger losses $10,000 is eliminated, from
the point of view of the insurer.
The total dollars of loss eliminated is computed by summing up the sizes of loss for all losses less
than the deductible amount of $10,000, and adding to that the sum of $10,000 per each loss greater
than or equal to $10,000. In this case the losses eliminated are:
$35,200 + (122)($10,000) = $1,255,200. Note that the Empirical Losses Eliminated are a
continuous function of the deductible amount; a small increase in the deductible amount produces a
corresponding small increase in the empirical losses eliminated.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How many dollars of loss are eliminated by a deductible of $1000?
[Solution: $300 + $600 + (3)($1000) = $3900.]
Let N be the total number of losses.
Then the Losses Eliminated by a deductible d would be written for a continuous size of loss
distribution as the sum of the same two pieces, the contribution of the small losses plus the
contribution of the large losses:
d

x f(x) dx + N d S(d).

The first piece is the sum of losses less than d. (We have multiplied by the total number of losses
since f(x) is normalized to integrate to unity.) The second piece is the number of losses greater than
d times d per such loss. Note that the losses eliminated are just the number of losses times the
Limited Expected Value.
Losses Eliminated by deductible d are: N E[X d].

36

By ground up I mean the economic loss to the insured, prior to the impact any deductible.

2012-4-2,

Loss Distributions, 6 Losses Eliminated

HCM

10/8/12,

Page 64

Loss Elimination Ratio:


The total losses in Section 1 are $40,647,700. Therefore, the $1,255,200 of losses eliminated by a
deductible of size $10,000 represent $1255200 / $40647700 = 3.09% of the total losses.
This corresponds to an empirical Loss Elimination Ratio (LER) at 10,000 of 3.09%.
Loss Elimination Ratio at d = LER(d) =

Losses Eliminated by a deductible of size d


.
Total Losses

In general the LER is the ratio of the losses eliminated to the total losses. Since its numerator is
continuous while its denominator is independent of the deductible amount, the empirical loss
elimination ratio is a continuous function of the deductible amount.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the (empirical) loss elimination ratio at $1000?
[Solution: $3900 losses are eliminated out of a total of 300 + 600 + 1200 + 1500 + 2800 = 6400.
Therefore, LER(1000) = 3900/6400 = 60.9%.]
The loss elimination ratio at x can be written as:
dollars of loss limited by x (dollars of loss limited by x) / N E[X x]
LER(x) =
=
=
.
total losses
(total losses) / N
Mean

LER(x) =

E[X x]
.
E[X]

For example, for the ungrouped data in Section 1, E[X 10000] is equal to the losses eliminated
by a deductible of 10,000: $1,255,200, divided by the total number of losses 130.
E[X 10000] = 1,255,200 / 130 = 9655.4.
The mean is the total losses of $40,647,700 divided by 130. E[X] = 40,647,700/130 = 312,675.
Therefore, LER(10000) = E[X 10000] / E[X] = 9655.4 / 312675 = 3.09%,
matching the previous computation of LER(10000).

2012-4-2,

Loss Distributions, 6 Losses Eliminated

HCM

10/8/12,

Page 65

Problems:
6.1 (1 point) Insureds suffer six losses of sizes: 3, 8, 13, 22, 35, 62.
What is the Loss Elimination Ratio, for a deductible of 10?
A. less than 0.37
B. at least 0.37 but less than 0.39
C. at least 0.39 but less than 0.41
D. at least 0.41 but less than 0.43
E. at least 0.43
6.2 (1 point) You observe 5 losses of sizes: 15, 35, 70, 90, 140.
What is the Loss Elimination Ratio at 50?
A. 54%
B. 57%
C. 60%
D. 63%
E. 66%
6.3 (2 points) You observe the following payments on 6 losses with no deductible applied:
$200, $300, $400, $800, $900, $1,600.
Let A be the loss elimination ratio (LER) for a $500 deductible.
Let B be the loss elimination ratio (LER) for a $1000 deductible. Determine B - A.
A. 30%
B. 35%
C. 40%
D. 45%
E. 50%
6.4 (4, 5/89, Q.57) (1 point) Given the following payments on 6 losses, calculate the loss
elimination ratio (LER) for a $300 deductible (assume the paid losses had no deductible applied).
Paid Losses: $200, $300, $400, $800, $900, $1,600.
A. LER < .40 B. .40 < LER .41 C. .41 < LER < .42 D. .42 < LER .43 E. .43 < LER
6.5 (CAS5, 5/03, Q.38) (3 points) Given the information below, calculate the loss elimination ratio
for ABC Company's collision coverage in State X at a $250 deductible. Show all work.

ABC insures 5,000 cars at a $250 deductible with the following fully credible data on the
collision claims:
Paid losses are $1,000,000 per year.
The average number of claims per year is 500.

A fully credible study found that in State X:


The average number of car accidents per year involving collision damage was 10,000.
The average number of vehicles was 67,000.

Assume ABC Company's expected ground-up claims frequency is equal to that of State X.
Assume the average size of accidents that fall below the deductible is $150.

2012-4-2,

Loss Distributions, 6 Losses Eliminated

HCM

10/8/12,

Page 66

Solutions to Problems:
6.1. A. LER(10) = Losses Eliminated / Total Losses =
(3 + 8 + 10 + 10 + 10 + 10) / (3 + 8 + 13 + 22 + 35 + 62) = 51 / 143 = 0.357.
6.2. B. E[X] = (15+35+70+90+140)/ 5 = 70. LER(50) = E[X

50]/E[X] = 40/70 = 0.571.

6.3. A. The Losses Eliminated for a $500 deductible are: 200 + 300 + 400 + (3)(500) = 2400.
The total losses are 4200.
Thus LER(500) = Losses Eliminated / Total Losses = 2400/4200 = .571.
Losses Eliminated for a $1000 deductible are: 200 + 300 + 400 + 800 + 900 + 1000 = 3600.
Thus LER(1000) = Losses Eliminated / Total Losses = 3600/4200 = .857.
LER(1000) - LER(500) = .857 - .571 = 0.286.
6.4. B. The Losses Eliminated are: (200)+(300)+(4)(300) = 1700. The total losses are 4200.
Thus the LER = Losses Eliminated / Total Losses = 1700/4200 = 0.405.
6.5. Accident Frequency for State X is: 10,000/67,000 = 14.925%.
For 5000 cars, expect: (14.925%)(5000) = 746.3 accidents.
There were 500 claims, in other words 500 accidents of size greater than the $250 deductible.
Thus we infer: 746.3 - 500 = 246.3 small accidents.
These small accidents had average size $150, for a total of: (246.3)($150) = $36,945.
Deductible eliminates $250 for each large accident, for a total of: ($250)(500) = $125,000.
Losses eliminated = $36,945 + $125,000 = $161,945.
Total losses = losses eliminated + losses paid = $161,945 + $1,000,000 = $1,161,945.
LER at $250 = Losses Eliminated / Total Losses = $161,945 / $1,161,945 = 13.9%.
Alternately, frequency of loss = 10,000/67,000 = 14.925%.
Frequency of claims (accidents of size > 250) = 500/5000 = 10%.
S(250) = 10%/14.925% = .6700. F(250) = 1 - S(250) = .3300.
Average size of accidents that fall below the deductible = average size of small accidents =
$150 = {E[X 250] - 250S(250)}/F(250) = {E[X 250] - ($250)(.67)}/.33.

E[X 250] = (.33)($150) + (.67)($250) = $217.


Average payment per non-zero payment = $1,000,000/500 =
$2000 = (E[X] - E[X 250])/S(250) = (E[X] - E[X 250])/.67.

E[X] - E[X 250] = $1340. E[X] = $1340 + $217 = $1557.


LER(250) = E[X

250] / E[X] = $217/$1557 = 13.9%.

2013-4-2,

Loss Distributions, 7 Excess Losses

HCM 10/8/12,

Page 67

Section 7, Excess Losses


The dollars of loss excess of $10,000 per loss are also of interest. These are precisely the dollars of
loss not eliminated by a deductible of size $10,000. For the ungrouped data in Section 1, the
losses excess of $10,000 are $40,647,700 - $1,255,200 = $39,392,500.
(X - d)+ 0 when X d, X - d when X > d.37

(X - d)+ is the amount paid to an insured with a deductible of d.


The insurer pays nothing if X d, and pays X - d if X > d.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is (X - 1000)+?
[Solution: 0, 0, $200, $500, and $1800.]
(X - d)+ is referred to as the left censored and shifted variable at d.38
(X - d)+ left censored and shifted variable at d
0 when X d, X - d when X > d the amounts paid to insured with a deductible of d

payments per loss, including when the insured is paid nothing due to the deductible of d
amount paid per loss.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is E[(X - 1000)+]?
[Solution: (0 + 0 + $200 + $500 + $1800)/ 5 = $500.]
The expected losses excess of 10,000 per loss would be written for a continuous size of loss
distribution as:

E[(X - 10000)+ ] = Losses Excess of 10,000 per loss =

(x - 10,000) f(x) dx

10,000

Note that we only integrate over those losses greater than $10,000 in size, since smaller losses
contribute nothing to the excess losses. Also larger losses only contribute the amount by which each
exceeds $10,000.
37
38

The + refers to taking the variable X - d when it is positive, and otherwise setting the result equal to zero.
Censoring will be discussed in a subsequent section. See Definition 3.5 in Loss Models.

2013-4-2,

Loss Distributions, 7 Excess Losses

E[(X - 10000)+ ] =

(x - 10,000) f(x) dx =

10,000

10,000

x f(x) dx - {

HCM 10/8/12,

Page 68

x f(x) dx - 10,000

10,000

f(x) dx =

10,000

x f(x) dx + 10,000 S(10,000)} = E[X] - E[X 10000].

Losses Excess of L per loss = E[(X - L)+] = E[X] - E[X L].


Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
Show that E[(X - 1000)+] = E[X] - E[X 1000].
[Solution: E[X] - E[X 1000] = 1280 - 780 = 500 = E[(X - 1000)+].]

Exercise: For an Exponential Distribution with = 100, what is E[(X - 70)+]?


[Solution: E[(X - 70)+] = E[X] - E[X 70] = 100 - 100(1 - e-70/100) = 49.7.]
Excess Ratio:39
The Excess Ratio is the losses excess of the given limit divided by the total losses.
Excess Ratio at x = R(x) (losses excess of x)/(total losses)
= E[(X - x)+ ] / E[X] = (E[X] - E[X x]) / E[X].
Therefore, for the data in Section 1, the empirical Excess Ratio,
R(10,000) = (40,647,700 - 1,255,200) / 40,647,700 = 96.91%.
Note that: R(10,000) = 96.91% = 1 - 3.09% = 1 - LER(10,000).
R(10,000) = (losses excess of 10,000) / (total losses) = N E[(X - 10,000)+] / (N E[X]) =
(E[X] - E[X 10,000]) / E[X] = 1 - E[X 10,000] / E[X] = 1 - LER(10,000).
R(x) = 1 - LER(x) = 1 - E[X x] / E[X].
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the (empirical) excess ratio at $1000?
[Solution: R(1000) = (200 + 500 + 1800)/6400 = 39.1% = 1 - 60.9% = 1 - LER(1000)
= 1 - E[X 1000] / E[X] = 1 - 780/1280.]
39

Loss Models does not use the commonly used term Excess Ratio. However, this important concept may help you
to understand and answer questions. Since the Excess Ratio is just one minus the Loss Elimination Ratio, one can
always work with the Loss Elimination Ratio instead of the Excess Ratio.

2013-4-2,

Loss Distributions, 7 Excess Losses

HCM 10/8/12,

Page 69

One can also write the Excess Ratio in terms of integrals as:

(x - L) f(x) dx

R(L) = L

f(x) dx - L S(L)

= L

f(x) dx

f(x) dx

However, in order to compute the excess ratio or loss elimination ratio, it is usually faster to use the
formulas in Appendix A of Loss Models for the Mean and Limited Expected Value.
Exercise: For an Exponential Distribution with = 100, what is R(70)?
[Solution: R(70) = 1 - E[X 70]/E[X] = 1 - 100(1 - e-70/100)/100 = 49.7%.]
Total Losses = Limited Losses + Excess Losses:
Exercise: For a loss of size 6 and a loss of size 15, list X 10, (X-10)+, and (X 10) + (X-10)+.
[Solution: X
X 10
(X-10)+
(X 10) + (X-10)+
6
15
In general, X = (X

6
10

0
5

6
15]

d) + (X - d)+ .

In other words, buying two policies, one with a policy limit of 1000 (and no deductible), and another
with a deductible of 1000 (and no policy limit), provides the same coverage as a single policy with
no deductible or policy limit.
A deductible of 1000 caps the policyholders payments at 1000, so from his point of view the 1000
deductible acts as a limit. The policyholders retained loss is: X 1000. The insurers payment to
the policyholder is: (X - 1000)+ . Together they total to the loss, X.
A deductible from one point of view is a policy limit from another point of view.40 Remember the
losses eliminated by a deductible of size 1000 are E[X 1000], the same expression as the
losses paid under a policy with limit of size 1000 (and no deductible).
X = (X

d) + (X - d)+. E[X] = E[X

d] + E[(X - d)+]. E[(X - d)+ ] = E[X] - E[X

d].

Expected Excess = Expected Total Losses - Expected Limited Losses.


40

An insurer who buys reinsurance with a per claim deductible of 1 million, has capped its retained losses at
1 million per claim. In that sense the 1 million deductible from the point of view of the reinsurer acts as if the insurer
had sold policies with a 1 million policy limit from the point of view of the insurer.

2013-4-2,

Loss Distributions, 7 Excess Losses

HCM 10/8/12,

Problems:
7.1 (1 point) Insureds suffer six losses of sizes: 3, 8, 13, 22, 35, 62.
What is the Excess Ratio, excess of 30?
A. less than .16
B. at least .16 but less than .19
C. at least .19 but less than .22
D. at least .22 but less than .25
E. at least .25
7.2 (2 points) Determine the excess ratio at $200,000.
Frequency
Dollar
of Losses
Amount
40%
$5,000
20%
$10,000
15%
$25,000
10%
$50,000
5%
$100,000
4%
$250,000
3%
$500,000
2%
$1,000,000
1%
$2,000,000
7.3 (1 point) X is 70 with probability 40% and 700 with probability 60%.
Determine E[(X - 100)+ ].
A. less than 345
B. at least 345 but less than 350
C. at least 350 but less than 355
D. at least 355 but less than 360
E. at least 360
7.4 (1 point) X is 5 with probability 80% and 25 with probability 20%.
If E[(X - d)+ ] = 3, determine d.
A. 4

B. 6

C. 8

D. 10

E. 12

Page 70

2013-4-2,

Loss Distributions, 7 Excess Losses

HCM 10/8/12,

Page 71

Solutions to Problems:
7.1. E. R(30) = (dollars excess of 30) / (total dollars) =
(5 + 32) / (3 + 8 + 13 + 22 + 35 + 62) = 37 / 143 = 0.259.
7.2. Excess Ratio = expected excess losses / expected total losses = 45000/82750 = 54.4%.
Probability

Amount

Product

Excess of 200000

Product

0.4
0.2
0.15
0.1
0.05
0.04
0.03
0.02
0.01

$5,000
$10,000
$25,000
$50,000
$100,000
$250,000
$500,000
$1,000,000
$2,000,000

$2,000
$2,000
$3,750
$5,000
$5,000
$10,000
$15,000
$20,000
$20,000

$0
$0
$0
$0
$0
$50,000
$300,000
$800,000
$1,800,000

$0
$0
$0
$0
$0
$2,000
$9,000
$16,000
$18,000

$82,750

$45,000

7.3. E. (70 - 100)+ = 0. (700 - 100)+ = 600. E[(X - 100)+ ] = (40%)(0) + (60%)(600) = 360.
7.4. D. E[(X - 5)+ ] = (0)(80%) + (25 - 5)(20%) = 4 > 3. d must be greater than 5.
Therefore, E[(X - d)+ ] = (.2)(25 - d) = 3. d = 10.

2013-4-2,

Loss Distributions, 8 Mean Excess Loss

HCM 10/8/12,

Page 72

Section 8, Mean Excess Loss


The Excess Loss Variable for d is defined for X > d as X-d and is undefined for X d.41
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the Excess Loss Variable for $1000?
[Solution: undefined, undefined, $200, $500, $1,800.]
Excess Loss Variable for d the nonzero payments excess of a deductible of d

X - d for X > d truncated from below at d and shifted42


amount paid per (non-zero) payment.
The Excess Loss Variable at d, which could be called the left truncated and shifted variable at d, is
similar to (X - d)+ , the left censored and shifted variable at d. However, the Excess Loss Variable at
d is undefined for X d, while in contrast (X - d)+ is zero for X d.
Excess Loss Variable undefined X d amount paid per (non-zero) payment.
(X - d)+ 0 for X d amount paid per loss.
Exercise: An insured has four losses of size: 700, 3500, 16,000 and 40,000.
What are the excess loss variable at 5000, the left censored and shifted variable at 5000, and the
limited loss variable at 5000?
[Solution: Excess Loss Variable at 5000: 11,000 and 35,000, corresponding to the last two losses.
(It is not defined for the first two losses of size less than 5000.)
Left censored and shifted variable at 5000: 0, 0, 11,000 and 35,000.
Limited Loss Variable at 5000: 700, 3500, 5000, 5000.]

41
42

See Definition 3.4 in Loss Models.


Truncation will be discussed in a subsequent section.

2013-4-2,

Loss Distributions, 8 Mean Excess Loss

HCM 10/8/12,

Page 73

Mean Residual Life / Mean Excess Loss:


The mean of the excess loss variable for d =
the mean excess loss, e(d) =
(Losses Excess of d) / (number of losses > d) =
(E[X] - E[ X d])/S(d) =
the average payment per (nonzero) payment with a deductible of d.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the (empirical) mean excess loss at $1000?
[Solution: e(1000) = ($200 + $500 + $1,800)/3 = $833.33.]
Note that the first step in computing e(1000) is to ignore the two losses that died before 1000.
Then one computes the average lifetime beyond 1000 for the 3 remaining losses.43
In this situation, on a policy with a $1000 deductible, the insurer would make 3
(non-zero) payments totaling $2500, for an average (non-zero) payment of $833.33.
The Mean Residual Life or Mean Excess Loss44 at x, e(x), is defined as the average dollars of
loss above x on losses of size exceeding x.
For the ungrouped data in Section 1, there are 122 losses of size greater than $10,000 and they
have (40647700 - 1255200) dollars of loss above $10,000.45 Therefore, e(10,000) =
$39,392,500 / 122 = $322,889. This can also be written as:
mean - E[X 10,000] $312,674.6 - $9655.4
=
= $322,889.
S(10,000)
122 / 130

e(x) =

E[X] E[X x]
.
S(x)

Note that the empirical mean excess loss is discontinuous. While the excess losses in the numerator
are continuous, the empirical survival function in the denominator is discontinuous. The denominator
has a jump discontinuity at every observed claim size.

43

In Life Contingencies, this is how one computes the mean residual life. See Actuarial Mathematics.
See Definition 3.4 in Loss Models.
45
Note that only losses which exceed the limit even enter into the computation; we ignore small losses. Thus the
denominator in this case is 122 rather than 130.
44

2013-4-2,

Loss Distributions, 8 Mean Excess Loss

HCM 10/8/12,

Page 74

The (theoretical) mean excess loss would be written for a continuous size of loss distribution as:

x
f(x)
dx
x
f(x)
dx
+
L
S(L)

x f(x) dx - L S(L)

0
L
e(L) =
=
.
S(L)
S(L)

The numerator of e(L) is the losses eliminated divided by the total number of losses; this is equal to
the excess ratio R(x) times the mean.
Thus e(x) = R(x) mean / S(x).
Specifically, for the ungrouped data in Section 1,
e(10000) = (96.91%)(312674.6 ) / (122 / 130) 322,883.
One can also write e(L) as:

x f(x) dx

e(L) =

S(L)

-L=

dollars on losses of size > L


- L.
# of losses of size > L

Thus, e(x) = (average size of those losses of size greater than x) - x.

Summary of Related Ideas:


Loss Elimination Ratio at x = LER(x) = E[X x] / E[X].
Excess Ratio at x = R(x) =

E[X] E[X x]
E[X x]
=1= 1 - LER(x).
E[X]
E[X]

Mean Residual Life at x = Mean Excess Loss at x = e(x) =

E[X] E[X x]
.
S(x)

On the exam, one wants to avoid doing integrals if at all possible. Therefore, one should use the
formulas for the Limited Expected Value, E[X x], in Appendix A of Loss Models whenever
possible. Those who are graphically oriented may find the Section on Lee Diagrams helps them to
understand these concepts.

2013-4-2,

Loss Distributions, 8 Mean Excess Loss

HCM 10/8/12,

Page 75

Exercise: For the data in Section 1, what are the Limited Expected Value, Loss Elimination Ratio, the
Excess Ratio, and the mean excess loss, all at $25,000?
[Solution: The Limited Expected Value at $25,000 is:
{(sum of losses < $25,000) + ($25,000)(# losses > $25,000) } / (# losses)
= ($232500)+ ($25,000)(109) = $2,957,500 / 130 = $22,750.
The Loss Elimination Ratio at $25,000 is:
E[X 25000] / mean = $22,750 / $312674.6 = 7.28%.
The Excess Ratio at $25,000 = R(25000) = 1 - LER(25000) = 1 - 7.28% = 92.72%.
The mean excess loss at $25,000 = e(25000) = (mean - E[X 25000]) / S(25000)
= ($312674.6 - $22,750) / (109 / 130) = $345,782.]

Hazard Rate/ Failure Rate:


The failure rate, force of mortality, or hazard rate, is defined as:
h(x) = f(x)/S(x), x0.
For a given age x, the hazard rate is the density of the deaths, divided by the number of people still
alive at age x.
The hazard rate determines the survival (distribution) function and vice versa:

S(x) = exp -

h(t) dt ].

d ln[S(x)]
= h(x).
dx

As will be discussed in a subsequent section, the limit as x approaches infinity of e(x) is equal
to the the limit as x approaches infinity of 1/h(x). These behaviors will be used to distinguish
the tails of distributions.
Exercise: For the data in Section 1, estimate the empirical hazard rate at $25,000.
[Solution: There is no unique estimate of the hazard rate at 25,000.
However, there are 109 claims greater than 25,000 and 3 claims within 5000 of 25,000.
Thus the density at 25,000 is about 3/5000, while the empirical survival function is 109/130.
Therefore, h(25000) (3/5000)/(109/130) = 0.0007.]

2013-4-2,

Loss Distributions, 8 Mean Excess Loss

HCM 10/8/12,

Page 76

Problems:
8.1 (1 point) Insureds suffer six losses of sizes: 3, 8, 13, 22, 35, 62.
What is the empirical mean excess loss at 20?
A. less than 16
B. at least 16 but less than 18
C. at least 18 but less than 20
D. at least 20 but less than 22
E. at least 22
8.2 (2 points) Match the concepts.
1. Limited Loss Variable at 20
2. Excess Loss Variable at 20
3. (X - 20)+
A. 1a, 2b, 3c

B. 1a, 2c, 3b

a. 3, 8, 13, 20, 20, 20.


b. 2, 15, 42.
c. 0, 0, 0, 2, 15, 42.
C. 1b, 2a, 3c

D. 1b, 2c, 3a

E. 1c, 2b, 3a

8.3 (2 points) The random variable for a loss, X, has the following characteristics:
Limited Expected Value at x
x
F(x)
0
0.0
0
500
0.3
360
1000
0.9
670
5000
1.0
770
Calculate the mean excess loss for a deductible of 500.
A. less than 600
B. at least 600 but less than 625
C. at least 625 but less than 650
D. at least 650 but less than 675
E. at least 675
8.4 (2 points) You are given S(60) = .50, S(70) = .40, and e(70) = 13.
Assuming the survival function is a straight line between ages 60 and 70, estimate e(67).
A. 14.8
B. 15.0
C. 15.2
D. 15.4
E. 15.6
8.5 (1 point) You observe 5 losses of sizes: 15, 35, 70, 90, 140.
What is the Mean Excess Loss at 50?
A. 10
B. 20
C. 30
D. 40
E. 50

2013-4-2,

Loss Distributions, 8 Mean Excess Loss

HCM 10/8/12,

Page 77

8.6 (4, 5/88, Q.60) (1 point) What is the empirical mean residual life at x = 4 given the following
sample of total lifetimes:
3, 2, 5, 8, 10, 1, 6, 9.
A. Less than 1.5
B. At least 1.5, but less than 2.5
C. At least 2.5, but less than 3.5
D. 3.5 or more
E. Cannot be determined from the information given
8.7 (4B, 5/93, Q.25) (2 points) The following random sample has been observed:
2.0, 10.3, 4.8, 16.4, 21.6, 3.7, 21.4, 34.4
Calculate the value of the empirical mean excess loss function for x = 8.
A. less than 7.00
B. at least 7.00 but less than 9.00
C. at least 9.00 but less than 11.00
D. at least 11.00 but less than 13.00
E. at least 13.00
8.8 (4B, 11/94, Q.16) (1 point) A random sample of auto glass claims has yielded the following
five observed claim amounts:
100, 125, 200, 250, 300.
What is the value of the empirical mean excess loss function at x = 150?
A. 75
B. 100
C. 200
D. 225
E. 250
8.9 (3, 11/01, Q.35 & 2009 Sample Q.101) (2.5 points)
The random variable for a loss, X, has the following characteristics:
Limited Expected Value at x
x
F(x)
0
0.0
0
100
0.2
91
200
0.6
153
1000
1.0
331
Calculate the mean excess loss for a deductible of 100.
(A) 250
(B) 300
(C) 350
(D) 400
(E) 450

2013-4-2,

Loss Distributions, 8 Mean Excess Loss

HCM 10/8/12,

Page 78

Solutions to Problems:
8.1. C. e(20) = (dollars excess of 20) / (# claims greater than 20) = (2 + 15 + 42) / 3 =
59 /3 = 19.7.
8.2. A. Limited Loss Variable at 20, limit each large loss to 20: 3, 8, 13, 20, 20, 20.
Excess Loss Variable at 20: 2, 15, 42, corresponding to 20 subtracted from each of the last 3
losses. It is not defined for the first 3 losses, each of size less than 20.
(X - 20)+ is 0 for X 20, and X - 20 for X > 20: 0, 0, 0, 2, 15, 42.
8.3. A. F(5000) = 1 E[X] = E[X

5000] = 770.

e(500) = (E[X] - E[X 500])/S(500) = (770 - 360)/(1 - .3) = 586.


Comment: Similar to 3, 11/01, Q.35.
8.4. B. Years excess of 70 = S(70)e(70) = (.4)(13) = 5.2.
S(70) = .40, S(69) .41, S(68) .42, S(67) .43.
Years lived between ages 67 and 70 .425 + .415 + .405 = 1.245.
e(67) = (years excess of 67)/S(67) (5.2 + 1.245)/.43 = 15.0.
8.5. E. e(50) = (20 + 40 + 90)/3 = 50.
Alternately, e(50) = (E[X] - E[X 50])/S(50) = (70 - 40)/(1 - 0.4) = 50.
8.6. D. We ignore all claims of size 4 or less. Each of the 5 claims greater than 4 contributes the
amount by which it exceeds 4. The empirical mean excess loss at x=4 is:
{(5-4) + (8-4) + (10-4) + (6-4) + (9-4)} / 5 = 18/5 = 3.6.
8.7. D. To compute the mean excess loss at 8, we only look at accidents greater than 8.
There are 5 such accidents, and we compute the average amount by which they exceed 8:
e(8) = (2.3 +8.4 + 13.6 + 13.4 + 26.4) / 5 = 64.1 / 5 = 12.82.
8.8. B. Add up the dollars excess of 150 and divide by the 3 claims of size exceeding 150.
e(150) = (50 + 100 + 150) / 3 = 100.
8.9. B. F(1000) = 1 E[X] = E[X
e(100) = (E[X] - E[X

1000] = 331.

100])/S(100) = (331 - 91)/(1 - 0.2) = 300.

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 79

Section 9, Layers of Loss


Actuaries, particularly those working with reinsurance, often look at the losses in a layer. The following
diagram shows how the Layer of Loss between $10,000 and $25,000 relates to three specific
claims of size: $30,000, $16,000 and $7,000.
The claim of size $30,000 contributes to the layer $15,000, the width of the layer, since it is larger
than the upper boundary of the layer. The claim of size $16,000 contributes to the layer $16,000 $10,000 = $6,000; since it is between the two boundaries of the layer it contributes its size minus
the lower boundary of the layer. The claim of size $7,000 contributes nothing to the layer, since it is
smaller than the lower boundary of the layer.
$30,000

$25,000

$16,000

$10,000
$7,000

For example, for the data in Section 1 the losses in the layer between $10,000 and $25,000 are
calculated in three pieces. The 8 losses smaller than $10,000 contribute nothing to this layer. The 13
losses between $10,000 and $25,000 each contribute their value minus $10,000. This sums to
$67,300. The remaining 109 losses which are bigger than the upper limit of the interval at $25,000,
each contribute the width of the interval, $25,000 - $10,000 = $15,000.
Thus the total losses in the layer between $10,000 and $25,000 are: 0 + $67,300 +
(109)($15,000) = $1,702,300. This is $1,702,300 / $40,647,700 = 4.19% of the total losses.

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 80

Exercise: For the data in Section 1, what is the percentage of total losses in the layer from $25,000
to $50,000?
[Solution: $2,583,100 / $40,647,700 = 6.35%.]
For a continuous size of loss distribution, the percentage of losses in the layer from $10,000 to
$25,000 would be written as:
25,000

(x - 10,000) f(x) dx + S(25,000) (25,000 -10,000)

10,000

f(x) dx

The percentage of losses in a layer can be rewritten in terms of limited expected values.
The percentage of losses in the layer from $10,000 to $25,000 is:
(E[X 25000] - E[X 10000]) / mean = (22,750- 9655.4) / 312,675 = 4.19%.
This can also be written in terms of the Loss Elimination Ratios:
LER(25000) - LER(10000) = 7.28% - 3.09% = 4.19%.
This can also be written in terms of the Excess Ratios (with the order reversed):
R(10000) - R(25000) = 96.91% - 92.72% = 4.19%.
The percentage of losses in the layer from d to u =
u

(x -

d) f(x) dx + S(u) (u - d)
=

f(x) dx

E[X u] E[X d]
= LER(u) - LER(d) = R(d) - R(u).
E[X]

Layer Average Severity for the layer from d to u =


The mean losses in the layer from d to u = E[X u] - E[X d] =
{LER(u) - LER(d)} E[X] = {R(d) - R(u)} E[X].
The Layer from d to u can be thought of as either:
(Layer from 0 to u) - (Layer from 0 to d)
or (Layer from d to ) - (Layer from u to ).

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 81

For example, the Layer from 10 to 25 can be thought of as either:


(Layer from 0 to 25) - (Layer from 0 to 10)
or (Layer from 10 to ) - (Layer from 25 to ):

R(25)

25
R(10)

LER(25)

10
LER(10)
0

The percentage of losses in the layer from 10 to 25 is: LER(25) - LER(10) = R(10) - R(25).
Those who are graphically oriented may find that my Section on Lee Diagrams helps them to
understand these concepts.

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 82

Problems:
9.1 (1 point) Insureds suffer six losses of sizes: 3, 8, 13, 22, 35, 62.
What is the percentage of total losses in the layer from 10 to 25?
A. less than 29%
B. at least 29% but less than 30%
C. at least 30% but less than 31%
D. at least 31% but less than 32%
E. at least 32%
9.2 (1 point) Four accidents occur of sizes: $230,000, $810,000, $1,170,000, and $2,570,000.
A reinsurer is responsible for the layer of loss from $500,000 to $1,500,000
($1 million excess of 1/2 million.)
How much does the reinsurer pay as a result of these four accidents?
A. $1.7 million B. $1.8 million C. $1.9 million D. $2.0 million E. $2.1 million
Use the following information for the next two questions:
A reinsurer expects 50 accidents per year from a certain book of business.
Limited Expected Values for this book of business are estimated to be:
E[X $1 million] = $300,000
E[X $4 million] = $375,000
E[X $5 million] = $390,000
E[X $9 million] = $420,000
E[X $10 million] = $425,000
9.3 (1 point) If the reinsurer were responsible for the layer of loss from $1 million to $5 million
($4 million excess of $1 million), how much does the reinsurer expect to pay per year as a result of
accidents from this book of business?
A. $4.0 million B. $4.5 million C. $5.0 million D. $5.5 million E. $6.0 million
9.4 (1 point) Let A be the amount the reinsurer would expect to pay per year as a result of
accidents from this book of business, if the reinsurer were responsible for the layer of loss from
$1 million to $5 million ($4 million excess of $1 million). Let B be the amount the reinsurer would
expect to pay per year as a result of accidents from this book of business, if the reinsurer were
instead responsible for the layer of loss from $1 million to $10 million
($9 million excess of $1 million).
What is the ratio of B/A?
A. 1.30
B. 1.35
C. 1.40
D. 1.45
E. 1.50

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 83

Use the following information for the next four questions:


A reinsurer expects 50 accidents per year from a certain book of business.
The average size of accident from this book of business is estimated as $450,000.
Excess Ratios (Unity minus the Loss Elimination Ratio) for this book of business are:
R($1 million) = 0.100
R($4 million) = 0.025
R($5 million) = 0.015
R($9 million) = 0.006
R($10 million) = 0.005
9.5 (1 point) What is the percentage of total losses in the layer from $1 million to $5 million?
A. less than 6%
B. at least 6% but less than 7%
C. at least 7% but less than 8%
D. at least 8% but less than 9%
E. at least 9%
9.6 (1 point) If the reinsurer were responsible for the layer of loss from $1 million to $5 million
($4 million excess of $1 million), how much does the reinsurer expect to pay per year as a result of
accidents from this book of business?
A. less than $1 million
B. at least $1 million but less than $2 million
C. at least $2 million but less than $3 million
D. at least $3 million but less than $4 million
E. at least $4 million
9.7 (1 point) What is the percentage of total losses in the layer from $1 million to $10 million?
A. less than 6%
B. at least 6% but less than 7%
C. at least 7% but less than 8%
D. at least 8% but less than 9%
E. at least 9%
9.8 (1 point) Let A be the amount the reinsurer would expect to pay per year as a result of
accidents from this book of business, if the reinsurer were responsible for the layer of loss from
$1 million to $5 million ($4 million excess of $1 million). Let B be the amount the reinsurer would
expect to pay per year as a result of accidents from this book of business, if the reinsurer were
instead responsible for the layer of loss from $1 million to $10 million
($9 million excess of $1 million). What is the ratio of B/A?
A. 1.1
B. 1.2
C. 1.3
D. 1.4
E. 1.5

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 84

9.9 (2 points) Assume you have Pareto distribution with = 5 and = $1000.
What percentage of total losses are represented by the layer from $500 to $2000?
A. less than 16%
B. at least 16% but less than 17%
C. at least 17% but less than 18%
D. at least 18% but less than 19%
E. at least 19%
9.10 (1 point) There are seven losses of sizes: 2, 5, 8, 11, 13, 21, 32.
What is the percentage of total losses in the layer from 5 to 15?
A. 35%
B. 40%
C. 45%
D. 50%
E. 55%
9.11. Use the following information:
Limited Expected Values for Security Blanket Insurance are estimated to be:
E[X 100,000] = 40,000
E[X 200,000] = 50,000
E[X 300,000] = 57,000
E[X 400,000] = 61,000
E[X 500,000] = 63,000
Security Blanket Insurance buys reinsurance from Plantagenet Reinsurance.
Let A be the amount Plantagenet would expect to pay per year as a result of accidents from
Security Blanket, if the reinsurance had a deductible of 100,000, maximum covered loss of 300,000,
and a coinsurance factor of 90%. Let B be the amount Plantagenet would expect to pay per year
as a result of accidents from Security Blanket, if the reinsurance had a deductible of 100,000, a
maximum covered loss of 400,000, and a coinsurance factor of 80%.
What is the ratio of B/A?
(A) 1.05
(B) 1.10
(C) 1.15
(D) 1.20
(E) 1.25
9.12 (CAS5, 5/07, Q.9) (1 point)
Using the table below, what is the formula for the loss elimination ratio at deductible D?
Loss Limit
Number of Losses
Total Loss Amount
D and Below
N1
L1
Over D
N2
L2
Total
N1+N2
L1+L2
A. 1 - [L1 + L2 - (N1)(D)] / [L1 + L2]
B. 1 - [L1 + (N2)(D)] / [L1 + L2]
C. 1 - [L2 - (N2)(D)] / [L1 + L2]
D. [L2 + (N2)(D)] / [L1 +(N2)(D)]
E. [L1 + (N1)(D)] / [L1]

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 85

Solutions to Problems:
9.1. D. (losses in layer from 10 to 25) / total losses =
(0+0+3+12+15+15) / (3+8+13+22+35+62) = 45 / 143 = .315.
9.2. D. The accidents of sizes $230,000, $810,000, $1,170,000, and $2,570,000 contribute to
the layer of loss from $500,000 to $1,500,000:
0 + 310,000 + 670,000 + 1,000,000 = $1,980,000.
9.3. B. (50)(E[X $5 million] - E[X $1 million]) = (50)(390,000 - 300,000) = $4.5 million.
9.4. C. A = (50)(E[X $5 million] - E[X $1 million]) = (50)(390,000 - 300,000) =
$4.5 million. B = (50)(E[X $10 million] - E[X $1 million]) = (50)(425,000 - 300,000) =
$6.25 million. B/A = 6.25 / 4.5 = 1.389.
Comment: One can solve this problem without knowing that 50 accidents are expected per year,
since 50 multiplies both the numerator and denominator. The ratio between two layers of loss
depends on the severity distribution, not the frequency distribution.
9.5. D. R($1 million) - R($5 million) = 0.100 - 0.015 = 0.085.
9.6. B. The annual losses from the layer from $1 million to $5 million =
(number of accidents per year)(mean accident){R($1 million) - R($5 million)} =
(50)($450,000){R($1 million) - R($5 million)} = ($22.5 million){.100 - .015} = $1,912,500.
Alternately, the total expected losses are:
(# of accidents per year)(mean accident) = (50)($450,000) = $22,500,000.
(.085)($22,500,000) = $1,912,500.
9.7. E. R($1 million) - R($10 million) = 0.100 - 0.005 = 0.095.
9.8. A. B/A = {R($1 million) - R($10 million)} / {R($1 million) - R($5 million)}
= (0.100 - 0.005) / (0.100 - 0.015) = 0.095 / 0.085 = 1.12.
Comment: A = (number of accidents per year)(mean accident){R($1 million) - R($5 million)}.
B = (number of accidents per year)(mean accident){R($1 million) - R($10 million)}.

2013-4-2,

Loss Distributions, 9 Layers of Loss

HCM 10/8/12,

Page 86

9.9. D. Use the formula given in Appendix A of Loss Models for the Limited Expected Value of
the Pareto, E[X x] = {/(1)}{1(/(+x))1}.
Percentage of Losses in the Layer $500 to $2000 = ( E[X 2000] - E[X 500]) / mean =
(246.9 - 200.6)/250 = 18.5%.
Alternately, use the formula given in a subsequent section for the Excess Ratio of the Pareto, R(x)
={/(+x)}1. Percentage of Losses in the Layer $500 to $2000 = R(500) - R(2000) =
19.75% - 1.23% = 18.5%.
9.10. B. Loss:
2
5
8
11
13
21
32
Contribution to Layer from 5 to 15:
0
0
3
6
8
10
10
(0 + 0 + 3 + 6 + 8 + 10 + 10) / (2 + 5 + 8 + 11 + 13 + 21 + 32) = 37/92 = 40.2%.
9.11. B. A = (0.9)(E[X 300,000] - E[X 100,000]) = (.9)(57,000 - 40,000) = 15,300.
B = (.8)(E[X 400,000] - E[X 100,000]) = (.8)(61,000 - 40,000) = 16,800.
B/A = 16,800 / 15,300 = 1.098.
Comment: Both A and B have been calculated per accident. Their ratio does not depend on the
expected number of accidents.
9.12. C. The losses eliminated are: L1 + (N2)(D).
Loss Elimination Ratio is: {L1 + (N2)(D)} / (L1 + L2) = 1 - {L2 - (N2)(D)}/(L1 + L2).
Alternately, each loss of size less than D contributes nothing to the excess losses.
Each loss of size x > D, contributes x - D to the excess losses.
Therefore, the excess losses = L2 - (N2)(D).
Excess Ratio = (Excess Losses)/(Total Losses) = {L2 - (N2)(D)}/(L1 + L2).
Loss Elimination Ratio = 1 - Excess Ratio = 1 - {L2 - (N2)(D)}/(L1 + L2).

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 87

Section 10, Average Size of Losses in an Interval


One might want to know the average size of those losses between $10,000 and $25,000 in size.
For the ungrouped data in Section 1, this is calculated as:
sum of losses of size between $10,000 & $25,000
= $197,300 / 13 = $15,177.
# losses between $10,000 & $25,000
Exercise: For the data in Section 1, what is the average size of loss for those losses of size from
$25,000 to $50,000?
[Solution: (29600 + 32200 + 32500 + 33700 + 34300 + 37300 + 39500 + 39900 + 41200 +
42800 + 45900 + 49200) / 12 = $458,100 / 12 = $38,175.
Comment: The answer had to be between 25,000 and 50,000.]
Note that this concept differs from a layer of loss. Here we are ignoring all losses other than those in
a certain size category. In contrast, losses of all sizes contribute to each layer.
Exercise: An insured has losses of sizes: $300, $600, $1200, $1500, and $2800.
Determine the losses in the layer from $500 to $2500.
[Solution: The loss of size 300 contributes nothing. The loss of size 600 contributes 100.
The loss size 1200 contributes 700. The loss of 1500 contributes 1000.
The loss of 2800 contributes the width of the layer or 2000.
0 + 100 + 700 + 1000 + 2000 = 3800.]
Exercise: An insured has losses of sizes: $300, $600, $1200, $1500, and $2800.
Determine the sum of those losses of size from $500 to $2500.
[Solution: 600 + 1200 + 1500 = 3300.
Comment: The average size of these three losses is: 3300/3 = 1100.]
For a discrete size of loss distribution, the dollars from those losses of size 10,000 is:

xi Prob[X = xi].
xi 10000

For a continuous size of loss distribution, the dollars from those losses of size 10,000 is:46
10,000

x f(x) dx = E[X 10,000] - 10,000 S(10,000).

0
46

The limited expected value = contribution of small losses + contribution of large losses.
Therefore, contribution of small losses = limited expected value - contribution of large losses.

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 88

For a continuous size of loss distribution, the average size of loss for those losses of size less than
or equal to 10,000 is:
10,000

x f(x) dx

F(10,000)

E[X10,000] - 10,000 S(10,000)


.
F(10,000)

Exercise: For an Exponential Distribution with mean = 50,000, what is the average size of those
losses of size less than or equal to 10,000?
[Solution: E[X x] = (1 - e-x/). E[X 10000] = 50000 (1 - e-1/5) = 9063.
S(x) = e-x/. S(10000) = e-1/5 = .8187. {E[X 10000] - 10000S(10000)}/F(10000) = 4832.]
For a continuous size of loss distribution the average size of loss for those losses of size between
10,000 and 25,000 would be written as:
25,000

25,000

x f(x) dx

10,000

F(25,000) - F(10,000)

10,000

x f(x) dx -

x f(x) dx

F(25,000) - F(10,000)

{E[X 25000] - 25000 S(25000)} - {E[X 10000] - 10000 S(10000)}


.
F(25000) - F(10000)
Exercise: For an Exponential Distribution with mean = 50,000, what is the average size of those
losses of size between 10,000 and 25,000?
[Solution: E[X x] = (1 - e-x/). E[X 10000] = 50000 (1 - e-1/5) = 9063.
E[X 25000] = 50000 (1 - e-1/2) = 19,673.
S(x) = e-x/. S(10000) = e-1/5 = 0.8187. S(25000) = e-1/2 = 0.6065
({E[X 25000] - 25000S(25000)} - {E[X 10000] - 10000S(10000)}) / {F(25000) - F(10000)} =
({19,673 - (25,000)(0.6065)} - {9063 - (10000)(0.8187)}) / {0.3935 - 0.1813} = 17,127.]
In general, the average size of loss for those losses of size between a and b is:
{E[X b] - b S(b)} - {E[X a] - a S(a)}
.
F(b) - F(a)
The numerator is the dollars per loss contributed by the losses of size a to b =
(contribution of losses of size 0 to b) minus (contribution of losses of size 0 to a).
The denominator is the percent of losses of size a to b =
(percent of losses of size 0 to b) minus (percent of losses of size 0 to a).

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 89

For an Exponential with = 50,000, here are the average sizes for various size intervals:
Bottom

Top

0
10,000
25,000
50,000
100,000
250,000

10,000
25,000
50,000
100,000
250,000
Infinity

9,063
19,673
31,606
43,233
49,663
50,000

S(Top)

Average Size

81.9%
60.7%
36.8%
13.5%
0.7%
0.0%

4,833
17,126
36,463
70,901
142,141
300,000

For a Pareto Distribution, S(x) = (/(+x)), and E[X x] = {/(-1)}{1 - (/(+x))1}.


A Pareto Distribution with = 3 and = 100,000, has a mean of: /(-1) = 50,000.
For this Pareto Distribution, here are the average sizes for various size intervals:
Bottom

Top

0
10,000
25,000
50,000
100,000
250,000

10,000
25,000
50,000
100,000
250,000
Infinity

8,678
18,000
27,778
37,500
45,918
50,000

S(Top)

Average Size

75.1%
51.2%
29.6%
12.5%
2.3%
0.0%

4,683
16,863
35,989
70,270
148,387
425,000

Notice the difference between the results for the Pareto and the Exponential Distributions.
Proportion of Dollars of Loss From Losses of a Given Size:
Another quantity of interest, is the percentage of the total losses from losses in a certain size interval.
Proportional of Total Losses from Losses in the Interval [a, b] is:
b

x f(x) dx

E[X]

{E[X b] - b S(b)} - {E[X a] - a S(a)}


.
E[X]

Exercise: For an Exponential Distribution with mean = 50,000, what percentage of the total dollars of
those losses come from losses of size between 10,000 and 25,000?
[Solution: E[X x] = (1 - e-x/). E[X 10000] = 50000 (1 - e-1/5) = 9063.
E[X 25000] = 50000 (1 - e-1/2) = 19,673.
S(x) = e-x/. S(10000) = e-1/5 = 0.8187. S(25000) = e-1/2 = 0.6065
({E[X 25000] - 25000S(25000)} - {E[X 10000] - 10000S(10000)}) / E[X] =
({19,673 - (25,000)(0.6065)} - {9063 - (10,000)(.8187)}) / 50,000 = 7.3%.]

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 90

For an Exponential with = 50,000, here are the percentages for various size intervals:
Bottom

Top

0
10,000
25,000
50,000
100,000
250,000

10,000
25,000
50,000
100,000
250,000
Infinity

9,063
19,673
31,606
43,233
49,663
50,000

S(Top)

Percentage of Total Losses

81.9%
60.7%
36.8%
13.5%
0.7%
0.0%

1.8%
7.3%
17.4%
33.0%
36.6%
4.0%

A Pareto Distribution with = 3 and = 100,000, here are the percentages for various size intervals:
Bottom

Top

0
10,000
25,000
50,000
100,000
250,000

10,000
25,000
50,000
100,000
250,000
Infinity

8,678
18,000
27,778
37,500
45,918
50,000

S(Top)

Percentage of Total Losses

75.1%
51.2%
29.6%
12.5%
2.3%
0.0%

2.3%
8.1%
15.5%
24.1%
30.2%
19.8%

Notice the difference between the results for the Pareto and the Exponential Distributions.

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 91

Problems:
For each of the following three problems, assume you have a Pareto distribution with parameters
= 5 and = $1000.
10.1 (2 points) What is the average size of those losses less than $500 in size?
A. less than $160
B. at least $160 but less than $170
C. at least $170 but less than $180
D. at least $180 but less than $190
E. at least $190
10.2 (2 points) What is the average size of those losses greater than $500 in size but less than
$2000?
A. less than $800
B. at least $800 but less than $825
C. at least $825 but less than $850
D. at least $850 but less than $875
E. at least $875
10.3 (2 points) Assume you expect 100 losses per year. What is the expected dollars of loss paid
on those losses greater than $500 in size but less than $2000?
A. less than $10,500
B. at least $10,500 but less than $11,000
C. at least $11,000 but less than $11,500
D. at least $11,500 but less than $12,000
E. at least $12,000
10.4 (2 points) You are given the following:
A sample of 5,000 losses contains 1800 that are no greater than $100, 2500 that are
greater than $100 but no greater than $1000, and 700 that are greater than $1000.
The empirical limited expected value function for this sample evaluated at $100 is $73.
The empirical limited expected value function for this sample evaluated at $1000 is $450.
Determine the total amount of the 2500 losses that are greater than $100 but no greater than $1000.
A. Less than $1.50 million
B. At least $1.50 million, but less than $1.52 million
C. At least $1.52 million, but less than $1.54 million
D. At least $1.54 million, but less than $1.56 million
E. At least $1.56 million

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 92

10.5 (3 points) Severity is LogNormal with = 5 and = 3.


What is the average size of those losses greater than 20,000 in size but less than 35,000?
A. less than 25,000
B. at least 25,000 but less than 27,000
C. at least 27,000 but less than 29,000
D. at least 29,000 but less than 31,000
E. at least 31,000
10.6 (2 points) You are given the following:
x
F(x)
E[X x]
$20,000
0.75
$7050
$30,000
0.80
$9340
Determine the average size of those losses of size between $20,000 and $30,000.
A. Less than $23,500
B. At least $23,500, but less than $24,500
C. At least $24,500, but less than $25,500
D. At least $25,500, but less than $26,500
E. At least $26,500
10.7 (2 points) You are given the following:
A sample of 3,000 losses contains 2100 that are no greater than $1,000, 830 that are
greater than $1,000 but no greater than $5,000, and 70 that are greater than $5,000.
The total amount of the 830 losses that are greater than $1,000 but no greater than $5,000
is $1,600,000.
The empirical limited expected value function for this sample evaluated at $1,000 is $560.
Determine the empirical limited expected value function for this sample evaluated at $5,000.
A. Less than $905
B. At least $905, but less than $915
C. At least $915, but less than $925
D. At least $925, but less than $935
E. At least $935
10.8 (2 points) The random variable for a loss, X, has the following characteristics:
Limited Expected Value at x
x
F(x)
0
0.0
0
100
0.2
91
200
0.6
153
1000
1.0
331
Calculate the average size of those losses of size greater than 100 but less than 200.
(A) 140
(B) 145
(C) 150
(D) 155
(E) 160

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 93

10.9 (160, 5/88, Q.5) (2.1 points) A population experiences mortality consistent with an
exponential distribution with = 10. Calculate the average fraction of the interval (x, x+3] lived by
those who die during the interval.
(A) (1 + e-0.1 + e-0.2 - 3e-0.3) e / {6(1 - e-0.3)}
(B) (1 + e-0.1 + e-0.2 - 3e-0.3) / {3(1 - e-0.3)}
(C) 1/3
(D) (13 - 10e-0.3) / {3(1 - e-0.3)}
(E) (10 - 13e-0.3) /{3(1 - e-0.3)}
10.10 (4B, 5/92, Q.23) (2 points) You are given the following information:
A large risk has a lognormal claim size distribution with parameters = 8.443 and = 1.239.
The insurance agent for the risk settles all claims under $5,000.
(Claims of $5,000 or more are settled by the insurer, not the agent.)
Determine the expected value of a claim settled by the insurance agent.
A. Less than 500
B. At least 500 but less than 1,000
C. At least 1,000 but less than 1,500
D. At least 1,500 but less than 2,000
E. At least 2,000
10.11 (4B, 5/93, Q.33) (3 points) The distribution for claim severity follows a Single Parameter
Pareto distribution of the following form:
f(x) = (3/1000)(x/1000)-4, x > 1000
Determine the average size of a claim between $10,000 and $100,000, given that the claim is
between $10,000 and $100,000.
A. Less than $18,000
B. At least $18,000 but less than $28,000
C. At least $28,000 but less than $38,000
D. At least $38,000 but less than $48,000
E. At least $48,000

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 94

10.12 (4B, 5/99, Q.10) (2 points) You are given the following:

One hundred claims greater than 3,000 have been recorded as follows:
Interval
Number of Claims
(3,000, 5,000]
6
(5,000, 10,000]
29
(10,000, 25,000]
39
(25,000, )
26

Claims of 3,000 or less have not been recorded.

The null hypothesis, H0 , is that claim sizes follow a Pareto distribution,


with parameters = 2 and = 25,000 .
If H0 is true, determine the expected claim size for claims in the interval (25,000, ).
A. 12,500

B. 25,000

C. 50,000

D. 75,000

E. 100,000

10.13 (4B, 11/99, Q.1) (2 points) You are given the following:
Losses follow a distribution (prior to the application of any deductible) with mean 2,000.
The loss elimination ratio (LER) at a deductible of 1,000 is 0.30.
60 percent of the losses (in number) are less than the deductible of 1,000.
Determine the average size of a loss that is less than the deductible of 1,000.
A. Less than 350
B. At least 350, but less than 550
C. At least 550, but less than 750
D. At least 750, but less than 950
E. At least 950

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 95

Solutions to Problems:
10.1. A. The Limited Expected Value of the Pareto, E[X x] = {/(1)}{1(/+x)1}.
500

x f(x) dx

= E[X 500] - 500S(500) = 200.6 - 500 (1000/1500)5 = 200.6 - 65.8 = 134.8.

Average size of claim = 134.8 / F(500) = 134.8 / .869 = $155.


10.2. B.

2000

2000

500

x f(x) dx = x f(x) dx - x f(x) dx =


500

E[X 2000] - 2000S(2000) - {E[X 500] - 500S(500)} =


{246.9 - 2000 (1000/3000)5 } - {200.6 - 500 (1000/1500)5 } = 238.7 - 134.8 = 103.9
Average Size of Claim = 103.9 / (F(2000)-F(500)) = 103.9 / (0.996 - 0.869) = $818.
10.3. A.

2000

2000

x f(x) dx -

100 x f(x) dx = 100


500

500

100 x f(x) dx =
0

100{ {E[X 2000] - 2000S(2000)} - {E[X 500] - 500S(500)} }=


100{{246.9 - 2000 (1000/3000)5 } - {200.6 - 500 (1000/1500)5 }} = 100{238.7 - 134.8} =
$10,390. Alternately, one expects 100{F(2000)-F(500)} = 100{.996 - .869} = 12.7 such claims per
year, with an average size of $818, based on the previous problem. Thus the expected dollars of
loss on these claims = (12.7)($818) = $10,389.

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 96

10.4. B. The average size of those claims of size between 100 and 1,000 equals :
({E[X 1000] - 1000S(1000)} - {E[X 100] - 100S(100)})/{F(1000)-F(100)}
= {(450 - (1000)(700/5000)) - (73 - (100)(3200/5000)}/{(4300/5000)-(1800/5000)} =
(310 - 9) / .5 = $602. Thus these 2500 claims total: (2500)($602) = $1,505,000.
Alternately, (Losses Limited to $100) / (Number of Claims) = E[X 100] = $73.
Since there are 5000 claims, Losses Limited to $100 = ($73)(5000) = $365,000.
Now there are: 2500 + 700 = 3200 claims greater than $100 in size.
Since these claims contribute $100 each to the losses limited to $100, they contribute a total of:
(3200)($100) = $320,000.
Losses limited to $100 = (losses on Claims $100) + (contribution of claims >$100).
Thus losses on Claims $100 is: $365,000 - $320,000 = $45,000.
(Losses Limited to $1000) / (Number of Claims) = E[X 1000] = $450.
Since there are 5000 claims, Losses Limited to $1000 = ($450)(5000) = $2,250,000.
Now there are 700 claims greater than $1000 in size.
Since these claims contribute $1000 each to the losses limited to $1000, they contribute a total of:
(700)($1000) = $700,000.
Losses limited to $1000 = (losses on Claims $1000)+(contribution of Claims >$1000).
Thus losses on Claims $1000 = $2,250,000 - $700,000 = $1,550,000.
The total amount of the claims that are greater than $100 but no greater than $1000 is:
(losses on Claims $1000) - (losses on Claims $100) =
$1,550,000 - $45,000 = $1,505,000.
10.5. B. F(x) = [(lnx )/]. F(20000) = [1.63]. F(35000) = [1.82].
E[X

x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.

E[X

x] - xS(x) = exp( + 2/2)[(lnx 2)/].

E[X
E[X

20,000] - (20000)S(20000) = (e9.5) [(ln20000 - 5 - 32 )/3]) =13360 [-1.37].


35,000] - (35000)S(35000) = (e9.5) [(ln35000 - 5 - 32 )/3]) =13360 [-1.18].

The average size of claims of size between $20,000 and $25,000 is:
(E[X 35000] - 35000S(35000) - {E[X 20000] - 20000S(20000)}) /{F(35000)-F(20000)} =
13360{[-1.18] - [-1.37]}/{[1.82] - [1.63]} =
13360(.1190 - .0853)/(.9656 - .9484) = 26,176.
10.6. D. The average size of those claims of size between 20,000 and 30,000 equals:
({E[X 30000] - 30000S(30000)} - {E[X 20000] - 20000S(20000)}) / {F(30000)-F(20000)}
= {(9340-(.2)(30000)) - (7050-(.25)(20000))}/(.80-.75) = (3340-2050)/.05 = $25,800.

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 97

10.7. B. (Losses Limited to $1000) / (Number of Claims) = E[X 1000] = $560.


Since there are 3000 claims, Losses Limited to $1000 = ($560)(3000) = $1,680,000.
Now there are 830+70 = 900 claims greater than $1000 in size.
Since these claims contribute $1000 each to the losses limited to $1000, they contribute a total of
(900)($1000) = $900,000.
Losses limited to $1000 = (losses on Claims $1000) + (contribution of claims > $1000).
Thus the losses on claims $1000 = $1,680,000 - $900,000 = $780,000.
Now the losses on claims $5000 =
(losses on claims $1000) + (losses on claims > $1000 and $5000) =
$780,000 + ($1,600,000) = $2,380,000.
Finally, the losses limited to $5000 =
(the losses on claims $5000) + (Number of Claims > $5000)($5000) =
$2,380,000 + (70)($5000) = $2,730,000.
E[X 5000] = (Losses limited to $5000)/(Total Number of Claims) =
$2,730,000 / 3000 = $910.
Alternately, the average size of those claims of size between 1,000 and 5,000 equals :
({E[X 5000] - 5000S(5000)} - {E[X 1000] - 1000S(1000)})/{F(5000)-F(1000)}.
We are given that: S(1000) = 900/3000 = 0.30, S(5000) = 70/3000 = 0.0233, E[X 1000] = 560.
The observed average size of those claims of size 1000 to 5000 is: 1,600,000/ 830 = 1927.7
Setting the observed average size of those claims of size 1000 to 5000 equal to the above
formula for the same quantity:
1927.7 = ({E[X 5000] - 5000S(5000)} - {E[X 1000] - 1000S(1000)})/{F(5000)-F(1000)} =
({E[X 5000] - 5000(0.0233)} - {560 - 1000(0.30))/{0.9767 - 0.70} .
Solving, E[X 5000] = (1927.7)(0.2767) + 116.5 + 560 - 300 = $910.
10.8. D. Average size of losses between 100 and 200 is:
({E[X 200] - 200S(200)} - {E[X 100] - 100S(100)})/(F(200) - F(100)) =
({153 - (200)(1 - 0.6)} - {91 - (100)(1 - 0.2)})/(0.6 - 0.2) = (73 - 11)/0.4 = 155.
Comment: Same information as in 3, 11/01, Q.35.

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 98

10.9. E. The average size for losses of size between x and x + 3 is:
{E[X x+3] - (x+3)S(x+3)} - E[X x] + xS(x)} / {F(x+3) - F(x)} =
{10(1 - e-(x+3)/10 - (x+3)e-(x+3)/10 - 10(1 - e-x/10) + xe-x/10}/(e-x/10 - e-(x+3)/10) =
{10e-x/10 - 13e-(x+3)/10 - xe-(x+3)/10 + xe-x/10}/{e-x/10(1 - e-.3)} =
(10 - 13e-.3 - xe-.3 + x)/(1 - e-.3) = (10 - 13e-.3)/(1 - e-.3) + x.
The average fraction of the interval (x, x+3] lived by those who die during the interval =
{(The average size for losses of size between x and x + 3) - x}/(x + 3 - x) =
(10 - 13e- . 3)/{3(1 - e- . 3)}.
Alternately, the fraction for someone who dies at age x+ t is: t/3. Average fraction is:
3

t=0

t=0

(t / 3) e (x + t)/10 / 10 dt / e (x + t )/10 / 10 dt =
t=3

t=3

{(e-x/10 / 3)(-te-t/10 - 10e-t/10) ] } / {e-x/10 (- e-t/10) ] } = (10 - 13e- . 3)/{3(1 - e- . 3)}.


t=0

t=0

10.10. E. One is asked for the average size of those claims of size less than 5000. This is:
5000

x f(x) dx / F(5000) = {E[X 5000] - 5000(1 - F(5000)} / F(5000).


x=0

For this LogNormal Distribution:


F(5000) = [{ln(x) } / ] = [{ln(5000) - 8.443} / 1.239 ] = [.060] = .5239.
E[X 5000] = exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx - )/]} =
exp(8.443 + 1.2392 /2)[(ln5000 - 8.443 -1.2392 )/1.239] + 5000 {1-.5238} =
10002 [-1.18] + 2381 = (10002)(.1190) + 2381 = 3571.

{E[X 5000] - 5000(1 - F(5000)} / F(5000) = (3571 - 2381)/0.5239 = 2271.

2013-4-2,

Loss Distributions, 10 Avg. Size in Interval

HCM 10/8/12,

Page 99

10.11. A. F(x) = 1 - (x/ 1000)-3, x > 1000. S(10,000) = 0.001. S(100,000) = 0.000001.
The average size of claim between 10000 and 100000 is the ratio of the dollars of loss on such
claims to the number of such claims:
100,000

100,000

xf(x) dx / {F(10,000) - F(100,000)} = (3 x109 ) x-3 dx / 0.000999 =

10,000

10,000

3.003 x 1012 {1/2}{10,000-2 - 100,000-2} = (1.5015){10,000 - 100} = 14,865.


Comment: One can get the Distribution Function either by integrating the density function from 1000
to x or by recognizing that this is a Single Parameter Pareto Distribution. Note that in this case, as is
common for a distribution skewed to the right, the average size of claim is near the left end of the
interval rather than near the middle.
10.12. D. F(x) = 1 - (/(+x)). S(25000) = 1-F(25000) = {25000/(25000+25000)}2 = 1/4.
E[X x] = {/(1)}{1(/(+x))1}.
E[X 25000] = {25000/(2-1)}{1-(25000/(25000+25000))2-1} = 25000(1/2) = 12,500.
The expected claim size for claims in the interval (25,000, ) =
(E[X] - {E[X 25000] -25000S(25000)})/S(25000) = (25000-(12500-(1/4)(25000))/(1/4) =
(18750)(4) = 75,000.
Alternately, the average payment the insurer would make excess of 25,000, per non-zero such
payment, is {E[X] - E[X 25000]}/S(25000) = 50,000. Then the expected claim size for claims in
the interval (25,000, ) is this portion excess of 25,000 plus an additional 25,000 per large claim;
50,000 + 25,000 = 75,000.
10.13. A. LER(1000) = .30. E[X] = 2000. F(1000) = 0.60.
E[X 1000] = LER(1000)E[X] = (0.30)(2000) = 600.
The average size of those losses less than 1000 is:
{E[X 1000] - 1000S(1000)}/F(1000) = {600-(1000)(1-.6)}/0.6 = (600-400)/0.6 = 333.33.

2013-4-2,

Loss Distributions, 11 Grouped Data

HCM 10/8/12,

Page 100

Section 11, Grouped Data


Unlike the ungrouped data in Section 1, often one is called upon to work with data grouped into
intervals.47 In this example, both the number of losses in each interval and the dollars of loss on
those losses are shown. Sometimes the latter information is missing or sometimes additional
information may be available.
Interval ($000)
0-5
5 -10
10-15
15-20
20-25
25-50
50-75
75-100
100 -
SUM

Number of Losses
2208
2247
1701
1220
799
1481
254
57
33
10,000

Total of Losses in the Interval ($000)


5,974
16,725
21,071
21,127
17,880
50,115
15,303
4,893
4,295
157,383

The estimated mean is $15,738.


As will be seen, in some cases one has to deal with grouped data in a somewhat different manner
than ungrouped data. With modern computing power, the actuary is usually better off working
with the data in an ungrouped format if available. The grouping process discards valuable
information. The wider the intervals, the worse is the loss of information.

47

Note that in this example, for simplicity I have not made a big deal over whether for example the 10-15 interval
includes 15 or not. In many real world applications, in which claims cluster at round numbers, that can be important.

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 101

Section 12, Working with Grouped Data


For the Grouped Data in Section 11, it is relatively easy to compute the various items discussed
previously.
The Limited Expected Value can be computed provided the limit is the upper boundary of one of
the intervals. One sums the losses for all the intervals less than the limit, and adds the product of the
number of claims greater than the limit times the limit. For example, the Limited Expected Value at
$25,000 is:
82,777 thousand + (25 thousand) (1825 claims)
= 12.84 thousand
10,000 claims
The Loss Elimination Ratio, LER(x), can be computed provided x is a boundary of an interval.
LER(x) = E[X x] / mean. So LER(25000) = 12.84 / 15.74 = 81.6%.
The Excess Ratio, R(x), can be computed provided x is a boundary of an interval. The excess
losses are the sum of the losses for intervals greater than x, minus the product of x and the number
of claims greater than x. For example, the losses excess of $75 thousand, are:
(4893 + 4295) - (57 + 33)(75) = 2438 thousand.
R(75,000) = 2438 / 157383 = 1.5%. Also, R(x) = 1 - LER(x) = 1 - E[X x] / mean.
The mean excess loss, e(x), can be computed provided x is a boundary of an interval.
e(x) is the losses excess of x, divided by the number of claims greater than x.
So for example, using the excess losses computed above,
e(75,000) = 2,438,000 / (57+33) = 27.0 thousand.
e(x) can also computed using e(x) = R(x) mean / S(x) .

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 102

Exercise: Compute the Limited Expected Values, Loss Elimination Ratios, Excess Ratios, and
Mean Residual Lives for the Grouped Data in Section 11.
[Solution:
Bottom Top
of
of
Interval Interval

0
5
10
15
20
25
50
75
100

x
5
10
15
20
25
50
75
100
Infinity

#
claims
in the
Interval
2208
2247
1701
1220
799
1481
254
57
33
10000

#
claims
>
Interval
10000
7792
5545
3844
2624
1825
344
90
33
0

Loss
in
Interval

5,974
16,725
21,071
21,127
17,880
50,115
15,303
4,893
4,295

CumuSeverity
LEV(x) LER(x)
lative for
Lossesinterval
Cumu0
0
0
5,974
4.5
28.6%
22,699
7.8
49.7%
43,770
10.1
64.4%
64,897
11.7
74.6%
82,777
12.8
81.6%
132,892
15.0
95.4%
148,195
15.5
98.5%
153,088
15.6
99.4%
157,383
15.7
100%

R(x)

e(x)

1
71.4%
50.3%
35.6%
25.4%
18.4%
4.6%
1.5%
0.6%
0.0%

15.7
14.4
14.3
14.6
15.2
15.9
21.2
27.1
30.2

157,383

For example, the numerator of the limited expected value at 20,000 is:
contribution of small losses + contribution of large losses =
$5,974,000 + $16,725,000 + $21,071,000 + $21,127,000 +
($20,000)(799 + 1481 + 254 + 57 + 33) = $64,897,000 + ($20,000)(2624) = $117,377,000.
E[X 20,000] = $117,377,000/10,000 = $11,738.
LER(20,000) = E[X 20,000]/E[X] = $11,738/$15,738 = 74.6%.
e(20,000) = (E[X] - E[X 20,000]) / S(20,000) = ($15,738 - $11,738) / (2624/10,000) = $15,244.]

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Problems:
Use the following information to answer the next four questions:
Range($)
0-100
100-200
200-300
300-400
400-500
over 500

# of claims
6300
2350
850
320
110
70

loss
$300,000
$350,000
$200,000
$100,000
$50,000
$50,000

10000

$1,050,000

12.1 (1 point) What is the Loss Elimination Ratio, for a deductible of $200?
A. less than .83
B. at least .83 but less than .85
C. at least .85 but less than .87
D. at least .87 but less than .89
E. at least .89
12.2 (1 point) What is the Limited Expected Value, for a Limit of $300?
A. less than $95
B. at least $95 but less than $105
C. at least $105 but less than $115
D. at least $115 but less than $125
E. at least $125
12.3 (1 point) What is the empirical mean excess loss at $400?
A. less than $140
B. at least $140 but less than $150
C. at least $150 but less than $160
D. at least $160 but less than $170
E. at least $170
12.4 (1 point) What is the Excess Ratio, excess of $500?
A. less than 1.5%
B. at least 1.5% but less than 1.6%
C. at least 1.6% but less than 1.7%
D. at least 1.7% but less than 1.8%
E. at least 1.8%

Page 103

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

12.5 (1 point) Calculate the loss elimination ratio for a $500 deductible.
Loss Size
Number of Claims
Total Amount of Loss
$0-249
5,000
$1,125,000
250-499
2,250
765,000
500-999
950
640,000
1,000-2,499
575
610,000
2500 or more
200
890,000
Total
8,975
A. Less than 55.0%
B. At least 55.0%, but less than 60.0%
C. At least 60.0%, but less than 65.0%
D. At least 65.0%, but less than 70.0%
E. 70.0% or more

$4,030,000

12.6 (2 points) An individual health insurance policy will pay:


None of the first $500 of annual medical costs.
80% of the next $2500 of annual medical costs.
100% of annual medical costs excess of $3000.
Annual Medical Costs
Frequency
Average Amount
$0
20%
$1-$500
20%
$300
$501-$1000
10%
$800
$1001-$1500
10%
$1250
$1501-$2000
10%
$1700
$2001-$2500
10%
$2150
$2501-$3000
10%
$2600
over $3000
10%
$4500
What is the average annual amount paid by this policy?
A. $800
B. $810
C. $820
D. $830
E. $840

Page 104

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

12.7 (3 points) Use the following information:


Range($)
# of claims
0
2370
1-10,000
1496
10,001-25,000
365
25,001-100,000
267
100,001-300,000
99
300,001-1,000,000
15
Over 1,000,000
1

HCM 10/8/12,

Page 105

loss($000)
0
4,500
6,437
13,933
16,488
7,207
2,050

4613
50,615
Determine the loss elimination ratios at 10,000, 25,000, 100,000, 300,000, and 1 million.
12.8 (2 points) You are given the following data on sizes of loss:
Range
# of claims
0-99
29
100-199
38
200-299
13
300-399
9
400-499
7
500 or more
4

loss
1000
6000
3000
3000
3000
4000

100
20,000
Determine the empirical limited expected value E[X 300].
A. 145
B. 150
C. 155
D. 160
12.9 (3 points) Use the following information:
Range($)
# of claims
0
2711
1-10,000
1124
10,001-50,000
372
50,001-100,000
83
100,001-300,000
51
300,001-1,000,000
5
Over 1,000,000
2

E. 165

loss($000)
0
3,082
7,851
5,422
7,607
2,050
3,000

4348
29,012
Determine the loss elimination ratios at 10,000, 50,000, 100,000, 300,000, and 1 million.

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 106

12.10 (3 points) Using the following data, determine the mean excess loss at the endpoints of the
intervals.
Interval
Number of claims
Dollars of Loss
$1- 10,000
1,600
$16,900,000
$10,001 - 30,000
600
$14,000,000
$30,001 - 100,000
250
$12,500,000
$100,001 - 500,000
48
$5,500,000
Over $500,000
2
$1,100,000
Total

2,500

$50,000,000

12.11 (4B, 5/96, Q.22) (2 points) Forty (40) observed losses have been recorded in thousands
of dollars and are grouped as follows:
Interval
Number of
Total Losses
($000)
Losses
($000)
(1, 4/3)
16
20
[4/3, 2)
10
15
[2, 4)
10
35
[4, )
4
20
Determine the empirical limited expected value function evaluated at 2 (thousand).
A. Less than 0.5
B. At least 0.5, but less than 1.0
C. At least 1.0, but less than 1.5
D. At least 1.5, but less than 2.0
E. At least 2.0
12.12 (4B, 11/97, Q.8) (2 points) You are given the following:
A sample of 2,000 claims contains 1,700 that are no greater than $6,000, 30 that are
greater than $6,000 but no greater than $7,000, and 270 that are greater than $7,000.
The total amount of the 30 claims that are greater than $6,000 but no greater than $7,000 is
$200,000.
The empirical limited expected value function for this sample evaluated at $6,000 is $1,810.
Determine the empirical limited expected value function for this sample evaluated at $7,000.
A. Less than $1,900
B. At least $1,900, but less than $1,925
C. At least $1,925, but less than $1,950
D. At least $1,950, but less than $1,975
E. At least $1,975

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

12.13 (Course 4 Sample Exam 2000, Q.7) Summary statistics of 100 losses are:
Interval

Number
of Losses

Sum

Sum of
Squares

(0, 2000]
(2000, 4000]
(4000, 8000]
(8000, 15000]
(15,000, )

39
22
17
12
10

38,065
63,816
96,447
137,595
331,831

52,170,078
194,241,387
572,753,313
1,628,670,023
17,906,839,238

Total

100

667,754

20,354,674,039

Determine the empirical limited expected value E[X 15,000].

12.14 (CAS5, 5/00, Q.27) (1 point) Calculate the excess ratio at $100.
(The excess ratio is one minus the loss elimination ratio.)
Loss Size ($)
Number of Claims
Amount of Losses ($)
0-50
600
21,000
51- 100
500
37,500
101 -250
400
70,000
251 - 500
300
120,000
501-1000
200
150,000
Over 1000
100
200,000
Total
2,100
598,500
A. Less than 0.700
B. At least 0.700, but less than 0.725
C. At least 0.725, but less than 0.750
D. At least 0.750, but less than 0.775
E. At least 0.775

Page 107

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 108

12.15 (CAS5, 5/02, Q.14) (1 point) Based on the following full-coverage loss experience,
calculate the excess ratio at $500. (The excess ratio is one minus the loss elimination ratio.)
Loss Size ($)
Number of Claims
Amount of Losses ($)
0 - 100
1,100
77,000
101 - 250
800
148,000
251 - 500
500
180,000
501 -1000
350
245,000
1001 - 2000
200
300,000
Over 2000
50
150,000
Total
3,000
1,100,000
A. Less than 0.250
B. At least 0.250, but less than 0.350
C. At least 0.350, but less than 0.450
D. At least 0.450, but less than 0,550
E. At least 0.550
12.16 (CAS3, 11/04, Q.26) (2.5 points) A sample of size 2,000 is distributed as follows:
Range
Count
0X6,000
1,700
6,000<X7,000
30
7,000<X
270

The sum of the 30 observations between 6,000 and 7,000 is 200,000.


For the empirical distribution X, E(X 6,000) = 1,810.
Determine E(X 7,000).
A. Less than 1,910
B. At least 1,910, but less than 1,930
C. At least 1,930, but less than 1,950
D. At least 1,950, but less than 1,970
E. At least 1,970

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

12.17 (CAS5, 5/05, Q.19) (1 point)


Given the following information, calculate the loss elimination ratio at a $500 deductible.
Loss Amount
Claim Count
Total Loss
Below $500
150
$15,000
$500
6
$3,000
Over $500
16
$22,000
A. Less than 0.4
B. At least 0.4, but less than 0.5
C. At least 0.5, but less than 0.6
D. At least 0.6, but less than 0.7
E. At least 0.7

Page 109

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 110

Solutions to Problems:
12.1. D. LER(200) = Losses Eliminated / Total Losses =
{(300,000+350,000) + (200)(850+320+110+70)} / 1,050,000 = $920,000 / $1,050,000 =
0.876.
Alternately, E[X 200] = {(300,000+350,000) + (200)(850 +320+110+70)} /10000 = $92.
Mean = $105. Thus, LER(200) = E[X 200] / mean = $92 / $105= 0.876.
12.2. B. E[X 300] = {(300,000+350,000+200,000) + (300)(320+110+70)} / 10000 = 100.
12.3. C. e(400) = (dollars on claims excess of 400) / (# claims greater than 400) - 400 =
{(50000 + 50000) / (110 + 70)} - 400 = 555.56 - 400 = $155.56.
Alternately, e(400) = { mean - E[X 400] } / {1 - F(400)} = ($105 - $102.2) / .018 = $155.56.
12.4. A. R(500) = (dollars excess of 500) / (total dollars) = {50000 - (70)(500)} / 1,050,000 =
0.0143.
Alternately, R(500) = 1 - E[X 500] / mean = 1 - $103.5 / $105 = 0.0143.
12.5. D. Losses eliminated = 1,125,000 + 765,000 + (500)(950 + 575 + 200) = 2,752,500.
Loss Elimination Ratio = 2,752,500/4,030,000 = 68.3%.
12.6. D. 80% of the layer from 500 to 3000 is paid, plus 100% of the layer excess of 3000.
Annual
Cost

Frequency

0
1 to 500
501 to 1000
1001 to 1500
1501 to 2000
2001 to 2500
2501 to 3000
over 3000

20%
20%
10%
10%
10%
10%
10%
10%

Average

Average
Cost

Layer
500 to 3000

Layer
excess of 3000

Paid

$300
$800
$1250
$1700
$2150
$2600
$4500

$0
$0
$300
$750
$1200
$1650
$2100
$2500

$0
$0
$0
$0
$0
$0
$0
$1500

$0
$0
$240
$600
$960
$1320
$1680
$3500
$830

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 111

12.7. The losses eliminated by a deductible of size 10,000 are in thousands:


4500 + 10(365 + 267 + 99 + 15 + 1) = 11,970. LER(10,000) = 11970/50615 = 23.65%.
Bottom Top
#
of
of
claims
Interval Interval in the
$ Thou. $ Thou. Interval

#
claims
>
Interval

Loss
in
Interval
$ Thous.

0
10
25
100
300
1000
Infinity

2243
747
382
115
16
1
0

0
4,500
6,437
13,933
16,488
7,207
2,050

1
10
25
100
300
1000

2370
1496
365
267
99
15
1
4613

CumuSeverity
Losses
lative forEliminated
Lossesinterval
$ Thous.
$ Thous.
$ Thous.
0
4,500
10,937
24,870
41,358
48,565
50,615

11,970
20,487
36,370
46,158
49,565

LER(x)

23.65%
40.48%
71.86%
91.19%
97.93%

50,615

Comment: Data taken from AIA Closed Claim Study (1974) in Table IV of Estimating Pure
Premiums by Layer - An Approach by Robert J. Finger, PCAS 1976. Finger calculates excess
ratios, which are one minus the loss elimination ratios.
12.8. D. The contribution to the numerator of E[X 300] from the claims of size less than 300 is the
sum of their losses: 1000 + 6000 + 3000 = 10,000.
Each claim of size 300 or more contributes 300 to the numerator of E[X 300]; the sum of their
contributions is: (300)(9 + 7 + 4) = 6000.
E[X 300] = (10,000 + 6000)/100 = 160.

12.9. The losses eliminated by a deductible of size 100,000 are in thousands:


3082 + 7851 + 5422 + 100(51 + 5 + 2) = 22,155. LER(100,000) = 22155/29012 = 76.36%.
Bottom Top
#
of
of
claims
Interval Interval in the
$ Thou. $ Thou. Interval

#
claims
>
Interval

Loss
in
Interval
$ Thous.

0
10
50
100
300
1000
Infinity

1637
513
141
58
7
2
0

0
3,082
7,851
5,422
7,607
2,050
3,000

1
10
50
100
300
1000

2711
1124
372
83
51
5
2
4348

CumuSeverity
Losses
lative forEliminated
Lossesinterval
$ Thous.
$ Thous.
$ Thous.
0
3,082
10,933
16,355
23,962
26,012
29,012

8,212
17,983
22,155
26,062
28,012

LER(x)

28.31%
61.98%
76.36%
89.83%
96.55%

29,012

Comment: Data taken from NAIC Closed Claim Study (1975) in Table VII of Estimating Pure
Premiums by Layer - An Approach by Robert J. Finger, PCAS 1976. Finger calculates excess
ratios, which are one minus the loss elimination ratios.

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 112

12.10. e(0) = E[X] = $50,000,000 / 2500 = $20,000.


e(10,000) =
($14,000,000 + $12,500,000 + $5,500,000 + $1,100,000) / (600 + 250 + 48 + 2) - 10,000 =
$26,778.
e(30,000) = ($12,500,000 + $5,500,000 + $1,100,000) / (250 + 48 + 2) - 30,000 = $33,667.
e(100,000) = ($5,500,000 + $1,100,000) / (48 + 2) - 100,000 = $32,000.
e(500,000) = $1,100,000 / 2 - 500,000 = $50,000.
12.11. D. Include in the Numerator the small losses at their reported value, while limiting the large
losses to 2 (thousand) each. The Denominator is the total number of claims.
Therefore, E[X 2] = (20+15+ (10)(2)+ (4)(2))/(16 +10+10+4) = 63/40 = 1.575.
12.12. D. (Losses Limited to $6000) / (Number of Claims) = E[X 6000] = $1810.
Since there are 2000 claims, Losses Limited to $6000 = ($1810)(2000) = $3,620,000.
Now there are 30+270 = 300 claims greater than $6000 in size.
Since these claims contribute $6000 each to the losses limited to $6000, they contribute a total of
(300)($6000) = $1,800,000.
Losses limited to $6000 = (losses on claims $6000)+(contribution of claims >$6000). Thus the
losses on claims $6000 = $3,620,000 - $1,800,000 = $1,820,000.
Now the losses on claims $7000 =
(losses on claims 6000) + (losses on claims > $6000 and $7000) =
$1,820,000 + ($200,000) = $2,020,000. Finally, the losses limited to $7000 =
(the losses on claims $7000) + (Number of Claims > $7000)($7000) =
$2,020,000 + (270)($7000) = $3,910,000.
E[X 7000] = (Losses limited to $7000)/(Total Number of Claims) =
$3,910,000 / 2000 = $1955.
Alternately, the average size of those claims of size between 6,000 and 7,000 equals :
({E[X 7000] - 7000S(7000)} - {E[X 6000] - 6000S(6000)})/{F(7000)-F(6000)}.
We are given that: S(6000) = 300/2000 = .15, S(7000) = 270/2000 = .135,
E[X 6000] = 1810. The observed average size of those claims of size 6000 to 7000 is: 200000/
30 = 6666.7. Setting the observed average size of those claims of size 6000 to 7000 equal to the
above formula for the same quantity:
6666.7 = ({E[X 7000] - 7000S(7000)} - {E[X 6000] - 6000S(6000)})/{F(7000)-F(6000)} =
({E[X 7000] - 7000(.135)} - {1810 - 6000(.15))/{.865 - .85} .
Solving, E[X 7000] = (6666.7)(.015) + 945 + 1810 - 900 = $1955.
Comment: While Lee Diagrams are not on the syllabus, this question may also be answered via a
Lee Diagram, not to scale, as follows:

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 113

$7000

$6000

Probability

A
.850

.865

1.000

A + B + C = E[X 6000] = $1810


D + B = (Losses on claims of size 6000 to 7000) / (total number of claims) =
$200,000 / 2000 = $100. B = ($6000)(30/2000) = ($6000)(.865 - .850) = $90.
Therefore, D = (D+B) - B = $100 - $90 = $10.
E = ($7000-$6000)(270/2000) = ($1000)(1-.865) = $135.
E[X 7000] = A + B + C + D + E = $1810 + $10 + $135 = $1955.
12.13. The limited losses are: (dollars from small losses) + (15000)(number of large losses) =
(667,754 - 331,831) + (15000)(10) = 485,923.
E[X 15,000] = losses limited to 15000)/(number of losses) = 485,923/100 = 4859.
12.14. C. E[X 100] = {(21000 + 37500) + (100)(400+ 300 + 200 + 100)} / 2100 =
158,500/2100 = 75.48. Mean = 598,500/2100 = 285.
Excess Ratio at 100: R(100) = 1 - E[X 100]/E[X] = 1 - 75.48/285 = 73.5%.
Alternately, the losses excess of 100, are contributed by the last four intervals:
70,000 - (400)(100) + 120,000 - (300)(100) + 150,000 - (200)(100) + 200,000 - (100)(100) =
30,000 + 90,000 + 130,000 + 190,000 = 440,000.
Excess Ratio at 100: Losses Excess of $100 / Total Losses = 440/598.5 = 73.5%.

2013-4-2,

Loss Distributions, 12 Working with Grouped Data

HCM 10/8/12,

Page 114

12.15. C. E[X 500] = {(77000 + 148000 + 180000) + (500)(350 + 200 + 50)} / 3000 =
705000/3000 = 235. Mean = 1,100,000/3000 = 366.67.
Excess Ratio at 500: R(500) = 1 - E[X 500]/E[X] = 1 - 235/366.67 = 35.9%.
Alternately, the losses excess of 500, are contributed by the last three intervals:
245,000 - (350)(500) + 300,000 - (200)(500) + (150,000) - (50)(500) = 395,000.
Excess Ratio at 500: Losses Excess of $500 / Total Losses = 395/1100 = 35.9%.
Comments: Here is the calculation of the excess ratios at various amounts:
Bottom
of
Interval

Top
of
Interval

0
5
10
15
20
100

x
100
250
500
1000
2000
Infinity

#
claims
in the
Interval

#
claims
>
Interval
3000
1900
1100
600
250
50
0

1100
800
500
350
200
50
3000

Loss
in
Interval

77,000
148,000
180,000
245,000
300,000
150,000

Cumulative
Severity
LEV(x)LER(x) R(x)
Losses for
interval
0
77,000
225,000
405,000
650,000
950,000
1,100,000

0
89.0
166.7
235.0
300.0
350.0
366.7

100.0%
75.7%
54.5%
35.9%
18.2%
4.5%
0.0%

1,100,000

The Loss Elimination Ratio at $500 is: 1 - 35.9% = 64.1% = 235/366.67 = E[X 500]/E[X].
12.16. D. (X

7,000) - (X

6,000) = 0 for X 6000


X - 6000 for 6,000 < X 7,000
1000 for 7000 < X.
For the 30 observations between 6,000 and 7,000:

(xi - 6000) = xi - (30)(6000) = 200,000 - 180,000 = 20,000.


The 270 observations greater than 7,000 contribute to this difference: (270)(1000) = 270,000.

{(xi 7,000) - (xi 6,000)} = 20,000 + 270,000 = 290,000.


E(X 7,000) - E(X 6,000) = 290,000/(1700 + 30 + 270) = 145.
E(X 7,000) = E(X 6,000) + 145 = 1810 + 145 = 1955.
Alternately, 1810 = E(X

6,000) = {sum of small losses + (300)(6000)}/2000.

Sum of losses of size less than 6000 is: (1800)(2000) - (300)(6000) = 1,820,000.
E(X

7,000) = {1,820,000 + 200,000 + (7000)(270)}/2000 = 1955.

12.17. D. Total Losses: 15,000 + 3000 + 22000 = 40,000.


Losses eliminated: 15,000 + (6 + 16)(500) = 26,000.
Loss Elimination Ratio: 26/40 = 0.65.

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 115

Section 13, Uniform Distribution


If losses are uniformly distributed on an interval [a, b], then it is equally likely that a loss is anywhere in
that interval. The probability of a loss being in any subinterval is proportional to the width of that
subinterval.
Exercise: Losses are uniformly distributed on the interval [3, 7].
What is the probability that a loss chosen at random will be in the interval [3.6, 3.8]?
[Solution: (3.8 - 3.6)/(7 - 3) = .05.]

Uniform Distribution
Support: a x b

Parameters: None

D. f. :

F(x) = (x-a) / (b-a)

P. d. f. :

f(x) = 1/ (b-a)

bn + 1 - an + 1
Moments: E[Xn] =
(b - a) (n + 1)
Mean = (b+a)/2
Variance = (b-a)2 / 12

Coefficient of Variation = Standard Deviation / Mean =

b-a
(b + a) 3

Skewness = 0
Kurtosis = 9/5
Median = (b+a)/2
Limited Expected Value Function: E[X x] =
E[(X x)n ] =

2xb - a2 - x 2
, for a x b
2(b - a)

(n +1) xn b - an + 1 - n xn + 1
, for a x b
(n + 1) (b - a)

e(x) = (b - x)/2, for a x < b

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 116

Exercise: Assume losses are uniformly distributed on the interval [20, 25].
What is mean, second moment., third moment, fourth moment, variance, skewness and kurtosis?
[Solution: The mean is: (20+25)/2 = 22.5.
The second moment is: (b3 - a3 ) / {(b-a)(3)} = (253 - 203 ) / {(25-20)(3)} = 508.3333.
The third moment is: (b4 - a4 ) / {(b-a)(4)} = (254 - 204 ) / {(25-20)(4)} = 11531.25.
The fourth moment is: (b5 - a5 ) / {(b-a)(5)} = (255 - 205 ) / {(25-20)(5)} = 262,625.
Therefore the variance is: 508.333 - 22.52 = 2.08 = (25-20)2 / 12.
The skewness is: {11531.025 - (3)(508.333)(22.5) + 2(22.53 )} / 2.081.5 = 0.
The kurtosis is: {262,625 - (4)(11531.25)(22.5) + (6)(508.3333)(22.52 ) - 3(22.54 )} / 2.082 = 1.8.
Comment: The skewness of a uniform distribution is always zero, since it is symmetric.
The kurtosis of a uniform distribution is always 9/5.]

Discrete Uniform Distribution:


The uniform distribution discussed above is a continuous distribution. It is different than a distribution
uniform and discrete on integers.
For example, assume a distribution uniform and discrete on the integers from 10 to 13 inclusive:
f(10) = 1/4, f(11) = 1/4, f(12) = 1/4, and f(13) = 1/4.
It has mean of: (10 + 11 + 12 + 13)/4 = 11.5 = (10 + 13)/2.
It has variance of: {(10 - 11.5)2 + (11 - 11.5)2 + (12 - 11.5)2 + (13 - 11.5)2 }/ 4 = 1.25.
In general, for a distribution uniform and discrete on the integers from i to j inclusive:
Mean = (i + j)/2
Variance = {(j + 1 - i)2 - 1}/12.
For i = 10 and j = 13, the variance = {(13 + 1 - 10)2 - 1}/12 = 15/12 = 1.25, matching the previous
result. Note that the variance formula is somewhat different for the discrete case than the continuous
case.48
Exercise: What is the variance of a six-sided die?
[Solution: Uniform and discrete from 1 to 6: variance = {(6 + 1 - 1)2 - 1}/12 = 35/12.]
The variance of an S-sided die is: (S2 - 1)/12.
48

I would not memorize the formula for the variance in the discrete case.

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Problems:
Use the following information for the next 8 questions:
The size of claims is uniformly distributed on the interval from 3 to 7.
13.1 (1 point) What is the probability density function at 6?
A. 0.20

B. 0.25

C. 0.30

D. 0.35

13.2 (1 point) What is the distribution function at 6?


A. 0.60
B. 0.65
C. 0.70
D. 0.75

E. 0.40

E. 0.80

13.3 (1 point) What is the mean of the severity distribution?


A. less than 4
B. at least 4 but less than 5
C. at least 5 but less than 6
D. at least 6 but less than 7
E. at least 7
13.4 (1 point) What is the variance of the severity distribution?
A. less than 1.4
B. at least 1.4 but less than 1.5
C. at least 1.5 but less than 1.6
D. at least 1.6 but less than 1.7
E. at least 1.7
13.5 (2 points) What is the limited expected value at 6?
A. less than 4.5
B. at least 4.5 but less than 4.6
C. at least 4.6 but less than 4.7
D. at least 4.7 but less than 4.8
E. at least 4.8
13.6 (1 point) What is the excess ratio at 6?
A. less than 2.0%
B. at least 2.0% but less than 2.1%
C. at least 2.1% but less than 2.2%
D. at least 2.2% but less than 2.3%
E. at least 2.3%

Page 117

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 118

13.7 (1 point) What is the skewness of the severity distribution?


(A) -1.0
(B) -0.5
(C) 0
(D) 0.5
(E) 1.0
13.8 (1 point) What is the mean excess loss at 4, e(4)?
(A) 1.0
(B) 1.5
(C) 2.0
(D) 2.5
(E) 3.0
13.9 (2 points) Losses for a coverage are uniformly distributed on the interval 0 to $10,000.
What is the Loss Elimination Ratio for a deductible of $1000?
A. less than 0.16
B. at least 0.16 but less than 0.18
C. at least 0.18 but less than 0.20
D. at least 0.20 but less than 0.22
E. at least 0.22
13.10 (3 points) X is uniformly distributed on the interval 0 to 10,000.
Determine the covariance of X 1000 and (X - 1000)+.
A. less than 200,000
B. at least 200,000 but less than 210,000
C. at least 210,000 but less than 220,000
D. at least 220,000 but less than 230,000
E. at least 230,000
13.11 (3 points) X is uniformly distributed on the interval 0 to 10,000.
Determine the correlation of X 1000 and (X - 1000)+.
A. less than 0.3
B. at least 0.3 but less than 0.4
C. at least 0.4 but less than 0.5
D. at least 0.5 but less than 0.6
E. at least 0.6
13.12 (3 points) X is uniform on [0, 20]. Y is uniform on [0, 30].
X and Y are independent.
Z is the maximum of X and Y.
Determine E[Z].
A. less than 16.0
B. at least 16.0 but less than 16.5
C. at least 16.5 but less than 17.0
D. at least 17.0 but less than 17.5
E. at least 17.5

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 119

13.13 (1 point) X and Y are independent. X has a uniform distribution on 0 to 100.


Y has a uniform distribution on 0 to . eY(40) = ex(40) - 5. What is ?
A. 90

B. 95

C. 100

D. 105

E. 110

13.14 (2, 5/83, Q.13) (1.5 points) A box is to be constructed so that its height is 10 inches and its
base is X inches by X inches. If X has a uniform distribution over the interval (2, 8), then what is the
expected volume of the box in cubic inches?
A. 80.0
B. 250.0
C. 252.5
D. 255.0
E. 280.0
13.15 (160, 11/86, Q.5) (2.1 points) A population has a survival density function
f(x) = 0.01, 0 < x < 100. Determine the probability that a life now aged 60 will live longer than a life
now aged 50.
(A) 0.1
(B) 0.2
(C) 0.3
(D) 0.4
(E) 0.5
13.16 (2, 5/88, Q.40) (1.5 points) Let X be a random variable with a uniform distribution on the
interval (1, a) where a > 1. If E(X) = 6 Var(X), then what is a?
A. 2

B. 3

C. 3 2

D. 7

E. 8

13.17 (4B, 11/95, Q.28) (2 points) Two numbers are drawn independently from a uniform
distribution on [0,1]. What is the variance of their product?
A. 1/144
B. 3/144
C. 4/144
D. 7/144
E. 9/144
13.18 (Course 160 Sample Exam #1, 1999, Q.4) (1.9 points)
A cohort of eight fruit flies is hatched at time t = 0. You are given:
(i) The survival distribution for fruit flies is known to be uniform over (0, 10].
(ii) Deaths are observed at times 1, 1, 2, 4, 5, 6, 6 and 7.
(ill) y is the number of fruit flies from the cohort observed to survive past e(0).
For any future cohort of eight fruit flies, determine the probability that exactly y will survive beyond
e(0).
(A) 0.22
(B) 0.23
(C) 0.25
(D) 0.27
(E) 0.28
13.19 (Course 1 Sample Exam, Q.35) (1.9 points) Suppose the remaining lifetimes of a
husband and wife are independent and uniformly distributed on the interval [0, 40].
An insurance company offers two products to married couples:
One which pays when the husband dies; and
One which pays when both the husband and wife have died.
Calculate the covariance of the two payment times.
A. 0.0
B. 44.4
C. 66.7
D. 200.0
E. 466.7

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 120

13.20 (1, 5/00, Q.38) (1.9 points) An insurance policy is written to cover a loss, X, where X has a
uniform distribution on [0, 1000]. At what level must a deductible be set in order for the expected
payment to be 25% of what it would be with no deductible?
(A) 250
(B) 375
(C) 500
(D) 625
(E) 750
13.21 (1, 11/01, Q.28) (1.9 points) Two insurers provide bids on an insurance policy to a large
company. The bids must be between 2000 and 2200. The company decides to accept the lower
bid if the two bids differ by 20 or more. Otherwise, the company will consider the two bids further.
Assume that the two bids are independent and are both uniformly distributed on the interval from
2000 to 2200.
Determine the probability that the company considers the two bids further.
(A) 0.10
(B) 0.19
(C) 0.20
(D) 0.41
(E) 0.60
13.22 (1, 11/01, Q.29) (1.9 points) The owner of an automobile insures it against damage by
purchasing an insurance policy with a deductible of 250. In the event that the automobile is
damaged, repair costs can be modeled by a uniform random variable on the interval
(0, 1500). Determine the standard deviation of the insurance payment in the event that the
automobile is damaged.
(A) 361
(B) 403
(C) 433
(D) 464
(E) 521
13.23 (3, 11/02, Q.33) (2.5 points) XYZ Co. has just purchased two new tools with independent
future lifetimes. Each tool has its own distinct De Moivre survival pattern.
One tool has a 10-year maximum lifetime and the other a 7-year maximum lifetime.
Calculate the expected time until both tools have failed.
(A) 5.0
(B) 5.2
(C) 5.4
(D) 5.6
(E) 5.8
13.24 (2 points) In the previous question, calculate the expected time until at least one of the two
tools has failed.
(A) 2.6
(B) 2.7
(C) 2.8
(D) 2.9
(E) 3.0
13.25 (CAS3, 11/03, Q.5) (2.5 points) Given:
i) Mortality follows De Moivre's Law.
ii) e 20 = 30.
Calculate q20.
A. 1/60

B. 1/70

C. 1/80

D. 1/90

E. 1/100

2013-4-2,

Loss Distributions, 13 Uniform Distribution

13.26 (SOA3, 11/03, Q.39) (2.5 points) You are given:


(i) Mortality follows DeMoivres law with = 105.
(ii) (45) and (65) have independent future lifetimes.
Calculate e ____ .
45:65
(A) 33

(B) 34

(C) 35

(D) 36

(E) 37

HCM 10/8/12,

Page 121

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 122

Solutions to Problems:
13.1. B. f(x) = 1/(7-3) = 0.25 for 3< x< 7.
13.2. D. F(x) = (x-3)/(7-3) = for 3< x< 7. F(6) = (6-3)/(7-3) = 0.75.
13.3. C. Mean = (3+7)/2 = 5.
13.4. A.

second moment =

x2 (1/4)dx = x3 /{(3)(4)}] = {343- 27}/12 = 26.33.

mean = (7+3)/2 = 5. Thus variance = 26.33 -52 = 1.33.


Alternately, one can use the general formula for the variance of a uniform distribution on [a,b]:
Variance = (b-a)2 /12 = (7-3)2 / 12 = 1.333.
13.5. E.

E[X 6] =

x(1/4)dx + 6(1-F(6) = (36 - 9) / 8 + (6)(1/4) = 4.875.

Alternately, one can use the general formula for E[X x] of a uniform distribution on [a,b]: E[X x] =
(2xb - a2 - x2 ) / {2(b-a)}. E[X 6] = ((2)(6)(7) - 32 - 62 ) / {2(7 - 3)} = 4.875.

13.6. E. The excess ratio R(6) = 1 - E[X 6] / E[X] = 1 - 4.875 / 5 = 2.5%.


13.7. C. Since the uniform distribution is symmetric, the skewness is zero.
13.8. B. The losses of size larger than 4, are uniform from 4 to 7. The amount by which they
exceed 4 is uniformly distributed from 0 to 3. e(4) = average amount by which those losses of size
greater than 4, exceed 4 = (0 +3)/2 = 1.5.
Comment: For a uniform distribution from a to b, e(x) = (b-x)/2, for a x < b.
13.9. C. The overall mean is (10000 + 0) / 2 = 5000. E[X

1000] =

1000

(x / 10000) dx + 1000(9000/10000) = 50 + 900 = 950. LER(1000) = 950 / 5000 = 0.19.


0

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 123

13.10. B. E[X 1000] = (1/10)(500) + (9/10)(1000) = 950.


E[(X - 1000)+] = E[X] - E[X 1000] = 5000 - 950 = 4050.

(X 1000)(X - 1000)+ is 0 for x 1000 and (1000)(x - 1000) for x > 1000.
10000

E[(X 1000)(X - 1000)+] = (1000)(x - 1000)/10000 dx = 4,050,000.


1000

Cov[(X 1000) , (X - 1000)+] = E[(X 1000)(X - 1000)+] - E[X 1000]E[(X - 1000)+] =


4,050,000 - (950)(4050) = 202,500.
13.11. C.

1000

10000

E[(X 1000)2 ] = x2 /10000 dx + 10002 /10000 dx = 33,333 + 900,000 = 933,333.


0

1000

Var[X 1000] = 933,333 - 9502 = 30,833.


10000

E[(X - 1000)+2 ] =

(x-1000)2/10000 dx = 24,300,000.

1000

Var[(X - 1000)+] = 24,300,000 - 40502 = 7,897,500.


Corr[(X 1000) , (X - 1000)+] = Cov[(X 1000) , (X - 1000)+]/ Var[X 1000] Var[(X - 1000)+] =
202,500/ (30,833)(7,897,500) = 0.410.
13.12. D. Prob[Z z] = Prob[X z]Prob[Y z]: (z/20)(z/30) for z 20, and z/30 for 20 z 30.
S(z) = 1 - z2 /600 for z 20, 1 - z/30 for 20 z 30, and 0 for z 30.

20

30

z=20

z=30

Mean = S(z)dz = 1 - z2 /600 dz + 1 - z/30 dz = z - z3 /1800] + z - z2 /60] =


0

20

z=0

z=20

20 - 4.444 + 10 - 8.333 = 17.22.


Alternately, f(t) = z/300 for z 20, 1/30 for 20 z 30, and 0 for z 30.
20

30

Mean = z f(z)dt = z2 /300 dz +


0

z/30 dt = 8.888 + 8.333 = 17.22.

20

Comment: Similar to 3, 11/02, Q.33.


13.13. A. ex(40) = (100 - 40)/2 = 30. 30 - 5 = 25 = eY(40) = ( - 40)/2. = 90.

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 124

13.14. E. E[10X2 ] =

10x2 / 6 dx = 10(83 - 23)/18 = 280.


2

13.15. D. The future life time of the life aged 60 is uniform from 0 to 40. The future lifetime of the life
aged 50 is uniform from 0 to 50. If the age 60 dies at time = t, then the probability it lived longer
40

than the life aged 50 is: t/50.

(t / 50) / 40 dt = 0.4.
0

13.16. B. E[X] = (1 + a)/2. Var[X] = (a - 1)2 /12. (1 + a)/2 = 6(a - 1)2 /12.

(1 + a) = (a - 1)2 . a2 - 3a = 0. a = 3.
13.17. D. For X and Y independent: E[XY] = E[X]E[Y], and E[X2 Y 2 ] = E[X2 ]E[Y2 ].
Var[XY] = E[(XY)2 ] - E2 [XY] = E[X2 Y 2 ] - {E[X]E[Y]}2 = E[X2 ]E[Y2 ] - E[X]2 E[Y]2 .
For the uniform distribution on [0,1], E[X] = 1/2, E[X2 ] = integral from 0 to 1 of f(x)x2 dx = 1/3.
Therefore, Var[XY] = (1/3)(1/3) - (1/4)(1/4) = 1/9 - 1/16 = (16 - 9) /144 = 7 /144.
13.18. A. e(0) = mean = (0 + 10)/2 = 5. Three flies survive past 5, so y = 5.
For the uniform, probability 3 out of 8 survive past time 5 is: {(8!)/(3!5!)} .53 .55 = 0.21875.
13.19. C. Let X be the time of death of the husband. Let Y be the time of death of the wife.
The first policy pays at maximum[X].
The second policy pays at maximum[X, Y].
E[X] = 20.
Prob[Max[X, Y] t] = Prob[X t and Y t] = Prob[X t] Prob[Y t] = (t/40)(t/40) = t2 /1600.
40

E[Max[X, Y]] =

1 - t2/1600 dt = 40 - 13.333 = 26.667.


0

X Max[X, Y] = X2 if X Y and XY if X < Y.


E[X Max[X, Y] | X = x] = (x2 )Prob[Y x] + xE[Y | Y > x]Prob[Y > x] =
(x2 )(x/40) + x{(x + 40)/2}(1 - x/40) = x3 /80 + 20x.
40

40

E[X Max[X, Y] | X = x] f(x) dx = (x3/80 + 20x)(1/40) dx = 600.

E[X Max[X, Y] ] =
0

Cov[X, Max[X, Y]] = 600 - (20)(26.667) = 66.66.

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 125

13.20. C. Expected payment to be 25% of what it would be with no deductible.


75% of losses eliminated.
.75 = LER[d] = E[X

d]/E[X] = {(d/2)(d/1000) + d(1 - d/1000)}/500. 375 = d - d2 /2000.

d2 - 2000d + 750,000 = 0. d = {2000 4,000,000 - 3,000,000 }/2 = 500.


Comment: The other root of 1500 is greater than 1000 and thus not a solution to the question.
13.21. B. The two variables are within 20 of each other when they are in the southwest to northeast
strip. Area of strip = 2002 - 1802 /2 - 1802 /2 = 7600. 7600/2002 = 19%.
2200

2020
2000
13.22. B. Mean payment is: (1/6)(0) + (5/6)(1250/2) = 520.83.
1500

Second moment of payment is: (1/6)(02 ) + (x-250)2 / 1500 dx = 434028.


250

Variance of payment is: 434028 - 520.832 = 162764.


Standard Deviation of payment is:

162,764 = 403.

Comment: I have included the probability of a payment of zero due to the deductible.

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 126

13.23. E. X is uniform on [0, 10] and Y is uniform on [0, 7].


Probability that both are dead by time t is: (t/10)(t/7) for t 7, and t/10 for 7 t 10.
The corresponding survival function S(t) = 1 - t2 /70, t 7, 1 - t/10 for 7 t 10, 0 for t 10.
7

10

t=7

t=10

Mean = S(t)dt = 1 - t2 /70 dt + 1 - t/10 dt = t - t3 /210] + t - t2 /20] = 5.82.


0

t=0

t=7

Alternately, the corresponding density function f(t) = t/35, t 7, 1/10 for 7 t 10, 0 for t 10.
7

10

t=7

t=10

Mean = t f(t)dt = t2 /35 dt + t/10 dt = t3 /105] + t2 /20] = 3.267 + 5 - 2.45 = 5.82.


0

t=0

t=7

Comment: Last survivor status is discussed in Section 9.4 of Actuarial Mathematics.


One could instead use equation 9.5.4 in Actuarial Mathematics: e xy = e x + e y - e xy .
13.24. B. X is uniform on [0, 10] and Y is uniform on [0, 7].
Probability that X has not failed by time t is: 1 - t/10, for t 10.
Probability that Y has not failed by time t is: 1 - t/7, for t 7.
Probability that neither tool has failed by time t is: (1 - t/10)(1 - t/7) for t 7.
Corresponding survival function, S(t) = (1 - t/10)(1 - t/7) = 1 - .24286t + t2 /70, for t 7.
7

t=7

1 - .24286t + t2/70 dt = t - .12143t2 + t3/210] = 2.68.

Mean = S(t)dt =
0

t=0

One could instead use equation 9.5.4 in Actuarial Mathematics: e xy = e x + e y - e xy , and the
solution to the previous question: e xy = e x + e y - e xy = 5 + 3.5 - 5.82 = 2.68.
13.25. A. De Moivre's Law the age of death for a life aged 20 is uniform from 20 to .
30 = e 20 = the average future lifetime = ( - 20)/2 = /2 - 10. = 80.
q20 = {F(21) - F(20)}/S(20) = {21/80 - 20/80}/(60/80) = 1/60.

2013-4-2,

Loss Distributions, 13 Uniform Distribution

HCM 10/8/12,

Page 127

13.26. B. The life aged 45 has a future lifetime uniform from 0 to 60, e 45 = 30.
The life aged 65 has a future lifetime uniform from 0 to 40, e = 20.
65

For t < 40, Prob[both alive] = S(t) = (1 - t/60)(1 - t/40) = 1 - .04167t + t2 /2400.
40

40

1 - .04167t + t2/2400 dt = t - .02083t2 + t3/7200 ] = 15.56.

e 45:65 = S(t)dt =
0

t = 40

e 4 5 :6 5 = e 45 + e 65 - e 45:65 = 30 + 20 - 15.56 = 34.44.

t=0

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 128

Section 14, Statistics of Grouped Data


Since the grouped data in Section 11 displays the losses in each interval, one estimates the mean
as the total losses divided by the total number of claims, 157,383,000 / 10000 = 15,738. If the
losses were not given one could estimate the mean by assuming that each claim in an interval was at
the center of the interval49.
If one were given additional information, such as the sum of squares of the claim sizes, then one
could directly compute the second moment and variance
Exercise: Summary statistics of 100 losses are:
Interval

Number
of Losses

Sum

Sum of
Squares

(0,2000]
(2000,4000]
(4000,8000]
(8000, 15000]
(15,000 )

39
22
17
12
10

38,065
63,816
96,447
137,595
331,831

52,170,078
194,241,387
572,753,313
1,628,670,023
17,906,839,238

Total

100

667,754

20,354,674,039

Estimate the mean, second moment and variance.


[Solution: The observed mean is: 667,754/100 = 6677.54.
The observed second moment is: (sum of squared loss sizes)/(number of losses) =
20,354,674,039/100 = 203,546,740.39.
The observed variance is: 203,546,740.39 - 6677.542 = 158,957,200.
Comment: In this case, when it comes to the first two moments, we have enough information to
proceed in exactly the same manner as if we had ungrouped data.50]
More generally, we can estimate moments by assuming the losses are uniformly distributed on
each interval.

49

In the case of skewed distributions this will lead to an underestimate of the mean. Also, one would have to guess
what value to assign to the claims in a final interval [c,).
50
See Course 4 Sample Examination, Q.8.

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 129

Exercise: Given the following grouped data, assuming the losses in each interval are uniformly
distributed, calculate the mean, second moment, third moment and fourth moment.
0 -10
6
10-20
11
20-25
3
[Solution: For each interval [a,b], the nth moment is (bn+1 - an+1) / {(b-a)(n+1)}. (Those for the
interval [20,25] match those calculated in the previous exercise.) Then we weight together the
moments for each interval by the number of claims observed in each interval.
Lower
Endpoint
0
10
20

Upper
Endpoint
10
20
25

Number
of Claims
6
11
3

First
Moment
5.00
15.00
22.50

Second
Moment
33.33
233.33
508.33

Third
Moment
250.00
3,750.00
11,531.25

Fourth
Moment
2,000.00
62,000.00
262,625.00

20

13.12

214.58

3,867.19

74,093.75

For example, {(33.33)(6) + (233.33)(11) + (508.33)(3)} / 20 = 214.58.


Thus the estimated mean, second moment, third moment and fourth moment are : 13.12, 214.58,
3867.19, 74093.75.]
As long as the final interval in which there are claims has a finite upper endpoint, this technique can be
applied to estimate the moments of any grouped data set. The estimates of second and higher
moments may be poor when the intervals are wide and/or the distribution is highly skewed.
These estimates of the moments can then be used to estimate the variance, CV, skewness and
kurtosis.
Exercise: Given the following grouped data, assuming the losses in each interval are uniformly
distributed, calculate the variance, CV, skewness and kurtosis.
0 -10
6
10-20
11
20-25
3
[Solution: From the solution to the previous exercise, the estimated mean, second moment, third
moment and fourth moment are : 13.12, 214.58, 3867.19, 74093.75. Therefore the variance is:
214.58 - 13.122 = 42.45. CV = 42.45.5 / 13.12 = .497.
The skewness is: {3867.19 - (3)(214.58)(13.12) + 2(13.123 )} / 42.451.5 = -.226.
The kurtosis is: {74,093.75- (4)(3867.19)(13.12) + (6)(214.58)(13.122 ) - 3(13.124 )} / 42.452 =
2.15.]

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 130

Estimating Statistics of Grouped Data When Given the Losses in Each Interval:
While the syllabus does not discuss how to estimate higher moments for grouped data, when given
the losses in each interval, here is an example of how one can estimate the variance of grouped
data. First one can compute the between interval variance by assuming that all claims in an interval
are at the average. Clearly this underestimates the total variance, because it ignores the variance
within intervals. For narrow intervals this will not produce a major error.
For a uniform distribution over an interval from a to b, the variance is (b-a)2 /12. Thus one can
estimate the within interval variance by computing the weighted average squared width of the
intervals, and dividing by 12.
The total variance can be estimated by adding the between interval variance to the within interval
variance. As an illustrative example, for the Grouped Data in Section 11 the variance could be
estimated as follows:
A

Interval

# claims

0-5
5 -10
10-15
15-20
20-25
25-50
50-75
75-100
over 100

2208
2247
1701
1220
799
1481
254
57
33

Loss
Cumu5974
16725
21071
21127
17880
50115
15303
4893
4295

10000

157383

Severity
Cumu2.7
7.4
12.4
17.3
22.4
33.8
60.2
85.8
130.2

Square of
Severity
7.3
55.4
153.4
299.9
500.8
1145.1
3629.8
7368.9
16939.4

Col. B x
Col. E
16163
124488
261015
365861
400118
1695823
921976
420025
559001

Square of
Interval Width
25
25
25
25
25
625
625
625
10000

Col. B x
Col. G
55200
56175
42525
30500
19975
925625
158750
35625
330000

476

165

First the between variance is estimated as the difference between the average squared severity
minus the square of the average severity :
Estimated Variance Between Intervals = 476 million - (15,7382) = 228 million.
Next the within variance is estimated by calculating the average squared width of interval. For the
over $100,000 interval we select an equivalent width of about $100,000 (This is based on the
average severity for this interval being $130,000 only 30 thousand more than the lower bound of
the interval; therefore, this is not a very heavier-tailed distribution. For heavier-tailed distributions, the
rare large claims can contribute a significant portion of the overall mean and an even more significant
portion of the variance.)
Estimated Variance Within Intervals = 165 million / 12 = 14 million.

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 131

Then the estimated total variance is the sum of the between and within variances:
Estimated Variance = 228 million + 14 million = 242 million.
The estimated coefficient of variation is the estimated standard deviation divided by the estimated
mean. Estimated Coefficient of Variation = 15.6 / 15.7 = 0.99.
While the estimated variance and thus the estimated coefficient of variation is dependent to some
extent on exactly how one corrects for the grouping, particularly how one deals with the last interval,
for this grouped data the coefficient of variation is clearly close to one. (The adjustment for the within
interval variance did not have much effect due to the width of the intervals and the relatively light tail
of this loss distribution.)
Similarly, using the average values for each interval, one can estimate the third moment and thus the
skewness. Taking the weighted average of the cubes of the severities for each interval, using the
number of claims in each interval as the weight, gives an estimate for the third moment of the
grouped data in Section 11: m3 = 2.41 x 1013.
However, by comparing for a uniform distribution the integral over an interval of x3 versus the cube
of the average severity, one can derive a correction term. This correction term which should be
added to the previous estimate of the third moment is:
(square of the width of each interval) (mean severity for each interval) / 4,
taking the weighted average over all intervals, using the number of claims as the weights.
For the grouped data in Section 11, one gets a correction term of 0.22 x 1013, where the last interval
has been assigned an equivalent width of 100,000.
Adding in the correction term gives an estimate of: m3 = 2.63 x 1013.
Using the formula Skewness = {m3 - (3 m1 m2 ) + (2 m1 3 )} / STDDEV3 , and the estimates
m1 = 1.57 x 104 , m2 = variance + mean2 = 4.88 x 108 , standard deviation = 1.56 x 104 , gives an
estimate for the skewness of: 1.10 / 0.38 = 2.9.
Exercise: What is the estimate of the skewness, if in the correction to the third moment one assumes
an equivalent width of 150,000 rather than 100,000?
[Solution: Estimated skewness is 3.2 rather than 2.9.]
The estimated skewness is even more affected than the estimated coefficient of variation by the
loss of information inherent in the grouping of data. However, it is clear that the skewness for the
grouped data in Section 11 is somewhere around three.51
51

One could use the Single Parameter Pareto Distribution in order to sharpen the estimate of the contribution of the
last interval. This can be particularly useful when dealing with more highly skewed distributions.

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

Problems:

Use the following information for the next 6 questions:


You are given the following grouped data:
0 -1
16
1-5
54
5-25
23
25-100
7
There are no reported losses of size greater than 100.
Assume the losses in each interval are uniformly distributed.
14.1 (1 point) Estimate the mean.
A. less than 10
B. at least 10 but less than 11
C. at least 11 but less than 12
D. at least 12 but less than 13
E. at least 13
14.2 (3 points) Estimate the variance.
A. 260
B. 270
C. 280
14.3 (3 points) Estimate the skewness.
A. less than 3.5
B. at least 3.5 but less than 4.0
C. at least 4.0 but less than 4.5
D. at least 4.5 but less than 5.0
E. at least 5.0
14.4 (3 points) Estimate the kurtosis.
A. less than 8
B. at least 8 but less than 10
C. at least 10 but less than 12
D. at least 12 but less than 14
E. at least 14

D. 290

E. 300

HCM 10/8/12,

Page 132

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

14.5 (2 points) Estimate the limited expected value at 50, E[X 50].
A. less than 7.0
B. at least 7.0 but less than 7.5
C. at least 7.5 but less than 8.0
D. at least 8.0 but less than 8.5
E. at least 8.5
14.6 (3 points) Estimate the limited second moment at 50, E[(X 50)2 ].
(A) 195
(B) 200
(C) 205
(D) 210
(E) 215

14.7 (3 points) You are given the following grouped data:


Claim Size Number of Claims
0 to 10
82
10 to 25
133
25 to 50
65
50 to 100
20
There are no reported losses of size greater than 100.
Assume a uniform distribution of claim sizes within each interval.
Estimate the second raw moment of the claim size distribution.
A. less than 500
B. at least 500 but less than 600
C. at least 600 but less than 700
D. at least 700 but less than 800
E. at least 800

Page 133

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 134

14.8 (3, 11/00, Q.31 & 2009 Sample Q.117) (2.5 points) For an industrywide study of patients
admitted to hospitals for treatment of cardiovascular illness in 1998, you are given:
(i)
Duration In Days
Number of Patients Remaining Hospitalized
0
4,386,000
5
1,461,554
10
486,739
15
161,801
20
53,488
25
17,384
30
5,349
35
1,337
40
0
(ii) Discharges from the hospital are uniformly distributed between the durations shown in the table.
Calculate the mean residual time remaining hospitalized, in days, for a patient who has been
hospitalized for 21 days.
(A) 4.4
(B) 4.9
(C) 5.3
(D) 5.8
(E) 6.3
14.9 (2 points) In the previous question, 3, 11/00, Q.31, what is the Excess Ratio at 21 days?
(Excess Ratio = 1 - Loss Elimination Ratio.)
(A) 0.5%
(B) 0.7%
(C) 0.9%
(D) 1.1%
(E) 1.3%

14.10 (4, 11/01, Q.2 & 2009 Sample Q. 58) (2.5 points) You are given:
Claim Size Number of Claims
0-25
30
25-50
32
50-100
20
100-200
8
Assume a uniform distribution of claim sizes within each interval.
Estimate the second raw moment of the claim size distribution.
(A) Less than 3300
(B) At least 3300, but less than 3500
(C) At least 3500, but less than 3700
(D) At least 3700, but less than 3900
(E) At least 3900

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 135

14.11 (4, 5/07, Q.7) (2.5 points) You are given:


(i)
Claim Size Number of Claims
(0, 50]
30
(50, 100]
36
(100, 200]
18
(200, 400]
16
(ii) Claim sizes within each interval are uniformly distributed.
(iii) The second moment of the uniform distribution on (a, b] is (b3 - a3 ) / {3(b - a)}.
Estimate E[(X 350)2 ], the second moment of the claim size distribution subject to a limit of 350.
(A) 18,362 (B) 18,950 (C) 20,237 (D) 20,662 (E) 20,750
14.12 (2 points) In the previous question, estimate Var[X

350].

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 136

Solutions to Problems:
14.1. A. For each interval [a,b], the nth moment is: (bn+1 - an+1) / {(b-a)(n+1)}.
Then we weight together the moments for each interval by the number of claims observed in each
interval.
Lower
Endpoint
0
1
5
25

Upper
Endpoint
1
5
25
100

Number
of Claims
16
54
23
7

First
Moment
0.50
3.00
15.00
62.50

Second
Moment
0.33
10.33
258.33
4,375.00

Third
Moment
0.25
39.00
4,875.00
332,031.25

Fourth
Moment
0.20
156.20
97,625.00
26,640,625.00

100

9.525

371.30

24,384.54

1,887,381.88

For example, {(.33)(16) + (10.33)(54) + (258.33)(23) + (4375)(7)} / 100 = 371.3.


Thus the estimated mean, second moment, third moment and fourth moment are: 9.525, 371.3,
24,384.54, and 1,887,381.88.
14.2. C. The estimated variance = 371.3 - 9.5252 = 280.6.
14.3. A. The estimated skewness = {3 - (3 1 2 ) + (2 1 3)} / Variance1.5 =
{24384.54 - (3)(9.525)(371.3) + (2)(9.5253 )}/280.61.5 = 3.30.
14.4. E. The estimated kurtosis = {4 - 41 3 + 61 22 - 31 4} / Variance2 =
{1887382- (4)(9.525)(24384.54) + (6)(9.5252 )(371.3) - (3)(9.5254 )}/280.62 = 14.4.
14.5. D. & 14.6. E. There are 7 losses in the interval 25 to 100, so we assume 7/3 losses in the
interval 25 to 50 and 14/3 losses in the interval 50 to 100.
The losses of size 50 or more contribute 50 to E[X 50].
The losses of size 50 or more contribute 502 to E[(X 50)2 ].
Lower Endpoint

Upper Endpoint

# Claims

1st Limited Moment at 50

2nd Limited Moment at 50

0
1
5
25
50

1
5
25
50
100

16
54
23
2.333
4.667

0.50
3.00
15.00
37.50
50

0.33
10.33
258.33
1,458.33
2500

100

8.358

215.74

E[X 50] = {(.5)(16) + (3)(54) + (15)(23) + (37.5)(7/3) + (50)(14/3)} / 100 = 8.358.

E[(X 50)2 ] = {(.33)(16) + (10.33)(54) + (258.33)(23) + (1458.33)(7/3) + (2500)(14/3)} / 100 =


215.74.

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 137

14.7. E. For each interval [a, b], we assume the losses are uniformly distributed, and therefore the
nth moment is: (bn+1 - an+1) / {(b - a)(n+1)}.
The second moment is: (b3 - a3 ) / {(b - a)3}.
For example, for the second interval: (253 - 103 ) / {(25 - 10)3} = 325.
Then we weight together the moments for each interval by the number of claims observed in each
interval: {(82)(33.3) + (133)(325) + (65)(1458.3) + (20)(5833.3)}/300 = 858.
Lower
Endpoint
0
10
25
50

Upper
Endpoint
10
25
50
100

Number
of Claims
82
133
65
20

First
Moment
5
18
38
75

Second
Moment
33.3
325.0
1,458.3
5,833.3

300

22.25

858.1

Comment: Estimated variance = 858.1 - 22.252 = 363.


14.8. A. Since discharges are uniform from 20 to 25, there are assumed to be:
(4/5)(53488 - 17384) = 28883.2 discharges from 21 to 25.
For a discharge at time t > 21, the contribution to the excess is of 21 is: t - 21.
t is assumed uniform on [25, 30], with mean (25 + 30)/2 = 27.5.
Thus the average contribution to the excess of 21 from the interval [25, 30] is:
E[t - 21] = E[t] - 21 = 27.5 - 21 = 6.5.
If discharges are uniformly distributed on [a, b], with a > 21, then the average contribution to the time
excess of 21 from those patients discharged between a and b is: (b+a)/2 - 21.
For each interval [a, b], the contribution to the time excess of 21 is:
(# who are discharged between a and b)(average contribution to the excess).
Bottom of
Interval

Top of
Interval

Average
Contribution

Number
Discharged

Contribution to
Time Excess of 21

21
25
30
35

25
30
35
40

2
6.5
11.5
16.5

28,883.2
12,035.0
4,012.0
1,337.0

57,766.4
78,227.5
46,138.0
22,060.5

46,267.2

204,192.4

Sum

e(21) = (total time excess of 21)/(# patients staying more than 21 days) =
204192.4 days / 46267.2 = 4.4 days.
Comment: Can also be answered using Actuarial Mathematics. The number discharged between
30 and 35, is: (the number remaining at time 30) - (number remaining at time 35) = 5349 - 1337 =
4012.

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 138

14.9. C. From the previous solution, the total time excess of 21 days is 204,192 days.
Assuming uniformity on each interval, one can calculate the total number of days as 21,903,260:
Bottom of
Interval

Top of
Interval

Average

Number
Discharged

Contribution to
Total Time

0
5
10
15
20
25
30
35

5
10
15
20
25
30
35
40

2.5
7.5
12.5
17.5
22.5
27.5
32.5
37.5

2,924,446
974,815
324,938
108,313
36,104
12,035
4,012
1,337

7,311,115
7,311,112
4,061,725
1,895,478
812,340
330,962
130,390
50,138

4,386,000

21,903,260

Sum

R(21) = (time excess of 21)/total time = 204,192/21,903,260 = 0.93%.


14.10. E. The second moment for a uniform distribution on [a, b] is:
b

x2 /(b-a) dx = (b3 - a3 )/(3(b-a)). Weight together the 2nd moments for each interval:
a

Lower
Endpoint
0
25
50
100

Upper
Endpoint
25
50
100
200

Number
of Claims
30
32
20
8

First
Moment
12.50
37.50
75.00
150.00

Second
Moment
208.33
1,458.33
5,833.33
23,333.33

90

47.500

3,958.33

{(30)(208.33) + (32)(1458.33) + (20)(5833.33) + (8)(23,333.33)}/90 = 3958.33.


Comment: The estimated variance is: 3958.33 - 47.52 = 1702.08.
14.11. E. Since we assume a uniform distribution from 200 to 400, we assume 12 of the 16 claims
in this interval are from 200 to 350, while the remaining 4 claims are from 350 to 400.
Interval
Second Moment of Uniform Distribution Number of Claims
0 to 50

(503 - 03 ){(3)(50 - 0)} = 833.33

30

50 to 100

(1003 - 503 ){(3)(100 - 50)} = 5833.33

36

100 to 200

(2003 - 1003 ){(3)(200 - 100)} = 23,333.33

18

200 to 350

(3503 - 2003 ){(3)(350 - 200)} = 77,500

12

The 4 claims of size greater than 350 each contribute 3502 to the numerator of E[(X
E[(X

350)2 ].

350)2 ] = {(833.33)(30) + (5833.33)(36) + (23333.33)(18) + (77500)(12) + (3502 )(4)} /


(30 + 36 + 18 + 12 + 4) = 2,075,000/100 = 20,750.

2013-4-2,

Loss Distributions, 14 Statistics Grouped Data

HCM 10/8/12,

Page 139

14.12. Again, we assume 12 of the 16 claims in the final interval are from 200 to 350, while the
remaining 4 claims are from 350 to 400.
E[X 350] = {(25)(30) + (75)(36) + (150)(18) + (275)(12) + (350)(4)} / (30 + 36 + 18 + 12 + 4)
= 10850/100 = 108.5.
Var[X

350] = E[(X

350)2 ] - E[X

350]2 = 20,750 - 108.52 = 8977.75.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 140

Section 15, Policy Provisions


Insurance policies may have various provisions which determine the amount paid, such as
deductibles, maximum covered losses, and coinsurance clauses.
(Ordinary) Deductible:
An ordinary deductible is a provision which states that when the loss is less than or equal to the
deductible there is no payment, and when the loss exceeds the deductible the amount paid is the
loss less the deductible.52 Unless specifically stated otherwise, assume a deductible is ordinary.
Unless stated otherwise assume the deductible operates per loss.
In actual applications, deductibles can apply per claim, per person, per accident, per
occurrence,
per event, per location, per annual aggregate, etc.53
Exercise: An insured suffers losses of size: $3000, $8000 and $17,000.
If the insured has a $5000 (ordinary) deductible, what does the insurer pay for each loss?
[Solution: Nothing, $8000 - $5000 = $3000, and $17,000 - $5000 = $12,000.]
Here is a graph of the payment under an ordinary deductible of 5000:
Payment
20000
15000
10000
5000

5000

52

Loss
10000 15000 20000 25000

See Definition 8.1 in Loss Models.


An annual aggregate deductible is discussed in the section on Stop Loss Premiums in Mahlers Guide to
Aggregate Losses
53

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 141

Maximum Covered Loss:54

Maximum Covered Loss u


size of loss above which no additional payments are made
censorship point from above.
Exercise: An insured suffers losses of size: $2,000, $13,000, $38,000.
If the insured has a $25000 maximum covered loss, what does the insurer pay for each loss?
[Solution: $2,000, $13,000, $25,000.]
Most insurance policies have a maximum covered loss or equivalent. For example, a liability policy
with a $100,000 per occurrence limit would pay at most $100,000 in losses from any single
occurrence, regardless of the total losses suffered by any claimants.
An automobile collision policy will never pay more than the total value of the covered
automobile minus any deductible, thus it has an implicit maximum covered loss.
An exception is a Workers Compensation policy, which provides unlimited medical coverage
to injured workers.55
Coinsurance:
A coinsurance factor is the proportion of any loss that is paid by the insurer after any other
modifications (such as deductibles or limits) have been applied. A coinsurance is a provision
which states that a coinsurance factor is to be applied.
For example, a policy might have a 80% coinsurance factor. Then the insurer pays 80% of what it
would have paid in the absence of the coinsurance factor.

54

See Section 8.5 of Loss Models. Professor Klugman made up the term maximum covered loss.
While benefits for lost wages are frequently also unlimited, since they are based on a formula in the specific
workers compensation law, which includes a maximum weekly benefit, there is an implicit maximum benefit for lost
wages, assuming a maximum possible lifetime.
55

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 142

Policy Limit:56
Policy Limit maximum possible payment on a single claim.
Policy Limit = c (u - d),
where c = coinsurance factor, u = maximum covered loss, and d = deductible.
If c = 90%, d = 1000, and u = 5000, then the policy limit = (90%)(5000 - 1000) = 3600;
if a loss is of size 5000 or greater, the insurer pays 3600.
With a coinsurance factor, deductible, and Policy Limit L:
u = d + L/c.
In the above example, 1000 + 3600/.9 = 5000.
With no deductible and no coinsurance, the policy limit is the same as the maximum
covered loss.
Exercise: An insured has a policy with a $25,000 maximum covered loss, $5000 deductible, and a
80% coinsurance factor. The insured suffers a losses of: $5000, $15,000, $38,000.
How much does the insurer pay?
[Solution: Nothing for the loss of $5000. (.8)(15000 - 5000) = $8000 for the loss of $15,000.
For the loss of $38,000, first the insurer limits the loss to $25,000. Then it reduces the loss by the
$5,000 deductible, $25,000 - $5,000 = $20,000. Then the 80% coinsurance factor is applied:
(80%)($20,000) = $16,000.
Comment: The maximum possible amount paid for any loss, $16,000, is the policy limit.]
If an insured with a policy with a $25,000 maximum covered loss, $5000 deductible, and a
coinsurance factor of 80%, suffers a loss of size x, then the insurer pays:
0, if x $5000
0.8(x-5000), if $5000 < x $25,000
$16,000, if x $25,000
More generally, if an insured has a policy with a maximum covered loss of u, a deductible of d, and a
coinsurance factor of c, suffers a loss of size x, then the insurer pays:
0, if x d
c(x-d), if d < x u
c(u-d), if x u
56

See Section 8.5 of Loss Models. This definition of a policy limit differs from that used by many actuaries.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 143

If an insured has a policy with a policy limit of L, a deductible of d, and a coinsurance factor of c,
suffers a loss of size x, then the insurer pays:
0, if x d
c(x-d), if d < x d + L/c
L, if x d + L/c
Exercise: There is a deductible of $10,000, policy limit of $100,000, and a coinsurance factor of
90%. Let Xi be the individual loss amount of the ith claim and Yi be the claim payment of the ith
claim. What is the relationship between Xi and Yi?
[Solution: The maximum covered loss, u = 10000 + 100000/0.9 = $121,111.
0
Xi 10,000
Y i = 0.90(Xi - 10,000)
100,000

10,000 < Xi 121,111


Xi > 121,111.]

Order of Operations:
If one has a deductible, maximum covered loss, and a coinsurance, then on this exam in order to
determine the amount paid on a loss, the order to operations is:
1. Limit the size of loss to the maximum covered loss.
2. Subtract the deductible. If the result is negative, set the payment equal to zero.
3. Multiply by the coinsurance factor.

Franchise Deductible:57
Besides an ordinary deductible, there is the franchise deductible. Unless specifically stated
otherwise, assume a deductible is ordinary.
Under a franchise deductible the insurer pays nothing if the loss is less than the deductible
amount, but ignores the deductible if the loss is greater than the deductible amount.
Exercise: An insured suffers losses of size: $3000, $8000 and $17,000. If the insured has a $5000
franchise deductible, what does the insurer pay for each loss?
[Solution: Nothing, $8000, and $17,000.]

57

In Definition 8.2 in Loss Models.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 144

Under a franchise deductible with deductible amount is d, if the insured has a loss of size x, then the
insurer pays:
0
xd
x
x>d
Thus data from a policy with a franchise deductible is truncated from below at the
deductible amount.58
Therefore under a franchise deductible, the average nonzero payment is:
e(d) + d = {E[X] - E[X d]}/S(d) + d.59
The average cost per loss is: (average nonzero payment)(chance of nonzero payment) =
{(E[X] - E[X d])/S(d) + d}S(d) = (E[X] - E[X d]) + dS(d).60
Here is a graph of the payment under a franchise deductible of 5000:
Payment
25000
20000
15000
10000
5000
5000

58

Loss
10000 15000 20000 25000

See the next section for a discussion of truncation from below (truncation from the right.)
See Theorem 8.3 in Loss Models.
60
See Theorem 8.3 in Loss Models.
59

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 145

Definitions of Loss and Payment Random Variables:61


Name

Description

ground-up loss62

Losses prior to the impact of any deductible or maximum covered loss;


the full economic value of the loss suffered by the insured
regardless of how much the insurer is required to pay
in light of any deductible, maximum covered loss, coinsurance, etc.

amount paid
per payment

Undefined when there is no payment due to a deductible or


other policy provision. Otherwise it is the amount paid by
the insurer. Thus for example, data truncated and shifted
from below consists of the amounts paid per payment.

amount paid
per loss

Defined as zero when the insured suffers a loss but there is


no payment due to a deductible or other policy provision.
Otherwise it is the amount paid by the insurer.

The per loss variable is: 0 if X d, X if X > d.


The per payment variable is: undefined if X d, X if X > d.
Loss Models uses the notation YL for the per loss variable and YP for the per payment variable.
Unless stated otherwise, assume a distribution from Appendix A of Loss Models will be used to
model ground-up losses, prior to the effects of any coverage modifications. The effects on
distributions of coverage modifications will be discussed in subsequent sections.

61
62

See Section 8.2 of Loss Models.


Sometimes referred to as ground-up unlimited losses.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 146

Problems:
Use the following information for the next 3 questions:
The ABC Bookstore has an insurance policy with a $100,000 maximum covered loss,
$20,000 per loss deductible, and a 90% coinsurance factor.
During the year, ABC Bookstore suffers three losses of sizes: $17,000, $60,000 and $234,000.
15.1 (1 point) How much does the insurer pay in total?
A. less than $95,000
B. at least $95,000 but less than $100,000
C. at least $100,000 but less than $105,000
D. at least $105,000 but less than $110,000
E. at least $110,000
15.2 (1 point) What is the amount paid per loss?
A. less than $35,000
B. at least $35,000 but less than $40,000
C. at least $40,000 but less than $45,000
D. at least $45,000 but less than $50,000
E. at least $50,000
15.3 (1 point) What is the amount paid per payment?
A. less than $35,000
B. at least $35,000 but less than $40,000
C. at least $40,000 but less than $45,000
D. at least $45,000 but less than $50,000
E. at least $50,000

15.4 (2 points) The size of loss is uniform on [0, 400].


Policy A has an ordinary deductible of 100.
Policy B has a franchise deductible of 100.
What is the ratio of the expected losses paid under Policy B to the expected losses paid under
Policy A?
A. 7/6
B. 5/4
C. 4/3
D. 3/2
E. 5/3

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 147

15.5 (1 point) An insured suffers 4 losses of size: $2500, $7700, $10,100, and $23,200.
The insured has a $10,000 franchise deductible. How much does the insurer pay in total?
A. less than 32,500
B. at least 32,500 but less than 33,000
C. at least 33,000 but less than 33,500
D. at least 33,500 but less than 34,000
E. at least 34,000
Use the following information for the next 3 questions:
An insurance policy has a deductible of 10,000, policy limit of 100,000, and a coinsurance factor of
80%. (The policy limit is the maximum possible payment by the insurer on a single loss.) During the
year, the insured suffers six losses of sizes: 3000, 8000, 14,000, 80,000, 120,000, and 200,000.
15.6 (2 points) How much does the insurer pay in total?
A. less than 235,000
B. at least 235,000 but less than 240,000
C. at least 240,000 but less than 245,000
D. at least 245,000 but less than 250,000
E. at least 250,000
15.7 (1 point) What is the amount paid per loss?
A. less than 45,000
B. at least 45,000 but less than 50,000
C. at least 50,000 but less than 55,000
D. at least 55,000 but less than 60,000
E. at least 60,000
15.8 (1 point) What is the amount paid per payment?
A. less than 45,000
B. at least 45,000 but less than 50,000
C. at least 50,000 but less than 55,000
D. at least 55,000 but less than 60,000
E. at least 60,000

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 148

Use the following size of loss distribution for the next 2 questions:
Size of Loss
Probability
100
70%
1000
20%
10,000
10%
15.9 (2 points) If there is an ordinary deductible of 500, what is the coefficient of variation of the
nonzero payments?
A. less than 1.0
B. at least 1.0 but less than 1.1
C. at least 1.1 but less than 1.2
D. at least 1.2 but less than 1.3
E. at least 1.3
15.10 (2 points) If there is a franchise deductible of 500, what is the coefficient of variation of the
nonzero payments?
A. less than 1.0
B. at least 1.0 but less than 1.1
C. at least 1.1 but less than 1.2
D. at least 1.2 but less than 1.3
E. at least 1.3
Use the following information for the next four questions:
The Mockingbird Tequila Company buys insurance from the Atticus Insurance Company, with a
deductible of $5000, maximum covered loss of $250,000, and coinsurance factor of 90%.
Atticus Insurance Company buys reinsurance from the Finch Reinsurance Company.
Finch will pay Atticus for the portion of any payment in excess of $100,000.
Let X be an individual loss amount suffered by the Mockingbird Tequila Company.
15.11 (2 points) Let Y be the amount retained by the Mockingbird Tequila Company.
What is the relationship between X and Y?
15.12 (2 points) Let Y be the amount paid by the Atticus Insurance Company to the Mockingbird
Tequila Company, prior to the impact of reinsurance. What is the relationship between X and Y?
15.13 (2 points) Let Y be the payment made by the Finch Reinsurance Company to the Atticus
Insurance Company. What is the relationship between X and Y?
15.14 (2 points) Let Y be the net amount paid by the Atticus Insurance Company after the impact
of reinsurance. What is the relationship between X and Y?

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 149

15.15 (2 points) Assume a loss of size x.


Policy A calculates the payment based on limiting to a maximum of 10,000, then subtracting a
deductible of 1000, and then applying a coinsurance factor of 90%.
Policy B instead calculates the payment based on subtracting a deductible of 1000, then limiting it to
a maximum of 10,000, and then applying a coinsurance factor of 90%.
What is the difference in payments between that under Policy A and Policy B?
15.16 (CAS6, 5/94, Q.21) (1 point) Last year, an insured in a group medical plan incurred charges
of $600. This year, the same medical care resulted in a charge of $660. The group comprehensive
medical care plan provides 80% payment after a $100 deductible. Determine the increase in the
insured's retention under his or her comprehensive medical care plan.
A. Less than 7.0%
B. At least 7.0% but less than 9.0%
C. At least 9.0% but less than 11.0%
D. At least 11.0% but less than 13.0%
E. 13.0% or more
15.17 (CAS6, 5/96, Q.41) (2 points)
You are given the following full coverage experience:
Loss Size
Number of Claims
Amount of Loss
$ 0-99
1,400
$76,000
$100-249
400
$80,000
$250-499
200
$84,000
$500-999
100
$85,000
$1,000 or more
50
$125,000
Total
2,150
$450,000
(a) (1 point) Calculate the expected percentage reduction in losses for a $250 ordinary deductible.
(b) (1 point) Calculate the expected percentage reduction in losses for a $250 franchise deductible.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 150

Use the following information for the next two questions:


Full Coverage Loss Data
Loss Size
Number of Claims Amount of Loss
0 - 250
1,500
375,000
250 - 500
1,000
450,000
500 - 750
750
487,500
750-1,000
500
400,000
1,000-1,500
250
312,500
1,500 or more
100
300,000
Total
4,100
2,325,000
15.18 (CAS6, 5/97, Q.32a) (1 point) Calculate the percentage reduction in the loss costs for a
$500 franchise deductible compared to full coverage.
15.19 (1 point) Calculate the percentage reduction in the loss costs for a $500 ordinary deductible
compared to full coverage.

15.20 (CAS9 11/97, Q.36a) (2 points)


An insured is trying to decide which type of policy to purchase:

A policy with a franchise deductible of $50 will cost her $8 more than a policy with
a straight deductible of $50.

A policy with a franchise deductible of S100 will cost her $10 more than a policy with
a straight deductible of $100.
An expected ground-up claim frequency of 1.000 is assumed for each of the policies
described above.
Calculate the probability that the insured will suffer a loss between $50 and $100. Show all work.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 151

Use the following information for the next two questions:


Loss Size
Number of Claims
Total Amount of Loss
$0-249
5,000
$1,125,000
250-499
2,250
765,000
500-999
950
640,000
1,000-2,499
575
610,000
2500 or more
200
890,000
Total
8,975
$4,030,000
15.21 (CAS6, 5/98, Q.7) (1 point) Calculate the percentage reduction in loss costs caused by the
introduction of a $500 franchise deductible. Assume there is no adverse selection or padding of
claims to reach the deductible.
A. Less than 25.0%
B. At least 25.0%, but less than 40.0%
C. At least 40.0%, but less than 55.0%
D. At least 55.0%, but less than 70.0%
E. 70.0% or more
15.22 (1 point) Calculate the percentage reduction in loss costs caused by the introduction of a
$500 ordinary deductible. Assume there is no adverse selection or padding of claims to reach the
deductible.
15.23 (1, 5/03, Q.25) (2.5 points) An insurance policy pays for a random loss X subject to a
deductible of C, where 0 < C < 1. The loss amount is modeled as a continuous random variable
with density function f(x) = 2x for 0 < x < 1.
Given a random loss X, the probability that the insurance payment is less than 0.5
is equal to 0.64. Calculate C.
(A) 0.1
(B) 0.3
(C) 0.4
(D) 0.6
(E) 0.8
15.24 (CAS5, 5/03, Q.9) (1 point) An insured has a catastrophic health insurance policy with a
$1,500 deductible and a 75% coinsurance clause. The policy has a $3,000 maximum retention.
If the insured incurs a $10,000 loss, what amount of the loss must the insurer pay?
Note: I have rewritten this past exam question in order to match the current syllabus.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 152

15.25 (CAS3, 5/04, Q.35) (2.5 points) The XYZ Insurance Company sells property insurance
policies with a deductible of $5,000, policy limit of $500,000, and a coinsurance factor of 80%.
Let Xi be the individual loss amount of the ith claim and Yi be the claim payment of the ith claim.
Which of the following represents the relationship between Xi and Yi?
0
A. Yi = .80(Xi - 5,000)

Xi 5,000
5,000 < Xi 625,000

500,000

Xi > 625,000

Xi 4,000

B. Yi = .80(Xi - 4,000)

4,000 < Xi 500,000

500,000

Xi > 500,000

Xi 5,000

C. Yi = .80(Xi - 5,000)

5,000 < Xi 630,000

500,000

Xi > 630,000

Xi 6,250

D. Yi = .80(Xi - 6,250)

6,250 < Xi 631,250

500,000

Xi > 631,250

Xi 5,000

E. Yi = .80(Xi - 5,000)
500,000

5,000 < Xi 505,000


Xi > 505,000

15.26 (SOA M, 5/05, Q.32 & 2009 Sample Q.168) (2.5 points) For an insurance:
(i) Losses can be 100, 200, or 300 with respective probabilities 0.2, 0.2, and 0.6.
(ii) The insurance has an ordinary deductible of 150 per loss.
(iii) YP is the claim payment per payment random variable.
Calculate Var(YP).
(A) 1500
(B) 1875

(C) 2250

(D) 2625

(E) 3000

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 153

Solutions to Problems:
15.1. D. First the insurer limits each loss to $100,000: 17, 60, 100. Then it reduces each loss by
the $20,000 deductible: 0, 40, 80. Then the 90% coinsurance factor is applied: 0, 36, 72.
The insurer pays a total of 0 + 36 + 72 = $108 thousand.
15.2. B. There are three losses and 108,000 in total is paid: $108,000/3 = $36,000.
15.3. E. There are two (non-zero) payments and 108,000 in total is paid: $108,000/2 = $54,000.
15.4. E. Under Policy A one pays x - 100 for x > 100. 3/4 of the losses are greater than 100, and
those losses have average size (100 + 400)/2 = 250.
Thus under Policy A the expected payment per loss is: (3/4)(250 -100) = 112.5
Under Policy B, one pays x for x > 100.
Thus the expected payment per loss is: (3/4)(250) = 187.5. Ratio is: 187.5/112.5 = 5/3.
15.5. C. The insurer pays: 0 + 0 + $10,100 + $23,200 = $33,300.
15.6. D. Subtract the deductible: 0, 0, 4000, 70,000, 110,000, 190,000.
Multiply by the coinsurance factor: 0, 0, 3200, 56,000, 88,000, 152,000.
Limit each payment to 100,000: 0, 0, 3200, 56,000, 88,000, 100,000.
0 + 0 + 3200 + 56,000 + 88,000 + 100,000 = 247,200.
Alternately, the maximum covered loss is: 10000 + 100000/.8 = 135,000.
Limit each loss to the maximum covered loss: 3000, 8000, 14,000, 80,000, 120,000, 135,000.
Subtract the deductible: 0, 0, 4000, 70,000, 110,000, and 125,000.
Multiply by the coinsurance factor: 0, 0, 3200, 56,000, 88,000, and 100,000.
0 + 0 + 3200 + 56,000 + 88,000 + 100,000 = 247,200.
15.7. A. 247,200/6 = 41,200.
15.8. E. 247,200/4 = 61,800.
15.9. D. The nonzero payments are: 500@2/3 and 9500@1/3.
Mean = (2/3)(500) + (1/3)(9500) = 3500.
2nd moment = (2/3)(5002 ) + (1/3)(95002 ) = 30,250,000.
variance = 30,250,000 - 35002 = 18,000,000.
CV =

18,000,000 / 3500 = 1.212.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 154

15.10. B. The nonzero payments are: 1000@2/3 and 10,000@1/3.


Mean = (2/3)(1000) + (1/3)(10000) = 4000.
2nd moment = (2/3)(10002 ) + (1/3)(100002 ) = 34,000,000.
variance = 34,000,000 - 40002 = 18,000,000. CV =

18,000,000 / 4000 = 1.061.

15.11. Mockingbird retains all of any loss less than $5000.


For a loss of size greater than $5000, it retains $5000 plus 10% of the portion above $5000.
Mockingbird retains the portion of any loss above the maximum covered loss of $250,000.
Y = X, for X 5000.
Y = 5000 + (0.1)(X - 5000) = 4500 + 0.1X, for 5000 X 250,000.
Y = 4500 + (0.1)(250,000) + (X - 250,000) = X - 220,500, for 250,000 X.
Comment: The maximum amount that Atticus Insurance Company retains on any loss is:
(.9)(250,000 - 5000) = 220,500. Therefore, for a loss X of size greater than 250,000, Mockingbird
retains X - 220,500.
15.12. Atticus Insurance pays nothing for a loss less than $5000. For a loss of size greater than
$5000, Atticus Insurance pays 90% of the portion above $5000.
For a loss of size 250,000, Atticus Insurance pays: (.9)(250,000 - 5000) = 220,500.
Atticus Insurance pays no more for a loss larger than the maximum covered loss of $250,000.
Y = 0, for X 5000.
Y = (0.9)(X - 5000) = 0.9X - 4500, for 5000 X 250,000.
Y = 220,500, for 250,000 X.
Comment: The amount retained by Mockingbird, plus the amount paid by Atticus to Mockingbird,
equals the total loss.
15.13. Finch Reinsurance pays something when the loss results in a payment by Atticus of more
than $100,000. Solve for the loss that results in a payment of $100,000:
100000 = (0.9)(X - 5000). x = 116,111.
Y = 0, for X 116,111.
Y = (0.9)(X - 116,111) = 0.9X - 104,500, for 116,111 < X 250,000.
Y = 120,500, for 250,000 X.
15.14. For a loss greater than 116,111, Atticus pays 100,000 net of reinsurance.
Y = 0, for X 5000.
Y = (0.9)(X - 5000) = 0.9X - 4500, for 5000 X 116,111.
Y = 100,000, for 116,111 < X.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 155

15.15. Policy A: (0.9) (Min[x, 10,000] - 1000)+ = (0.9) (Min[x - 1000, 9000])+
= (Min[0.9x - 900, 8100])+
Policy B: (0.9) Min[(x - 1000)+, 10,000] = Min[(0.9x - 900)+, 9000]
= (Min[0.9x - 900, 9000])+.
Policy A - Policy B = (Min[0.9x - 900, 8100])+ - (Min[0.9x - 900, 9000])+.
If x 11,000, then this difference is: 8100 - 9000 = -900.
If 11,000 > x > 10,000, then this difference is: 8100 - (0.9x - 900) = 9000 - 0.9x.
If 10,000 x > 1000, then this difference is: (0.9x - 900) - (0.9x - 900) = 0.
If x 1000, then this difference is: 0 - 0 = 0.
Comment: Policy A follows the order of operations you should follow on your exam, unless
specifically stated otherwise.
A graph of the difference in payments between that under Policy A and Policy B:
2000

4000

6000

8000 10000 12000 14000

- 200
- 400
- 600
- 800

I would attack this type of problem by just trying various values for x.
Here are some examples:
x
Payment Under A Payment Under B Difference
12,000
8100
9000
-900
10,300
8100
8370
-270
8000
6300
6300
0
700
0
0
0
15.16. A. Last year, the insured gets paid: (80%)(600 - 100) = 400.
Insured retains: 600 - 400 = 200.
This year, the insured gets paid: (80%)(660 - 100) = 448.
Insured retains: 660 - 448 = 212.
Increase in the insured's retention is: 212 / 200 - 1 = 6.0%.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 156

15.17. (a) Losses eliminated: 76,000 + 80,000 + (250)(200 + 100 + 50) = $243,500.
$243,500/$450,000 = 0.541 = 54.1% reduction in expected losses.
(b) Under the franchise deductible, we pay the whole loss for a loss of size greater than 250.
Losses eliminated: 76,000 + 80,000 = $156,000.
$156,000/$450,000 = 0.347 = 34.7% reduction in expected losses.
15.18. (375,000 + 450,000) / 2,325,000 = 35.5%.
15.19. {375,000 + 450,000 + (500)(750 + 500 + 250 + 100)} / 2,325,000 = 69.9%.
15.20. When a payment is made, the $50 franchise deductible pays $50 more than the $50
straight deductible. Therefore, $8 = (1.000) S(50) ($50). S(50) = 16%.
When a payment is made, the $100 franchise deductible pays $100 more than the $100 straight
deductible. Therefore, $10 = (1.000) S(100) ($100). S(100) = 10%.
The probability that the insured will suffer a loss between $50 and $100 is:
S(50) - S(100) = 16% - 10% = 6%.
15.21. C. (1125000 + 765000)/4030000 = 46.9%.
15.22. {1125000 + 765000 + (500)(950 + 575 + 200)}/4030000 = 68.3%.
15.23. B. F(x) = x2 .
0.64 = Prob[payment < 0.5] = Prob[X - C < 0.5] = Prob[X < 0.5 + C] = (0.5 + C)2 .

C = 0.8 - 0.5 = 0.3.


15.24. Without the maximum retention the insurer would pay: (10,000 - 1500)(0.75) = 6375.
In that case the insured would retain: 10,000 - 6375 = 3625.
However, the insured retains at most 3000, so the insurer pays: 10,000 - 3000 = 7000.
15.25. C. The policy limit is: c(u-d), where u is the maximum covered loss.
Therefore, u = d + (policy limit)/c = 5000 + 500,000/0.8 = 630,000.
Therefore the payment is: 0 if X 5,000, 0.80(X - 5,000) if 5,000 < X 630,000,
and 500,000 if X > 630,000.
Comment: For a loss of size 3000 nothing is paid. For a loss of size 100,000, the payment is:
.8(100,000 - 5000) = 76,000. For a loss of size 700,000, the payment would be:
.8(700,000 - 5000) = 556,000, except that the maximum payment is the policy limit of 500,000.
Increasing the size of loss above the maximum covered loss of 630,000, results in no increase in
payment beyond 500,000.

2013-4-2,

Loss Distributions, 15 Policy Provisions

HCM 10/8/12,

Page 157

15.26. B. Prob[YP = 50] = 0.2/(0.2 + 0.6) = 1/4. Prob[YP = 150] = 0.6/(0.2 + 0.6) = 3/4.
E[YP] = (50)(1/4) + (150)(3/4) = 125. E[(YP)2 ] = (502 )(1/4) + (1502 )(3/4) = 17,500.
Var[YP] = 17,500 - 1252 = 1875.

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 158

Section 16, Truncated Data


The ungrouped data in Section 1 is assumed to be ground-up (first dollar) unlimited losses, on all
loss events that occurred. By first dollar, I mean that we start counting from the first dollar of
economic loss, in other words as if there were no deductible. By unlimited I mean we count every
dollar of economic loss, as if there were no maximum covered loss.
Sometimes some of this information is not reported, most commonly due to a deductible and/or
maximum covered loss.
There are four such situations likely to come up on your exam, each of which has two names:
left truncation truncation from below
left truncation and shifting truncation and shifting from below
left censoring and shifting censoring and shifting from below
right censoring censoring from above.
In the following, ground-up, unlimited losses are assumed to have distribution function F(x).
G(x) is what one would see after the effects of either a deductible or a maximum covered loss.
Left Truncation / Truncation from Below:63
Left Truncation Truncation from Below at d
deductible d and record size of loss for size > d.
For example, the same data as in Section 1, but left truncated or truncated from below at
$10,000, would have no information on the first eight losses, each of which resulted in less than
$10,000 of loss. The actuary would not even know that there were eight such losses.64 The same
information would be reported as shown in Section 1 on the remaining 122 large losses.
When data is truncated from below at the value d, losses of size less than d are not in the reported
data base.65 This generally occurs when there is an (ordinary) deductible of size d, and the insurer
records the amount of loss to the insured.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is left truncated / truncated from below at $1000?
[Solution: $1,200, $1,500 and $2,800 .
Comment: The two smaller losses are never reported to the insurer.]
63

The terms left truncation and truncation from below are synonymous.
This would commonly occur in the case of a $10,000 deductible.
65
Note that the Mean Excess Loss, e(x), is unaffected by truncation from below at d, provided x > d.
64

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 159

The distribution function and the probability density functions are revised as follows:
F(x) - F(d)
G(x) =
,x>d
S(d)
g(x) = f(x) / S(d), x > d
x the size of loss.
Thus the data truncated from below has a distribution function which is zero at d and 1 at infinity.
The revised probability density function has been divided by the original chance of having a loss of
size greater than d. Thus for the revised p.d.f. the probability from d to infinity integrates to unity as it
should.
Note that G(x) = {F(x) - F(d)} / S(d) = (S(d) - S(x))/S(d) = 1 - S(x)/S(d), x> d.
1 - G(x) = S(x)/S(d). The revised survival function after truncation from below is the survival function
prior to truncation divided by the survival function at the truncation point.
Both data truncated from below and the mean excess loss exclude the smaller losses. In order to
compute the mean excess loss, we would take the average of the losses greater than d, and then
subtract d. Therefore, the average size of the data truncated from below at d, is d plus the mean
excess loss at d, e(d) + d.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average size of the data reported, if it is truncated from below at $1000?
[Solution: ($1,200+ $1,500 + $2,800) /3 = 1833.33 = 1000 + 833.33 = 1000 + e(1000).]
Franchise Deductible:
Under a franchise deductible the insurer pays nothing if the loss is less than the deductible amount,
but ignores the deductible if the loss is greater than the deductible amount. If the deductible amount
is d and the insured has a loss of size x, then the insurer pays:
0
x

xd
x>d

Thus data from a policy with a franchise deductible is truncated from below at the
deductible amount.

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 160

Left Truncation and Shifting / Truncation and Shifting from Below:


Left Truncation & Shifting at d Truncation & Shifting from Below at d
Excess Loss Variable deductible d and record non-zero payment.
If the data in Section 1 were truncated and shifted from below at $10,000, the data on these
remaining 122 losses would have each amount reduced by $10,000. For the ninth loss with
$10,400 of loss, $400 would be reported66. When data is truncated and shifted from below at the
value d, losses of size less than d are not in the reported data base, and larger losses have their
reported values reduced by d. This generally occurs when there is an (ordinary) deductible of size
d, and the insurer records the amount of payment to the insured.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is truncated and shifted from below at $1000?
[Solution: $200, $500 and $1,800.]
The distribution, survival, and the probability density functions are revised as follows:
F(x + d) - F(d)
G(x) =
, x > 0.
New survival function = 1 - G(x) = S(x+d)/S(d), x > 0
S(d)
g(x) = f(x+d) / S(d), x > 0
x the size of (non-zero) payment.

x+d the size of loss.

As discussed previously, the Excess Loss Variable for d is defined for X > d as X-d and is
undefined for X d, which is the same as the effect of truncating and shifting from below at d.
Exercise: Prior to a deductible, losses are Weibull with = 2 and = 1000.
What is the probability density function of the excess loss variable corresponding to d = 500?
[Solution: For the Weibull, F(x) = 1 - exp(-(x/)) and f(x) = x1 exp(-(x/)) / .
S(500) = exp[-(500/1000)2 ] = .7788. Let Y be the truncated and shifted variable.
Then, g(y) = f(500 + y)/S(500) = (y+500) exp[-((y+500)/1000)2 ] / 389,400, y > 0.]
The average size of the data truncated and shifted from below at d, is the mean excess loss (mean
residual life) at d, e(d).

66

With a $10,000 deductible, the insurer would pay $400 while the insured would have to absorb $10,000.

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 161

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average size of the data reported, if it is truncated and shifted from below at $1000?
[Solution: ($200+ $500 + $1,800) /3 = 833.33 = e(1000).]
Complete Expectation of Life:67
These ideas are mathematically equivalent to ideas discussed in Life Contingencies.
The expected future lifetime for an age x is e x , the complete expectation of life.
e x is the mean residual life (mean excess loss) at x, e(x).
e x is the mean of the lives truncated and shifted from below at x.
Exercise: Three people die at ages: 55, 70, 80. Calculate e 65 .
[Solution: Truncate and shift the data at 65; eliminate any ages 65 and subtract 65:
70 - 65 = 5, 80 - 65 = 15.
Average the truncated and shifted data: e 65 = (5 + 15)/2 = 10.]
The survival function at age x + t for the data truncated and shifted at x is: S(x+t)/S(x) = tp x.
As will be discussed in a subsequent section, one can get the mean by integrating the survival
function.
e

= mean of the data truncated and shifted at x =

tpx dt .

68

67
68

See Actuarial Mathematics.


See equation 3.5.2 in Actuarial Mathematics.

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 162

Right Truncation / Truncation from Above:69


In the case of the data in Section 1, right truncated or truncated from above at $25,000, there
would be no information on the 109 losses larger than $25,000. Truncation from above contrasts
with data censored from above at $25,000, which would have each of the 109 large losses in
Section 1 reported as being $25,000 or more.70
When data is right truncated or truncated from above at the value L, losses of size greater than L are
not in the reported data base.71
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is truncated from above at $1000?
[Solution: $300 and $600.]
The distribution function and the probability density functions are revised as follows:
G(x) = F(x) / F(L), x L
g(x) = f(x) / F(L), x L
The average size of the data truncated from above at L, is the average size of losses from 0 to L,
E[X L] - L S(L)
.
F(L)
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average size of the data reported, if it is truncated from above at $1000?
[Solution: ($300 + $600)/2 = $450.]

69

See Definition 14.1 in Loss Models.


Under censoring from above, one would not know the total size of loss for each of these large losses. This is quite
common for data reporting when there are maximum covered losses.
71
Right truncation can happen when insufficient time has elapsed to receive all of the data. For example, one might
be doing a mortality study based on death records, which would exclude from the data anyone who has yet to die.
Right truncation can also occur when looking at claim count development. One might not have data beyond a given
report and fit a distribution function (truncated from above) to the available claim counts by report.
See for example, Estimation of the Distribution of Report Lags by the Method of Maximum Likelihood,
by Edward W. Weisner, PCAS 1978.
70

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 163

Truncation from Both Above and Below:


When data is both truncated from above at L and truncated from below at the value d, losses of size
greater than L or less than or equal to d are not in the reported data base.
Exercise: Events occur at times: 3, 6, 12, 15, 24, and 28.
How is this data reported, if it is truncated from below at 10 and truncated from above at 20?
[Solution: 12 and 15.
Comment: One starts observing at time 10 and stops observing at time 20.]
The distribution function and the probability density functions are revised as follows:
G(x) =

F(x) - F(d)
, d<xL
F(L) - F(d)

g(x) =

f(x)
, d<xL
F(L) - F(d)

Note that whenever we have truncation, the probability remaining after truncation is the denominator
of the altered density and distribution functions.
The average size of the data truncated from below at d and truncated from above at L, is the
{E[X L] - L S(L)} - {E[X d] - d S(d)}
average size of losses from d to L,
.
F(L) - F(d)
Exercise: Events occur at times: 3, 6, 12, 15, 24, and 28. What is the average size of the data
reported, if it is truncated from below at 10 and truncated from above at 20?
[Solution: (12 + 15)/2 = 13.5.]

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 164

Problems:
Use the following information for the next 4 questions:
There are 6 losses: 100, 400, 700, 800, 1200, and 2300.
16.1 (1 point) If these losses are truncated from below (left truncated) at 500, what appears in the
data base?
16.2 (1 point) If these losses are truncated and shifted from below (left truncated and shifted) at 500,
what appears in the data base?
16.3 (1 point) If these losses are truncated from above (right truncated) at 1000, what appears in
the data base?
16.4 (1 point) If these losses are truncated from below (left truncated) at 500 and truncated from
above (right truncated) at 1000, what appears in the data base.
Use the following information for the next 2 questions:
Losses are uniformly distributed from 0 to 1000.
16.5 (1 point) What is the distribution function for these losses left truncated at 100?
16.6 (1 point) What is the distribution function for these losses left truncated and shifted at 100?

16.7 (1 point) Assume that claims follow a distribution, F(x) = 1 -

1
.
{1 + (x / )}

Which of the following represents the distribution function for the data truncated from below at d?
+ d
A.
x >d

+ x
+ d
B. 1 -
x >d
+ x

C.
x >d

+ x

D. 1 -

+ x

x >d

E. None of the above.

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 165

16.8 (1 point) The report lag for claims is assumed to be exponentially distributed:
F(x) = 1 - e-x , where x is the delay in reporting.
What is the probability density function for data truncated from above at 3?
A. 3e-3x

B. e-3

C. e-x / (1 - e-3)

D. e-x / (1 - e-3)

E. e-(x-3)

Use the following information for the next three questions:

Losses follow a Distribution Function F(x) = 1 - {2000 / (2000+x)}3 .


16.9 (1 point) If the reported data is truncated from below at 400, what is the Density Function at
1000?
A. less than 0.0004
B. at least 0.0004 but less than 0.0005
C. at least 0.0005 but less than 0.0006
D. at least 0.0006 but less than 0.0007
E. at least 0.0007
16.10 (1 point) If the reported data is truncated from above at 2000, what is the Density Function at
1000?
A. less than 0.0004
B. at least 0.0004 but less than 0.0005
C. at least 0.0005 but less than 0.0006
D. at least 0.0006 but less than 0.0007
E. at least 0.0007
16.11 (1 point) If the reported data is truncated from below at 400 and truncated from above at
2000, what is the Density Function at 1000?
A. less than 0.0004
B. at least 0.0004 but less than 0.0005
C. at least 0.0005 but less than 0.0006
D. at least 0.0006 but less than 0.0007
E. at least 0.0007
Use the following information for the next 2 questions:
There are 3 losses: 800, 2500, 7000.
16.12 (1 point) If these losses are truncated from below (left truncated) at 1000, what appears in the
data base?
16.13 (1 point) If these losses are truncated and shifted from below (left truncated and shifted) at
1000, what appears in the data base?

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 166

Use the following information for the next 3 questions:


The probability density function is: f(x) = x/50, 0 x 10.
16.14 (2 points) Determine the mean of this distribution left truncated at 4.
A. 7.3
B. 7.4
C. 7.5
D. 7.6
E. 7.7
16.15 (2 points) Determine the median of this distribution left truncated at 4
A. 7.3
B. 7.4
C. 7.5
D. 7.6
E. 7.7
16.16 (2 points) Determine the variance of this distribution left truncated at 4
A. 2.8
B. 3.0
C. 3.2
D. 3.4
E. 3.6

16.17 (4, 5/85, Q.56) (2 points) Let f be the probability density function of x, and let F be the
distribution function of x. Which of the following expressions represent the probability density
function of x truncated and shifted from below at d?

0, x d
A.
f(x) / {1 - F(d)}, d < x

0, x 0
B.
f(x) / {1 - F(d)}, 0 < x

0, x d
C.
f(x - d) / {1 - F(d)}, d < x

0, x - d
D.
f(x + d) / {1 - F(d)}, - d < x

0, x 0
E.
f(x + d) / {1 - F(d)}, 0 < x

2013-4-2,

Loss Distributions, 16 Truncated Data

16.18 (4B, 11/92, Q.3) (1 point) You are given the following:
Based on observed data truncated from above at $10,000,
the probability of a claim exceeding $3,000 is 0.30.
Based on the underlying distribution of losses, the
probability of a claim exceeding $10,000 is 0.02.
Determine the probability that a claim exceeds $3,000.
A. Less than 0.28
B. At least 0.28 but less than 0.30
C. At least 0.30 but less than 0.32
D. At least 0.32 but less than 0.34
E. At least 0.34

HCM 10/8/12,

Page 167

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 168

Solutions to Problems:
16.1. If these losses are truncated from below (left truncated) at 500, the two small losses do not
appear: 700, 800, 1200, 2300.
16.2. If these losses are truncated and shifted from below at 500, the two small losses do not
appear and the other losses have 500 subtracted from them: 200, 300, 700, 1800.
Comment: The (non-zero) payments with a $500 deductible.
16.3. If these losses are truncated from above (right truncated) at 1000, the two large losses do not
appear: 100, 400, 700, 800.
16.4. Neither the two small losses nor the two large losses appear: 700, 800.
16.5. Losses of size less than 100 do not appear.
G(x) = (x - 100)/900 for 100 < x < 1000.
Alternately, F(x) = x/1000, 0 < x < 1000 and G(x) = {F(x) - F(100)}/S(100) =
{(x/1000) - 100/1000)}/(1 - 100/1000) = (x - 100)/900 for 100 < x < 1000.
16.6. Losses of size less than 100 do not appear. We record the payment amount with a $100
deductible. G(x) = x/900 for 0 < x < 900.
Alternately, F(x) = x/1000, 0 < x < 1000 and G(x) = {F(x + 100) - F(100)}/S(100) =
{((x + 100)/1000) - 100/1000)}/(1 - 100/1000) = x/900 for 0 < x < 900.
Comment: A uniform distribution from 0 to 900.
16.7. B. The new distribution function is for x > d: {F(x) - F(d)} / {1 - F(d)} =
[{ / ( + d)} { / ( + x)}] / { / ( + d)} = 1- {( + d) / ( + x)}.
Comment: A Burr Distribution.
16.8. D. The Distribution Function of the data truncated from above at 3 is G(x) = F(x)/F(3) =
(1 - e-x) / (1 - e-3). The density function is g(x) = G(x) = e-x / (1 - e-3).
16.9. C. Prior to truncation the density function is: f(x) = (3)(2000)3 / (2000+x)4 .
After truncation from below at 400, the density function is: g(x) = f(x) / {1 - F(400)} =
f(x) / {2000 / (2000+400)}3 = f(x) / 0.5787.
f(1000) = (3)(2000)3 / (2000+1000)4 = 0.000296.
g(1000) = 0.000296 / 0.5787 = 0.00051.

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

Page 169

16.10. A. Prior to truncation the density function is: f(x) = (3)(2000)3 / (2000+x)4 .
After truncation from above at 2000, the density function is:
g(x) = f(x) / F(2000) = f(x) / {1 - (2000 / (2000+2000))3 } = f(x) / 0.875.
f(1000) = (3)(2000)3 / (2000 +1000)4 = 0.000296.
g(1000) = 0.000296 / 0.875 = 0.00034.
16.11. D. Prior to truncation the density function is: f(x) = (3)(2000)3 / (2000+x)4 .
After truncation from below at 400 and from above at 2000, the density function is:
g(x) = f(x) / {F(2000)-F(400)}. F(2000) = 0.875. F(400) = 0.4213.
f(1000) = (3)(20003 ) / (2000+1000)4 = 0.000296.
g(1000) = 0.000296 / (0.875 - 0.4213) = 0.00065.
16.12. If these losses are truncated from below (left truncated) at 1000, then losses of size less than
or equal to 1000 do not appear. The data base is: 2500, 7000.
16.13. If these losses are truncated and shifted from below at 1000, then losses of size less than or
equal to 1000 do not appear, and the other losses have 1000 subtracted from them. The data base
is: 1500, 6000.
Comment: The (non-zero) payments with a $1000 deductible.
16.14. B. & 16.15. D. & 16.16. A. For the original distribution: F(4) = 42 /100 = 0.16.
Therefore, the density left truncated at 4 is: (x/50)/(1 - 0.16) = x/42, 4 x 10.
10

The mean of the truncated distribution is:

4 (x / 42) x dx = (103 - 43)/126 = 7.429.

For x > 4, by integrating the truncated density, the truncated distribution is: (x2 - 42 ) / 84.
Set the truncated distribution equal to 50%: 0.5 = (x2 - 42 ) / 84. x = 7.616.
10

The second moment of the truncated distribution is:

4 (x / 42) x2 dx = (104 - 44)/168 = 58.

The variance of the truncated distribution is: 58 - 7.4292 = 2.81.


Comment: The density right truncated at 4 is: (x/50)/0.16 = x/8, 0 x 4.
16.17. E. The p.d.f. is: 0 for x 0, and f(x+d)/[1-F(d)] for 0 < x
Comment: Choice A is the p.d.f for the data truncated from below at d, but not shifted.

2013-4-2,

Loss Distributions, 16 Truncated Data

HCM 10/8/12,

16.18. C. P(x3000 | x 10000) = P(x3000) / P(x10000).


Thus, 1 - 0.3 = P(x3000) / (1 - 0.02).
P(x3000) = (1 - 0.3)(0.98) = 0.686.
P(x>3000) = 1 - 0.686 = 0.314.
Alternately, let F(x) be the distribution of the untruncated losses.
Let G(x) be the distribution of the losses truncated from above at 10,000.
Then G(x) = F(x) / F(10,000), for x 10000.
We are given that 1 - G(3000) = 0.3, and that 1 - F(10,000) = 0.02.
Thus F(10,000) = 0.98.
Also, 0.7 = G(3000) = F(3000) / F(10,000) = F(3000) / 0.98.
Thus F(3000) = (0.7)(0.98) = 0.686.
1 - F(3000) = 1 - 0.686 = 0.314.

Page 170

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 171

Section 17, Censored Data


Censoring is somewhat different than truncation. With truncation we do not know of the existence of
certain losses. With censoring we do not know the size of certain losses.
The most important example of censoring is due to the effect of a maximum covered loss.
Right Censored / Censored from Above:72
Right Censored at u Censored from Above at u X

u Min[X, u]

Maximum Covered Loss u and dont know exact size of loss, when u.
When data is right censored or censored from above at the value u, losses of size more than u
are recorded in the data base as u. This generally occurs when there is a maximum covered loss of
size u. When a loss (covered by insurance) is larger than the maximum covered loss, the insurer
pays the maximum covered loss (if there is no deductible) and may neither know nor care how much
bigger the loss is than the maximum covered loss.
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is censored from above at $1000?
[Solution: $300, $600, $1000, $1000, $1000.]
The revised Distribution Function under censoring from above at u is:
F(x)
G(x) =
1

x < u
x = u

f(x)
g(x) =
point mass of probability S(u)

x < u
x = u

The data censored from above at u is the limited loss variable, X

u Min[X, u], discussed

previously. The average size of the data censored from above at u, is the Limited Expected Value
at u, E[X u].

72

The terms right censored and censored from above are synonymous. From the right refers to a graph with the
size of loss along the x-axis with the large values on the righthand side. From above uses similar terminology as
higher layers of loss. From above is how the effect of a maximum covered loss looks in a Lee Diagram.

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 172

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average size of the data reported, if it is censored from above at $1000?
[Solution: ($300+ $600+ $1000+ $1000+ $1000) /5 = $780 = E[X 1000].]
Truncation from Below and Censoring from Above:
When data is subject to both a maximum covered loss and a deductible, and one records the loss
by the insured, then the data is censored from above and truncated from below.
For example, with a deductible of $1000 and a maximum covered loss of $25,000:
As Recorded after Truncation from Below at 1000 and
Loss Size
Censoring from Above at 25000
600
Not recorded
4500
4500
37000
25000
For truncation from below at d and censoring from above at u,73 the data are recorded as follows:
Loss by Insured
Amount Recorded by Insurer
xd
d<xL
ux

Not Recorded
x
u

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is truncated from below at $1000 and censored from above at
$2000?
[Solution: $1200, $1500, $2000.]
The revised Distribution Function under censoring from above at u and truncation from below at d is:
F(x) - F(d)

d < x < u
G(x) =
S(d)

1
x = u
d < x < u
f(x) / S(d)
g(x) =
point mass of probability S(u)/ S(d)
x = u

73

For the example u = $25,000 and d = $1000.

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 173

The total losses of the data censored from above at u and truncated from below at d, is the losses in
the layer from d to u, plus d times the number of losses in the data set. The number of losses in the
data set is S(d). Therefore, the average size of the data censored from above at u and truncated
from below at d, is:
(E[X u] - E[X d]) + d S(d) E[X u] - E[X d]
=
+ d.
S(d)
S(d)
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average size of the data reported, if it is truncated from below at $1000 and censored
from above at $2000?
[Solution: ($1200+ $1500 + $2000) /3 = $1567.
Comment: (E[X 2000] - E[X 1000])/S(1000) + 1000 = (1120 - 780)/.6 + 1000 = $1567.]
Truncation and Shifting from Below and Censoring from Above:
When data is subject to both a maximum covered loss and a deductible, and one records the
amount paid by the insurer, then the data is censored from above and truncated and shifted from
below.
For example, with a deductible of $1000 and a maximum covered loss of $25,000:
As Recorded after Truncation and Shifting from Below at 1000 and
Loss Size
Censoring from Above at 25000
600
Not recorded
4500
3500
37000
24000
For truncation and shifting from below at d and censoring from above at u, the data are recorded as
follows:
Loss by Insured
Amount Recorded by Insurer
xd
d<xu
ux

Not Recorded
x-d
u-d

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 174

The revised Distribution Function under censoring from above at u and truncation and shifting from
below at d is:
F(x + d) F(d)

S(d)

G(x) =

0 < x < u - d
x = u - d

f(x + d) / S(d)
0 < x < u - d
g(x) =
x = u - d
point mass of probability S(u) / S(d)
x the size of (non-zero) payment.

x+d the size of loss.

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is truncated and shifted from below at $1000 and censored from
above at $2000?
[Solution: $200, $500, $1000.]
The total losses of the data censored from above at u and truncated and shifted from below at d, is
the losses in the layer from d to u. The number of losses in the data base is S(d). Therefore the
average size of the data censored from above at u and truncated and shifted from below at d, is:
E[X u] - E[X d]
.
S(d)
Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average size of the data reported, if it is truncated and shifted from below at $1000 and
censored from above at $2000?
[Solution: ($200+ $500 + $1000) /3 = $567.
Comment: (E[X 2000] - E[X 1000])/S(1000) = (1120 - 780)/.6 = $567.]

Left Censored and Shifted / Censored and Shifted from Below:


For example, the same data as in Section 1, but left censored and shifted or censored and
shifted from below at $10,000, would have each of the 8 small losses in Section 1 reported as
resulting in no payment in the presence of a $10,000 deductible, but we would not know their exact
sizes. For the remaining 122 large losses, the payment of $10,000 less than their size would be
reported.

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 175

left censored and shifted variable74 at d (X - d)+


0 when X d, X - d when X > d the amounts paid to insured with a deductible of d

payments per loss, including when the insured is paid nothing due to the deductible of d
amount paid per loss.
When data is left censored and shifted at the value d, losses of size less than d are recorded in
the data base as 0. Losses of size x > d, are recorded as x - d.
What appears in the data base is (X - d)+.
The revised Distribution Function under left censoring and shifting at d is:
G(x) = F(x + d)

x0

point mass of probability F(d)


g(x) =
f(x + d)
x the size of payment.

x = 0
x > 0

x+d the size of loss.

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is left censored and shifted at $1000?
[Solution: $0, $0, $200, $500 and $1,800.]
The mean of the left censored at d and shifted variable =
the average payment per loss with a deductible of d = E[X] - E[ X
E[(X - d)+] = E[X] - E[ X

d] Layer from d to .

d] =

(x- d) f(x) dx .

E[ (X - d)n+ ] =

(x- d)n f(x) dx .

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average of the data reported, if it is left censored and shifted at $1000?
[Solution: (0 + 0 + $200 + $500 + $1800)/5 = $500.
E[X ] - E[X 1000] = $1280 - $780 = $500 = average payment per loss. In contrast, the average
payment per non-zero payment is: ($200 + $500 + $1800)/3 = $833.33. ]
74

Discussed previously. See Definition 3.5 in Loss Models.

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 176

Left Censored / Censored from Below:75


Sometimes data is censored from below, so that one only knows how many small values there are,
but does not know their exact sizes.76
For example, an actuary might have access to detailed information on all Workers Compensation
losses of size greater than $2000, including the size of each such loss, but might only know how
many losses there were of size less than or equal to $2000. Such data has been censored from
below at $2000.
For example, the same data as in Section 1, but censored from below at $10,000, would have each
of the 8 small losses in Section 1 reported as being $10,000 or less. The same information would
be reported as shown in Section 1 on the remaining 122 losses. When data is censored from
below at the value d, losses of size less than d are recorded in the data base as d.
The revised Distribution Function under censoring from below at d is:
0
x < d
G(x) =
F(x) x d
point mass of probability F(d)
g(x) =
f(x)

x < d
x d

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
How is this data reported, if it is censored from below at $1000?
[Solution: $1000, $1000, $1,200, $1,500 and $2,800.]
The average size of the data censored from below at d, is (E[X ] - E[X
The losses are those in the layer from d to , plus d per loss.

d]) + d.

Exercise: An insured has losses of sizes: $300, $600, $1,200, $1,500 and $2,800.
What is the average of the data reported, if it is censored from below at $1000?
[Solution: ($1000 + $1000 + $1200 + $1500 + $2800) / 5 = $1500.
Comment: (E[X ] - E[X 1000]) + 1000 = (1280-780) + 1000 = 1500. ]

75
76

See Definition 14.1 in Loss Models.


See for example, 4, 11/06, Q.5.

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 177

Problems:
Use the following information for the next 4 questions:
Losses are uniformly distributed from 0 to 1000.
17.1 (1 point) What is the distribution function for these losses left censored at 100?
17.2 (1 point) What is the distribution function for these losses left censored and shifted at 100?
17.3 (1 point) What is the distribution function for these losses right censored at 800?
17.4 (1 point) What is the distribution function for these losses left truncated and shifted at 100 and
right censored at 800?

Use the following information for the next 4 questions:


There are 6 losses: 100, 400, 700, 800, 1200, and 2300.
17.5 (1 point) If these losses are left censored (censored from below) at 500, what appears in the
data base?
17.6 (1 point) If these losses are left censored and shifted at 500, what appears in the data base?
17.7 (1 point) If these losses are censored from above (right censored) at 2000, what appears in
the data base?
17.8 (1 point) If these losses are left truncated and shifted at 500 and right censored at 2000, what
appears in the data base?

17.9 (1 point) There are five accidents with losses equal to:
$500, $2500, $4000, $6000, and $8000.
Which of the following statements are true regarding the reporting of this data?
1. If the data is censored from above at $5000, then the data is reported as:
$500, $2500, $4000.
2. If the data is truncated from below at $1000, then the data is reported as:
$2500, $4000, $6000, $8000.
3. If the data is truncated and shifted at $1000, then the data is reported as:
$1500, $3000, $5000, $7000.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3
E. None of A, B, C or D

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 178

17.10 (2 points) It can take many years for a Medical Malpractice claim to be reported to an insurer
and can take many more years to be closed, in other words resolved.
You are studying how long it takes Medical Malpractice claims to be reported to your insurer. You
have data on incidents that occurred two years ago and how long they took to be reported. You are
also studying how long it takes for Medical Malpractice claims to be closed once they are reported.
You have data on all incidents that were reported two years ago and how long it took to close those
that are not still open. For each of these two sets of data, state whether it is truncated and/or
censored and briefly explain why.

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 179

Solutions to Problems:
17.1. We only know that small losses are of size at most 100.
G(100) = .1; G(x) = x/1000 for 100 < x < 1000.
17.2. G(0) = .1; G(x) = (x + 100)/1000 for 0 < x < 900.
17.3. All losses are limited to 800. G(x) = x/1000 for 0 < x < 800; G(800) = 1.
17.4. Losses less than 100 do not appear. Other losses are limited to 800 and then have 100
subtracted. G(x) = x/900 for x < 700; G(700) = 1.
Alternately, F(x) = x/1000, 0 < x < 1000 and G(x) = {F(x + 100) - F(100)}/S(100) =
{((x + 100)/1000) - 100/1000)}/(1 - 100/1000) =
x/900 for 0 < x < 800 - 100 = 700; G(700) = 1.
17.5. The two smaller losses appear as 500: 500, 500, 700, 800, 1200, 2300.
17.6. (X - 500)+ = 0, 0, 200, 300, 700, 1800.
Comment: The amounts the insured receives with a $500 deductible.
17.7. The large loss is limited to 2000: 100, 400, 700, 800, 1200, 2000.
Comment: Payments with a 2000 maximum covered loss.
Right censored observations might be indicated with a plus as follows:
100, 400, 700, 800, 1200, 2000+. The 2000 corresponds to a loss of 2000 or more.
17.8. The two small losses do not appear; the other losses are limited to 2000 and then have 500
subtracted: 200, 300, 700, 1500.
Comment: Payments with a 500 deductible and 2000 maximum covered loss.
Apply the maximum covered loss first and then the deductible; therefore, apply the censorship first
and then the truncation.
17.9. C. If the data is censored by a $5000 limit, then the data is reported as: $500, $2500,
$4000, $5000, $5000. Statement 1 would be true if it referred to truncation from above rather than
censoring. Under censoring the size of large accidents is limited in the reported data to the maximum
covered loss. Under truncation from above, the large accidents do not even make it into the reported
data. Statements 2 and 3 are each true.

2013-4-2,

Loss Distributions, 17 Censored Data

HCM 10/8/12,

Page 180

17.10. The data on incidents that occurred two years ago is truncated from above at two years.
Those incidents, if any, that will take more than 2 years to be reported will not be in our data base
yet. We dont know how many of them there may be nor how long they will take to be reported.
The data on claims that were reported two years ago is censored from above at two years. Those
claims that are still opened, we know will be closed eventually. However, while we know it will take
more than 2 years to close each of them, we dont know exactly how long it will take.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 181

Section 18, Average Sizes


For each of the different types of data there are corresponding average sizes. The most important
cases involve a deductible and/or a maximum covered loss; one should know well the average
payment per loss and the average payment per (non-zero) payment.
Average Amount Paid per Loss:
Exercise: When there is a deductible of size 1000, a maximum covered loss of 25,000, and thus a
policy limit of 25,000 - 1000 = 24,000, what is the average amount paid per loss?
[Solution: The average amount paid per loss are the average losses in the layer from 1000 to
25,000: E[X 25000] - E[X 1000].]
Situation

Average Amount Paid per Loss

No Maximum Covered Loss, No Deductible

E[X]

Maximum Covered Loss u, No Deductible

E[X

No Maximum Covered Loss, (ordinary) Deductible d

E[X] - E[X

Maximum Covered Loss u, (ordinary) Deductible d

E[X

Recalling that E[X


situations:

] = E[X] and E[X

u]

d]

u] - E[X

d]

0] = 0, we have a single formula that covers all four

With Maximum Covered Loss of u and an (ordinary) deductible of d, the average amount
paid by the insurer per loss is: E[X u] - E[X d].
Note that the average payment per loss is just the layer from d to u. As discussed previously, this
layer can also be expressed as: (layer from d to ) - (layer from u to ) =
E[(X - d)+] = E[(X - u)+] = {E[X] - E[X d]} - {E[X] - E[X u]} = E[X u] - E[X d].
Average Amount Paid per Non-Zero Payment:
Exercise: What is the average non-zero payment when there is a deductible of size 1000 and no
maximum covered loss?
[Solution: The average non-zero payment when there is a deductible of size 1000 is the ratio of the
losses excess of 1000, E[X] -E[X 1000], to the probability of a loss greater than 1000, S(1000).
Thus the expected non-zero payment is: (E[X] -E[X 1000]) / S(1000).]

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 182

In case of a deductible, some losses to the insured are too small to result in a payment by the
insurer. Thus there are fewer non-zero payments than losses. In order to convert the average
amount paid per loss to the average amount paid per non-zero payment, one needs to divide by
S(d).
With Maximum Covered Loss of u and an (ordinary) deductible of d, the average amount
E[X u] - E[X d]
paid by the insurer per non-zero payment to the insured is:
.
S(d)
If u = , in other words there is no maximum covered loss, then this is e(d).
Coinsurance Factor:
For example, an insurance policy might have a 80% coinsurance factor.
Then the insurer pays 80% of what it would have paid in the absence of the coinsurance factor.
Thus the average payment, either per loss or per non-zero payment would be multiplied by 80%.
In general, a coinsurance factor of c, multiplies the average payment, either per loss or per non-zero
payment by c.
With Maximum Covered Loss of u, an (ordinary) deductible of d, and a coinsurance factor of c, the
average amount paid by the insurer per loss by the insured is: c (E[X u] - E[X d]).
With Maximum Covered Loss of u, an (ordinary) deductible of d, and a coinsurance factor of c, the
average amount paid by the insurer per non-zero payment to the insured is:
E[X u] - E[X d]
c
.
S(d)
Exercise: Prior to the application of any coverage modifications, losses follow a Pareto Distribution,
as per Loss Models, with parameters = 3 and = 20,000.
An insured has a policy with a $100,000 maximum covered loss, a $5000 deductible, and a 90%
coinsurance factor. Thus the policy limit is: (0.9)(100,000 - 5000) = 85,500.
Determine the average amount per non-zero payment.
[Solution: For the Pareto Distribution, as shown in Appendix A of Loss Models,

1

S(x) =
. E[X x] =
1 -

.
+ x
+ x
1

S(5000) = (20/25)3 = 0.512. E[X 5000] = 10,000 {1 -(20/25)2 } = 3600.


E[X 100,000] = 10,000 {1 -(20/120)2 } = 9722.
90%

E[X 100,000] - E[X 5000] (90%)(9722 - 3600)


=
= $10,761.]
0.512
S(5000)

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 183

Average Sizes for the Different Types of Data Sets:


Type of Data

Average Size

Ground-up, Total Limits

E[X]

Censored from Above at u

E[X

Truncated from Below at d

e(d) + d =

Truncated and Shifted from Below at d

e(d) =

Truncated from Above at L

E[X L] - L S(L)
F(L)

Censored from Below at d

(E[X ] - E[X

Left Censored and Shifted

E[(X - d)+] = E[X] - E[X

Censored from Above at u

E[X u] - E[X d]
+d
S(d)

u]
E[X] - E[X d]
+d
S(d)

E[X] - E[X d]
S(d)

d]) + d

d]

and Truncated from Below at d

Censored from Above at u and

E[X u] - E[X d]
S(d)

Truncated and Shifted from Below at d

Truncated from Above at L and

{E[X L] - L S(L)} - {E[X d] - d S(d)}


F(L) - F(d)

Truncated from Below at d

Truncated from Above at L


and Truncated and Shifted
from Below at d

{E[X L] - L S(L)} - {E[X d] - d S(d)}


-d
F (L) - F (d)

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 184

Problems:
Use the following information for the next 5 questions:
There are five losses of size: $2000, $6000, $12,000, $27,000, and $48,000.
18.1 (1 point) With a $5000 deductible, what is the average payment per loss?
A. less than 12,000
B. at least 12,000 but less than 13,000
C. at least 13,000 but less than 14,000
D. at least 14,000 but less than 15,000
E. at least 15,000
18.2 (1 point) With a $5000 deductible, what is the average payment per (non-zero) payment?
A. less than 17,000
B. at least 17,000 but less than 18,000
C. at least 18,000 but less than 19,000
D. at least 19,000 but less than 20,000
E. at least 20,000
18.3 (1 point) With a $25,000 policy limit, what is the average payment?
A. less than 14,000
B. at least 14,000 but less than 15,000
C. at least 15,000 but less than 16,000
D. at least 16,000 but less than 17,000
E. at least 17,000
18.4 (1 point) With a $5000 deductible and a $25,000 maximum covered loss,
what is the average payment per loss?
A. less than 10,000
B. at least 10,000 but less than 11,000
C. at least 11,000 but less than 12,000
D. at least 12,000 but less than 13,000
E. at least 13,000
18.5 (1 point) With a $5000 deductible and a $25,000 maximum covered loss,
what is the average payment per (non-zero) payment?
A. less than 11,000
B. at least 11,000 but less than 12,000
C. at least 12,000 but less than 13,000
D. at least 13,000 but less than 14,000
E. at least 14,000

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 185

Use the following information for the next 5 questions:


Loss Size (x)
F(x)
E[X x]
10,000
0.325
8,418
25,000
0.599
16,198
50,000
0.784
23,544
100,000
0.906
30,668
250,000
0.978
37,507

1.000
41,982
18.6 (1 point) With a 25,000 deductible, determine E[YL ].
A. less than 22,000
B. at least 22,000 but less than 23,000
C. at least 23,000 but less than 24,000
D. at least 24,000 but less than 25,000
E. at least 25,000
18.7 (1 point) With a 25,000 deductible, determine E[YP].
A. less than 62,000
B. at least 62,000 but less than 63,000
C. at least 63,000 but less than 64,000
D. at least 64,000 but less than 65,000
E. at least 65,000
18.8 (1 point) With a 100,000 policy limit, what is the average payment?
A. less than 27,000
B. at least 27,000 but less than 28,000
C. at least 28,000 but less than 29,000
D. at least 29,000 but less than 30,000
E. at least 30,000
18.9 (1 point) With a 25,000 deductible and a 100,000 maximum covered loss,
what is the average payment per loss?
A. less than 15,000
B. at least 15,000 but less than 16,000
C. at least 16,000 but less than 17,000
D. at least 17,000 but less than 18,000
E. at least 18,000
18.10 (1 point) With a 25,000 deductible and a 100,000 maximum covered loss, determine E[YP].
A. 32,000
B. 33,000
C. 34,000
D. 35,000
E. 36,000

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 186

Use the following information for the next five questions:


There are six accidents of size: $800, $2100, $3300, $4600, $6100, and $8900.
18.11 (1 point) If the reported data is truncated from below at $1000,
what is the average size of claim in the reported data?
A. 4800
B. 4900
C. 5000
D. 5100
E. 5200
18.12 (1 point) If the reported data is truncated and shifted from below at $1000,
what is the average size of claim in the reported data?
A. less than 3900
B. at least 3900 but less than 4000
C. at least 4000 but less than 4100
D. at least 4100 but less than 4200
E. at least 4200
18.13 (1 point) If the reported data is left censored and shifted at $1000,
what is the average size of claim in the reported data?
A. less than 3700
B. at least 3700 but less than 3800
C. at least 3800 but less than 3900
D. at least 3900 but less than 4000
E. at least 4000
18.14 (1 point) If the reported data is censored from above at $5000,
what is the average size of claim in the reported data?
A. less than 3400
B. at least 3400 but less than 3500
C. at least 3500 but less than 3600
D. at least 3600 but less than 3700
E. at least 3700.
18.15 (1 point) If the reported data is truncated from above at $5000,
what is the average size of claim in the reported data?
A. less than 2300
B. at least 2300 but less than 2400
C. at least 2400 but less than 2500
D. at least 2500 but less than 2600
E. at least 2600

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 187

Use the following information for the next four questions:


Losses follow a uniform distribution from 0 to 20,000.
18.16 (2 points) If there is a deductible of 1000,
what is the average payment by the insurer per loss?
A. less than 8800
B. at least 8800 but less than 8900
C. at least 8900 but less than 9000
D. at least 9000 but less than 9100
E. at least 9100
18.17 (2 points) If there is a policy limit of 15,000,
what is the average payment by the insurer per loss?
A. less than 9300
B. at least 9300 but less than 9400
C. at least 9400 but less than 9500
D. at least 9500 but less than 9600
E. at least 9600
18.18 (2 points) There is a maximum covered loss of 15,000 and a deductible of 1000.
What is the average payment by the insurer per loss?
(Include situations where the insurer pays nothing.)
A. less than 8200
B. at least 8200 but less than 8300
C. at least 8300 but less than 8400
D. at least 8400 but less than 8500
E. at least 8500
18.19 (2 points) There is a maximum covered loss of 15,000 and a deductible of 1000.
What is the average value of a non-zero payment by the insurer?
A. less than 8700
B. at least 8700 but less than 8800
C. at least 8800 but less than 8900
D. at least 8900 but less than 9000
E. at least 9000

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 188

18.20 (2 points) An insurance policy has a maximum covered loss of 2000 and a deductible of 100.
For the ground up unlimited losses: F(100) = 0.20, F(2000) = 0.97, and
2000

x f(x) dx = 400.
100

What is the average payment per loss?


A. 360
B. 380
C. 400

D. 420

E. 440

18.21 (1 point) A policy has a policy limit of 50,000 and deductible of 1000.
What is the expected payment per loss?
A. E[X 49,000] - E[X 1000]
B. E[X 50,000] - E[X 1000]
C. E[X 51,000] - E[X 1000]
D. E[X 49,000] - E[X 1000] + 1000
E. E[X 51,000] - 1000
18.22 (2 points) You are given:
In the absence of a deductible the average loss is 15,900.

With a 10,000 deductible, the average amount paid per loss is 7,800.
With a 10,000 deductible, the average amount paid per nonzero payment is 13,300.
What is the average of those losses of size less than 10,000?
(A) 5000
(B) 5200
(C) 5400
(D) 5600
(E) 5800
18.23 (1 point) E[(X - 1000)+] = 3500. E[(X - 25,000)+] = 500.
There is a 1000 deductible and a 25,000 maximum covered loss.
Determine the average payment per loss.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 189

Use the following information for the next two questions:


Flushing Reinsurance reinsures a certain book of business.
Limited Expected Values for this book of business are estimated to be:
E[X $1 million] = $300,000
E[X $4 million] = $375,000
E[X $5 million] = $390,000
E[X $9 million] = $420,000
E[X $10 million] = $425,000
The survival functions, S(x) = 1 - F(x), for this book of business are estimated to be:
S($1 million) = 3.50%
S($4 million) = 1.70%
S($5 million) = 1.30%
S($9 million) = 0.55%
S($10 million) = 0.45%
Flushing Reinsurance makes a nonzero payment, y, on this book of business.
18.24 (1 point) If Flushing Reinsurance were responsible for the layer of loss from $1 million to
$5 million ($4 million excess of $1 million), what is the expected value of y?
A. less than $1 million
B. at least $1 million but less than $2 million
C. at least $2 million but less than $3 million
D. at least $3 million but less than $4 million
E. at least $4 million
18.25 (1 point) If Flushing Reinsurance were responsible for the layer of loss from $1 million to
$10 million ($9 million excess of $1 million), what is the expected value of y?
A. less than $1 million
B. at least $1 million but less than $2 million
C. at least $2 million but less than $3 million
D. at least $3 million but less than $4 million
E. at least $4 million

18.26 (3 points) Losses are distributed uniformly from 0 to .


There is a deductible of size d < .
Determine the variance of the payment per loss.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 190

Use the following information for the next 12 questions:


The distribution of losses suffered by insureds is estimated to have the following
limited expected values:
E[X 5,000 ] = 3,600
E[X 20,000] = 7,500
E[X 25,000] = 8,025
E[X ] = 10,000
The survival functions, S(x), for the distribution of losses suffered by insureds
is estimated to have the following values:
S(5,000) = 51.2%
S(20,000) = 12.5%
S(25,000) = 8.8%
18.27 (1 point) What is average loss suffered by the insureds?
A. 9,600
B. 9,700
C. 9,800
D. 9,900
E. 10,000
18.28 (1 point) What is the average size of data truncated from above at 25,000?
A. less than 6,300
B. at least 6,300 but less than 6,400
C. at least 6,400 but less than 6,500
D. at least 6,500 but less than 6,600
E. at least 6,600
18.29 (1 point) What is the average size of data truncated and shifted from below at 5000?
A. 12,500

B. 12,600

C. 12,700

D. 12,800

E. 12,900

18.30 (1 point) What is the average size of data censored from above at 25,000?
A. 7800

B. 7900

C. 8000

D. 8100

E. 8200

18.31 (1 point) What is the average size of data censored from below at 5,000?
A. less than 10,700
B. at least 10,700 but less than 10,800
C. at least 10,800 but less than 10,900
D. at least 10,900 but less than 11,000
E. at least 11,000
18.32 (1 point) What is the average size of data left censored and shifted at 5,000?
A. 6200
B. 6300
C. 6400
D. 6500
E. 6600

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 191

18.33 (2 points) What is the average size of data truncated from below at 5,000 and truncated from
above at 25,000?
A. less than 11,200
B. at least 11,200 but less than 11,300
C. at least 11,300 but less than 11,400
D. at least 11,400 but less than 11,500
E. at least 11,500.
18.34 (1 point) What is the average size of data truncated from below at 5,000 and censored from
above at 25,000?
A. less than 13,500
B. at least 13,500 but less than 13,600
C. at least 13,600 but less than 13,700
D. at least 13,700 but less than 13,800
E. at least 13,800
18.35 (2 points) What is the average size of data censored from below at 5,000 and censored from
above at 25,000?
A. 9100
B. 9200
C. 9300
D. 9400
E. 9500
18.36 (2 points) What is the average size of data truncated and shifted from below at 5,000 and
truncated from above at 25,000?
A. less than 6,100
B. at least 6,100 but less than 6,200
C. at least 6,200 but less than 6,300
D. at least 6,300 but less than 6,400
E. at least 6,400
18.37 (2 points) What is the average size of data truncated and shifted from below at 5,000 and
censored from above at 25,000?
A. less than 8,700
B. at least 8,700 but less than 8,800
C. at least 8,800 but less than 8,900
D. at least 8,900 but less than 9,000
E. at least 9,000
18.38 (1 point) What is the average size of data truncated from below at 5000?
A. 17,000
B. 17,500
C. 18,000
D. 18,500
E. 19,000

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 192

18.39 (2 points) The size of loss distribution has the following characteristics:
(i) S(100) = 0.65.
(ii) E[X | X > 100] = 345.
There is an ordinary deductible of 100 per loss.
Determine the average payment per loss.
(A) 160
(B) 165
(C)170
(D) 175
(E) 180
18.40 (3 points) A business has obtained two separate insurance policies that together provide full
coverage. You are given:
(i) The average ground-up loss is 27,000.
(ii) Policy B has no deductible and a maximum covered loss of 25,000.
(iii) Policy A has an ordinary deductible of 25,000 with no maximum covered loss.
(iv) Under policy A, the expected amount paid per loss is 10,000.
(v) Under policy A, the expected amount paid per payment is 22,000.
Given that a loss less than or equal to 25,000 has occurred, what is the expected payment under
policy B?
A. Less than 11,000
B. At least 11,000, but less than 12,000
C. At least 12,000, but less than 13,000
D. At least 13,000, but less than 14,000
E. At least 14,000
18.41 (2 points) X is the size of loss prior to the effects of any policy provisions.
Given the following information, calculate the average payment per loss under a policy with a 1000
deductible and a 25,000 maximum covered loss.
x
e(x)
F(x)
1000
30,000
72.7%
25,000
980,000
99.7%
A. 4250
B. 4500
C. 4750
D. 5000
E. 5250
18.42 (1 point) For a certain policy, in order to determine the payment on a claim, first the deductible
of 500 is applied, and then the payment is capped at 10,000.
What is the expected payment per loss?
A. E[X 10,000] - E[X 500]
B. E[X 10,500] - E[X 500]
C. E[X 10,000] - E[X 500] + 500
D. E[X 10,500] - E[X 500] + 500
E. None of A, B, C, or D

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 193

18.43 (4B, 5/92, Q.20) (1 point) Accidents for a coverage are uniformly distributed on the interval
0 to $5,000. An insurer sells a policy for the coverage which has a $500 deductible.
Determine the insurer's expected payment per loss.
A. $1,575
B. $2,000
C. $2,025
D. $2,475
E. $2,500
18.44 (4B, 5/95, Q.22) (2 points) You are given the following:
Losses follow a Pareto distribution, with parameters = 1000 and = 2.
10 losses are expected each year.
The number of losses and the individual loss amounts are independent.
For each loss that occurs, the insurer's payment is equal to the entire amount of the loss
if the loss is greater than 100.
The insurer makes no payment if the loss is less than or equal to 100.
Determine the insurer's expected annual payments.
A. Less than 8,000
B. At least 8,000, but less than 9,000
C. At least 9,000, but less than 9,500
D. At least 9,500, but less than 9,900
E. At least 9,900
18.45 (4B, 11/95, Q.13 & 4B, 5/98 Q.9) (3 points) You are given the following:
Losses follow a uniform distribution on the interval from 0 to 50,000.

There is a maximum covered loss of 25,000 per loss and a deductible of 5,000 per loss.
The insurer applies the maximum covered loss prior to applying the deductible
(i.e., the insurers maximum payment is 20,000 per loss).
The insurer makes a nonzero payment p.
Determine the expected value of p.
A. Less than 15,000
B. At least 15,000, but less than 17,000
C. At least 17,000, but less than 19,000
D. At least 19,000, but less than 21,000
E. At least 21,000

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 194

18.46 (CAS9, 11/97, Q.40a) (2.5 points) You are the large accounts actuary for Pacific International
Group, and you have a risk with a $1 million limit.
The facultative underwriters from AnyRe have indicated that they are willing to reinsure
the following layers:
from $100,000 to $200,000 ($100,000 excess of $100,000)
from $200,000 to $500,000 ($300,000 excess of $200,000)
from $500,000 to $1 million ($500,000 excess of $500,000).
You have gathered the following information:
Limit
E[X x]
F(x)
100,000
58,175
0.603
200,000
89,629
0.748
500,000
139,699
0.885
1,000,000
179,602
0.943
Expected frequency = 100 claims.
Calculate the frequency, severity, and expected losses for each of the facultative layers.
Show all work.
18.47 (4B, 11/98, Q.12) (2 points) You are given the following:

Losses follow a distribution (prior to the application of any deductible) with


cumulative distribution function and limited expected values as follows:
Loss Size (x) F(x)
E[X x]
10,000
0.60
6,000
15,000
0.70
7,700
22,500
0.80
9,500

1.00
20,000

There is a deductible of 15,000 per loss and no maximum covered loss.

The insurer makes a nonzero payment p.


Determine the expected value of p.
A. Less than 15,000
B. At least 15,000, but less than 30,000
C. At least 30,000, but less than 45,000
D. At least 45,000, but less than 60,000
E. At least 60,000

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 195

18.48 (4B, 5/99, Q.7) (2 points) You are given the following:

Losses follow a distribution (prior to the application of any deductible) with


cumulative distribution function and limited expected values as follows:
Loss Size (x) F(x)
E[X x]
10,000
0.60
6,000
15,000
0.70
7,700
22,500
0.80
9,500
32,500
0.90
11,000

1.00
20,000

There is a deductible of 10,000 per loss and no maximum covered loss.

The insurer makes a payment on a loss only if the loss exceeds the deductible.
The deductible is raised so that half the number of losses exceed the new deductible compared to
the old deductible of 10,000.
Determine the percentage change in the expected size of a nonzero payment made by the insurer.
A. Less than -37.5%
B. At least -37.5%, but less than -12.5%
C. At least -12.5%, but less than 12.5%
D. At least 12.5%, but less than 37.5%
E. At least 37.5
18.49 (Course 3 Sample Exam, Q.5) You are given the following:
The probability density function for the amount of a single loss is
f(x) = 0.01(1 - q + 0.01qx)e-0.01x, x >0.
If an ordinary deductible of 100 is imposed, the expected payment
(given that a payment is made) is 125.
Determine the expected payment (given that a payment is made) if the deductible is increased to
200.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 196

18.50 (4, 5/00, Q.6) (2.5 points) A jewelry store has obtained two separate insurance policies that
together provide full coverage. You are given:
(i) The average ground-up loss is 11,100.
(ii) Policy A has an ordinary deductible of 5,000 with no maximum covered loss
(iii) Under policy A, the expected amount paid per loss is 6,500.
(iv) Under policy A, the expected amount paid per payment is 10,000.
(v) Policy B has no deductible and a maximum covered loss of 5,000.
Given that a loss less than or equal to 5,000 has occurred, what is the expected payment under
policy B?
(A) Less than 2,500
(B) At least 2,500, but less than 3,000
(C) At least 3,000, but less than 3,500
(D) At least 3,500, but less than 4,000
(E) At least 4,000
18.51 (4, 11/00, Q.18) (2.5 points) A jewelry store has obtained two separate insurance policies
that together provide full coverage.
You are given:
(i) The average ground-up loss is 11,100.
(ii) Policy A has an ordinary deductible of 5,000 with no maximum covered loss.
(iii) Under policy A, the expected amount paid per loss is 6,500.
(iv) Under policy A, the expected amount paid per payment is 10,000.
(v) Policy B has no deductible and a maximum covered loss of 5,000.
Given that a loss has occurred, determine the probability that the payment under policy B is 5,000.
(A) Less than 0.3
(B) At least 0.3, but less than 0.4
(C) At least 0.4, but less than 0.5
(D) At least 0.5, but less than 0.6
(E) At least 0.6

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 197

18.52 (CAS3, 11/03, Q.22) (2.5 points) The severity distribution function of claims data for
automobile property damage coverage for Le Behemoth Insurance Company is given by an
exponential distribution, F(x).
F(x) = 1 - exp(-x/5000).
To improve the profitability of this portfolio of policies, Le Behemoth institutes the following policy
modifications:
i) It imposes a per-claim deductible of 500.
ii) It imposes a per-claim limit of 25,000.
(The maximum paid per claim is 25,000 - 500 = 24,500.)
Previously, there was no deductible and no limit.
Calculate the average savings per (old) claim if the new deductible and policy limit had been in
place.
A. 490
B. 500
C. 510
D. 520
E. 530
18.53 (SOA M, 11/05, Q.26 & 2009 Sample Q.207) (2.5 points) For an insurance:
(i) Losses have density function
0.02x 0 < x < 10
fX(x) =
elsewhere
0
(ii) The insurance has an ordinary deductible of 4 per loss.
(iii) YP is the claim payment per payment random variable.
Calculate E[YP].
(A) 2.9
(B) 3.0

(C) 3.2

(D) 3.3

(E) 3.4

18.54 (SOA M, 11/06, Q.6 & 2009 Sample Q.279) (2.5 points)
Loss amounts have the distribution function
(x /100)2 , 0 x 100
F(x) =
1 , 100 < x

An insurance pays 80% of the amount of the loss in excess of an ordinary deductible of 20,
subject to a maximum payment of 60 per loss.
Calculate the conditional expected claim payment, given that a payment has been made.
(A) 37
(B) 39
(C) 43
(D) 47
(E) 49

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 198

18.55 (CAS5, 5/07, Q.47) (2.0 points) You are given the following information:
Claim

Ground-up Uncensored Loss Amount

A
B
C
D
E
F
G
H
I
J

$250,000
$300,000
$450,000
$750,000
$1,200,000
$2,500,000
$4,000,000
$7,500,000
$9,000,000
$15,000,000

a. (1.25 points) Calculate the ratio of the limited expected value at $5 million to the limited
expected value at $1 million
b. (0.75 points) Calculate the average payment per payment with a deductible of $1 million and a
maximum covered loss of $5 million.
Comment: I have reworded this exam question in order to match the syllabus of your exam.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 199

Solutions to Problems:
18.1. D. The payments are: 0, 1000, 7000, 22000 and 43000.
Average payment per loss is: (0 + 1000 + 7000 + 22,000 + 43,000)/5 = 14,600.
18.2. C. Average payment per payment is: (1000 + 7000 + 22,000 + 43,000)/4 = 18,250.
18.3. B. The payments are: $2000, $6000, $12,000, $25,000, and $25,000.
Average payment is: (2000 + 6000 + 12000 + 25000 + 25000)/5 = 14,000.
18.4. A. The payments are: 0, 1000, 7000, 20000 and 20000.
Average payment per loss is: (0 + 1000 + 7000 + 20000 + 20000)/5 = 9,600.
Alternately, E[X 5000] = (2000 + 5000 + 5000 + 5000 + 5000 + 5000)/6 = 4400.
E[X 25000] = (2000 + 6000 + 12,000 + 25,000 + 25,000)/6 = 14,000.
E[X 25000] - E[X 5000] = 14,000 - 4400 = 9,600.
Comment: The layer from 5000 to 25,000.
18.5. C. Average payment per payment is: (1000 + 7000 + 20000 + 20000)/4 = 12,000.
Alternately, (E[X 25000] - E[X 5000])/S(5000) = (14,000 - 4400)/0.8 = 12,000.
18.6. E. E[X] - E[X 25000] = 41,982 - 16,198 = 25,784.
Comment: Based on a LogNormal distribution with = 9.8 and = 1.3.
18.7. D. Average payment per payment is:
(E[X] - E[X 25000]) / S(25000) = (41,982 - 16,198) / (1 - 0.599) = 64,299.
18.8. E. E[X 100000] = 30,668.
18.9. A. E[X 100000] - E[X 25000] = 30,668 - 16,198 = 14,470.
Comment: The layer from 25,000 to 100,000.
18.10. E. (E[X 100000] - E[X 25000])/S(25000) = (30,668 - 16,198)/(1 - .599) = 36,085.
18.11. C. ($2100 + $3300 + $4600 + $6100 + $8900) / 5 = 5000.
18.12. C. ($1100 + $2300 + $3600 + $5100 + $7900) / 5 = 4000.
Alternately, one can subtract 1000 from the solution to the previous question.
18.13. A. ($0 + $1100 + $2300 + $3600 + $5100 + $7900) / 6 = 3333.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 200

18.14. B. ($800 + $2100 + $3300 + $4600 + $5000 + $5000) / 6 = 3467.


18.15. E. ($800 + $2100 + $3300 + $4600 ) / 4 = 2700.
18.16. D. For this uniform distribution, f(x) = 1/20,000 for 0x20,000.
The payment by the insurer depends as follows on the size of loss x:
Insurers Payment
0
x 1000
x - 1000
1000 x 20,000
We need to compute the average dollars paid by the insurer per loss:
20000

20000

(x-1000)f(x) dx
x = 1000

x = 20000

{(x-1000)/20000} dx

= (.5){(x-1000)2 /20000} ] = 9025.

x = 1000

x = 1000

18.17. B. f(x) = 1/20,000 for 0 x 20,000.


Insurers Payment
x
x 15,000
15,000
15,000 x 20,000
We need to compute the average dollars paid by the insurer per loss, the sum of two terms
corresponding to 0 x 15000 and 15000 x 20000:
15000

xf(x) dx

15000

+ 15000{ 1 - F(15000)} =

x=0

x/20000 dx

+ (15,000)(1 - 0.75) =

x=0
x = 15000

{x2 /40000} ]
x=0

+ 3750 = 5625 + 3750 = 9375.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 201

18.18. D. For this uniform distribution, f(x) = 1/20000 for 0x20000. The payment by the insurer
depends as follows on the size of loss x:
Insurers Payment
0
x 1000
x - 1000
1000 x 15,000
14,000
x 15,000
We need to compute the average dollars paid by the insurer per loss and the probability that a
loss as the sum of two terms corresponding to 1000 x 15,000 and 20,000 x 15,000:
15000

15000

(x-1000)f(x) dx

+ 14000{ 1 - F(15000)} =

x = 1000

{(x-1000)/20000} dx

+ 14000{ 1 - .75} =

x = 1000
x = 15000

{(x-1000)2 /40000} ]

+ 3500 = 4900 + 3500 = 8400.

x = 1000

18.19. C. We need to compute the ratio of two quantities, the average dollars paid by the insurer
per loss and the probability that a loss will result in a non-zero payment. The latter is the chance that
x>1000, which is for the uniform distribution: 1 - (1000/20000) = .95. The former is the solution to the
previous question: 8400. Therefore, the average non-zero payment is 8400 / 0.95 = 8842.
Comment: Similar to 4B, 11/95, Q.13.
18.20. B. Average payment per loss = E[X

2000

E[X

2000] - E[X

100] =

u] - E[X

d] =

100

x f(x) dx + 2000S(2000) - { x f(x) dx + 100S(100)} =


0

2000

x f(x) dx

+ 2000S(2000) - 100S(100) = 400 + (2000)(1 - .97) - (100)(1 - .20) = 380.

100

Comment: Can be done via a Lee Diagram.


18.21. C. Policy limit = maximum covered loss - deductible.
Thus the maximum covered loss = 51,000.
Expected payment per loss = E[X u] - E[X d] = E[X 51,000] - E[X 1000].

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 202

18.22. C. E[X] = 15,900.


With a 10,000 deductible, the average amount paid per loss = E[X] - E[X 10000] = 7800.
Therefore, E[X 10000] = 15900 - 7800 = 8,100.
With a 10,000 deductible, the average amount paid per nonzero payment =
(E[X] - E[X 10000])/S(10000) = 13,300.
Therefore, S(10000) = 7800/ 13300 = 0.5865.
Average loss of size between 0 and 10000 =
10000

x f(x) dx / F(10000) = (E[X 10000] - 10000S(10000))/F(10000) =


0

(8100 - (10000)(.5865))/(1-.5865) = 2235/0.4135 = 5405.


18.23. E[(X - 1000)+] - E[(X - 25,000)+] = {E[X] - E[X 1000]} - {E[X] - E[X 25,000]} =
E[X 25,000] - E[X 1000] = average payment per loss.
Thus, the average payment per loss is: 3500 - 500 = 3000.
18.24. C. The reinsurer pays the dollars in the layer of loss from $1 million to
$5 million, which are: E[X $5 million] - E[X $1 million]. The number of nonzero payments is
1 - F(1 million) = S(1 million). Thus the average nonzero payment is:
(E[X $5 million] - E[X $1 million])/ S($1 million) = (390,000 - 300,000)/.035 = 2,571,429.
18.25. D. The reinsurer pays the dollars in the layer of loss from $1 million to
$10 million, which are: E[X $10 million] - E[X $1 million]. The number of nonzero payments is
1 - F(1 million) = S(1 million). Thus the average nonzero payment is:
(E[X $10 million] - E[X $1 million])/ S($1 million) = (425,000 - 300,000)/.035 = 3,571,429.
18.26. The payment per loss of size x is: 0 for x d, and x - d for x > d.

Mean payment per loss is:

d (x- d) (1/ ) dx = ( - d)2 / (2).

Second Moment of the payment per loss is:

d (x- d)2 (1/ ) dx = ( - d)3 / (3).

Thus, the variance of the payment per loss is:


( - d)3 / (3) - ( - d)4 / (2)2 = ( - d)3 ( + 3d) / (122).
18.27. E. E[X] = E[X

] = 10,000.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 203

18.28. B. {E[X 25000] - 25000S(25000)} / F(25000) =


{8025 - (25000)(0.088)} / (1-0.088) = 6387.
18.29. A. e(5000) = {E[X] - E[X
18.30. C. E[X

5000]}/S(5000) = (10,000-3600)/0.512 = 12,500.

25000] = 8025.

18.31. E. The small losses are each recorded at 5000. Subtracting 5000 from every recorded
loss, we would get the layer from 5000 to . Thus the average loss is:
(E[X ] - E[X 5000]) + 5000 = (10000 - 3600) + 5000 = 11,400.
18.32. C. E[(X - 5000)+] = E[X ] - E[X

5000] = 10000 - 3600 = 6,400.

18.33. B. This is the average size of losses in the interval from 5000 to 25000:
({E[X 25000] - 25000S(25000)}-{E[X 5000] - 5000S(5000)})/(F(25000)-F(5000)} =
{(8025-(25000)(.088)) - (3600-(5000)(.512)}/(.512-.088) = (5825 - 1040)/0.424 = 11,285.
18.34. C. {E[X

25000] - E[X

5000]}/S(5000) + 5000 = (8025-3600)/.512 + 5000 = 13,643.

18.35. D. The losses are 5000 per loss plus the layer of losses from 5000 to 25,000. Thus the
average loss is: (E[X 25000] - E[X 5000]) + 5000 =
(8025 -3600) + 5000 = 9425. Alternately, the average size of loss is reduced compared to data
just censored from below, by E[X] - E[X 25000] = 10000 - 8025 = 1975. Since from a previous
solution, the average size of data censored from below at 5,000 is 11,400, the solution to this
question is: 11,400 - 1975 = 9425.
18.36. C. Using a previous solution, where there was no shifting, this is 5000 less than the average
size of data truncated from below at 5,000 and truncated from above at 25,000 = 11,285 - 5000 =
6,285.
Alternately, the dollars of loss are those for the layer from 5,000 to 25,000, less the width of the
layer times the number of losses greater than 25,000:
{E[X 25000] - E[X 5000]} - (20000)S(25000). The number of losses in the database is:
F(25000) - F(5000) = S(5000) - S(25000). Thus the average size is:
{(8025-3600) - (20000)(.088)} / (0.512 - 0.088) = 6,285.
18.37. A. {E[X

25000] - E[X

5000]}/S(5000) = (8025-3600)/.512 = 8,643.

18.38. B. e(5000) + 5000 = {E[X] - E[X 5000]}/S(5000) + 5000 =


(10,000-3600)/0.512 + 5000 = 17,500.

2013-4-2,

Loss Distributions, 18 Average Sizes

18.39. A.

E[X | X > 100] =

HCM 10/8/12,

Page 204

x f(x) dx / S(100). x f(x) dx = S(100)E[X | X > 100] = (.65)(345) = 224.25.

100

100

With a deductible of 100 per loss, the average payment per loss is:

100

100

100

(x - 100) f(x) dx = x f(x) dx - 100 f(x) dx = 224.25 - (100)(0.65) = 159.25.


Alternately, average payment per loss = S(100) (average payment per payment) =
S(100) E[X - 100 | X > 100] = S(100) {E[X | X > 100] - 100} = (.65)(345 - 100) = 159.25.
18.40. A. Average ground-up loss = E[X] = 27,000.
Under policy A, average amount paid per loss = E[X] - E[X 25000] = 10000.
Therefore, E[X 25000] = 27000 - 10000 = 17000.
Under policy A, average amount paid per payment =
(E[X] - E[X 25000])/S(25000) = 22000.
Therefore, S(25000) = 10000/22000 = 0.4545.
Given that a loss less than or equal to 25,000 has occurred, the expected payment under policy B =
average loss of size between 0 and 25000 =
25000

x f(x) dx / F(25000) = {E[X 25000] - 25000S(25000)}/F(25000) =


0

{17000 - (25000)(0.4545)} / (1 - 0.4545) = 10,335.


Comment: Similar to 4, 5/00, Q.6.
18.41. E. E[(X - 1000)+] = e(1000) S(1000) = (30,000)(1 - 0.727) = 8190.
E[(X - 25,000)+] = e(25,000) S(25,000) = (980,000)(1 - 0.997) = 2940.
The average payment per loss is:
E[X 25,000] - E[X 1000] = E[(X - 1000)+] - E[(X - 25,000)+] = 8190 - 2940 = 5250.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 205

18.42. B. Let X be the size of loss.


If for example x = 11,000, then the payment is: Min[10,500, 10,000] = 10,000.
If for example x = 10,200, then the payment is: Min[9700, 10,000] = 9700.
If for example x = 7000, then the payment is: Min[6500, 10,000] = 6500.
If for example x = 300, then the payment is: Min[0, 10,000] = 0.

10,000, x 10,500

payment = x - 500, 500 < x < 10,500 .

0, x 500

Thus the average payment per loss is:


10,500

500

(x - 500) f(x) dx + 10,000 S(10,500) =

10,500

500

10,500

x f(x) dx - 500

500

f(x) dx + 10,000 S(10,500) =

E[X 10,500] - 10,500 S(10,500) - {E[X 500] - 500 S(500)}


- 500{F(10,500) - F(500)} + 10,000 S(10,500) =
E[X 10,500] - 500 S(10,500) - E[X 500] + 500 S(500) - 500 F(10,500) + 500 F(500) =
E[X 10,500] - E[X 500] + 500{F(500) + S(500) - F(10,500) - S(10,500)} =
E[X 10,500] - E[X 500] + (500)(1 - 1) = E[X 10,500] - E[X 500].
Alternately, 10,000 = maximum payment = Policy limit = maximum covered loss - deductible.
Thus the maximum covered loss = 10,500.
Expected payment per loss = E[X u] - E[X d] = E[X 10,500] - E[X 500].
Comment: As mentioned in the section on Policy Provisions, the default on the exam is to apply the
maximum covered loss first and then apply the deductible.
What is done in this question is mathematically the same as first applying a maximum covered loss
of 10,500 and then applying a deductible of 500.
18.43. C. For an accident that does not exceed $500 the insurer pays nothing.
For an accident of size x > 500, the insurer pays x - 500.
The density function for x is f(x) = 1/5000 for 0x5000.
Thus the insurers average payment per accident is:
5000

5000

(x-500) f(x) dx = (x-500) (1/5000) dx =


x = 500

x = 500

5000

(x-500)2 (1/10000)

] = 2025.

x = 500

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 206

18.44. E.

Expected amount paid per loss =

100

100 x f(x) dx = 0 x f(x) dx - 0 x f(x) dx =

Mean - {E[X 100] - 100S(100)}.


S(100) = {/(+100)}2 = (1000/1100)2 = 0.8264.
E[X 100] = {/(1)}{1(/(+100))1} = {1000/(2-1)} { 1- (1000/1100)2-1} = 90.90.
Mean = /(1) = 1000.
Therefore, expected amount paid per loss is: 1000 - {90.90 - 82.64} = 991.74.
Expect 10 losses per year, so the average cost per year is: (10)(991.7) = $9917.
Alternately the expected cost per year of 10 losses is:

10

100 x f(x) dx = (10)(2)(10002) 100 x (1000 + x)- 3 dx =

107 {-x

x =
(1000 + x)- 2 }

x = 100

+ 107

100 (1000 + x)- 2 dx = 107 (100/11002 + 1/1100} = 9917.

Alternately, the average severity per loss > $ 100 is:


100 + e(100) = 100 + (+100)/( -1) = 1100 + 100 = $1200.
Expected number of losses > $100 = 10S(100) = 8.2645.
Expected annual payment = $1200(8.2645) = $9917.
Comment: Almost all questions involve the ordinary deductible, in which for a loss X larger than d,
X - d is paid. For these situations the average payment per loss is: E[X] - E[X d].
Instead, here for a large loss the whole amount is paid. This is a franchise deductible, as discussed in
the section on Policy Provisions. In this case, the average payment per loss is d S(d) more than for
the ordinary deductible or: E[X] - E[X d] + d S(d).
One can either compute the expected total amount paid per year by an insurer either as (average
payment insured receives per loss)(expected losses the insured has per year) or as (average
payment insurer makes per non-zero payment)(expected non-zero payments the insurer makes
per year). The former is ($991.7)(10) = $9917; the latter is ($1200)(8.2645) = $9917. Thus
whether one looks at it from the point of view of the insurer or the insured, one gets the same result.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 207

18.45. B. For this uniform distribution, f(x) = 1/50000 for 0x50000. The payment by the insurer
depends as follows on the size of loss x:
Insurers Payment
0
x 5000
x - 5000
5000 x 25000
20000
x 25000
We need to compute the ratio of two quantities, the average dollars paid by the insurer per loss and
the probability that a loss will result in a nonzero payment. The latter is the chance that x > 5000,
which is: 1 - (5000/50000) = 0.9. The former is the sum of two terms corresponding to
5000 x 25000 and x > 25000:
25000

(x-5000)f(x) dx

25000

+ 20000{1 - F(25000)} = {(x-5000)/50000} dx + 20000{1 - .5} =

x = 5000

x = 5000
x = 25000

{(x-5000)2 /100000} ]

+ 10000 = 4000 + 10000 = 14000.

x = 5000

Thus the average nonzero payment by the insurer is: 14000 / 0.9 = 15,556.
Alternately, S(x) = 1 - x/50000, x < 50000.
The average payment per (nonzero) payment is:
(E[X L] - E[X d])/S(d) = (E[X 25000] - E[X 5000])/S(5000) =
25000

25000

S(x) dx / S(5000) = {1 - x/50000} dx / 0.9 = (20000 - 6250 + 250)/0.9 = 15,556.


x = 5000

x = 5000

18.46. For the layer from $100,000 to $200,000, the expected number of payments is:
100 S(100,000) = 39.7.
The expected losses are: (100) (E[X 200,000] - E[X 100,000]) = $3,145,400.
The average payment per payment in the layer is: 3,145,400/39.7 = $79,229.
For the layer from $200,000 to $500,000, the expected number of payments is:
100 S(200,000) = 25.2.
The expected losses are: (100) (E[X 500,000] - E[X 200,000]) = $5,007,000.
The average payment per payment in the layer is: 5,007,000/25.2 = $198,690.
For the layer from $500,000 to $1,000,000, the expected number of payments is:
100 S(500,000) = 11.5.
The expected losses are: (100) (E[X 1,000,000] - E[X 500,000]) = $3,990,300.
The average payment per payment in the layer is: 5,007,000/11.5 = $346,983.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 208

18.47. C. The insurer pays the dollars of loss excess of $15,000, which are:
E[X] - E[X 15000] = E[X ] - E[X 15000]. The number of non-zero payments is
1 - F(15000). Thus the average nonzero payment is:
(E[X] - E[X 15000])/ (1 - F(15000)) = (20,000 - 7700)/(1-.7) = 12300 / 0.3 = 41,000.
18.48. E. Since 40% of the losses exceed a deductible of 10,000 and half of 40% is 20%, the
new deductible is 22,500 which is exceeded by 20% of the losses.
In other words, S(22,500) = 20% = 40%/ 2 = S(10,000) / 2.
For a deductible of size d, the expected size of a nonzero payment made by the insurer is (E[X] E[X d])/{ 1 - F(d) } = e(d) = the mean excess loss at d.
e(10,000) = (20,000 - 6000) / (1-.6) = 35,000.
e(22,500) = (20,000 - 9500) / (1-.8) = 52,500.
52,500 / 35,000 = 1.5 or a 50% increase.
Comment: One can do the problem without using the specific numbers in the Loss Size column.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 209

18.49. For this density, the survival function is:

S(x) = 0.01(1 - q + 0.01qt)e-0.01t dt = -e-0.01t - 0.01qte-0.01t ] = e-0.01x + 0.01qxe-0.01x.


x

We will also need integrals of xf(x):

t f(t)dt = 0.01 (1 - q)te-0.01t + 0.01qt2e-0.01t dt =


x

-.01e-0.01t {(1-q)(100t+1002 ) +q(t2 +2t100 + (2)1002 )}

]=
x

e-0.01x {(1-q)(x+100) +q(.01x2 +2x + 200)} = e-0.01x {((x+100) + q(.01x2 +x + 100)}.


First, given q, calculate the average value of a non-zero payment, given a deductible of 100. We
need to compute:

(t-100) f(t)dt / S(100) = t f(t)dt / S(100) - 100 =


100

100

{e-1 {200 + q(300)}} /(1 + q)e-1 -100 = {200 + 300q} /(1 + q) -100.
Setting this equal to 125, one can solve for q: {200 + 300q} /(1 + q) -100 = 125.
225(1 + q) = 200 + 300q. q = 25/75 = 1/3.
Now the average non-zero payment, given a deductible of 200 is:

t f(t)dt / S(200) - 200 = {e-2 {300 + (700/3)}} /(1 + 2/3)e-2 - 200 = (1600/3)/(5/3) - 200 =
200

= 320 - 200 = 120.


Alternately, the given density is a mixture of an Exponential with = 100, given weight 1-q, and a
Gamma Distribution with parameters = 2 and =100, given weight q.
The mean for the Exponential Distribution is 100.
The mean for this Gamma Distribution is (20)(100) = 200.
Thus, the mean for the mixed distribution is: (1-q)(100) + 200q = 100 + 100q.
For this Exponential Distribution, E[X
For this Gamma Distribution, E[X

x] = 100 (1- e-0.01x).

x] = 200 (3; .01x) + x {1 - (2; .01x)}.

Making use of Theorem A.1 in Appendix A of Loss Models,


(3; .01x) = 1 - e-0.01x{1 + .01x + .00005x2 } and

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 210

(2; .01x) = 1 - e-0.01x{1 + .01x}. Therefore, for this Gamma Distribution,


E[X x] = 200 - e-0.01x{200 + 2x + .01x2 } + e-0.01x{x + .01x2 } = 200 - e-0.01x(200 + x).
Thus, for the mixed distribution, E[X x] =
q{200 - e-0.01x(200 + x)} +(1-q)100(1- e-0.01x) =
100(1-e-0.01x) + q(100 - 100e-0.01x - xe-0.01x).
For this Exponential Distribution, S(x) = e-0.01x.
For this Gamma Distribution, S(x) = 1 - (2; .01x) = e-0.01x{1 + .01x}.
Thus for the mixed distribution, S(x) = q{e-0.01x(1 + .01x)} +(1-q) e-0.01x =
e-0.01x + q.01x e-0.01x.
The expected non-zero payment given a deductible of size x is:
(E[X] - E[X

x] )/S(x) = {100e-0.01x + q(100+x)e-0.01x}/{ e-0.01x + q.01x e-0.01x} =

{100 +q(100+x)}/{1 + .01xq}.


Thus for a deductible of 100, the average non-zero payment is:
(100+200q)/(1+q). Setting this equal to 125 and solving for q,
125 = (100+200q)/(1+q). q = 25/75 = 1/3.
Thus for a deductible of 200, the average non-zero payment is:
(100+(300/3))/(1+2/3) = 200/(5/3) = 120.
18.50. D. Average ground-up loss = E[X] = 11,100.
Under policy A, average amount paid per loss = E[X] - E[X 5000] = 6500.
Therefore, E[X 5000] = 11100 - 6500 = 4600.
Under policy A, average amount paid per payment =
(E[X] - E[X 5000])/S(5000) = 10000.
Therefore, S(5000) = 6500/ 10000 = .65.
Given that a loss less than or equal to 5,000 has occurred, the expected payment under policy B =
average loss of size between 0 and 5000 =
5000

x f(x) dx / F(5000) = {E[X 5000] - 5000S(5000)}/F(5000) =


0

{4600 - (5000)(.65)}/(1 - .65) = 1350/.35 = 3857.


Comment: F, S, f, the mean, and the Limited Expected Value, are all for the ground-up unlimited
losses of the jewelry store, whether or not it has insurance.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 211

18.51. E. Under policy A, with an ordinary deductible of 5,000 with no maximum covered loss, the
expected amount paid per loss is: E[X] - E[X 5000] = 6,500. Under policy A, the
the expected amount paid per payment is: (E[X] - E[X 5000])/S(5000) = 10,000.
Therefore, S(5000) = 6500/10000 = 0.65. Given that a loss has occurred, the payment under
policy B, with no deductible and a policy limit of 5,000, is 5,000 if and only if the original loss is 5000
or more. The probability of this is S(5000) = 0.65.
18.52. C. An Exponential Distribution with = 5000. E[X
E[X

500] = 5000(1 - e-0.1) = 475.8. E[X

x] = (1 - e-x/) = 5000(1 - e-x/5000).

25000] = 5000(1 - e-5) = 4966.3. E[X] = = 5000.

Average payment per loss before: E[X] = 5000.


Average payment per loss after: E[X 25000] - E[X
Average savings per loss: 5000 - 4490.5 = 509.5.

500] = 4966.3 - 475.8 = 4490.5.

18.53. E. By integrating f(x), F(x) = .01x2 , 0 < x < 10. S(4) = 1 - (0.01)(42 ) = 0.84.
10

10

x =10

E[X] = S(x) dx = (1 - .01x2 ) dx = x - .01x3 /3 ] = 6.667


0

x=0

E[X

x =4

4] = S(x) dx = (1 - .01x2 ) dx = x - .01x3 /3 ] = 3.787


0

E[YP] = (E[X] - E[X

x=0

4])/S(4) = (6.667 - 3.787)/0.84 = 3.43.

18.54. B. Let m be the maximum covered loss. 60 = 0.8(m - 20). m = 95.


The insurance pays 80% of the layer from 20 to 95.
95

Expected payment per loss is: 0.8

95

20 S(x) dx = 0.820 1 - x2 / 10000 dx = 0.8(75 - 28.31) = 37.35.

Expected payment per payment is: 37.35/S(20) = 37.35/(1 - 0.22 ) = 38.91.


Alternately, f(x) = x/5000, 0 x 100.
E[X

95

95] =

x f(x) dx

+ 95 S(95) = 57.16 + (95)(1 - 0.952 ) = 66.42.

E[X

20

20] =

x f(x) dx

+ 20 S(20) = 0.533 + (20)(1 - 0.202 ) = 19.73.

E[YP] = 0.8 {E[X

95] - E[X

20]} / S(20) = 0.8 (66.42 - 19.73) / 0.96 = 38.91.

2013-4-2,

Loss Distributions, 18 Average Sizes

HCM 10/8/12,

Page 212

18.55. a. E[X 1 million] = 7,750,000/10 = $775,000.


E[X 5 million] = 24,450,000/10 = $2,445,000. $2,445,000/$775,000 = 3.155.
Claim

Loss

Limited to 1 million

Limited to 5 million

A
B
C
D
E
F
G
H
I
J

$250,000
$300,000
$450,000
$750,000
$1,200,000
$2,500,000
$4,000,000
$7,500,000
$9,000,000
$15,000,000

$250,000
$300,000
$450,000
$750,000
$1,000,000
$1,000,000
$1,000,000
$1,000,000
$1,000,000
$1,000,000

$250,000
$300,000
$450,000
$750,000
$1,200,000
$2,500,000
$4,000,000
$5,000,000
$5,000,000
$5,000,000

Sum

$40,950,000

$7,750,000

$24,450,000

b. With a deductible of $1 million there are 6 non-zero payments out of 10 losses.


Average payment per payment is: ($2,445,000 - $775,000)/0.6 = $2,783,333.
Alternately, the six non-zero payments are in millions: 0.2, 1.5, 3, 4, 4, 4.
(0.2 + 1.5 + 3 + 4 + 4 + 4)/6 = 16.7 million/6 = $2,783,333.
Comment: The solution to part a is one way to determine the $5 million increased limit factor
for a basic limit of $1 million.

2013-4-2,

Loss Distributions, 19 Percentiles

HCM 10/8/12,

Page 213

Section 19, Percentiles


The 80th percentile is the place where the distribution function is 0.80.
For a continuous distribution, the 100pt h percentile is the first value at which F(x) = p.
Exercise: Let F(x) = 1 - e-x/10. Find the 75th percentile of this distribution.
[Solution: 0.75 = 1 - e-x/10. x = -10 ln(1 - 0.75) = 13.86.
Comment: Check: 1 - e-13.86/10 = 1 - 0.250 = 0.75.]
The Value at Risk, VaRp , is defined as the 100pth percentile.77
In Appendix A of the Tables attached to the exam, there are formulas for VaRp (X) for many of the
distributions: Exponential, Pareto, Single Parameter Pareto, Weibull, Loglogistic, Inverse Pareto,
Inverse Weibull, Burr, Inverse Burr, Inverse Exponential, Paralogistic, Inverse Paralogistic.
One can use these formulas for VaRp in order to determine percentiles.
For example, for the Exponential Distribution as shown in Appendix A: VaRp (X) = - ln(1-p).
Thus in the previous exercise, VaR0.75 = (-10) ln(0.25) = 13.86.
The 50t h percentile is the median, the place where the distribution function is 0.50.
The 25th percentile is the lower or first quartile.
The 50th percentile is the middle or second quartile.
The 75th percentile is the upper or third quartile.78
Exercise: What is the 90th percentile of a Weibull Distribution with parameters = 3 and = 1000?
[Solution: F(x) = 1 - exp[-(x/)] = 1 - exp[-(x/1000)3 ]. Set F(x) = 0.90 and solve for x.
0.90 = 1 - exp[-(x/1000)3 ]. -ln[0.10] = (x/1000)3 . x = 1000 ln[10]1/3 = 1321.
Alternately, as shown in Appendix A: VaRp (X) = {-ln(1-p)}1/.
VaR0.90 = (1000) {-ln(0.1)}1/3 = 1321.]

77
78

As discussed in Mahlers Guide to Risk Measures.


The difference between the 75th and 25th percentiles is called the interquartile range.

2013-4-2,

Loss Distributions, 19 Percentiles

HCM 10/8/12,

Page 214

Percentiles of Discrete Distributions:


A more precise mathematical definition also covers situations other than the continuous loss
distributions:79 The 100pth percentile of a distribution F(x) is any number, p such that:
F(p -) p F(p ), where F(y-) is the limit of F(x) as x approaches y from below.
Exercise: Let a distribution be such that there is 30% chance of a loss of $100, a 50% chance of a
loss of $200, and a 20% chance of a loss of $500.
Determine the 70th and 80th percentiles of this distribution.
[Solution: F(100) = 0.3, F(200) = 0.8. Since F(x) = 0.3 for 100 x < 200, F(200-) = 0.3.
Thus, F(200-) 0.7 F(200), so that .70 = 200. 200 is the first value at which F(x) > 0.7.
F(x) = 0.8 for 200 x < 500, so that 200 .80 500.
Comment: Since there is a value at which F(x) = 0.8, there is no unique value of the 80th percentile
for this discrete distribution. For example, F(200-) =0.3 0.8 = F(200), F(300-) = 0.8 = F(300), and
F(500-) = 0.8 1.0 = F(500). Thus each of 200, 300 and 500 satisfy the definition of the 80th
percentile. In this case I would use 200 as the 80th percentile.]
For a discrete distribution, take the 100pth percentile as the first value at which F(x) p.
Quantiles:
The 95th percentile is also referred to as Q0.95, the 95% quantile.

25th percentile Q0.25 25% quantile first quartile.


50th percentile Q0.50 50% quantile median.
75th percentile Q0.50 75% quantile third quartile.
90th percentile Q0.90 90% quantile.
99th percentile Q0.99 99% quantile.

79

Definition 3.7 at page 34 of Loss Models.

2013-4-2,

Loss Distributions, 19 Percentiles

HCM 10/8/12,

Page 215

Problems:
19.1 (1 point) What is the 90th percentile of a Pareto Distribution with parameters = 3 and
= 100?
A. less than 120
B. at least 120 but less than 125
C. at least 125 but less than 130
D. at least 130 but less than 135
E. at least 135
19.2 (1 point) Severity is Exponential.
What is the ratio of the 95th percentile to the median?
A. 4.1
B. 4.3
C. 4.5
D. 4.7

E. 4.9

19.3 (1 point) What is the 80th percentile of a Weibull Distribution with parameters = 2 and
= 100?
A. less than 120
B. at least 120 but less than 125
C. at least 125 but less than 130
D. at least 130 but less than 135
E. at least 135

2013-4-2,

Loss Distributions, 19 Percentiles

HCM 10/8/12,

Page 216

19.4 (2, 5/85, Q.4) (1.5 points) Let the continuous random variable X have the density function f(x)
as shown in the figure below:
f(x)

(a, 1/4)

1/4

(0,0)
What is the 25th percentile of the distribution of X ?
A. 2
B. 4
C. 8
D. 16

E. 32

19.5 (160, 11/86, Q.7) (2.1 points) You are given:


(i) F(x) = 1 - (/x)a where a > 0, > 0, x > .
(ii) The 90th percentile is (2)(101/4).
(iii) The 99th percentile is (2)(1001/4).
Determine the median of the distribution.
(A) 21/4

(B) 2(21/4)

(C)

(D) 2 2

(E) 4 2

19.6 (2, 5/92, Q.10) (1.7 points) Let Y be a continuous random variable with cumulative
distribution function F(y) = 1 - exp[-(y - a)2 /2], y > a, where a is a constant.
What is the 75th percentile of Y?
A. F(0.75)

B. a -

2 ln(4/ 3)

C. a +

2 ln(4/ 3)

D. a - 2 ln(2)

E. a + 2 ln(2)

19.7 (IOA 101, 9/01, Q.1) (1.5 points) Data were collected on 100 consecutive days for the
number of claims, x, arising from a group of policies.
This resulted in the following frequency distribution
x: 0 1
2
3
4 5
f: 14 25 26 18 12 5
Calculate the median, 25th percentile, and 75th percentile for these data.

2013-4-2,

Loss Distributions, 19 Percentiles

HCM 10/8/12,

Page 217

Solutions to Problems:
19.1. A. F(x) = 1 - {100/(100+x)}3 . Set F(x) = 0.90 and solve for x.
0.90 = 1 - {100/(100+x)}3 . x = 100 {(1-.9)-1/3 -1} = 115.4.
Alternately, for the Pareto Distribution: VaRp (X) = [(1-p)-1/ - 1].
VaR0.9 = (100) {(0.1)-1/3 - 1} = 115.4.
Comment: Check. F(115.4) = 1 - {100/(100 + 115.4)}3 = 0.900.
19.2. B. F(x) = 1 - e-x/. Set F(x) = 0.5 to find the median. median = -ln(1 - 0.5) = 0.693.
Set F(x) = 0.95 to find the 95th percentile. 95th percentile = -ln(1 - 0.95) = 2.996.
Ratio of the 95th percentile to the median = 2.996/.693 = 4.32.
Alternately, for the Exponential Distribution as shown in Appendix A: VaRp (X) = - ln(1-p).
VaR0.95 / VaR0.5 = ln(1-0.95)/ ln(1-0.5) = 4.32.
19.3. C. F(x) = 1 - exp[-(x/100)2 ]. Set F(x) = 0.80 and solve for x.
0.80 = 1 - exp[-(x/100)2 ]. ln(0.2) = -(x/100)2 . x = 100 {-ln(0.2)}1/2 = 126.9.
Alternately, as shown in Appendix A: VaRp (X) = {-ln(1-p)}1/.
VaR0.80 = (100) {-ln(0.2)}1/2 = 126.9.
Comment: Check. F(126.9) = 1 - exp[-(126.9/100)2 ] = 0.800.
19.4. B. The area under the density must be: 1. (1/2)(a)(1/4) = 1. a = 8.
The 25th percentile is where F(x) = 0.25, or where 1/4 of the area below the density
is to the left. This occurs at: a/2 = 8/2 = 4.

2013-4-2,

Loss Distributions, 19 Percentiles

HCM 10/8/12,

Page 218

19.5. B. 0.90 = F[2(101/4)] = 1 - (/{2(101/4)})a. -ln10 = aln - aln2 - (a/4)ln10.


0.99 = F[2(1001/4)] = 1 - (/{2(1001/4)})a. -2ln10 = aln - aln2 - (a/2)ln10.
Subtracting the two equations: ln10 = (a/4)ln(10). a = 4. = 2.
.5 = F(x) = 1 - (2/x)4 . x = 2(21 / 4).
Alternately, this is a Single Parameter Pareto Distribution.
As shown in Appendix A: VaRp (X) = (1- p) - 1/ .
Therefore, (2)(101/4) = (0.1) -1/ a , and
(2)(1001/4) = (0.01) - 1/a .
Dividing the second equation by the first equation: 101/4 = (0.1)-1/a. a = 4. = 2.
19.6. E. Set F(x) = 0.75. exp[-(y - a)2 /2] = 0.25. -(y - a)2 /2 = -ln(4). (y - a)2 = 4ln(2).

y = a + 2 ln(2) .
19.7. For a discrete distribution, the median is the first place the Distribution Function is at least 50%.
x: 0
1
2
3
4
F: .14 .39 .65 .83 .95
Thus the median is 2. Similarly, the 25th percentile is 1 and the 75th percentile is 3.

2013-4-2,

Loss Distributions, 20 Definitions

HCM 10/8/12,

Page 219

Section 20, Definitions


Those definitions that are not discussed elsewhere are included here.
Cumulative Distribution Function and Survival Function:80
Cumulative Distribution Function of X cdf of X Distribution Function of X
F(x) = Prob[X x].
The distribution function is defined on the real line and satisfies:
1. 0 F(x) 1.
2. F(x) is nondecreasing; F(x) F(y) for x < y.
3. F(x) is right continuous; lim F(x + ) = F(x).
0

4. F(-) = 0 and F() = 1; lim

x -

F(x) = 0 and lim F(x) = 1.


x

Most theoretical size of loss distributions, such as the Exponential, F(x) = 1 - e-x/, are continuous,
increasing functions, with F(0) = 0.
The survival function, S(x) = 1 - F(x) = Prob[X > x]. 0 S(x) 1.
S(x) is nonincreasing. S(x) is left continuous. S(-) = 1 and S() = 0.
For the Exponential, S(x) = e-x/, is a continuous, decreasing function, with S(0) = 1.
Discrete, Continuous, and Mixed Random Variables:81
Discrete Random Variable support is finite or countable
Continuous Random Variable support is an interval or a union of intervals
Mixed Random Variable82 combination of discrete and continuous random variables

Examples of Discrete Random Variables: Frequency Distributions, Made-up Loss Distributions


such as 70% chance of 100 and 30% chance of 500.
80

See Definitions 2.1 and 2.4 in Loss Models.


See Definition 2.3 at Loss Models. The support is the set of input values for the distribution; for a loss distribution
it is the set of possible sizes of loss. The set of integers is countable. The set of real numbers is not countable.
82
This is a different use of the term mixed than for n-point mixtures of Loss Distributions; in an n-point mixture one
weights together n individual distributions in order to create a new distribution.
81

2013-4-2,

Loss Distributions, 20 Definitions

HCM 10/8/12,

Page 220

Examples of Continuous Random Variables: Loss Distributions in Appendix A of Loss Models


such as the Exponential, Pareto, Gamma, LogNormal, Weibull, etc.
Examples of Mixed Random Variables: Loss Distributions censored from above (censored from
the right) by a maximum covered loss; there is a point mass of probability at the maximum covered
loss.
Probability Function Probability Mass Function probability density function for a discrete
distribution or a point mass of probability for a mixed random variable
p X(x) = Prob[X = x].83
Loss Events, etc.:
A loss event or claim is an incident in which someone suffers damages which result in an economic
loss. For example, an insured driver may damage his car in an accident, an insured business may
have its factory damaged by fire, someone with health insurance may enter the hospital, etc. Each of
these is a loss event.
On this exam, do not distinguish between the illness or accident that caused the insured to enter the
hospital, the entry into the hospital or the bill from the hospital; each or all of these combined could
be considered the loss event. Quite often I will refer to a loss event as an accident. However, loss
events can involve a death, illness, natural event, etc., with no accident involved.
On this exam, the term claim is not distinguished from a loss event. However, in common usage
the term claim is reserved for those situations where someone actually contacts the insurer
and asks for a payment. So for example, if an insured with a $5000 deductible suffered $1000
of otherwise covered damage to his home, he would probably not even bother to inform the
insurer. This is a loss event, but in common usage it would not be called a claim. One
common definition of a claim is: A claim is a demand for payment by an insured or by an allegedly
injured third party under the terms and conditions of an insurance contract. 84 The same
mathematical methods can be applied to self-insureds.
On this exam, do not distinguish between first party and third party claims. For example, if Bob
is driving his car and it hits Sues car, then Sue may make a claim against Bobs insurer. Under
liability insurance, there is no requirement that a loss event or claim be made by an insured.
For example, a customer may slip and fall in a store and sue the store. The stores insurer may
have to pay the customer.
83
84

See Definition 2.6 in Loss Models.


Foundations of Casualty Actuarial Science, Chapter 2.

2013-4-2,

Loss Distributions, 20 Definitions

HCM 10/8/12,

Page 221

The loss, size of loss, or severity, is the dollar amount of damage as a result of a loss event. The
loss may be zero. If an insured suffers $20,000 of damage and the insured has a $5,000
deductible, then the insurer would only pay the insured $15,000. However, the (size of) loss is
$20,000, the amount of damage suffered by the insured. If the insured suffered only $1000 of
damage, then the insurer would pay nothing due to the $5000 deductible, but the (size of) loss is
$1000.
A payment event is an incident in which someone receives a (non-zero) payment as a result of a
loss event covered by an insurance contract.
The amount paid is the actual dollar amount paid as a result of a loss event or a payment event. If it
is as the result of a loss event, the amount paid may be zero.
So if an insureds home suffers $20,000 of damage and the insured has a $5,000 deductible, then
the insurer would pay the insured $15,000. The amount paid is $15,000. If the insured suffered
$1000 damage, then the insurer would pay nothing due to the $5000 deductible. The amount paid
is 0.
If an injured worker makes a claim for Workers Compensation benefits, but it is found that the
injury is not work related, then the amount paid may be zero. If a doctor is sued for medical
malpractice, quite often the claim is closed without payment; the amount paid can be zero.
The allocated loss adjustment expense (ALAE) is the amount of expense incurred directly as
a result of a loss event.
Loss adjustment expenses (LAE) which can be directly related to a specific claim are classified
as ALAE, while those that can not are classified as unallocated loss adjustment expense
(ULAE).85 Examples of ALAE are fees for defense attorneys, expert witnesses for the defense,
medical evaluations, court costs, laboratory and x-ray costs, etc. Quite often claims closed
without (loss) payment, will have a positive amount of ALAE, sometimes a very large amount.
Note that any loss payment is not a part of the ALAE and vice versa.
A loss distribution is the probability distribution of either the loss or the amount paid from a loss
event or of the amount paid from a payment event. The distribution may or may not exclude
payments of zero and may or may not include ALAE.

85

For specific lines of insurance the distinction between ALAE and ULAE may depend on the statistical plan or
reporting requirement.

2013-4-2,

Loss Distributions, 20 Definitions

HCM 10/8/12,

Page 222

Loss distributions can be discrete. For example, a 20% chance of $100 and an 80% chance of
$500. The loss distributions in the Appendix A of Loss Models, such as the Exponential Distribution
or the Pareto Distribution, are all continuous and all exclude the chance of a loss of size zero. Thus in
order to model losses for lines of insurance with many claims closed without payment one
would have to include a point mass of probability at zero. On Liability lines of insurance, losses
and ALAE are often reported together, so they are frequently modeled together.
Frequency, Severity, and Exposure:
The frequency is the number of losses or number of payments random variable. Its expected
value is called the mean frequency. Unless indicated otherwise the frequency is for one exposure
unit. Frequency distributions are discrete with support on all or part of the non-negative integers.
They can be made-up, or can be named distributions, such as the Poisson or Binomial
Distributions.86
The severity can be either the loss or amount paid random variable. Its expected value is called the
mean severity.
Severity and frequency together determine the aggregate amount paid by an insurer. The number
of claims or loss events determines how many times we take random draws from the severity
variable. Note that first we determine the number of losses from the frequency distribution and then
we determine the size of each loss. Thus frequency and severity do not enter into the aggregate
loss distribution in a symmetric manner. This will be seen for example when we calculate the variance
of the aggregate losses.87
The exposure base is the basic unit of measurement upon which premiums are determined.
For example, insuring one automobile for one year is a car-year of exposure. Insuring $100 dollars
of payroll in Workers Compensation is the unit of exposure. So if the rate for carpenters in State X
were $4 per $100 of payroll, insuring the Close to You Carpenters, which paid its carpenters
$250,000 per year in total, would cost:
(250,000/100)($4) = $10,000 per year.88

86

See Mahlers Guide to Frequency Distributions, and Appendix B of Loss Models.


See Mahlers Guide to Aggregate Distributions.
88
This is a simplified example.
87

2013-4-2,

Loss Distributions, 20 Definitions

HCM 10/8/12,

Page 223

Some Definitions from Joint Principles of Actuarial Science:89


Phenomena occurrences that can be observed
Experiment observation of a given phenomena under specified conditions
Event set of one or more possible outcomes
Stochastic phenomenon more than one possible outcome
contingent event outcome of a stochastic phenomenon; more than one possible outcome
probability measure of the likelihood of an event, on a scale from 0 to 1
random variable function that assigns a numerical value to every possible outcome
Data Dependent Distributions:
Data-Dependent Distributions
complexity that of the data; complexity increases as the sample size increases.90
The most important example of a data-dependent distribution is the Empirical Distribution Function.
Another example is Kernel Smoothing.
As discussed previously, the empirical model assigns probability 1/n to each of n observed
values. For example, with the following observations: 81, 157, 213, the probability function (pdf) of
the corresponding empirical model is: p(81) = 1/3, p(157) = 1/3, p(213) = 1/3.
As discussed previously, the corresponding Empirical Distribution Function is:
F(x) = 0 for x < 81, F(x) = 1/3 for 81 x < 157, F(x) = 2/3 for 157 x < 213, F(x) = 1 for 213 x.
In a Kernel Smoothing Model, the data is smoothed using a kernel function.91

89

See Section 2.1 of Loss Models.


See Definition 4.7 in Loss Models.
91
Kernel smoothing is covered in Mahlers Guide to Fitting Loss Distributions.
90

2013-4-2,

Loss Distributions, 21 Parameters of Distributions

HCM 10/8/12,

Page 224

Section 21, Parameters of Distributions


The probability that a loss is less than or equal to x is given by a distribution function F(x).
For a given type of distribution, in addition to the size of loss x, F(x) depends on
what are called parameters. Each type of distribution has a set of parameters, which enter into F(x)
and its derivative f(x) in a manner particular to that type of distribution.
For example, the Pareto Distribution has a set of two parameters and . For fixed and , F(x) is a
function of x. For = 2 and = 10, the Pareto Distribution is given by:

F(x) = 1 -
= 1 - (1 + x / 10)-2.
+ x
The Pareto Distribution is an example of a parametric distribution.92
The parameter(s) tell you which member of the family one has. For example, if one has a Pareto
Distribution with parameters = 2 and = 10, the Distribution is completely described. In addition
one needs to know the support of the distributions in the family. For example, all Pareto Distributions
have support x > 0. Finally, one needs to know the set from which the parameter or parameters
may be drawn. For the Pareto Distribution, > 0 and
> 0.
It is useful to group distributions based on how many parameters they have. Those in the
Appendix Loss Models have one, two, three or even occasionally four parameters. For example,
the Exponential has one parameter, the Gamma has two parameters, while the Transformed
Gamma has three parameters.
It is also useful to divide parameters into those that are related to the scale of the distribution and
those that are related to the shape of a parameter. A scale parameter is a parameter which divides
x everywhere it appears in the distribution function. For example, is a scale parameter for the
Pareto distribution: F(x) = 1 - {/(+x)} = 1 - (1 + x / ). normalizes the scale of x, so that one
standard distribution can fit otherwise similar data sets. Thus for example, different units of
measurement (dollars, yen, pounds, marks, etc.) or the effects of inflation can be easily
accommodated by changing the scale parameter.
A scale parameter will appear to the nth power in the formula for the nth moment of the distribution.
Thus, the nth moment of the Pareto has a factor of n .
92

See Definition 4.1 in Loss Models.

2013-4-2,

Loss Distributions, 21 Parameters of Distributions

HCM 10/8/12,

Page 225

Also, a scale parameter will not appear in the coefficient of variation, the skewness, or kurtosis. The
coefficient of variation, the skewness, and the kurtosis each measure the shape of the distribution.
Thus a change in scale should not affect the shape of the fitted Pareto; does not appear in the
coefficient of variation or the skewness of the Pareto. However, does, and thus alpha is called a
shape parameter.
A parameter such as in the Normal distribution is referred to as a location parameter. Altering

shifts the whole distribution to the left or right.


Gamma Distribution, an Example:
For the Gamma Distribution, is the shape parameter and is the scale parameter.
Here are graphs of four different Gamma Distributions, each with a mean of 200:
Prob.
0.005

Prob.

Exponential, alpha= 1

alpha= 2

0.003

0.004
0.003

0.002

0.002
0.001

0.001
100 200
Prob.

300 400

500

x
600

100 200
Prob.

alpha= 4

300 400 500

x
600

alpha= 10

0.006

0.004

0.005

0.003

0.004
0.003

0.002

0.002

0.001

0.001
100 200

300 400 500

x
600

100 200

300 400 500

x
600

2013-4-2,

Loss Distributions, 21 Parameters of Distributions

HCM 10/8/12,

Page 226

Advantages of Parametric Estimation:93


Parametric estimation has the following the following advantages:
1. Accuracy.
Maximum likelihood estimators of parameters have good properties.
2. Inferences can be made beyond the population that generated the data.
For example, even if the largest observed loss is 100, we can estimate the chance that the next
observed loss will have a size greater than 100.
3. Parsimony.
Distributions can be summarized using only a few parameters.
4. Allows Hypothesis Tests.94
5. Scale parameters/families.
Allow one to more easily handle the effects of (uniform) inflation.
Some Desirable Properties of Size of Loss Distributions:95
As a model for claim sizes. in order to be practical, a distribution should have the following
desirable characteristics:
1. The estimate of the mean should be efficient and reasonably easy to use.
2. A confidence interval about the mean should be calculable.
3. All moments of the distribution function should exist.
4. The characteristic function can be written in closed form.
As will be seen, some distributions such as the Pareto do not have all of their moments. While this
does not prevent their use, it does indicate some caution must be exercised. For the LogNormal,
the characteristic can not be written in closed form.

93

See Section 2.6 of the first edition of Loss Models.


For example through the use of the Chi-Square Statistic or the Kolmogorov-Smirnov Statistic, discussed in
Mahlers Guide to Fitting Loss Distributions.
95
See Estimating Pure Premiums by Layer - An Approach, by Robert J. Finger, Discussion by Lee R. Steeneck,
PCAS 1976, not on the syllabus
94

2013-4-2,

Loss Distributions, 21 Parameters of Distributions

HCM 10/8/12,

Page 227

Problems:
21.1 (1 point) Which of the following are true with respect to the application in Property/Casualty
Insurance of theoretical size of loss distributions?
1. When data is extensive, theoretical distributions are not essential.
2. Inferences can be made beyond the population that generated the data.
3. The inconvenience restricts their use to a unusual circumstances.
A. 1
B. 2
C. 3
D. 2, 3
E. None of A, B, C, or D
21.2 (4B, 5/95, Q.29) (1 point) Which of the following are reasons for the importance of theoretical
distributions?
1. They permit calculations to be made without the formulation of a model.
2. They are completely summarized by a small number of parameters.
3. Their convenience for mathematical manipulation allows for the development of useful
theoretical results.
A. 2
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3

2013-4-2,

Loss Distributions, 21 Parameters of Distributions

HCM 10/8/12,

Page 228

Solutions to Problems:
21.1. B. 1. F. Even when data are extensive theoretical distributions may be essential, depending
on the question to be answered. For example, theoretical distributions may be essential to estimate
the tail of the distribution. 2. T. 3. F.
21.2. D. 1. F. 2. T. 3. T.

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 229

Section 22, Exponential Distribution


This single parameter distribution is extremely simple to work with and thus appears in many exam
questions. In most actual applications, the Exponential doesnt provide enough flexibility to fit the
data. Thus, it is much more common to use the Gamma or Weibull Distributions, of which the
exponential is a special case96. Following is a summary of the Exponential Distribution.
Exponential Distribution
Support: x > 0

Parameters: > 0 ( scale parameter)

D. f. :

F(x) = 1 - e-x/

P. d. f. :

f(x) = e-x/ /

Moments: E[Xn] = n! n
Mean =
Variance = 2
Coefficient of Variation = Standard Deviation / Mean = 1
Skewness = 2
Kurtosis = 9
Median = ln(2)

Mode = 0

Limited Expected Value Function: E[X x] = (1- e-x/)


R(x) = Excess Ratio = e-x/
e(x) = Mean Excess Loss =

Derivatives of d.f.:

F(x)
= x e-x/

Method of Moments: = X .
Percentile Matching: = -x1 / ln(1-p1 )
Method of Maximum Likelihood: = X = xi / n, same as method of moments.
96

For a shape parameter of unity, either the Gamma or the Weibull distributions reduce to the Exponential.

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 230

All Exponential Distributions have the same shape.


The Coefficient of Variation is always 1 and the skewness is always 2.
Heres a graph of the density function of an Exponential. All Exponential Distributions look the same
except for the scale; in this case the mean is 10. Also note that while Ive only shown x 50, the
density is positive for all x > 0.
f(x)
0.10

0.08

0.06

0.04

0.02

10

20

30

40

50

The Exponential Distribution has a constant Mean Excess Loss and therefore a constant
hazard rate; it is the only continuous distribution with this memoryless property.
Exercise: Losses prior to any deductible follow an Exponential Distribution with = 8.
A policy has a deductible of size 5.
What is the distribution of non-zero payments under that policy?
[Solution: After truncating and shifting by d:
S(x + d)
S(x + 5)
e- (x + 5) / 8
G(x) = 1 =1=1= 1 - e-x/8.
S(d)
S(5)
e- 5 / 8
Comment: This is an Exponential Distribution with = 8. ]

When an Exponential Distribution is truncated and shifted from below,


one gets the same Exponential Distribution, due to its memoryless
property. On any exam question involving an Exponential Distribution, check whether its
memoryless property helps to answer the question.97
97

See for example, 3, 11/00, Q.21.

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 231

Integrals Involving the Density of the Exponential Distribution:


Let f(x) = e-x/ / , the density of an Exponential Distribution with mean .
x

tn f(t) dt =

tn e- t / / dt =

x/

(n+1 ; x) = 1 -

sn n e-s d s = n (n+1 ; x/) (n+1).98

0
i=n

xi e -x / i! .

(n+1) = n!.

i=0

Therefore,

tn f(t) dt = n (1 -

i=n

(x / ) i e-x/ / i! ) n!.
i=0

t f(t) dt = {1 - e-x/ - (x/)e-x/} 1! = - ( + x)e-x/.


0
x

t2 f(t) dt = 2 {1 - e-x/ - (x/)e-x/ - (x/)2e-x//2} 2! = 22 - (22 + 2x + x2)e-x/.


0
x

t3 f(t) dt = 3 {1 - e-x/ - (x/)e-x/ - (x/)2e-x//2 - (x/)3e-x//6} 3!


0

= 63 - (63 + 62x + 3x2 + x3 )e-x/.


Inverse Exponential Distribution:
If x follows an Exponential Distribution, then 1/x follows an Inverse Exponential Distribution.
F(x) = e-/x.

f(x) =

e- x /
, x > 0.
x2

The Inverse Exponential Distribution is very heavy-tailed. It fails to have a finite mean, let alone any
higher moments!
98

The Incomplete Gamma Function is discussed in Mahlers Guide to Frequency Distributions.

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 232

Problems:
Use the following information for the next eight questions:
Let X be an exponentially distributed random variable, the probability density function of which is:
f(x) = 8 e-8x, x 0
22.1 (1 point) Which of the following is the mean of X?
A. less than 0.06
B. at least 0.06 but less than 0.08
C. at least 0.08 but less than 0.10
D. at least 0.10 but less than 0.12
E. at least 0.12
22.2 (1 point) Which of the following is the median of X?
A. less than 0.06
B. at least 0.06 but less than 0.08
C. at least 0.08 but less than 0.10
D. at least 0.10 but less than 0.12
E. at least 0.12
22.3 (1 point) Which of the following is the mode of X?
A. less than 0.06
B. at least 0.06 but less than 0.08
C. at least 0.08 but less than 0.10
D. at least 0.10 but less than 0.12
E. at least 0.12
22.4 (1 point) What is the chance that X is greater than 0.3?
A. less than 0.06
B. at least 0.06 but less than 0.08
C. at least 0.08 but less than 0.10
D. at least 0.10 but less than 0.12
E. at least 0.12

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 233

22.5 (1 point) What is the variance of X?


A. less than 0.015
B. at least 0.015 but less than 0.016
C. at least 0.016 but less than 0.017
D. at least 0.017 but less than 0.018
E. at least 0.018
22.6 (1 point) What is the coefficient of variation of X?
A. less than 0.5
B. at least 0.5 but less than 0.7
C. at least 0.7 but less than 0.9
D. at least 0.9 but less than 1.1
E. at least 1.1
22.7 (2 points) What is the skewness of X?
A. less than 0
B. at least 0 but less than 1
C. at least 1 but less than 2
D. at least 2 but less than 3
E. at least 3
22.8 (3 points) What is the kurtosis of X?
A. less than 3
B. at least 3 but less than 5
C. at least 5 but less than 7
D. at least 7 but less than 9
E. at least 9

22.9 (1 point) Prior to the application of any deductible, losses follow an Exponential Distribution
with = 135. If there is a deductible of 25, what is the density of non-zero payments at 60?
A.
B.
C.
D.
E.

less than 0.0045


at least 0.0045 but less than 0.0050
at least 0.0050 but less than 0.0055
at least 0.0055 but less than 0.0060
at least 0.0060

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 234

22.10 (2 points) You are given the following:

Claim sizes follow an exponential distribution with density function


f(x) = 0.1 e-0.1x , 0 < x < .

You observe 8 claims.

The number of claims and claim sizes are independent.


Determine the probability that the largest of these claim is less than 17.
A. less than 80%
B. at least 80% but less than 85%
C. at least 85% but less than 90%
D. at least 90% but less than 95%
E. at least 95%
22.11 (1 point) What is F(3), for an Inverse Exponential Distribution, with = 10?
A. less than 3%
B. at least 3% but less than 5%
C. at least 5% but less than 7%
D. at least 7% but less than 9%
E. at least 9%
22.12 (2 points) You are given:

Future lifetimes follow an Exponential distribution with a mean of .

The force of interest is .

A whole life insurance policy pays 1 upon death.


What is the actuarial present value of this insurance?
(A) e
(B) 1 / (1 + )
(C) e2
(D) 1 / (1 + )2
(E) None of A, B, C, or D.
22.13 (1 point) Prior to the application of any deductible, losses follow an Exponential Distribution
with = 25. If there is a deductible of 5, what is the variance of the non-zero payments?
A.
B.
C.
D.
E.

less than 600


at least 600 but less than 650
at least 650 but less than 700
at least 700 but less than 750
at least 750

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 235

22.14 (2 points) Prior to the application of any deductible, losses follow an Exponential Distribution
with = 31. There is a deductible of 10. What is the variance of amount paid by the insurer for one
loss, including the possibility that the amount paid is zero?
A. less than 900
B. at least 900 but less than 950
C. at least 950 but less than 1000
D. at least 1000 but less than 1050
E. at least 1050
22.15 (2 points) Size of loss is Exponential with mean .
Y is the minimum of N losses.
What is the distribution of Y?
22.16 (2 points) You are given:

A claimant receives payments at a rate of 1 paid continuously while disabled.

Payments start immediately.

The force of interest is .

The length of disability follows an Exponential distribution with a mean of .

At the time of disability, what is the actuarial present value of these payments?
(A) 1 / ( + )

(B) 1 / (1 + )

(C) / ( + )

(D) / (1 + )

(E) None of A, B, C, or D.

22.17 (2 points) You are given the following graph of the density of an Exponential Distribution.
Prob.
0.1
0.08
0.06
0.04
0.02

10

20

30

40

50

What is the third moment of this Exponential Distribution?


A. 1000
B. 2000
C. 4000
D. 6000
E. 8000

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 236

22.18 (3 points) Belle Chimes and Leif Blower are engaged to be married.
The cost of their wedding will be 110,000. They will receive 200 gifts at their wedding.
The size of each gift has distribution: F(x) = 1 - exp[-(x - 100)/500], x > 100.
What is the probability that the total value of the gifts will not exceed the cost of their wedding?
A. 6%
B. 8%
C. 10%
D. 12%
E. 14%
22.19 (5 points) Define the quartiles as the 25th, 50th, and 75th percentiles.
Define the interquartile range as the difference between the third and first quartiles, in other words as
the 75th percentile minus the 25th percentile.
Determine the interquartile range for an Exponential Distribution.
Define the Quartile Skewness Coefficient as:
(3rd quartile - 2nd quartile) - (2nd quartile - 1st quartile)
.
3rd quartile - 1st quartile
Determine the Quartile Skewness Coefficient for an Exponential Distribution,
and compare it to the skewness.
22.20 (3 points) Define the Mean Absolute Deviation as: E[ |X - E[X]| ].
Determine the Mean Absolute Deviation for an Exponential Distribution.
22.21 (160, 11/86, Q.9) (2.1 points) X1 and X2 are independent random variables each with
Exponential distributions. The expected value of X1 is 9.5. The variance of X2 is 2.25.
Determine the probability that X1 < X2 .
(A) 2/19

(B) 3/22

(C) 3/19

(D) 3/16

(E) 2/3

22.22 (4, 5/87, Q.32) (1 point) Let X be an exponentially distributed random variable, the
probability density function of which is: f(x) = 10 exp(-10x), where x 0.
Which of the following statements regarding the mode and median of X is true?
A. The median of X is 0; the mode is 1/2.
B. The median of X is (ln 2) / 10; the mode of X is 0.
C. The median of X is 1/2; the mode of X does not exist.
D. The median of X is 1/2; the mode of X is 0.
E. The median of X is 1/10; and the mode of X is (ln 2) /10.
22.23 (2, 5/90, Q.11) (1.7 points) Let X be a continuous random variable with density function
f(x) = ex for x > 0. If the median of this distribution is 1/3, then what is ?
A. (1/3) In(1/2)

B. (1/3) In(2)

C. 2 In(3/2)

D. 3 In(2)

E. 3

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 237

22.24 (160, 5/90, Q.3) (2.1 points) You are given:


(i) Tu is the failure time random variable assuming the uniform distribution from 0 to .
(ii) Te is the failure time random variable assuming the exponential distribution.
(iii) Var(Tu ) = 3Var(Te ).
(iv) f(te ) / S(te ) = 0.5.
Calculate the uniform distribution parameter .
(A) 3

(B) 4

(C) 8

(D) 12

(E) 15

22.25 (2, 2/96, Q.40) (1.7 points) Let X1 , . . . , X100 be a random sample from an exponential
distribution with mean 1/2.
100

Determine the approximate value of P[ Xi > 57] using the Central Limit Theorem.
i =1

A. 0.08

B. 0.16

C. 0.31

D. 0.38

E. 0.46

22.26 (2, 2/96, Q.41) (1.7 points) Let X be a continuous random variable with density function
f(x) = e-x/2/2 for x > 0. Determine the 25th percentile of the distribution of X.
A. In(4/9)
B. ln(16/9) C. ln(4)
D. 2
E. ln(16)
22.27 (Course 151 Sample Exam #2, Q.24) (2.5 points)
An insurer's portfolio consists of a single possible claim. You are given:
(i) the claim amount is uniformly distributed over (100, 500).
(ii) the probability that the claim occurs after time t is e-0.1t, t > 0.
(iii) the claim time and amount are independent.
(iv) the insurer's initial surplus is 20.
(v) premium income is received continuously at the rate of 40 per annum.
Determine the probability of ruin (not having enough money to pay the claim.)
(A) 0.3
(B) 0.4
(C) 0.5
(D) 0.6
(E) 0.7
22.28 (Course 160 Sample Exam #2, 1996, Q.1) (1.9 points) You are given:
(i) Two independent random variables X1 and X2 have exponential distributions with means
1 and 2, respectively.
(ii) Y = X1 X2 .
Determine E[Y].
1
1
(A)
+
1 2

(B)

1 2
1 + 2

(C) 1 + 2

(D)

1
1 2

(E) 12

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 238

22.29 (4B, 11/99, Q.17) (2 points) Claim sizes follow a distribution with density function
f(x) = e-x, 0 < x < . Determine the probability that the second claim observed will be more than
twice as large as the first claim observed.
A. e-3

B. e-2

C. 1/3

D. e-1

E. 1/2

22.30 (Course 1 Sample Exam, Q.23) (1.9 points) The value, v, of an appliance is based on
the number of years since purchase, t, as follows: v(t) = e(7- 0.2t).
If the appliance fails within seven years of purchase, a warranty pays the owner the value of the
appliance. After seven years, the warranty pays nothing. The time until failure of the appliance has
an exponential distribution with mean 10. Calculate the expected payment from the warranty.
A. 98.70
B. 109.66
C. 270.43
D. 320.78
E. 352.16
22.31 (1, 5/00, Q.3) (1.9 points) The lifetime of a printer costing 200 is exponentially distributed
with mean 2 years. The manufacturer agrees to pay a full refund to a buyer if the printer fails during
the first year following its purchase, and a one-half refund if it fails during the second year. If the
manufacturer sells 100 printers, how much should it expect to pay in refunds?
(A) 6,321
(B) 7,358
(C) 7,869
(D) 10,256 (E) 12,642
22.32 (1, 5/00, Q.18) (1.9 points) An insurance policy reimburses dental expense, X, up to a
maximum benefit of 250. The probability density function for X is: c e-0.004x for x > 0,
where c is a constant. Calculate the median benefit for this policy.
(A) 161
(B) 165
(C) 173
(D) 182
(E) 250
22.33 (1, 11/00, Q.9) (1.9 points) An insurance company sells an auto insurance policy that covers
losses incurred by a policyholder, subject to a deductible of 100.
Losses incurred follow an exponential distribution with mean 300.
What is the 95th percentile of actual losses that exceed the deductible?
(A) 600
(B) 700
(C) 800
(D) 900
(E) 1000
22.34 (1, 11/00, Q.14) (1.9 points) A piece of equipment is being insured against early failure.
The time from purchase until failure of the equipment is exponentially distributed with mean 10
years. The insurance will pay an amount x if the equipment fails during the first year, and it will pay
0.5x if failure occurs during the second or third year. If failure occurs after the first three years, no
payment will be made. At what level must x be set if the expected payment made under this
insurance is to be 1000?
(A) 3858
(B) 4449
(C) 5382
(D) 5644
(E) 7235

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 239

22.35 (1, 5/01, Q.20) (1.9 points) A device that continuously measures and records seismic
activity is placed in a remote region. The time, T, to failure of this device is exponentially distributed
with mean 3 years. Since the device will not be monitored during its first two years of service, the
time to discovery of its failure is X = max(T, 2). Determine E[X] .
(A) 2 + e-6/3

(B) 2 - 2e-2/3 + 5e-4/3

(C) 3

(D) 2 + 3e-2/3

(E) 5

22.36 (1, 5/01, Q.32) (1.9 points) A company has two electric generators. The time until failure for
each generator follows an exponential distribution with mean 10. The company will begin using the
second generator immediately after the first one fails.
What is the variance of the total time that the generators produce electricity?
(A) 10
(B) 20
(C) 50
(D) 100
(E) 200
22.37 (1, 5/03, Q.4) (2.5 points) The time to failure of a component in an electronic device has an
exponential distribution with a median of four hours. Calculate the probability that the component will
work without failing for at least five hours.
(A) 0.07
(B) 0.29
(C) 0.38
(D) 0.42
(E) 0.57
22.38 (CAS3, 11/03, Q.17) (2.5 points) Losses have an Inverse Exponential distribution.
The mode is 10,000. Calculate the median.
A. Less than 10,000
B. At least 10,000, but less than 15,000
C. At least 15,000, but less than 20,000
D. At least 20,000, but less than 25,000
E. At least 25,000
22.39 (SOA3, 11/03, Q.34 & 2009 Sample Q.89) (2.5 points) You are given:
(i) Losses follow an exponential distribution with the same mean in all years.
(ii) The loss elimination ratio this year is 70%.
(iii) The ordinary deductible for the coming year is 4/3 of the current deductible.
Compute the loss elimination ratio for the coming year.
(A) 70%
(B) 75%
(C) 80%
(D) 85%
(E) 90%
22.40 (CAS3, 5/04, Q.20) (2.5 points)
Losses have an exponential distribution with a mean of 1,000.
There is a deductible of 500.
The insurer wants to double the loss elimination ratio.
Determine the new deductible that achieves this.
A. 219
B. 693
C. 1,046
D. 1,193
E. 1,546

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 240

22.41 (CAS3, 11/05, Q.20) (2.5 points)


Losses follow an exponential distribution with parameter .
For a deductible of 100, the expected payment per loss is 2,000.
Which of the following represents the expected payment per loss for a deductible of 500?
A.
B. (1 - e-500/)
C. 2,000 e-400/
D. 2,000 e-5/
E. 2,000 (1 - e-500/) / (1 - e-100/)
22.42 (4, 11/06, Q.26 & 2009 Sample Q.269) (2.9 points) The random variables X1 , X2 , ... , Xn ,
are independent and identically distributed with probability density function f(x) = e-x//, x 0.
Determine E[ X 2 ].
(A)

n+ 1 2

(B)

n+ 1 2

n2

(C)

2
n

(D)

2
n

(E) 2

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 241

Solutions to Problems:
22.1. E. An exponential with = 1/8; mean = = 0 .125.
22.2. C. An exponential with = 1/8; F(x) = 1 - e-x/ = 1 - e-8x.
At the median: F(0.5)= 0.5 = 1 - e-8x. x = -ln(.5)/8 = 0.0866.
22.3. A. The mode of the exponential is always zero.
(The density 8e-8x decreases for x > 0 and thus attains its maximum at x=0.)
22.4. C. An exponential with 1/ = 8 ; F(x) = 1 - e -x/ = 1 - e-8x .
1- F(0.3) = e-(8)(.3) = e-2.4 = 0.0907.
22.5. B. An exponential with 1/ = 8 ; variance = 2 = 0.015625.
22.6. D. An exponential always has a coefficient of variation of 1.
The C.V. = standard deviation / mean = (0.015625)0.5 / 0.125 = 1.
22.7. D. An exponential always has skewness of 2. Specifically the moments are:
1 = (1!) 1 = 1/8 = .125. 2 = (2!) 2 = 2 / 82 = .03125. 3 = (3!) 3 = 6 / 83 = .01172.
Standard Deviation = (.03125 - .1252 ).5 = .125.
Skewness = {3 - (3 1 2 ) + (2 1 3 )} / STDDEV3 =
{.01172 - (3)(.125)(.03125)) + (2)(.125)3 )} / (.125)3 = .0039075 / .001953 = 2.00.
22.8. E. An exponential always has kurtosis of 9. Specifically the moments are:
1 = (1! ) 1 = . 2 = (2!) 2 = 22 . 3 = (3!) 3 = 63 . 4 = (4!) 4 = 24 4 .
Standard Deviation = (22 - 2 ).5 = . 4 = 4 - (4 1 3 ) + (6 1 2 2 ) - 31 4 =
24 4 - (4)()(63 ) + (6)(2 )(22 ) - (3)4 = 94 . kurtosis = 4 / STDDEV4 = 94 / 4 = 9.
22.9. B. After truncating and shifting from below, one gets the same Exponential Distribution with
= 135, due to its memoryless property.
The density is: e-x/135/135, which at x = 60 is: e-60/135/135 = 0.00475.

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 242

22.10. A. For this exponential distribution, F(x) = 1 - e-.1x. F(17) = 1- e-(.1)(17) = .817.
The chance that all eight claims will be less than or equal to 17, is: F(17)8 = .8178 = 19.9%.
Comment: This is an example of an order statistic. The maximum of the 8 claims is less than or equal
to 17 if and only if each of the 8 claims is less than or equal to 17.
22.11. B. F(x) = e-/x. F(3) = e-10/3 = 0.036.
22.12. B. The probability of death at time t, is the density of the Exponential Distribution:
f(t) = e-t/ /. The present value of a payment of one at time t is et .
Therefore, the actuarial present value of this insurance is:

et e-t/ / dt = (1/) e(+1/)t dt = (1/) /( + 1/) = 1/(1 + ).


0

22.13. B. After truncating and shifting from below, one gets the same Exponential Distribution with
= 25, due to its memoryless property. The variance is 2 = 252 = 625.
22.14. A. After truncating and shifting from below, one gets the same Exponential Distribution with
= 31, due to its memoryless property. Thus the nonzero payments are Exponential with = 31,
with mean 31 and variance 312 . The probability of a nonzero payment is the probability that a loss
is greater than the deductible of 10; S(10) = e-10/31 = .7243. Thus the payments of the insurer can
be thought of as an aggregate distribution, with Bernoulli frequency with mean .7243 and
Exponential severity with mean 31. The variance of this aggregate distribution is:
(Mean Freq.)(Var. Sev.) + (Mean Sev.)2 (Var. Freq.) =
(.7243)(312 ) + (31)2 {(.7243)(1-.7243)} = 888.
Comment: Similar to 3, 11/00, Q.21.
22.15. The survival function of Y is: Prob[all N losses > y] = S(y)N = (e-x/)N = e-xN/.
The distribution of Y is Exponential with mean /N.

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 243

22.16. D. Given a disability of length t, the present value of an annuity certain is:
(1-e-t)/. The expected present value is the average of this over all t:

{(1-e-t)/}f(t) dt = {(1-e-t)/}e-t// dt = (1/)e-t// - e-t( + 1/)/ dt =


0

(1/){1 - (1/( + 1/))/} = (1/){1 - (1/( 1 + ) } = /( 1 + ).


22.17. D. For an Exponential, f(x) = e-x//. f(0) = 1/. Thus 1/10 = 1/. = 10.
Third moment is: 63 = 6000.
22.18. B. Let Y = X - 100. Then Y is Exponential with mean = 500.
E[X] = E[Y] + 100 = 500 + 100 = 600. Var[X] = Var[Y] = 5002 = 250,000.
The mean total value of gifts is: (200)(600) = 120,000.
The variance of the total value of gifts is: (200)(250,000) = 50,000,000.
Prob[gifts 110,000] [(110,000 - 120,000)/ 50,000,000 ] = [-1.41] = 7.9%.
Comment: The distribution of the size of gifts is a Shifted Exponential.
22.19. 0.25 = 1 - exp[-Q0.25 / ]. Q0.25 = ln[4/3].
0.5 = 1 - exp[-Q0.5 / ]. Q0.5 = ln[2].
0.75 = 1 - exp[-Q0.75 / ]. Q0.75 = ln[4].
Interquartile range = Q0.75 - Q0.25 = ln[4] - ln[4/3] = ln[3] = 1.0986.
Quartile Skewness Coefficient =

( ln(4) - ln(2)) - ( ln(2) - ln(4 / 3))


= ln(4/3) / ln(3) = 0.262.
ln(3)

The skewness of any Exponential Distribution is 2.


Specifically, the third central moment is: E[X3 ] - 3E[X2 ]E[X] + 2E[X]3 = 63 - (3)(22)() + 23 = 23.
The variance is 2. Thus the skewness is: 23 / (2)3/2 = 2.
Comment: The first quartile is also called the lower quartile, while the 3rd quartile is also called the
upper quartile.
The Quartile Skewness Coefficient as applied to a small sample of data would be a robust
estimator of the skewness of the distribution from which the data was drawn; it would be not be
significantly effected by unusual values in the sample, in other words by outliers.

2013-4-2,

Loss Distributions, 22 Exponential

22.20.

Page 244

0 | x - | f(x) dx = 0 ( - x) e- x / / dx + (x - ) e- x / / dx =

HCM 10/8/12,

0 e - x / / dx - e - x / / dx + x e - x / / dx - 0 x e - x / / dx =

(1 -

e-1)

e-1

x=

+ (-x exp[-x / ] - exp[-x / ])

x =

x =

- (-x exp[-x / ] - exp[-x / ])

x= 0

(1 - 2e-1) + 2e-1 + (2e-1 - ) = 2 e- 1 = 0.7358 .


22.21. B. 22 = 2.25. 2 = 1.25. Given X1 = t, Prob[X2 > t] = e-t/1.5.

Prob[X2 > X1] = e t /1.5 e t /9.5 / 9.5 dt = (1/9.5)/(1/1.5 + 1/9.5) = 3/22.


0

Comment: This is mathematically equivalent to two independent Poisson Processes, with


1 = 1/9.5 and 2 = 1/1.5. The probability of observing an event from the first process before the
second process is: 1/(1 + 2) = (1/9.5)/(1/9.5 + 1/1.5) = 3/22.
See Mahlers Guide to Stochastic Processes, on CAS Exam 3L.
22.22. B. The median is where F(x) = .5. F(x) = 1 - e-10x. Therefore solving for x,
the median = -ln(.5) / 10 = ln(2) / 10. The mode is that point where f(x) is largest. Since f(x) declines
for x 0, f(x) is at its maximum at x = 0. Therefore, the mode is zero.
22.23. D. F(x) = 1 - ex. F(1/3) = 0.5. 0.5 = e/3. = 3 In(2).
22.24. D. For the Exponential, the hazard rate is given as 0.5, and therefore = 1/0.5 = 2.
Variance of the Exponential is: 2 = 4. Variance of the Uniform is: 2/12. 2/12 = (3)(4). = 12.
22.25. A. The sum of 100 Exponential distributions has mean (100)(1/2) = 50, and variance
100

(100)(1/22 ) = 25. P[ Xi > 57] 1 - [(57 - 55)/5] = 1 - [1.4] = 0.0808.


i =1

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 245

22.26. B. .25 = F(x) = 1 - e-x/2. x = -2ln(.75) = 2ln(4/3) = ln(16/9).


Comment: F(ln(16/9)) = 1 - exp[-ln(16/9)/2] = 1 -

9 / 16 = 1 - 3/4 = 1/4.

22.27. C. If the loss occurs prior to t = 2, then since the insurer has less than 100 in assets, the
probability of ruin is 100%. If the loss occurs subsequent to t = 12, then since the insurer has assets
of more than 500, the probability of ruin is 0. If the loss occurs at time 2 t 12, then the insurer has
assets of 20 + 40t, and the probability of ruin is: {500 - (20 + 40t)}/400 = (12 - t)/10.
Adding up the different situations, using that the time of loss is exponentially distributed, the
probability of ruin is:
12

12

F(2) + f(t)(12 - t)/10 dt + 0 = (1 - e-.2) + .1e-.1t(12 - t)/10 dt =


2

2
12

.181 + (12/10)(-e-.1t)

12

12

] + .01 te-.1t dt = .181 +


2

.621 - .01(te-.1t + e-.1t) ] = .802 - .320 = 0.482.

Alternately, if the loss is of size 100, there is ruin if the loss occurs at time t < 2, which has probability:
1- e-.2.
If the loss is of size 500, there is ruin if the loss occurs at time t < 12, which as probability: 1 - e-1.2.
If the loss is of size x, then there is ruin if the loss occurs prior to time (x - 20)/40,
since at t = (x - 20)/40 the assets are: 20 + 40(x-20)/40 = x.
Thus for a loss of size x, the probability of ruin is: 1 - exp[-.1(x-20)/40] = 1 - e-(x-20)/400.
The losses are uniformly distributed from 100 to 500, so the overall probability of ruin is:
500

500

(1- e-(x-20)/400)(1/400)dx = x/400 - e-(x-20)/400 ] = (1.2 - .2) - ( e-1.2 - e-.2) = 0.482.


100

100

22.28. E. E[X1 X2 ] = E[X1 ]E[X2 ] = 12.


22.29. C. Given that the first claim is of size x, the probability that the second will be more than
twice as large is 1 - F(2x) = S(2x) = e-2x. The overall average probability is:

x=0

x=0

(Probability given x) f(x) dx = e-2x e-x dx = e-3x dx = 1/3.


x=0

2013-4-2,

Loss Distributions, 22 Exponential

HCM 10/8/12,

Page 246

22.30. D. The density of the time of failure is: f(t) = e-t/10/10.


7

Expected payment is:

0 v(t) f(t) dt = 0 e

(7 - 0.2t)

e - t / 10 / 10 dt = 0.1 e7

0 e- 0.3t dt =

0.1 e7 (1 - e- 2.1) 0.3 = 320.78.


22.31. D. Prob[fails during the first year] = F(1) = 1 - e-1/2 = .3935.
Prob[fails during the second year] = F(2) - F(1) = e-1/2 - e-2/2 = .2386.
Expected Cost = 100{(200)(.3935) + (100)(.2386)} = 10,256.
22.32. C. This is an Exponential with 1/ = .004. = 250. The median of this Exponential is: 250
ln(2) = 173.3, which since it is less than 250 is also the median benefit.
22.33. E. By the memoryless property of the Exponential Distribution, the non-zero payments
excess of a deductible are also an Exponential Distribution with mean 300. Thus the 95th percentile
of the nonzero payments is: -300 ln(1 - .95) = 899. Adding back the 100 deductible, the 95th
percentile of the losses that exceed the deductible is: 999.
22.34. D. Prob[fails during the first year] = F(1) = 1 - e-1/10 = .09516.
Prob[fails during the second or third year] = F(3) - F(1) = e-1/10 - e-3/10 = .16402.
Expected Cost = .09516x + .16402x/2 = 1000. x = 5644.
22.35. D. max(T, 2) + min(T, 2) = T + 2.
E[max(T, 2)] = E[T+ 2] - E[min(T, 2)] = E[T] + 2 - E[T

2] = 3 + 2 - 3(1 - e-2/3) = 2 + 3e- 2 / 3.

22.36. E. Each Exponential has variance 102 = 100.


The variances of independent variables add: 100 + 100 = 200.
Comment: The total time is Gamma with = 2, = 10, and variance (2)(102 ) = 200.
22.37. D. Median = 4 .5 = 1 - e-4/. = 5.771. S(5) = e-5/5.771 = 0.421.
22.38. E. The mode of the Inverse Exponential is /2. /2 = 10000 = 20000.
To get the median: F(x) = e-/x = e-20000/x = .5. x = 28,854.
Comment: One could derive the mode: f(x) = e/x/x2 . f(x) = -2f(x)/x + f(x)/x2 = 0. x = /2.

2013-4-2,

Loss Distributions, 22 Exponential

22.39. C. LER(x) = E[X

HCM 10/8/12,

Page 247

x]/E[X] = (1 - e-x/)/ = 1 - e-x/.

LER(d) = 1 - e-d/ = .7. d = 1.204.


LER(4d/3) = LER(1.605) = 1 - e-1.605/ = 1 - e-1.605 = 80.0%.
22.40. E. For the Exponential, LER[x] = E[X

x]/E[X] = 1 - e-x/.

1 - e-500/1000 = .3935. We want: 1 - ed/1000 = (2)(.3935) = .7869. d = 1546.


22.41. C. Due to the memoryless property of the Exponential, the expected payment per
payment is , regardless of the deductible.
Therefore, for a deductible of d, the expected payment per loss is: S(d) = e-d/.
Thus 2000 = e-100/ . = 2000e100/.
Therefore, the expected payment per loss for a deductible of 500 is:
e-d/ = 2000e100/ e-500/ = 2,000 e-400/ .
Alternately, the expected payment per loss is:
E[X] - E[X

d] = - (1 - e-d/) = e-d/. Proceed as before.

Comment: One could solve numerically for with result = 2098.


22.42. A. Xi is Exponential. Xi is Gamma with = n and .
X = Xi/n is Gamma with = n and /n.
Therefore X has 2nd moment: (/n)2 (n)(n + 1) = {(n+1)/n} 2.
Alternately, E[ X 2 ] = Var[ X ] + E[ X ]2 = Var[X]/n + E[X]2 = 2/n + 2 = {(n+1)/n} 2.
Alternately, for i = j, E[XiXj] = E[X2 ] = 22. For i j, E[XiXj] = E[Xi] E[Xj] = E[X]2 = 2.
E[ X 2 ] = E[Xi/n Xj/n] = E[XiXj]/n2 = {(n)(22) + (n2 - n)2}/n2 = {(n+1)/n} 2.

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 248

Section 23, Single Parameter Pareto Distribution


The Single Parameter Pareto Distribution is described in Appendix A.4.1.4 of Loss Models.
It is not the same as the Pareto distribution described in Appendix A.2.3.1 of Loss Models.99
The Single Parameter Pareto applies to a size of claim distribution above a lower limit > 0.100

F(x) = 1 - , x > .
x
Note that F() = 0.
f(x) =


, x > .
x + 1

Since this single parameter distribution is simple to work with it is very widely used by
actuaries in actual applications involving excess losses or layers of loss.101 It also has appeared
in many past exam questions.
Exercise: What is the limited expected value for the Single Parameter Pareto Distribution?
x


[Solution: E[X x] = y f(y) dy + x S(x) = y + 1 dy + x

x

1

= dy + x1 =
x
y

x1- - 1 -

+ x1 =
+
+
=
.
( 1) x 1
1
1 ( - 1) x-1
1
x 1

Comment: In Appendix A of Loss Models,


For k = 1, E[X x] =

E[(X x)k]

k
k
=
.
- k ( - k) x - k

, matching the above formula.]


- 1 ( - 1) x - 1

If one takes F(x) = 1 - {(+)/(x+)} for x > , then one gets a distribution function of which the Pareto and
Single Parameter Pareto are each special cases.
One gets the former Pareto for beta = 0 and the latter Single Parameter Pareto for theta = 0.
100
The Single Parameter Pareto is designed to work directly with data truncated from below at .
See Mahlers Guide to Fitting Loss Distributions.
101
See A Practical Guide to the Single Parameter Pareto Distribution, by Stephen W. Philbrick, PCAS LXXII, 1985,
pp. 44.
99

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 249

Exercise: Using the formula for the Limited Expected Value, what is the mean excess loss, e(x), for
the Single Parameter Pareto Distribution?

- {
}
- 1
( -1) x - 1
E[X] - E[X x] - 1
x
[Solution: e(x) =
=
=
.]
S(x)
- 1

x

Single Parameter Pareto Distribution


Support: x >

Parameters: > 0 (shape parameter)

D. f. :


F(x) = 1 -
x

P. d. f. :


+ 1
f(x) = + 1 =
x
x

Moments: E[Xn] =
Mean =

n
, > n
n

, >1
1

Variance =

2
, >2
( 1)2 ( 2)

Coefficient of Variation =
Mode =

1
, > 2
( 2)

Skewness =

2( + 1)
3

2
,>3

Median = 21/

Limited Expected Value Function: E[X x] =


,>1
1 ( - 1) x-1
R(x) = Excess Ratio = (1/) (x/)1 , > 1, x >
e(x) = Mean Excess Loss = x / (1)
Derivatives of d.f.:

>1

F(x)
= -(/x) ln(x / )

Method of Moments: = m1 / (m1 )


Method of Maximum Likelihood: =

Percentile Matching: = - ln(1-p1 ) / ln (x1 / )


N

ln[xi / ]

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Probability density function of a Single Parameter Pareto with = 1000 and = 2:

f(x)
0.0020

0.0015

0.0010

0.0005

2000

3000

4000

x
5000

Page 250

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

Problems:
Use the following information for the next nine questions:
X has the probability density function:

f(x) = 607.5 x-3.5, x 9.

23.1 (1 point) Which of the following is the mean of X?


A. less than 12
B. at least 12 but less than 14
C. at least 14 but less than 16
D. at least 16 but less than 18
E. at least 18
23.2 (1 point) Which of the following is the median of X?
A. less than 12
B. at least 12 but less than 14
C. at least 14 but less than 16
D. at least 16 but less than 18
E. at least 18
23.3 (1 point) Which of the of following is the mode of X?
A. less than 10
B. at least 10 but less than 11
C. at least 11 but less than 12
D. at least 12 but less than 13
E. at least 13
23.4 (1 point) What is the chance that X is greater than 30?
A. less than 1%
B. at least 1% but less than 2%
C. at least 2% but less than 3%
D. at least 3% but less than 4%
E. at least 4%
23.5 (2 points) What is the variance of X?
A. less than 170
B. at least 170 but less than 190
C. at least 190 but less than 210
D. at least 210 but less than 230
E. at least 230

HCM 10/8/12,

Page 251

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 252

23.6 (1 point) What is the coefficient of variation of X?


A. less than 1
B. at least 1 but less than 2
C. at least 2 but less than 3
D. at least 3
E. Can not be determined
23.7 (2 points) What is the skewness of X?
A. less than 0
B. at least 0 but less than 2
C. at least 2 but less than 4
D. at least 4
E. Can not be determined
23.8 (3 points) What is the Limited Expected Value at 20?
A. less than 8
B. at least 8 but less than 10
C. at least 10 but less than 12
D. at least 12 but less than 14
E. at least 14
23.9 (2 points) What is the Excess Ratio at 20?
A. less than 9%
B. at least 9% but less than 10%
C. at least 10% but less than 11%
D. at least 11% but less than 12%
E. at least 12%

23.10 (3 points) The large losses for Pickwick Insurance are given by X:
f(x) = 607.5 x-3.5, x 9.
Pickwick Insurance expects 65 such large losses per year.
Pickwick Insurance reinsures the layer of loss from 20 to 30 with Global Reinsurance.
How much does Global Reinsurance expect to pay per year for losses from Pickwick Insurance?
A. less than 20
B. at least 20 but less than 30
C. at least 30 but less than 40
D. at least 40 but less than 50
E. at least 50

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 253

23.11 (2 points) You are modeling the distribution of the diameters of those meteors that have
diameters greater than 1 meter and that hit the atmosphere of the Earth.
If you use a Single Parameter Pareto Distribution for the model, what are the possible reasonable
values of and ?
23.12 (2 points) X follows a Single Parameter Pareto Distribution.
What is the expected value of ln(X/)?
A. 1/( - 1)

B. 1/

C. /( - 1)

D. /

E. /( - 1)

23.13 (2 points) You are given the following graph of a Single Parameter Pareto Distribution.
Density
0.06
0.05
0.04
0.03
0.02
0.01
50
100
150
What is the variance of this Single Parameter Pareto Distribution?
A. less than 1600
B. at least 1600 but less than 1800
C. at least 1800 but less than 2000
D. at least 2000 but less than 2200
E. at least 2200

200

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 254

23.14 (3 points) The Pareto principle, named after economist Vilfredo Pareto, states that for many
phenomena, 80% of the consequences stem from 20% of the causes.
If the size of loss follows a Single Parameter Pareto Distribution, for what value of is it the case that
80% of the aggregate losses are expected to come from the largest 20% of the loss events?
A. less than 1.1
B. at least 1.1 but less than 1.2
C. at least 1.2 but less than 1.3
D. at least 1.3 but less than 1.4
E. at least 1.4
23.15 (4, 5/86, Q.60) (1 point) Given a Single Parameter Pareto distribution
F(x; c, ) = 1 - (c/x), for x > c, for a random variable x representing large losses.
Which of the distribution functions shown below represents the distribution function of x truncated
from below at d, d > c?
A. 1 - {(c - d)/x}

x>c-d

B. 1 - {(c + d)/x} x > c + d

C. 1 - (d/x)

x>d

D. 1 - {(d - c)/x} x > d - c

E. 1 - {d/(x - d)}

x>d

23.16 (4, 5/89, Q.25) (1 point) The distribution function of the random variable X is
F(x) = 1 - 1/x2 , x 1.
Which of the following are true about the mean, median, and mode of X?
A. mode < mean < median
B. mode < median < mean
C. mean < mode < median
D. median < mean and the mode is undefined
E. None of the above
23.17 (4, 5/90, Q.30) (1 point) Losses, denoted by T, have the probability density function:
f(T) = 4 T-5 for 1 T < .
What is the coefficient of variation of T?
A. 1/8

B. 1/4

C.

2 /4

D. 3/4

E. 3 2 /4

23.18 (4, 5/90, Q.31) (1 point) Losses, denoted by T, have the probability density function:
f(T) = 4 T-5 for 1 T < .
What is the coefficient of skewness of T?
A. 5

B. 5 2

C. 20/27

D. 5/9

E. 5 2 /9

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 255

23.19 (4, 5/90, Q.33) (1 point) Losses, denoted by T, have the probability density function:
f(T) = 4 T-5 for 1 T < .
What is the actual 95th percentile of T?
A. less than 2.25
B. at least 2.25 but less than 2.50
C. at least 2.50 but less than 2.75
D. at least 2.75 but less than 3.00
E. at least 3.00
23.20 (4B, 5/92, Q.15) (2 points) Determine the coefficient of variation of the claim severity
distribution f(x) = (5/2) x-7/2 , x > 1.
A. Less than 0.70
B. At least 0.70 but less than 0.85
C. At least 0.85 but less than 1.00
D. At least 1.00 but less than 1.15
E. At least 1.15
23.21 (1, 5/00, Q.34) (1.9 points) An insurance policy reimburses a loss up to a benefit limit of 10.
The policyholders loss, Y, follows a distribution with density function: f(y) = 2/y3 , y > 1.
What is the expected value of the benefit paid under the insurance policy?
(A) 1.0
(B) 1.3
(C) 1.8
(D) 1.9
(E) 2.0
23.22 (1, 11/00, Q.25) (1.9 points) A manufacturers annual losses follow a distribution with density
function f(x) = 2.5 (0.62.5) / x3.5, x > 0.6.
To cover its losses, the manufacturer purchases an insurance policy with an annual
deductible of 2.
What is the mean of the manufacturers annual losses not paid by the insurance policy?
(A) 0.84
(B) 0.88
(C) 0.93
(D) 0.95
(E) 1.00
23.23 (1, 5/01, Q.39) (1.9 points) An insurance company insures a large number of homes.
The insured value, X, of a randomly selected home is assumed to follow a distribution with density
function f(x) = 3x-4, x > 1. Given that a randomly selected home is insured for at least 1.5, what is
the probability that it is insured for less than 2?
(A) 0.578
(B) 0.684
(C) 0.704
(D) 0.829
(E) 0.875

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 256

23.24 (3, 11/01, Q.37 & 2009 Sample Q.103) (2.5 points)
For watches produced by a certain manufacturer:
(i) Lifetimes follow a single-parameter Pareto distribution with > 1 and = 4.
(ii) The expected lifetime of a watch is 8 years.
Calculate the probability that the lifetime of a watch is at least 6 years.
(A) 0.44
(B) 0.50
(C) 0.56
(D) 0.61
(E) 0.67
23.25 (1, 5/03, Q.22) (2.5 points) An insurer's annual weather-related loss, X, is a random variable
with density function f(x) = 2.5 2002.5/x3.5, x > 200.
Calculate the difference between the 30th and 70th percentiles of X .
(A) 35
(B) 93
(C) 124
(D) 231
(E) 298

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 257

Solutions to Problems:
23.1. C. Single Parameter Pareto Distribution with = 9 and = 2.5.
Mean = { / ( - 1)} = (2.5 / 1.5)(9) = 15.
Alternately, one can integrate xf(x) from 9 to .
23.2. A. Single Parameter Pareto Distribution with = 9 and = 2.5.
F(x) = 1 - (x / )- = 1 - (x / 9)-2.5. At the median we want F(x) =.5: .5 = (x / 9)-2.5.
Therefore, x = (9) .5-1/2.5 = 11.9.
23.3. A. The mode of the Single Parameter Pareto Distribution is always which in this case is 9.
(The density decreases for x > and thus attains its maximum at x=.)
23.4. E. F(x) = 1 - (x / )- = 1 - (x / 9)-2.5. 1 - F(30) = (30/9)-2.5 = 0.049.
23.5. B. Single Parameter Pareto Distribution with = 9 and = 2.5.
Variance = ( / [ ( - 2) ( 1)2 ]) 2 = (92 )(2.5) / [(.5)(1.52 )] = 180.
Alternately, one can compute the second moment as the integral of x2 f(x) from 9 to .

x2 f(x) dx = x2 607.5 x-3.5 dx = 607.5 x-1.5 dx = (-607.5/.5) x-.5 ] = 1215 / 3 = 405.


9

Thus the variance is 405 - 152 = 405 - 225 = 180.


23.6. A. From the solutions to previous questions, the mean is 15 and the variance is 180, the
coefficient of variation is:

180 / 15 = 0.894.

Alternately, the coefficient of variation is 1/ ( - 2) = 1/ (2.5)(2.5 - 2) = .894.


23.7. E. Single Parameter Pareto Distribution with = 9 and = 2.5.
The skewness only exists > 3, therefore in this case the skewness does not exist.
(If one tries to calculate the third moment by taking the integral of x2 f(x) from 9 to , one gets infinity
due to evaluating x.5 at .)

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 258

23.8. D. Single Parameter Pareto Distribution with = 9 and = 2.5.


E[X x] = [{ (x/)1} / ( 1)] .
E[X 20] = 9[ {2.5 -(20/9)-1.5}/1.5] = 13.19.
Alternately, one can compute the integral of xf(x) from to x, from 9 to 20:
20

20

20

x f(x) dx = x 607.5
9

20

x-3.5 dx = 607.5 x-2.5 dx = (-607.5/1.5) x-1.5 ]

= -405 (.01118 - .03704) = 10.473.


E[X 20] is the sum of the above integral plus 20{1 - F(20)} = 110.47 + 20(9/20)2.5 =
10.473 + 2.717 = 13.19.
23.9. E. Single Parameter Pareto Distribution with = 9 and = 2.5.
R(x) = Excess Ratio = (1/) (x/)1 . R((20) = (1/2.5) (20/9)-1.5 = 12.1%.
Alternately, one can compute the integral of 1-F(x) = S(x) from 20 to :

(x/9)-2.5
20

dx = 92.5 x-2.5 dx = (-243) x-1.5/1.5 ] = (-162) (0 - 0.01118) = 1.811.


20

20

This integral is the losses excess of 20. R(20) is the ratio of the above integral to the mean.
Mean = ( / ( 1)) = (2.5 / 1.5) (9) = 15. Thus R(20) = 1.811 / 15 = 12.1%.
Alternately, using previous solutions, R(20) = 1 - E[X 20]/E[X] = 1 - 13.19/15 = 12.1%.
23.10. E. Single Parameter Pareto Distribution with = 9 and = 2.5.
E[X x] = [{ (x/)1} / ( 1)] .
E[X 30] = 9[ {2.5 -(30/9)-1.5}/1.5] = 14.01.
E[X 20] = 9[ {2.5 -(20/9)-1.5}/1.5] = 13.19.
65 large losses expected per year, so that the Reinsurer expects in the layer from 30 to 20:
65{ E[X 30] - E[X 20] } = (65)(14.01-13.19) = 53.
Comment: The Reinsurer only expects to make payments on (65)S(20) = (65)(20/9)-2.5 = 8.8
losses per year. Of these (65)S(30) = (65)(30/9)-2.5 =3.2 are expected to exceed the upper limit
of the layer; on such claims the reinsurer pays the width of the layer or 10.

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 259

23.11. Since the data is truncated from below at 1 meter, one takes = 1 meter.
The volume of a meteor is proportional to d3 . Therefore, assuming some reasonable average
density, the mass of a meteor is also proportional to d3 . The average mass of meteors hitting the
Earth should be finite. Therefore, the distribution of d should have a finite third moment.
Therefore, > 3.
Comment: Beyond what you are likely to be asked on your exam. However, it is important to know
that the Single Parameter Pareto does not have all of its moments.
23.12. B. Let y = ln(x/). x = ey/. dx = ey/ dy/.

ln(x/) / x(+1) dx = y / (ey/)+1 ey/ dy/ = (1/) y e-y dy = (2)/ = 1/.


23.13. C. For the Single Parameter Pareto, f(x) = + 1 , x > .
x
Since the graph starts at 50, = 50.
f() = /. Therefore, 0.06 = /. = (50)(0.06) = 3.
E[Xn ] =

n
, > n. E[X] = (3)(50)/2 = 75. E[X2 ] = (3)(502 )/1 = 7500.
n

Var[X] = 7500 - 752 = 1875.

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 260

23.14. B. The scale parameter will drop out of the analysis, so for convenience take = 1.
We want 20% of the aggregate losses to come from the smallest 80% of the loss events.
E[X] = /( - 1) = /( - 1).
The 80th percentile is such that 0.2 = S(x) = 1/x. x = 51/.
Dollars from those losses of size less than x = 51/ is:
E[X

x] - xS(x) = /( - 1) - 1/{51/( - 1)} - 51/(0.2) = {/( - 1)}(1 - 51//5).

The portion of the total dollars from those loss of size less than 80th percentile is the above divided
by the mean: 1 - 51//5.
Thus we want: 0.2 = 1 - 51//5. 4 = 51/. = 1.161.
Comment: For = 1 and = 1.161, E[X] = 1.161/0.161 = 7.211,
the 80th percentile is: 51/1.161 = 4, and
E[X 4] = 1.161/0.161 - 1/{(0.161)(4.161)} = 2.242.
Dollars from those loss of size less than 4 is: 2.242 - (0.2)(4) = 1.442. 1.442/ 7.211 = 0.20.
23.15. C. The distribution function for the data truncated from below at d is:
G(x) = (F(x)-F(d)/(1-F(d)) for x >d. In this case G(x) = ((c/d) - (c/x)) / (c/d)
= 1 - (d/x) for x >d.
23.16. B. The density f(x) = 2 x3, x 1. Since the density declines for all x 1, it has its maximum
at x =1. The mode is 1. The mean is the integral from 1 to of xf(x) which is

2x2 dx = -2/x ]
x=1

= 2. Thus the mean = 2.

x=1

The median is such that F(x) = .5. Thus .5 = 1 - 1/x2 . median =

2 = 1.414.

Comment: A Single Parameter Pareto Distribution, with = 2 and = 1.


Mean = ( / ( 1)) = (2/1)(1) = 2. Mode = = 1. Median = 21/ = 21/2 = 1.414.
For a continuous distribution with positive skewness, such as the Single Parameter Pareto
Distribution, typically: mean > median > mode (alphabetical order.)

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

23.17. C.

HCM 10/8/12,

Page 261

x=

xf(x)dx = x(4x-5)dx = 4 x-4 dx = -4x-3 /3 ] = 4/3.

mean =
1

second moment =

x=1

x=

x2f(x)dx = x2(4x-5)dx = 4 x-3 dx = -4x-2 /2 ] = 2.

x=1

Thus the variance is: 2 - (4/3)2 = 2/9. The standard deviation is:

2 / 3.

Coefficient of Variation = Standard Deviation / Mean = ( 2 / 3) / (4/3) =

2 / 4.

Comment: A Single Parameter Pareto Distribution with = 4 and = 1.


The CV = ((2))-.5 = 8-.5 = 2 / 4.
23.18. B.

third moment =

x3f(x)dx = x3(4x-5)dx = 4 x-2 dx = -4x-1 /1 ] = 4.

Thus the skewness = {3 - (3 1 2 ) + (2 1 3)} / STDDEV3 =


[4 - {(3) (4/3)(2)} + {(2)(4/3)3 }] / ( 2 / 3)3 = {128/27 - 4} / {2 2 / 27} = 20 / (2 2 ) = 5 2 .
23.19. f(T) = 4 T-5 for T 1, so that taking the integral F(T) = 1 - T-4 for T 1.
At the 95th percentile 0.95 = F(T) = 1 - T-4. Therefore T = (1/.05)1/4 = 2.115.
23.20. C.
The mean is

x f(x) dx = x (5/2) x-7/2 dx = -{(5/2)/(3/2)}x-3/2 ] = 5/3

x=1

x=1

x=1

The 2nd moment is

x2 (5/2) x-7/2 dx = -{(5/2)/(1/2)}x-1/2 ] = 5

x=1

x=1

Thus the variance = 5 - (5/3)2 = 2.22. The coefficient of variation is:

2.22 / (5/3) = 0.894.

Comment: This is a Single Parameter Pareto Distribution, with parameters =1 and = 5/2 = 2.5.
It has coefficient of variation equal to:

1
=
( 2)

1
(2.5)(2.5 2)

= 0.894.

2013-4-2,

Loss Distributions, 23 Single Parameter Pareto

HCM 10/8/12,

Page 262

23.21. D. The density of a Single Parameter Pareto Distribution is: / x+1, x > .
This is a Single Parameter Pareto Distribution with = 2 and = 1.
E[X

x] = / (-1) - /{(-1)x1}. E[X

10] = 2 - 1/10 = 1.9.

23.22. C. The mean annual losses not paid by the insurance policy is E[X

2].

This is a Single Parameter Pareto Distribution with = 2.5 and = 0.6.


E[X

x] = / (-1) - /{(-1)x1}. E[X

2] = (2.5)(0.6)/1.5 - (0.62.5)/{(1.5)21.5} = 0.9343.

23.23. A. F(x) = 1 - 1/x3 , x > 1.


Prob[X < 2 | X > 1.5] = (F(2) - F(1.5))/S(1.5) = (7/8 - .7037)/.2963 = 0.578.
23.24. A. For the Single Parameter Pareto Distribution, E[X] = /(-1).
Therefore, 8 = 4/(-1) = 2. S(x) = (/x). S(6) = (4/6)2 = 4/9 = 0.444.
23.25. B. F(x) = 1 - (200/x)2.5, x > 200. F(30) = .3. 30 = 200(1/.7)1/2.5 = 230.7.
F(75) = .7. 70 = 200(1/.3)1/2.5 = 323.7. 70 - 30 = 323.7 - 230.7 = 93.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 263

Section 24, Common Two Parameter Distributions


For the exam it is important for the student to become familiar with the material in Appendix A of
Loss Models.102 Here are the four most commonly used Distributions with Two Parameters:
Pareto, Gamma, LogNormal, and Weibull.103
Pareto:
is a shape parameter and is a scale parameter. Notice the factor of n in the moments. The
Pareto is a heavy-tailed distribution. Higher moments may not exist.
The coefficient of variation (when it exists) is always greater than one; the standard deviation is
always greater than the mean.104 The skewness for the Pareto is always greater than twice
the coefficient of variation.

F(x) = 1 -
= 1 - (1 + x / ).
+ x

Mean =

E[X2 ] =

2 2
, >2
( 1) ( 2)

>1

f(x) =


= (/)(1 + x / )( + 1)
( + x) + 1

Variance =

E[Xn ] = n

2
,>2
( 1)2 ( 2)

n! n

( i)

n! n
, > n
( 1)...( n)

i=1

Coefficient of Variation =

Mode = 0.
E[X x] =

>2

+1
3

>3

Median = (21/ - 1).

1

1 -

, > 1
+ x
1

Loss Elimination Ratio = 1 - (1 + x / )1 , > 1.


102

Skewness = 2

Mean Excess Loss =

+ x
,>1
1

Excess Ratio = (1 + x / )1 , > 1

There are a few other distributions used by actuaries than those listed there, and the distributions are sometimes
parameterized in a different manner.
103
In my opinion. See a subsequent section for additional two parameter distributions in Loss Models.
104
This fact is also equivalent to the fact that for the Pareto E[X2 ] > 2 E[X]2 .

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 264
Heres a graph of the density function of a Pareto Distribution with = 3 and = 60:
f(x)
0.05
0.04
0.03
0.02
0.01

20

40

60

80

100

Exercise: Losses prior to any deductible follow a Pareto Distribution with parameters = 1.7 and
= 30. A policy has a deductible of size 10.
What is the distribution of non-zero payments under that policy?
[Solution: After truncating and shifting by d, G(x) = 1 - S(x+d)/S(d) = 1 - S(x+10)/S(10) =

1.7
30
1 -

40 1.7
30 + x + 10
=
1

.
30 1.7
40 + x

30 + 10
Comment: This is a Pareto Distribution with = 1.7 and = 40.]
If losses prior to any deductible follow a Pareto Distribution with parameters and , then after
truncating and shifting from below by a deductible of size d:

+ d
+ x + d
G(x) = 1 - S(x+d)/S(d) = 1 =
1

.

+ d + x

+ d
If losses prior to any deductible follow a Pareto Distribution with parameters and ,
then after truncating and shifting from below by a deductible of size d, one gets another
Pareto Distribution, but with parameters and + d.105
105

The form of an Exponential Distribution is also preserved under truncation and shifting from below. While for the
Exponential the parameter remains the same, for the Pareto the parameter becomes + d.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 265
Exercise: Losses prior to any deductible follow a Pareto Distribution with parameters = 1.7 and
= 30. What is the mean non-zero payment under a policy that has a deductible of size 10?
[Solution: The non-zero payments follow a Pareto Distribution with = 1.7 and = 40,
with a mean of 40/(1.7-1) = 57.1. Alternately, the mean of the data truncated and shifted from below
is the mean excess loss for the original Pareto Distribution.
e(x) = (+x)/(-1). e(10) = (30 + 10)/(1.7-1) = 57.1.]
Gamma:106 107
is a shape parameter and is a scale parameter.108 Note the factors of in the moments. For =
1, the Gamma is an Exponential Distribution.
The Gamma always has well-defined moments and is thus not as heavy-tailed as other distributions
such as the Pareto.
The sum of two independent random variables each of which follows a Gamma distribution with the
same scale parameter, is also a Gamma distribution; it has a shape parameter equal to the sum of
the two shape parameters and the same scale parameter. Specifically the sum of n independent
identically distributed variables which are Gamma with parameters and is a Gamma distribution
with parameters n and .
The Gamma is infinitely divisible; if X follows a Gamma, then given any n >1 we can find a
random variable Y which also follows a Gamma, such that adding up n independent version of
Y gives X. Take n independent copies of a Gamma with parameters /n and . Their sum is a
Gamma with parameters and .
For a positive integral shape parameter, = m, the Gamma distribution is the sum of m
independent variables each of which follows an Exponential distribution. Thus for = 1,
we get an Exponential. The sum of two independent, identically distributed Exponential variables
follows a Gamma Distribution with = 2. As approaches infinity the Gamma approaches a
Normal distribution by the Central Limit Theorem. The Gamma has variance equal to 2, the sum of
identical independent exponential distributions each with variance 2.
106

The incomplete Gamma Function, which underlies the Gamma Distribution, is covered in Mahlers Guide to
Frequency Distributions.
107
The Gamma Distribution is sometimes called a Pearson Type III Distribution.
108
In Actuarial Mathematics is replaced by 1/.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 266
For = /2, the Gamma is a Chi-Square distribution with degrees of freedom.109 A Chi-Square
distribution with degrees of freedom, is in turn the sum of squares of independent unit
Normal Distributions Thus the Gamma is a sum of = 2 squares of unit Normal Distributions.
Also note that the series given in the Theorem A.1 in Appendix A of Loss Models for the
Incomplete Gamma function, for = m, is the sum of Poisson probabilities for the number of
events greater than or equal to m.
F(x) = (; x/)

Mean =

f(x) =

(x / ) e- x/
x ()

x -1 e - x /
.
()

Variance = 2

E[Xk] = k ( + k - 1) ... , for k a positive integer.

E[Xk] = k

( + k)
, k > .
()

Mode = ( - 1), > 1. Mode = 0, 1.

Points of inflection: 1 1 , > 2; 1 + 1 , 2 > 1.


Coefficient of Variation = 1/

Skewness = 2/ = 2CV.

Kurtosis = 3 + 6/ = 3+ 6CV2 .

The skewness for the Gamma distribution is always twice the coefficient of variation. Thus the
Gamma is likely to fit well only to data sets for which this is true.
The Kurtosis of a Gamma is always greater than 3, the kurtosis of a Normal Distribution.
As goes to infinity, the Kurtosis of a Gamma goes to 3, the kurtosis of a Normal, since the Gamma
approaches a Normal. For a Gamma: 2 Kurtosis - 3 Skewness2 = 6.

109

The Chi-Square with degrees of freedom is a Gamma with = /2, = 2, mean , and variance 2.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 267
Heres a graph of the density function of a Gamma Distribution with = 3 and = 10:
Prob.
0.025
0.02
0.015
0.01
0.005

20

40

60

80

100

Size

For = 3, the Gamma is a peaked curve, skewed to the right .110


Note that while Ive only shown x 100, the density is positive for all x > 0. Note that for 1,
rather than a peaked curve, we get a curve with mode of zero.111
Note that for very large alpha, one would get a curve much less skewed to the right.

110
111

This general description applies to the densities of most Loss Distributions.


For alpha =1, one gets an Exponential Distribution, with mode of zero.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 268
LogGamma:112
If ln[X] follows a Gamma Distribution, then X follows a LogGamma Distribution.
The distribution function of the LogGamma [a ; ln(x)/] is just that of the Gamma ( ; x/)
with ln(x) in place of x.
In order to derive the density of the LogGamma, one can differentiate the distribution function, but
must remember to use the chain rule and take into account the change of variables.
Let y = ln(x) . F(x) = ( ; ln(x)/) = ( ; y/).
Therefore, f(x) = dF/dx = (dF/dy)(dy/dx) = (density function of Gamma in terms of y) (1/x) =
{y 1 ey/ / ()} / x = {ln(x)}1 eln(x)/ /()} / x =
{ln(x/)}1 / {x1+1/ () } , x > 1.
is a shape parameter and is not exactly a scale parameter. For very large the distribution
approaches a LogNormal distribution, (just as the Gamma approaches the Normal distribution.)
For = 1, ln[X] follows an Exponential Distribution, and one gets a Single Parameter Distribution.
If one were to graph the size of loss distribution, but have the x-axis (the size of loss) on a
logarithmic scale, then the size of loss distribution would look much less skewed. If ln(x) followed a
Gamma, then x itself follows a LogGamma distribution. The LogGamma is much more skewed than
the Gamma distribution.
The product of independent LogGamma variables with the same parameter is a LogGamma with
the same parameter and the sum of the individual parameters.113

112

The LogGamma is not in Appendix A of Loss Models, and is extremely unlikely to be asked about on your exam.
For more information on the LogGamma see for example Loss Distributions by Hogg & Klugman.
113
This follows from the fact that the sum of two independent random variables each of which follows a Gamma
distribution with the same scale parameter, is also a Gamma distribution; it has a shape parameter equal to the sum of
the two shape parameters and the same scale parameter.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 269
LogNormal:
If one were to graph the size of loss distribution, but have the x-axis (the size of loss) on a
logarithmic scale, then the size of loss distribution would be much less skewed. If ln(x) follows a
(symmetric) Normal, then x itself follows a LogNormal.114
The product of a series of independent LogNormal variables is also LogNormal.115
The only condition necessary to produce a LogNormal Distribution is that the amount of an
observed value be the product of a large number of factors, each of which is independent of the
size of any other factor.116
Please note that is not the mean of the LogNormal nor is the standard deviation. Rather is the
mean and is the standard deviation of the Normal Distribution of the logs of the claim sizes. is a
shape parameter; note the way the CV and skewness only depend on .
As parameterized in Loss Models, the LogNormal Distribution does not have a scale parameter.
However, we can rewrite the Distribution Function:
F(x) = [{ln(x)} / ] = [{ln(x)ln(e)} / ] = [{ln(x/e)} / ] .
Thus since everywhere x appears in the distribution function it is divided by e, e would be
the scale parameter for the LogNormal. Thus if reparameterized in this way, the LogNormal
Distribution would have a scale parameter. Note the way that (e)n appears in the formula for
the moments of the LogNormal, another sign that e would be the scale parameter, if one
parameterized the distribution differently.
The LogNormal Distribution can also be used to model stock prices.117

ln(x)
F(x) =

exp f(x) =

( ln(x)

)2
22
x 2

Mean = exp[ + 2/2]


Second moment = exp[2 + 22]
114

E[Xn ] = exp[n + n2 2/2] .

The LogNormal is less skewed than the LogGamma distribution, (because the Normal distribution is less skewed
than the Gamma distribution.)
115
Since the sum of independent Normal variables is also a Normal.
116
Quoted from Sampling Theory in Casualty Insurance, by Arthur L. Bailey, PCAS 1942 and 1943.
117
See Derivative Markets by McDonald, not on the syllabus of this exam.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 270
Variance = exp(2 + 2) {exp(2) -1}

Skewness =

Coefficient of Variation = exp(2 ) - 1

exp(32) - 3 exp(2) + 2
= (3 + CV2 ) CV.
{exp(2 ) - 1}1.5

Mode = exp( 2)

Median = exp()

The relationships between the Gamma, Normal, LogGamma, and LogNormal Distributions are
shown below:118
Gamma

y = ln(x)

Normal

y = ln(x)

LogGamma

LogNormal

Exercise: A LogNormal Distribution has parameters = 5 and = 1.5.


What are the mean and variance?
[Solution: Mean = exp(+.52) = exp(5 +(1.52 )/2) = 457.145.
Second Moment = exp(2+22) = exp[10 + (2)(1.52 )] = 1,982,759.
Variance = 1,982,759 - 457.1452 = 1,773,777.]
The formula for the moments of a LogNormal Distribution follows from the formula for its mean.
If X is LogNormal with parameters and , then ln(X) is Normal with the same parameters.
Therefore, n ln(X) is Normal with parameters n and n.
Therefore, exp[n ln(X)] = Xn is LogNormal with parameters n and n.
Therefore, E[Xn ] = exp[n + (n)2 /2] = exp[n + n2 2/2].
Exercise: For a LogNormal Distribution with = 8.0064 and = 0.6368, what are the mean, median
and mode?
[Solution: Mean = exp( + .5 2) = 3674. Median = exp() = 3000. Mode = exp( 2) = 2000.]
118

A summary of the Normal Distribution appears in Mahlers Guide to Frequency Distributions.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 271
Here is a graph of this LogNormal Distribution, with = 8.0064 and = 0.6368:
density

Mode

0.00025
Median
0.00020
Mean
0.00015

0.00010

0.00005

2000

3000 3674

The LogNormal is a heavy-tailed distribution, yet all of its (positive) moments exist.119
Its mode (place where the density is largest) is less than it median (place where the distribution
function is 50%), which in turn is less than its mean (average).
As increases, the LogNormal gets heavier-tailed.
For a LogNormal Distribution with = 2, what is the ratio of the mean to the median?
[Solution: Mean / Median = exp( + 2/2) / exp() = exp(2/2) = e2 = 7.39.]
For a LogNormal Distribution with = 2, what is the ratio of the median to the mode?
[Solution: Median / Mode = exp() / exp( 2) = exp(2) = e4 = 54.6.]
For a LogNormal Distribution with = 2, what is the probability of a loss of size less than the mean?
[Solution: Mean = exp( + .5 2). F(mean) = [(ln(mean) - )/)] = [( + .5 2 - )/)] =
[.5 ] = [1] = 84.13%.]
119

See the section on tails of distributions.


The LogNormal is the distribution with the heaviest tail such that all of its moments exist.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 272
Weibull:
is a shape parameter, while is a scale parameter.
= 1 is the Exponential Distribution.

x
F(x) = 1 - exp-

x
x
exp -
x


f(x) =
= x 1 exp - .

x

[ ]

- 1 1 /
The mode of a Weibull is:

for > 1, and 0 for 1.



The Weibull Distribution is a generalization of the Exponential. One applies a power transformation
to the size of loss and gets a new more general distribution with two parameters from the
Exponential Distribution with one parameter. So where x/ appears in the Exponential Distribution,
(x/) appears in the Weibull Distribution.
x
S(x) = exp - . Note that for large , the righthand tail can decrease very quickly since x is taken

[ ]

to a power in the exponential.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 273
Here is a graph of a Weibull Distribution with = 100 and = 2 (solid), compared to an Exponential
Distribution with = 100 (dashed):
Prob.
0.01

0.008

0.006

0.004

0.002

100

200

300

400

500

For sufficiently large the Weibull has a negative skewness.


For less than 1 the Mean Excess Loss increases, but for greater than 1 the Mean Excess Loss
decreases. For = 1 you get the Exponential Distribution, with constant Mean Excess Loss.
For large x the Mean Excess Loss is proportional to x1.
The mean of the Weibull is (1+ 1/). For = 1, we have an Exponential with mean (2) = .
For example, for = 1/3, the mean is: (4) = 6.
For example for = 2, the mean is: (3/2) =

/2 = 0.8862 .

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 274
Here is a graph of the mean divided by , as function of the shape parameter of the Weibull
Distribution, :
mean over theta
1.3
1.2
1.1
1.0
0.9

The th moment of a Weibull with shape parameter is: (1 + /) = (2) = .


For example, the third moment of a Weibull Distribution with = 3 is 3.

tau

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 275
Problems:
24.1 (1 point) Which of the following statements are true?
1.

The mean of a LogNormal distribution, exists for all values of > 0.

2.

The mean of a Pareto distribution, exists for all values of > 1.

3.

The variance of a Weibull distribution, exists for all values of > 1.

A. 1

B. 2

C. 1, 2

D. 2, 3

E. 1, 2, 3

24.2 (1 point) Given a Gamma Distribution with a coefficient of variation of 0.5,


what is the value of the parameter ?
A. 1

B. 2

C. 3

D. 4

E. Cannot be determined.

24.3 (2 points) Given a Gamma Distribution with a coefficient of variation of 0.5 and skewness of 1,
what is the value of the parameter ?
A. 1

B. 2

C. 3

D. 4

E. Cannot be determined.

24.4 (1 point) Given a Weibull Distribution with parameters = 100,000 and = 0.2,
what is the survival function at 25,000?
A. 43%
B. 44%
C. 45%

D. 46%

E. 47%

24.5 (1 point) Given a Weibull Distribution with parameters = 100,000 and = 0.2,
what is the mean?
A. less than 10 million
B. at least 10 million but less than 15 million
C. at least 15 million but less than 20 million
D. at least 20 million but less than 25 million
E. at least 25 million
24.6 (1 point) Given a Weibull Distribution with parameters = 100,000 and = 0.2,
what is the median?
A. less than 10 thousand
B. at least 10 thousand but less than 15 thousand
C. at least 15 thousand but less than 20 thousand
D. at least 20 thousand but less than 25 thousand
E. at least 25 thousand

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 276
24.7 (3 points) X follows a Gamma Distribution with = 4 and = 10.
Y follows a Pareto Distribution with = 3 and = 10.
X and Y are independent and Z = XY. What is the variance of Z?
A. 120,000
B. 140,000
C. 160,000
D. 180,000

E. 200,000

24.8 (1 point) Given a Pareto Distribution with parameters = 2.5 and = 34 million,
what is the distribution function at 20 million?
A. less than 60%
B. at least 60% but less than 65%
C. at least 65% but less than 70%
D. at least 70% but less than 75%
E. at least 75%
24.9 (1 point) Given a Pareto Distribution with parameters = 2.5 and = 34 million,
what is the mean?
A. less than 10 million
B. at least 10 million but less than 15 million
C. at least 15 million but less than 20 million
D. at least 20 million but less than 25 million
E. at least 25 million
24.10 (1 point) Given a Pareto Distribution with parameters = 2.5 and = 34 million,
what is the median?
A. less than 10 million
B. at least 10 million but less than 15 million
C. at least 15 million but less than 20 million
D. at least 20 million but less than 25 million
E. at least 25 million
24.11 (2 points) Given a Pareto Distribution with parameters = 2.5 and = 34 million,
what is the standard deviation?
A. less than 50 million
B. at least 50 million but less than 55 million
C. at least 55 million but less than 60 million
D. at least 60 million but less than 65 million
E. at least 65 million

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 277
24.12 (2 points) The times of reporting of claims follows a Weibull Distribution with = 1.5 and
= 4.
If 172 claims have been reported by time 5, estimate how many additional claims will be reported in
the future.
A. 56
B. 58
C. 60
D. 62
E. 64
24.13 (1 point) X1 , X2 , X3 , X4 are a sample of size four from an Exponential Distribution with mean
100. What is the mode of X1 + X2 + X3 + X4 ?
A. 0

B. 100

C. 200

D. 300

E. 400

24.14 (2 points) You are given the following:

V is distributed according to an Gamma Distribution with parameters = 3 and = 50.

W is distributed according to an Gamma Distribution with parameters = 5 and = 50.

X is distributed according to an Gamma Distribution with parameters = 9 and = 50.

V, W, and X are independent.

Which of the following is the distribution of Y = V + W + X?


A. Gamma with = 17 and = 50

B. Gamma with = 17 and = 150

C. Gamma with = 135 and = 50

D. Gamma with = 135 and = 150

E. None of the above.


24.15 (3 points) A large sample of claims has an observed average claim size of $2,000 with a
variance of 5 million. Assuming the claim severity distribution to be LogNormal, estimate the
probability that a particular claim exceeds $3,500.
A. less than 0.14
B. at least 0.14 but less than 0.18
C. at least 0.18 but less than 0.22
D. at least 0.22 but less than 0.26
E. at least 0.26
24.16 (1 point) Which of the following are Exponential Distributions?
1. The Gamma Distribution as approaches infinity.
2. The Gamma Distribution with = 1.
3. The Weibull Distribution with = 1.
A. 1, 2

B. 1, 3

C. 2, 3

D. 1, 2, 3

E. None of A, B, C, or D

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 278
24.17 (2 points) Claims are assumed to follow a Gamma Distribution, with = 5 and = 1000.
What is the probability that a claim exceeds 8,000? (Use the Normal Approximation.)
A. less than 4%
B. at least 4% but less than 6%
C. at least 6% but less than 8%
D. at least 8% but less than 10%
E. at least 10%
24.18 (1 point) What is the mode of a Pareto Distribution, with = 2 and = 800?
A.
B.
C.
D.
E.

less than 700


at least 700 but less than 800
at least 800 but less than 900
at least 900 but less than 1000
at least 1000

24.19 (2 points) The claim sizes at first report follow a LogNormal Distribution, with = 10 and
= 2.5.
The amount by which a claim develops from first report to ultimate also follows a LogNormal
Distribution, but with = 0.1 and = 0.5.
Assume that there are no new claims reported after first report, and that the distribution of
development factors is independent of size of claim.
What is the probability that a claim chosen at random is more than 1 million at ultimate?
A. less than 4%
B. at least 4% but less than 6%
C. at least 6% but less than 8%
D. at least 8% but less than 10%
E. at least 10%
24.20 (2 points) X follows a Gamma Distribution, with = 5 and = 1/1000.
What is the expected value of 1/X?
A. less than 200
B. at least 200 but less than 220
C. at least 220 but less than 240
D. at least 240 but less than 260
E. at least 260

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 279
24.21 (2 points) X follows a LogNormal Distribution, with = 6 and = 2.5.
What is the expected value of 1/X2 ?
A. less than 1.4
B. at least 1.4 but less than 1.6
C. at least 1.6 but less than 1.8
D. at least 1.8 but less than 2.0
E. at least 2.0
24.22 (1 point) What is the mean of a Pareto Distribution, with = 4 and = 3000?
A.
B.
C.
D.
E.

less than 700


at least 700 but less than 800
at least 800 but less than 900
at least 900 but less than 1000
at least 1000

24.23 (2 points) What is the coefficient of variation of a Pareto Distribution, with = 4 and = 3000?
A.
B.
C.
D.
E.

less than 1.4


at least 1.4 but less than 1.6
at least 1.6 but less than 1.8
at least 1.8 but less than 2.0
at least 2.0

24.24 (2 points) What is the coefficient of skewness of a Pareto Distribution, with = 4 and
= 3000?
A.
B.
C.
D.
E.

less than 1
at least 1 but less than 3
at least 3 but less than 5
at least 5 but less than 7
at least 7

24.25 (3 points) Y is the sum of 90 independent values drawn from a Weibull Distribution
with = 1/2. Using the Normal Approximation, estimate the probability that Y > 1.2 E[Y].
A. 5%

B. 10%

C. 15%

D. 20%

E. 25%

24.26 (2 points) For a LogNormal Distribution, the ratio of the 99th percentile to the 95th percentile
is 3.4. Determine the parameter.
A. 1.8

B. 1.9

C. 2.0

D. 2.1

E. 2.2

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 280
24.27 (3 points) You are given:

A claimant receives payments at a rate of 1 paid continuously while disabled.

Payments start immediately.

The force of interest is .

The length of disability follows a Gamma distribution with parameters and .

At the time of disability, what is the actuarial present value of these payments?
1 - ( + ) -
1 - (1 + )-

(A)
(B)
(C)
(D)

1 +
(1 + )
(E) None of A, B, C, or D.
24.28 (1 point) What is the median of a LogNormal Distribution, with = 4.2 and = 1.8?
A.
B.
C.
D.
E.

less than 70
at least 70 but less than 75
at least 75 but less than 80
at least 80 but less than 85
at least 85

24.29 (2 points) For a LogNormal Distribution, with = 4.2 and = 1.8, what is the probability that a
claim is less than the mean?
A. less than 70%
B. at least 70% but less than 75%
C. at least 75% but less than 80%
D. at least 80% but less than 85%
E. at least 85%
24.30 (3 points) Calculate the fourth central moment of a Gamma Distribution with parameters
and .
A. 4

B. 4 6

C. 4{3 2 +6}

D. 4{3 + 3 2 +6}

E. 4{4 - 33 + 3 2 +6}

24.31 (1 point) What is the kurtosis of a Gamma Distribution with parameters and ?
Hint: Use the solution to the previous question.
A. 3

B. 3 + 6/

C. 3 + 6/

D. 3 + 6/ + 32/2

E. None of A, B, C, or D.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 281
24.32 (1 point) For a LogNormal Distribution, with = 3 and = 0.4, what is the mean?
A. 18

B. 19

C. 20

D. 21

E. 22

24.33 (1 point) For a LogNormal Distribution, with = 3 and = 0.4, what is the median?
A. less than 18
B. at least 18 but less than 19
C. at least 19 but less than 20
D. at least 20 but less than 21
E. at least 21
24.34 (1 point) For a LogNormal Distribution, with = 3 and = 0.4, what is the mode?
A. less than 18
B. at least 18 but less than 19
C. at least 19 but less than 20
D. at least 20 but less than 21
E. at least 21
24.35 (1 point) For a LogNormal Distribution, with = 3 and = 0.4, what is the survival function at
50?
A. 1.1%

B. 1.2%

C. 1.3%

D. 1.4%

E. 1.5%

24.36 (2 points) You are given:

Future lifetimes follow a Gamma distribution with parameters and .

The force of interest is .

A whole life insurance policy pays 1 upon death.


What is the actuarial present value of this insurance?
1

(A)
(B)
(C)

( + )
(1 + )
( + )

(D)

(1 + )

(E) None of A, B, C, or D.
24.37 (2 points) Prior to the application of any deductible, losses follow a Pareto Distribution with
= 3.2 and = 135. If there is a deductible of 25, what is the density of non-zero payments at 60?
A.
B.
C.
D.
E.

less than 0.0045


at least 0.0045 but less than 0.0050
at least 0.0050 but less than 0.0055
at least 0.0055 but less than 0.0060
at least 0.0060

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 282
Use the following information for the next three questions:
X is Normally distributed with mean 4 and standard deviation 0.8.
24.38 (1 point) What is the mean of eX?
A. 69

B. 71

C. 73

D. 75

E. 77

24.39 (2 points) What is the standard deviation of eX?


A. 69
B. 71
C. 73
D. 75
E. 77
24.40 (2 points) What is the 10th percentile of eX?
A. 20
B. 25
C. 30
D. 35

E. 40

24.41 (2 points) A company has two electric generators.


The time until failure for each generator follows an exponential distribution with mean 10.
The company will begin using the second generator immediately after the first one fails.
What is the probability that both generators have failed by time 30?
A. 70%
B. 75%
C. 80%
D. 85%
E. 90%
24.42 (3 points) For a LogNormal Distribution, with parameters and , what is the value of the
ratio

mean - mode
?
mean - median

A.

1 - exp(- 2)
1 - exp(-0.5 2)

B.

1 - exp(-1.5 2)
1 - exp(-0.5 2)

D.

1 - exp(-0.5 2)
1 - exp(- 2)

E.

1 - exp(- 2)
exp(- 2) - exp(-0.5 2)

C.

1 - exp(-1.5 2)
1 - exp(- 2)

24.43 (3 points) You are given:


Future lifetimes follow a Weibull distribution with = 2 and = 30.

The force of interest is .04.

A whole life insurance policy pays $1 million upon death.


Calculate the actuarial present value of this insurance.

Hint:

exp(-x2) dx =

{1 - (b 2 )}

(A) $325,000

(B) $350,000

(C) $375,000

(D) $400,000

(E) $425,000

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 283
24.44 (3 points) The following three LogNormal Distributions have been fit to three different sets of
claim size data for professional liability insurance:
Physicians

= 7.8616

2 = 3.1311

Surgeons

= 8.0562

2 = 2.8601

Hospitals

= 7.4799

2 = 3.1988

Compare their means and coefficients of variation.


24.45 (2 points) The data represented by the following histogram is most likely to follow which of
the following distributions?
A. Normal

B. Exponential

C. Gamma, > 1

D. Pareto

E. Single Parameter Pareto

24.46 (3 points) Prior to the application of any deductible, losses follow a Pareto Distribution with
= 2.5 and = 47. There is a deductible of 10. What is the variance of amount paid by the insurer
for one loss, including the possibility that the amount paid is zero?
A. 4400
B. 4500
C. 4600
D. 4700
E. 4800
24.47 (3 points) Size of loss is Exponential with mean 4. Three losses occur.
What is the probability that the sum of these three losses is greater than 20?
A. less than 8%
B. at least 8% but less than 10%
C. at least 10% but less than 12%
D. at least 12% but less than 14%
E. at least 14%

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 284
Use the following information for the next 4 questions:
Prior to the effect of any deductible, the losses follow a Pareto Distribution with parameters
= 2.5 and = 24.
24.48 (2 points) If the insured has an ordinary deductible of size 15, what is the average payment
by the insurer per non-zero payment?
A. 18
B. 20
C. 22

D. 24

E. 26

24.49 (2 points) If the insured has a franchise deductible of size 15, what is the average payment
by the insurer per non-zero payment?
A. 37
B. 39
C. 41
D. 43
E. 45
24.50 (2 points) If the insured has an ordinary deductible of size 10, what is the average payment
by the insurer per loss?
A. 9.5
B. 10.0
C. 10.5
D. 11.0
E. 11.5
24.51 (2 points) If the insured has an franchise deductible of size 10, and there are 73 losses
expected per year, what are the insurer's expected annual payments?
A. 850
B. 900
C. 950
D. 1000
E. 1050
24.52 (1 point) Given a Weibull Distribution with parameters = 1000 and = 1.5, what is the
mode?
A. less than 470
B. at least 470 but less than 480
C. at least 480 but less than 490
D. at least 490 but less than 500
E. at least 500
24.53 (2 points) Size of loss is LogNormal with = 7 and = 1.6.
One has a sample of 10 independent losses: X1 , X2 , ..., X10.
Let Y be their geometric average, Y = (X1 X2 ... X10)1/10.
Determine the expected value of Y.
A. 1100
B. 1150
C. 1200

D. 1250

E. 1300

24.54 (2 points) Determine the variance of a Weibull Distribution with parameters = 9 and = 4.
Some values of the gamma function are: (0.25) = 3.62561, ( 0.5) = 1.77245,
( 0.75) =1.22542, (1) = 1, (1.25) = 0.90640, (1.5) = 0.88623, (1.75) = 0.91906, (2) = 1.
A. 3

B. 5

C. 7

D. 9

E. 11

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 285
24.55 (4 points) Demonstrate that for a Gamma Distribution with 1,
(mean - mode)/(standard dev.) = (skewness)(kurtosis + 3)/(10 kurtosis - 12 skewness2 - 18).
24.56 (4 points) For a three-year term insurance on a randomly chosen member of a population:
(i) 1/4 of the population are smokers and 3/4 are nonsmokers.
(ii) The future lifetimes follow a Weibull distribution with:
= 2 and = 15 for smokers
= 2 and = 20 for nonsmokers
(iii) The death benefit is 100,000 payable at the end of the year of death.
(iv) i = 0.06
Calculate the actuarial present value of this insurance.
(A) 2000
(B) 2100
(C) 2200
(D) 2300
(E) 2400
24.57 (2 points) X1 , X2 , X3 is a sample of size three from a Gamma Distribution with = 5 and
mean of 100. What is the mode of X1 + X2 + X3 ?
A. 200

B. 220

C. 240

D. 260

E. 280

24.58 (3 points) X follows a LogNormal Distribution, with = 3 and 2 = 2.


Y also follows a LogNormal Distribution, but with = 4 and 2 = 1.5.
X and Y are independent. Z = XY. What is the standard deviation of Z?
A. 35,000 B. 36,000 C. 37,000 D. 38,000 E. 39,000
24.59 (2 points) You are given the following:
The Tan Teen Insurance Company is set up solely to jointly insure 7 independent lives.

Each life has a future lifetime which follows a Weibull Distribution with = 2 and = 30.
Tan Teen starts with assets of 5, which grow continuously at 10% per year.
Thus the assets at time t are: 5(1.10)t.

Tan Teen pays 50 upon the death of the last survivor.


Calculate the probability of ruin of Tan Teen.
A. Less than 0.3%
B. At least 0.3% but less than 0.4%
C. At least 0.4% but less than 0.5%
D. At least 0.5% but less than 0.6%
E. At least 0.6%

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 286
24.60 (3 points) For medical malpractice insurance, the size of economic damages follows a
LogNormal Distribution with a coefficient of variation of 5.75.
What is the probability that an economic damage exceeds the mean?
A. 11%

B. 13%

C. 15%

D. 17%

E. 19%

24.61 (2 points) You are given:


(i) The frequency distribution for the number of losses for a policy with no deductible is
Binomial with m = 50 and q = 0.3.
(ii) Loss amounts for this policy follow the Pareto distribution with = 2000 and = 4.
Determine the expected number of payments when a deductible of 500 is applied.
(A) Less than 5
(B) At least 5, but less than 7
(C) At least 7, but less than 9
(D) At least 9, but less than 11
(E) At least 11
24.62 (2 points) The ratio of the median to the mode of a LogNormal Distribution is 5.4.
What is the second parameter, , of this LogNormal?
A. 0.9

B. 1.0

C. 1.1

D. 1.2

E. 1.3

24.63 (2 points) Define the quartiles as the 25th, 50th, and 75th percentiles.
Define the interquartile range as the difference between the third and first quartiles,
in other words as the 75th percentile minus the 25th percentile.
Determine the interquartile range for a Pareto Distribution with = 2 and = 1000.
(A) Less than 825
(B) At least 825, but less than 850
(C) At least 850, but less than 875
(D) At least 875, but less than 900
(E) At least 900

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 287
24.64 (3 points) A random sample of size 20 is drawn from a LogNormal Distribution with = 3 and
= 2. Determine E[ X 2 ].
(A) 50,000

(B) 60,000

(C) 70,000

(D) 80,000

(E) 90,000

24.65 (2, 5/83, Q.7) (1.5 point) W has LogNormal Distribution with parameters
= 1 and = 2. Which of the following random variables has a uniform distribution on [0, 1]?
A. In[([W] - 1)/4]

B. [(ln[W] - 1)/4]

C. [(ln[W] - 1)/2] D. [ln[(W - 1)/2]] E. In[([W] - 1)/2]

24.66 (2, 5/85, Q.46) (1.5 point) Let X be a continuous random variable with density function
f(x) = ax2 e-bx for x 0, where a > 0 and b > 0. What is the mode of X?
A. 0
B. 2
C. 2/b
D. b/2
E.
24.67 (4, 5/86, Q.49) (1 point) Which of the following statements are true?
1. If X is normally distributed, then ln(X) is lognormally distributed.
2. The tail of a Pareto distribution does not approach zero as fast as does the tail of
a lognormal distribution.
3. The mean of a Pareto distribution exists for all values of its parameters.
A. 1
B. 2
C. 3
D. 1, 2
E. 1, 3
24.68 (4, 5/87, Q.48) (1 point) Which of the following are true?
1. The sum of 25 independent identically distributed random variables from
a markedly skewed distribution has an approximate normal distribution.
2. The sum of two independent normal random variables is a normal random variable.
3. A random variable X is lognormally distributed if X = ln(Z) where Z is normally distributed.
A. 1
B. 2
C. 3
D. 2, 3
E. None of the above.
24.69 (160, 5/88, Q.4) (2.1 points) A survival function is defined by:
f(t) = k (t/2) e-t/; t > 0, > 0. Determine k.
(A) 1 / 4

(B) 1 / 2

(C) 1

(D) 2

(E) 4

24.70 (4, 5/88, Q.47) (1 point) Which of the following statements are true?
1. The LogNormal distribution is symmetric.
2. The tail of the Pareto distribution fades to zero more slowly than does that of a LogNormal,
for large enough x.
3. An important application of the binomial distribution is in connection with the distribution
of claim frequencies when the risks are not homogeneous.
A. 1
B. 2
C. 1, 2
D. 2, 3
E. 1, 2, 3

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 288
24.71 (4, 5/89, Q.42) (1 point) Which of the following are true?
1. If the independent random variables X and Y are Poisson variables,
then Z = X + Y is a Poisson random variable.
2. If X1 ,,....Xn are n independent unit normal variables, then the sum Z = X1 +..+ Xn is
normally distributed with mean = n and standard deviation = n.
3. If X is normally distributed then Y = lnX has a LogNormal distribution.
A. 1
B. 3
C. 1, 2
D. 1, 3
E. 1, 2, 3
24.72 (4, 5/89, Q.44) (2 points) The severities of individual claims are Pareto distributed, with
parameters = 8/3 and = 8,000. Using the Central Limit Theorem, what is the probability that the
sum of 100 independent claims will exceed 600,000?
A. Less than 0.025
B. At least 0.025, but less than 0.075
C. At least 0.075, but less than 0.125
D. At least 0.125, but less than 0.175
E. 0.175 or more
24.73 (4, 5/89, Q.56) (1 point) The random variable X has a Pareto distribution
F(x) = 1 - {100 / (100+x)}2 , for x > 0. Which of the following distribution functions represents the
distribution function of X truncated from below at 100?
A. 1 - {200 / (100+x)}2 , x > 100
B. 1 - (100/x)2 , x > 100
C. 1 - 1 / (x - 100)2 , x > 100
D. 1 - {200 / (200+x)}2 , x > 0
E. 1 - {100 / (x - 100)}2 , x > 0

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 289
24.74 (4, 5/90, Q.50) (2 points) The underlying claim severity distribution for the ABC Insurance
Company is lognormal with parameters = 7 and 2 = 10. The company only records losses that
are less than $50,000. Let X be the random variable representing all losses with cumulative
distribution function FX(x) and Y be the random variable representing the company's recorded
losses with cumulative distribution function FY(x).
Then for x $50,000 FY(x) = A FX(x) where A is the necessary adjustment factor.
In what range does A fall?
A. A < 0.70
B. 0.70 A < 0.90
C. 0.90 A < 1.10
D. 1.10 A < 1.30
E. 1.30 A
24.75 (4B, 11/92, Q.17) (1 point) You are given the following:

X1 and X2 are independent, identically distributed random variables.

X = X1 + X2

Which of the following are true?


1. If X1 , X2 have Poisson distributions with mean , then X has a Poisson distribution with mean 2.
2. If X1 , X2 have gamma distributions with parameters and , then X has a gamma distribution
with parameters 2 and 2.
3. If X1 , X2 have standard normal distributions, then X has a normal distribution with mean 0 and
variance 2.
A. 1 only
B. 2 only

C. 1, 3 only

D. 2, 3 only

E. 1, 2, 3

24.76 (4B, 11/92, Q.31) (2 points) The severity distribution of individual claims is gamma with
parameters = 5 and = 1000. Use the Central Limit Theorem to determine the probability that
the sum of 100 independent claims exceeds $525,000.
A. Less than 0.05
B. At least 0.05 but less than 0.10
C. At least 0.10 but less than 0.15
D. At least 0.15 but less than 0.20
E. At least 0.20

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 290
24.77 (4B, 11/93, Q.19) (2 points) A random variable X is distributed lognormally with parameters
= 0 and = 1. Determine the probability that X lies within one standard deviation of the mean.
A. Less than 0.65
B. At least 0.65, but less than 0.75
C. At least 0.75, but less than 0.85
D. At least 0.85, but less than 0.95
E. At least 0.95
24.78 (4B, 11/93, Q.28) (3 points) You are given the following:
X1 , X2 , and X3 are random variables representing the amount of an individual claim.
The first and second moments for X1 , X2 , and X3 are:
E[X1 ] = 1. E[X1 2 ] = 1.5.

E[X2 ] = 0.5. E[X2 2 ] = 0.5.

E[X3 ] = 0.5. E[X3 2 ] = 1.5.

For which of the random variables X1 , X2 , and X3 is it appropriate to use a Pareto distribution?
A. X1

B. X2

C. X3

D. X1 , X3

E. None of A, B, C or D

24.79 (4B, 5/94, Q.6) (2 points) You are given the following:

Losses follow a Weibull distribution with parameters = 20 and = 1.0.

A random sample of losses is collected, but the sample data is truncated from below by
a deductible of 10.
Determine the probability that an observation from the sample data is at most 25.
A. Less than 0.50
B. At least 0.50, but less than 0.60
C. At least 0.60, but less than 0.70
D. At least 0.70, but less than 0.80
E. At least 0.80
24.80 (4B, 11/94, Q.1) (1 point) You are given the following:
X is a random variable representing size of loss.
Y = ln(x) is a random variable having a normal distribution with a mean of 6.503 and standard
deviation of 1.500. Determine the probability that X is greater than $1,000.
A. Less than 0.300
B. At least 0.300, but less than 0.325
C. At least 0.325, but less than 0.350
D. At least 0.350, but less than 0.375
E. At least 0.375

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 291
24.81 (4B, 11/94, Q.10) (2 points) You are given the following:
A portfolio consists of 10 independent risks.
The distribution of annual losses (x is in $ millions) for each risk is given by a Gamma Distribution
f(x) = x1 ex/ / (), x > 0, with = 1 and = 0.1.
Determine the probability that the portfolio has aggregate annual losses greater than $1.0 million.
A. Less than 20%
B. At least 20%, but less than 40%
C. At least 40%, but less than 60%
D. At least 60%, but less than 80%
E. At least 80%
24.82 (4B, 11/94, Q.17) (2 points) You are given the following:
Losses follow a Weibull distribution with parameters = 20 and = 1.0. For each loss that occurs,
the insurers payment is equal to the amount of the loss truncated and shifted by a deductible of 10.
If the insurer makes a payment, what is the probability that an insurers payment is less than or equal
to 25?
A. Less than 0.65
B. At least 0.65, but less than 0.70
C. At least 0.70, but less than 0.75
D. At least 0.75, but less than 0.80
E. At least 0.80
24.83 (4B, 11/95, Q.8) (2 points) Losses follow a Weibull distribution, with parameters
(unknown) and = 0.5.
Determine the ratio of the mean to the median.
A. Less than 2.0
B. At least 2.0, but less than 3.0
C. At least 3.0, but less than 4.0
D. At least 4.0
E. Cannot be determined from the given information.
24.84 (4B, 11/96, Q.8) (1 point) The random variable Y is the sum of two independent and
identically distributed random variables, X1 and X2 . Which of the following statements are true?
1. If X1 and X2 have Poisson distributions, then Y must have a Poisson distribution.
2. If X1 and X2 have gamma distributions, then Y must have a gamma distribution.
3. If X1 and X2 have lognormal distributions, then Y must have a lognormal distribution.
A. 1

B. 2

C. 3

D. 1, 2

E. 1, 3

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 292
Use the following information for the next two questions:

A portfolio consists of 16 independent risks.

For each risk, losses follow a Gamma distribution, with parameters = 250 and = 1.

24.85 (4B, 5/97, Q.6) (2 points) Without using the Central Limit Theorem, determine the
probability that the aggregate losses for the entire portfolio will exceed 6,000.
A. 1 - (1; 1)

B. 1 - (1; 24)

C. 1 - (1; 384)

D. 1 - (16; 24)

E. 1 - (16; 384)

24.86 (4B, 5/97, Q.7) (2 points) Using the Central Limit Theorem, determine the approximate
probability that the aggregate losses for the entire portfolio will exceed 6,000.
A. Less than 0.0125
B. At least 0.0125, but less than 0.0250
C. At least 0.0250, but less than 0.0375
D. At least 0.0375, but less than 0.0500
E. At least 0.0500
24.87 (4B, 5/97, Q.16) (1 point) You are given the following:

The random variable X has a Weibull distribution, with parameters = 625 and = 0.5.
Z is defined to be 0.25X.
Determine the correlation coefficient of X and Z.
A. 0.00
B. 0.25
C. 0.50
D. 0.75

E. 1.00

24.88 (4B, 11/98, Q.27) (2 points) Determine the skewness of a gamma distribution with a
coefficient of variation of 1. Hint: The skewness of a distribution is defined to be the third central
moment divided by the cube of the standard deviation.
A. 0
B. 1
C. 2
D. 4
E. 6
24.89 (4B, 5/99. Q.1) (1 point) Which of the following inequalities is true for a Pareto distribution
with a finite mean?
A. Mean < Median < Mode
B. Mean < Mode < Median
C. Median < Mode < Mean
D. Mode < Mean < Median
E. Mode < Median < Mean
24.90 (Course 160 Sample Exam #1, 1999, Q.1) (1.9 points)
For a laser operated gene splicer, you are given:
(i) It has a Weibull survival model with parameters =

2 and = 2.

(ii) It was operational at time t = 1.


(iii) It failed prior to time t = 4.
Calculate the probability that the splicer failed between times t = 2 and t = 3.
(A) 0.2046 (B) 0.2047 (C) 0.2048 (D) 0.2049 (E) 0.2050

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 293
24.91 (1, 5/00, Q.7) (1.9 points) An insurance companys monthly claims are modeled by a
continuous, positive random variable X, whose probability density function is proportional to
(1 + x)4 , where 0 < x < .
Determine the companys expected monthly claims.
(A) 1/6
(B) 1/3
(C) 1/2
(D) 1

(E) 3

24.92 (3, 5/00, Q.8) (2.5 points) For a two-year term insurance on a randomly chosen member of
a population:
(i) 1/3 of the population are smokers and 2/3 are nonsmokers.
(ii) The future lifetimes follow a Weibull distribution with:
= 2 and = 1.5 for smokers
= 2 and = 2.0 for nonsmokers
(iii) The death benefit is 100,000 payable at the end of the year of death.
(iv) i = 0.05
Calculate the actuarial present value of this insurance.
(A) 64,100 (B) 64,300 (C) 64,600 (D) 64,900 (E) 65,100
24.93 (3, 5/01, Q.24) (2.5 points) For a disability insurance claim:
(i) The claimant will receive payments at the rate of 20,000 per year, payable
continuously as long as she remains disabled.
(ii) The length of the payment period in years is a random variable with the gamma
distribution with parameters = 2 and = 1.
(iii) Payments begin immediately.
(iv) = 0.05
Calculate the actuarial present value of the disability payments at the time of disability.
(A) 36,400 (B) 37,200 (C) 38,100 (D) 39,200 (E) 40,000
24.94 (1, 11/01, Q.35) (1.9 points) Auto claim amounts, in thousands, are modeled by a random
variable with density function f(x) = xe-x for x 0.
The company expects to pay 100 claims if there is no deductible.
How many claims does the company expect to pay if the company decides to introduce
a deductible of 1000?
(A) 26

(B) 37

(C) 50

(D) 63

(E) 74

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 294
24.95 (CAS3, 5/04, Q.21) (2.5 points) Auto liability losses for a group of insureds
(Group R) follow a Pareto distribution with = 2 and = 2,000.
Losses from a second group (Group S) follow a Pareto distribution with = 2 and = 3,000.
Group R has an ordinary deductible of 500, while Group S has a franchise deductible of 200.
Calculate the amount that the expected cost per payment for Group S exceeds that for
Group R.
A. Less than 350
B. At least 350, but less than 650
C. At least 650, but less than 950
D. At least 950, but less than 1,250
E. At least 1,250
24.96 (CAS3, 11/04, Q.25) (2.5 points)
Let X be the random variable representing the aggregate losses for an insured.
X follows a gamma distribution with mean of $1 million and coefficient of variation 1.
An insurance policy pays for aggregate losses that exceed twice the expected value of X.
Calculate the expected loss for the policy.
A. Less than $100,000
B. At least $100,000, but less than $200,000
C. At least $200,000, but less than $300,000
D. At least $300,000, but less than $400,000
E. At least $400,000
24.97 (CAS3, 5/05, Q.35) (2.5 points) An insurance company offers two types of policies,
Type Q and Type R. Type Q has no deductible, but a policy limit of 3,000.
Type R has no limit, but an ordinary deductible of d.
Losses follow a Pareto distribution with = 2,000 and = 3.
Calculate the deductible, d, such that both policy types have the same expected cost per loss.
A. Less than 50
B. At least 50, but less than 100
C. At least 100, but less than 150
D. At least 150, but less than 200
E. 200 or more
24.98 (SOA M, 11/05, Q.8) (2.5 points) A Mars probe has two batteries. Once a battery is
activated, its future lifetime is exponential with mean 1 year. The first battery is activated when the
probe lands on Mars. The second battery is activated when the first fails. Battery lifetimes after
activation are independent. The probe transmits data until both batteries have failed.
Calculate the probability that the probe is transmitting data three years after landing.
(A) 0.05
(B) 0.10
(C) 0.15
(D) 0.20
(E) 0.25

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 295
24.99 (CAS3, 5/06, Q.25) (2.5 points)
Calculate the skewness of a Pareto distribution with = 4 and = 1,000.
A. Less than 2
B. At least 2, but less than 4
C. At least 4, but less than 6
D. At least 6, but less than 8
E. At least 8
24.100 (CAS3, 5/06, Q.36) (2.5 points)
The following information is available for a collective risk model:

X is a random variable representing the size of each loss.


X follows a Gamma Distribution with = 2 and = 100.
N is a random variable representing the number of claims.
S is a random variable representing aggregate losses.
S = X1 + ... + XN.
Calculate the mode of S when N = 5.
A. Less than 950
B. At least 950 but less than 1050
C. At least 1050 but less than 1150
D. At least 1150 but less than 1250
E. At least 1250
24.101 (4, 5/07, Q.39) (2.5 points) You are given:
(i) The frequency distribution for the number of losses for a policy with no deductible is
negative binomial with r = 3 and = 5.
(ii) Loss amounts for this policy follow the Weibull distribution with = 1000 and = 0.3.
Determine the expected number of payments when a deductible of 200 is applied.
(A) Less than 5
(B) At least 5, but less than 7
(C) At least 7, but less than 9
(D) At least 9, but less than 11
(E) At least 11
24.102 (2 points) In the previous question, 4, 5/07, Q.39, determine the variance of the number of
payments when a deductible of 200 is applied.
A. 20
B. 30
C. 40
D. 50
E. 60

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 296
Solutions to Problems:
24.1. E. 1. True. All of the moments of a LogNormal exist. 2. True. (For the Pareto, the nth
moment exists if n > .) 3. True. All of the moments of a Weibull exist.
24.2. D. For the Gamma Distribution, the mean is , while the variance is 2 .
Thus the coefficient of variation is: (variance.5) / mean = { 2 }.5 /{} = 1 / .5.
Thus for the Gamma Distribution, = 1/CV2 . Thus = 1 / (0.5)2 = 4.
24.3. E. Both the coefficient of variation and the skewness do not depend on the scale parameter
. Therefore can not be determined from the given information.
For the Gamma Distribution, the coefficient of variation = 1/ and Skewness = 2/ .
24.4. E. S(25,000) = exp[-(25,000/100,000)0.2] = 0.469.
24.5. B. The mean of the Weibull is (1+ 1 /) = (100,000)(1+ 1/0.2) = (100,000)(6) = 5!
(100,000) = (120) (100,000) = 12 million.
24.6. C. The median of the Weibull is such that .5 = F(m) = 1 - exp(-(m/)).
Thus, -(m/) = ln .5. m = (-ln.5)1/ = (.693147)1/ = (100,000).6931475 = 16,000.
Comment: Note that the median for this Weibull is much smaller than the mean, a symptom of a
distribution skewed to the right (positively skewed.)

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 297
24.7. C. In general for X and Y two independent variables :
E[XY] = E[X]E[Y] and E[X2 Y 2 ] = E[X2 ]E[Y2 ].
VAR[XY] = E[(XY)2 ] - E[XY]2 = E[X2 Y 2 ] - {E[X]E[Y]}2 = E[X2 ]E[Y2 ] - E[X]2 E[Y]2 .
For the Pareto Distribution the moments are: n

n! n

( i)

, > n.

i=1

Therefore, E[Y] = /(1) = 10/2 = 5, and E[Y2 ] = 22 /{(1)(2)} = (2)(102 )/{(2)(1)} = 100.
n1

For the Gamma Distribution the moments are: n ( + i) .


i=0

Therefore, E[X] = = 40, and E[X2 ] = (+1)2 = (4)(5)/(102 ) = 2000.


Therefore, VAR[Z] =(2000)(100) - {(40)(5)}2 = 160,000.
24.8. C. F(20 million) = 1 - {34/(34 + 20)}2.5 = 68.5%.
24.9. D. The mean of the Pareto is / (-1) = 34 million / 1.5 = 22.667 million.
24.10. B. The median of the Pareto is such that .5 = F(m) = 1 - {/(+m)}.
Thus (+m)/ = 0.5-1/, and m = {0.5-1/ - 1} = 34 million{0.5-1/2.5 - 1} = 10.86 million.
Comment: Note the connection to Simulation by the Method of Inversion and to fitting via Percentile
Matching.
24.11. B. For the Pareto Distribution the moments are: n

n! n

( i)

, > n.

i=1

Therefore putting everything in units of 1 million, E[X] = /(1) = 34/1.5 = 22.67 and
E[X2 ] = 22 /{(1)(2)} = (2)(342 ) / {(1.5)(.5)} = 3083. Thus VAR[X] = 3803 - (22.67)2 = 2569.
Therefore, the standard deviation of X is

2569 = 50.7 (million).

Alternately, for the Pareto the variance = 2 / { (2)(1)2 } = (342 )(1012)(2.5) / {(.5)(1.52 )} =
2569 x1012. Therefore, the standard deviation is: 2569 x 10 12 = 50.7 million.
Comment: Note that the coefficient of variation = standard deviation / mean
= 50.7 / 22.67 = 2.24 =

5 = 2.5 / 0.5 = / ( - 2) .

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 298
24.12. A. F(5) = 1 - exp[-(5/4)1.5] = .753. Expect 172/.753 = 228 claims ultimately reported.
Expect 228 - 172 = 56 claims reported in the future.
Comment: We are dealing with time rather than size of loss.
F(5) = the expected percentage of claims reported by time 5.
F(5) (expected total number of claims) = expected number of claims reported by time 5.
expected total number of claims (number of claims reported by time 5)/F(5).
24.13. D. X1 + X2 + X3 + X4 is a Gamma Distribution with = 4 and = 100.
As shown in the Appendix attached to the exam, for > 1 its mode is: ( - 1) = 300.
24.14. A. The sum of independent random variables each of which follows a Gamma distribution
with the same scale parameter, is also a Gamma distribution; it has a shape parameter equal to the
sum of the shape parameters and the same scale parameter.
Thus Y is Gamma with = 3 + 5 + 9 = 17, and = 50.
24.15. B. Mean of LogNormal is exp( + .52 ). Second Moment of the LogNormal is
exp(2 + 22 ). Therefore the variance is: exp(2 + 22 ) - exp(2 + 2 ) =
exp(2 + 2 ) {1+exp(2 )}. CV2 = Variance / Mean2 = exp(2 ) - 1. = {ln(1 + CV2 )}.5 =
{ln(1 + 5/4)}0.5 = 0.9005. = ln(Mean) - 0.52 = ln(2000) - 0.5(.90052 ) = 7.1955.
Chance that a claim exceeds $3,500 is 1 -F(3500) = 1 -({ln(3500) - } / ) =
1 - ({8.1605 - 7.1955} / 0.9005) = 1- (1.07) = 1- 0.8577 = 0.1423.
24.16. C. The Gamma Distribution as approaches infinity is a Normal Distribution.
The Gamma Distribution with = 1 is an Exponential Distribution.
The Weibull Distribution with = 1 is an Exponential Distribution.
24.17. D. For the Gamma Distribution: Mean = = 5000, Variance = 2 = 5 million.
Thus the Standard Deviation is

5 million = 2236. Thus the chance of a claim exceeding 8,000 is

approximately: 1 - ((8000 - 5000)/2236) = 1 - (1.34) = 1 - 0.9099 = 9.01%.


Comment: When applying the Normal Approximation to a continuous distribution, there is no
continuity correction such as is applied as when approximating a discrete distribution.
(In any case, here it would make no significant difference.) The Gamma approaches a Normal as
approaches infinity. In this case, the exact answer is given via an incomplete Gamma Function:
1 - (; x/) = 1 - (5; 8) = 1 - .900368 = 9.9632%, gotten via computer.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 299
24.18. A. The mode of any Pareto Distribution is 0.
24.19. C. Let X be the losses at first report, let Y be the loss development factor and let
Z = XY be the losses at ultimate. Ln(X) and ln(Y) are two independent Normal variables.
Therefore, ln(Z) = ln(XY) = ln(X) + ln(Y) is a Normal variable. Therefore, Z is a LogNormal variable.
ln(Z) = ln(X) + ln(Y) has mean equal to the sum of the means of ln(X) and ln(Y) and variance equal to
the sum of the variances of ln(X) and ln(Y). Therefore ln(Z) has parameters = 10 + 0.1 = 10.1 and
2 = 2.52 + 0.52 = 6.5, and therefore so does Z.
For a LogNormal Distribution F(z) = [{ln(z) } / ]. In this case with = 10.1 and = 2.55,
F(1,000,000) = [{ln(1,000,000) - 10.1} / 2.55] = [1.46] = .9279.
1 - F(1,000,000) = 1 - 0.9279 = 0.0721.
Comment: The product of two independent LogNormal variables is also LogNormal.
Note that the variances add, not the parameters themselves.

24.20. D. E[1/X] =

0 f(x) / x dx = {/

()} x - 2 e- x / dx =

{ / ()} / {(1) / (1)} = 1/ {(1)} = 1000/(5-1) = 250.


Alternately, the moments of the Gamma Distribution are E[Xn ] = n (+n) / ().
This formula works for n positive or negative. Therefore for n = -1, = 5 and = 1/1000:
E[1/X] = 1000 (4) / (5) = 1000 (3!)/(4!) = 1000 / 4 = 250.
Alternately, if X follows a Gamma, then Z = 1/X has Distribution F(z) = 1 - (; 1/ z) =
1 - (5; 1000/ z), which is an Inverse Gamma, with scale parameter 1000 and = 5.
The Inverse Gamma has Mean = /(-1) = 1000 / (5-1) = 250.
Comment: Note that theta for the Gamma of 1/1000 becomes one over theta for the Inverse
Gamma, which has theta equal to 1000.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 300
24.21. C. The moments of the LogNormal Distribution are E[Xn ] = exp[n + 0.5 n2 2].
Therefore for n = -2, with = 6 and = 2.5, E[1/X2 ] = exp[-12 + (2) 2.52] = e0.5 = 1.65.
Alternately, if lnX follows a Normal, then if Z = 1/X, lnZ = - lnX also follows a Normal ( but with mean
of -6 and standard deviation of 2.5.) Therefore, Z follows a LogNormal with = -6 and = 2.5.
Then one can apply the formula for moments of the LogNormal in order to get the second moment
of Z: E[1/X2 ] = E[Z2 ] = exp[-12 + (2) 2.52] = e0.5 = 1.65.
Alternately, if Y = 1/X2 , then lnY = -2lnX also follows a Normal but with mean = (-2)(6) = -12 and
standard deviation equal to ( |-2| )(2.5) = (2)(2.5) = 5. Thus Y follows a LogNormal with
= -12 and = 5. Thus E[1/X2 ] = E[Y] = exp[ +.52] = exp[-12 + (1/2) 52] = e0.5 = 1.65.
24.22. E. For a Pareto, mean = /(-1) = 3000 / (4-1) = 1000.
24.23. B. For a Pareto, coefficient of variation =

/ ( - 2) = 2 = 1.414.

Alternately, For the Pareto Distribution the moments are: n

n! n

( i)

, > n.

i=1

E[X] = /(-1) = 1000. E[X2 ] = 22/{(-1)(-2)} = 3,000,000. =

E[X2] - E[X]2 = 1414.2.

coefficient of variation = /E[X] = 1414.2 / 1000 = 1.414.


Comment: The coefficient of variation does not depend on the scale parameter .
24.24. E. For a Pareto, skewness = 2{(+1)/(3)} ( - 2) / = 2(5)/ 2 = 7.071.
Alternately, for the Pareto Distribution the moments are: n

n! n

( i)

, > n.

i=1

E[X] = /(1) = 1000. E[X2 ] = 22/{(1)(2)} = 3,000,000. =

E[X2] - E[X]2 = 1414.2.

E[X3 ] = 63/{(1)(2)(-3)} = 27,000,000,000.


skewness = {E[X3 ] - 3E[X]E[X2 ] + 2E[X]3 }/3 = 7.07.
Comment: The skewness does not depend on the scale parameter . Notice the large positive
skewness, which is typical for a heavier-tailed distribution such as the Pareto, when its skewness
exists (for > 3.)

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 301
24.25 .D. The mean of the Weibull is: (1 + 1/) = (1 + 2) = (3) = 2! = 2.
The second moment of the Weibull is: 2 (1 + 2/) = (2)(1 + 4) = (2)(5) = (2)4! = 242.
The variance of the Weibull is: 242 - (2)2 = 202.
Y has a mean of: (90)(2) = 180, and a variance of: (90)(202) = 18002.
Prob[X > 1.2 E[X]] = 1 - [(216 - 180) /

1800 2 ] = 1 - [0.85] = 19.77%.

24.26. A. The 99th percentile is: exp[ + 2.326]. The 95th percentile is: exp[ + 1.645].
3.4 = exp[ + 2.326] / exp[ + 1.645] = exp[0.681]. = 1.797.
24.27. B. Given a disability of length t, the present value of an annuity certain is:
(1 - e-t)/. The expected present value is the average of this over all t:

{(1 - e-t)/}f(t) dt = (1/)f(t) - e-t e-t/t1/(()) dt =


0

(1/){1 - (1/(())) e-t( + 1/) t1 dt = (1/){1 - (1/(()))()( + 1/)} =


0

(1/){1 - (1/( + 1/))/} = {1 - (1 + )}/.


Comment: Similar to 3, 5/01, Q. 24.

I used the fact that t1 et/ dt = () Gamma density integrates to 1 over its support.
0

24.28. A. The median is where the Distribution Function is 0.5. [ {ln(x)} / ] = 0.5.
Therefore, {ln(x)} / = 0. x = e = e4.2 = 66.7.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 302
24.29. D. The mean is exp[ + .52]. The Distribution Function at the mean is:
[{ln(exp[ + 0.52]) - } / ] = [{( + 0.52)) - } / ] = [/2] = [0.9] = 0.8159.
Comment: For a heavier-tailed distribution, theres only a small chance that a claim is greater than the
mean; a few large claims contribute a great deal to the mean.
The mean is: exp[ + .52] = exp[4.2 + (1/2)(1.82 )] = exp[5.82] = 336.972.
F(336.972) = [{ln(336.972) - 4.2}/1.8] = [0.9] = 0.8159.
24.30. C. For the Gamma Distribution, the moments about the origin are:
E[Xn ] = n (+n)/(). E[X] = . E[X2 ]= 2(+1). E[X3 ] = 3(+1)(+2).
E[X4 ] = 4(+1)(+2)(+3).
Fourth Central Moment = E[X4 ] - 4E[X]E[X3 ] + 6E[X]2 E[X2 ] - 3E[X]4 =
4 {(+1)(+2)(+3) - 4(+1)(+2) + 6(+1) - 33} =
4 {3 + 62 +11 + 6 - 43 -122 - 8 + 63 + 62 - 33} = 4 {3 2 + 6}.
24.31. B. From the previous solution, for the Gamma Distribution,
Fourth Central Moment = 4{3 2 + 6}. Variance = 2.
Kurtosis = Fourth Central Moment/ Variance2 = 3 + 6/.
Comment: Note that the scale parameter, , does not appear in the kurtosis, which is a
dimensionless quantity. Also note that the kurtosis of a Gamma is always larger than that of a Normal
Distribution, which is 3. The Gamma has a heavier tail than the Normal.
24.32. E. For the LogNormal, the mean is: exp( +.5 2) = exp(3.08) = 21.8.
24.33. D. The median is that point where F(x) = 0.5.
Thus [{ln(x) } / ] = 0.5. 0 = {ln(x) - } / .
Thus ln(x) = , or the median = e = e3 = 20.1.
24.34. A. The mode is that point where f(x) is a maximum. For the LogNormal:
f(x) = exp[-.5 ({ln(x) } / )2] /{x

2 ).

f(x) = -f(x)/x - f(x) ({ln(x) } / ) /x.


Thus f(x) =0 for ({ln(x) } / 2) = -1. mode = exp( 2) = exp(2.84) = 17.1.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 303
24.35. A. S(50) = 1 - [(ln(50) - 3)/.4] = 1 - [2.28] = 1 - 0.9887 = 1.13%.
24.36. E. The probability of death at time t, is the density of the Gamma Distribution:
f(t) = e-t/t1/{()}. The present value of a payment of one at time t is et .
Therefore, the actuarial present value of this insurance is:

et e-t/t1/(()) dt = (1/(())) e(+1/)t x1 dt


0

= (1/(())) ()( + 1/) = 1/(1 + ).


Comment: The Gamma Distribution is too heavy-tailed to be a good model of future lifetimes.
24.37. C. After truncating and shifting from below by a deductible of size d, one gets another
Pareto Distribution, but with parameters and + d, in this case 3.2 and
135 + 25 = 160. This has density of: ()( + x) ( + 1) = (3.2)(1603.2)(160+x)-4.2.
Plugging in x = 60 one gets: (3.2)(1603.2)(160+60)-4.2 = 0.00525.
Alternately, after truncating and shifting by 25, G(x) = 1 - S(x+25)/S(25) =
1 - {(135/(135 + x + 25))3.2} / {(135/(135 +25))3.2} = 1 - (160/(160 + x))3.2.
This is a Pareto Distribution with = 3.2 and = 160. Proceed as before.
Alternately, after truncating and shifting by 25, g(x) = f(x+25)/S(25) =
(3.2)(1353.2)(135 + x + 25) -4.2 / {(135/(135 +25))3.2} = (3.2)(1603.2)(160+x)-4.2.
h(60) = (3.2)(1603.2)(220)-4.2 = 0.00525.
24.38. D. & 24.39. B. eX is LogNormal with = 4 and = 0.8.
Mean of LogNormal = E[eX] = exp[ + 2/2] = exp[4 + 0.82 /2] = 75.189.
E[(eX)2 ] = E[e2X] = Second moment of LogNormal = exp[2 + 22] = exp[(2)(4 + 0.82 )] =
10,721.4.
Variance of LogNormal = 10,721.4 - 75.1892 = 5068.0.
Standard Deviation of LogNormal =

5068.0 = 71.2.

Alternately, for the LogNormal, 1 + CV2 = E[X2 ]/E[X]2 = exp[2] = exp[0.82 ] = 1.8965.

CV = 0.9468. Standard Deviation of LogNormal = (0.9468)(75.189) = 71.2.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 304
24.40. A. eX is LogNormal with = 4 and = 0.8.
0.1 = [{ln(x) - 4}/0.8]. -1.282 = {ln(x) - 4} / 0.8. x = 19.58.
Alternately, the 10th percentile of the Normal Distribution is: 4 - (1.282)(0.8) = 2.9744.
Thus, the 10th percentile of eX is: e2.9744 = 19.58.
24.41. C. The sum of two identically distributed Exponential Distributions is a Gamma Distribution
with = 2. The density of a Gamma Distribution with = 2 and = 10 is:
f(t) = (t/10)2 e-t/10 /{t (2)} = .01 t e-t/10.
30

t=30

Prob[T 30] = .01 t e-t/10 dt = -(t/10) e-t/10 - e-t/10 = 1 - 4e-3 = 80.1%.


0

t=0

Alternately, Prob[T 30] = [2; 30/10] = 1 - (30/10) e-30/10 - e-30/10 = 80.1%.


24.42. B. For the LogNormal: mean is exp[ +.5 2], median = e, mode = exp(2).
(mean - mode)/(mean - median) = {exp( +.5 2) - exp(2)} / {exp( +.5 2) - e} =
{1 - exp(-1.52)} / {1 - exp(-0.52)}.
Comment: Note that for the LogNormal, mean > median > mode (alphabetical order) . This is
typical for a continuous distribution with positive skewness. (The situation is reversed for negative
skewness.) Also note that the median is closer to the mean than to the mode (just as it is in the
dictionary.) Also, note that as goes to zero, this ratio goes to 1.5/.5 = 3. (For curves with mild
skewness, it is reasonable to approximate this ratio by 3, according to Kendalls Advanced
Theory of Statistics.)

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 305
24.43. D. The probability of death at time t, is the density of the Weibull Distribution:
f(t) = (t/) exp(-(t/)) /t = 2t exp(-(t/30)2 ) /302. The present value of a payment of one at time t is
et = e-0.04t. Therefore, the actuarial present value of this insurance in million of dollars is:

e-.04t t exp(-(t/30)2) /450 dt = (1/450) t exp(-{(t/30)2 +.04t + (.62) - .36}) dt =


0

(1/450) e.36

t exp(-{(t/30 + .6}2) dt = (1/450) e.36 (30x-18) exp(-x2) 30 dx =

x = .6

e.36 {

2x exp(-x2 ) dx - (6/5) exp(-x2 ) dx } =

.6

.6
x=

e.36 { - exp(-x2 )

] - (6/5)(

{1 - (0.6 2 )})} = 1 - e.36 (6/5) {1 - (0.8489)} =

x = .6

1 - (1.433)(1.2)(1.772)(1 - 0.8019) = 1 - 0.604 = 0.396.


Where Ive used the chance of variables x = (t/30) + .6 and made use of the hint.
Thus the the actuarial present value of this insurance is: ($1 million)(.396) = $396,000.
24.44. Mean = exp[ + 2/2]. Physicians: exp[7.8616 + 3.1311/2] = 12,421.
Surgeons: exp[8.0562 + 2.8601/2] = 13,177. Hospitals: exp[ 7.4799 + 3.1988/2] = 8772.
Hospitals have the smallest mean, while Surgeons have the largest mean. 1 + CV2 = E[X2 ] / E[X]2 .
CV =

E[X2] / E[X]2 - 1 = exp[2 + 2 2] / exp[ + 2 / 2]2 - 1 = exp[ 2] - 1.

Physicians: exp[3.1311] - 1 = 4.680. Surgeons:


Hospitals:

exp[2.8601] - 1 = 4.057.

exp[3.1988] - 1 = 4.848.

Surgeons have the smallest CV, while Hospitals have the largest CV.
Comment: Taken from Table 4 of Sheldon Rosenbergs discussion of On the Theory of Increased
Limits and Excess of Loss Pricing, PCAS 1977. Based on data from Policy Year 1972.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 306
24.45. C. The Normal distribution is symmetric. The Exponential and Pareto each have a mode of
zero. The single Parameter Pareto has support x > > 0 and mode = .
If it were a Single Parameter Pareto, then the biggest density (the mode) would be where the
support of the density starts.
This is notthe case for the given histogram.
Thus none of these are similar to the histogram.
The Gamma for > 1 has a mode > 0.
Comment: This histogram is from a Gamma Distribution with = 4.
24.46. E. After truncating and shifting from below, one gets a Pareto Distribution with
= 2.5 and = 47 + 10 = 57. Thus the nonzero payments are Pareto with = 2.5 and = 57.
This has mean: /( - 1) = 57/1.5 = 38, second moment: 22 / {( - 1)( - 2)} = 8664, and variance:
8664 - 382 = 7220. The probability of a nonzero payment is the probability that a loss is greater
than the deductible of 10; for the original Pareto, S(10) = {47/(47+10)}2.5 = 0.617.
Thus the payments of the insurer can be thought of as an aggregate distribution, with Bernoulli
frequency with mean 0.617 and Pareto severity with = 2.5 and = 57.
The variance of this aggregate distribution is: (Mean Freq.)(Var. Sev.) + (Mean Sev.)2 (Var. Freq.) =
(.617)(7220) + (38)2 {(0.617)(1 - 0.617)} = 4796.
Comment: Similar to 3, 11/00, Q.21.
24.47. D. The sum of three identically distributed Exponential Distributions is a Gamma Distribution
with = 3. The density of a Gamma Distribution with = 3 and = 4 is:
f(x) = (x/4)3 e-x/4 / {x (3)} = x2 e-x/4/128.

x=

Prob[X > 20] = x2 e-x/4/128 dx = -(x/4)2 e-x/4/2 - (x/4) e-x/4 - e-x/4 = 18.5e-5 = 12.5%.
20

x=20

Alternately, Prob[X > 20] = 1 - [3; 20/4] = 52 e-5/2 + 5 e-5 + e-5 = 18.5e-5 = 12.5%.
24.48. E. For an ordinary deductible, the average payment per non-zero payment is:
{E[X] - E[X d]}/ S(d) = {E[X] - E[X 15]}/ S(15) =
{(24/(2.5-1)) - (24/(2.5-1))(1 - {1 + 15/24}-1.5)} {1 + 15/24}2.5 = 26.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 307
24.49. C. For a franchise deductible, one has data truncated from below, and the average payment
per non-zero payment is: {E[X] - E[X d] + S(d)d}/ S(d) = e(d) + d =
(15+24)/(2.5-1) + 15 = 41.
Alternately, the numerator is the integral from 15 to infinity of x f(x), which is:
E[X] - E[X 15] + S(15)15 = 16 - 8.276 + (.2971)(15) = 12.180.
The denominator is S(15) = 0.2971.
Thus the average payment per non-zero payment is: 12.180/0.2971 = 41.00.
Alternately, each non-zero payment is 15 more than with an ordinary deductible, thus using the
previous solution: 15 + 26 = 41.
Comment: For the Pareto Distribution e(d) = (d+)/(-1).
24.50. A. For an ordinary deductible, the average payment per loss is: E[X] - E[X d] =

E[X] - E[X 10] = {(24/(2.5-1)) - (24/(2.5-1))(1 - {1 + 10/24}-1.5) = 16{1 + 10/24}-1.5 = 9.49.


24.51. D. For a franchise deductible, the average payment per loss is the same as that for an
ordinary deductible, except d is added to each nonzero payment.
Average payment per loss = E[X] - E[X 10] + (10)(Probability of nonzero payment) =
9.49 + (10){24/(24 + 10)}2.5 = 13.68. (13.68)(73) = 999.
24.52. C. As shown in the Appendix attached to the exam, for > 1 its mode is:
{( - 1)/}1/ = (1000){.5/1.5)1/1.5 = 481.
24.53. D. lnY = (X1 + X2 + ... + X10)/10, which is the average of 10 independent, identically
distributed Normals, which is therefore another Normal with the same mean of 7 and a standard
deviation of 1.6/ 10 . Therefore Y is LogNormal with = 7 and = 1.6/ 10 .
Using the formula for the mean of a LogNormal, E[Y] = exp[7 + (1.6/ 10 )2 /2] = e7.128 = 1246.
Comment: In general, the expected value of the Geometric average of n independent, identically
distributed LogNormals is: exp[ + 2/(2n)].
As n , this approaches the median of e.
Note that while the expected value of lnY is , it is not true that E[Y] = exp[E[lnY]] = e.
24.54. B. E[X] = (9)[1 + 1/4] = (9)(0.90640) = 8.1576.
E[X2 ] = (92 )[1 + 2/4] = (92 )(0.88623) = 71.7846.
Variance = 71.7846 - 8.15762 = 5.238.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 308
24.55. mean = . mode = (1). standard deviation = .
(mean - mode)/(standard deviation) = { (1)}/( ) = 1/ .
Skewness = 2/ . Kurtosis = 3 + 6/.
(skewness)(kurtosis + 3)/(10 kurtosis - 12 skewness2 - 18) =
(2/ )(3 + 6/ + 3)/(30 + 60/ - 48/ - 18) = (2/ )(6 + 6/)/(12 -12/) = 1/ .
Comment: This is true in general for any of the members of the Pearson family of distributions.
See equation 6.6 in Volume I of Kendalls Advanced Theory of Statistics.
24.56. D. The survival function for the Weibull is: S(t) = exp(-(t/)).
= 2 and = 15 for smokers: S(t) = exp(-(t/15)2 ).
S(1) = 0.9956. S(2) = 0.9824. S(3) = 0.9608.
= 2 and = 20 for nonsmokers: S(t) = exp(-(t/20)2 ).
S(1) = 0.9975. S(2) = 0.9900. S(3) = 0.9778.
Assume for example a total of 400,000 people alive initially.
Then, one fourth or 100,000 are smokers, and three fourths or 300,000 are nonsmokers.
#
of
Years

Smoker
Prob. of
Survival

#
Smokers
Surviving

0
1
2
3

1
0.9956
0.9824
0.9608

100,000
99,557
98,238
96,079

#
Smoker
Deaths
443
1,319
2,159

Non-Smoker
#
#
Prob. of
Non-Smoker Non-Smoker
Survival
Surviving
Deaths
1
0.9975
0.9900
0.9778

300,000
299,251
297,015
293,325

749
2,236
3,690

Total
#
Deaths
1,193
3,555
5,849

For example, the number of smokers who survive through year one is 99,557, while the number
who survive through year two is 98,238.
Therefore, 99,557 - 98,238 = 1319 smokers are expected to die during year two.
For 400,000 insureds, the actuarial present value of the payments is:
(100,000) {1193/1.06 + 3555/1.062 + 5849/1.063 } = 920,034,223.
The actuarial present value of this insurance is: 920,034,223 / 400,000 = 2300.
Comment: Similar to 3, 5/00, Q.8.
24.57. E. = Mean/ = 100/5 = 20. The sum is Gamma with = (5)(3) = 15 and = 20.
Mode = ( - 1) = (20)(14) = 280.
Comment: Similar to CAS3, 5/06, Q.36.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 309
24.58. B. Ln(X) and ln(Y) are two independent Normal variables.
Therefore, ln(Z) = ln(XY) = ln(X) + ln(Y) is a Normal variable. Therefore, Z is a LogNormal variable.
ln(Z) = ln(X) + ln(Y) has mean equal to the sum of the means of ln(X) and ln(Y) and variance equal to
the sum of the variances of ln(X) and ln(Y). Therefore, ln(Z) has parameters
= 3 + 4 = 7 and 2 = 2 + 1.5 = 3.5, and therefore so does Z. For the LogNormal Distribution,
variance = exp(2 + 2) (exp( 2) -1). Thus VAR[Z] = exp(14 + 3.5) (exp( 3.5) -1) =
(39.82 million)(32.12) = 1.279 billion. Standard deviation of Z is

1.279 billion = 35,763.

Alternately, in general for X and Y two independent variables: E[XY] = E[X]E[Y] and E[X2 Y 2 ] =
E[X2 ]E[Y2 ]. VAR[XY] = E[(XY)2 ] - E[XY]2 = E[X2 Y 2 ] - {E[X]E[Y]}2 = E[X2 ]E[Y2 ] - E[X]2 E[Y]2 .
E[X] = exp( + .5 2) = e4 , E[Y] = e4.75 , E[X2 ] = exp(2 + 2 2) = e10, E[Y2 ] = e11.
Therefore, Var[Z] = Var[XY] = (e10)(e11) - ( e4 )2 (e4.75)2 = e21 - e17.5 = 1.279 billion.
Therefore, the standard deviation of Z is: 1.279 billion = 35,763.
Comment: In general the product of independent LogNormal variables is a LogNormal with the sum
of the individual and 2 .
24.59. D. If the payment is made before the assets have grown to 50, then there is ruin.
50 = 5(1.10)t, implies t = 24.16. The probability each person has died by time 24.16 is given by
the Weibull Distribution: 1 - exp[-(24.16/30)2 ] = 0.477.
The probability that all 7 persons are dead by time 24.16 is: 0.4777 = 0.56%.
Comment: Similar to 3, 5/00, Q.6.
24.60. D. 1 + CV2 = E[X2 ]/E[X]2 = exp[2 + 22] / exp[ + 2/2]2 = exp[2].
Thus, 1 + 5.752 = exp[2]. = 1.88.
For a LogNormal Distribution, the probability that a value is greater than the mean is:
1 - F[exp[ + 2/2]] = 1 - [(ln[exp[ + 2/2]] - ) / ] = 1 - [/2] = 1 - [0.94] = 17.36%.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 310
24.61. B. For the Pareto, S(500) = {2000/(2000 + 500)}4 = 0.4096.
The mean number of losses is: mq = (50)(0.3) = 15.
The expected number of (non-zero) payments is: (0.4096)(15) = 6.144.
Comment: Similar to 4, 5/07, Q. 39. As discussed in Mahlers Guide to Frequency Distributions,
the number of (non-zero) payments is Binomial with m = 5 and q = (0.4096)(0.3) = 0.12288.
Therefore, the variance of the number of payments is: (50)(0.12288)(1 - 0.12288) = 5.39.
24.62. E. The Median is where the distribution function is 0.5.
0.5 = [(ln(x) - )/]. 0 = (ln(x) - )/. x = exp[].
As shown in Appendix A attached to the exam, Mode = exp[ 2].
Therefore, Median/Mode = exp[]/exp[ 2] = exp[2] = 5.4. = 1.30.
24.63. B. For the Pareto Distribution, VaRp [X] = {(1-p)-1/ - 1}.
Q 0.25 = VaR0.25[X] = {(0.75)-1/ - 1} = (1000) {(0.75)-1/2 - 1} = 154.7.
Q 0.75 = VaR0.75[X] = {(0.25)-1/ - 1} = (1000) {(0.25)-1/2 - 1} = 1000.
Interquartile range = Q0.75 - Q0.25 = 1000 - 154.7 = 845.3.
Comment: For a Pareto Distribution with = 1000, here is the interquartile range as a function of :
Interquartile Range
4000

3000

2000

1000

alpha

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 311
24.64. D. E[X] = exp[3 + 22 /2] = 148.41. E[X2 ] = exp[(2)(3) + (2)(22 )] = 1,202,604.
Var[X] = 1,202,604 - 148.412 = 1,180,579.
E[ X ] = E[X] = 148.41. Var[ X ] = Var[X]/20 = 1,180,579/20 = 59,029.
E[ X 2 ] = Var[ X ] + E[ X ]2 = 59,029 + 148.412 = 81,054.
Comment: Similar to 4, 11/06, Q.26.
24.65. C. ln(W) has a Normal Distribution with parameters = 1 and = 2.
(ln(W) - 1))/2 has a Normal Distribution with parameters = 0 and = 1; in other words,
(ln(W) - 1))/2 follows a standard unit Normal Distribution, with distribution function .
Therefore, [(ln(W) - 1))/2] is uniform on [0, 1].
Comment: This forms the basis of simulating a LogNormal Distribution.
If X follows the distribution F, then F(X) is uniform on [0, 1].
24.66. C. Find where the density is a maximum. 0 = f(x) = 2ax e-bx - bax2 e-bx.

2 = bx. x = 2/b.
Comment: A Gamma Distribution with = 3 and = 1/b. For > 1, Mode = ( - 1) = 2/b.
In order to integrate to one, the density must go to zero as x goes to infinity.
Therefore, the mode is never infinity.
24.67. B. 1. False. A random variable Y is lognormally distributed if ln(Y) is normally distributed. If
X is normally distributed, then exp(X) is lognormally distributed. 2. True. The LogNormal has a
lighter tail than the Pareto. One way to see this is that all the moments of the LogNormal exist, while
the moments of the Pareto only exist for n > . Another is that the mean residual life of the
LogNormal goes to infinity less than linearly, while that of the Pareto increases linearly.
3. False. The mean of a Pareto only exists for > 1.
24.68. B. 1. Assume one is summing n independent identically distributed random variables.
According to the Central Limit Theorem as n approaches , this sum approaches a Normal
Distribution. Precisely how large n must be in order for the Normal Approximation to be reasonable
depends on the shape of the distribution and the definition of what is reasonable. Precisely how
large n must be would depend on details about the distribution, however a large skewness would
require a larger n. Statement 1 is not true. 2. True. 3. False. A random variable X is lognormally
distributed if ln(X) = Z, where Z is normally distributed.
24.69. C. This is Gamma Distribution with = 2 and = . k = 1/() = 1/(2) = 1/1! = 1.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 312
24.70. B. 1. False. The Normal is symmetric while the LogNormal is skewed to the right.
2. True. The LogNormal has a lighter tail than the Pareto. One way to see this is that the mean
residual life of the LogNormal goes to infinity less than linearly, while that of the Pareto increases
linearly. 3. False. This is an important application of the Negative Binomial Distribution. The Negative
Binomial is the mixed distribution for the Gamma-Poisson process.
24.71. A. 1. T. 2. F. While the sum of independent Normal distributions is also Normal, Statement
#2 is false for two reasons. First, since each unit Normal has variance of 1, the sum of n independent
unit Normals has a variance of n and standard deviation of n . Second, since each unit Normal has
mean of 0, the sum of n independent unit Normals has a mean of 0. 3. F. If ln(Y) is normally
distributed, then Y has a LogNormal Distribution. Equivalently, exp(X) is LogNormally distributed if
X is Normally distributed.
24.72. C. The variance (for a single claim ) of the Pareto Distribution is:
2 / { (2)(1)2 } = 92,160,000. The sum of 100 independent claims has 100 times this variance
or 9,216,000,000. The standard deviation is therefore

9,216,000,000 = 96,000.

The mean of a single claim from the Pareto is /(1) = 4800; for 100 claims the mean is 480,000.
Thus standardizing the variable 600,000 corresponds to:
(600,000 - 480,000)/96,000 = 1.25. Thus the chance of the sum being greater than 600,000 is
approximately: 1 - (1.25) = 1 - .8944 = 0.1056.
Comment: The second moment is: 22 / { (2)(1) } = 115,200,000.
Thus the variance = 115,200,000 - 48002 = 92,160,000.
24.73. A. If G(x) is the distribution function truncated from below at d, then
G(x) = (F(x)-F(d))/(1-F(d)) for x >d. In this case, G(x) = (F(x)-F(100))/ (1-F(100)), for x>100.
G(x) = {(100/200)2 - (100/(100+x))2 }/{(100/200)2 } = 1 - (200/(100+x))2 , for x>100.
24.74. D. For Truncation from above at a limit L, FY(x) = FX(x) / FX(L), for x L.
Thus in this case with L = 50000, A = 1/ FX(50000 ) = 1/ [{ln(50000) - } / ] =
1/ [{10.82 - 7} /

10 ] = 1/ [1.208] = 1/.8865 = 1.128.

24.75. C. 1. True. 2. False. X has a Gamma Distribution with parameters 2 and .


3. True. The Standard Normal has a mean of 0 and a variance of 1. The means add and variances
add. Thus X has mean of 0 + 0 = 0 and variance 1 + 1 = 2.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 313
24.76. C. For the Gamma Distribution the mean is and the variance is 2 . Thus we have per
claim a mean of 5000 and a variance of 5,000,000. For the sum of 100 independent claims, one has
100 times the mean and variance. Thus the sum has a mean of 500,000 and a variance of 500
million. Thus the sum has a standard deviation of 22,361. $525,000 exceeds this mean by
25000 / 22,361 = 1.118 standard deviations. Thus the chance of the sum exceeding 525,000 is
approximately: 1 - (1.12) = 1 - 0.8686 = 0.1314.
Comment: Alternately one can use the fact that the sum of 100 identical independent Gamma
Distributions has a Gamma Distribution with parameters 100 and .
Note that when applying the Normal Approximation to a continuous distribution such as the Gamma,
there is no need for the continuity correction that is applied in the case of a discrete frequency
distribution.
24.77. D. For this LogNormal Distribution the moments are E[X] = exp( + .52) = e.5 = 1.649. E[X2]
= exp(2 + 22) = e2. Thus the standard deviation is (e2 - e).5 = 2.161. Thus the interval within one
standard deviation of the mean is 1.649 2.161. But for the LogNormal distribution x > 0, so we are
interested in the probability of x < (1.649 + 2.161) = 3.810.
F(3.810) = ((ln(3.810) - ) / ) = (1.34) = 0.9099.
Comment: Note that this differs from the probability of being within one standard deviation of the
mean for the normal distribution, which is .682. Also note that the parameter is not the standard
deviation of the LogNormal Distribution. Finally, note that the LogNormal Distribution is not
symmetric, and thus one has to compute the distribution function separately at each of the two
points. F(1.649 - 2.161) = 0 in this case, since the support of the LogNormal Distribution is x >0, so
that F(x) = 0 for x 0. If instead we were asked the probability of being within a half of a standard
deviation of the mean, this would be:
F(1.649 + 1.080) - F(1.649 - 1.080) = (1.00) - (-.56) = .8413 - (1 - .7123) = .5536.
24.78. C. When > 2 and therefore the first two moments exist for the Pareto:
E[X2 ] / E[X]2 = {22 /(-1)(-2)} / {/(-1)}2 = 2(-1) / (-2) > 2. Thus we check the ratio
E[X2 ] / E[X]2 , to see whether it is greater than 2. For X1 this ratio is 1.5. For X2 this ratio is 2.
For X3 this ratio is: (1.5) / 0.52 = 6.
Thus since this ratio is only greater than 2 for X3 , we conclude only for X3 could one use a Pareto.
Comment: This is an initial test that is useful in some real world applications.
Note that 1 + CV2 = E[X2] / E2[X]. Thus E[X2] / E2[X] > 2 is equivalent to CV2 > 1.
Thus this fact is equivalent to the fact that for the Pareto distribution, the coefficient of variation (when it
exists) is always greater than 1; the standard deviation is always greater than the mean.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 314
24.79. B. The original distribution function: F(x) = 1 - e -0.05x.
The data truncated from below at 10, has distribution function:
{F(x) - F(10)} / {1 - F(10)} = {e -0.5 - e -0.05x} / e -0.5 = 1 - e 0.5 -0.05x , x > 10
For x = 25, this has a value of 1 - e0.5 -1.25 = 0.53.
Comment: The Weibull Distribution for a shape parameter of 1 is an Exponential.
24.80. E. X > 1000 corresponds to Y > 6.908. Converting to the Standard Normal distribution:
(6.908 - 6.503) / 1.5 = 0.270. Using the Standard Normal Table, the chance of being less than or
equal to .27 is .6064, so the chance of being more is 1 - .6064 = 0.3936.
Comment: S(1000) for a LogNormal Distribution, with = 6.503 and = 1.500.
24.81. B. The sum of 10 independent identically distributed Gammas is a Gamma, with the same
scale parameter and 10 times the original shape parameter. Thus the new Gamma has shape
parameter, = 0.1 x 10 = 1, while = 1. A Gamma with a shape parameter of 1 is an exponential
distribution. F(x) = 1 - exp(-x). 1 - F(1) = exp(-1) = 36.8%.
24.82. C. If the insurer makes a payment, the probability that an insurers payment is less than or
equal to 25, is in terms of the original Weibull Distribution:
{F(35) - F(10)} / {1 - F(10)} = {(1 - e-1.75) - (1 - e-.5)} / {1 - (1-e-.5)} = 0.713.
24.83. D. The mean of the Weibull for = .5 is: (1+ 1 /.5) = (3) = (2!) = 2.
The median of the Weibull for = .5 is such that .5 = F(m) = 1 - exp(-(m/).5).
Thus, -(m/).5 = ln0.5 m = (-ln.5)2 = .4805 . mean / median = 2 / .4805 = 4.162.
Comment: The ratio is independent of , since both the median and the mean are multiplied by the
scale parameter .
24.84. D. 1. True. 2. True. 3. False.
Comment: The sum of independent Normal Distributions is a Normal. The product of independent
LogNormal Distributions is a LogNormal.
24.85. D. The sum of 16 independent risks each with a Gamma Distribution is again a Gamma
Distribution, with the same scale parameter and new shape parameter 16.
In this case, the aggregate losses have a Gamma Distribution with = 250 and = (16)(1) = 16.
Thus the tail probability at 6000 is:
1 - F(6000) = 1 - (; 6000/) = 1 - (16; 6000/ 250) = 1 - (16; 24).
Comment: 1 - (16; 24) = 0.0344.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 315
24.86. B. The aggregate losses have a Gamma Distribution with = 250 and = 16. The mean is
= 4000 and variance is 2 = 1,000,000. Alternately, each risk has a mean of 1/.004 = 250, and a
variance of 1/0.0042 = 62500. The means and variances add for the sum of independent risks.
Thus the aggregate losses have a mean of (16)(250) = 4000
and a variance of (16)(62,500) = 1,000,000.
In any case, the standard deviation is 1000 and the survival function at 6000 is approximately:
1- [(6000-4000)/1000] = 1 - (2) = 1 - 0.9773 = 0.0227.
Comment: Note that in approximating the continuous Gamma Distribution no continuity correction is
required. Note that the result from using the Normal Approximation here is less than the exact result
of .0344. Due to the skewness of the Gamma Distribution, the Normal Approximation
underestimated the tail probability; the Gamma has a heavier tail than the Normal.
24.87. E. Var[Z] = Var[0.25X] = 0.252 Var[X].
Covar[X,Z] = Covar[X,.025X] = 0.25Covar[X, X] = 0.25 Var[X].
Therefore, Corr[X, Z] = 0.25 Var[X] / Var[X] 0.252 Var[X] = 1.
Comments: Two variables that are proportional with a positive proportionality constant are perfectly
correlated and have a correlation of one.
24.88. C. The Gamma Distribution has skewness twice its coefficient of variation.
Thus the skewness is (2)(1) = 2.
Comment: The given Gamma is an Exponential with = 1, CV = 1, and skewness = 2.
The skewness would be calculated as follows.
For the Gamma, E[X] = , E[X2 ] = (+1)2 , E[X3 ] = (+1)(+2)3 .
The variance is: E[X2 ] - E[X]2 = (+1)2 - 22 = 2 .
Thus the CV =

/ 2
=1/
/

The third central moment = E[X3 ] - 3 E[X] E[X2 ] + 2 E[X]3 =


(+1)(+2)3 - 3{(+1)2 } + 233 = 23.
Thus the skewness = 23 / {2 }1.5 = 2 /

= twice the CV.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 316
24.89. E. The mean of the Pareto is /(1), for > 1. The mode is zero, since f(x) =
()( + x)( + 1), x > 0, which decreases as x increases, so that the density is largest at zero. The
median is where F(x) = .5. Therefore, .5 = 1 - (/(+x)), thus median = (21/ - 1).
Mean

/(1)

Median

(21/ - 1)

Mode

Thus the mode is smallest. On the exam, just pick values for and and see what happens.
For example, for = 3 and = 10, the mean is 10/(3-1) = 5 while the median is smaller at
10(21/3 -1) = 2.6. One can show that for the Pareto the mean is greater than the median, since
1/(-1) > 21/ - 1, for > 1. This inequality is equivalent to:
1 + 1/ (-1) = /(-1) > 21/ (-1)/ = 1 1/ < 2-1/.
Let = 1/, then this inequality is equivalent to: 1 < + 1/2, for 1 > > 0.
At = 0, the right hand expression is 1; its derivative is: 1 + ln(1/2)/2 > 0.
Thus the right hand expression is indeed greater than 1 for > 0.
This in turn implies that the mean is greater than the median.
Comment: For a continuous distribution with positive skewness typically: mean > median > mode
(alphabetical order.) Since the mean is an average of the claim sizes, it is more heavily impacted by
the rare large claim than the median; therefore, the Pareto with a heavy tail, has its mean greater than
it median.
24.90. D. S(t) = exp[-(t/ 2 )2 ] = exp[-t2 /2].
Prob[2 < t < 3 | 1 < t < 4] = {S(2) - S(3)} / {S(1) - S(4)} = (e-2 - e-4.5) / (e-0.5 - e-8) = 0.205.
24.91. C. The density of a Pareto Distribution is f(x) = ()( + x)-( + 1), 0 < x < .
Thus this density is a Pareto Distribution with = 3 and = 1.
It has mean: /(-1) = 1/(3 - 1) = 1/2.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 317
24.92. C. The survival function for the Weibull is: S(t) = exp(-(t/)).
= 2 and = 1.5 for smokers: S(t) = exp(-(t/1.5)2 ).
S(1) = e-.4444 = .6412. S(2) = e-1.7778 = .1690.
= 2 and = 2.0 for nonsmokers: S(t) = exp(-(t/2)2 ).
S(1) = e-0.25 = 0.7788. S(2) = e-1 = 0.3679.
Assume for example a total of 30,000 people alive initially.
Then, one third or 10,000 are smokers and two thirds or 20,000 are nonsmokers.
#
of
Years

Smoker
Prob. of
Survival

#
Smokers
Surviving

#
Smoker
Deaths

0
1
2

1
0.6412
0.1690

10,000
6,412
1,690

3,588
4,722

Non-Smoker
#
#
Prob. of
Non-Smoker Non-Smoker
Survival
Surviving
Deaths
1
0.7788
0.3679

20,000
15,576
7,358

4,424
8,218

Total
#
Deaths
8,012
12,940

For example, the number of nonsmokers who survive through year one is 15,576, while the number
who survive through year two is 7358.
Therefore, 15,576 - 7,358 = 8,218 nonsmokers are expected to die during year two.
For 30,000 insureds, the actuarial present value of the payments is:
(100,000) {8012/1.05 + 12940/1.052 } = 1,936,743,764.
Therefore, the actuarial present value of this insurance is: 1,936,743,764 / 30,000 = 64,558.
Comment: Overall: S(1) = (1/3)(.6412) + (2/3)(.7788) = .7329 = (6412 + 15576)/30000.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 318
24.93. B. Assuming the payments are made for a period of time t, the present value is an annuity
certain: 20000(1 - e-t)/ = 20000(1 - e-.05t)/.05.
For the Gamma Distribution, f(t) = t1 e-t/ / () = 1-2 t2-1 e-t/1 / (2) = te-t.
The actuarial present value is:

f(t)(Present value | t)dt = 20000f(t)(1 - e-.05t)/.05 dt = 400000(1 - f(t)e-.05tdt)

= 400000(1 - te-te-.05tdt) = 400000(1 - te-1.05t dt) = 400000(1 - 1.05-2) = 37,188.


0

Comment: I have used the fact about Gamma type integrals

t1 et/ dt = (), for = 2 and = 1/1.05.


0

For a length of disability with a Gamma Distribution and rate of payment of B, the actuarial present
value is: B(1 - (1 + ))/. In this case, B = 2000, = 2, = 1, and = .05, and the actuarial
present value is: (20000)(1 - (1 + (1)(.05))-2)/.05 = 37,188.
The mean length of disability is: = 2. 20000 times an annuity certain of length 2 has present
value: 20000(1 - e-(.05)(2))/.05 = 38,065 37,188.
24.94. E.

x =

S(1) = xe-x dx = -xe-x - e-x ] = 2e-1 = .7358.


1

x =1

After the introduction of the deductible, expect to pay: (100)(.7358) = 74 claims.


Comment: A Gamma Distribution with = 2 and = 1. F(1) = [2; 1/1] = 1 - e-1 - 1e-1 = .2642.
24.95. C. For ordinary deductible d, expected payment per payment = (E[X] - E[X

d])/S(d) =

({/(-1)} - {/(-1)}{1 - (/(+d))1})/(/(+d)) = {/(-1)}/(/(+d)) = (+d)/(-1) = + d.


For Group R, the expected payment per payment is: 2000 + 500 = 2500.
For Group S, the expected payment per payment with an ordinary deductible of 200 would be:
3000 + 200 = 3200. However, with a franchise deductible each payment is 200 more, for an
average payment per payment of: 3200 + 200 = 3400. 3400 - 2500 = 900.
Comment: For an ordinary deductible d, the expected payment per payment is e(d).
For the Pareto distribution, e(x) = ( + x)/( -1). In this case, e(d) = ( + d)/(2 - 1) = + d.

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 319
24.96. B. For a Gamma Distribution, CV =

2 / () = 1/ . CV = 1 =1.

Therefore, this is an Exponential Distribution, with = 1 million.


E[(X - 2 million)+] = E[X] - E[X

2 million] = 1 million - (1 million)(1 - exp[-(2/1)]) = $135,335.

Comment: One can also compute E[(X - 2 million)+] by integrating the survival function from
2 million to infinity, or by remembering that for an Exponential Distribution, R(x) = e-x/.
The fact that this is an aggregate distribution does not change the mathematics of using an
Exponential Distribution. It would be uncommon for aggregate losses to follow an Exponential.
24.97. D. Expected cost per loss for policy Q is:
E[X 3000] = {2000/(3 - 1)}{1 - (2000/(2000 + 3000))2 } = 840.
Expected cost per loss for policy R is: E[X] - E[X d] =
2000/(3 - 1) - {2000/(3 - 1)} {1 - (2000/(2000 + d))2 } = 1000{2000/(2000 + d)}2 .
Set the two expected costs equal:
1000(2000/(2000 + d))2 = 840. (2000 + d) = 2000 1000 / 840 = 2182. d = 182.
24.98. D. The sum of two independent, identically distributed Exponentials is a Gamma
Distribution, with = 2 and the same . Thus the time the probe continues to transmit has a Gamma
Distribution with = 2 and = 1. This Gamma has density f(t) = te-t.

S(t) =

f(t) dt = te-t dt = e-t + te-t. S(3) = 4e-3 = 0.199.


t

Alternately, independent, identically distributed exponential interarrival times implies a Poisson


Process. Therefore, if we had an infinite number of batteries, over three years the number of failures
is Poisson with mean 3. We are interested in the probability of fewer than 2 failures, which is:
e-3 + 3e-3 = 4e-3 = 0.199.
Comment: One could work out the convolution from first principles by doing the appropriate integral:
t

f*f(t) = f(x) f(t x)dx . Alternately, the distribution function is F(x) f(t x)dx .

2013-4-2, Loss Distributions, 24 Common 2 Parameter Dists. HCM 10/8/12, Page 320
24.99. D. E[X] = /( - 1) = 333.333. E[X2 ] = 22/{( - 1)( - 2)} = 333,333.
Var(X) = 333,333 - 333.3332 = 222,222.
E[X3 ] = 63/{( - 1)( - 2)( - 3)} = 1,000,000,000.
Third Central Moment is: E[X3 ] - 3E[X]E[X2 ] + 2E[X]3 =
1,000,000,000 - (3)(333.333)(333,333) + 2(333.3333 ) = 740,741,185.
Skewness = 740,741,185/222,2221.5 = 7.071.
Alternately, for the Pareto, skewness is: 2{(+1)/(3)} ( - 2) / = {(2)(5)/1}

2 / 4 = 7.071.

Comment: Since the skewness does not depend on the scale parameter, one could take = 1.
E[X] = 1/3. E[X2 ] = 1/3. Var(X) = 2/9. E[X3 ] = 1.
Skewness = {1 - 3(1/3)(1/3) + 2(1/3)3 } / (2/9)1.5 = 7.071.
24.100. A. The sum of 5 independent, identically distributed Gamma Distributions is another
Gamma, with the same = 100, and = (5)(2) = 10. Mode = ( - 1) = (100)(10 - 1) = 900.
24.101. C. For the Weibull, S(200) = exp[-(200/1000).3] = 0.5395.
The mean number of losses is: r = (3)(5) = 15.
The expected number of (non-zero) payments is: (0.5395)(15) = 8.09.
24.102. B. The number of (non-zero) payments is Negative Binomial with r = 3
and = (0.5395)(5) = 2.6975.
Therefore, the variance of the number of payments is: (3)(2.6975)(1 + 2.6975) = 29.92.
Comment: See Mahlers Guide to Frequency Distributions.

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 321

Section 25, Other Two Parameter Distributions120


In Loss Models, there are other Distributions with 2 parameters, which are much less important to
know than the common distributions discussed previously.
The Inverse Gamma is the most important of these other distributions.
With the exception of the Inverse Gaussian, all of the remaining distributions have scale parameter ,
one shape parameter, and are heavy-tailed, with only some moments that exist.121
Just as with the Pareto Distribution discussed previously, the LogLogistic, Inverse Pareto,
ParaLogistic and Inverse ParaLogistic, are special cases of the 4 parameter Transformed Beta
Distribution, discussed subsequently.
The Inverse Gamma and Inverse Weibull are special cases of the 3 parameter Inverse
Transformed Gamma, discussed subsequently.
LogLogistic:122
The LogLogistic Distribution is a special case of a Burr Distribution, for = 1.
It has scale parameter and shape parameter .
(x / )
1
F(x) =
.
=
1 + (x / )
1 + ( / x)

x 1
f(x) =
.
(1 + (x / ) )2

The nth moment only exists for n < .


Inverse Pareto:123
If X follows a Pareto with parameters and 1 then /X follows an Inverse Pareto with
parameters = and . The Inverse Pareto is so heavy-tailed that it has no (finite) mean nor higher
moments. It has scale parameter and shape parameter .
x
F(x) =
= (1 + /x).
x +
120

f(x) =

x 1
.
(x + ) + 1

See the previous section for what I believe are more commonly used two parameter distributions.
While their names sound similar, the Inverse Gaussian and Inverse Gamma are completely different distributions.
122
See Appendix A of Loss Models.
123
See Appendix A of Loss Models.
121

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 322

Inverse Gaussian:
The Inverse Gaussian distribution has a tail behavior not very different than a Gamma Distribution.
It has two parameters and , neither of which is exactly either a shape or a scale parameter.

f(x) =

x
2
- 1

exp

2x
.5
1
2
x

x
F(x) = 1

x
=

x
1

x
+ e2 / + 1

exp -

x
+ e2 /
+

+
2
2

2x
.
1.5
x

]
1

Mean =
Variance = 3 /

Skewness = 3

Coefficient of Variation =

= 3CV.

Kurtosis = 3 + 15/ = 3 + 15CV2 .

Thus the skewness for the Inverse Gaussian distribution is always three times the coefficient of
variation.124 Thus the Inverse Gaussian is likely to fit well only to data sets for which this is true.
Multiplying a variable that follows an Inverse Gaussian by a constant gives another Inverse
Gaussian.125
The Inverse Gaussian is infinitely divisible.126 If X follows an Inverse Gaussian, then given any
n > 1 we can find a random variable Y which also follows an Inverse Gaussian, such that
adding up n independent version of Y gives X.
124

For the Gamma, the skewness was twice the coefficient of variation.
Thus the Inverse Gaussian is preserved under Uniform Inflation. It is a scale distribution.
See Loss Models, Definition 4.2.
126
The Gamma Distribution is also infinitely divisible.
125

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 323

If X follows an Inverse Gaussian with parameters 1 and 1, and Y follows an Inverse Gaussian
with parameters 2 and 2, and if 12 / 1 = 22 / 2, then for X and Y independent, X+Y also
follows an Inverse Gaussian, but with parameters 3 = 1 + 2 and 3 such that 32 / 3 = 12 / 1 =

22 / 2. 127
Thus an Inverse Gaussian can be thought of as a sum of other independent identically
distributed Inverse Gaussian distributions. Therefore, keeping = 2 / fixed, as gets larger
and larger, the Inverse Gaussian is the sum of more and more identical copies of an Inverse
Gaussian with parameters 1 and 1/. Thus keeping 2 / fixed and letting go to infinity, an
Inverse Gaussian approaches a Normal distribution.
It can be shown that if X follows an Inverse Gaussian with parameters and , and if
Y = (X-)2 / (2 X), then Y follows a Chi-Square Distribution with one degree of freedom.128
In other words, Y has the same distribution as the square of a unit Normal Distribution.
Density of the Inverse Gaussian Distribution:
Exercise: Determine the derivative with respect to x of: [ (x1/2/ - x-1/2)].
[Solution: [y] is the cumulative distribution function of a Standard Normal.
Its derivative with respect to y is the density of a Standard Normal.
d [y] / dy = (y) = (1/ 2 ) exp[-y2 /2].
Therefore, d [y] / dx = dy/dx (1/ 2 ) exp[-y2 /2].
d [ (x1/2/ - x-1/2)] / dx =
{(1/2)x-1/2/ + (1/2) x-3/2}(1/ 2 ) exp[-(x1/2/ - x-1/2)2 /2] =
(1/2) (1/ 2 )

127

(x-1/2/ + x-3/2) exp[-(/2)(x/2 - 2/ + 1/ x)] .]

One can verify that the means and the variances add, as they must for the sum of any two independent variables.
See Insurance Risk Models by Panjer & Willmot, page 115. This a somewhat more complicated version of the similar
result for a Gamma. The sum of two independent Gammas with the same scale parameters, is a Gamma with the same
scale parameter and the sum of the shape parameters.
128
See for example page 116 of Risk Models by Panjer and Willmot or page 412 of Volume 1 of Kendalls Advanced
Theory of Statistics, by Stuart and Ord.

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 324

Exercise: Determine the derivative with respect to x of: e2/ [- (x1/2/ +x-1/2)] .
[Solution: d [y] / dx = dy/dx (1/ 2 ) exp[-y2 /2].
d e2/ [- (x1/2/ + x-1/2)] / dx =
e2/ {(-1/2)x-1/2/ + (1/2) x-3/2}(1/ 2 ) exp[-(x1/2/ + x-1/2)2 /2] =
(1/2) (1/ 2 )

(-x-1/2/ + x-3/2) e2/ exp[-(/2)(x/2 + 2/ + 1/ x)/2] =

(1/2) (1/ 2 )

(-x-1/2/ + x-3/2) exp[-(/2)(x/2 - 2/ + 1/ x)] .]

Exercise: Given the Distribution Function


F(x) = [ (x1/2/ - x-1/2)] + e2/ [- (x1/2/ +x-1/2)], determine the density function f(x).
[Solution: dF/ dx = (1/2) (1/ 2 )
(1/2) (1/ 2 )

(1/ 2 )

(x-1/2/ + x-3/2) exp[-(/2)(x/2 - 2/ + 1/ x))] +

(-x-1/2/ + x-3/2) exp[-(/2)(x/2 - 2/ + 1/ x)] =

(x-3/2) exp[-(/2)(x/2 - 2/ + 1/ x)] =

exp -

+
22

2x
.]
x1.5

Thus we have been able to confirm that the stated Distribution Function of an Inverse Gaussian
Distribution does in fact correspond to the stated density function of an Inverse Gaussian.
Moments of the Inverse Gaussian Distribution:
Exercise: For an Inverse Gaussian, set up the integral in order to compute the nth moment.

[Solution: E[Xn ] =

= e/

f(x) xn dx =

xn - 1.5 exp[-

n - 1.5
x

x
exp[- 2 +
] dx
2
2

2x

] dx . ]
2 2
2x

This integral is in a form that gives a modified Bessel Function of the third kind.129

0 x -1 exp [- xp - x - p ] dx = (2/p) (/)/2p K/p (2


129

).

See Insurance Risk Models by Panjer and Willmot or formula 3.478.4 in Table of Integrals, Series, and Products,
by Gradshetyen and Ryzhik.

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 325

Therefore, using the above formula with p = 1, = n - 1/2, = / (22), and = /2,
E[Xn ] = e/

(2) (2)(n-0.5)/2 K(n - 1/2)/1 (2


/2) = e/
2
2 2

2 n
Kn - 1/2 (/).

130

Exercise: What is the 4th moment of an Inverse Gaussian with parameters = 5 and = 7?
[Solution: E[X4 ] = e/

2 4
K3.5(/) = e1.4

2.8 4
5 K3.5(1.4) =

(4.0552)(0.94407)(625)(4.80757) = 11503.3.
Comment: Where one has to look up in a table or use a software package to compute the value of
K3.5(1.4) = 4.80757, the modified Bessel Function of the third kind.131]
ParaLogistic:132
The ParaLogistic Distribution is a special case of the Burr Distribution, with its two shape parameters
equal, for = . This unlike most of the other named special cases, in which a parameter is set equal
to one. This general idea of setting two shape parameters equal can be used to produce additional
special cases. The only other time this is done in Loss Models, is to produce the Inverse
ParaLogistic Distribution.
The ParaLogistic Distribution has scale parameter and shape parameter .

1
F(x) = 1 -
.
1 + (x / )
f(x) =

2 x 1
.
{1 + (x / )} + 1

The nth moment only exists for n < 2. This follows from the fact that moments of the
Transformed Beta Distribution only exist for n < , with in this case = .

130

The formula for the cumulants of the Inverse Gaussian Distribution is more tractable than that for the moments.
The nth cumulant for n 2 is: 2n-1 (2n-3)! / {n-1(n-2)! 2n-2}.
See for example, Kendalls Advanced Theory of Statistics, Volume I, by Stuart and Ord. Using this formula for the
cumulants, one can obtain the formula for the skewness and the kurtosis listed above.
131
I used Mathematica.
132
See Appendix A of Loss Models.

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 326

Inverse ParaLogistic:133
The Inverse ParaLogistic Distribution is a special case of the Inverse Burr Distribution, with its two
shape parameters equal, for = .
If X follows a ParaLogistic with parameters and 1, then /X follows an Inverse ParaLogistic with
parameters = and .
The Inverse ParaLogistic Distribution has scale parameter and shape parameter .
(x / )
F(x) =
= {1 + ( /x)}.
1 + (x / )
f(x) =

2 (x / )2
.
x {1 + (x / ) } +1

The nth moment only exists for n < .


Inverse Weibull:134
If X follows an Weibull Distribution with parameters 1 and , then /X follows an Inverse Weibull with
parameters and . The Inverse Weibull is heavier-tailed than the Weibull; the moments of the
Inverse Weibull only exist for n < , while the Weibull has all of its (positive) moments exist. The
Inverse Weibull has scale parameter and shape parameter . The Inverse Weibull Distribution is a
special case of the Inverse Transformed Gamma Distribution with = 1.

F(x) = exp - .
x
f(x) =

133
134

[ ]
[ ]



exp
.
x
x + 1

See Appendix A of Loss Models.


See Appendix A in Loss Models.

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 327

Inverse Gamma:135 136


If X follows a Gamma Distribution with parameters and 1, then /x follows an Inverse Gamma
Distribution with parameters = and . (Thus this distribution is no more complicated conceptually
than the Gamma Distribution.) is the shape parameter and is the scale parameter. The Inverse
Gamma Distribution is a special case of the Inverse Transformed Gamma Distribution with = 1.
e - / x .
The Distribution Function is : F(x) = 1 - ( ; /x), and the density function is: f(x) = + 1
x
[]
If X follows an Inverse Gamma Distribution with parameters and , then 1/X has
distribution function ( ; x), which is a Gamma Distribution with parameters and 1/.
Note that the density has a negative power of x times an exponential of 1/x.
This is how one recognizes an Inverse Gamma density.137
The scale parameter is divided by x in the exponential.
The negative power of x has an absolute value one more than the shape parameter .
Exercise: A probability density function is proportional to e-11/x x-2.5.
What distribution is this?
[Solution: This is an Inverse Gamma Distribution with = 1.5 and = 11.
Comment: The proportionality constant in front of the density is 111.5 / (1.5) = 36.48/ 0.8862 =
41.16. There is no requirement that be an integer. If is non-integral then one needs access to a
software package that computes the (complete) Gamma Function.]
The Distribution Function is related to that of a Gamma Distribution: ( ; x/).
If x/ follows an Inverse Gamma Distribution with scale parameter of one, then /x follows a Gamma
Distribution with a scale parameter of one.
The Inverse Gamma is heavy-tailed, as can be seen by the lack of the existence of certain
moments.138 The nth moment of an Inverse Gamma only exists for n < .
135

See Appendix A of Loss Models.


An Inverse Gamma Distribution is sometimes called a Pearson Type V Distribution.
137
The Gamma density has an exponential of x times x to a power.
138
In the extreme tail its behavior is similar to that of a Pareto distribution with the same shape parameter .
136

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 328

Note that the Inverse Gamma density function integrates to unity from zero to infinity.139

e- / x

+ 1 dx = () / , > 0.
x
0

This fact will be useful for working with the Inverse Gamma Distribution. For example, one can
compute the moments of the Inverse Gamma Distribution:

E[Xn ] =

xn f(x)dx

xn e -/ x

/ { ()

x + 1 }

dx

= {/()}

e - / x x -( + 1 n) dx

= {/()} ( - n) /(-n) = n ( - n) / (), - n > 0.


Alternately, the moments of the Inverse Gamma also follow from the moments of the Gamma
Distribution which are E[Xn ] = n (+n) / ().140 If X follows a Gamma with unity scale parameter,
then Z = /X has an Inverse Gamma Distribution, with parameters and .
Thus the Inverse Gamma has moments: E[Zn ] = E[(/X)n ] = n E[X-n] = n (-n) / (), > n.
Specifically, the mean of the Inverse Gamma = E[/X] = (-1) / () = /(1).
As will be discussed in a subsequent section, the Inverse Gamma Distribution can be used to mix
together Exponential Distributions.
For the Inverse Gamma Distribution:
{Skewness(Kurtosis + 3)}2 = 4 (4 Kurtosis - 3 Skewness2 ) (2 Kurtosis - 3 Skewness2 - 6).141

139

This follows from substituting y = 1/x in the definition of the Gamma Function. Remember it via the fact that all
probability density functions integrate to unity over their support.
140
This formula works for n positive or negative, provided n > .
141
See page 216 of Volume I of Kendalls Advanced Advanced Theory of Statistics.
Type V of the Pearson system of distributions is the Inverse Gamma.

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Inverse Gamma Distribution


Support: x > 0

Parameters: > 0 (shape parameter), > 0 (scale parameter)

D. f. :

F(x) = 1 - ( ; /x)

P. d. f. :

e- / x
e - / x
f(x) = + 1
=
.
x
[] x x []

Moments: E[Xk] = n

( i)

k
, > k.
( 1)...( k)

i=1

Mean =

>1
2
( 1) ( 2)

Second Moment =

Variance =

Mode =

( 1)2 ( 2)

>2

>2

Coefficient of Variation = Standard Deviation / Mean =


Skewness =

+1

2
, > 3.
3

Kurtosis =

Limited Expected Value: E[X x] =

1
, > 2.
2

3 ( 2 ) ( + 5 )
, > 4.
( 3 ) ( 4 )

{1 [1; /x]} + x [; /x]


1

R(x) = Excess Ratio = [1; /x] (1) (x/) [; /x] .

e(x) = Mean Excess Loss =

[ -1; / x]
-x >1
1 [ ; / x]

X Gamma(, 1) /X Inverse Gamma(, ).

Page 329

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Problems:
Use the following information for the next 4 questions:
You have an Inverse Gamma Distribution with parameters = 5 and = 10.
25.1 (1 point) What is the density function at x = 0.7?
A. less than 0.01
B. at least 0.01 but less than 0.02
C. at least 0.02 but less than 0.03
D. at least 0.03 but less than 0.04
E. at least 0.04
25.2 (1 point) What is the mean?
A. less than 2.0
B. at least 2.0 but less than 2.2
C. at least 2.2 but less than 2.4
D. at least 2.4 but less than 2.6
E. at least 2.6
25.3 (1 point) What is the variance?
A. less than 2.0
B. at least 2.0 but less than 2.2
C. at least 2.2 but less than 2.4
D. at least 2.4 but less than 2.6
E. at least 2.6
25.4 (1 point) What is the mode?

25.5 (1 point) What is the integral from zero to infinity of 1,000,000 e-15/x x-7?
A. less than 8
B. at least 8 but less than 9
C. at least 9 but less than 10
D. at least 10 but less than 11
E. at least 11

Page 330

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 331

25.6 (1 point) Sizes of loss are assumed to follow a LogLogistic Distribution, with = 4 and
= 1000. What is the probability that a loss exceeds 1500?
A.
B.
C.
D.
E.

less than 13%


at least 13% but less than 14%
at least 14% but less than 15%
at least 15% but less than 16%
at least 16%

25.7 (1 point) Losses follow a LogLogistic Distribution with = 4 and = 1000. What is the mode?
25.8 (1 point) For an Inverse Pareto Distribution, with = 3 and = 10,
what is the probability density function at 7?
A. less than 1.7%
B. at least 1.7% but less than 1.8%
C. at least 1.8% but less than 1.9%
D. at least 1.9% but less than 2.0%
E. at least 2.0%
25.9 (1 point) For an Inverse Pareto Distribution, with = 6 and = 80, what is the mode?
25.10 (2 points) For a ParaLogistic Distribution, with = 4 and = 100, what is the mean?
You may use the facts that: (1/4) = 3.6256, (1/2) = 1.7725, and (3/4) = 1.2254.
A. 59

B. 61

C. 63

D. 65

E. 67

25.11 (1 point) For an Inverse ParaLogistic Distribution, with = 2.3 and = 720, what is F(2000)?
A.
B.
C.
D.
E.

less than 82%


at least 82% but less than 83%
at least 83% but less than 84%
at least 84% but less than 85%
at least 85%

25.12 (2 points) For an Inverse Weibull Distribution, with = 5 and = 20, what is the variance?
You may use: (0.2) = 4.59084 , (0.4) = 2.21816 , (0.6) = 1.48919 , and (0.8) = 1.16423.
A.
B.
C.
D.
E.

less than 55
at least 55 but less than 60
at least 60 but less than 65
at least 65 but less than 70
at least 70

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 332

25.13 (1 point) For an Inverse Weibull Distribution, with = 5 and = 20, what is the mode?

25.14 (2 points) f(x) =

12,500 e- 10 / x
, x > .
3
x6

You take a sample of size 20.


Using the Normal Approximation, determine the probability that the sample mean is greater than 3.
A. 2%
B. 3%
C.4%
D. 5%
E. 6%
25.15 (2 points) What is the Distribution Function at 9 of an Inverse Gaussian Distribution,
with parameters = 7 and = 4?
A.
B.
C.
D.
E.

less than 70%


at least 70% but less than 75%
at least 75% but less than 80%
at least 80% but less than 85%
at least 85%

25.16 (2 points) What is the Probability Density Function at 16 of an Inverse Gaussian Distribution,
with parameters = 5 and = 7?
A.
B.
C.
D.
E.

less than 0.0040


at least 0.0040 but less than 0.0045
at least 0.0045 but less than 0.0050
at least 0.0050 but less than 0.0055
at least 0.0055

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Use the following information for the next two questions.


Let W = 0.7X + 0.3Y, where X and Y are independent random variables.
X follows a Gamma distribution with = 3 and = 10.
Y follows a Inverse Gamma distribution with = 12 and = 300.
25.17 (2 points) What is the mean of W?
A. less than 25
B. at least 26 but less than 27
C. at least 27 but less than 28
D. at least 28 but less than 29
E. at least 29
25.18 (3 points) What is the variance of W?
A. less than 145
B. at least 145 but less than 150
C. at least 150 but less than 155
D. at least 155 but less than 160
E. at least 160

Page 333

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 334

Solutions to Problems:
25.1. C. f(x) = e/x / {() x+1} , f(.7) = 105 e10/.7 / {(5) .76 } = 0.0221.
Comment: Using the formulas in Appendix A of Loss Models, f(x) = (/x) e/x /{x()}.
f(0.7) = (10/0.7)5 e-10/0.7 / {0.7 (5) } = 0.0221.
25.2. D. Mean = / (1) = 10 / (5-1) = 2.5.
25.3. B. Variance = 2 / {(1)2 (2)} = (102 ) / {42 3} = 2.0833.
Alternately, the second moment is: 2 / {(1) (2)} = (102 ) / {(4)(3)} = 8.333.
Thus the variance = 8.333 - 2.52 = 2.083.
25.4. mode = /(+1) = 10/6 = 1.67.

25.5. D.

0 x - ( + 1) e - / x dx = () / .

Letting = 15 and = 6, the integral from zero to infinity of e-15/x x-7 is:
(6) / 156 = 120 / 113909625 = 0.0000105. Thus the integral of 1 million times that is: 10.5.
Comment: e-15/x x-7 is proportional to the density of an Inverse Gamma Distribution with = 15 and
= 6. Thus its integral from zero to infinity is the inverse of the constant in front of the Inverse
Gamma Density, since the density itself must integrate to unity.
Alternately, one could let y = 15/x and convert the integral to a complete Gamma Function.
25.6. E. F(x) = (x/) / (1+ (x/)). S(x) = 1 / (1 + (x/)) = 1/(1 + 1.54 ) = 0.165.
25.7. For > 1, mode = {(-1)/(+1)}1/ = 1000 (3/5)1/4 = 880.
25.8. B. f(x) = x 1 /(x+)+1. f(7) = (3)(10)(72 )/(7+10)4 = 0.0176.
25.9. For > 1, mode = ( - 1)/2 = 80(5/2) = 200.

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 335

25.10. E. (1.25) = (1/4)(.25) = (1/4)(3.6256) = .9064.


(3.75) = (2.75)(1.75)(.75)(.75) = (3.6094)(1.2254) = 4.423.
E[X] = (1+1/)(1/)/() = 100 (1.25) (3.75) / (4). = (100)(.9064)(4.423)/6 = 66.8.
25.11. A. F(x) = ((x /)/(1+(x /))) = (1+( /x )). F(2000) = (1+(720/2000)2.3)-2.3 = 0.811.
25.12. A. E[X] = (11/) = 20 (.8) = (20)(1.16423) = 23.285.
E[X2 ] = 2 (12/) = 400 (.6) = (400)(1.48919) = 595.68.
variance = 595.68 - 23.2852 = 53.5.
25.13. mode = {/(+1)}1/ = (20) (5/6)1/5 = 19.3.
25.14. E. The given density is that of an Inverse Gamma, with = 5 and = 10.
Mean = /(-1) = 10/4 = 2.5.
Second Moment =

2
= 100 / 12.
( 1) ( 2)

Variance = 25/3 - 2.52 = 2.0833.

Thus the variance of the sample mean is: 2.0833 / 20 = 0.10417


3 - 2.5
S(3) 1 - [
] = 1 - [1.55] = 6.06%.
0.10417
25.15. C. For the Inverse Gaussian, F(x) = [ (x1/2/ - x-1/2)] + e2/ [- (x1/2/ +x-1/2)].
With = 7 and = 4, F(9) = [2 (3/7 - 1/3)] + e8/7 [-2(3/7 +1/3)] =
[.19] + (3.1357)[-1.52] = .5753 + (3.1357)(.0643) = 0.777.
Comment: Using the formulas in Appendix A of Loss Models, z = (x-)/ = (9-7)/7 = 2/7.
y = (x+)/ = (9+7)/7 = 16/7. F(x) = [z / x )] + e2/ [-y / x ].
F(9) = [(2/7) 4 / 9 ] + e8/7 [-(16/7) 4 / 9 ] = [.19] + (3.1357)[-1.52] = 0.777.
25.16. E. For the Inverse Gaussian, f(x) =
With = 5 and = 7, f(16) =
0.0057.

/ (2 ) x-1.5 exp(-(x/ -1)2 / (2x)).

7 / (2 ) 16-1.5 exp(-7(11/5)2 / 32) = (1.0556) (1/64) e-1.0588 =

2013-4-2,

Loss Distributions, 25 Other 2 Parameter Dists.

HCM 10/8/12,

Page 336

25.17. E. Mean of Gamma is = 30. Mean of Inverse Gamma = /(1) = 300/11 = 27.3.
Mean of W = .7(30) + .3(27.3) = 29.2.
25.18. C. Second moment for Gamma is (+1)2 = 1200.
Variance of Gamma = 1200 - 302 = 300 = 2.
Second moment for Inverse Gamma is 2/ {(1)(2)} = 818.2.
Variance of Inverse Gamma = 8.18.2 - 27.32 = 72.9.
Therefore, the Variance of W = (0.72 )(300) + (0.32 )(72.9) = 153.6.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 337

Section 26, Three Parameter Distributions


There are five Three Parameter Distributions in Loss Models: Transformed Gamma, Inverse
Transformed Gamma, Burr, Inverse Burr, and Generalized Pareto, each generalizations of one or
more of the two parameter distributions. The extra parameter provides extra flexibility, which
potentially allows a closer fit to data.142 You are unlikely to be asked questions involving the 3
parameter distributions.
Transformed Gamma Distribution:143
Transformed Gamma with = 1 is the Gamma.
Transformed Gamma with = 1 is the Weibull.

F(x) =

[ ; (x/)].

x
x -1 exp -

f(x) =

()

[ ]

is the scale parameter for the Transformed Gamma. is a shape parameter in the same way it is
for the Gamma. is a shape parameter, as for the Weibull.
The relationships between the Exponential, Weibull, Gamma, and Transformed Gamma
Distributions are shown below:
power
transformation
Exponential
=1

Gamma

Weibull

=1

Transformed Gamma

power
transformation
Mean = ( + 1/) /().

142
143

Variance = 2 {()( + 2/) 2( + 1/)} / ()2.

The potential additional accuracy comes at the cost of extra complexity.


See Appendix A and Figure 5.3 in Loss Models.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 338

Therefore: 1 + CV2 = ()( + 2/) / 2( + 1/).


It turns out that: (Skewness)(CV3) + 3CV2 + 1 = 2()( + 3/) / 2(+1/).144
The Transformed Gamma Distribution is defined in terms of the Incomplete Gamma Function,
F(x) = [ ; (x/)]. Thus using the change of variables y = (x/), the density of a Transformed
Gamma Distribution can be derived from that for a Gamma Distribution.
Exercise: Derive the density of the Transformed Gamma Distribution from that of the Gamma
Distribution.
[Solution: Let y = (x/). If y follows a Gamma Distribution with parameters and 1, then x follows a
Transformed Gamma Distribution with parameters and .
If y follows a Gamma Distribution with parameters and 1, then f(y) = y1 e-y / ().
Then the density of x is given by: f(y)(dy/dx) = {((x/))1 exp(-(x/)) / ()} {x1} =
x1 exp[-(x/)] / (), as shown in Appendix A of Loss Models.]
Exercise: What is the mean of a Transformed Gamma Distribution?
[Solution:

xf(x)dx = x x1 exp[-(x/)] / () dx = x exp[-(x/)] dx/ ().


0

Let y = (x/), and thus x = y1/ , dx = y1/ 1 dy / , then the integral for the first moment is:

(y ) exp[-y]{ y1/ 1 dy/}/() = y +1/ 1 exp[-y] dy/() = (+ 1/)/().]


0

144

See Venter Transformed Beta and Gamma Distributions and Aggregate Losses, PCAS 1983. These relations
can be used to apply the method of moments to the Transformed Gamma Distribution. Also an appropriate mixing of
Transformed Gammas via a Gamma produces a Transformed Beta Distribution. The mixing of Exponentials via a
Gamma to produce a Pareto is just a special case of this more general result.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 339

Exercise: What is the nth moment of a Transformed Gamma Distribution?


[Solution:

xnf(x)dx = xn x1 exp[-(x/)]dx/() = x+n-1 exp[-(x/)]dx/().


0

Let y = (x/), and thus x = y1/, dx = y1/ 1 dy/, then the integral for the nth moment is:

(y+(n-1)/ +n-1) exp[-y]{y1/ 1 dy/} / () = n y +(n/) 1 exp[-y] dy/()


0

= n (+ n/)/(). This is the formula shown in Appendix A of Loss Models.]


Exercise: What is the 3rd moment of a Transformed Gamma Distribution with
= 5, = 2.5, and = 1.5?
[Solution: n (+ n/)/() = 2.53 (5+ 3/1.5)/(5) = 2.53 { (7)/(5)} = 2.53 (6)(5) = 468.75.]
Limit of Transformed Gamma Distributions:
One can obtain a LogNormal Distribution as an appropriate limit of Transformed Gamma
Distributions.145 First note that [; y] is a Gamma Distribution with scale parameter of 1, and thus
mean and variance of . As gets large, this Gamma Distribution, approaches a Normal
Distribution. Thus for large , [; y] [(y -)/ ].
Now take a limit of Transformed Gamma Distributions as goes to zero, while
= (1 + )/22 and = 22, where and are selected constants.146
As goes to zero, both and go to infinity.
For each Transformed Gamma we have F(x) = [; y], with y = (x/) = x/(22).
(y -)/ = (x/22 - (1 + )/22)/ {

1 + /} = {x / 1 + -

1 + } / . As tau goes

to zero both the number and denominator go to zero. To get the limit we use LHospitals rule and
differentiate both the numerator and denominator with respect to :
limit x / 1 + 0

1 + } / = limit {ln(x)x/ 1 + - x/{2(1 + )1.5} - / {2


0

= {ln(x) - / 2 - / 2}/ = {ln(x) - }/ .


145
146

See Section 5.3.3 of Loss Models.


Mu and sigma will turn out to be the parameters of the limiting LogNormal Distribution.

1 + }}/

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 340

Thus as alpha gets big and tau get small we have


[; y] [(y - ) /

] [{ln(x) - }/ ],

which is a LogNormal Distribution. Thus one can obtain a LogNormal Distribution as an appropriate
limit of Transformed Gamma Distributions.147
Inverse Transformed Gamma:148
If x follows a Transformed Gamma Distribution, then 1/x follows an Inverse
Transformed Gamma Distribution.
Inverse Transformed Gamma with = 1 is the Inverse Gamma.
Inverse Transformed Gamma with = 1 is the Inverse Weibull.

F(x) = 1 -


exp -
x
f(x) =
.

+
1
x
[]

[ ]

[ ; (/x)].

is the scale parameter for the Inverse Transformed Gamma. is a shape parameter in the same
way it is for the Inverse Gamma. is a shape parameter, as for the Inverse Weibull.
The relationships between the Inverse Exponential, Inverse Weibull, Inverse Gamma, and Inverse
Transformed Gamma Distributions are shown below:
power
transformation
Inverse
Exponential
=1

Inverse
Gamma

Inverse
Weibull

=1

Inverse
Transformed Gamma

power
transformation
The Inverse Transformed Gamma, and its special cases, are heavy-tailed distributions.
The nth moment only exists is n < .
147
148

One can also obtain a LogNormal Distribution as an appropriate limit of Inverse Transformed Gamma Distributions.
See Appendix A and Figure 5.3 of Loss Models.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 341

The mean excess loss increases approximately linearly for large x.


In the same way that the Transformed Gamma distribution is a generalization of the Gamma
Distribution, the Inverse Transformed Gamma is a Generalization of the Inverse Gamma.
y = x
Gamma

y = 1/x
Inverse Gamma

Transformed Gamma

y = 1/x

Inverse Transformed Gamma

y = x

Generalized Pareto:149
Generalized Pareto with = 1 is the Pareto.
Generalized Pareto with = 1 is the Inverse Pareto.
F(x) = [, ; x/(+x)].

f(x) =

[ + ] x -1
.
[ ] [] ( + x) +

is the scale parameter for the Generalized Pareto. is a shape parameter in the same way it is for
the Pareto. is an additional shape parameter. While the Pareto may be obtained by mixing
Exponential Distributions via a Gamma Distribution, the Generalized Pareto can be obtained by
mixing Gamma Distributions via a Gamma Distribution. Thus in the same way that a Gamma is a
generalization of the Exponential, so is the Generalized Pareto of the Pareto.
If x follows a Generalized Pareto, then so does 1/x.150
Specifically, using the fact that (a,b; x) = 1 - (b,a; 1-x), 1 - F(1/x) = 1 - [, ; (1/x)/(+1/x)] =
1 - [, ; (1/)/(1/ +x)] = [, ; 1- (1/)/(1/ +x)] = [, ; x/(1/ +x)].
Thus if x follows a Generalized Pareto with parameters , , , then 1/x follows
a Generalized Pareto, but with parameters , 1/, .
149

As discussed in Mahlers Guide to Statistics, for CAS Exam 3L, the F-Distribution is a special case of the
Generalized Pareto. The Generalized Pareto is sometimes called a Pearson Type VI Distribution.
150
Which is why there is not an Inverse Generalized Pareto Distribution.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 342

Burr Distribution:
Burr with = 1 is the Pareto.
Burr with = 1 is the LogLogistic.

Burr with = is the ParaLogistic.

1
F(x) = 1 -

1 + (x / )

f(x) =

x 1

+ 1
1

1 + (x / )

is the scale parameter for the Burr distribution. is a shape parameter in the same way it is for the
Pareto. is an additional shape parameter. The Burr is obtained from the Pareto by introducing a
power transformation; if x follows a Pareto Distribution, then x follows a Burr Distribution. If x follows
a Burr Distribution with parameters , , and , then (x/) follows a Pareto Distribution with shape
parameter of and scale parameter of 1.
While the Pareto may be obtained by mixing Exponential Distributions via a Gamma Distribution,
the Burr can be obtained by mixing Weibull Distributions via a Gamma Distribution. Thus in the
same way that a Weibull is a generalization of the Exponential, so is the Burr of the Pareto.
Inverse Burr:
If x follows a Burr Distribution, then 1/x follows an Inverse Burr Distribution.
Inverse Burr with = 1 is the Inverse Pareto.
Inverse Burr with = 1 is the LogLogistic. Inverse Burr with = is the Inverse ParaLogistic.
(x / )

F(x) =
= {1 + ( /x) }
1 + (x / )

f(x) =

+1
x 1
1

1 + (x / )

is the scale parameter for the Burr distribution. is a shape parameter in the same way it is for the
Inverse Pareto. is an additional shape parameter. The Inverse Burr is obtained from the Inverse
Pareto by introducing a power transformation; if x follows a Inverse Pareto Distribution, then x
follows a Inverse Burr Distribution.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 343

While the Inverse Pareto may be obtained by mixing Inverse Exponential Distributions via a
Gamma Distribution, the Inverse Burr can be obtained by mixing Inverse Weibull Distributions via a
Gamma Distribution. Thus in the same way that an Inverse Weibull is a generalization of the
Inverse Exponential, so is the Inverse Burr of the Inverse Pareto.
Log-t Distribution:151
The log-t distribution has the same relationship to the t distribution, as does the LogNormal
Distribution to the Standard Normal Distribution.152
If Y has a t distribution with r degrees of freedom, then exp(Y + ) has a log-t distribution, with
parameters r, and .
In other words, if X has a log-t distribution with parameters r, and , then (ln(X) - )/ has a
t distribution, with r degrees of freedom.
Exercise: What is the distribution function at 1.725 for a t distribution with 20 degrees of freedom?
[Solution: Using a t table, at 1.725 there is a total of 10% in the left and right hand tails. Therefore, at
1.725 the distribution function is 95%.]
Exercise: What is the distribution function at 11.59 for a log-t distribution with parameters
r = 20, = -1, and = 2?
[Solution: {ln(11.59) - (-1)}/2 = 1.725. Thus we want the distribution function at 1.725 of a
t distribution with 20 degrees of freedom. This is 95%.]
As with the t-distribution, the distribution function of the log-t distribution can be written in terms
of incomplete beta functions, while the density involves complete gamma functions.
Since the t-distribution is heavier tailed than the Normal Distribution, the log-t distribution is heavier
tailed than the LogNormal Distribution. In fact, none of the moments of the log-t distribution exist!

151
152

See Appendix A in Loss Models.


The (Students) t distribution is discussed briefly in the next section.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 344

Problems:
26.1 (1 point) The Generalized Pareto Distribution is a generalization of which of the following
Distributions?
A. Pareto
B. Gamma

C. Weibull

D. LogNormal

E. Inverse Gaussian

26.2 (2 points) For f(x) = (10-10) x9 exp[-(.01x2 )] / 12


what is the value of integral from zero to infinity of x2 f(x) dx?
Hint: For the Transformed Gamma distribution, f(x) = (x/) exp(-(x/)) / {x ()}, and
E[Xn ] = n ( + n/)/().
A. less than 450
B. at least 450 but less than 470
C. at least 470 but less than 490
D. at least 490 but less than 510
E. at least 510
26.3 (1 point) Match the Distributions.
1. Transformed Gamma with =1.

a. Pareto

2. Transformed Gamma with =1.

b. Weibull

3. Burr with =1.

c. Gamma

A. 1a, 2b, 3c

B. 1a, 2c, 3b

C. 1b, 2a, 3c

D. 1b, 2c, 3a

26.4 (2 points) You are given the following:

Claim sizes follow a Burr Distribution with parameters = 3, = 32, and = 2.

You observe 11 claims.

The number of claims and claim sizes are independent.


Determine the probability that the smallest of these claim is greater than 7.
A. less than 20%
B. at least 20% but less than 25%
C. at least 25% but less than 30%
D. at least 30% but less than 35%
E. at least 35%

E. 1c, 2b, 3a

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 345

26.5 (2 points) Which of the following is an expression for the variance of a Generalized Pareto
Distribution, with > 2?
A. 2 / {(-)2 (-2)}
B. 2 (+-1) / {(-1)2 (-2)}
C. 2 (+-1) / {(-)(-1)(-2)}
D. 2 / {(-)2 (- (+1))}
E. None of the above.
26.6 (1 point) Claim sizes follow an Inverse Burr Distribution with parameters = 3, = 100, and
= 4. Determine F(183).
A. less than 60%
B. at least 60% but less than 65%
C. at least 65% but less than 70%
D. at least 70% but less than 75%
E. at least 75%
26.7 (1 point) What is the second moment of an Inverse Transformed Gamma Distribution with
parameters = 5, = 15, = 2?
A. less than 20
B. at least 20 but less than 30
C. at least 30 but less than 40
D. at least 40 but less than 50
E. at least 50
26.8 (1 point) Match the Distributions.
1. Inverse Transformed Gamma with = 1.

a. LogLogistic

2. Generalized Pareto with = 1.

b. Inverse Weibull

3. Inverse Burr with = 1.

c. Inverse Pareto

A. 1a, 2b, 3c

B. 1a, 2c, 3b

C. 1b, 2a, 3c

D. 1b, 2c, 3a

E. 1c, 2b, 3a

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 346

26.9 (1 point) What is the mode of a Transformed Gamma distribution, with = 4.85, = 813, and
= 0.301?
A. less than 2000
B. at least 2000 but less than 2500
C. at least 2500 but less than 3000
D. at least 3000 but less than 3500
E. at least 3500
26.10 (1 point) What is the mode of a Generalized Pareto distribution, with = 2.5, = 100, and
= 1.3?
26.11 (1 point) What is the mode of an Inverse Burr distribution, with = 0.8, = 1000, and
= 1.5?
26.12 (1 point) What is the mode of an Inverse Transformed Gamma distribution, with = 3.7,
= 200, and = 0.6?
26.13 (1 point) What is the median of a Burr distribution, with = 1.85, = 273,700, and = 0.97?
A. less than 115,000
B. at least 115,000 but less than 120,000
C. at least 120,000 but less than 125,000
D. at least 125,000 but less than 130,000
E. at least 130,000
26.14 (2 points) X follows a Burr Distribution with parameters = 6, = 20, and = 0.5.
Let X be the average of a random sample of size 200.
Use the Central Limit Theorem to find c such that Prob( X c) = 0.90.
A. 2.4
B. 2.5
C. 2.6
D. 2.7
E. 2.8
26.15 (4B, 5/97, Q.24) (2 points) The random variable X has the density function
f(x) = 4x/(1+x2 )3 , 0 < x < .
Determine the mode of X.
A. 0
B. Greater than 0, but less than 0.25
C. At least 0.25, but less than 0.50
D. At least 0.50, but less than 0.75
E. At least 0.75

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 347

Solutions to Problems:
26.1. A. The Generalized Pareto Distribution is a generalization of the Pareto Distribution, with three
rather than two parameters.
Comment: Questions cant get any easier than this! For = 1, the Generalized Pareto is a Pareto.
For = 1, the Generalized Pareto is an Inverse Pareto.
26.2. D. This is a Transformed Gamma distribution, with = 5, = 10, and = 2.
The integral is the second moment, which for the Transformed Gamma is:
2( + 2/) / () = 100 (5 + 1) / (5) = (100)(5) = 500.
26.3. D. 1. Transformed Gamma with =1 is a Weibull. 2. Transformed Gamma with
= 1 is a Gamma. 3. Burr with = 1 is a Pareto.
26.4. B. S(7) = 1 - F(7) = (1/(1+(7/32)2 ))3 = .8692.
The smallest of these 11 claims is greater than 7, if and only if all the 11 claims are greater than 7.
The chance of this is: S(7)11 = 0.869211 = 21.4%.
Comment: This is an example of an order statistic.
26.5. B. The Generalized Pareto Distribution has for > n, moments: n ( n)( + n)/()().
The first moment is: ( 1)( + 1)/()() = /(-1).
The second moment is: 2( 2)( + 2)/()() = 2(+1) /{(-1)(-2)}.
Thus the variance is: 2(+1) /{(-1)(-2)} - { /(-1)}2 =
{2/{(-1)2 (-2)}}{(+1)(-1) - (-2)} = 2 (+-1) /{(-1)2 (-2)}.
Comment: For =1, this is a Pareto, with variance: 2 / {(2)(1)2}, for > 2.
26.6. E. F(x) = (1 + (/x)) . F(183) = (1 + (.5464)4 )-3 = 0.774.
26.7. E. E[X2 ] = 2 ( - 2/) / () = 225 (4) / (5) = 225/4 = 56.25.
26.8. D. 1. b, 2.c, 3. a.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 348

26.9. D. For = 1.46 > 1, the mode = {( 1)/}1/ = (813)(.45985/.301)1/0.301 = 3323.


Comment: f(x) = (x/) exp(-(x/)) / {x ()}.
ln f(x) = ln() + ( - 1)ln(x) - ln() - (x/) -ln(()).
The mode is where the density is largest, and therefore where ln f(x) is largest.
d ln f(x) / dx = ( - 1)/x - x1/. Setting d ln f(x) / dx = 0, ( - 1)/x = x1/ .
Therefore, x = {( 1)/}1/ = (813){1.528)3.322 = 3323.
26.10. For > 1, mode = ( - 1)/(+1) = (100)(0.3)/3.5 = 8.57.
Comment: The formula for the mode is shown in Appendix A attached to the exam.
26.11. For > 1, mode = {( - 1) / (+1)}1/ = (1000)(0.2/2.5)1/1.5 = 185.7.
26.12. Mode = { / ( + 1)}1/ = (200)(0.6/3.22)1/0.6 = 12.16.
26.13. C. F(x) = 1 - (1/(1+(x/))). Therefore, .5 = 1 - (1/(1+(x/273700).97))1.85.
1.455 = 1 + x.97 /188,000.

Therefore, x = 121.5 thousand.

26.14. E. E[X] = [1 + 1/] [ - 1/] / [] = (20) [3] [4] / [6] = (20) (2!) (3!) / (5!) = 2.
E[X2 ] = 2 [1 + 2/] [ - 2/] / [] = (202 ) [5] [2] / [6] = (202 ) (4!) (1!) / (5!) = 80.
Var[X] = 80 - 22 = 76. Var[ X ] = 76/200 = 0.38. E[ X ] = E[X] = 2.
Using the Normal Approximation, the 90th percentile is at: 2 + 1.282 0.38 = 2.790.

2013-4-2,

Loss Distributions, 26 Three Parameter Distribs.

HCM 10/8/12,

Page 349

26.15. C. The mode is where the density is at a maximum. We check the end points, but
f(0) = 0 and f() = 0, so the maximum is in between where f(x) = 0.
f(x) = {4(1+x2 )3 - (4x)(6x)(1+x2 )2 }/(1+x2 )6 . Setting the derivative equal to zero,
4(1+x2 )3 - (4x)(6x)(1+x2 )2 = 0. Therefore 4(1+x2 ) = 24x2 . x2 = 1/5. x = 1/ 5 = 0.447.
Comment: One can confirm that the mode is approximately .45 numerically:
x

f(x)

0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6

0.000
0.199
0.388
0.561
0.711
0.834
0.927
0.990
1.025
1.035
1.024
0.996
0.954

This is a Burr Distribution with = 2, = 1, and = 2. As shown in Appendix A of Loss Models,


the mode is for > 1: {(-1) / ( +1)}1/ = (1/5)1/2.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 350

Section 27, Beta Function and Distribution


The quantity xa-1(1-x)b-1 for a > 0, b > 0, has a finite integral from 0 to 1. This integral is called the
(complete) Beta Function. The value of this integral clearly depends on the choices of the
(a - 1)! (b- 1)! (a) (b)
parameters a and b.153 This integral is:
=
.
(a + b - 1)!
(a + b)
The Complete Beta Function is a combination of three Complete Gamma Functions:
[a,b] =

xa - 1 (1- x)b - 1 dx =

(a - 1)! (b- 1)! (a) (b)


=
.
(a + b - 1)!
(a + b)

Note that (a, b) = (b, a).


Exercise: What is the integral from zero to 1 of x3 (1-x)6 ?
[Solution: (4, 7) = (4)(7) / (4+7) = 3! 6! / 10! = 1/840 = 0.001190.]
One can turn the complete Beta Function into a distribution on the interval [0,1] in a manner similar to
how the Gamma Distribution was created from the (complete) Gamma Function on [0,]. Let:
x

F(x) = {(a+b-1)! / (a-1)! (b-1)!}

ta-1(1-t)b-1 dt = {(a+b) / (a)(b)}

ta-1(1-t)b-1 dt.

F(x) =

ta - 1 (1- t)b - 1 dt / [a, b] (a , b; x).

f(x) = {(a+b-1)! / (a-1)! (b-1)!} xa-1(1-x)b-1, 0 x 1.


Then the distribution function is zero at x = 0 and one at x = 1. The latter follows from the fact that we
have divided by the value of the integral from 0 to 1 of ta-1(1-t)b-1.
This corresponds to the form of the incomplete Beta function shown in the Appendix A of Loss
Models. F(x) = (a, b; x), 0 x 1.154 This two parameter distribution is a special case of what Loss
Models calls the Beta distribution, for = 1; the Beta Distribution in Loss Models is
F(x) = (a, b; x/), 0 x .

153

The results have been tabulated and this function is widely used in many applications.
See for example the Handbook of Mathematical Functions, by Abramowitz, et. al.
154
This distribution is sometimes called a Beta Distribution of the first kind, or a Pearson Type I Distribution.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 351

The following relationship, is sometimes useful: (a, b; x) = 1 - (b, a; 1 - x).

(a,b;x) has mean:

a
a (a +1)
ab
, second moment:
, and variance:
.
2
a + b
(a + b) (a + b + 1)
(a + b) (a+ b +1)

The mean is between zero and one; for b < a the mean is greater than 0.5.
For a fixed ratio of a/b, the mean is constant and for a and b large (a,b; x) approaches a Normal
Distribution. As a or b get larger the variance decreases. For either a or b extremely large, virtually all
the probability is concentrated at the mean.
Here are various Beta Distributions (with = 1):

a = 1 and b = 5

a = 2 and b = 4
2.0

4
1.5
3
1.0

0.5

1
0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

a = 5 and b = 1

a = 4 and b = 2

2.0

4
1.5

1.0

0.5

1
0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

For a > b the Beta Distribution is skewed to the left. For a < b it is skewed to the right.
For a = b it is symmetric. For a 1, the Mode = 0. For b 1, the Mode = 1.
(a,b;x), the Beta distribution for = 1 is closely connected to the Binomial Distribution. The Binomial
parameter q varies from zero to one, the same domain as the Incomplete Beta Function.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 352

The Beta density is proportional to the chance of success to the power a-1, times the chance of
failure to the power b-1. The constant in front of the Beta density is (a+b-1) times the binomial
coefficient for (a+b-2) and a-1. The Incomplete Beta Function is a conjugate prior distribution for the
Binomial. The Incomplete Beta Function for integer parameters can be used to compute the
sum of terms from the Binomial Distribution.155
Summary of Beta Distribution:
Support: 0 x

Parameters: a > 0 (shape parameter), b > 0 (shape parameter)


> 0 (similar to a scale parameter, determines the support)

(a + b - 1)! x/ a - 1
F(x) = (a, b ; x/)
t (1- t)b - 1 dt .
(a - 1)! (b- 1)! 0
f(x) =
=

1
(a + b)
(x/)a (1 - x/)b-1 / x =
(x/)a (1 - x/)b-1 / x
(a, b)
(a) (b)
(a + b - 1)!
(x/)a-1 (1 - x/)b-1 / , 0 x .
(a - 1)! (b- 1)!

For a = 1, b = 1, the Beta Distribution is the uniform distribution from [0, ].


(a + b) (a + n)
(a + b - 1)! (a + n - 1)!
E[Xn] = n
= n
(a + b + n) (a)
(a + b + n - 1)! (a - 1)!
= n

Mean =

a(a + 1) ... (a + n - 1)
.
(a + b)(a + b + 1) ... (a + b + n - 1)
a
a+b

E[X2 ] = 2

a (a +1)
(a + b) (a + b + 1)

Variance = 2

ab
(a+ b +1)

(a + b)2

Coefficient of Variation = Standard Deviation / Mean = (b / {a (a+b+1)}).5


Skewness = 2 (b - a)

Mode =

a - 1
a + b - 2

a + b + 1
(a + b + 2) a b
for a > 1 and b > 1

Limited Expected Value = E[X x] = {a/(a+b)}(a+1, b; x/) + x(1-(b, a; x/)).


155

See Mahlers Guide to Frequency Distributions. On the exam you should either compute the sum of binomial
terms directly or via the Normal Approximation. Note that the use of the Beta Distribution is an exact result, not an
approximation. See for example the Handbook of Mathematical Functions, by Abramowitz, et. al.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 353

Beta Distribution for a = 3, b = 3, and = 1:


1.75
1.5
1.25
1
0.75
0.5
0.25
0.2

0.4

0.6

0.8

Since the density of the Beta Distribution integrates to one over its support:

(t / )a - 1 (1 - t / )b - 1 dt = (a, b). This is sometimes called a Beta type integral.

Uniform Distribution:
The Uniform Distribution from 0 to is a Beta Distribution with a = 1 and b = 1.
Specifically, DeMoivres Law is a Beta Distribution with a = 1, b = 1, and = .
The future lifetime of a life aged x under DeMoivres Law is a Beta Distribution with a = 1,
b = 1, and = - x.
Modified DeMoivres Law:
The Modified DeMoivres Law has S(x) = (1 - x/), 0 x . = 1 is DeMoivres Law.
The Modified DeMoivres Law is a Beta Distribution with a = 1, b = , and = .
The future lifetime of a life aged x under the Modified DeMoivres Law is a Beta Distribution with
a = 1, b = , and = - x.
Generalized Beta Distribution:
Appendix A of Loss Models also shows the Generalized Beta Distribution, which is obtained from
the Beta Distribution via a power transformation. F(x) = (a, b ; (x/)). For = 1, the Generalized
Beta Distribution reduces to the Beta Distribution.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 354

Students-t Distribution:
The Students-t Distribution from Statistics can be written in terms of the Incomplete Beta Function.
If U is a Unit Normal variable and 2 follows a chi-square distribution with degrees of freedom, then
U/

2 / follows a Students-t distribution, degrees of freedom.156

For parameter , the density is:


f(x) = {((+1)/2)/ ((/2)

)} (1+x2 /)-(+1)/2 = (1+x2 /)-(+1)/2 / (/2,1/2), - < x < .

The Distribution Function is:157


F(x) = [/2, 1/2; /(+x2 )] /2 for x 0,
F(x) = 1 - [/2, 1/2; /(+x2 )] /2 for x 0.
F-Distribution:
The F-Distribution (variance ratio distribution) from Statistics can be written in terms of the
Incomplete Beta Function.158 If 12 follows a chi-square distribution with 1 degrees of freedom and
22 follows a chi-square distribution with 2 degrees, then

(12/1 ) / (22/2 ) follows an F-distribution, with 1 and 2 degrees of freedom.159


The Distribution Function is:160
F(x) = [1 /2, 2 /2; 1 x / (2 +1 x)] = 1- [2 /2, 1 /2; 2 /(2 +1 x)], x > 0.
Conversely one can get the Incomplete Beta Function from the F-Distribution, provided 2a and 2b
are integral:
(a,b; x) = F[bx/ a(1-x)], where F is the F-Distribution with 2a and 2b degrees of freedom.

A chi-square distribution with degrees of freedom is the sum of squares of independent Unit Normals.
See Section 26.7 of the Handbook of Mathematical Functions by Abramowitz, et. al.
158
The F-Distribution is a form of what is sometimes called a Beta Distribution of the Second Kind.
159
For example, 1 2 could be the estimated variance from a sample of 1 drawn from a Normal Distribution and 2 2
156
157

could be the estimated variance from an independent sample of 2 drawn from a Normal Distribution with the same
variance as the first. The variance-ratio test uses the F-Distribution to test the hypothesis that the two Normal
Distributions have the same variance. See the syllabus of CAS Exam 3L.
160
See Section 26.6 of the Handbook of Mathematical Functions by Abramowitz, et. al.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 355

Relation of the Beta to the Gamma Distribution:


The Complete Beta Function is a combination of three Complete Gamma Functions:
1

[a,b] =

(b- 1)! (a) (b)


=
.
xa - 1 (1- x)b - 1 dx = (a(a- 1)!
+ b - 1)!
(a + b)

In addition, there are other connections between the Gamma and Beta. If X is a random draw
from a Gamma Distribution with shape parameter and scale parameter , and Y is a random
draw from a Gamma Distribution with shape parameter and same scale parameter , then
Z = X / (X+Y) is a random draw from a Beta Distribution with parameters , , and
scale parameter 1.
In addition, the Gamma Distribution can be obtained as the limit of an appropriately chosen
sequence of Beta Distributions.
In order to demonstrate this relationship well use Sterlings formula for the Gamma of large
arguments as well as the fact that the limit as z of (1 + c/z)z is ec.161
Let (a,b; x) be a Beta Distribution. Let y be a chosen constant. Let x go to zero such that at the
same time the second parameter b of the Beta Distribution goes to infinity while the
relationship b = 1 + y/x holds. Then the integral that enters into the definition of (a,b; x) is:
x

ta-1(1-t)b-1 dt = ta-1(1-t)y/x dt
0

Change variables by taking s = ty /x, then since t = sx/y, the above integral is:
y

(sx/y)a-1(1-sx/y)y/x (x/y)ds = (x/y)a sa-1{1-s/(y/x)}y/x ds = (x/y)a sa-1e-s ds


0

Since for small x , y/x is large and therefore {1 - s/(y/x)}y/x e-s


Meanwhile, the constant in front of the integral that enters into the definition of (a,b; x) is:

(a+b) / { (a)(b)} = (a){ (a+ 1+ y/x) / (1 + y/x)}.


161

The latter fact follows from taking the limit of ln{(1 + c/z)z} = z ln(1 + c/z) z (c/z - (c/z)2 /2 ) = c - c/ 2z2 c.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 356

For very small x the argument of the Complete Gamma Function is very large.
Sterlings Formula says for large z: (z) e-zzz0.5 2 .
Thus for small x:

(a+ 1+ y/x) / ( 1+ y/x)


e-(a+ 1+ y/x)(a+ 1+ y/x)(a+ .5+ y/x) (2).5 / {e-(1+ y/x)(1+ y/x)(.5+ y/x) (2).5}
= e-a (a+ 1+y/x)a {(a+ 1+ y/x) / (1+ y/x)}(.5+ y/x) e-a (y/x)a {1+ ax/y) y/x
e-a (y/x)a ea = (y/x)a .
Thus for large b, (a+b) / { (a)(b)} (y/x)a / (a)
Putting the two pieces of the Incomplete Beta Function together:
x

(a,b; x) = (a+b) / { (a)(b)} ta-1(1-t)b-1 dt


0

{ (y/x)a / (a)} (x/y)a sa-1e-s ds = {1 / (a)}


0

sa-1 e-s ds = (a; y).

Thus the Incomplete Gamma Function (a; y) has been obtained as a limit of an appropriate
sequence of Incomplete Beta Functions (a, b; x), with b = 1 + y/x, as x goes to zero.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 357

Problems:
Use the following information for the next four questions:
X follows a Beta Distribution with a = 3, b = 8, and = 1
27.1 (1 point) What is the density function at x = 0.6 ?
A. 0.05
B. 0.1
C. 0.2
D. 0.3

E. 0.4

27.2 (1 point) What is the mode?


A. less than 0.20
B. at least 0.20 but less than 0.25
C. at least 0.25 but less than 0.30
D. at least 0.30 but less than 0.35
E. at least 0.35
27.3 (1 point) What is the mean?
A. less than 0.20
B. at least 0.20 but less than 0.25
C. at least 0.25 but less than 0.30
D. at least 0.30 but less than 0.35
E. at least 0.35
27.4 (1 point) What is the variance?
A. less than 0.02
B. at least 0.02 but less than 0.03
C. at least 0.03 but less than 0.04
D. at least 0 .04 but less than 0.05
E. at least 0.05

27.5 (2 points) For a Beta Distribution with parameters a = 0.5, b = 0.5, and = 1, what is the
mode?
A. 1/4

B. 1/3

C. 1/5

D. 3/4

E. None of A, B, C, or D

27.6 (2 points) For a Generalized Beta Distribution with = 2, determine the second moment.
A. 2

a
a + b

B. 2

b
a + b

D. 2

a (a + 1)
(a + b) (a + b + 1)

E. 2

b (b + 1)
(a + b) (a + b + 1)

C. 2

ab
(a + b) (a + b + 1)

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 358

Use the following information for the next 2 questions:

While partially disabled, an injured workers lost earnings capacity is defined


based on the wage he could now earn as:
(preinjury wage - postinjury wage)/(preinjury wage) = lost wages / preinjury wage.

Assume an injured workers lost earnings capacity is distributed via a Beta Distribution
with parameters a = 3, b = 2, and =1.

While partially disabled, Workers Compensation weekly benefits are 70% times
the workers lost wages, limited to 56% of the workers preinjury wage.

27.7 (1 point) What is the average lost earnings capacity of injured workers?
A. less than 0.53
B. at least 0.53 but less than 0.56
C. at least 0.56 but less than 0.59
D. at least 0.59 but less than 0.62
E. at least 0.62
27.8 (3 points) Where (a,b; x) is the Incomplete Beta Function as defined in Appendix A of Loss
Models, what is the average ratio of weekly partial disability benefits to preinjury weekly wage?
Hint: Use the formula for E[X x] for the Beta Distribution, in Appendix A of Loss Models.
A. 0.42(4, 2; 0.2) + 0.56(2, 3; 0.8)
B. 0.56(4, 2; 0.8) + 0.42(2, 3; 0.2)
C. 0.42(4, 2; 0.8) + 0.56(2, 3; 0.2)
D. 0.56(4, 2; 0.2) + 0.42(2, 3; 0.8)
E. None of the above.

27.9 (2 points) On an exam, the grades of students are distributed via a Beta Distribution with
a = 4, b = 1, and = 100.
A grade of 70 or more passes.
What is the average grade of a student who passes this exam?
A. 82
B. 84
C. 86
D. 88
E. 90

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 359

27.10 (2, 5/83, Q.12) (1.5 points) Let X have the density function
f(x) = ( + ) x1 (1 - x)1/ {() ()}, for 0 < x < 1, where > 0 and > 0.
If = 6 and = 5, what is the expected value of (1 - X)-4?
A. 42

B. 63

C. 210

D. 252

E. 315

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 360

Solutions to Problems:
27.1. C. f(x) = {(a+b-1)! / (a-1)! (b-1)!} (x/)a-1{1-(x/)}b-1/ = {10! / (2! 7!)}.62 .47 = 0.212
27.2. B. f(x) = {10! / (2! 7!)}x2 (1-x)7 = 360x2 (1-x)7 . f(x) = 720x(1-x)7 + 2520x2 (1-x)6 .
0 = f(x) = 720x(1-x)7 + 2520x2 (1-x)6 . 720(1 - x) = 2520x. x = 0.222.
Comment: For a > 1 and b > 1, the mode is: (a - 1) / (a + b - 2) = (1)(3 - 1) / (3 + 8 - 2) = 2/9.
A graph of the density of this Beta Distribution:
density
3.0
2.5
2.0
1.5
1.0
0.5
0.2

0.4

0.6

0.8

1.0

27.3. C. Mean = a/(a+b) = 3 / 11 = 0.2727.


27.4. A. Second moment = 2a(a + 1 )/ {(a + b)(a + b + 1)} = (3)(4) / {(11)(12)} = 0.09091.
Variance = 0.09091 - 0.27272 = 0.0165.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 361

27.5. E. f(x) is proportional to: x-1/2 (1 - x)-1/2.


f(x) is proportional to: (-1/2)x-3/2 (1 - x)-1/2 + (1/2)x-1/2 (1 - x)-3/2.
Setting this equal to zero, and solving, x = 1/2. At x = 1/2, x-1/2 (1 - x)-1/2 = 2.
As x 0, x-1/2 (1 - x)-1/2 . As x 1, x-1/2 (1 - x)-1/2 .
Thus checking the endpoints of the support, the modes are at 0 and 1.
Comment: 1/2 is a minimum for this density,
f(x) = {(1/2 + 1/2) / ((1/2)(/12))} x-1/2 (1 - x)-1/2 = x-1/2 (1 - x)-1/2 / , 0 < x < 1:
density

1.5

1.0

0.5

0.2

27.6. A. E[Xk] = k
E[X2 ] = 2
= 2

0.4

0.6

0.8

1.0

[a + b] [a + k / ]
.
[a] [a + b + k / ]

[a + b] [a + 2 / 2]
[a + b] [a + 1]
[a + 1]
[a + b]
= 2
= 2
[a] [a + b + 2 / 2]
[a] [a + b + 1]
[a]
[a + b + 1]
a
.
a + b

27.7. D. The mean of a Beta Distribution is a/(a+b) = 3/(3+2) = 0.6.

2013-4-2,

Loss Distributions, 27 Beta Function

HCM 10/8/12,

Page 362

27.8. C. Let y be the injured workers lost earnings capacity. Let w be the workers
pre-injury wage. Then the lost wages are yw. Thus the benefits are .7yw, limited to .56w.
Thus the ratio of weekly partial disability benefits to pre-injury weekly wage, r, is: .7y limited to .56.
For y .56/.7 = .8, r = .56; for y .8, r = .7y. To get the average r, we need to integrate with
respect to f(y)dy for y = 0 to 1, dividing into two pieces depending on whether y is less than or
greater than .8.
0.8

0 (0.7y) f(y) dy + 0.8 (0.56) f(y) dy

0.8

= 0.7{

0 y f(y) dy

+ 0.8 S(0.8) } = 0.7 E[Y

0.8].

Thus the average ratio r has been written in terms of the Limited Expected Value; for the Beta
Distribution, E[X x] = (a/(a+b))(a+1,b; x/) + x(1-(b,a; x/)).
Thus for = 1, a = 3 and b = 2: 0.7 E[X .8] = 0.7{(3/(3+2))(3+1,2;0.8) + 0.8(1-(2,3;.8)} =
0.42(4,2; 0.8) + 0.56(2,3; 0.2).
Comments: 0.42(4,2; 0.8) + 0.56(2,3; 0.2) = (0.42)(0.7373) + (0.56)(0.1808) = 0.411.
Since (a,b;1-x) = 1 - (b,a;x), the solution could also be written as:
0.42(4,2; 0.8) + 0.56{1-(3,2; 0.8)}.
27.9. D. f(x) = x3 / 25,000,000.
100

S(70) =

100

70 x3 dx / 25,000,000 = 0.7599.
100

70 x f(x) dx = 70 x4 dx / 25,000,000 = 66.5544.

The average grade of a student who passes this exam is: 66.5544 / 0.7599 = 87.58.
Comment: The average grade of all students is: (100)(4)/(4 + 1) = 80.

2013-4-2,

Loss Distributions, 27 Beta Function

27.10. A.

E[(1 - X)-4] = {(11) / ((5) (6))}

HCM 10/8/12,

Page 363
1

x4 (1 - x)5 (1 - x)-4 dx = {(11) / ((5) (6))} x4 (1 - x) dx =

{(11) / ((5) (6))}{((5) (2))/(7)} = (10)(9)(8)(7)/{(5)(4)(3)(2)} = 42.


Alternately, let y = 1 - x. Then y has a Beta Distribution with a = 6, b = 5, and = 1.
E[(1 - X)-4] = E[Y-4] = (6 + 5)(6 - 4)/{(6)(6 + 5 - 4)} = (11)(2)/{(6)(7)} = (10!)(1!)/{(5!)(6!)}
= (10)(9)(8)(7) / {(5)(4)(3)(2)} = 42.
Comment: X has a Beta Distribution with a = 5, b = 6, and = 1.

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 364

Section 28, Transformed Beta Distribution


You are extremely unlikely to be asked questions on your exam involving the 4 parameter
Transformed Beta Distribution.
Define the Transformed Beta Distribution on (0, ) in terms of the Incomplete Beta Function:
F(x) = [, ; x / (x + )] = 1 - [ , ; / ( x + )].
f(x) = {(+)/()()} x1 (1+(x/))( + ).
As shown on page 70 and in Appendix A of Loss Models, the Pareto, Generalized Pareto, Burr,
LogLogistic, and other distributions are special cases of the Transformed Beta Distribution.
In this parameterization, acts as a scale parameter, since everywhere x appears in F(x) it is divided
by . is a power transformation parameter as in the Loglogistic and Burr Distributions.
is a shape parameter as in the Pareto. is another shape parameter as in the Generalized Pareto
or Inverse Pareto. With four parameters the Transformed Beta Distribution has a lot of flexibility to fit
different data sets.
For 0 < x < , x / (x + ) is between 0 and 1, the domain of the Incomplete Beta Function.
Thus this transformation allows one to get a size of loss distribution from the Incomplete Beta
Function, which only has a domain from zero to one.
The moments of the Transformed Beta Distribution are:
n ( n/)(+ n/)/()(), > n > -.
Only some moments exist; if >1 the mean exists. If >2 the second moment exists, etc.
Mean = ( 1/)(+ 1/)/(()()), > 1.
Provided >1, the mean excess loss exists and increases to infinity approximately linearly;
for large x, e(x) x / ( - 1). This tail behavior carries over to special cases such as: the Burr,
Generalized Pareto, Pareto, LogLogistic, ParaLogistic, Inverse Burr, Inverse Pareto, and Inverse
ParaLogistic. All have mean excess losses, when they exist, that increase approximately linearly for
large x.162
162

The mean excess loss of a Pareto, when it exists, is linear in x. e(x) = (x+)/(-1).

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 365

Special Cases of the Transformed Beta:163


By setting any one of the three shape parameters equal to one,164 the Transformed Beta becomes
a three parameter distribution.
For = 1, the Transformed Beta is a Burr.
For = 1, the Transformed Beta is a Generalized Pareto.
For = 1, the Transformed Beta is an Inverse Burr.
In turn, one can fix one of the remaining shape parameters in one of these three parameter
distributions and obtain a two parameter distribution, each of which is also a special case of the
Transformed Beta Distribution.
For = 1, the Burr is a Pareto.
For = 1, the Burr is a LogLogistic.
For = , the Burr is a ParaLogistic.
For = 1, the Generalized Pareto is a Pareto.
For = 1, the Generalized Pareto is an Inverse Pareto.
For = 1, the Inverse Burr is an Inverse Pareto.
For = 1, the Inverse Burr is a LogLogistic.
For = , the Inverse Burr is an Inverse ParaLogistic.
Limiting Cases of the Transformed Beta Distribution:165
By taking appropriate limits one can obtain additional distributions from the Transformed Beta
Distribution. Examples include the Transformed Gamma, Inverse Transformed Gamma and the
LogNormal Distributions.
The first example is that the Transformed Gamma Distribution is a limiting case of the Transformed
Beta. In order to demonstrate this relationship well use Sterlings formula for the Gamma of large
arguments, which says that for large z: (z) e-zzz-0.5 2 .
163

See Figure 5.2 in Loss Models.


One can fix any of the parameters at any positive value, but for example the three parameter distribution that
results from fixing = 2 does not have a name, because it does not come up as often in applications. Fixing the scale
parameter is much less common for practical applications than fixing one or more shape parameters.
165
See Figure 5.4 in Loss Models.
164

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 366

Well also use the fact that the limit as z of (1 + c/z)z is ec.
The latter fact follows from taking the limit of ln{(1 + c/z)z} = z ln(1 + c/z) z (c/z - (c/z)2 /2) =
c - c/2z2 c.
Exercise: Use Sterlings Formula to approximate (a+b)/(a)(b), for very large a.
[Solution: Sterlings Formula says for large z: (z) e-zzz-0.5 2 . Thus for large a:
(a+b)/(a) e-(a+b)(a+b)a+b-0.5 2 / {e-a aa.5 2 } = e-b (1+b/a)a (a+b)b a / (a + b)
e-b eb ab ab . Thus for large a, (a+b)/(a)(b) ab / (b).]
A Transformed Beta Distribution with parameters , , and has density
f(x) = {(+)/()()} x1 (1+(x/))( + ). Well take limits of the Gamma Functions in front
and the rest of the density as two separate pieces and then put the results together.
Set q = /1/ and let go to infinity, while holding q constant.166
Given the chosen constant q, then = q1/.
Then the density of the Transformed Beta, other than the Gamma Functions is:
x1 (1+(x/))( + ) = q x1 (1 + (x/q) /)( + ) =
q x1 (1 + (x/q) /) /(1 + (x/q) /) q x1/exp((x/q)) =
q x1 exp(-(x/q)).
Where Ive used the fact that the limit as z of (1+c/z)z is ec, with z = and c = (x/q).
Meanwhile, the Gammas in front of density are: (+)/{()()}.
As shown in the above exercise, for large , (+)/{()()} / ().
Putting the two pieces of the density of the Transformed Beta Distribution together:
f(x) = {(+)/()()} x1 (1+(x/))( + )
{ /() } q x1 exp(-(x/q)) = q x1 exp[-(x/q)] / ().
This is the density of a Transformed Gamma Distribution, with scale parameter q, with what is
normally the parameter given as , and what is normally the parameter given as .
166

q will turn out to be the scale parameter of the limiting Transformed Gamma Distribution.

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 367

Thus the Transformed Gamma Distribution has been obtained as a limit of a series of Transformed
Beta Distributions, with q = /1/ and letting go to infinity, while holding q constant. Then q is the
scale parameter of the limiting Transformed Gamma. The parameter of the limiting Transformed
Gamma is the parameter of the Transformed Beta. The parameter of the limiting Transformed
Gamma is the parameter of the Transformed Beta.167
In terms of Distribution Functions:
lim [, ; x / (x + q)] = [ ; (x/q)].

For the Transformed Beta Distribution, and its special cases, as alpha approaches infinity, in the limit
we get a Transformed Gamma, and its special cases:
Transformed Beta

Transformed Gamma

Generalized Pareto

Gamma

Burr

Weibull

Pareto

Exponential

Note that in each case since we taken the limit as alpha approaches infinity, the limiting distribution
has one fewer shape parameter than the distribution whose limit we are taking.
Exercise: What is the limit of a Pareto Distribution, as goes to infinity while = 100?
[Solution: This is special case of the limit of a Transformed Beta. Using the above result, in this case,
the limit is an Exponential Distribution with scale parameter 100.
Alternately, for the Pareto S(x) = 1/{1+ (x/)} = 1/{1 + (x/100)/}. As approaches infinity,
S(x) approaches 1/exp(x/100) = e-x/100. This is an Exponential Distribution with mean 100.]
Exercise: What is the limit of a Burr Distribution, with = 3, as goes to infinity while = 251/3?
[Solution: This is special case of the limit of a Transformed Beta. Using the above result, in this case,
the limit is an Weibull Distribution with scale parameter 25 and = 3. Alternately, for the Burr with
= 3, S(x) = 1/(1+ (x/)3 ) = 1/(1+ (x/25)3 /). As approaches infinity, S(x) approaches
1/exp((x/25)3 ) = exp(-(x/25)3 ). This is an Weibull Distribution with scale parameter 25 and = 3.]
167

The fact that the tau parameter of the limiting Transformed Gamma Distribution is not the same tau as that of the
Transformed Beta Distribution is due to the manner in which Loss Models has chosen to parametrize both
distributions.

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 368

Similarly, for the Transformed Beta Distribution, and its special cases, as tau approaches infinity,
in the limit we get a Inverse Transformed Gamma, and its special cases:
Transformed Beta

Inverse Transformed Gamma

Generalized Pareto

Inverse Gamma

Inverse Burr

Inverse Weibull

Inverse Pareto

Inverse Exponential

A Transformed Beta Distribution with parameters , , and has density


f(x) = {(+)/()()} x1 (1+(x/))( + ). Set q = 1/ and let go to infinity, while holding
q constant. Given the chosen constant q, then = q/1/.
Then the density of the Transformed Beta, other than the Gamma Functions is:
x1 (1 + (x/))( + ) = x1 (x/)-( + ) (1 + ( / x))( + ) =
x(+1) (1 + ( / x)) (1 + ( / x)) = q x(+1) (1 + (q / x)/) (1 + (q / x)/)
q x(+1) exp(-(x/q)) (1) q x(+1) exp(-(x/q)).168
As shown previously in an exercise, for large , {(+)/()()} / ().
Putting the two pieces of the density of the Transformed Beta Distribution together:
f(x) = {(+)/()()} x1 (1+(x/))( + )
{ / ()} q x(+1) exp(-(x/q)) = q x(+1) exp(-(x/q)) / ().
This is the density of an Inverse Transformed Gamma Distribution, with scale parameter q, with the
usual parameter, and what is normally the parameter given as .
Thus the Inverse Transformed Gamma Distribution has been obtained as a limit of a series of
Transformed Beta Distributions, with q = 1/ and let go to infinity, while holding q constant. Then q
is the scale parameter of the limiting Inverse Transformed Gamma. The parameter of the limiting
Inverse Transformed Gamma is the parameter of the Transformed Beta. The parameter of the
limiting Inverse Transformed Gamma is the parameter of the Transformed Beta.
In terms of Distribution Functions: lim [, ; x / (x + q /)] = 1 - [ ; (q/x)].

168

Where Ive use the fact that the limit as z of (1 + c/z)-z is e-c, with z = and c = (x/q).

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 369

Note that this result could have been obtained from the previous one:
lim [, ; x / (x + q)] = [ ; (x/q)]. Since [a, b;x] = 1- [a, b; 1-x],

lim [, ; x / (x + q /)] = lim 1 - [ , ; q/ / (x + q /)] =


lim 1 - [ , ; q / (q + x)] = 1 - [ ; (q/x)].


Exercise: What is the limit of a Generalized Pareto Distribution, with = 7, as goes to infinity while
= 33/?
[Solution: This is special case of the limit of a Transformed Beta. Using the above result, in this case,
the limit is an Inverse Gamma Distribution with scale parameter 33 and = 7. Alternately, for the
Generalized Pareto Distribution with = 7,
f(x) = {(7+)/()(7)}x1(1+(x/))(7 + ) = {(7+)/()(7)}33 x1(1+(x/33))(7 + ) =

{(7+)/()(7)} 33 x1 {(x/33)} ((1 + (33/x)/) (1+(x/33))-7 =


{(7+)/()(7)} x-1 (( 1 + (33/x)/) (1+(x/33))-7. As approaches infinity, this approaches:
{7/(7)} x-1 exp(-33/x) {(x/33)}-7 = 337 x-8 exp(-33/x) /(7).
This is the density of an Inverse Gamma Distribution with scale parameter 33 and = 7.]
Exercise: What is the limit of an Inverse Burr Distribution, with = 4, as goes to infinity while
= 13/1/4 ?
[Solution: This is special case of the limit of a Transformed Beta. Using the above result, in this case,
the limit is an Inverse Weibull Distribution with scale parameter 13 and = 4. Alternately, for the
Inverse Burr with = 4, S(x) = 1/(1 + ( /x)4 ) = 1/(1 + (13/x)4 /). As approaches infinity, S(x)
approaches 1/exp((13/x)4 ) = exp(-(13/x)4 ). This is an Inverse Weibull Distribution with scale
parameter 13 and = 4.]

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 370

Problems:
28.1 (1 point) Which of the following are special cases of the Transformed Beta Distribution?
1. Beta Distribution
2. ParaLogistic Distribution
3. Inverse Gaussian Distribution
A. None of 1, 2 or 3
B. 1
C. 2
D. 3
E. None of A, B, C, or D
28.2 (1 point) Which of the following can be obtained as limits of Transformed Beta Distributions?
1. Weibull Distribution
2. Inverse Gamma Distribution
3. Single Parameter Pareto Distribution
A. None of 1, 2 or 3
B. 1
C. 2
D. 3
E. None of A, B, C, or D
28.3 (2 points) Calculate the density function at 14, f(14), for a Transformed Beta Distribution with
= 3, = 10, = 4 and = 6.
Hint: f(x) = {(+)/()()} x1 (1+(x/))( + ).
A. less than 0.02
B. at least 0.02 but less than 0.03
C. at least 0.03 but less than 0.04
D. at least 0.04 but less than 0.05
E. at least 0.05
28.4 (1 point) Match the Distributions:
1. Pareto

a. Transformed Beta with = 1 and =1

2. LogLogistic

b. Transformed Beta with = 1 and =1

3. Inverse Pareto

c. Transformed Beta with = 1 and =1

A. 1a, 2b, 3c
D. 1b, 2c, 3a

B. 1a, 2c, 3b
E. 1c, 2b, 3a

C. 1b, 2a, 3c

28.5 (2 points) What is the limit of Inverse Pareto Distributions, as goes to infinity while = 7/?
A. An Exponential Distribution, with scale parameter 7.
B. An Exponential Distribution, with scale parameter 1/7.
C. An Inverse Exponential Distribution, with scale parameter 7.
D. An Inverse Exponential Distribution, with scale parameter 1/7.
E. None of the above.

2013-4-2,

Loss Distributions, 28 Transformed Beta

HCM 10/8/12,

Page 371

Solutions to Problems:
28.1. C. 1. While the Transformed Beta Distribution can be derived from the Beta Distribution,
the Beta has support [0,1], while the Transformed Beta Distribution has support x > 0.
The Beta is not a special case of a Transformed Beta.
2. The ParaLogistic Distribution is a special case of a Transformed Beta, with =1 and = .
3. The Inverse Gaussian Distribution is not a special case of a Transformed Beta Distribution.
28.2. E. 1. Yes. By taking appropriate limits of Burr Distributions, Transformed Beta Distributions
with = 1, one can obtain a Weibull Distribution . 2. Yes. By taking appropriate limits of
Generalized Pareto Distributions, Transformed Beta Distributions with = 1, one can obtain an
Inverse Gamma Distribution 3. No. The Single Parameter Pareto has support x > , while the
Transformed Beta Distribution has support x > 0.
28.3. B. f(x) = {(+)/()()} (x/) (1+(x/))( + )/x.
f(22) = {(7)/(4)(3)} (6) (1.4)24 (1+(1.4)6 )-7/ 14 = (60)(6)(3214.2)/((3,284,565)(14)) = 0.025.
28.4. C. 1b, 2a, 3c.
28.5. C. This is special case of the limit of a Transformed Beta.
The limit is an Inverse Transformed Gamma with = 1 and =1 and scale parameter 7.
That is an Inverse Exponential with scale parameter 7.
Alternately, for the Inverse Pareto, F(x) = (x/(x+)) = (1+/x) = (1+(7/x)/).
As approaches infinity F(x) approaches 1/exp(7/x) = exp(-7/x).
This is an Inverse Exponential with scale parameter 7.

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 372

Section 29, Producing Additional Distributions


Given a light-tailed distribution, one can produce a more heavy-tailed distribution by looking at the
inverse of x. Let G(x) = 1 - F(1/x).169 For example, if F is a Gamma Distribution, then G is an
Inverse Gamma Distribution. For a Gamma Distribution with = 1, F(x) = [; x].
Letting y = /x, G(y) = 1 - F(x) = 1 - F(/y) = [; /y], the Inverse Gamma Distribution.
Given a Distribution, one can produce another distribution by adding up independent identical
copies. For example, adding up independent Exponential Distributions gives a Gamma
Distribution. As approaches infinity one approaches a very light-tailed Normal Distribution.
One can get a more heavy-tailed distribution by the change of variables y = ln(x).
Let G(x) = F[ln(x)]. For example, if F is the Normal Distribution, then G is the heavier-tailed
LogNormal Distribution. Loss Models refers to this as "exponentiating", since if y = ln(x),
then x = ey.
One can get new distributions by the change of variables y = x1/. Loss Models refers to this as
"raising to a power". Let G(x) = F[x1/], > 0. For example, if F is the Exponential Distribution with
mean , then G is a Weibull Distribution, with scale parameter . For > 1 the Weibull Distribution
has a lighter tail than the Exponential Distribution. For < 1 the Weibull Distribution has a heavier tail
than the Exponential Distribution.
For > 0, Loss Models refers to the new distribution as transformed, for example Transformed
Gamma versus Gamma.170
If < 0, then G(x) = 1 - F[x1/].
For < 0, this is called the inverse transformed distribution, such as the Inverse Transformed
Gamma versus the Gamma. This can be usefully thought of as two separate changes of variables:
raising to a positive power and inverting.
For the special case, = -1, this is the inverse distribution as discussed previously, such as the
Inverse Gamma versus the Gamma.

169

We need to subtract from one, so that G(0) = 0 and G() = 1.


However, some distribution retain their special names. For example the Weibull is not called the transformed
Exponential, nor is the Burr called the transformed Pareto.
170

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 373

Exercise: X is Exponential with mean 10. Determine the form of the distribution of Y = X3 .
[Solution: F(x) = 1 - exp[-x/10]. Y = X3 . X = Y1/3.
FY(y) = FX[x] = FX[y1/3] = 1 - exp[-y1/3/10] = 1 - exp[-(y/1000)1/3].
A Weibull Distribution with = 1000 and = 1/3.
Alternately, FY(y) = Prob[Y y] = Prob[X3 y] = Prob[X y1/3] = 1 - exp[-y1/3/10].
FY(y) = 1 - exp[-(y/1000)1/3]. A Weibull Distribution with = 1000 and = 1/3.
Alternately, f(x) = exp[-x/10]/10.
fY(y) = fX[y1/3] |dx/dy| = exp[-y1/3/10]/10 (1/3)y-2/3 = (1/3) (y/1000)1/3 exp[-(y/1000)1/3] / y.
A Weibull Distribution with = 1000 and = 1/3.
Comment: A change of variables as in calculus class.]
One can get additional distributions as a mixture of distributions. As will be discussed in a
subsequent section, the Pareto can be obtained as a mixture of an Exponential by an Inverse
Gamma.171 Usually such mixing produces a heavier tail; the Pareto has a heavier tail than an
Exponential. The Negative Binomial which can be obtained as a mixture of Poissons via a Gamma
has a heavier tail than the Poisson. Loss Models refers to this as Mixing. Another method of getting
new distributions is to weight together two or more existing distributions. Such mixed distributions,
referred to by Loss Models as n-point or two-point mixtures, are discussed in a subsequent
section.172
One can get additional distributions as a ratio of independent variables each of which follows a
known distribution. For example an F-distribution is a ratio of Chi-Squares.173 As a special
case, the Pareto can be obtained as a ratio of an Exponential variable and a Gamma
variable.174 The Beta Distribution can be obtained as a combination of two Gammas.175
Generally the distributions so obtained have heavier tails.

171

Loss Models Example 5.4 shows that mixing an exponential via an inverse exponential yields a Pareto
Distribution. This is just a special case of the Inverse Gamma-Exponential, with mixed distribution a Pareto.
Example 5.6 shows that mixing an Inverse Weibull via a Transformed Gamma with the same parameter, gives an
Inverse Burr Distribution.
172
Loss Models, Section 4.2.3.
173
The F-Distribution from Statistics is related to the Generalized Pareto Distribution.
174
See p. 47, Loss Distributions by Hogg & Klugman.
175
If X is a random draw from a Gamma Distribution with shape parameter and scale parameter , and Y is a random
draw from a Gamma Distribution with shape parameter and scale parameter ,
then Z = X / (X+Y) is a random draw from a Beta Distribution with parameters and .

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 374

Finally, one can introduce a scale parameter. If one had the distribution F(x) = 1 - e-x, one can create
a family of distributions by substituting x/ everywhere x appears. is now a scale parameter.
F(x) = 1 - e-x/. Introducing a scale parameter does not effect either the tail behavior or the shape of
the distribution. Loss Models refers to this as "multiplying by a constant".
10
- 10. Determine the distribution of Y.
x

Exercise: X is uniform from 0 to 0.1. Y =

[Solution: F(x) = 10x, 0 x 0.1. X = 10/(Y + 10)2 .


x = 0 y = . x = 0.1 y = 0. Small Y Large X. Need to take 1 - F(x).
FY(y) = 1 - FX[10/(y + 10)2 ] = 1 - 102 /(y + 10)2 . A Pareto Distribution with = 2 and = 10.
Alternately, FY(y) = Prob[Y y] = Prob[

10
10
- 10 y] = Prob[
y + 10] =
x
x

Prob[10/X (y + 10)2 ] = Prob[X 10/(y + 10)2 ] = 1 - Prob[X 10/(y + 10)2 ] =


1 - (10){10/(y + 10)2 } = 1 - 102 /(y + 10)2 .
FY(y) = 1 - 102 /(y + 10)2 , a Pareto Distribution with = 2 and = 10.
Alternately, f(x) = 10, 0 x 0.1.
fY(y) = fX[10/(y + 10)2 ] |dx/dy| = (10) 20/(Y + 10)3 = (2) 102 /(Y + 10)3 .
A Pareto Distribution with = 2 and = 10.]

Percentiles:
A one-to-one monotone transformation, such as ln(x), ex, or x2 , preserves the percentiles, including
the median. For example, the median of a Normal Distribution is , which implies that the median of a
LogNormal Distribution is e.

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 375

Problems:
29.1 (4 points) X follows a Standard Normal Distribution, with mean zero and standard deviation
of 1. Y = 1/X.
(a) (1 point) What is the density of y?
(b) (2 points) Graph the density of y.
(c) (1 point) What is E[Y]?
29.2 (1 point) X follows an Exponential Distribution with mean 1. Let Y = exp[X/].
What is the distribution of Y?
29.3 (3 points) X follows a Weibull Distribution with parameters and . Let Y = -ln[X].
What are the algebraic forms of the distribution and density functions of Y?
29.4 (3 points) ln(X) follows a LogNormal Distribution with = 1.3 and = 0.4.
Determine the density function of X.
29.5 (3 points) X follows a Standard Normal Distribution, with mean 0 and standard deviation of 1.
Let Y = X2 , for > 0. Determine the form of the Distribution of Y.
29.6 (2 points) Let X follows the density f(x) = e-x / (1 + e-x)2 , - < x < .
Let Y = eX / , for > 0, > 0. Determine the form of the distribution of Y.
29.7 (2 points) Let X be a uniform distribution from 0 to 1.
1 - p
Let Y = ln[
], for > 0, 1 > p > 0. Determine the form of the distribution function of Y.
1 - pX
29.8 (2 points) X follows an Exponential Distribution with hazard rate . Let Y = exp[-X].
What is the distribution of Y?
29.9 (3 points) X is uniform on (0, 1). Y is uniform on (0,
(a) What is the distribution of Y?
(b) Determine E[Y].

x ).

29.10 (1 point) X follows an Exponential Distribution with mean 10. Let Y = 1/X.
What is the distribution of Y?

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 376

29.11 (4 points) You are given the following:


The random variable X has a Normal Distribution, with mean zero and standard deviation .
The random variable Y also has a Normal Distribution, with mean zero and standard deviation .
X and Y are independent.
R2 = X2 + Y2 .
Determine the form of the distribution of R.
Hint: The sum of squares of independent Standard Normals is a Ch-Square Distribution with v
degrees a freedom, which is a Gamma Distribution with = /2 and = 2.
29.12 (2, 5/90, Q.36) (1.7 points) If Y it uniformly distributed on the interval (0, 1) and if
Z = -a ln(1 - Y) for some a > 0, then to which of the following families of distributions does Z belong?
A. Pareto
B. LogNormal
C. Normal
D. Exponential
E. Uniform
29.13 (4B, 11/97, Q.11) (2 points) You are given the following:
The random variable X has a Pareto distribution, with parameters and .
Y is defined to be ln(1 + X/).
Determine the form of the distribution of Y.
A. Negative Binomial B. Exponential
C. Pareto

D. Lognormal

E. Normal

29.14 (IOA 101, 4/00, Q.13) (3.75 points) Suppose that the distribution of a physical coefficient,
X, can be modeled using a uniform distribution on (0, 1).
A researcher is interested in the distribution of Y, an adjusted form of the reciprocal of the coefficient,
where Y = (1/X) - 1.
(i) (2.25 points) Determine the probability density function of Y.
(ii) (1.5 points) Determine the mean of Y.
29.15 (1, 11/01, Q.13) (1.9 points) An actuary models the lifetime of a device using the random
variable Y = 10X0.8, where X is an exponential random variable with mean 1 year.
Determine the probability density function f(y), for y > 0, of the random variable Y.
(A) 10 y0.8 exp[-8y-0.2]
(B) 8 y-0.2 exp[-10y0.8]
(C) 8 y-0.2 exp[-(0.1y)1.25]
(D) (0.1y)1.25 exp[-1.25(0.1y).25]
(E) .125 (0.1y).25 exp[-(0.1y)1.25]

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

29.16 (CAS3, 11/05, Q.19) (2.5 points)


Claim size, X, follows a Pareto distribution with parameters and .
A transformed distribution, Y, is created such that Y= X1/.
Which of the following is the probability density function of Y?
A. y1 / (y + )+1
B. y 1 / (y + )+1
C. / (y + )+1
D. (y/) / {y[1 + (y/)]+1}
E. / (y + )+1
29.17 (CAS3, 5/06, Q.27) (2.5 points)
The following information is available regarding the random variables X and Y:

X follows a Pareto Distribution with = 2 and = 100.


Y = ln[1 + (X/)]
Calculate the variance of Y.
A. Less than 0.10
B. At least 0.10, but less than 0.20
C. At least 0.20, but less than 0.30
D. At least 0.30, but less than 0.40
E. At least 0.40

Page 377

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 378

Solutions to Problems:
29.1. (a) f(x) = exp[-.5 x2 ]/ 2 . x = 1/y. dx/dy = -1/y2 .
g(y) = f(x) |dx/dy| = (exp[-0.5/y2 ]/y2 ) / 2 , - < y < .
(b) This density is zero at zero, is symmetric, and has maximums at 1/ 2 :
density
0.30
0.25
0.20
0.15
0.10

0.05

-6

-4

-2

(c) E[Y] = 2 (exp[-.5/y2 ]/y)/ 2 . Now as y , exp[-0.5/y2 ] e0 = 1.


0

Therefore, for large values of y, the integrand is basically 1/y, which has no finite integral since
ln[] = . Therefore, the first moment of Y does not exist.
Comment: This is an Inverse Normal Distribution, none of whose positive moments exist.
29.2. F(x) = 1 - exp[-x]. y = exp[x/]. exp[x/] = y/. exp[x] = (y/).
Substituting into F(x), F(y) = 1 - (/y), a Single Parameter Pareto Distribution.
Comment: While x goes from 0 to , y goes from exp[0/] = to .
In general, if ln[Y] follows a Gamma, then Y follows what is called a LogGamma.
A special case is when ln[Y] is Exponential with mean . Then Y follows a LogExponential, which is
just a Single Parameter Pareto Distribution with = 1.

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 379

29.3. F(x) = 1 - exp[-(x/)], x > 0. y = -ln[x]. x = e-y. x = 0 y = , and x = y = -.


Since large x corresponds to small y, we need to substitute into S(x) rather than F(x).
Substituting into S(x), F(y) = exp[-(e-y/)] = exp[-e-y/], - < y < .
Differentiating, f(y) = exp[-e-y/] e-y/, - < y < .
Alternately, for the Weibull Distribution, f(x) = x1 exp(-(x/)) / .
f(y) = f(x) |dx/dy| = { e-y(1) exp(-(e-y/)) / } e-y = exp[-e-y/] e-y/, - < y < .
Comment: This distribution is sometimes called the Gumbel Distribution.
For = 1 and = 1, F(y) = exp[-e-y], and f(x) = exp[-y - e-y], - < y < , which is a form of what is
called the Extreme Value Distribution, the Fisher-Tippet Type I Distribution, or the Doubly
Exponential Distribution.
29.4. The LogNormal Distribution has support starting at 0, so we want ln(x) > 0. x > 1.
F(x) = LogNormal Distribution at ln(x): [{ln(ln(x)) - 1.3}/.4].
f(x) = [{ln(ln(x)) - 1.3}/.4] d ln(ln(x))/dx =
{exp[-{ln(ln(x)) - 1.3}2 /(2 .42 )]/ (.4

2 ) } / {x ln(x)} =

exp[-3.125{ln(ln(x)) - 1.3}2 ]/ {0.4 x ln(x) 2 }, x > 1.


Comment: Beyond what you are likely to be asked on your exam. Just as the LogNormal
Distribution has a much heavier righthand tail than the Normal Distribution, the
LogLogNormal Distribution has a much heavier righthand tail than the LogNormal
Distribution.
29.5. f(x) = exp[-x2 /2]/ 2 . X = Y/2.
Since x is symmetric around zero, but x2 0, we need to double the density of x.
g(y) = 2 f(x) |dx/dy| = 2{exp[-y/2]/ 2 }(/2)y/2 1 = y/2 1 exp[-y/2] /

2 .

The density of a Transformed Gamma Distribution is: f(x) = x1 exp[-x/ ] /{ ()}.


Matching parameters, = /2, and 2 = . The density of y is a Transformed Gamma
Distribution with parameters = 1/2, , and = 21/.
Comment: () = () (1/2) = 21/2

= 2 .

If = 1, then Y has a Gamma Distribution with = 1/2 and = 2,


which is a Chi-Square Distribution with one degree of freedom.

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 380

29.6. By integration, F(x) = 1/(1 + e-x) = ex / (1 + ex), - < x < .

ex = (y / ) . Therefore, FY(y) =

(y / )
, y > 0. This is a Loglogistic Distribution.
1 + (y / )

Comment: The original distribution is called a Logistic Distribution. The Loglogistic has a similar
relationship to the Logistic Distribution, as the LogNormal has to the Normal.

2013-4-2,

Loss Distributions, 29 Additional Distributions

29.7. y/ = ln[

HCM 10/8/12,

Page 381

1 - p
1 - p
]. ey/ =
. (1-p) e-y/ = 1 - px. px = 1 - (1-p) e-y/.
x
1 - p
1 - px

x ln(p) = ln[1 - (1-p) e-y/]. x = ln[1 - (1-p) e-y/] / ln(p).


For x = 1, y = 0, while as x approaches zero, y approaches infinity.
Since X is uniform, FX(x) = x. FY(y) = 1 - ln[1 - (1-p) e-y/ ] / ln(p), y > 0.
Comment: The density is fY(y) =

-e- y / (1- p)
- y/

{1 - (1- p)e

} ln[p]

, y > 0.

The distribution of Y is called an Exponential-Logarithmic Distribution.


As p approaches 1, the distribution of Y approaches an Exponential Distribution.
The Exponential-Logarithmic Distribution has a declining hazard rate.
If frequency follows a Logarithmic Distribution, and severity is Exponential, then the minimum
of the claim sizes follows an Exponential-Logarithmic Distribution.
Here is a graph comparing the density of an Exponential with mean 100
and an Exponential-Logarithmic Distribution with p = 0.2 and = 100:
density
0.025

0.020
Exponential-Logarithmic
0.015

0.010

Exponential

0.005

50

100

150

200

250

300

size

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 382

29.8. F(x) = 1 - exp[-x]. X = -ln[Y]/.


When x is big y is small and vice-versa. As x goes from zero to infinity, y goes from 1 to 0.
Therefore, we get the distribution function of Y by plugging into the survival function of X:
F(y) = exp[- (-ln[y]/)] = y/, 0 < y < 1. f(y) = (/) y/ - 1, 0 < y < 1.
Y follows a Beta Distribution, with parameters = 1, a = /, and b = 1.
Comment: If X is the future lifetime, and is the force of interest, then Y is the present value of a life
insurance that pays 1. The actuarial present value of this insurance is:
a
/

E[Y] =
=
=
.
a + b / + 1 +
As discussed in Life Contingencies, if the distribution of future lifetimes is Exponential with hazard

rate , then
is the actuarial present value of a life insurance that pays 1.
+
29.9. a. F[y | x] = y/ x for 0 y

x . F[y | x] = 1 for y >

x.

1 for 0 x y2
In other words, F[y | x] =
.
y / x for y2 x 1
y2

Thus, F[y] =

0 1 dx + y2 y /

x =1

x dx =

y2

+ 2y x

= y2 + 2y - 2y2 = 2y - y2 , 0 y 1.

x = y2
1

b. S(y) = 1 + y2 - 2y. E[Y] =

0 1 + y2 - 2y dy = 1 + 1/3 - 1 = 1/3.

Alternately, f(x) = 2 - 2y, 0 y 1.


This is a Beta Distribution with a = 1, b = 2, and = 1.
E[Y] = a / (a+ b) = 1/3.
29.10. F(x) = 1 - e-x/10. Let G be the distribution function of Y.
G(y) = 1 - F(x) = 1 - F(1/y) = exp[-0.1/y].
This is an Inverse Exponential Distribution with = 0.1.
Comment: We need to subtract from one, so that G(0) = 0 and G() = 1.

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 383

29.11. For = 1, R2 is the sum of two unit Normals, and thus a Chi-Square with 2 degrees of
freedom, which is an Exponential Distribution with = 2.
Now R = (R2 )1/2, so we have a power transformation, and thus R is Weibull with = 2.
Specifically, the survival function of R is: S(r) = survival function of R2 = exp[-r2 /2].
Now if 1, we just have a scale transformation, and r is divided by wherever r appears in the
survival function:
r2
r 2
S(r) = exp[] = exp[-
].
2
2 2
R follows a Weibull Distribution with = 2 and =

2.

Comment: This is called a Rayleigh Distribution.


29.12. D. F(z) = Prob[Z z] = Prob[-a ln(1 - Y) z] = Prob[ln( 1 - Y) -z/a] =
Prob[1 - Y e-z/a] = Prob[1 - e-z/a Y] = 1 - e-z/a. An Exponential Distribution with = a.
Comment: For Y uniform on [0, 1], Prob[Y y] = y.
This is the basis of one way to simulate an Exponential Distribution.
Z = -a ln(Y), also follows an Exponential Distribution with = a, which is the basis of another way
to simulate an Exponential Distribution.
29.13. B. If y = ln(1+ x/), then dy/dx = (1/) / (1+x/) = 1/(+x). Note that ey = 1+ x/.
g(y) = f(x) / |dy/dx| = {()( + x)( + 1)}/ (1/(+x)) = (1 + x/) = (ey) = ey.
Thus y is distributed as per an Exponential.
Comment: See for example page 107 of Insurance Risk Models by Panjer and Willmot, not on the
Syllabus.
29.14. (i) F(x) = x, 0 < x < 1. y = 1/x - 1. x = 1/(1 + y). F(y) = 1 - 1/(1+y), 0 < y < .

f(y) = 1/(1+y)2 , 0 < y < .


Alternately, f(x) = 1, 0 < x < 1. dy/dx= -1/x2 . f(y) = f(x)/(|dy/dx|) = 1/(1+y)2 .
When x is 0, y is , and when x = 1, y is 0. f(y) = 1/(1+y)2 , 0 < y < .
(ii) Y follows a Pareto Distribution with = 1 (and = 1), and therefore the mean does not exist.
Alternately, E[Y] is the integral from 0 to of y/(1+y)2 , which does not exist, since for large y the
integrand acts like 1/y.

2013-4-2,

Loss Distributions, 29 Additional Distributions

HCM 10/8/12,

Page 384

29.15. E. S(x) = exp[-x]. y = 10x0.8. x = (y/10)1.25. S(y) = exp[-(y/10)1.25].


f(y) = 1.25 y.25 exp[-(y/10)1.25] / 101.25 = 0.125 (0.1y). 2 5 exp[-(0.1y)1 . 2 5] .
Comment: Y follows a Weibull Distribution with = 1.25 and = 10.
29.16. B. Y= X1/. x = y. dx/dy = y1.
f(x) = /(x + )+1.
f(y) = dF/dy = dF/dx dx/dy = {/(x + )+1} y1 = {/(y + )+1} y1
= y1/(y + )+1.
Alternately, F(x) = 1 - {/(x + )}. x = y. F(y) = 1 - {/(y + )}.
Differentiating with respect to y, f(y) = y1/(y + )+1.
Comment: Basically, just a change of variables from calculus. The result is a Burr Distribution, but
with a somewhat different treatment of the scale parameter than in Loss Models.
If = 1, one should just get the density of the original Pareto. This is not the case for choices A and
C, eliminating them. While it is not obvious, choice D does pass this test.
29.17. C. Y = ln(1 + (X/)). X = (eY - 1) = 100(eY - 1).
F(x) = 1 - {100/(100 + x)}2 . F(y) = 1 - {100/(100 + 100(ey - 1))}2 = 1 - e-2y.
Thus Y follows an Exponential Distribution with = 1/2, and variance 2 = 1/4.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 385

Section 30, Tails of Loss Distributions


Actuaries are often interested in the behavior of a size of loss distribution as the size of claim gets
very large. The question of interest is how quickly the right-hand tail probability, as quantified in the
survival function S(x) = 1 - F(x), goes to zero as x approaches infinity. If the tail probability goes to
zero slowly, one describes that as a "heavy-tailed distribution."
For example, for the Pareto distribution S(x) = {/(+x)}, which goes to zero as per x.
If the tail probability goes to zero quickly, then one describes the distribution as "light-tailed".
For the Exponential Distribution, S(x) = e-x/, which goes to zero very quickly as x .
The heavier tailed distribution will have both its density and its survival function go to zero more

slowly as x approaches infinity. For example, for a Pareto, f(x) =
, which goes to zero
( + x) + 1
more slowly than the density of an Exponential Distribution, f(x) = e-x/ /.
For example, here is a comparison starting at 300 of the Survival Function of an Exponential
Distribution with = 100 versus that of a Pareto Distribution with = 200, = 3, and mean of 100:

S(x)
0.06
0.05
0.04
0.03
Pareto

0.02
0.01

Expon.
400

500

600

700

800

900

1000

The Pareto with a heavier righthand tail has its Survival Function go to zero more slowly as x
approaches infinity, than the Exponential. The Exponential has less probability in its righthand tail
than the Pareto. The Exponential has a lighter righthand tail than the Pareto.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 386

Exercise: Compare S(1000) for the Exponential Distribution with = 100 versus that of a
Pareto Distribution with = 200, = 3, and mean 100.
[Solution: For the Exponential, S(1000) = e-1000/100 = 0.00454%.
For the Pareto, S(1000) = (200/1200)3 = 0.46296%.
Comment: The Pareto Distribution has a much higher probability of a loss of size greater than 1000
than does the Exponential Distribution.]
Exercise: What are the mean and second moment of a Pareto Distribution with parameters
= 3 and = 10?
[Solution: The mean is: /(1) = 10/2 = 5. The second moment is:

2 2
= 200 / 2 = 100.]
( 1) ( 2)

Exercise: What are the mean and second moment of a LogNormal Distribution with parameters
= 0.9163 and = 1.1774?
[Solution: The mean is: exp( + 2/2) = exp(1.6094) = 5.
The second moment is: exp(2 + 22) = exp(4.605) = 100.]
Thus a Pareto Distribution with parameters = 3 and = 10 and a LogNormal Distribution with
parameters = 0.9163 and = 1.1774 have the same mean and second moment, and therefore
the same variance. However, while their first two moments match, the Pareto has a heavier tail. This
can be seen by calculating the density functions for some large values of x.
Exercise: What are f(10), f(100), f(1000) and f(10,000) for a Pareto Distribution with parameters
= 3 and = 10?
[Solution: For a Pareto, f(x) =


. So that f(10) = 3000/ 204 = 0.01875,

+
1

+
x
( )

f(100) = 2.05 x 10-5 , f(1000) = 2.88 x 10-9, f(10,000) = 2.99 x 10-13.]

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 387

Exercise: What are f(10), f(100) , f(1000) and f(10,000) for a LogNormal Distribution with
parameters = 0.9163 and = 1.1774?

[(

exp [Solution: For a LogNormal f(x) =

ln(x) )2
22
x 2

, so that

f(10) = .0169 , f(100) = 2.50 x 10-5 , f(1000) = 8.07 x 10-10 , f(10,000) = 5.68 x 10-16.]
x

Pareto10
3Density

LogNormal
0.9163
1.1774
Density

10
100
1000
10000

1.87e-2
2.05e-5
2.88e-9
2.99e-13

1.69e-2
2.50e-5
8.07e-10
5.68e-16

While at 10 and 100 the two densities are similar, by the time we get to 1000, the LogNormal
Density has started to go to zero more quickly. This LogNormal has a lighter tail than this Pareto.
In general any LogNormal has a lighter tail than any Pareto Distribution.
For the LogNormal, ln f(x) = -0.5 ({ln(x)} /)2 - ln(x) - ln() - ln(2)/2.
For very large x this is approximately: -0.5 ln(x)2 /2.
For the Pareto, ln f(x) = ln() + ln() - (+1) ln( + x).
For very large x this is approximately: -(+1) ln(x).
Since the square of ln(x) eventually gets much bigger than ln(x), the log density of the Lognormal
(eventually) goes to minus infinity faster than that of the Pareto. In other words, for very large x, the
density of the Lognormal goes to zero more quickly than the Pareto. The LogNormal is lighter-tailed
than the Pareto.
There are number of methods by which one can distinguish which distribution or empirical data set
has the heavier tail. Light-tailed distributions have more moments that exist. For example, the
Gamma Distribution has all of its (positive) moments exist. Heavy- tailed distributions do not have
higher moments exist. For example, for the Pareto, only those moments for n < exist.
In general, computing the nth moment involves integrating xn f(x) with upper limit of infinity. Thus if
f(x) goes to zero as x-m as x approaches infinity, then the integrand is xn-m for large x; thus the mean
only exist if n-m < -1, in other words if m > n +1. The nth moment will only exist if f(x) goes to zero
faster than x-(n+1).

2013-4-2,

Loss Distributions, 30 Tails

For example, the Burr Distribution has f(x) =

HCM 10/8/12,

Page 388

+ 1
x 1
1
go to zero as per

1 + (x / )

x(1) (+1) = x( +1), so the nth moment exists only if > n.


For example, a Burr Distribution with = 2.2 and = 0.8 has a first moment but fails to have a
second moment, since = 1.76 2.
If it exists, the larger the coefficient of variation, the heavier-tailed the distribution. For example, for the
Pareto with > 2, the Coefficient of Variation =

, which increases as approaches 2.


- 2

Thus as decreases, the tail of the Pareto gets heavier.


Skewness:
Similarly, when it exists, the larger the skewness, the heavier the tail of distribution. The Normal
Distribution is symmetric and thus has a skewness of zero. For the common size of loss distributions,
the skewness is usually positive when it exists.
The Gamma, Pareto and LogNormal all have positive skewness. For small the Weibull has
positive skewness, but has negative skewness for large enough .
The Gamma Distribution has skewness of 2 / , which is always positive.
The skewness of the Pareto Distribution does not exist for 3.
For > 3, the Pareto skewness is: 2

+1
3

2
> 0.

For the LogNormal Distribution the skewness =

exp(32) - 3 exp(2) + 2
.
{exp(2 ) - 1}1.5

The denominator is positive, since exp( 2) > 1 for 2 > 0.


The numerator is positive since it can be written as y3 - 3 y + 2, for y = exp( 2) > 1.
The derivative is 3y2 - 3 > 0 for y > 1.
At y =1 this denominator is zero, thus for y >1 this denominator is positive.
Thus the skewness of the LogNormal is positive.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 389

For the Weibull Distribution the skewness is:

{(1+ 3/) 3(1+ 1/) (1+ 2/) + 2((1+ 1/))3 } / {(1+ 2/) ((1+ 1/)2 )}1.5.
Note that the skewness depends on the shape parameter but not on the scale parameter .
Note that for very large tau, the skewness of the Weibull is approximately -1.5 / tau.
For large tau the skewness is negative, but goes to zero as tau goes to infinity.
The Weibull has positive skewness for < 3.6 and a negative skewness for > 3.6.
Mean Excess Loss (Mean Residual Lives):
Heavy-tailed distributions have mean excess losses (mean residual lives), e(x) that increase to
infinity as x approaches infinity.176 For example, for the Pareto the mean excess loss increases
linearly.
Light-tailed distributions have mean excess losses (mean residual lives) that increase slowly or
decrease as x approaches infinity. For example, the Exponential Distribution has a constant mean
excess loss, e(x) = .
Hazard Rate (Force of Mortality):177
The hazard rate / force of mortality is defined as:
h(x) = f(x) / S(x).
If the force of mortality is large, then the chance of being alive at large ages very quickly goes to
zero. If the hazard rate (force of mortality) is large, then the density drops off quickly to zero. Thus if
the hazard rate is increasing, the tail is light. Conversely, if the hazard rate decreases as x
approaches infinity, then the tail is heavy.
The hazard rate for an Exponential Distribution is constant, h(x) = 1/.
Relation of the Tail
to the Exponential

Hazard
Rate

Mean
Residual Life

Heavier

Decreasing

Increasing

Lighter

Increasing

Decreasing

176
177

Mean Excess Losses (Mean Residual Lives) are discussed further in a subsequent section.
Hazard rates are discussed further in a subsequent section.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 390

Heavier vs. Lighter Tails:


Heavier or lighter tail is a comparative concept; there are no strict definitions of heavy-tailed and
light-tailed:
Heavier Tailed

Lighter Tailed

f(x) goes to zero more slowly

f(x) goes to zero more quickly

Few Moments exist

All (positive) moments exist

Larger Coefficient of Variation178

Smaller Coefficient of Variation

Higher Skewness179

Lower Skewness180

e(x) Increases to Infinity181

e(x) goes to a constant182

Decreasing Hazard Rate

Increasing Hazard Rate

178

Very heavy tailed distributions may not even have a (finite) coefficient of variation.
Very heavy tailed distributions may not even have a (finite) skewness.
180
Very light tailed distributions may have a negative skewness
181
The faster the mean excess loss increases to infinity the more heavy the tail.
182
For very light-tailed distributions (such as the Weibull with > 1) the mean excess loss may go to zero as x
approaches infinity.
179

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 391

Here is a list of loss distributions, arranged in increasing heaviness of the tail:183

Distribution
Normal

Mean Excess Loss


Do All Positive
(Mean Residual Life)
Moments Exist
decreases to zero approximately as 1/x Yes

Weibull for > 1

decreases to zero less quickly than 1/x

Yes

Trans. Gamma for > 1

decreases to zero less quickly than 1/x

Yes

Gamma for > 1184

decreases to a constant

Yes

Exponential

constant

Yes

Gamma for < 1

increases to a constant

Yes

Inverse Gaussian

increases to a constant

Yes

Weibull for < 1

increases to infinity less than linearly

Yes

Trans. Gamma for < 1

increases to infinity less than linearly

Yes

LogNormal

increases to infinity just less than linearly

Yes185

Pareto

increases to infinity linearly

No

Single Parameter Pareto

increases to infinity linearly

No

Burr

increases to infinity linearly

No

Generalized Pareto

increases to infinity linearly

No

Inverse Gamma

increases to infinity linearly

No

Inverse Trans. Gamma

increases to infinity linearly

No

183

The Pareto, Single Parameter Pareto, Burr, Generalized Pareto, Inverse Transformed Gamma and Inverse Gamma
all have tails that are not very different. The Gamma and Inverse Gaussian have tails that are not very different. The
Weibull and Transformed Gamma have tails that are not very different.
184
The Gamma Distribution with < 1 is heavier tailed than the Exponential ( = 1).
The Gamma Distribution with > 1 is lighter tailed than the Exponential ( = 1).
One way to remember which one is heavier than an Exponential, is that as , the Gamma Distribution is a sum of
many independent identically distributed Exponentials, which approaches a Normal Distribution.
The Normal Distribution is lighter tailed, and therefore so is a Gamma Distribution for > 1.
185
While the moments exist for the LogNormal, the Moment Generating Function does not.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 392

Comparing Tails:
There is an analytic technique one can use to more precisely compare the tails of distributions.
One takes the limit as x approaches infinity of the ratios of the densities.186
Exercise: What is the limit as x approaches infinity of the ratio of the density of a Pareto Distribution
with parameters and to the density of a Burr Distribution with parameters, , and .
[Solution: For the Pareto f(x) = ()( + x) ( + 1)
For the Burr, (using g to distinguish it from the Pareto),
g(x) = (x/)(1+(x/))( + 1) /x
lim f(x) / g(x) = lim ()( + x)( + 1) / {(x/)(1+(x/))( + 1) /x} =
x

lim x( + 1) / {(x1/)( + 1)x( + 1)} = lim x (1) / .


x

For >1 the limit is infinity. For < 1 the limit is zero.
For =1 the limit is one; for =1, the Burr Distribution is a Pareto.]
Let f(x) and g(x) be the two densities, then if:
lim f(x) / g(x) = , f has a heavier tail than g.
x

lim f(x) / g(x) = 0, f has a lighter tail than g.


x

lim f(x) / g(x) = positive constant, f has a similar tail to g.


x

Exercise: Compare the tails of Pareto Distribution with parameters and , and a Burr Distribution
with parameters, , and .
[Solution: The comparison depends on the , the second shape parameter of the Burr.
For >1, the Pareto has a heavier tail than the Burr.
For < 1, the Pareto has a lighter tail than the Burr.
For = 1, the Burr is equal to the Pareto, thus they have similar, in fact identical tails.]

186

See Loss Models, Section 3.4.2.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 393

Note, Loss Models uses the notation f(x) ~ g(x), x , when lim f(x)/g(x) = 1.
x

Two distributions have similar tails if f(x) ~ cg(x), x , for some constant c > 0.
Instead of taking the limit of the ratio of densities, one can equivalently take the limit of the ratios of the
survival functions.187
Exercise: What is the limit as x approaches infinity of the ratio of the Survival Function of a Pareto
Distribution with parameters and to the Survival Function to a Burr Distribution with parameters,
, and .
[Solution: For the Pareto S(x) = ( + x) = (1 + x/).
For the Burr, (using T to distinguish it from the Pareto), T(x) = {1 + (x/)}.
lim S(x) / T(x) = lim {(1 +(x/)) / (1 +x/) } = lim {(x/)1} = lim x(1).
x

For >1 the limit is infinity. For < 1 the limit is zero. For =1 the limit is one; for =1, the Burr
Distribution is a Pareto.]
Therefore the comparison of the tails of the Burr and Pareto depends on the value of , the second
shape parameter of the Burr. For >1 the Burr has a lighter tail than the Pareto.
For < 1 the Burr has a heavier tail than the Pareto. For = 1, the Burr is equal to the Pareto, thus
they have similar, in fact identical, tails.
This makes sense, since for >1, x increases more quickly than x. Thus a Burr with = 2 has x2 in
the denominator of its survival function, where the Pareto only has x. Thus the survival function of a
Burr with = 2 goes to zero more quickly than the Pareto, indicating it is lighter-tailed than the Pareto.
The reverse is true if = 1/2. Then the Burr has

x in the denominator of its survival function, where

the Pareto only has x.


These same technique also can be used to compare the tails of distributions from the same family.

187

The derivative of the survival function is minus the density. Since as x approaches infinity, S(x) approaches zero,
one can apply L'Hospital's Rule. Let the two densities be f and g. Let the two survival functions be S and T. Limit as x
approaches infinity of S(x)/T(x) = limit x approaches infinity of S'(x)/T'(x) =
limit x approaches infinity of - f(x)/(- g(x)) = limit x approaches infinity of f(x)/g(x).

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 394

Exercise: The first Distribution is a Gamma with parameters and . The second Distribution is a
Gamma with parameters a and q. Which distribution has the heavier tail?
[Solution: The density of the first Gamma is: f1 (x) ~ x-1 exp(-x/).
The density of the second Gamma is: f2 (x) ~ xa-1 exp(-x/q). f1 (x)/f2 (x) ~ x-a exp(x(1/q - 1/)).
If 1/q - 1/ > 0, then the limit of f1 (x)/f2 (x) as x approaches infinity is .
If 1/q - 1/ < 0, then the limit of f1 (x)/f2 (x) as x approaches infinity is 0.
If 1/q - 1/ = 0, then the limit of f1 (x)/f2 (x) as x approaches infinity is if > a and 0 if a > .
Thus we have that: If > q, then the first Gamma is heavier-tailed.
If < q, then the second Gamma is heavier-tailed.
If = q and > a, then the first Gamma is heavier-tailed.
If = q and < a, then the second Gamma is heavier-tailed.
Multiplicative constants such as () or , which appear in the density, have been ignored since
they will not effect whether the limit of the ratio of densities goes to zero or infinity.]
Thus we see that the tails of two Gammas while not very different are not precisely similar.188
Whichever Gamma has the larger scale parameter is heavier-tailed. If they have the same scale
parameter, whichever Gamma has the smaller shape parameter is heavier-tailed.189
Inverse Gaussian Distribution vs. Gamma Distribution:
The skewness of the Inverse Gaussian Distribution, 3 / , is always three times its coefficient of
variation, / . In contrast, the Gamma Distribution has it skewness 2/ , is always twice times its
coefficient of variation, 1/ . Thus if a Gamma and Inverse Gaussian have the same mean and
variance, then the Inverse Gaussian has a larger skewness; if a Gamma and Inverse Gaussian have
the same mean and variance, then the Inverse Gaussian has a heavier tail.
A data set for which a Gamma is a good candidate usually also has an Inverse Gaussian as a
good candidate. The fits of the two types of curves differ largely based on the relative
magnitude of the skewness of the data set compared to its coefficient of variation. For data sets
with less volume, there may be no way statistically to distinguish the fits.
188

Using the precise mathematical definitions in Loss Models. Casualty actuaries rarely use this concept to compare
the tails of two Gammas. It would be more common to compare a Gamma to let's say a LogNormal. (A LogNormal
Distribution has a significantly heavier-tail than a Gamma Distribution.)
189
If they have the same scale parameters and the same shape parameters, then the two Gammas are identical and
have the same tail.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 395

Tails of the Transformed Beta Distribution:


Since many other distributions are special cases of the Transformed Beta
Distribution,190 it is useful to know its tail behavior. The density of the Transformed Beta Distribution is:

{(+)/()()} x1 (1+(x/))( + ) .
For large x the density acts as x1 / x( + ) = 1/ x + 1. If we multiply by xn we get
xn- 1; if we then integrate to infinity, we get a finite answer provided n - -1 > -1.
Thus the nth moment exists for n > .
The larger the product , the more moments exist and the lighter the (righthand) tail. The shape
parameter is that of a Pareto. The shape parameter is the power transform by which the Burr is
obtained from the Pareto. Their product, , determines the (righthand) tail behavior of the
Transformed Beta Distribution and its special cases. Provided >1, the mean excess loss exists
and increases to infinity approximately linearly; for large x, e(x) x / ( - 1).
This tail behavior carries over to special cases such as: the Burr, Generalized Pareto, Pareto,
LogLogistic, ParaLogistic, Inverse Burr, Inverse Pareto, and Inverse ParaLogistic.
All have mean excess losses, when they exist, that increase approximately linearly for large x.191
One can examine the behavior of the left hand tail192, as x approaches zero, in a similar manner.
For small x the density acts as x1. If we multiply by x-n we get x1n; if we then integrate to
zero, we get a finite answer provided - 1- n > -1. Thus the negative nth moment exists for > n.
Thus the behavior of the left hand tail is determined by the product of the two shape parameters of
the Inverse Burr Distribution.
Thus we see that of the three shape parameters of the Transformed Beta, (one more than the
power to which x is taken in the Incomplete Beta Function, i.e., the first parameter of the Incomplete
Beta Function) affects the left hand tail, (the shape parameter of the Pareto) affects the righthand
tail, and (the power transform parameter of the Burr and LogLogistic) affects both tails.

190

See Figure 5.2 in Loss Models.


The mean excess loss of a Pareto, when it exists, is linear in x. e(x) = (x+)/(-1).
192
Since casualty actuaries are chiefly concerned with the behavior of loss distributions in the righthand tail, as x
approaches infinity, assume that unless specified otherwise, "tail behavior" refers to the behavior in the righthand
tail as x approaches infinity.
191

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 396

An Example of Distributions fit to Hurricane Data:


Hogg & Klugman in Loss Distributions show the results of fitting different distributions to a set of
hurricane data.193 The hurricane data set is truncated from below at $5 million and consists of 35
storms with total losses adjusted to 1981 levels of more than $5 million.194 This serves as a good
example of the importance of the tails of the distribution to practical applications. Here are the
parameters of different distributions fit by Hogg & Klugman via maximum likelihood, as well as their
means, coefficients of variation (when they exist) and their skewnesses (when they exist):
Distribution
Type
Weibull
LogNormal
Pareto
Burr
Gen. Pareto

Parameters
= 88588730
= 17.953
= 1.1569
= 3.7697
= 2.8330

Parameters
= .51907
= 1.6028
= 73674000
= 585453983
= 862660000

Parameters

= .65994
= .33292

Mean
($ million)

Coefficient
of
Variation

Skewness

166
226
470
197
157

2.12
3.47
N.D.
3.75
2.79

6.11
52.26
N.D.
N.D.
N.D.

It is interesting to compare the tails of the different distributions by comparing the estimated
probabilities of a storm greater than $1 billion or $5 billion:
Distribution
Type

Probability
of storm
> 5 million

Probability
of storm
> 1 billion

Probability
of storm
> 5 billion

Weibull
LogNormal
Pareto
Burr
Gen. Pareto

79.86%
94.26%
92.68%
85.28%
43.56%

2.9637%
4.1959%
4.5069%
3.5529%
0.0142%

0.0300%
0.3142%
0.7475%
0.2122%
0.0002%

Estimated Annual Frequency


of Hurricanes Greater than
$1 billion
$5 billion
4.0591%
4.8686%
5.3185%
4.5567%
0.0358%

0.0410%
0.3646%
0.8821%
0.2722%
0.0004%

The lighter-tailed Weibull produces a much lower estimate of the chance of a huge hurricane than a
heavier-tailed distribution such as the Pareto. The estimate from the LogNormal, which is
heavier-tailed than a Weibull, but lighter-tailed than a Pareto, is somewhere in between.

193

Loss Distributions was formerly on the Part 4B exam syllabus.


The data is shown in Table 4.1 of Loss Distributions, and Table 13.8 of Loss Models.
In millions of dollars, the trended hurricane sizes are: 6.766, 7.123, 10.562, 14.474, 15.351, 16.983, 18.383,
19.030, 25.304, 29.112, 30.146, 33.727, 40.596, 41.409, 47.905, 49.397, 52.600, 59.917, 63.123, 77.809,
102.942, 103.217, 123.680, 140.136, 192.013, 198.446, 227.338, 329.511, 361.200, 421.680, 513.586,
545.778, 750.389, 863.881, 1638.000.
194

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 397

There were 35 hurricanes greater than 5 million in constant 1981 dollars observed in 32 years.
Thus one could estimate the frequency of such hurricanes as 35/32 = 1.09 per year. Then using the
curves fit to the data truncated from below one could estimate the frequency of hurricanes greater
than size x as: (1.09)S(x) / S(5 million). For example, for the Pareto Distribution the estimated
annual frequency of hurricanes greater than 5 billion in 1981 dollars is: (1.09)(.7475%)/(92.68%) =
.8821%. This is a mean return time of: 1/.8821% = 113 years. The return times estimated using
the other curves are much longer:
Distribution
Type
Weibull
LogNormal
Pareto
Burr
Gen. Pareto

Mean Return Time (Years)


of storm
of storm
> $1 billion
> $5 billion
25
21
19
22
2,796

2,438
274
113
367
229,068

It is interesting to note that even the most heavy-tailed of these curves would seem with
twenty-twenty hindsight to have underestimated the chance of large hurricanes such as
Hurricane Andrew.195 The small amount of data does not allow a good estimate of the extreme
tail; the observation of just one very large hurricane would have significantly changed the
results. Also due to changing or cyclic weather patterns and the increase in homes near the
coast, this may just not be an appropriate technique to apply to this particular problem. The
preferred technique currently is to simulate possible hurricanes using meteorological data and
estimate the likely damage using exposure data on the location and characteristics of insured
homes combined with engineering and physics data.196

195

The losses in Hogg & Klugman are adjusted to 1981 levels. Hurricane Andrew in 8/92 with nearly $16 billion in
insured losses, probably exceeded $7 billion dollars in loss in 1981 dollars. It is generally believed that Hurricanes
that produce such severe losses have a much shorter average return time than a century.
196
See for example, "A Formal Approach to Catastrophe Risk Assessment in Management", by Karen M. Clark,
PCAS 1986, or "Use of Computer Models to Estimate Loss Costs," by Michael A. Walters and Francois Morin,
PCAS 1997.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 398

Coefficient of Variation versus Skewness, Two Parameter Distributions:


For the following two parameter distributions: Pareto, LogNormal, Gamma and Weibull, the
Coefficient of Variation and Skewness depend on a single shape parameter.
Values are tabulated below:
Shape
Parameter
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
3.8
4
5
6
7
8
9
10

Pareto
C.V.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
3.317
2.449
2.082
1.871
1.732
1.633
1.558
1.500
1.453
1.414
1.291
1.225
1.183
1.155
1.134
1.118

Skew
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
N.A.
25.720
14.117
10.222
8.259
7.071
4.648
3.810
3.381
3.118
2.940
2.811

LogNormal
C.V.
0.202
0.417
0.658
0.947
1.311
1.795
2.470
3.455
4.953
7.321
11.201
17.786
29.354
50.391
90.012
167
324
652
1366
2981
268337
6.6e+7
4.4e+10
7.9e+13
3.9e+17
5.2e+21

Skew
0.056
0.355
1.207
3.399
9.282
26.840
87.219
331
1503
8208
53948
426061
4036409
4.6e+7
6.2e+8
1.0e+10
2.0e+11
4.6e+12
1.3e+14
4.3e+15
2.7e+24
1.5e+35
7.6e+47
3.5e+62
1.4e+79
5.2e+97

Gamma
C.V.
2.236
1.581
1.291
1.118
1.000
0.913
0.845
0.791
0.745
0.707
0.674
0.645
0.620
0.598
0.577
0.559
0.542
0.527
0.513
0.500
0.447
0.408
0.378
0.354
0.333
0.316

The shape parameters for these distributions are:


Pareto

LogNormal

Gamma

Weibull

Skew
4.472
3.162
2.582
2.236
2.000
1.826
1.690
1.581
1.491
1.414
1.348
1.291
1.240
1.195
1.155
1.118
1.085
1.054
1.026
1.000
0.894
0.816
0.756
0.707
0.667
0.632

Weibull
C.V.
15.843
3.141
1.758
1.261
1.000
0.837
0.724
0.640
0.575
0.523
0.480
0.444
0.413
0.387
0.363
0.343
0.325
0.309
0.294
0.281
0.229
0.194
0.168
0.148
0.133
0.120

Skew
190.1
11.35
4.593
2.815
2.000
1.521
1.198
0.962
0.779
0.631
0.509
0.405
0.315
0.237
0.168
0.106
0.051
0.001
-0.045
-0.087
-0.254
-0.373
-0.463
-0.534
-0.591
-0.638

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 399

As mentioned previously, for the Gamma Distribution the skewness is twice the CV:197
Skewness
10
8
6

alpha < 1

4
2

E
alpha > 1
1

CV

For > 1 the Gamma Distribution is lighter-tailed than an Exponential, and has CV < 1 and
skewness < 2. Conversely, for < 1 the Gamma Distribution is heavier-tailed than an Exponential,
and has CV > 1 and skewness > 2. The Exponential Distribution ( = 1), shown above as E, has
CV = 1 and skewness = 2.
For the Pareto Distribution the skewness is more than twice the CV, when they exist.
For the Pareto, the CV > 1 and the skewness > 2:
Skewness
10
Pareto

8
6

Gamma
4
2

E
1

197

For the Inverse Gaussian, the Skewness is three times the CV.

CV

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 400

As goes to infinity, the Pareto approaches the Exponential which has CV = 1 and
skewness = 2. As approaches 3, the skewness approaches infinity.
Here is a similar graph for the LogNormal Distribution versus the Gamma Distribution:
Skewness
10
8

LogNormal

Gamma

4
2

CV

For the LogNormal, as approaches zero, the coefficient of variation and skewness each approach
zero. For = 1, CV = 1.311 and the skewness = 9.282.
As approaches infinity both the skewness and CV approach infinity.
Finally, here is a similar graph for the Weibull Distribution versus the Gamma Distribution:
Skewness
10
8

Weibull, tau < 1

Gamma

4
2

E
Weibull, tau > 1
1

CV

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 401

For > 1 the Weibull Distribution is lighter-tailed than an Exponential, and has CV < 1 and
skewness < 2. Conversely, for < 1 the Weibull Distribution is heavier-tailed than an Exponential,
and has CV > 1 and skewness > 2. The Exponential Distribution ( = 1), shown above as E, has
CV = 1 and skewness = 2.
The CV is positive by definition. The skewness is positive for curves skewed to the right
and negative for curves skewed to the left. The Pareto, LogNormal and Gamma all have
positive Skewness. The Weibull has positive skewness for tau < 3.60235 and negative skewness
for tau > 3.60235.

Existence of Moment Generating Functions:198


The moment generating function for a continuous loss distribution is given by:199

M(t) =

0 f(x) ext dx = E[ext].

For example for the Gamma Distribution:


M(t) = (1 - t), for t < 1/.
The moments of the function can be obtained as the derivatives of the moment generating function
at zero. Thus if the Moment Generating Function exists (within an interval around zero) then so do all
the moments. However the converse is not true.
The moment generating function, when it exists, can be written as a power series in t:200
n=

M(t) =

E[Xn] tn / n!.
n=0

198

See Mahlers Guide to Aggregate Distributions.


See also Definition 12.2.2 in Actuarial Mathematics or Definition 3.9 in Loss Models.
199
With support from zero to infinity. In general the integral goes over the support of the probability distribution.
200
This is just the usual Taylor Series, substituting in the moments for the derivatives at zero of the Moment
Generating Function.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 402

In order for the moment generating function to converge (in an interval around zero), the moments
E[Xn ] may not grow too quickly as n gets large. This is yet another way to distinguish lighter and
heavier tailed distributions. Those with Moment Generating Functions are lighter-tailed than those
without Moment Generating Functions.
Thus the Weibull for > 1, whose m.g.f. exists, is lighter-tailed than the Weibull with < 1, whose
m.g.f. does not. The Transformed Gamma has the same behavior as the Weibull; for > 1 the
Moment Generating Function exists and the distribution is lighter-tailed than < 1 for which the
Moment Generating Function does not exist. (For = 1, one gets a Gamma, for which the
Moment Generating Function exists.) The LogNormal Distribution has its moments increase
rapidly and thus it does not have a Moment Generating Function. The LogNormal is the heaviesttailed of those distributions which have all their moments.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 403

Problems:
30.1 (2 points) You are given the following information on three (3) size of loss distributions:
Distribution Coefficient of Variation
Skewness
I
2
3
II
1.22
3.81
III
1
2
Which of these three loss distributions can not be a Gamma Distribution?
A. I
B. II
C. III
D. I, II, III
E. None of A,B,C, or D
30.2 (1 points) Which of the following distributions always have positive skewness?
1. Weibull
2. Normal
3. Gamma
A. None of 1, 2, or 3
B. 1 C. 2 D. 3 E. None of A, B, C, or D
30.3 (2 points) Which of the following statements is true?
1. For the Pareto Distribution, the standard deviation (when it exists) is always greater than the mean.
2. For the Pareto Distribution, the skewness (when it exists) is always greater than twice the
coefficient of variation.
3. For the LogNormal distribution, f(x) goes to zero more quickly as x approaches infinity than for the
Transformed Gamma distribution.
Hint: For the Transformed Gamma distribution, f(x) = (x/) exp[-(x/)] / {x ()}.
A. 1, 2

B. 1, 3

C. 2, 3

D. 1, 2, 3

E. None of A, B, C, or D

30.4 (2 points) Rank the tails of the following three distributions, from lightest to heaviest:
1. Weibull with = 0.5 and = 10.
2. Weibull with = 1 and = 100.
3. Weibull with = 2 and = 1000.
A. 1, 2, 3

B. 2,1, 3

C. 1, 3, 2

D. 3, 2, 1

E. None of A, B, C or D

30.5 (3 points) Rank the tails of the following three distributions, from lightest to heaviest:
1. Gamma with = 0.7 and = 10.
2. Inverse Gaussian with = 3 and = 4 .
3. Inverse Gaussian with = 5 and = 2 .
A. 1, 2, 3

B. 2, 1, 3

C. 1, 3, 2

D. 3, 2, 1

E. None of A, B, C or D

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 404

30.6 (1 point) Rank the tails of the following three distributions, from lightest to heaviest:
1. Exponential
2. Lognormal
3. Single Parameter Pareto
A. 1, 2, 3
B. 2, 1, 3
C. 1, 3, 2

D. 3, 2, 1

E. None of A, B, C or D

30.7 (1 point) The Inverse Exponential Distribution has a righthand tail similar to which of the
following distributions?
A. Lognormal

B. Pareto = 1

C. Pareto = 2

D. Weibull < 1

E. Weibull > 1

30.8 (3 points) You are given the following:


Claim sizes for Risk A follow a Exponential distribution, with mean 400.
Claim sizes for Risk B follow a Gamma distribution, with parameters = 200, = 2.
r is the ratio of the proportion of Risk B's claims (in number) that exceed d to the
proportion of Risk A's claims (in number) that exceed d.
Determine the limit of r as d goes to infinity.
A. 0
B. 1/2
C. 1
D. 2
E.
30.9 (4B, 11/92, Q.2) (1 point) Which of the following are true?
1. The random variable X has a lognormal distribution with parameters and ,
if Y = ex has a normal distribution with mean and standard deviation .
2. The lognormal and Pareto distributions are positively skewed.
3. The lognormal distribution generally has greater probability in the tail than the Pareto distribution.
A. 1 only
B. 2 only
C. 1, 3 only D. 2, 3 only E. 1, 2, 3
30.10 (4B, 11/93, Q.21) (1 point) Which of the following statements are true for statistical
distributions?
1. Linear combinations of independent normal random variables are also normal.
2. The lognormal distribution is often useful as a model for claim size distribution because it is
positively skewed.
3. The Pareto probability density function tapers away to zero much more slowly than
the lognormal probability density function.
A. 1
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 405

30.11 (4B, 11/99, Q.19) (2 points) You are given the following:
Claim sizes for Risk A follow a Pareto distribution, with parameters = 10,000 and = 2.
Claim sizes for Risk B follow a Burr distribution, F(x) = 1 - {1/(1+(x/))},
with parameters = 141.42, = 2, and = 2.
r is the ratio of the proportion of Risk A's claims (in number) that exceed d to
the proportion of Risk B's claims (in number) that exceed d.
Determine the limit of r as d goes to infinity.
A. 0
B. 1
C. 2
D. 4
E.
30.12 (CAS3, 11/03, Q.16) (2.5 points)
Which of the following is/are true, based on the existence of moments test?
I. The Loglogistic Distribution has a heavier tail than the Gamma Distribution.
ll. The Paralogistic Distribution has a heavier tail than the Lognormal Distribution.
Ill. The Inverse Exponential has a heavier tail than the Exponential Distribution.
A. I only
B. I and II only
C. I and Ill only
D. II and III only
E. I, ll, and Ill

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 406

Solutions to Problems:
30.1. E. Distributions I and II cant be a Gamma, for which
the skewness = twice coefficient of variation.
30.2. D. The Normal is symmetric, so it has skewness of zero. The Gamma has skewness of
0.5 > 0. The Weibull has either a positive or negative skewness, depending on the value of .
30.3. A. 1. True. This is the same as saying the CV > 1 for the Pareto. 2. True.
3. False. For the LogNormal, ln f(x) = -0.5 ({ln(x)} /)2 - ln(x) - ln() -ln(2)/2. For very large x this
is approximately: -0.5 ln(x)2 /2. For the Transformed Gamma, ln f(x) =
ln() + ( -1)ln(x) - ln() -(x/) - ln() . For very large x this is approximately: -x/.
Thus the log density of the LogNormal goes to minus infinity more slowly than that of the
Transformed Gamma. Therefore the density function of the LogNormal goes to zero less quickly as
x approaches infinity than that of the Transformed Gamma.
The LogNormal has a heavier tail than the Transformed Gamma.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 407

30.4. D. The three survival functions are: S1 (x) = exp[-(x/10).5], S2 (x) = exp(-x/100),
S 3 (x) = exp[-(x/1000)2 ]. S1 (x)/S2 (x) = exp[x/100S 1 (x)/S2 (x) is , since x increases more quickly than

x / 10 ]. The limit as x approaches infinity of


x . Thus the first Weibull is heavier-tailed than

the second. Similarly, the limit as x approaches infinity of S2 (x)/S3 (x) is , since x increases more
quickly than x2 . Thus the second Weibull is heavier-tailed than the third. Alternately, just calculate the
densities or log densities for an extremely large value of x, for example 1 billion = 109 .
(The log densities are more convenient to work with; the ordering of the densities and the log
densities are the same.)
For the Weibull, f(x) = (x/) exp(-(x/)) /x. ln f(x) = ln() + ln(x/) - ln(x) -(x/).
For the first Weibull, ln f(1 billion) = ln(.5) +(.5)ln(100 million) -ln(1 billion) - 100 million
-10,000. For the second Weibull, ln f(1 billion) = ln(1) + (1)ln(10 million) - ln(1 billion) 10 million -10,000,000. For the third Weibull, ln f(1 billion) = ln(2) + (2)ln(1 million) ln(1 billion) - (1 million)2 -1,000,000,000,000. Thus f(1 billion) is much larger for the first Weibull
than second Weibull, while f(1 billion) is much larger for the second Weibull than third Weibull. Thus
the third Weibull has the lightest tail, while the first Weibull has the heaviest tail.
Comment: For the Weibull, the smaller the shape parameter , the heavier the tail. The values of the
scale parameter , have no effect on the heaviness of the tail. However by changing the scale, the
third Weibull with = 1000 does take longer before its density falls below the others than if it instead
had = 1. The (right) tail behavior refers to the behavior as x approaches infinity, thus how long it
takes the density to get smaller does not affect which has a lighter tail. While the third Weibull might
be lighter-tailed, for some practical applications with a maximum covered loss you may be
uninterested in the large values of x at which its density is smaller than the others.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 408

30.5. B. The three densities functions are:


f1 (x) = 10-0.7 x-.3 exp(-x/10)/ (0.7), f2 (x) =
2 / x-1.5 exp(-2x/9 + 4/3 - 2/x). f3 (x) =

4 / (2 ) x-1.5 exp(-4(x/3 -1)2 / (2x)) =


1/ x-1.5 exp(-x/25 + 2/5 - 1/x).

We will take the limit as x approaches infinity of the ratios of these densities, ignoring any annoying
multiplicative constants such as 10-0.7/ (0.7) or

2 / e4/3.

f1 (x)/f2 (x) ~ x-0.3 exp(-x/10) / x-1.5 exp(-2x/9 - 2/x) = x1.2 exp(0.122x + 2/x).
The limit as x approaches infinity of f1 (x)/f2 (x) is .
Thus the Gamma is heavier-tailed than the first Inverse Gaussian with = 3 and = 4.
f1 (x)/f3 (x) ~ x-0.3 exp(-x/10) / x-1.5 exp(-x/25 - 1/x) = x1.2 exp(-0.06x + 1/x ).
The limit as x approaches infinity of f1 (x)/f3 (x) is 0, since exp(-0.06x) goes to zero very quickly.
Thus the Gamma is lighter-tailed than the second Inverse Gaussian with = 5 and = 2.
Thus the second Inverse Gaussian has the heaviest tail, followed by the Gamma, followed by the
first Inverse Gaussian.
Comment: In general the Inverse Gaussian and the Gamma have somewhat similar tails; they both
have their mean residual lives go to a constant as x approaches infinity. Which is heavier-tailed
depends on the particular parameters of the distributions. Let's assume we have a Gamma with
shape parameter and scale parameter ( using beta rather than using theta which is also a
parameter of the Inverse Gaussian,) and Inverse Gaussian with parameters and .
Then the density of the Gamma f1 (x) ~ x-1 exp(-x/)
Then the density of the Inverse Gaussian f2 (x) ~ x-1.5 exp[-x/(22) - /(2x)].
f1 (x)/f2 (x) ~ x+.5 exp[x(/(22) - 1/) + / 2x].
If / 22 > 1/, then the limit as x approaches infinity of f1 (x)/f2 (x) is , and the Gamma is
heavier-tailed than the Inverse Gaussian.
If / 22 < 1/, then the limit as x approaches infinity of f1 (x)/f2 (x) is 0, and the Gamma is
lighter-tailed than the Inverse Gaussian.
If / 22 = 1/, then f1 (x)/f2 (x) ~ x+.5 exp[/(2x)], and the limit as x approaches infinity of f1 (x)/f2 (x) is
, and the Gamma is heavier-tailed than the Inverse Gaussian.
30.6. A. The Single Parameter Pareto does not have all of its moments, and thus is heavier tailed
than the other two. The Lognormal has an increasing mean excess loss, while that for the Exponential
is constant. Thus the Lognormal is heavier tailed than the Exponential.
Comment: The Single Parameter Pareto has a tail similar to that of the Pareto.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 409

30.7. B. The Inverse Exponential does not have a mean and neither does the Pareto for
= 1. More specifically, the density of the Inverse Exponential is: e/x/x2 which is approximately
/x2 for large x, while the density of the Pareto for = 1 is: /(x+)2 which is also approximately /x2
for large x.
Comment: The Inverse Gamma Distribution has a similar to tail to the Pareto Distribution for the
same shape parameter . The Inverse Exponential is the Inverse Gamma for = 1.
30.8. A. S A(d) = e-d/400. fB(x) = xe-x/200/40000.

x=

S B(d) = xe-x/200/40000 dx = (1/40000)(-200xe-x/200 - 40000e-x/200


d

] = (1 + d/200)e-d/200.

x=d

r = SB(d) / SA(d) = (1 + d/200)/ e0.0025d. As d goes to infinity the denominator increases faster
than the numerator; thus as d goes to infinity, r goes to zero.
Comment: Similar to 4B, 11/99, Q.19.
30.9. B. 1. False. Ln(X) has a Normal distribution if X has a LogNormal distribution.
2. True. The skewness of the Pareto does not exist for 3.
For > 3, the Pareto skewness is: 2{(+1)/(3)} ( - 2) / > 0.
LogNormal Skewness = ( exp(32) - 3 exp(2) + 2 ) / (exp(2) -1)1.5. The denominator is
positive, since exp(2) > 1 for 2 > 0. The numerator is positive since it can be written as:
y 3 - 3 y + 2, for y = exp(2) > 1. (The derivative of y3 - 3 y + 2 is 3y2 - 3 , which is positive for
y >1. At y =1, y3 - 3 y + 2 is zero, thus for y >1 it is positive.)
Since the numerator and denominator are both positive, so is the skewness.
3. False. The Pareto is heavier-tailed than the Lognormal distribution. This can be seen by a
comparison of the mean residual lives. That of the lognormal increases less than linearly, while the
mean residual life of the Pareto increases linearly. Another way to see this is that all of the moments
of the LogNormal distribution exist, while higher moments of the Pareto distribution do not exist.
Comments: The LogNormal and the Pareto distributions are both heavy-tailed and heavy-tailed
distributions have positive skewness, (are skewed to the right.) These are statements that practicing
actuaries should know.
30.10. E. 1. True. 2. True. 3. True.
Comment: Statement 3 is another way of saying that the Pareto has a heavier tail than the
LogNormal.

2013-4-2,

Loss Distributions, 30 Tails

HCM 10/8/12,

Page 410

30.11. E. SA(d) = (10000/(10000+d))2 . SB(d) = (1/(1+(d/141.42)2 ))2 =(20000/(20000+d2 ))2 .


r = SA(d) / SB(d) = {(20000+d2 )/ 2(10000+d)}2 . As d goes to infinity the numerator increases
faster than the denominator; thus as d goes to infinity, r goes to infinity.
Comment: For > 1, the Burr Distribution has a lighter tail than the Pareto Distribution, while for < 1,
the Burr Distribution would have a heavier tail than the Pareto Distribution with the same .
30.12. E. I. The Loglogistic does not have all its moments, while the Gamma does.

The Loglogistic Distribution has a heavier tail than the Gamma Distribution.
II. The Paralogistic does not have all its moments, while the Lognormal does.

The Paralogistic Distribution has a heavier tail than the Lognormal Distribution.
III. The Inverse Exponential does not have all its moments, while the Exponential does.

The Inverse Exponential Distribution has a heavier tail than the Exponential Distribution.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 411

Section 31, Limited Expected Values


As discussed previously, the Limited Expected Value E[X x] is the average size of loss with all
losses limited to a given maximum size x. Thus the Limited Expected Value, E[X x], is the mean of
the data censored from above at x.
The Limited Expected Value is closely related to other important quantities: the Loss Elimination
Ratio, the Mean Excess Loss, and the Excess (Pure Premium) Ratio. The Limited Expected Value
can be used to price Increased Limit Factors. The ratio of losses expected for an increased
limit L, compared to a basic limit B, is E[X L] / E[X B].
The Limited Expected Value is generally the sum of two pieces. Each loss of size less than or equal
to u contributes its own size, while each loss greater than u contributes just u to the average.
For a discrete distribution:
E[X u] =

xi Prob[X = xi] + u Prob[X > u].


xiu

For a continuous distribution:


E[X u] =

t f(t) dt + u S(u).

Rather than calculating this integral, make use of Appendix A of Loss Models, which has formulas for
the limited expected value for each distribution.201
For example, the formula for the Limited Expected Value of the Pareto is:202
1

E[X x] =
1 -

, 1.
+ x
1

Exercise: For a Pareto with = 4 and = 1000, compute E[X], E[X 500] and E[X 5000].
[Solution: E[X] =

1000 4 1

1000
= 333.33. E[X x] =
1 -

.
1000 + x
1
4 1

E[X 500] = 234.57. E[X 5000] = 331.79.]


201

In some cases the formula for the Limited Expected Value (Limited Expected First Moment) is not given. In those
cases, one takes k = 1, in the formula for the Limited Expected Moments.
202
For =1, E[X x] = - ln(/(+x)).

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 412

Here are the formulas for the limited expected value for some distributions:
Distribution

Limited Expected Value, E[X x]

Exponential

(1 - e-x/)

Pareto

1

1 -

, 1
+ x
1

LogNormal

ln(x) - - 2
ln(x) -
exp( + 2/2)
+ x {1 -

Gamma

[+1 ; x/] + x {1 - [ ; x/]}

Weibull

(1 +1/) [1 +1/ ; (x/)] + x exp( -(x/))

Single Parameter Pareto

x
=

, x
- 1 ( - 1) x - 1
1

Exercise: For a LogNormal Distribution with = 9.28 and = 0.916, determine E[X
[Solution: E[X

25000].

x] = exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx )/]}.

E[X 25000] =
exp(9.6995)[0.01] + 25000 {1 - [0.92]} = (16310)(0.5040) + (25000)(1 - 0.8212) = 12,705.]
Relationship to the LER, Excess Ratio, and Mean Excess Loss:
The following relationships hold between the mean, the Limited Expected Value E[X x], the
Excess Ratio R(x), the Mean Excess Loss e(x), and the Loss Elimination Ratio LER(x):
mean = E[X infinity].
e(x) = { mean - E[X x] } / S(x).
R(x) = 1 - { E[X x] / mean } = 1 - LER(x).
R(x) = e(x) S(x) / mean.
LER(x) = E[X x] / mean.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 413

Layer Average Severity:


The Limited Expected Value can be useful when dealing with layers of loss. For example, suppose
we are estimating the expected value (per loss) of the layer of loss greater than $1 million but less
than $5 million.203 Note here we are taking the average over all losses, including those that are too
small to contribute to the layer. This Layer Average Severity is equal to the Expected Value
Limited to $5 million minus the Expected Value Limited to $1 million.
Layer Average Severity = E[X top of Layer] - E[X bottom of layer].
The Layer Average Severity is the insurers average payment per loss to an insured, when there is
a deductible of size equal to bottom of the layer and a maximum covered loss equal to the top of
the layer. Loss Models refers to this as the expected payment per loss.
expected payment per loss = average amount paid per loss =
E[X Maximum Covered Loss] - E[X Deductible Amount].
Exercise: Losses follow a Pareto with = 4 and = 1000. There is a deductible of 500 and a
maximum covered loss of 5000. What is the insurers average payment per loss?
[Solution: From previous solutions: E[X 500] = 234.57. E[X 5000] = 331.79.
The Layer Average Severity = E[X 5000] - E[X 500] = 331.79 - 234.59 = 97.20.]
Each small loss, x d, contributes nothing to a layer from d to u.
Each medium size loss, d < x u, contributes x - d to a layer from d to u.
Each large loss, u < x, contributes u - d to a layer from d to u.
u

Therefore, Layer Average Severity =

d (t d) f( t) dt + S(u) (u - d).

Average Non-zero Payment:


Besides the average amount paid per loss to the insured, one can also calculate the average
amount paid per non-zero payment by the insurer. Loss Models refers to this as the expected
payment per payment.204
With a deductible, there are many instances where the insured suffers a small loss, but the insurer
makes no payment. Therefore, if the denominator only includes those situations where the insurer
makes a non-zero payment, the average will be bigger. The average payment per payment is
greater than or equal to the average payment per loss.
203
204

This might be useful for pricing a reinsurance contract.


See pages 180-183 of Loss Models.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 414

Exercise: Losses follow a Pareto with = 4 and = 1000.


There is a deductible of 500, and a maximum covered loss of 5000.
What is the average payment per non-zero payment by the insurer?
[Solution: From the previous solution the average payment per loss to the insured is 97.20.
However, the insurer only makes a payment S(500) = 19.75% of the time the insured has a loss.
Thus the average per non-zero payment by the insurer is: 97.20 / 0.1975 = 492.08.]
expected payment per payment =

E[X Maximum Covered Loss] - E[X Deductible]


S(Deductible)

Coinsurance:
Sometimes, the insurer will only pay a percentage of the amount it would otherwise pay.205
As discussed previously, this is referred to as a coinsurance clause. For example with a 90%
coinsurance factor, after the application of any maximum covered loss and/or deductible, the insurer
would only pay 90% of what it would pay in the absence of the coinsurance clause.
Exercise: Losses follow a Pareto with = 4 and = 1000. There is a deductible of 500, a maximum
covered loss of 5000, and a coinsurance factor of 80%. What is the insurers average payment per
loss to the insured?
[Solution: From a previous solution in the absence of the coinsurance factor, the average payment is
97.20.
With the coinsurance clause each payment is multiplied by 0.8, so the average is:
(0.8)(97.20) = 77.76.]
In general each payment is multiplied by the coinsurance factor, thus so is the average. This is just a
special case of multiplying a variable by a constant. The nth moment is multiplied by the constant to
the nth power. The variance is therefore multiplied by the square of the coinsurance factor.
Exercise: Losses follow a Pareto with = 4 and = 1000.
There is a deductible of 500, a maximum covered loss of 5000, and a coinsurance factor of 80%.
What is the insurers average payment per non-zero payment by the insurer?
[Solution: From a previous solution the average payment per loss to the insured is 77.76.
However, the insurer only makes a payment S(500) = 19.75% of the time the insured has a loss.
Thus the average per non-zero payment by the insurer is: 77.76 / 0.1975 = 393.66.]

205

For example, coinsurance clauses are sometimes used in Health Insurance, Homeowners Insurance, or
Reinsurance.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 415

Formulas for Average Payments:


These are the general formulas:206
Given Deductible Amount d, Maximum Covered Loss u, and coinsurance factor c, then
the average payment per (non-zero) payment by the insurer is:
E[X u] - E[X d]
c
= c e(d).
S(d)
Given Deductible Amount d, Maximum Covered Loss u, and coinsurance factor c, then
the insurers average payment per loss to the insured is:
c (E[X u] - E[X d]).
The insurers average payment per loss to the insured is the Layer of Loss between the Deductible
Amount d to the Maximum Covered Loss u, E[X u] - E[X d], all multiplied by the coinsurance
factor. The average per non-zero payment by the insurer is the insurers average payment per loss
to the insured divided by the ratio of the number of non-zero payments by the insurer to the
number of losses by the insured, S(d).
Limited Expected Value as an Integral of the Survival Function:
The Limited Expected Value can be written as an Integral of the Survival Function, S(x) = 1 - F(x).
E[X x] =

t f(t) dt + x S(x).

Using integration by parts and the fact that the integral of f(x) is -S(x):207
E[X x] =

{- S(x)x +
0

S(t) dt } + x S(x) =

S(t) dt .

Thus the Limited Expected Value can be written as an integral of the Survival Function
from 0 to the limit, for a distribution with support starting at zero.208

206

See Theorem 8.7 in Loss Models.


More general formulas that include the effects of inflation will be discussed in a subsequent section.
207
Note that the derivative of S(x) is dS(x) /dx = d(1-F(x) / dx = - f(x). Thus an indefinite integral of f(x) is
-S(x) = F(x) - 1. (There is always an arbitrary constant in an indefinite integral.)
208
Thus this formula does not apply to the Single Parameter Pareto Distribution. For the Single Parameter Pareto
Distribution with support starting at , E[X x] = + integral from to x of S(t). More generally, E[X x] is the sum of
the integral from - to 0 of -F(t) and the integral from 0 to x of S(t). See Equation 3.9 in Loss Models.

2013-4-2,
E[X x] =

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 416

S(t) dt .

Since the mean is E[X ], it follows that the mean can be written as an integral of the Survival
Function from 0 to the infinity,209 for a distribution with support starting at zero.210

E[X] =

S(t) dt .

The losses in the Layer from a to b is given as a difference of Limited Expected Values:
E[X b] - E[X a] =

S(t) dt .
a

Thus the Losses in a Layer can be written as an integral of the Survival Function from
the bottom of the Layer to the top of the Layer.211
Expected Amount by Which Aggregate Claims are Less than a Given Value:
The amount by which x is less than y is defined as (y - X)+: 0 if x > y and y - x if x y.
For example, (10 - 2)+ = 8, while (10 - 15)+ = 0.
x
2
7
15

10 - X
8
3
-5

(10 - X)+
8
3
0

X 10
2
7
10

(10 - X)+ + (X 10)


10
10
10

So we see that (10 - x)+ + (x 10) = 10, regardless of x.

209

See formula 3.5.2 in Actuarial Mathematics.


Do not apply this formula to a Single Parameter Pareto Distribution. For a continuous distribution with support on
(a, b), the mean is: a + the integral from a to b of S(x). For the Single Parameter Pareto Distribution with support
(, ), E[X] = + integral from to of S(x).
211
These are the key ideas behind Lee Diagrams, discussed subsequently.
210

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 417

In general, (y - X)+ + (X y) = y. E[(y - X)+] + E(X y) = y. E[(y - X)+] = y - E[X y].


More generally, the expected amount by which losses are less than y is:
E[(y - X)+] =

(y x) f(x) dx = y f(x) dx - x f(x) dx = yF(y) - {E[X y] - yS(y)} = y - E[X y].

Therefore, the expected amount by which losses are less than y is:
E[(y - X)+ ] = y - E[X y].
This can also be seen via a Lee Diagram, as discussed in a subsequent section.
The expected amount by which aggregate losses are less than a given amount is sometimes
called the savings. 212
For example, assume policyholder dividends are 1/3 of the amount by which that policyholders
aggregate annual claims are less than 1000. Then let L be aggregate annual claims. Then
(1000 - L)/ 3
Policyholder Dividend =
0

L < 1000
L 1000

Then the expected policyholder dividend is one third times the average amount by which aggregate
claims are less than 1000.
Therefore, the expected dividend is: (1000 - E[L 1000])/3.
Exercise: The aggregate annual claims for a policyholder follow the following discrete distribution:
Prob[X = 200] = 30%, Prob[X = 500] = 40%, Prob[X = 2000] = 20%, Prob[X = 5000] = 10%.
Policyholder dividends are 1/4 of the amount by which that policyholders aggregate annual claims
are less than 1000, (no dividend is paid if annual claims exceed 1000.)
[Solution: E[X 1000] = (0.3)(200) + (0.4)(500) + (0.3)(1000) = 560.
Therefore, the expected amount by which aggregate annual claims are less than 1000 is:
1000 - E[X 1000] = 1000 - 560 = 440. Expected policyholder dividend is: 440/4 = 110.
Alternately, if aggregate claims are 200, then the dividend is: (1000 - 200)/4 = 200.
If aggregate claims are 500, then the dividend is: (1000 - 500)/4 = 125.
If aggregate claims are 2000 or 5000, then no dividend is paid.
Expected dividend (including those cases where no dividend is paid) is:
(0.3)(200) + (0.4)(125) = 110.]
212

Insurance Savings as used in Retrospective Rating is discussed for example in Gillam and Snader, Fundamentals
of Individual Risk Rating.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 418

Use the formula, E[(y - X)+] = y - E[X y], if the distribution is continuous rather than discrete.
Exercise: Assume aggregate annual claims for a policyholder are LogNormally distributed, with
= 4 and = 2.5.213 Policyholder dividends are 1/3 of the amount by which that policyholders
aggregate annual claims are less than 1000. No dividend is paid if annual claims exceed 1000.
What are the expected policyholder dividends?
[Solution: For the LogNormal distribution,
E[X x] = exp( + 2/2) [(lnx 2)/] + x {1 - [(lnx )/]}.
E[X 1000] = exp(7.125)(-1.337) + (1000){1 - (1.163)} =
(1242.6)(0.0906) + (1000)(1 - 0.8776) = 235.
Therefore, the expected dividend is (1000 - E[X 1000])/3 = 255.]
Sometimes, the dividend or bonus is stated in terms of the loss ratio, which is losses divided by
premiums. In this case, the same technique can be used to determine the average dividend or
bonus.
Exercise: An insurance agent will receive a bonus if his loss ratio is less than 75%.
The agent will receive a percentage of earned premium equal to 1/5 of the difference between 75%
and his loss ratio. The agent receives no bonus if his loss ratio is greater than 75%. His earned
premium is 10 million. His incurred losses are distributed according to a Pareto distribution with
= 2.5 and = 12 million. Calculate the expected value of his bonus.
[Solution: A loss ratio of 75% corresponds to (0.75)(10 million) = $7.5 million in losses.
If his losses are L, his loss ratio is L/10 million. If L < 7.5 million, his bonus is:
(1/5)(0.75 - L/10 million)(10 million) = (1/5)(7.5 million - L).
Therefore, his bonus is 1/5 the amount by which his losses are less than $7.5 million.
For the Pareto distribution, E[X x] = (/(-1){1 - (/(+x)1}.
Therefore, E[X 7.5 million] = (12 million/1.5){1 - (12/(12+7.5)1.5} = 4.138 million.
Therefore, the expected bonus is: (1/5){7.5 million - 4.138 million} = 672 thousand.]
E[(X-d)+] versus E[(d - X)+]:
E[(X-d)+] = E[X] - E[X d], is the expected losses excess of d.
E[(d-X)+] = d - E[X d], is the expected amount by which losses are less than d.
Therefore, E[(X-d)+] - E[(d-X)+] = E[X] - d = E[X - d].
213

Note that we are applying the mathematical concept of a limited expected value to the distribution of aggregate
losses in the same manner as was done to a distribution of sizes of loss. Aggregate distributions are discussed
further in Mahlers Guide to Aggregate Distributions.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 419

In fact, (X-d)+ - (d-X)+ = (x-d, if x d) - (d - x, if x < d) = X - d.


Exercise: For a Poisson distribution, determine E[(N-1)+].

[Solution: E[N 1] = 0 f(0) + 1 Prob[N 1] = Prob[N 1].


E[(N-1)+] = E[N] - E[N 1] = - Prob[N 1] = + e - 1.
Alternately, E[(N-1)+] = E[(1-N)+] + E[N] - 1 = Prob[N = 0] + - 1 = e + - 1.]
Exercise: In baseball a team bats in an inning until it makes 3 outs. In the fifth inning of today's game,
each batter for the Bad News Bears has a 45% chance of walking and a 55% chance of striking out,
independent of any other batter. What is the expected number of runs the Bad News Bears will
score in the fifth inning?
(If there are three men on base, a walk forces in a run. Assume no wild pitches, passed balls, etc.
Assume nobody steals any bases, is picked off base, etc.)
[Solution: Treat a walk as a failure for the defense, and striking out as a success for the defense.
An inning ends when there are three successes. The number of walks (failures) is Negative Binomial
with r = 3 and = (chance of failure)/(chance of success) = 0.45/0.55 = 9/11.
f(0) = 1/(1 + )r = (11/20)3 = .1664. f(1) = r/(1 + )r+1 = (3)(9/11)(11/20)4 = 0.2246.
f(2) = {(r)(r+1)/2}2/(1 + )r+2 = (3)(4/2)(9/11)2 (11/20)5 = 0.2021.
E[N 3] = 0f(0) + 1f(0) + 2f(2) + 3{1 - f(0) - f(1) - f(2)} = 0.2246 + (2)(0.2021) + (3)(0.4069) =
1.8495. If there are 3 or fewer walks in the inning, they score no runs. With a total of 4 walks they
score 1 run, with a total 5 walks they score 2 runs, etc. Number of runs scored = (N - 3)+.
Expected number of runs scored = E[(N - 3)+] = E[N] - E[N 3] = (3)(9/11) - 1.8495 = 0.605.]
Average Size of Losses in an Interval:
As discussed previously, the Limited Expected Value is generally the sum of two pieces.
Each loss of size less than x contributes its own size, while each loss greater than or equal to x
contributes just x to the average:
x

E[X x] =

0 y f(y) dy

+ x S(x).

This formula can be rewritten to put the integral in terms of the limited expected value E[X x] and
the survival function S(x), both of which are given in the Appendix A of Loss Models:
x

0 y f(y) dy = E[X x] - x S(x).

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 420

This integral represents the dollars of loss on losses of size 0 to x.


Dividing by the probability of such claims, F(x), would give the average size of such losses.
Dividing instead by the mean would give the percentage of losses represented by those losses.
The dollars of loss represented by the losses in an interval from a to b is just the difference of two
integrals of the type we have been discussing:
b

y f(y) dy - y f(y) dy = E[X b] - bS(b) - {E[X a] -

aS(a)}.

Dividing by F(b) - F(a) would give the average size of loss for losses in this interval.
Average Size of Losses in the Interval [a, b] =

{E[X b] - b S(b)} - {E[X a] - a S(a)}


.
F(b) - F(a)

Exercise: For a LogNormal Distribution, with parameters = 8 and = 3, what is the average size of
those losses with sizes between $1 million and $5 million?
[Solution: For the LogNormal: F(5 million) = [{ln(5 million)} / ] = [{ln(5 million)-8} / 3] =
[(15.425-8)/3] = [2.475] = 0.9933.
F(1 million) = [{ln(1 million)-8} / 3] = [1.939] = 0.9737.
E[X 5 million] = exp( + 2/2)[(ln(5 mil) - - 2)/] + (5 mil){1 [(ln(5 mil) - )/]} =
exp( + 2/2)[(ln(5 mil) - - 2)/] + (5 mil) {1 - [(ln(5 mil)- )/]} =
(268,337)[-0.525] +(5,000,000)(1- [2.475]) =
(268,337)(0.2998) +(5,000,000)(0.0067) = 113,679.
E[X 1 million] = exp( + 2/2)[(ln(1 mil) - - 2)/] + (1 mil) {1 - [(ln(1 mil)- )/]} =
(268,337)[-1.061] +(1,000,000)(1- [1.939]) =
(268,337)(0.1444) +(1,000,000)(0.0263) = 65,048.
Thus, the average size of loss for those losses of size between $1 million and $5 million is:
({E[X 5m ] - (5m) S(5m)} - {E[X 1m] - (1m)S(1m)}) / {F(5m) - F(1m)} =
({113,679 - (5,000,000)(0.0067)} - {65,048 - (1,000,000)(0.0263)}) / (0.9933 -0.9737) =
41,700/0.0196 = $2.13 million.
Comment: Note that the average size of loss is not at the midpoint of the interval, which is
$3 million. In the case of the LogNormal, E[X x ] - xS(x) = exp( + 2/2)[(ln(x) 2)/].
Thus, the mean loss size for the interval a to b is:
exp( + 2/2){[(lnb - - 2)/]-[(lna - - 2)/]} / {[(lnb - )/] -[(lna - )/]},
which would have saved some computation in this case. ]

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 421

Dividing instead by the mean would give the percentage of dollars of total losses represented by
those claims.
Proportional of Total Losses from Losses in the Interval [a, b] is:
{E[X b] - b S(b)} - {E[X a] - a S(a)}
.
E[X]
Exercise: For a LogNormal Distribution, with parameters = 8 and = 3, what percentage of the total
losses are from those losses with sizes between $1 million and $5 million?
[Solution: E[X] = exp( + 2/2) = e12.5 = 268,337.
From a previous solution, ({E[X 5m ] - (5m) S(5m)} - {E[X 1m] - (1m)S(1m)}) = 41,700.
41,700/268,337 = 15.5%.
Comment: In the case of the LogNormal, the percentage of losses from losses of size a to b =
exp( + 2/2){[(lnb - - 2)/] - [(lna - - 2)/]} / exp( + 2/2) =
[(lnb - - 2)/] - [(lna - - 2)/]. ]
Questions about the losses in an interval have to be distinguished from those about layers of loss.
For example, the losses in the layer from $100,000 and $1 million are part of the dollars from losses
of size greater than $100,000. Each loss of size between $100,000 and $1 million contributes its
size minus $100,000 to this layer, while those of size greater than $1 million contribute the width of
the layer, $900,000, to this layer.214

214

See the earlier section on Layers of Loss.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 422

Payments Subject to a Minimum:215


Assume a disabled worker is paid his weekly wage, subject to a minimum payment of 300.216
Let X be a workers weekly wage. Then, while he is unable to work, he is paid Max[X, 300].
Min[X, 300] + Max[X, 300] = X + 300.
Therefore, E[Max[X, 300]] = 300 + E[X] - E[Min[X, 300]] = 300 + E[X] - E[X

300].

Let Y = amount the worker is paid = Max[X, 300].


Then Y - 300 = 0 if X 300, and Y - 300 = X - 300 if X > 300.
Therefore, E[Y - 300] = E[(X - 300)+] = E[X] - E[X 300].

E[Y] = 300 + E[X] - E[X 300], matching the previous result.


Exercise: Weekly wages are distributed as follows:
200 @ 20%, 300 @30%, 400 @30%, 500 @ 10%, 1000 @10%.
Determine the average weekly payment to a worker who is disabled.
[Solution: E[X] = (20%)(200) + (30%)(300) + (30%)(400) + (10%)(500) + (10%)(1000) = 400.
E[X 300] = (20%)(200) + (30%)(300) + (30%)(300) + (10%)(300) + (10%)(300) = 280.
300 + E[X] - E[X 300] = 300 + 400 - 280 = 420.
Alternately, one can list all of the possibilities:
Wage
Payment
Probability
200
300
20%
300
300
30%
400
400
30%
500
500
10%
1000
1000
10%
(20%)(300) + (30%)(300) + (30%)(400) + (10%)(500) + (10%)(1000) = 420.]
Another way to look at this is that the average payment is:
mean wage + (the average amount by which the wage is less than 300) =
E[X] + (300 - E[X 300]) = 300 + E[X] - E[X 300], matching the previous result.
In general, E[Max[X, a]] = a + E[X] - E[X

215
216

a].

See for example, SOA M, 11/06, Q. 20.


This is a very simplified version of benefits under Workers Compensation.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 423

Payments Subject to both a Minimum and a Maximum:217


Assume a disabled worker is paid his weekly wage, subject to a minimum payment of 300, and a
maximum payment of 700.218 Let X be a workers weekly wage. Then, while he is unable to work, he
is paid Min[Max[X, 300], 700].
Let Y = amount the worker is paid = Min[Max[X, 300], 700].
Then Y - 300 = 0 if X 300, Y - 300 = X - 300 if 300 < X < 700, and Y - 300 = 400 if X 700.
Therefore, E[Y - 300] = the layer from 300 to 700 = E[X 700] - E[X 300].

E[Y] = 300 + E[X 700] - E[X 300].


Exercise: Weekly wages are distributed as follows:
200 @ 20%, 300 @30%, 400 @30%, 500 @ 10%, 1000 @10%.
Determine the average weekly payment to a worker who is disabled.
[Solution: E[X 300] = (20%)(200) + (80%)(300) = 280.
E[X 700] = (20%)(200) + (30%)(300) + (30%)(400) + (10%)(500) + (10%)(700) = 370.
300 + E[X 700] - E[X 300] = 300 + 370 - 280 = 390.
Alternately, one can list all of the possibilities:
Wage
Payment
Probability
200
300
20%
300
300
30%
400
400
30%
500
500
10%
1000
700
10%
(20%)(300) + (30%)(300) + (30%)(400) + (10%)(500) + (10%)(700) = 390.]
Another to way to arrive at the same result is that the average payment is:
mean wage + (average amount by which the wage is less than 300) - (layer above 700) =
E[X] + (300 - E[X 300]) - (E[X] - E[X 700]) = 300 + E[X 700] - E[X 300], matching the
previous result.
In general, E[Min[Max[X , a] , b] = a + E[X

b] - E[X

a].

We note that if b = , in other words the payments are not subject to a maximum, this reduces to
the result previously discussed for that case, E[Max[X, a]] = a + E[X] - E[X a].
If instead a = 0, in other words the payment is not subject to a minimum, this reduces to
E[Min[X , b]] = E[X b], which is the definition of the limited expected value.
217

This mathematics is a simplified version of the premium calculation under a Retrospectively Rated Policy.
See for example, Individual Risk Rating by Margaret Tiller Sherwood, in Foundations of Casualty Actuarial Science.
218
This is a simplified version of benefits under Workers Compensation.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 424

Problems:
31.1 (1 point) You are given the following:

The size of loss distribution is given by


f(x) = 2e-2x, x > 0

Under a basic limits policy, individual losses are capped at 1.

The expected annual claim frequency is 13.


What are the expected annual total loss payments on a basic limits policy?
A. less than 5.0
B. at least 5.0 but less than 5.5
C. at least 5.5 but less than 6.0
D. at least 6.0 but less than 6.5
E. at least 6.5
Use the following information for the next 7 questions.
Assume the unlimited losses follow a LogNormal Distribution with parameters = 7 and = 3.
Assume an average of 200 losses per year.
31.2 (1 point) What is the total cost expected per year?
A. less than $19 million
B. at least $19 million but less than $20 million
C. at least $20 million but less than $21 million
D. at least $21 million but less than $22 million
E. at least $22 million
31.3 (2 points) If the insurer pays no more than $1 million per loss, what is the insurers total cost
expected per year?
A. less than $7 million
B. at least $7 million but less than $8 million
C. at least $8 million but less than $9 million
D. at least $9 million but less than $10 million
E. at least $10 million
31.4 (2 points) If the insurer pays no more than $5 million per loss, what is the insurers total cost
expected per year?
A. less than $7 million
B. at least $7 million but less than $8 million
C. at least $8 million but less than $9 million
D. at least $9 million but less than $10 million
E. at least $10 million

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 425

31.5 (1 point) What are the dollars in the layer from $1 million to $5 million expected per year?
A. less than $3.5 million
B. at least $3.5 million but less than $3.7 million
C. at least $3.7 million but less than $3.9 million
D. at least $3.9 million but less than $4.1 million
E. at least $4.1 million
31.6 (1 point) What are the total dollars excess of $5 million per loss expected per year?
A. less than $7 million
B. at least $7 million but less than $8 million
C. at least $8 million but less than $9 million
D. at least $9 million but less than $10 million
E. at least $10 million
31.7 (2 points) What is the average size of loss for those losses between $1 million and $5 million
in size?
A. less than $1.4 million
B. at least $1.4 million but less than $1.7 million
C. at least $1.7 million but less than $2.0 million
D. at least $2.0 million but less than $2.3 million
E. at least $2.3 million
31.8 (1 point) What is the expected total cost per year of those losses between
$1 million and $5 million in size?
A. less than $3.5 million
B. at least $3.5 million but less than $3.7 million
C. at least $3.7 million but less than $3.9 million
D. at least $3.9 million but less than $4.1 million
E. at least $4.1 million

31.9 (2 points) A Pareto Distribution with parameters = 2.5 and = $15,000 appears to be a
good fit to liability claims. What is the expected average size of loss for a policy issued with a
$250,000 limit of liability?
A. less than 9200
B. at least 9200 but less than 9400
C. at least 9400 but less than 9600
D. at least 9600 but less than 9800
E. at least 9800

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 426

Use the following information for the next 4 questions:

The weekly wages for workers in a state follow a Pareto Distribution with = 4 and = 1800.

Injured workers are paid weekly benefits equal to 2/3 of their pre-injury average
weekly wage, but subject to a maximum benefit of the state average weekly wage
and a minimum benefit of 1/4 of the state average weekly wage.

Injured workers have the same wage distribution as all workers.

The duration of payments is independent of the workers wage.

31.10 (1 point) What is the state average weekly wage?


A. less than $500
B. at least $500 but less than $530
C. at least $530 but less than $560
D. at least $560 but less than $590
E. at least $590
31.11 (2 points) For a Pareto Distribution with parameters = 4 and = 1800, what is E[X 900]?
A. less than $400
B. at least $400 but less than $430
C. at least $430 but less than $460
D. at least $460 but less than $490
E. at least $490
31.12 (2 points) For a Pareto Distribution with parameters = 4 and = 1800, what is E[X 225]?
A. less than $100
B. at least $100 but less than $130
C. at least $130 but less than $160
D. at least $160 but less than $190
E. at least $190
31.13 (3 points) What is the average weekly benefit received by injured workers?
A. less than $300
B. at least $300 but less than $320
C. at least $320 but less than $340
D. at least $340 but less than $360
E. at least $360
Hint: Use the solutions to the previous three questions.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 427

Use the following information for the next 15 questions:


Losses follow an Exponential Distribution with = 10,000.
31.14 (1 point) What is the average loss?
A. less than 8500
B. at least 8500 but less than 9000
C. at least 9000 but less than 9500
D. at least 9500 but less than 10,000
E. at least 10,000
31.15 (1 point) Assuming a 25,000 policy limit, what is the average payment by the insurer?
A. less than 9000
B. at least 9000 but less than 9100
C. at least 9100 but less than 9200
D. at least 9200 but less than 9300
E. at least 9300
31.16 (1 point) Assuming a 1000 deductible (with no maximum covered loss),
what is the average payment per loss?
A. less than 9000
B. at least 9000 but less than 9100
C. at least 9100 but less than 9200
D. at least 9200 but less than 9300
E. at least 9300
31.17 (1 point) Assuming a 1000 deductible (with no maximum covered loss),
what is the average payment per non-zero payment by the insurer?
A. less than 8500
B. at least 8500 but less than 9000
C. at least 9000 but less than 9500
D. at least 9500 but less than 10,000
E. at least 10,000
31.18 (1 point) Assuming a 1000 deductible and a 25,000 maximum covered loss,
what is the average payment per loss?
A. less than 8500
B. at least 8500 but less than 9000
C. at least 9000 but less than 9500
D. at least 9500 but less than 10,000
E. at least 10,000

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 428

31.19 (1 point) Assuming a 1000 deductible and a 25,000 maximum covered loss,
what is the average payment per (non-zero) payment by the insurer?
A. less than 9000
B. at least 9000 but less than 9100
C. at least 9100 but less than 9200
D. at least 9200 but less than 9300
E. at least 9300
31.20 (1 point) Assuming a 75% coinsurance factor (with no deductible or maximum covered loss),
what is the average payment by the insurer?
A. less than 6700
B. at least 6700 but less than 6800
C. at least 6800 but less than 6900
D. at least 6900 but less than 7000
E. at least 7000
31.21 (1 point) Assuming a 75% coinsurance factor and a 1000 deductible (with no maximum
covered loss), what is the average payment per loss?
A. less than 6700
B. at least 6700 but less than 6800
C. at least 6800 but less than 6900
D. at least 6900 but less than 7000
E. at least 7000
31.22 (1 point) Assuming a 75% coinsurance factor, a 1000 deductible and a 25,000 maximum
covered loss, what is the average payment per non-zero payment by the insurer?
A. less than 6700
B. at least 6700 but less than 6800
C. at least 6800 but less than 6900
D. at least 6900 but less than 7000
E. at least 7000
31.23 (2 points) What is the average size of the losses in the interval from 1000 to 25000?
Assume no deductible, no maximum covered loss, and no coinsurance factor.
A. less than 7500
B. at least 7500 but less than 8000
C. at least 8000 but less than 8500
D. at least 8500 but less than 9000
E. at least 9000

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 429

31.24 (2 points) What is the proportion of total dollars of loss from the losses in the interval from
1000 to 25000? Assume no deductible, no maximum covered loss, and no coinsurance factor.
A. less than 74%
B. at least 74% but less than 76%
C. at least 76% but less than 78%
D. at least 78% but less than 80%
E. at least 80%
31.25 (3 points) Assuming a 1000 deductible, what is the average size of the insurers payments
for those payments greater than 500 and at most 4000?
A. less than 2100
B. at least 2100 but less than 2130
C. at least 2130 but less than 2160
D. at least 2160 but less than 2190
E. at least 2190
31.26 (3 points) Assuming a 75% coinsurance factor, and a 1000 deductible, what is the average
size of the insurers payments for those payments greater than 500 and at most 4000?
A. less than 2100
B. at least 2100 but less than 2130
C. at least 2130 but less than 2160
D. at least 2160 but less than 2190
E. at least 2190
31.27 (4 points) Assuming a 75% coinsurance factor, a 1000 deductible and a 25,000 maximum
covered loss, what is the average size of the insurers payments for those payments greater than
15,000 and at most 19,000?
A. less than 17,400
B. at least 17,400 but less than 17,500
C. at least 17,500 but less than 17,600
D. at least 17,600 but less than 17,700
E. at least 17,700
31.28 (1 point) Assuming a 75% coinsurance factor, a 1000 deductible and a 25,000 maximum
covered loss, what is the mean of the insurers payments per loss?
A. less than 4000
B. at least 4000 but less than 5000
C. at least 5000 but less than 6000
D. at least 6000 but less than 7000
E. at least 7000

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 430

31.29 (3 points) You are given the following information about a policyholder:

His loss ratio is calculated as incurred losses divided by earned premium.


He will receive a policyholder dividend as a percentage of earned premium equal to
1/4 of the difference between 60% and his loss ratio.
He receives no policyholder dividend if his loss ratio is greater than 60%.

His earned premium is 40,000.


His incurred losses are distributed via a LogNormal Distribution, with = 6 and = 3.
Calculate the expected value of his policyholder dividend.
(A) 4800
(B) 5000
(C) 5200
(D) 5400
(E) 5600
Use the following information for the next two questions:
In the state of Minnehaha, each town is responsible for its snow removal.

However, a state fund shares the cost if a town has a lot of snow during a winter.
In exchange, a town is required to pay into this state fund when it has a winter with
a small amount of snow.
Let x be the number of inches of snow a town has during a winter.

If x < 20, then the town pays the state fund c(20 - x), where c varies town.
If x > 50, then the state fund pays the town c(x - 50).
c = 1000 for the town of Frostbite Falls.
31.30 (3 points) The number of inches of snow the town of Frostbite Falls has per winter is
equally likely to be: 8, 10, 16, 21, 35, 57, 70, or 90.
What is the expected net amount the state fund pays Frostbite Falls (expected amount state fund
pays town minus expected amount town pays the state fund) per winter?
A. 3000
B. 3500
C. 4000
D. 4500
E. 5000
31.31 (5 points) The number of inches of snow the town of Frostbite Falls has per winter is
LogNormal, with = 2.4 and = 1.5.
What is the expected net amount the state fund pays Frostbite Falls (expected amount state fund
pays town minus expected amount town pays the state fund) per winter?
A. 7000
B. 7500
C. 8000
D. 8500
E. 9000

31.32 (2 points) N follows a Poisson Distribution, with = 2.5. Determine E[(N - 3)+].
A. 0.2

B. 0.3

C. 0.4

D. 0.5

E. 0.6

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 431

31.33 (2 points) The lifetime of batteries is Exponential with mean 6. Batteries are sold for $100
each. If a battery lasts less than 2 years, the manufacturer will pay the purchaser the pro rata share of
the purchase price. For example if the battery lasts only 1.5 years, the manufacturer will pay the
purchaser (100)(2 - 1.5)/2 = 25.
What is the expected amount paid by the manufacturer per battery sold?
(A) 11
(B) 13
(C) 15
(D) 17
(E) 19
31.34 (4 points) XYZ Insurance Company writes insurance in a state with a catastrophe fund for
hurricanes. For any hurricane on which XYZ has more than $30 million in losses in this state, the
Catastrophe Fund will pay XYZ 75% of its hurricane losses above $30 million, subject to a
maximum payment from the fund of $90 million.
The amount XYZ pays in this state on a hurricane that hits this state is distributed via a LogNormal
Distribution, with = 15 and = 2. What is expected value of the amount XYZ will receive from the
Catastrophe Fund due to the next hurricane to hit this state?
(A) 4 million (B) 5 million (C) 6 million (D) 7 million (E) 8 million

Use the following information for the next two questions:

Losses follow a Pareto Distribution, with parameters = 5 and = 40,000.

Three losses are expected each year.


For each loss less than or equal to 5,000, the insurer makes no payment.

31.35 (2 points) You are given the following:


For each loss greater than 5,000, the insurer pays the amount of the loss up to the maximum
covered loss of 25,000, less a 5000 deductible.
(Thus for a loss of 7000 the insurer pays 2000; for a loss of 80,000 the insurer pays 20,000.)
Determine the insurer's expected annual payments.
A. Less than 7,500
B. At least 7,500, but less than 12,500
C. At least 12,500, but less than 17,500
D. At least 17,500, but less than 22,500
E. At least 22,500
31.36 (2 points) For each loss greater than 5,000, the insurer pays the entire amount of the loss up
to the maximum covered loss of 25,000.
Determine the insurer's expected annual payments.
A. Less than 7,500
B. At least 7,500, but less than 12,500
C. At least 12,500, but less than 17,500
D. At least 17,500, but less than 22,500
E. At least 22,500

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 432

31.37 (2 points) Losses follow an Exponential Distribution with = 20,000.


Calculate the percent of expected losses within the layer 5,000 to 50,000.
A. Less than 50%
B. At least 50%, but less than 55%
C. At least 55%, but less than 60%
D. At least 60%, but less than 65%
E. At least 65%
31.38 (4 points) Losses follow a LogNormal Distribution with = 9.4 and = 1.
Calculate the percent of expected losses within the layer 5,000 to 50,000.
A. Less than 50%
B. At least 50%, but less than 55%
C. At least 55%, but less than 60%
D. At least 60%, but less than 65%
E. At least 65%
31.39 (3 points) Losses follow a Pareto Distribution with = 3 and = 40,000.
Calculate the percent of expected losses within the layer 5,000 to 50,000.
A. Less than 50%
B. At least 50%, but less than 55%
C. At least 55%, but less than 60%
D. At least 60%, but less than 65%
E. At least 65%
31.40 (3 points) N follows a Geometric Distribution, with = 2.5. Determine E[(N - 3)+].
A. 0.9

B. 1.0

C. 1.1

D. 1.2

E. 1.3

31.41 (3 points) Losses follow a Pareto Distribution with = 3 and = 12,000.


Policy A has a deductible of 3000. Policy B has a maximum covered loss of u.
The average payment per loss under Policy A is equal to that under Policy B. Determine u.
A. 4000
B. 5000
C. 6000
D. 7000
E. 8000
31.42 (1 point) X is 5 with probability 80% and 25 with probability 20%.
If E[(y - X)+ ] = 8, determine y.
A. 10

B. 15

C. 20

D. 25

E. 30

31.43 (2 points) X is Exponential with = 2. Y is equal to 1 - X is X < 1, and Y is 0 if X 1.


What is the expected value of Y?
A. 0.15
B. 0.17
C. 0.19

D. 0.21

E. 0.23

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 433

31.44 (3 points) Let R be the weekly wage for a worker compared to the statewide average.
R follows a LogNormal Distribution with = 0.4. Determine the percentage of overall wages earned
by workers whose weekly wage is less than twice the statewide average.
A. 88%
B. 90%
C. 92%
D. 94%
E. 96%
31.45 (2 points) You observe the following 35 losses: 6, 7, 11, 14, 15, 17, 18, 19, 25, 29, 30, 34,
40, 41, 48, 49, 53, 60, 63, 78, 85, 103, 124, 140, 192, 198, 227, 330, 361, 421, 514, 546, 750,
864, 1638. What is the (empirical) Limited Expected Value at 50?
A. less than 38
B. at least 38 but less than 39
C. at least 39 but less than 40
D. at least 40 but less than 41
E. at least 41
31.46 (2 points) Alexs pay is based on the annual profit made by his employer.
Alex is paid 2% of the profit, subject to a minimum payment of 100.
The annual profits for Alexs company, X, follow a distribution F(x).
Which of the following represents Alexs expected payment?
A. 100F(100) + E[X]/50 - E[X 100]
B. 100F(5000) + E[X]/50 - E[X 5000]/50
C. 100 + .02(E[X] - E[X 5000])
D. .02(E[X 5000] - E[X 100]) + 100S(5000)
E. None of A, B, C, or D
31.47 (2 points) In the previous question, assume F(x) = 1 - {20,000/(20,000 + x)}3 .
Determine Alexs expected payment.
A. 200
B. 230
C. 260
D. 290
E. 320
31.48 (2 points) The size of losses follows a Gamma distribution with parameters = 3, = 100.
What is the limited expected value at 500, E[X 500] ?
Hint: Use Theorem A.1 in Appendix A of Loss Models:
j=n-1

(n; x) = 1 -

xj e- x / j! , for n a positive integer.


j=0

A. less than 275


B. at least 275 but less than 280
C. at least 280 but less than 285
D. at least 285 but less than 290
E. at least 290

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 434

31.49 (2 points) Donald Adams owns the Get Smart Insurance Agency.
Let L be the annual losses from the insurance policies that Dons agency writes for the Control
Insurance Company. L follows a Single Parameter Pareto distribution with = 3 and = 100,000.
Don gets a bonus from the Control Insurance Company calculated as (170,000 - L)/4 if this quantity
is positive and 0 otherwise. Calculate Dons expected bonus.
A Less than 10,000
B. At least 10,000, but less than 12,000
C. At least 12,000, but less than 14,000
D. At least 14,000, but less than 16,000
E. At least 16,000
31.50 (1 point) In the previous question, calculate the expected value of Dons bonus conditional on
his bonus being positive.
31.51 (1 point) X follows the density f(x), with support from 0 to infinity.
1000

1000

f(x) dx = 0.87175.

x f(x) dx = 350.61.

Determine E[X 1000].


A. Less than 480
B. At least 480, but less than 490
C. At least 490, but less than 500
D. At least 500, but less than 510
E. At least 510
31.52 (3 points) The size of loss is modeled by a two parameter Pareto distribution with = 5000
and = 3. An insurance has the following provisions:
(i) It pays 75% of the first 2000 of any loss.
(ii) It pays 90% of any portion of a loss that is greater than 10,000.
Calculate the average payment per loss.
A Less than 1050
B. At least 1050, but less than 1100
C. At least 1100, but less than 1150
D. At least 1150, but less than 1200
E. At least 1200

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 435

31.53 (3 points) The mean number of minutes used per month by owners of cell phones varies
between owners via a Single Parameter Pareto Distribution with = 1.5 and = 20.
The Telly Savalas Phone Company is planning to sell a new unlimited calling plan.
Only those whose current average usage is greater than the overall average will sign up for the plan.
In addition, those who sign up will use on average 50% more minutes than currently.
What is the expected number of minutes used per month under the new plan?
A. 150
B. 180
C. 210
D. 240
E. 270
31.54 (2 points) Define the first moment distribution, G(x), as the percentage of total loss dollars that
come from those losses of size less than x.
If the size of loss distribution follows a LogNormal Distribution, with parameters and , determine
the form of the first moment distribution.
31.55 (3 points) Define the quartiles as the 25th, 50th, and 75th percentiles.
Define the trimmed mean as the average of those values between the first (lower) quartile and the
third (upper) quartile.
Determine the trimmed mean for an Exponential Distribution.
31.56 (2 points) For a Pareto Distribution with = 1, derive the formula for the Limited Expected
Value that is shown in Appendix A of Loss Models, attached to the exam.
31.57 (4 points) The value of a Property Claims Service (PCS) index is determined by the
catastrophe losses for the insurance industry in a certain region of the country over a certain period of
time. Each $100 million of catastrophe losses corresponds to one point on the index.
A 100/150 call spread would pay: (200) {(S - 150)+ - (S - 100)+},
0 if x < 0
where S is the value of the PCS index at expiration and X+ =
.
x if x 0
You assume that the catastrophe losses in a certain region follow a LogNormal Distribution with
parameters = 20 and = 2.
What is the expected payment on a 100/150 call spread on the PCS Index for this region?
A. 200
B. 300
C. 400
D. 500
E. 600

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 436

31.58 (3 points) Define the quantile Q to be such that F[Q] = .


For between 0 and 1/2, compute the Windsorized mean by:
1. Replace all values below Q by Q.
2. Replace all values above Q1 by Q1.
3. Take the average.
Determine the algebraic form of the Windsorized mean for an Exponential Distribution.
31.59 (4 points) Define p as the pth percentile.
Define the trimmed mean as the average of those values between 1-p and p .
For p = 95%, determine the algebraic form of the trimmed mean for a Pareto Distribution.
31.60 (4, 5/88, Q.61) (3 points) Losses for a given line of insurance are distributed according to
the probability density function f(x) = 0.015 - .0001x, 0 < x < 100.
An insurer has issued policies each with a deductible of 10 for this line.
On these policies, what is the average expected payment by the insurer per non-zero payment by
the insurer?
A. Less than 30
B. At least 30, but less than 35
C. At least 35, but less than 40
D. At least 40, but less than 45
E. 45 or more
31.61 (4, 5/90, Q.53) (2 points) Loss Models defines two functions:
1. the limited expected value function, E[X x] and
2. the Mean Excess Loss function, e(x)
If F(x) = Pr{X x} and the expected value of X is denoted by E[X], then which of the following
equations expresses the relationship between E[X x] and e(x)?
A. E[X x] = E[X] - e(x) / {1- F(x)}
B. E[X x] = E[X] - e(x)
C. E[X x] = E[X] - e(x)(1 - F(x))
D. E[X x] = E[X](1 - F(x)) - e(x)
E. None of the above

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 437

31.62 (4B, 11/93, Q.16) (1 point) Which of the following statements are true regarding loss
distribution models?
1. For small samples, method of moments estimators have smaller variances than
maximum likelihood estimators.
2. The limited expected value function evaluated at any point d > 0 equals
E [X d] =

x fx(x) dx + d{1 - Fx(d)}, where fx(x) and Fx(x) are the probability density

and distribution functions, respectively, of the loss random variable X.


3. A consideration in model selection is agreement between the empirical and fitted
limited expected value functions.
A. 2
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3
31.63 (4B, 5/95, Q.22) (2 points) You are given the following:
Losses follow a Pareto distribution, with parameters = 1000 and = 2.
10 losses are expected each year.
The number of losses and the individual loss amounts are independent.
For each loss that occurs, the insurer's payment is equal to the entire amount of the
loss if the loss is greater than 100. The insurer makes no payment if the loss is less
than or equal to 100.
Determine the insurer's expected annual payments.
A. Less than 8,000
B. At least 8,000, but less than 9,000
C. At least 9,000, but less than 9,500
D. At least 9,500, but less than 9,900
E. At least 9,900
31.64 (4B, 11/98, Q.28) (2 points) You are given the following:

Losses follow a lognormal distribution, with parameters = 10 and = 1.

One loss is expected each year.


For each loss less than or equal to 50,000, the insurer makes no payment.
For each loss greater than 50,000, the insurer pays the entire amount of the
loss up to the maximum covered loss of 100,000.
Determine the insurer's expected annual payments.
A. Less than 7,500
B. At least 7,500, but less than 12,500
C. At least 12,500, but less than 17,500
D. At least 17,500, but less than 22,500
E. At least 22,500

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 438

31.65 (3, 5/00, Q.25) (2.5 points) An insurance agent will receive a bonus if his loss ratio is less
than 70%. You are given:
(i) His loss ratio is calculated as incurred losses divided by earned premium on his block
of business.
(ii) The agent will receive a percentage of earned premium equal to 1/3 of the difference
between 70% and his loss ratio.
(iii) The agent receives no bonus if his loss ratio is greater than 70%.
(iv) His earned premium is 500,000.
(v) His incurred losses are distributed according to the Pareto distribution:
F(x) = 1 - {600,000 / (x + 600,000)}3 , x > 0.
Calculate the expected value of his bonus.
(A) 16,700 (B) 31,500 (C) 48,300 (D) 50,000 (E) 56,600
31.66 (3, 11/00, Q.27 & 2009 Sample Q.116) (2.5 points) Total hospital claims for a health plan
were previously modeled by a two-parameter Pareto distribution with = 2 and = 500.
The health plan begins to provide financial incentives to physicians by paying a bonus of 50% of
the amount by which total hospital claims are less than 500.
No bonus is paid if total claims exceed 500.
Total hospital claims for the health plan are now modeled by a new Pareto distribution
with = 2 and = K . The expected claims plus the expected bonus under the revised
model equals expected claims under the previous model.
Calculate K.
(A) 250
(B) 300
(C) 350
(D) 400
(E) 450
31.67 (3, 11/02, Q.37 & 2009 Sample Q.96) (2.5 points) Insurance agent Hunt N. Quotum will
receive no annual bonus if the ratio of incurred losses to earned premiums for his book of business is
60% or more for the year. If the ratio is less than 60%, Hunts bonus will be a percentage of his
earned premium equal to 15% of the difference between his ratio and 60%. Hunts annual earned
premium is 800,000. Incurred losses are distributed according to the Pareto distribution,
with = 500,000 and = 2. Calculate the expected value of Hunts bonus.
(A) 13,000

(B) 17,000

(C) 24,000

(D) 29,000

(E) 35,000

31.68 (1 point) In the previous question, (3, 11/02, Q. 37), calculate the expected value of Hunts
bonus, given that Hunt receives a (positive) bonus.
(A) 46,000 (B) 48,000 (C) 50,000 (D) 52,000 (E) 54,000
31.69 (CAS3, 11/03, Q.21) (2.5 points)
The cumulative loss distribution for a risk is F(x) = 1 - 106 / (x + 103 )2 .
Calculate the percent of expected losses within the layer 1,000 to 10,000.
A. 10%
B. 12%
C. 17%
D. 34%
E. 41%

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 439

31.70 (SOA3, 11/03, Q.3 & 2009 Sample Q.84) (2.5 points) A health plan implements an
incentive to physicians to control hospitalization under which the physicians will be paid a bonus B
equal to c times the amount by which total hospital claims are under 400 (0 c 1). The effect the
incentive plan will have on underlying hospital claims is modeled by assuming that the new total
hospital claims will follow a two-parameter Pareto distribution with = 2 and = 300. E(B) = 100.
Calculate c.
(A) 0.44

(B) 0.48

(C) 0.52

(D) 0.56

(E) 0.60

31.71 (SOA3, 11/04, Q.7 & 2009 Sample Q.123) (2.5 points) Annual prescription drug costs
are modeled by a two-parameter Pareto distribution with = 2000 and = 2.
A prescription drug plan pays annual drug costs for an insured member subject to the
following provisions:
(i) The insured pays 100% of costs up to the ordinary annual deductible of 250.
(ii) The insured then pays 25% of the costs between 250 and 2250.
(iii) The insured pays 100% of the costs above 2250 until the insured has paid 3600 in total.
(iv) The insured then pays 5% of the remaining costs.
Determine the expected annual plan payment.
(A) 1120
(B) 1140
(C) 1160
(D) 1180
(E) 1200
31.72 (CAS3, 11/05, Q.22) (2.5 points) An insurance agent gets a bonus based on the
underlying losses, L, from his book of business.
L follows a Pareto distribution with parameters = 3 and = 600,000.
His bonus, B, is calculated as (650,000 - L)/3 if this quantity is positive and 0 otherwise.
Calculate his expected bonus.
A Less than 100,000
B. At least 100,000, but less than 120,000
C. At least 120,000, but less than 140,000
D. At least 140,000, but less than 160,000
E. At least 160,000
31.73 (SOA M, 11/05, Q.14) (2.5 points) You are given:
(i) T is the future lifetime random variable.
(ii) (t) = , t 0.
(iii) Var[T] = 100.
Calculate E[T 10].
(A) 2.6
(B) 5.4

(C) 6.3

(D) 9.5

(E) 10.0

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 440

31.74 (CAS3, 5/06, Q.37) (2.5 points) Between 9 am and 3 pm Big National Bank employs 2
tellers to service customer transactions. The time it takes Teller X to complete each transaction
follows an exponential distribution with a mean of 10 minutes. Transaction times for Teller Y follow an
exponential distribution with a mean of 15 minutes. Both Teller X and Teller Y are continuously busy
while the bank is open.
On average every third customer transaction is a deposit and the amount of the deposit follows a
Pareto distribution with parameter = 3 and = $5000. Each transaction that involves a deposit of
at least $7500 is handled by the branch manager.
Calculate the expected total deposits made through the tellers each day.
A. Less than $31,000
B. At least $31,000, but less than $32,500
C. At least $32,500, but less than $35,000
D. At least $35,000, but less than $37,500
E. At least $37,500
31.75 (SOA M, 11/06, Q.20 & 2009 Sample Q.281) (2.5 points)
For a special investment product, you are given:
(i) All deposits are credited with 75% of the annual equity index return, subject to a minimum
guaranteed crediting rate of 3%.
(ii) The annual equity index return is normally distributed with a mean of 8%
and a standard deviation of 16%.
(iii) For a random variable X which has a normal distribution with mean and standard deviation ,
you are given the following limited expected values:
E[X

= 6%

3%]
= 8%

= 12%

-0.43%

0.31%

= 16%

-1.99%

-1.19%

E[X

4%]

= 6%

= 8%

= 12%

0.15%

0.95%

= 16%

-1.43%

-0.58%

Calculate the expected annual crediting rate.


(A) 8.9%
(B) 9.4%
(C) 10.7% (D) 11.0%

(E) 11.6%

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 441

31.76 (SOA M, 11/06, Q.31 & 2009 Sample Q.286) (2.5 points) Michael is a professional
stuntman who performs dangerous motorcycle jumps at extreme sports events around the world.
The annual cost of repairs to his motorcycle is modeled by a two parameter Pareto distribution
with = 5000 and = 2.
An insurance reimburses Michaels motorcycle repair costs subject to the following provisions:
(i) Michael pays an annual ordinary deductible of 1000 each year.
(ii) Michael pays 20% of repair costs between 1000 and 6000 each year.
(iii) Michael pays 100% of the annual repair costs above 6000 until Michael has paid 10,000 in
out-of-pocket repair costs each year.
(iv) Michael pays 10% of the remaining repair costs each year.
Calculate the expected annual insurance reimbursement.
(A) 2300
(B) 2500
(C) 2700
(D) 2900
(E) 3100

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 442

Solutions to Problems:
31.1. C. The distribution is an Exponential Distribution with = 1/2.
For the Exponential Distribution E[X x] = (1 - e-x/).
The average size of the capped losses is: E[X 1] = (1/2)(1 - e-2) = .432.
Thus the expected annual total loss payments on a basic limits policy are: (13)(.432) = 5.62.
Alternately, one can use the relation between the mean excess loss and the Limited Expected
Value: e(x) = { mean - E[X x] } / {1 - F(x)}, therefore E[X x] = mean - e(x){1 - F(x)}. For the
Exponential Distribution, the mean excess loss is a constant = = mean.
Therefore E[X x] = mean - e(x){1 - F(x)} = (e-x/ ). Proceed as before.
31.2. B. mean = exp( + 2/2) = 98,716. Therefore, with 200 claims expected per year, the
expected total cost per year is: (200)(98716) = $19.74 million.
31.3. A. E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.
E[X 1 mill.] = exp(7 + 9/2)[(ln(1000000) - 7 - 9 )/3] + (1000000){1 - [(ln(1000000) 7)/3]}
= (98,716)[-.73] + (100000)(1[2.27]) = (98,716)(1 - 0.7673) + (100,000)(1 - 0.9884)
= 22,971 + 11,600 = 34,571.
With a limit of $1 million per claim and 200 claims expected per year, the expected total cost per
year is: 200 E[X 1 million] = (200)(34,571) = $6.91 million.
31.4. E. E[X 5 million] = exp(7 + 9/2)[(ln(5000000) - 7 - 9 )/3] +
(5000000) {1 [(ln(5000000) 7)/3]} = (98716)[-.19] + (500000)(1[2.81])
= (98716)(1 - .5753) + (500,000)(1- .9975) = 41,925 + 12,500 = 54,425.
200 E[X 5 million] = $10.88 million.
31.5. D. The dollars in the layer from $1 million to $5 million is the difference between the dollars
limited to $5 million and the dollars limited to $1 million. Using the answers to the two previous
questions: $10.88 million - $6.91 million = $3.97 million.
Comment: In terms of the limited expected values and the expected number of losses N, the
dollars in the layer from $1 million to $5 million equals: N{E[X 5 million] - E[X 1 million]}.
In this case N = 200.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 443

31.6. C. The dollars excess of $5 million per loss is the difference between the total cost and the
cost limited to $5 million per loss. Using the answers to two prior questions:
$19.74 million - $10.88 million = $8.86 million.
Comment: The dollars excess of $5 million per losses equals:
N{E[X ] - E[X 5 million]} = N{mean - E[X 5 million]}. In this case N = 200 losses.
31.7. D. First calculate the dollars of loss on these losses per total number of losses:
{E[X 5 million] - 5 million S(5 million)} - {E[X 1 million] - 1 million S(1 million)} =
{54,425 - (5 million)(1-.9975)} - {34,517 - (1 million)(1-.9884)} = 41,925 - 22,917 = $19,008.
Then divide by the probability of a loss being of this size:
F(5 million) - F(1 million) = [(ln(5000000) - 7)/3] - [(ln(1000000) - 7)/3] =
[2.81] [2.27] = (.9975 - .9884) = .0091. $19,008 / .0091 = $2.09 million.
31.8. C. Either one can calculate the expected number of losses of this size per year as
(200){F(5 million) - F(1 million)} = (200){0.9975 - 0.9884} = 1.8 and multiply by the average size
calculated in the previous question. (1.8)($2.09 million) = $3.8 million. Alternately, one can multiply
the expected number of losses per year times the dollars on these losses per loss calculated in a
previous question: (200)($19,008) = $3.8 million.
31.9. E. For the Pareto Distribution, E[X x] = {/(1)}{1(/(+x))1}.
E[X 250000] = {15000/1.5}{1-(15000/(15000+250000))2.5-1} = 10000{.9865} = 9865.
31.10. E. The mean of a Pareto is: /(-1) = 1800/3 = 600.
31.11. B. For the Pareto Distribution, E[X x] = {/(1)}{1 (/(+x))1}.
E[X 900] = {1800/3}{1-(1800/(1800+900))3 } = 422.22.
31.12. D. For the Pareto Distribution, E[X x] = {/(1)}{1 (/(+x))1}.
E[X 225] = {1800/3}{1-(1800/(1800+225))3} = 178.60.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 444

31.13. B. The average weekly wage is $600, from a previous solution. Thus the maximum benefit
is $600, while the minimum benefit is $600/4 = $150. These correspond to pre-injury wages of
$600/(2/3) = $900 and $150/(2/3) = $225 respectively. (If a workers pre-injury wage is more than
$900 his benefit is only $600. If his pre-injury wage is less than $225, his benefit is still $150.)
Let x be the workers pre-injury wage, then the workers benefits are:
$150 if x $225, 2x/3 if x $225 and x $900, $600 if x $900.
Thus the average benefit is made up of three terms (low, medium, and high wages):
900

150 F(225) + (2/3)

900

225 x f(x) dx + 600 S(900)

900

225 x f(x) dx = 0

225

x f(x) dx -

x f(x) dx = E[X 900] - 900S(900) - {E[X 225] - 225S(225)}.

Thus the average benefit is:


150F(225) + 150S(225) + 600S(900) - 600S(900) + (2/3)(E[X 900] - E[X 225]) =
150 + (2/3)(E[X 900] -E[X 225]) = 150 + (2/3)(422.22 - 178.60) = 312.41.
Alternately, the benefits can be described as:
150 + (2/3)(layer of wages between 900 and 225) = 150 + (2/3)(E[X 900] -E[X 225]).
Comment: Extremely unlikely to be asked on the exam. Relates to the calculation of Law
Amendment Factors used in Workers Compensation Ratemaking. Geometrically oriented
students may benefit by reviewing the subsection on payments subject to both a minimum and
a maximum in the subsequent section on Lee Diagrams.
31.14. E. E[X] = = 10,000.
31.15. C. E[X 25000] = 10,000 (1-e-25000/10000) = 9179.
31.16. B. E[X x] = (1 - e-x/) = 10,000 (1 - e-1000/10000) = 952.
E[X] - E[X 1000] = 10,000 - 952 = 9048.
31.17. E. {E[X] - E[X 1000]}/S(1000) = (10,000 - 952)/.9048 = 10,000.
Alternately, the average size of the data truncated and shifted from below is the mean excess loss.
For the Exponential e(x) = = 10,000.
Comment: For the Exponential, {E[X] - E[X x]}/S(x) = . Thus for the Exponential Distribution, in
the absence of any maximum covered loss, the average size of the insurers payment per non-zero
payment by the insurer does not depend on the deductible amount; the mean excess loss is
constant for the Exponential Distribution.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 445

31.18. A. E[X 25000] = 10,000 (1-e-25000/10000) = 9179.


E[X 25000] - E[X 1000] = 9179 - 952 = 8,227.
31.19. B. {E[X 25000] - E[X 1000]}/S(1000) = (9179 - 952)/.9048 = 9093.
31.20. E. Each payment is 75% of the insureds loss, so the average is:
(.75)E[X] = (.75)(10,000) = 7500.
31.21. B. Each payment is 75% of the what it would have been without any coinsurance, so the
average is (.75)(E[X] - E[X 1000]) = (.75)(10,000 - 952) = 6786.
31.22. C. Each payment is 75% of the what it would have been without any coinsurance, so the
average is (.75)(E[X 25000] - E[X 1000])/ S(1000) = (.75)(9179 - 952)/.9048 = 6819.
31.23. D. Average Size of Losses in the Interval [1000, 25000] =
{E[X 25000] - 25000S(25000) - (E[X 1000] - 1000S(1000))}/ {F(25000) - F(1000)} =
{9179 - 25000(.08208) - (952 - (1000)(.9048))}/(.9179 - .0952) = 7080/.8227 = 8606.
31.24. A. {E[X 25000] - 25000S(25000) - (E[X 1000] - 1000S(1000))}/E[X] =
{9179 - 25000(.08208) - (952 - (1000)(.9048))}/ 10,000 = 7080/10,000 = 70.8%.
31.25. C. The payments from 500 to 4000 correspond to losses of size between 1500 and
5000. These losses have average size:
{E[X 5000] - 5000S(5000) - (E[X 1500] - 1500S(1500))}/ {F(5000) - F(1500)} =
{3935 - 5000(.6065) - (1393 - (1500)(.8607))}/(.3935 - .1393) = 3148.
The average size of the payments is 1000 less: 3148 - 1000 = 2148.
31.26. B. The payments from 500 to 4000 correspond to losses of size between
1000+(500/.75) = 1667 and 1000+(4000/.75) = 6333. These losses have average size:
{E[X 6333] - 6333S(6333) - (E[X 1667] - 1667S(1667))}/ {F(6333) - F(1667)} =
{4692 - 6333(.5308) - (1535 - (1667)(.8465))}/(.4692 - .1535) = 3820. The average size of the
payments is 1000 less then multiplied by .75: (3820 - 1000)(.75) = 2115.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 446

31.27. B. The most the insurer will pay is: (0.75)(25,000 - 1000) = 18,000.
For any loss of size greater than or equal to 25,000 the insurer pays 18,000.
Let X be the size of loss.
Then the payment is 18,000 if X 25,000, and (0.75)(X - 1000) if 25,000 > X > 1000.
A payment of 15,000 corresponds to a loss of: (15,000/0.75) + 1000 = 21,000.
Thus the dollars of payments greater than 15,000 and at most 19,000 is the payments on losses of
size greater than 21,000, which we split into two pieces:
25,000

0.75(x - 1000) f(x) dx + 18,000 S(25000) =


21,000
25,000

0.75

x f(x) dx - 750{F(25000)-F(21000)} + 18,000S(25000) =

21,000

0.75 {E[X 25000] - 25000S(25000) - (E[X 21000] - 21000S(21000)}


+ 750{S(25000) - S(21000)} + 18,000S(25000) =
0.75E[X 25000] - 0.75E[X 21000] + 15,750S(21000) - 18,750S(25000) - 750S(21000)
+ 750S(25000) + 18,000S(25000) =
0.75E[X 25000] - 0.75E[X 21000] + 15,000S(21000) =
0.75(9179.2) - (0.75)(8775.4) +15000(0.12246) = 2139.8.
In order to get the average size we need to divide the payments by the percentage of the number
of losses represented by losses greater than 21,000, S(21,000) = 0.12246:
2139.8/ 0.12246 = 17,473.
Comment: Long and difficult. In this case it may be easier to calculate the integral of xf(x)dx, rather
than put it in terms of the Limited Expected Values and Survival Functions.
31.28. D. 0.75{E[X 25000] - E[X 1000]} = (0.75)(9179 - 952) = 6170.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 447

31.29. B. A loss ratio of 60% corresponds to (.6)(40000) = $24,000 in losses.


If his losses are x, and x < 24,000, then he gets a dividend of (1/4)(24,000 - x).
The expected dividend is:
24000

(1/4) (24000 - x) f(x) = (1/4){24000 F(24000) - (E[X 24000] - 24000S(24000))}


0
= (1/4){24000

- (E[X 24000]}. For a LogNormal Distribution,

E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}. Therefore,


E[X 24000] = exp(6 + 9/2)[(ln24000 - 6 - 9)/3] + 24000 {1 [(ln24000 - 6)/3]} =
(36316)[-1.64] + (24000){1 - [1.36]} = (36,316)(1 - .9495) + (24,000)(1 - .9131) = 3920.
Therefore, his expected dividend is: (1/4)(24000 - 3920) = $5020.
31.30. E. If x < 20, then Frostbite Falls pays the state fund 1000(20 - x).
The expected amount by which x is less than 20, (the savings at 20), is: 20 - E[X 20].
E[X 20] = (8 + 10 + 16 + 100)/8 = 16.75.
Therefore, the expected amount paid by the town to the state fund per winter is:
(1000)(20 - E[X 20]) = 3250.
If x > 50, then the state fund pays Frostbite Falls 1000(x - 50). The expected amount by which x is
more than 50, (the inches of snow excess of 50), is: E[X] - E[X 50].
E[X] = (8 + 10 + 16 + 21 + 35 + 57 + 70 + 90)/8 = 38.375.
E[X 50] = (8 + 10 + 16 + 21 + 35 + 150)/8 = 30.
Therefore, the expected amount paid by the state fund to the town per winter is:
(1000)(E[X] - E[X 50]) = (1000)(38.375 - 30) = 8375.
Expected amount state fund pays town minus expected amount town pays the state fund is:
8375 - 3250 = 5125.
Alternately, one can list what happens in each possible situation:
Snow

Paid by State

8
10
16
21
35
57
70
90

-12,000
-10,000
-4,000
0
0
7,000
20,000
40,000

Average

5,125

Comment: (12 + 10 + 4)/8 = 3.250 = 20 - E[X 20]. (7 + 20 + 40)/8 = 8.375 = E[X] - E[X 50].
A very simplified example of retrospective rating. See for example, Individual Risk Rating, by
Margaret Tiller Sherwood in Foundations of Casualty Actuarial Science.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 448

31.31. A. If x < 20, then Frostbite Falls pays the state fund 1000(20 - x). The expected amount by
which x is less than 20, (the savings at 20), is: 20 - E[X 20].
E[X

20] = exp( + 2/2)[(ln20 2)/] + 20{1 - [(ln20 )/]} =

(33.954)[-1.10] + (20){1 - [.40]) = (33.954)(.1357) + (20)(1 - .6554) = 11.50.


Therefore, the expected amount paid by the town to the state fund per winter is:
(1000)(20 - E[X 20]) = 8500.
If x > 50, then the state fund pays Frostbite Falls 1000(x - 50). The expected amount by which x is
more than 50, (the inches of snow excess of 50), is: E[X] - E[X 50].
E[X] = exp( + 2/2) = 33.954.
E[X

50] = exp( + 2/2)[(ln50 2)/] + 50{1 - [(ln50 )/]} =

(33.954)[-.49] + (20){1 - [1.01]) = (33.954)(.3121) + (50)(1 - .8438) = 18.41.


Therefore, the expected amount paid by the state fund to the town per winter is:
(1000)(E[X] - E[X 50]) = (1000)(33.954 - 18.41) = 15,544.
Expected amount state fund pays town minus expected amount town pays the state fund is:
15,544 - 8500 = 7,044.
Comment: In the following Lee Diagram, other than the constant c = 1000, the expected amount
paid by the town to the state (when there is little snow) corresponds to Area A, below a horizontal
line at 20 and above the curve. Other than the constant c = 1000, the expected amount paid by the
state to the town (when there is a lot of snow) corresponds to Area B, above a horizontal line at 50
and below the curve.
200

150

100
B
50

A
0.2

0.4

0.6

0.8

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 449

31.32. C. E[(N-3)+] = E[N] - E[N 3] = - (Prob[N = 1] + 2Prob[N = 2] + 3Prob[N 3]) =


- e - 2e - (3)(1 - e - e - 2e/2) = + 3e + 2e + 2e/2 - 3 =
2.5 + 3e-2.5 + 2(2.5)e-2.5 + (2.52 )e-2.5/2 - 3 = 0.413.
Alternately, E[(N-3)+] = E[(3-N)+] + E[N] - 3 = 3Prob[N = 0] + 2Prob[N = 1] + Prob[N = 2] + - 3 =
+ 3e + 2e + 2e/2 - 3 = 0.413.
31.33. C. The expected amount by which lifetimes are less than 2 is:
2 - E[X 2] = 2 - (6)(1 - e-2/6) = .2992.
The expected amount paid per battery is: (100)(.2992/2) = 14.96.
31.34. B. For the LogNormal Distribution,
E[X

x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.

E[X 30 million] =
exp(15 + 22 /2)[(ln30000000 - 15 - 22 )/2] + 30000000 {1 - [(ln30000000 - 15)/2]} = 24154953
[-0.89] + 30000000 {1 - [1.11]} =
(24,154,953)(.1867) + (30,000,000)(1 - .8665) = 8.51 million.
E[X 150 million] =
exp(15 + 22 /2)[(ln150000000 - 15 - 22 )/2] + 150000000 {1 - [(ln150000000 - 15)/2]} =
24154953 [-.09] + 150000000 {1 - [1.91]} =
(24,154,953)(.4641) + (150,000,000)(1 - .9719) = 15.43 million.
The maximum payment of $90 million correspond to a loss by XYZ of: 30 + 90/.75 =
150 million. Therefore the average payment to XYZ per hurricane is:
.75 (E[X 150 million] - E[X 30 million]) = (.75)(15.43 - 8.51) = 5.2 million.
Comment: The portion of hurricanes on which XYZ receives non-zero payments is:
S(30 million) = 1 - [(ln30000000 - 15)/2] = 1 - [1.11] = .1335.
Therefore, the average payment per nonzero payment is: (.75)(15.43 - 8.51) /.1335 =
38.9 million. A very simplified version of the Florida Hurricane Catastrophe Fund.
31.35. C. Per loss, the insurer would pay the layer from 5,000 to 25,000, which is:
E[X 25,000] - E[X 5,000]. For the Pareto: E[X x] = {/(1)}{1(/(+x))1} =
10000 {1 - (40000/(40000+x))4 } . E[X 25,000] = 10000 {1 - (40/65)4 } = 8566.
E[X 5,000] = 10000 {1 - (40/45)4 } = 3757. E[X 25,000] - E[X 5,000] = 8566 - 3757 = 4809.
Three losses expected per year, thus the insurers expected payment is: (3)(4809) = 14,427.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 450

31.36. E. Without the feature that the insurer pays the entire loss (up to 25,000) for each loss
greater than 5,000, the insurer would pay the layer from 5,000 to 25,000, which is:
E[X 25,000] - E[X 5,000]. As calculated in the solution to the previous question,
E[X 25,000] - E[X 5,000] = 8566 - 3757 = 4809. However, that extra provision adds 5,000 per
large loss, or 5,000(1-F(5000)) = 5,000{(/(+5000)) = 5000 (40/45)5 = 2775. Thus per loss the
insurer pays: 5,000(1-F(5000)) + E[X 25,000] - E[X 5,000] = 2775 + 4809 = 7584.
There are three losses expected per year, thus the insurers expected payment is:
(3)(7584) = 22,752.
31.37. E. The expected losses within the layer 5,000 to 50,000 is:
50000

50000

S(x) dx = e-x/20000 dx = 13934.


5000

5000

The percent of expected losses within the layer 5,000 to 50,000 is: 13934/20000 = 69.7%.
Alternately, for the Exponential Distribution, LER(x) = 1 - e-x/.
LER(50000) - LER(5000) = e-5000/20000 - e-50000/20000 = e-0.25 - e-2.5 = 69.7%.
31.38. D. E[X] = exp( + 2/2) = e9.9 = 19930.
E[X

5000] = exp( + 2/2)[(ln5000 2)/] + 5000{1 - [(ln5000 )/]} =

(19930)[-1.88] + (5000){1 - [-.88]) = (19930)(.0301) + (5000)(.8106) = 4653.


E[X

50000] = exp( + 2/2)[(ln50000 2)/] + 50000{1 - [(ln50000 )/]} =

(19930)[.42] + (50000){1 - [1.42]) = (19930)(.6628) + (50000)(1 - .9222) = 17100.


The percent of expected losses within the layer 5,000 to 50,000 is:
(E[X 50,000] - E[X 5000])/E[X] = (17100 - 4653)/19930 = 62.5%.
31.39. C. E[X
E[X

x] = {/(1)}{1 - (/( + x))1} = 20000{1 - (40000/(40000 + x))2 }.

5000] = 4198. E[X

50,000] = 16,049. E[X] = /(1) = 20000.

The percent of expected losses within the layer 5,000 to 50,000 is:
(E[X 50,000] - E[X 5000])/E[X] = (16049 - 4198)/20000 = 59.3%.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 451

31.40. A. E[(N - 3)+] = E[N] - E[N 3] = - (Prob[N = 1] + 2Prob[N = 2] + 3Prob[N 3]) =


- /(1+)2 - 22/(1+)3 - 33/(1+)3 = {(1+)3 - (1+) - 22 - 33}/(1+)3 = 4/(1+)3
= 2.54 /3.53 = 0.911.
Alternately, E[(N-3)+] = E[(3-N)+] + E[N] - 3 = 3Prob[N = 0] + 2Prob[N = 1] + Prob[N = 2] + - 3 =
+ 3/(1+) + 2/(1+)2 + 2/(1+)3 - 3 = 2.5 + 3/3.5 + (2)(2.5)/3.52 + 2.52 /3.53 - 3 = 0.911.
Alternately, the Geometric shares the memoryless property of the Exponential
E[(N-3)+]/Prob[N3] = E[N] = . E[(N-3)+] = Prob[N3] = 3/(1+)3 = 4/(1+)3 = 0.911.
Comment: For integral j, for the Geometric, E[(N - j)+] = j+1/(1+)j.
31.41. E. For Policy A the average payment per loss is: E[X] - E[X 3000] =
/(-1) - (/(-1)){1 - (/(+3000))1} = 6000(12/15)2 = 6000(.64).
For Policy B the average payment per loss is: E[X u] = (/(-1)){1 - (/(+u))1} =
6000{1 - (12000/(12000+u))2 }. Setting this equal to 6000(.64):
6000(.64) = 6000{1 - (12000/(12000+u))2 } (12000/(12000+u))2 = .36. u = 8000.
31.42. B. E[(25 - X)+ ] = (25 - 5)(80%) + (0)(20%) = 16 > 8. y must be less than 25.
Therefore, E[(y - X)+ ] = (0.8)(y - 5) = 8. d = 15.
31.43. D. E[Y] = E[(1 - X)+] = 1 - E[X

1] = 1 - (1 - e1/) = 1 - 2(1 - e-1/2) = 0.213.

x=1

Alternately, E[Y] = (1 - x) e-x/2/2 dx = -e-x/2 + xe-x/2 + 2e-x/2 ] = 2e-1/2 - 1 = 0.213.


0

x=0

31.44. D. Since by definition E[R] = 1, the LogNormal Distribution has mean of 1.


exp[ + 2/2] = 1. = -2/2 = -0.08.
Percentage of overall wages earned by workers with R < 2 is:
{E[X

2] - 2S(2)}/E[X] = [(ln2 2)/] = [(ln2 + 0.08 - 0.42 )/0.4] = [1.53] = 93.7%.

Comment: Such wage tables are used to price the impact of changes in the laws governing
Workers Compensation Benefits.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 452

31.45. B. Each loss below 50 is counted as its size, while each of the 19 losses 50 counts as 50.
E[X 50] =
{6 + 7+ 11+ 14 + 15+17+ 18 + 19 + 25+ 29 + 30 + 34 + 40 + 41 + 48 + 49 + (19)(50)} / 35 =
(403 + 950)/35 = 38.66.
31.46. C. 100/2% = 5000. If the profit is less then 5000, then Alex gets 100.
Thus we want: E[Max[.02 X, 100]] = .02 E[max[X, 5000]].

E[Max[X, 5000]] = 5000F(5000) +

xf(x)dx = 5000F(5000) +

5000

5000

xf(x)dx -

xf(x)dx

= 5000F(5000) + E[X] - {E[X 5000] - 5000S(5000)} = 5000 + E[X] - E[X 5000].


.02 E[Max[X, 5000] = 100 + 0.02(E[X] - E[X 5000]).
Alternately, let Y = max[X, 5000]. Y - 5000 = 0 if X 5000, and Y - 5000 = X - 5000 if X > 5000.
Therefore, E[Y - 5000] = E[(X - 5000)+] = E[X] - E[X 5000].

E[Y] = 5000 + E[X] - E[X 5000].


Expected value of Alexs pay is: .02E[Y] = 100 + .02(E[X] - E[X
Comment: Similar to SOA M, 11/06, Q.20.

5000]).

31.47. B. F is a Pareto Distribution with = 3 and = 20,000.


E[X] = 20,000/(3 - 1) = 10,000.
E[X 5000] = (10,000)(1 - {20,000/(20,000 + 5000)}2 ) = 3600.
Alexs expected payment is: 100 + .02(E[X] - E[X 5000]) = 100 + (.02)(10000 - 3600) = 228.
31.48. C. [3 ; 5] = 1 - e-5(1 + 5 + 52 /2) = .875. [4 ; 5] = 1 - e-5(1 + 5 + 52 /2 +53 /6) = .735.
For the Gamma Distribution, E[X

500] = ()[+1 ; 500/] + 500 {1 - [ ; 500/]} =

300[4 ; 5] + 500{1 - [3 ; 5]} = (300)(.735) +(500)(1 - .875) = 283.


31.49. A. E[X x] = /( - 1) - 3/{( - 1)x1}.
E[L 170,000] = (3)(100,000)/(3 - 1) - 1000003 /{(3 - 1) 1700003-1} = 132,699.
E[(170,000 - L)+] = 170,000 - E[L 170000] = 170,000 - 132,699 = 37,301.
E[Bonus] = E[(170,000 - L)+/4] = 37,301/4 = 9325.
Comment: Similar to CAS3, 11/05, Q.22.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 453

31.50. His bonus is positive when L < 170,000.


100,000 3
F(170,000) = 1 -
= 0.79646.
170,000
E[Bonus | Bonus > 0] = E[Bonus] / Prob[Bonus > 0] = E[Bonus] / F(170,000) =
9325 / 0.79646 = 11,708.

31.51. A. E[X

1000

1000] =

x f(x) dx + 1000 S(1000) = 350.61 + (1000)(1 - 0.87175.) =

478.86.
Comment: Based on a LogNormal Distribution with = 6.0 and = 0.8.
31.52. D. For this Pareto Distribution, E[X
E[X

2000] = 1224. E[X

x] = (5000/2){1 - 50002 /(5000 + x)2 } .

10000] = 2222. E[X] = /(-1) = 2500.

The average payment per loss is:


(75%) E[X 2000] + (90%)(E[X] - E[X

10000]) = (75%)(1224) + (90%)(2500 - 2222) = 1168.

31.53. E. The mean of the Single Parameter Pareto is: /( - 1) = (1.5)(20)/(1.5 - 1) = 60.
Thus we want the average size of loss for those losses of size greater than 60.
E[X x] = /( - 1) - / {x1 ( - 1)}.
E[X 60] = (1.5)(20)/(1.5 - 1) - 201.5 / {601.5-1 (1.5 - 1)} = 36.906.
Average size of loss for those losses of size greater than 60 is:
{E[X] - (E[X 60] - 60S(60))}/S(60) = (60 - 36.906)/{(20/60)1.5} + 60 = 180.
Taking into account the 50% increase: (1.5)(180) = 270.
Alternately, the average size of those losses of size greater than 60 is:

60

x f(x) dx / S(60) = 60

x 1.5 201.5 x - 2.5 dx


(20 / 60)1.5

= (1.5) (601.5)

x - 1.5 dx

60

= (1.5)(601.5) (2)(60-0.5) = 180. Taking into account the 50% increase: (1.5)(180) = 270.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 454

31.54. The contribution of the small losses, those losses of size less than x is: E[X x] - x S(x).
E[X x] - x S(x)
The percentage of loss dollars from those losses of size less than x is:
.
E[X]
For the LogNormal Distribution, E[X

x] - x S(x) = exp[ + 2/2] [

Thus for the LogNormal Distribution, G(x) =

ln[x] - - 2
].

E[X x] - x S(x)
ln[x] - - 2
= [
].
E[X]

G(x) is also LogNormal with parameters: + 2, and .

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 455

31.55. 0.25 = 1 - exp[-Q0.25 / ]. Q0.25 = ln[4/3].


0.75 = 1 - exp[-Q0.75 / ]. Q0.75 = ln[4].
ln[4]

x = ln[4]

x exp[-x / ] dx = -x exp[-x / ] - exp[-x / ] ]


= ln[4/3] 3/4 + 3/4 - ln[4]/4 - /4

x = ln[4 / 3]
ln[4/ 3]
= {1/2 + ln[4]/2 - ln[3] 3/4}.
One half of the total probability is between the first and third quartile.
Trimmed Mean = {1/2 + ln[4]/2 - ln[3] 3/4} / (1/2) = {1 + ln[4] - ln[3] 3/2} = 0.7384 .
Alternately, E[X x] = (1 - e-x/).
E[X Q0.25] = 0.25 .

E[X Q0.75] = 0.75 .

The average size of those losses of size between Q0.25 and Q0.75 is:
{E[X Q0.75] - Q0.75 S(Q0.75 )} - {E[X Q0.25] - Q0.25 S(Q0.25)}
F(Q0.75) - F(Q0.25)

{0.75 - (ln[4])(0.25)} - {0.25 - (ln[4 / 3])(0.75)} {0.5 + 0.5 ln[4] - 0.75 ln[3]}
=
=
0.75 - 0.25
1/2
{1 + ln[4] - ln[3] 3/2} = 0.7384 .
Comment: Here the trimmed mean excludes 25% probability in each tail.
One could instead for example exclude 10% probability in each tail
The trimmed mean could be applied to a small set of data in order to estimate the mean of the
distribution from which the data was drawn. For a symmetric distribution such as a Normal
Distribution, the trimmed mean would be an unbiased estimator of the mean. If instead you
assumed the data was from a skewed distribution such as an Exponential, then the trimmed mean
would be a biased estimator of the mean. If the data was drawn from an Exponential, then the
trimmed mean divided by 0.7384 would be an unbiased estimator of the mean.
The trimmed mean would be a robust estimator; it would not be significantly affected by unusual
values in the sample. In contrast, the sample mean can be significantly affected by one unusually
large value in the sample.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

31.56. For = 1, S(x) =


x

E[X x] =

HCM 10/8/12,

Page 456

.
+ x

S(t) dt =

t= x

dt = ln( + t)] = ln(+x) - ln() = - ln[


].
+ t
+ x
t
=
0
0

Comment: The mean only exists if > 1. However, since the values entering its computation are
limited, the limited expected value exists as long as > 0.
31.57. D. (S - 150)+ - (S - 100)+ is the amount in the layer from 100 to 150 on the index.
This is 1/(100 million) times the layer from 10 billion to 15 billion on catastrophe losses.
(100 million times 150 is 15 billion.)
Thus the payment on the spread is 1/500,000 times the layer from 10 billion to 15 billion on
catastrophe losses.
ln(x) 2
ln(x)
For the LogNormal, E[X x] = exp( + 2/2)
+
x
{1

E[X 10 billion] =
ln[10 billion] - 20 - 22
ln[10 billion] - 20
exp[20 + 22 /2] [
] + (10 billion) {1 - [
]} =
2
2
(3.5849 billion) [-0.49] + (10 billion) {1 - [1.51]} =
(3.5849 billion) (0.3121) + (10 billion) {1 - 0.9345} = 1.774 billion.
E[X 15 billion] =
ln[15 billion] - 20 - 22
ln[15 billion] - 20
exp[20 + 22 /2] [
] + (15 billion) {1 - [
]} =
2
2
(3.5849 billion) [-0.28] + (15 billion) {1 - [1.72]} =
(3.5849 billion) (0.3897) + (15 billion) {1 - 0.9573} = 2.038 billion.
E[X 15 billion] - E[X 10 billion]
= (2.038 billion - 1.774 billion) / 500,000 = 528.
500,000
Comment: Not intended as a realistic model of catastrophe losses.
Catastrophe losses would be from hurricanes, earthquakes, etc.
An insurer could hedge its catastrophe risk by buying a lot of these or similar call spreads. An insurer
who owned many of these call spreads, would be paid money in the event of a lot of catastrophe
losses in this region for the insurance industry. This should offset to some extent the insurers own
losses due to these catastrophes, in a manner somewhat similar to reinsurance.
528 is the amount expected to be paid by someone who sold one of these calls (in other words
owned a put.) The probability of paying anything is low, but this person who sold a call could pay
up to a maximum of: (200)(50) = 10,000.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 457

31.58. The small values each contribute Q. Their total contribution is Q.


The large values each contribute Q1. Their total contribution is Q1 .
The medium values each contribute their value x.
Q1-

Their total contribution is:

x f(x) dx =

E[X Q1] - Q1 S(Q1) - {E[X Q] - Q S(Q)} =


E[X Q1] - Q1 - {E[X Q] - (1-)Q}.
Thus adding up the three contributions, the Windsorized mean is:
Q + Q 1 + E[X Q1] - Q1 - {E[X Q] - (1-)Q} =
E[X Q1] - E[X Q] + Q.
For the Exponential, Q = - ln(1 - ).
E[X x] = (1 - e-x/).

Q 1 = - ln().

E[X Q] = .

E[X Q1] = (1-).

Thus the Windsorized mean is: (1-) - - ln(1 - ) = {1 - 2 - ln(1-)}.


Comment: The trimmed mean excludes probability in each tail.
In contrast, the Windsorized mean substitutes for extreme values the corresponding quantile.
The Windsorized mean could be applied to a small set of data in order to estimate the mean of the
distribution from which the data was drawn.
For example if = 10%, then all values below the 10th percentile are replaced by the 10th
percentile, and all values above the 90th percentile are replaced by the 90th percentile, prior to
taking an average. For a symmetric distribution such as a Normal Distribution, the Windsorized mean
would be an unbiased estimator of the mean. If instead you assumed the data was from a skewed
distribution such as an Exponential, then the Windsorized mean would be a biased estimator of the
mean.
The Windsorized mean would be a robust estimator; it would not be significantly affected by
unusual values in the sample. In contrast, the sample mean can be significantly affected by one
unusually large value in the sample.
For the Exponential, here is a graph of the Windsorized mean divided by the mean, in other words
the Windsorized mean for = 1, as a function of alpha:

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Wind. Mean
1.0
0.9
0.8
0.7
0.6

alpha
0.1
0.2
0.3
0.4
0.5
As alpha increases, we are substituting for more of the values in the tails.

Page 458

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 459

31.59. Using the formula for VaR for the Pareto, p = {(1-p)-1/ - 1}.
0.05 = {0.95-1/ - 1}.
E[X x] =

0.95 = {0.05-1/ - 1}.

1

1 -

, 1.
+ x
1

1
{1 - 0.951- 1/ } .
E[X 0.05] =
=
1 -

1/

1
0.95

E[X 0.95] =

1
{1 - 0.051- 1/ } .
=
1 -

1/

1
0.05

The trimmed mean, the average size of those losses of size between 0.05 and 0.95 is:
{E[X 0.95 ] - 0.95 S(0.95)} - {E[X 0.05 ] - 0.05 S(0.05 )}
=
F( 0.95 ) - F( 0.05)
{

(0.951-1/ - 0.051-1/) + 0.95 0.05 - 0.05 0.95} / 0.9 =


1

1
(0.951-1/ - 0.051-1/) + 0.951-1/ - 0.051-1/ - 0.9} / 0.9 =
1

(0.951- 1/ - 0.051- 1/ )
{
- 1}, 1.
(0.9) ( -1)
For = 1, E[X x] = - ln[

].
+x

0.05 = {0.95-1 - 1} = /19.

0.95 = {0.05-1 - 1} = 19.

E[X 0.05] = ln(20/19).

E[X 0.95] = ln(20).

Therefore, the trimmed mean is: {ln(20) - ln(20/19) + 0.95/19 - (0.05)(19)} / 0.9 = 2.2716 .
Comment: Here the trimmed mean excludes 5% probability in each tail.
One could instead for example exclude 10% probability in each tail.
Even though we have excluded an equal probability in each tail, for the positively skewed Pareto
Distribution, the trimmed mean is less than the mean.
As approaches 1, the mean approaches infinity, while the trimmed mean approaches 2.2716 .
Here is a graph of the ratio of the trimmed mean to the mean:

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 460

Trimmed Mean over Mean


0.8
0.7
0.6
0.5
0.4
0.3
0.2
2

10

alpha

For example, for = 3, the trimmed mean is 0.384436 , while the mean is /2;
for = 3 the ratio of the trimmed mean to the mean is 0.768872.
31.60. C. For a loss of size x, the insurer pays 0 if x < 10, and x - 10 if 100 x 10.
(There are no losses greater than 100.) The average payment, excluding from the average small
losses on which the insurer makes no payment is:
100

100

100

100

(x-10)f(x) dx / f(x) dx = (x-10)(.015-.0001x) dx / .015-.0001x dx = 32.4 / .855 = 37.9.


10

10

10

10

Alternately, S(10) =

f(x) dx = .015-.0001x dx = .855.

10

E[X] =

100

10
100

xf(x) dx = x(.015-.0001x) dx = 41.67.

0
10

E[X

10

10 ] = xf(x) dx + 10S(10) = x(.015-.0001x) dx = .72 + 8.55 = 9.27.


0

Average payment per payment is: (E[X] - E[X

10 ])/S(10) = (41.67 - 9.27)/.855 = 37.9.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 461

31.61. C. e(x) = (losses excess of x)/(claims excess of x) = (E[X] - E[X x]) / S(x).
Therefore, E[X x] = E[X] - e(x){1 - F(x)}.
31.62. D. 1. False (not true), For small samples, either of the two methods may have smaller
variance. For large samples, the method of maximum likelihood has the smallest variance. 2. True.
This is the definition of the Limited Expected Value. 3. True.
31.63. E.

Expected amount paid per loss =

100

x f(x) dx = x f(x) dx - x f(x) dx =

100

Mean - {E[X 100] - 100(1-F(100)}. 1-F(100) = (/(+100))2 = (1000/1100)2 = .8264.


E[X 100] = {/(1)} {1 - (/(+100))1} = {1000/(2-1)} {1 - (1000/1100)2-1} = 90.90.
Mean = /(1) = 1000. Therefore, Expected amount paid per loss =
1000 - {90.90 - 82.64} = 991.74. Expect 10 losses per year, so the average cost per year is:
(10)(991.5) = $9915. Alternately, the expected cost per year of 10 losses is:

10

x f(x) dx = (10)(2)(10002) x (1000+x)-3 dx =

100

100
x=

107 {-x (1000+x)-2} ] + 107 (1000+x)-2 dx = 107 {100/11002 + 1/1100} = 9917.


x=100

100

Alternately, the average severity per loss > $ 100 is: 100 + e(100) = 100 + (+100)/( -1) =
1100 + 100 = $1200. Expected number of losses > $100 = 10(1-F(100)) = 8.2645.
Expected annual payment = $1200(8.2645) = $9917.
Comment: This is the franchise deductible.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 462

31.64. C. For the LogNormal: E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx )/]}.
E[X 100,000] = exp(10 + 12/2)[(ln100000 10 12 )/1] + 100000{1 - [(ln100000 - 10)/1]}
= e10.5(.51) + 100000(1 - (1.51)) = 36,316(.6950) + 100,000(1 - .9345) = 31,790.
E[X x] - xS(x) = exp( + 2/2)[(lnx 2)/]. E[X 50,000] - 50,000S(50000) =
e10.5[(ln50000 - 10 - 12 )/1] = 36,316 (-.18) = 36,316(.4286) = 15,565.
Without the feature that the insurer pays the entire loss (up to $100,000) for each loss greater than
$50,000, the insurer would pay the layer from 50,000 to 100,000, which is
E[X 100,000] - E[X 50,000]. That extra provision adds 50,000 per large loss, or
50,000 S(50000). Thus the insurer pays: 50,000S(50000) + E[X 100,000] - E[X 50,000] =
E[X 100,000] - {E[X 50,000] - 50,000S(50000)} = 31,790 - 15,565 = 16,225.
Alternately, the insurer pays for all dollars of loss in the layer less than $100,000, except it pays
nothing for losses of size less than $50,000. The former is: E[X 100,000];
the latter is: E[X 50,000]-50,000S(50000). Thus the insurer pays:
E[X 100,000] - {E[X 50,000] - 50,000S(50000)}. Proceed as above.
Alternately, the insurer pays all dollars for losses greater than $50,000, except it pays nothing in the
layer above $100,000. The former is:
E[X] - {E[X 50,000]-50,000(S(50000)}; the latter is: E[X] - E[X 100,000].
Thus subtracting the two values the insurer pays:
E[X 100,000] - { E[X 50,000] - 50,000(S(50000) }. Proceed as above.
Alternately, the insurer pays all dollars for losses greater than $50,000 and less than $100,000, and
pays $100,000 per loss greater than $100,000. The former is:
{E[X 100,000]-100,000(1-F(100000))} - {E[X 50,000]-50,000S(50000)}; the latter is:
100,000S(100,000). Thus adding the two contributions the insurer pays:
E[X 100,000] - { E[X 50,000] - 50,000S(50000)}. Proceed as above.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 463

31.65. E. A loss ratio of 70% corresponds to (.7)(500000) = $350,000 in losses.


If the losses are x, and x < 350,000, then the agent gets a bonus of (1/3)(350,000 - x).
On the other hand, if x 350,000, then the bonus is zero.
Therefore the expected bonus is:
350000

(1/3) (350000 -x) f(x)dx = (1/3)(350000)


0

350000

350000

f(x)dx - (1/3) xf(x)dx =


0

(1/3)(350000)F(350000) - (1/3){E[X 350000] - 350000S(350000)} =


(1/3){350000 - (E[X 350000] }.
The distribution of losses is Pareto with = 3 and = 600,000. Therefore,
E[X 350000] = (/(-1)) {1 - (/( + 350000))1} = (600000/2)(1 - (600/950)2 ) = 180,332.
Therefore, the expected bonus is: (1/3) (350000 - 180,332) = 56,556.
Alternately, the expected amount by which losses are less than y is: y - E[L y].
Therefore, expected bonus = (1/3)(expected amount by which losses are less than 350000) =
E[(350000 - L)+)]/3 = (1/3)(350000 - E[L 350000]). Proceed as before.
Alternately, his losses must be less than 350,000 to receive a bonus.
S(350,000) = (600/(350 + 600))3 = .25193 = probability that he receives no bonus.
The mean of the "small" losses (< 350,000) = {E[L 350000] - 350000S(350000)}/F(350000)
= (180,332 - (350,000)(.25193))/(1 - .25193) = 123,192.
123,192 / 500,000 = 24.638%, is the expected loss ratio when he gets a bonus.
Therefore, the expected bonus when he gets a bonus is: 500,000(70% - 24.638%)/3 = 75,603.
His expected overall bonus is: (1 - .25193)(75,603) + (.25193)(0) = 56,556.
Comment: Note that since if x 350, 000 the bonus is zero, we only integrate from zero to
350,000. Therefore, it is not the case that E[Bonus] = (1/3)(350,000 - E[X]).
31.66. C. Let total dollars of claims be A. Let B = the Bonus.
Then B = (500-A)/2 if A < 500 and 0 if A 500. Let y = A if A < 500 and 500 if A 500.
Then E[y] = E[A 500]. 2B + y = 500, regardless of A. Therefore 2E[B] + E[y] = 500.
Therefore E[B] = (500 - E[A 500])/2 = 250 - E[A 500]/2.
For the Pareto Distribution, E[X] = /(-1), and E[X x] = {/(-1)}{1- (/(x+))1}.
For the revised model, E[A 500] = K{1- (K/(500+K))} = 500K / (500 + K).
Thus for the revised model, E[B] = 250 - 250K/(500 + K) = 125,000/(500 + K).
Expected aggregate claims under the revised model are: K/(2-1) = K.
Expected aggregate claims under the previous model are: 500/(2-1) = 500.
So we are given that: K + 125,000/(500 + K) = 500.
500K + K2 + 125000 = 250000 + 500K. K2 = 125000. K = 353.
Comment: The expected amount by which claims are less than 500 is:
E[(500 - A)+)] = 500 - E[A 500].

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 464

31.67. E. A loss ratio of 60% corresponds to: (60%)(800000) = 480,000 in losses.


For the Pareto distribution, E[X

x] = {/(1)}{1(/(+x))1}.

E[X 480000] = {(500000)/(2-1)}(1 - (500000/(480000 + 500000))2 - 1) = 244,898.


If losses are less than 480,000 a bonus is paid.
Bonus = (15%)(amount by which losses < 480,000).
Expected bonus = (15%)E[(480000 - L)+] =
(15%){480,000 - E[L

480000]} = (15%)(480000 - 244898) = 35,265.

31.68. B. From the previous solution, his expected bonus is: 35,265.
He gets a bonus when the aggregate losses are less than 480,000.
The probability of this is: F(480,000) = 1 - {500/(500 + 480)}2 = .73969.
Expected value of Hunts bonus, given that Hunt receives a (positive bonus) is:
35,265/.73969 = 47,675.
Comment: This question asks about an analog to the expected payment per
(non-zero) payment, while the exam question asks about an analog to the expected payment per
loss. In this question we only consider situations where the bonus is positive, while the exam
question includes those situations where the bonus is zero.
31.69. E. The expected losses within the layer 1,000 to 10,000 is:
10000

10000

x=10000

S(x) dx = 106/(x + 103)2 dx = -106/(x + 103)] = 106(1/2000 - 1/11000) = 1000(1/2 - 1/11).


1000

1000

x=1000

x=

E[X] = S(x) dx = 106 /(x + 103 )2 dx = -106 /(x + 103 )] = 1000.


0

x=0

Therefore the percent of expected losses within the layer 1,000 to 10,000 is:
1000(1/2 - 1/11)/1000 = 1/2 - 1/11 = 40.9%.
Alternately, this is a Pareto Distribution with = 2 and = 1000.
E[X

x] = {/(1)}{1 - (/( + x))1} = 1000{1 - 1000/(1000 + x)} = 1000x/(1000 + x).

E[X

1000] = 500. E[X

10,000] = 909. E[X] = /(1) = 1000.

The percent of expected losses within the layer 1,000 to 10,000 is:
(E[X 10,000] - E[X 1000])/E[X] = (909 - 500)/1000 = 40.9%.
31.70. A. E[X

x] = {/(-1)}{1 - (/(+x))1}. E[X

100 = E[B] = c(400 - E[X

400] = 300(1 - 3/7) = 171.43.

400]) = c(400 - 171.43) = c228.57. c = 100/228.57 = 0.4375.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 465

31.71. C. At 5100 in loss, the insured pays: 250 + (25%)(2250 - 250) + (5100 - 2250) = 3600.
For annual losses > 5100, the insured pays 5% of the amount > 5100.
The insurer pays: 75% of the layer from 250 to 2250, 0% of the layer 2250 to 5100,
and 95% of the layer from 5100 to .

E[X
E[X
E[X
E[X

x] = {/(-1)}{1 - (/(+x))1} = (2000){1 - 2000/(2000 + x)} = 2000x/(2000 + x).


250] = (2000)(250)/(2000 + 250) = 222.
2250] = (2000)(2250)/(2000 + 2250) = 1059.
5100] = (2000)(5100)/(2000 + 5100) = 1437.

E[X] = /(-1) = 2000/(2 - 1) = 2000.


The expected annual plan payment:
(75%)(E[X 2250] - E[X 250]) + (95%)(E[X] - E[X 5100]) =
(75%)(1059 - 222) + (95%)(2000 - 1437) = 1163.
Comment: Provisions are similar to those in the 2006 Medicare Prescription Drug Program.
Here is a detailed breakdown of the layers of loss:
Layer
Expected Losses in Layer
Insured Share
Insurer Share
5100 to

563

5%

95%

2250 to 5100

378

100%

0%

250 to 2250

837

25%

75%

0 to 250

222

100%

0%

Total

2000

E[X] - E[X 5100] = 2000 - 1437 = 563.


E[X 5100] - E[X 2250] = 1437 - 1059 = 378.
E[X 2250] - E[X 250] = 1059 - 222 = 837. E[X 250] = 222.
For example, for an annual loss of 1000, insured pays: 250 + (25%)(1000 - 250) = 437.5,
and insurer pays: (75%)(1000 - 250) = 562.5.
For an annual loss of 4000, insured pays: 250 + (25%)(2250 - 250) + (4000 - 2250) = 2500, and
insurer pays: (75%)(2250 - 250) = 1500.
For an annual loss of 8000, insured pays: 250 + (25%)(2250 - 250) + (5100 - 2250) +
(5%)(8000 - 5100) = 3745, and insurer pays: (75%)(2250 - 250) + (95%)(8000 - 5100) = 4255.
31.72. C. For this Pareto,
E[L 650000] = {600000/(3 - 1)}{1 - (600000/(650000 + 600000)2 } = 230,880.
E[(650,000 - L)+] = 650,000 - E[L 650000] = 650,000 - 230,880 = 419,120.
E[Bonus] = E[(650,000 - L)+/3] = 419,120/3 = 139,707.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 466

31.73. C. A constant force of mortality is an Exponential Distribution.


Variance = 2 = 100. = 10.
E[T

10] = (1 - e-10/) = (10)(1 - e-1) = 6.32.

31.74. B. Teller X completes on average 6 transactions per hour, while Teller Y completes on
average 4 transactions per hour. (6)(6 + 4) = 60 transactions by tellers expected in total.
1/3 of all transactions are deposits, and therefore we expect 20 deposits.
Expected number of deposits handled by tellers: 20 F(7500).
Average size of those deposit of size less than 7500 is:
{E[X 7500] - 7500S(7500)}/F(7500).
Expected total deposits made through the tellers each day:
20{E[X 7500] - 7500S(7500)7500]} = 20{(5000/2)(1 - (5/12.5)2 ) - 7500(5/12.5)3 } =
(20){2100 - (7500)(.064)} = 32,400.
Comment: While the above is the intended solution of the CAS, it is not what I would have done to
solve this poorly worded exam question.
Let y be total number of deposits expected per day.
Then we expect S(7500)y deposits to be handled by the manager, and F(7500)y deposits to be
handled by the tellers. Expect 60 - F(7500)y non-deposits to be handled by the tellers.
1/3 of all transactions are deposits, presumably including those handled by the manager.
{60 + S(7500)y}/3 = y. y = 60/{3 - S(7500)}.
Expected number of deposits handled by tellers: F(7500)y = F(7500) 60/{3 - S(7500)}.
Multiply by the average size of those deposit of size less than 7500:
(F(7500) 60/{3 - S(7500)}) {E[X 7500] - 7500S(7500)}/F(7500) =
60{E[X 7500] - 7500S(7500)}/(3 - S(7500)) = (60){2100 - (7500)(.064)}/(3 - .064) = 33,106.
Resulting in a different answer than the intended solution.

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 467

31.75. B. 3%/75% = 4%. If the index return is less then 4%, then the depositor gets 3%.
Thus we want: E[Max[.75 X, 3%]] = 75% E[max[X, 4%]].

E[max[X, 4%]] = 4F(4) +

xf(x)dx = 4F(4) +

xf(x)dx -

xf(x)dx
0

= 4F(4) + E[X] - {E[X 4] - 4S(4)} = 4 + E[X] - E[X 4] = 4 + 8 - (-0.58) = 12.58.


75% E[max[X, 4%]] = (75%)(12.58%) = 9.43%.
Alternately, let Y = max[X, 4]. Then Y - 4 = 0 if X 4, and Y - 4 = X - 4 if X > 4.
Therefore, E[Y - 4] = E[(X - 4)+] = E[X] - E[X 4].

E[Y] = 4 + E[X] - E[X 4] = 4 + 8 - (-0.58) = 12.58. (75%)(12.58%) = 9.43%.


Alternately, as discussed in Mahlers Guide to Risk Measures for the Normal Distribution:
TVaRp [X] = + [zp ] / (1 - p). We are interested in the tail value at risk for a 4% interest rate.
For the Normal with mean 8% and standard deviation 16%, 4% corresponds to:
zp = (4% - 8%) / 16% = -0.25. p = 1 - 0.5987 = 0.4013.
Therefore, TVaR = 0.08 + 0.16 {exp[-(-0.25)2 /2] / 2 } / 0.5987 = 0.1833.
Now 40.13% of the time the return on the equity index is less than 4%, while the remaining 59.87%
of the time the return is greater than 4%.
Therefore, the expected annual crediting rate is:
(75%) {(40.13%)(4%) + (59.87%)(0.1833)} = 9.43%.
Given the table of limited expected values, this alternate solution is harder.
Comment: In general, Min[X, 4] + Max[X, 4] = X + 4.
Therefore, E[Max[X, 4]] = E[X] + 4 - E[X 4].

2013-4-2,

Loss Distributions, 31 Limited Expected Values

HCM 10/8/12,

Page 468

31.76. C. Let X be such that Michael just has paid 10,000 in out-of-pocket repair costs:
10000 = 1000 + (20%)(6000 - 1000) + (X - 6000). X = 14,000.
Thus the insurance pays 80% of the layer from 1000 to 6000, plus 90% of the layer above 14,000.
For this Pareto Distribution, E[X x] = 5000{1 - 5000/(5000 + x)} = 5000x/(5000 + x).
E[X

1000] = 833. E[X

6000] = 2727. E[X

14000] = 3684. E[X] = /(-1) = 5000.

Expected annual payment by the insurer is:


80%(E[X 6000] - E[X 1000]) + 90%(E[X] - E[X

14000]) =

80%(2727 - 833) + 90%(5000 - 3684) = 2700.


Comment: Similar to SOA3, 11/04, Q.7.
Here is a detailed breakdown of the layers of loss:
Layer
Expected Losses in Layer
Michaels Share

Insurer Share

14,000 to

5000 - 3684 = 1316

10%

90%

6000 to 14,000

3684 - 2727 = 957

100%

0%

1000 to 6000

2727 - 833 = 1894

20%

80%

100%

0%

0 to 1000
Total

833
5000

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 469

Section 32, Limited Higher Moments


One can get limited higher moments in a manner parallel to the limited expected value.
Just as the limited expected value at u, E[X u] , is the first moment of data limited to u,
the limited second moment, E[(X u)2 ], is the second moment of the data limited to u.
First limit the losses, then square, then take the expected value.

Exercise: Prob[X = 2] = 70%, and Prob[X = 9] = 30%. Determine E[X 5] and E[(X 5)2 ].
[Solution: E[X 5] = (70%)(2) + (30%)(5) = 2.9. E[(X 5)2 ] = (70%)(22 ) + (30%)(52 ) = 10.3.
Comment: Var[X 5] = 10.3 - 2.92 = 1.89.]
As with the limited expected value, one can write the limited second moment as a contribution of
small losses plus a contribution of large losses:
E[(X u)2 ] =

t2 f(t) dt + S(u) u2.

The losses of size larger than u, each contribute u2 , while the losses of size u or less, each contribute
their size squared. E[(X u)2 ] can be computed by integration in the same manner as the moments
and Limited Expected Values. As shown in Appendix A attached to the exam, here are the
formulas for the limited higher moments for some distributions:219
E[(X x)n ]

Distribution

n! n (n+1 ; x/) + xn e-x/

Exponential
Pareto

{n! n (n) / ()} [n+1, n ; x/(+x)] + xn (/(+x))

Gamma

{n (+n) (+n; x/) / ()} + xn {1- (; x/) }

LogNormal

ln(x) n 2
ln(x)
exp[n +.5 n2 2]
+ xn {1-

Weibull

n (1 + n/) (1 +n/ ; (x/)) + xn exp[-(x/)]

Single Parameter Pareto

n
n
, x .
- n ( - n) x - n

219

The formula for the limited moments of the Pareto involving Incomplete Beta Functions, reduces to the formula
shown subsequently for n=2. However, it requires integration by parts and a lot of algebraic manipulation.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 470

One obtains the Limited Expected Value by setting n = 1, while one obtains the limited second
moment for n = 2.220
Distribution

E[(X x)2 ]

Exponential

22 - 22e-x/ - 2xe-x/

Single Parameter Pareto

2
2
, x .
- 2 ( - 2) x - 2

Pareto

2 2
{1 - (1 + x / )1 - (1 + ( -1)x / )}
( 1) ( 2)

LogNormal

exp[2 + 22]

[ ln(x)

22

] + x2 {1 - [ ln(x) ]}

Exercise: For a LogNormal Distribution with = 7 and = 0.5, what is the E[(X 1000)2 ]?
[Solution: E[(X 1000)2 ] = e14.5 [{ln(1000) - 7.5} / 0.5] + 10002 {1 - [{ln(1000) - 7} / 0.5] }
= 1,982,759 [-1.184] + 1,000,000{1 - [-0.184]}
= 1,982,759 (0.1182) + 1,000,000(1 - 0.4270) = 807,362.]
Generally, E[(X u)2 ] is less than E[X2 ]. For low censorship points u or more skewed distributions
the difference can be quite substantial. For example, in the above exercise,
E[X2 ] = exp[2 + 22] = e14.5 = 1,982,759,
while E[(X 1000)2 ] = 807,362.
Gamma Distribution:
For the Gamma Distribution: E[(X x)2 ] = 2 (+1) (+2; x/)+ x2 {1- (; x/)}.
Using Theorem A.1 in Appendix A of Loss Models,
(3; x/) = 1 - e-x/ - (x/)e-x/ - (x/)2 e-x/ /2. Also (1; x/) = 1 - e-x/.
Thus for the Exponential, which is a Gamma for = 1, E[(X x)2 ] = 22 (3; x/) + x2 {1 - (; x/)} =
22{1 - e-x/ - (x/)e-x/ - (x/)2 e-x/ /2} + x2 e-x/ = 22 - 22e-x/ - 2xe-x/.
220

The limited second moments of a Exponential and Pareto are not shown in Loss Models in these forms, but as
shown below these formulas are correct.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 471

Exercise: For an Exponential Distribution with = 10, what is the E[(X 30)2 ]?
[Solution: E[(X x)2 ] = 22 - 22e-x/ - 2xe-x/. E[(X 30)2 ] = 200 - 200e-3 - 600e-3 = 160.2.]
Second Limited Moment in Terms of Other Quantities of Interest:
It is sometimes useful to write E[(X u)2 ] in terms of the Survival Function, the Excess Ratio R(x) or
the Loss Elimination Ratio LER(x) as follows. Using integration by parts and the fact that the integral
of f(x) is -S(x):
E[(X u)2 ] =

t= u
2
t

t= 0

t2 f(t) dt + S(u) u2 = -S(t) ]

0 S(t) 2t dt + S(u)u2 = 0 S(t) 2t dt .

In particular, for u = , one can write the second moment as twice the integral of the survival function
times x:221

E[X2 ] = 2

0 S(t) t dt.
u

More generally, E[(X u)n ] =

0 n tn - 1S(t) dt .

222

For u = , E[Xn ] =

0 n tn - 1S(t) dt .

Using integration by parts and the fact that the integral of S(t) is LER(x) :

E[(X u)2 ] = 2

221

t =u

0 S(t) t dt = 2x LER(x)t ]= 0 - 2 0 LER(t) dt .

See formula 3.5.3 in Actuarial Mathematics. Recall that the mean can be written as an integral of the survival
function. One can proceed in the same manner to get higher moments in terms of integrals of the survival function
times a power of x.
222
The form shown here is true for distributions with support x > 0. More generally, the nth limited moment is the
sum of an integral from - to 0 of - n tn-1F(t) and an integral from 0 to L n tn-1S(t). See Equation 3.9 in Loss Models.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments


u

E[(X u)2 ] = 2 {LER(u)u -

HCM 10/8/12,

Page 472

0 LER(t) dt } = 2 { 0 R(t) dt - R(u)u}.

So for example for the Pareto distribution: = / (1), R(x) = {/(+x)}1.


u

0 R(t) dt =

- - 1 /

{( - 2)(+ t) - 2

t= u

= {1/(2)}{2 - (+u)2}.

t= 0

E[(X u)2 ] = {2 / (1)(2)}{2 - (+u)2 (2)u(+u)1}.


E[(X u)2 ] = {2 2 / (1)(2)}{1 - (1 + u/ )2 - (2)(u/ )(1 + u/ )1}.
E[(X u)2 ] = E[X2 ] {1 - (1 + u/ )1[1 + (1)u/ ]}.

Letting u go to infinity, it follows that: E[X2 ] = 2 E[X]

R(x) dx =

0 R(x) dx .

E[X2 ]
.
2 E[X]

S(y) dy

Now the excess ratio R(x) is:

Therefore,

expected excess losses


= x
mean
E[X]

x=0 y= x

S(y) dy dx = E[X]

R(x) dx = E[X]

E[X2 ]
= E[X2 ]/2.
2 E[X]


2
f(x)

I will use the above result to show that the variance is equal to
S(y) dy
dx .

S(x)2
x=0 y = x

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 473


2
f(x)

Using integration by parts, let u =


S(y) dy and dv =
dx.

S(x)2
y =x

du = 2

S(y) dy (-S(x)).

y =x

v = 1/S(x).
x =


2

2
f(x)
1

Therefore,
S(y) dy
dx =
S(y) dy
2

S(x)

S(x)

x=0 y = x

y =x

+2

x=0 y=x S(y) dy dx .

x =0

Now as was shown previously, E[X2 ] = 2

0 t S(t) dt .

Therefore, if there is a finite second moment,

0 t S(t) dt must be finite.

If in the extreme righthand tail S(t) ~ 1/t2 , then this integral would be infinite.
Therefore, in the extreme right hand tail, S(t) must go down faster than 1/t2 .

If in the extreme righthand tail S(t) ~ 1/t2+ , then

S(y) dy /

y =x

S(x) ~ (1/x1+) x1+/2 = 1/x/2.

Therefore, if in the extreme right hand tail, S(t) goes down faster than 1/t2 ,

then

S(y) dy /

y =x

S(x) approaches zero as x approaches infinity.


2
1
Therefore, as x approaches infinity,
S(y) dy
approaches zero.

S(x)
y =x


2
f(x)

Thus,
S(y) dy
dx = -E[X]2 / S(0) + 2 E[X2 ]/2 = E[X]2 - E[X]2 = Var[X].

S(x)2
x=0 y = x

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 474

Pareto Distribution:

As discussed previously,

R(x) dx =

E[X2 ] 223
.
2 E[X]

For example, for the Pareto Distribution, R(x) = {/(+x)}1.

R(x) dx = 1

( + x) - 1 dx = 1 2/( - 2) = / (-2) =

2 2 / {( -1)( - 2)}
E[X2 ]
=
.
2 / ( - 1)
2 E[X]

As discussed previously, for the Pareto Distribution,


E[(X x)2 ] = E[X2 ] {1 - (1 + x / )1 - (1 + ( - 1)x / )} .224
Exercise: For a Pareto with = 4 and = 1000, compute E[X2 ], E[(X 500)2 ] and
E[(X 5000)2 ].
[Solution: E[X2 ] =

2 2
= 333,333,
( 1) ( 2)

E[(X 500)2 ]= E[X2 ] {1 - (1 + 500/)1(1 + (1)500/ )} = 86,420, and


E[(X 5000)2 ] = E[X2 ] {1 - (1 + 5000/)1(1 + (1)5000/ )} = 308,642.]
The limited higher moments can also be used to calculate the variance, coefficient of variation, and
skewness of losses subject to a maximum covered loss.
Exercise: For a Pareto Distribution with = 4 and = 1000, and for a maximum covered loss of
5000, compute the variance and coefficient of variation (per single loss.)
[Solution: From a previous solutions E[X 5000] = 331.79 and E[(X 5000)2 ] = 308,642.
Thus the variance is: 308,642 - 331.792 = 198,557. Thus the CV is: 198,557 / 331.79 = 1.34.]

223
224

Where R(x) is the excess ratio, and the distribution has support starting at zero.
While this formula was derived above, it is not in the Appendix attached to the exam.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 475

Second Moment of a Layer of Loss:


Also, one can use the Limited Second Moment to calculate the second moment of a layer of loss.225
The second moment of the layer from d to u is:226
E[(X u)2 ] - E[(X d)2 ] - 2d {E[X u] - E[X d]}.
Exercise: For a Pareto with = 4 and = 1000, compute the second moment of the layer from 500
to 5000.
[Solution: From the solutions to previous exercises, E[X 500] = 234.57, E[X 5000] = 331.79,
E[(X 500)2 ] = 86,420, and E[(X 5000)2 ] = 308,642.
The second moment of the layer from 500 to 5000 is:

E[(X 5000)2 ] - E[(X 500)2 ] - 2(500){E[X 5000] - E[X 500]} =


308,642 - 86,420 - (1000)(331.79 - 234.57) = 125,002.
Comment: Note this is the second moment per loss, including those losses that do not penetrate
the layer, in the same way that E[X 5000] - E[X 500] is the first moment of the layer per loss.]
Derivation of the Second Moment of a Layer of Loss:
For the layer from d to u, the medium size losses contribute x - d, while the large losses contribute
the width of the interval u - d.
Therefore, the second moment of the layer from d to u is:
u

d (x -

d)2 f(x)

dx + (u - d)2 S(u) =

d (x2 - 2dx + d2) f(x) dx + u2S(u) - 2duS(u) + d2S(u) =

x2

f(x) dx + u2 S(u) -

x2

f(x) dx - d2 S(d) + d2 S(d) - 2d x f(x) dx

+ d2 {F(u) - F(d)} - 2duS(u) + d2 S(u) =

225

Recall that the expected value of a Layer of Loss is the difference of the Limited Expected Value at the top of the
layer and the Limited Expected Value at the bottom of the layer. For second and higher moments the relationships
are more complicated.
226
See Theorem 8.8 in Loss Models. Here we are referring to the second moment of the per loss variable; similar to
the average payment per loss, we are including those times a small loss contributes zero to the layer.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 476

E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - uS(u) - (E[X d] - dS(d))}


+ d2 {S(d) + F(u) - F(d)} - 2duS(u) + d2 S(u) =
E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - (E[X d]}
+ d{2uS(u) - 2dS(d) + dS(d) + dF(u) - dF(d) -2uS(u) + dS(u)} =
E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - E[X d]} + d{d(F(u) + S(u)) - d(F(d) + S(d)} =
E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - E[X d]} + d(d - d) =
E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - E[X d]}.
Variance of a Layer of Loss:
Given the first and second moments layer of loss, one can compute the variance and the coefficient
of variation of a layer of loss.
Exercise: For a Pareto with = 4 and = 1000, compute the variance of the losses in the layer from
500 to 5000.
[Solution: From the previous exercise, the second moment is 125,002 and the mean is
E[X 5000] - E[X 500] = 331.79 - 234.57 = 97.22.
The variance of the layer from 500 to 5000 is: 125,002 - 97.222 = 115,550.]
Exercise: For a Pareto with = 4 and = 1000, compute the coefficient of variation of the losses in
the layer from 500 to 5000.
[Solution: From the previous exercise, the variance of the layer from 500 to 5000 is:
125,002 - 97.222 = 115,550 and the mean is 97.22. Thus the CV = 115,550.5 / 97.22 = 3.5.]
Using the formulas for the first and seconds moments of a layer of loss,
the variance of the layer of losses from d to u is:
E[(X u)2 ] - E[(X d)2 ] - 2d {E[X u] - E[X d]} - {E[X u] - E[X d]}2 .
Since the average payment per loss under a maximum covered loss of u and a deductible of d is
the layer from d to u, this provides a formula for the variance of the average payment per loss under
a maximum covered loss of u and a deductible of d.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 477

Exercise: Assume losses are given by a LogNormal Distribution with = 8 and = 0.7.
An insured has a deductible of 1000, and a maximum covered loss of 10,000.
What is the expected amount paid per loss?
[Solution: For the LogNormal Distribution:
E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx )/]}.
E[X 1000] = e8.245 [{ln(1000) - 8 - 0.49} / 0.7] + 1000{1 - [{ln(1000) - 8} / 0.7] }
= 3808.54 [-2.260] + 1000{1 - [-1.560]}
= 3808.54 (0.0119) + 1000(1 - 0.0594) = 986.
E[X 10000] = e8.245 [{ln(10000) - 8 - 0.49} / 0.7] + 10,000{1 - [{ln(10000) - 8} / 0.7] }
= 3808.54 [1.029] + 10000{1 - [1.729]} = 3808.54(0.8483) + 10000(1 - 0.9581) = 3650.
E[X 10000] - E[X 1000] = 3650 - 986 = 2664.]
Exercise: Assume losses are given by a LogNormal Distribution with = 8 and = 0.7.
An insured has a deductible of 1000, and a maximum covered loss of 10,000.
What is the variance of the amount paid per loss?
[Solution: For the LogNormal Distribution:
E[(X x)2 ] = exp[2 + 22][{ln(x) ( + 22)} / ] + x2 {1 - [{ln(x) } / ]}.
E[(X 1000)2 ] = e16.98 [{ln(1000) - 8.98} / 0.7] + 10002 {1 - [{ln(1000) - 8} / 0.7] }
= 23,676,652 [-2.960] + 1,000,000{1 - [-1.560]}
= 23,676,652 (0.0015) + 1,000,000(1 - 0.0594) = 976,115.
E[(X 10000)2 ] = e16.98 [{ln(10000) - 8.98} / 0.7] + 100002 {1 - [{ln(10000) - 8} / 0.7] }
= 23,676,652 [0.329] + 100,000,000{1 - [1.729]}
= 23,676,652 (0.6289) + 100,000,000(1 - 0.9581) = 19,080,246.
E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - (E[X d]} - {E[X u] - (E[X d]}2
= 19,080,246 - 976,115 - (2000)(2664) - 26642 = 5.68 million.
Comment: It would take too long to perform all of these computations for a single exam question!]
If one has a coinsurance factor of c, then each payment is multiplied by c, therefore the variance is
multiplied by c2 .
Exercise: Assume losses are given by a LogNormal Distribution with = 8 and = 0.7.
An insured has a deductible of 1000, a maximum covered loss of 10,000, and a coinsurance factor
of 80%. What is the variance of the amount paid per loss?
[Solution: (0.82 )(5.68 million) = 3.64 million.]

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 478

Variance of Non-Zero Payments:


Exercise: For a Pareto with = 4 and = 1000, compute the average non-zero payment given a
deductible of 500 and a maximum covered loss of 5000.
[Solution: (E[X 5000] - E[X 500]) / S(500)= (331.79 - 234.57) / 0.1975 = 492.2. ]
One can compute the second moment of the non-zero payments in a manner similar to the second
moment of the payments per loss. Given a deductible of d and a maximum covered loss of u, the
2nd moment of the non-zero payments is:227
u

(x - d)2 (f(x)/S(d)) dx + (u - d)2S(u)/S(d) = (2nd moment of the payments per loss) / S(d) =
d

{E[(X u)2 ] - E[(X d)2 ] - 2d(E[X u] - E[X d])} / S(d).


So just as with the first moment, the second moment of the non-zero payments has S(d) in the
denominator. If one has a coinsurance factor of c, then the second moment is multiplied by c2 .
Exercise: For a Pareto with = 4 and = 1000, compute the second moment of the non-zero
payments given a deductible of 500 and a maximum covered loss of 5000.
[Solution: {E[(X 5000)2 ] - E[(X 500)2 ] - 2(500){E[X 5000] - E[X 500]}}/S(500) =
{308,642- 86,420 - (1000)(331.79 - 234.57)} / (1000/1500)4 = 125,002/0.1975309 = 632,823.]
Thus given a deductible of d and a maximum covered loss of u the variance of the non-zero
payments is: {E[(X u)2 ] - E[(X d)2 ] - 2d(E[X u] - E[X d])} / S(d) - {(E[X u] - E[X d]) / S(d)}2 .
Exercise: For a Pareto with = 4 and = 1000, compute the variance of the non-zero payments
given a deductible of 500 and a maximum covered loss of 5000.
[Solution: From the solutions to previous exercises, variance = 632,823 - 492.22 = 390,562.]
If one has a coinsurance factor of c, then each payment is multiplied by c, therefore the variance is
multiplied by c2 .
Exercise: For a Pareto with = 4 and = 1000, compute the variance of the non-zero payments
given a deductible of 500, a maximum covered loss of 5000, and a coinsurance factor of 85%.
[Solution: (0.852 )(390,562) = 282,181.]
227

Note that the density is f(x)/S(d) from d to u, and has a point mass of S(u)/S(d) at u.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 479

Problems:
Use the following information for the next 7 questions.
Assume the unlimited losses follow a LogNormal Distribution with parameters = 10 and = 1.5.
Assume an average of 10,000 losses per year. For simplicity, ignore any variation in costs due to
variations in the number of losses per year.
32.1 (2 points) What is the coefficient of variation of the total cost expected per year?
A. less than 0.014
B. at least 0.014 but less than 0.018
C. at least 0.018 but less than 0.022
D. at least 0.022 but less than 0.026
E. at least 0.026
32.2 (3 points) If the insurer pays no more than $250,000 per loss,
what is the coefficient of variation of the insurers total cost expected per year?
A. less than 0.014
B. at least 0.014 but less than 0.018
C. at least 0.018 but less than 0.022
D. at least 0.022 but less than 0.026
E. at least 0.026
32.3 (3 points) If the insurer pays no more than $1,000,000 per loss,
what is the coefficient of variation of the insurers total cost expected per year?
A. less than 0.014
B. at least 0.014 but less than 0.018
C. at least 0.018 but less than 0.022
D. at least 0.022 but less than 0.026
E. at least 0.026
32.4 (1 point) If the insurer pays the layer from $250,000 to $1 million per loss,
what are the insurers total costs expected per year?
A. less than $135 million
B. at least $135 million but less than $140 million
C. at least $140 million but less than $145 million
D. at least $145 million but less than $150 million
E. at least $150 million

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 480

32.5 (3 points) If the insurer pays the layer from $250,000 to $1 million per loss,
what is the coefficient of variation of the insurers total cost expected per year?
A. less than 0.03
B. at least 0.03 but less than 0.05
C. at least 0.05 but less than 0.07
D. at least 0.07 but less than 0.09
E. at least 0.09
32.6 (3 points) What is the coefficient of skewness of the total cost expected per year?
A. less than 0.15
B. at least 0.15 but less than 0.20
C. at least 0.20 but less than 0.25
D. at least 0.25 but less than 0.30
E. at least 0.30
32.7 (3 points) If the insurer pays no more than $250,000 per loss, what is the coefficient of
skewness of the insurers total cost expected per year?
A. less than 0.015
B. at least 0.015 but less than 0.020
C. at least 0.020 but less than 0.025
D. at least 0.025 but less than 0.030
E. at least 0.030

Use the following information for the next 7 questions


Losses follow an Exponential Distribution with = 10,000.
32.8 (1 point) What is the variance of losses?
A. less than 105 million
B. at least 105 million but less than 110 million
C. at least 110 million but less than 115 million
D. at least 115 million but less than 120 million
E. at least 120 million
32.9 (2 points) Assuming a 25,000 policy limit, what is the variance of payments by the insurer?
A. less than 35 million
B. at least 40 million but less than 45 million
C. at least 45 million but less than 50 million
D. at least 50 million but less than 55 million
E. at least 55 million

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 481

32.10 (3 points) Assuming a 1000 deductible (with no maximum covered loss),


what is the variance of the payments per loss?
A. less than 95 million
B. at least 95 million but less than 100 million
C. at least 100 million but less than 105 million
D. at least 105 million but less than 110 million
E. at least 110 million
32.11 (2 points) Assuming a 1000 deductible (with no maximum covered loss),
what is the variance of non-zero payments by the insurer?
A. less than 95 million
B. at least 95 million but less than 100 million
C. at least 100 million but less than 105 million
D. at least 105 million but less than 110 million
E. at least 110 million
32.12 (3 points) Assuming a 1000 deductible and a 25,000 maximum covered loss,
what is the variance of the payments per loss?
A. less than 55 million
B. at least 55 million but less than 56 million
C. at least 56 million but less than 57 million
D. at least 57 million but less than 58 million
E. at least 58 million
32.13 (3 points) Assuming a 1000 deductible and a 25,000 maximum covered loss,
what is the variance of the non-zero payments by the insurer?
A. less than 53 million
B. at least 53 million but less than 54 million
C. at least 54 million but less than 55 million
D. at least 55 million but less than 56 million
E. at least 56 million
32.14 (2 points) Assuming a 75% coinsurance factor, a 1000 deductible and a 25,000 maximum
covered loss, what is the variance of the insurers payments per loss?
A. less than 15 million
B. at least 15 million but less than 20 million
C. at least 20 million but less than 25 million
D. at least 25 million but less than 30 million
E. at least 30 million

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 482

32.15 (2 points) Let X be the result of rolling a fair six-sided die, with the numbers 1 through 6 on its
faces. Calculate Var(X
(A) 1.1
(B) 1.2

4).
(C) 1.3

(D) 1.4

(E) 1.5

32.16 (3 points) The size of loss distribution has the following characteristics:
(i) E[X] = 245.
(ii) E[X 100] = 85.
(iii) S(100) = 0.65.
(iv) E[X2 | X > 100] = 250,000.
There is an ordinary deductible of 100 per loss.
Calculate the second moment of the payment per loss.
(A) 116,000
(B) 118,000
(C)120,000

(D) 122,000

(E) 124,000

Use the following information for the next four questions:

Losses follow a LogNormal Distribution with parameters = 9.7 and = 0.8.


The insured has a deductible of 10,000, maximum covered loss of 50,000, and a
coinsurance factor of 90%.

32.17 (3 points) What is the average payment per loss?


A. less than 7,000
B. at least 7,000 but less than 8,000
C. at least 8,000 but less than 9,000
D. at least 9,000 but less than 10,000
E. at least 10,000
32.18 (2 points) What is E[(X 50,000)2 ]?
A. less than 600 million
B. at least 600 million but less than 700 million
C. at least 700 million but less than 800 million
D. at least 800 million but less than 900 million
E. at least 900 million
32.19 (2 points) What is E[(X 10,000)2 ]?
A. less than 80 million
B. at least 80 million but less than 90 million
C. at least 90 million but less than 100 million
D. at least 100 million but less than 110 million
E. at least 110 million

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

32.20 (2 points) What is the variance of the payment per loss?


A. less than 110 million
B. at least 110 million but less than 120 million
C. at least 120 million but less than 130 million
D. at least 130 million but less than 140 million
E. at least 140 million

32.21 (3 points) You are given:


Claim Size Number of Claims
0-25
30
25-50
32
50-100
20
100-200
8
Assume a uniform distribution of claim sizes within each interval.
Estimate E[(X 80)2 ].
(A) 2300
(B) 2400

(C) 2500

(D) 2600

(E) 2700

Use the following information for the next 5 questions:

Losses are uniform from 0 to 40.


32.22 (1 point) What is E[X 10]?
A. 8.5
B. 8.75
C. 9.0

D. 9.25

E. 9.5

32.23 (1 point) What is E[X 25]?


A. 15
B. 16
C. 17

D. 18

E. 19

32.24 (2 points) What is E[(X 10)2 ]?


A. 79
B. 80
C. 81

D. 82

E. 83

32.25 (2 points) What is E[(X 25)2 ]?


A. 360
B. 365
C. 370

D. 375

E. 380

32.26 (2 points) What is the variance of the layer of loss from 10 to 25?
A. 37
B. 39
C. 41
D. 43
E. 45

Page 483

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Use the following information for the next 7 questions.


Assume the following discrete size of loss distribution:
10 50%
50 30%
100 20%
32.27 (2 points) What is the coefficient of variation of this size of loss distribution?
A. less than 0.6
B. at least 0.6 but less than 0.8
C. at least 0.8 but less than 1.0
D. at least 1.0 but less than 1.2
E. at least 1.2
32.28 (3 points) What is the coefficient of skewness of this size of loss distribution?
A. less than 0
B. at least 0 but less than 0.2
C. at least 0.2 but less than 0.4
D. at least 0.4 but less than 0.6
E. at least 0.6
32.29 (2 points) If the insurer pays no more than 25 per loss,
what is the coefficient of variation of the distribution of the size of payments?
A. less than 0.6
B. at least 0.6 but less than 0.8
C. at least 0.8 but less than 1.0
D. at least 1.0 but less than 1.2
E. at least 1.2
32.30 (2 points) If the insurer pays no more than 60 per loss,
what is the coefficient of variation of the distribution of the size of payments?
A. less than 0.6
B. at least 0.6 but less than 0.8
C. at least 0.8 but less than 1.0
D. at least 1.0 but less than 1.2
E. at least 1.2

Page 484

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

32.31 (3 points) If the insurer pays no more than 60 per loss,


what is the coefficient of skewness of the distribution of the size of payments?
A. less than 0
B. at least 0 but less than 0.2
C. at least 0.2 but less than 0.4
D. at least 0.4 but less than 0.6
E. at least 0.6
32.32 (1 point) If the insurer pays the layer from 30 to 70 per loss,
what is the insurers expected payment per loss?
A. 10
B. 12
C. 14
D. 16
E. 18
32.33 (2 points) If the insurer pays the layer from 30 to 70 per loss,
what is the coefficient of variation of the insurers payments per loss?
A. less than 0.6
B. at least 0.6 but less than 0.8
C. at least 0.8 but less than 1.0
D. at least 1.0 but less than 1.2
E. at least 1.2
32.34 (2 points) X follows the density f(x), with support from 0 to infinity.
500

500

f(x) dx = 0.685.

500

x f(x) dx = 217.

x2 f(x) dx = 76,616.

Determine the variance of the limited loss variable with u = 500, X 500.
A. 14,000
B. 15,000
C. 16,000
D. 17,000
E. 18,000
32.35 (4 points) The size of loss follows a Single Parameter Pareto Distribution
with = 3 and = 200.
A policy has a deductible of 250, a maximum covered loss of 1000,
and a coinsurance of 80%.
Determine the variance of YP, the per payment variable.
A. 12,000
B. 13,000
C. 14,000
D. 15,000
E. 16,000
32.36 (3 points)
The loss severity random variable X follows an exponential distribution with mean .
Determine the coefficient of variation of the excess loss variable Y = max(X - d, 0).

Page 485

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 486

32.37 (21 points) Let U be the losses in the layer from a to b.


Let V be the losses in the layer from c to d. a < b c < d.
(i) (3 points) Determine an expression for the covariance of U and V in terms of
Limited Expected Values.
(ii) (2 points) For an Exponential Distribution with mean of 8, determine the covariance of
the losses in the layer from 0 to 10 and the losses in the layer from 10 to infinity.
(iii) (3 points) For an Exponential Distribution with mean of 8, determine the variance of
the losses in the layer from 0 to 10.
Hint: For the Exponential, E[(X x)2 ] = 22 - 22e-x/ - 2xe-x/.
(iv) (3 points) For an Exponential Distribution with mean of 8, determine the variance of
the losses in the layer from 10 to infinity.
(v) (1 point) For an Exponential Distribution with mean of 8, determine the correlation of
the losses in the layer from 0 to 10 and the losses in the layer from 10 to infinity.
(vi) (2 points) For a Pareto Distribution with = 3 and = 16, determine the covariance of
the losses in the layer from 0 to 10 and the losses in the layer from 10 to infinity.
(vii) (3 points) For a Pareto Distribution with = 3 and = 16, determine the variance of
the losses in the layer from 0 to 10.
Hint: For the Pareto, E[(X x)2 ] =

2 2
{1 - (1 + x/)1 (1 + (-1)x/)}.
( - 1) ( - 2)

(viii) (3 points) For a Pareto Distribution with = 3 and = 16, determine the variance of
the losses in the layer from 10 to infinity.
(ix) (1 point) For a Pareto Distribution with = 3 and = 16, determine the correlation of
the losses in the layer from 0 to 10 and the losses in the layer from 10 to infinity.
32.38 (14 points) Let X be the price of a stock at time 1.
X is distributed via a LogNormal Distribution with = 4 and = 0.3.
Let Y be the payoff on a one-year 70-strike European Call on this stock.
Y = 0 if X 70, and Y = X - 70 if X > 70.
(i) (1 point) What is the mean of X?
(ii) (2 points) What is the variance of X?
(iii) (3 points) What is the mean of Y?
(iv) (4 points) What is the variance of Y?
(v) (3 points) What is the covariance of X and Y?
(vi) (1 point) What is the correlation of X and Y?

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 487

Use the following information for the next 2 questions:


Limit
Limited Expected Value
Limited Second Moment
100,000
55,556
4444 million
250,000
80,247
12,346 million
500,000
91,837
20,408 million
1,000,000
97,222
27,778 million
32.39 (2 points) Determine the coefficient of variation of the layer of loss from 100,000 to 500,000.
(A) Less than 2
(B) At least 2, but less than 3
(C) At least 3, but less than 4
(D) At least 4, but less than 5
(E) At least 5
32.40 (2 points) Determine the coefficient of variation of the layer of loss from 250,000 to 1 million.
(A) Less than 2
(B) At least 2, but less than 3
(C) At least 3, but less than 4
(D) At least 4, but less than 5
(E) At least 5

Use the following information for the next 2 questions:


Losses are uniform from 2 to 20.

There is a deductible of 5.
32.41 (1 point) Determine the variance of YP, the per-payment variable.
A. 17
B. 18
C. 19
D. 20
E. 21
32.42 (3 points) Determine the variance of YL , the per-loss variable.
A. 35
B. 37
C. 39
D. 41
E. 43

32.43 (3 points) The loss severity random variable X follows the pareto distribution with
= 5 and = 400.
Determine the coefficient of variation of the excess loss variable Y = max(X - 300, 0).
(A) 6.5
(B) 7.0
(C) 7.5
(D) 8.0
(E) 8.5

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 488

32.44 (3, 11/00, Q.21 & 2009 Sample Q.115) (2.5 points)
A claim severity distribution is exponential with mean 1000.
An insurance company will pay the amount of each claim in excess of a deductible of 100.
Calculate the variance of the amount paid by the insurance company for one claim,
including the possibility that the amount paid is 0.
(A) 810,000 (B) 860,000 (C) 900,000 (D) 990,000 (E) 1,000,000
32.45 (1, 5/01, Q.35) (1.9 points) The warranty on a machine specifies that it will be replaced at
failure or age 4, whichever occurs first.
The machines age at failure, X, has density function 1/5 for 0 < x < 5.
Let Y be the age of the machine at the time of replacement.
Determine the variance of Y.
(A) 1.3
(B) 1.4
(C) 1.7
(D) 2.1
(E) 7.5
32.46 (4, 11/03, Q.37 & 2009 Sample Q.28) (2.5 points) You are given:
Claim Size (X)
Number of Claims
(0, 25]
25
(25, 50]
28
(50, 100]
15
(100, 200]
6
Assume a uniform distribution of claim sizes within each interval.
Estimate E(X2 ) - E[(X 150)2 ].
(A) Less than 200
(B) At least 200, but less than 300
(C) At least 300, but less than 400
(D) At least 400, but less than 500
(E) At least 500
32.47 (SOA3, 11/03, Q.28) (2.5 points) For (x):
(i) K is the curtate future lifetime random variable.
(ii) qx+k = 0.1(k + 1), k = 0, 1, 2,, 9
Calculate Var(K 3).
(A) 1.1
(B) 1.2

(C) 1.3

(D) 1.4

(E) 1.5

32.48 (4, 5/07, Q.13) (2.5 points)


The loss severity random variable X follows the exponential distribution with mean 10,000.
Determine the coefficient of variation of the excess loss variable Y = max(X - 30,000, 0).
(A) 1.0
(B) 3.0
(C) 6.3
(D) 9.0
(E) 39.2

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 489

Solutions to Problems:
32.1. E. For the sum of N independent losses, both the Variance and the mean are N times that for
a single loss. The standard deviation is multiplied by

N.

Thus the CV, which is the ratio of the standard deviation to the mean, is divided by

N.

Per loss, mean = exp( + 2/2) = e11.125, and second moment is exp(2 + 22) = e24.5.
Therefore for a single loss, CV =

E[X2] / E[X]2 - 1 = e24.5 / e22.25 - 1 = e2.25 -1 = 2.91.

For 10,000 losses we divide by

10,000 = 100, thus the CV for the total losses is 0.0291.

32.2. A. E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.


E[X 250,000] = exp(11.125)[(ln(250,000) - 10 - 2.25)/1.5] +
(250,000){1 [(ln(250,000) 10)/1.5]} = (67,846)[.12] + (250,000)(1 [1.62])
= (67,846)(.5478) + (250,000)(1 - .9474) = 50.3 thousand.
E[(X L)2 ] = exp[2 + 22][{ln(L) (+ 22)} / ] + L2 {1 - [{ln(L) } / ]}.
E[(X 250,000)2 ] = exp(24.5)[-1.38] + 6.25 x 1010{1 - [1.62]} =
(4.367 x 1010)(.0838) + (6.25 x 1010)(1 - .9474) = 6.95 x 109 .
Therefore for a single loss Coefficient of Variation =
E[(X 250,000)2 ]/ E[X 250,000]2 - 1 = 6.95 x 109 / 2.53 x 109 - 1 = 1.32.
For 10,000 losses we divide by

10,000 = 100, thus the CV is 0.0132.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 490

32.3. C. E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.


E[X 1,000,000] = exp(11.125)[(ln(1,000,000) - 10 - 2.25 )/1.5] +
(1,000,000) {1 - [(ln(1,000,000) 10)/1.5]}
= (67,846)[1.04] + (1,000,000)(1[2.54]) = (67,846)(.8508) + (1,000,000)(1 - .9945)
= 63.2 thousand.
E[(X L)2 ] = exp[2 + 22][{ln(L) ( + 22)} / ] + L2 {1- [{ln(L) } / ] }.
E[(X 1,000,000)2 ] = exp(24.5)[-.46] + (1012){1 [2.54]} =
(4.367 x 1010)(.3228) + (1012){1 - .9945} = 1.960 x 1010.
Therefore Coefficient of Variation =

E[(X 250,000)2 ]/ E[X 250,000]2 - 1 =

1.960 x 1010 / 3.99 x 109 - 1 = 1.98.


For 10,000 losses we divide by

10,000 = 100, thus the CV is 0.0198.

Comment: The Coefficient of Variation of the Limited Losses is less than that of the unlimited losses.
The CV of the losses limited to 250,000 is lower than that of the losses limited to
$1 million.
32.4. A. Since the insurer expects 10,000 losses per year, the expected dollars in the layer from
250,000 to $1 million are:
10,000{E[X 1 million] - E[X 250,000]} = 10,000(63.2 thousand - 50.3 thousand) =
129 million.
32.5. C. The mean for the layer is: E[X 1 million] - E[X 250,000] =
63.2 thousand - 50.3 thousand = 13.1 thousand. The second moment for the layer is:
E[(X 1 million)2 ] - E[(X 250,000)2 ] - 2(250000)(E[X 1 million] - E[X 250,000]) =
1.960 x 1010 - 6.95 x 109 - 6.55 x 109 = 6.10 x 109 .
Therefore for a single loss, Coefficient of Variation =
For 10,000 losses we divide by

6.10 x 109 / 1.72 x 108 - 1 = 5.9.

10,000 = 100, thus the CV is 0.059.

Comment: The CV of a layer depends on how high the layer is, the width of the layer, as well as the
loss distribution. A higher layer usually has a larger CV.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 491

32.6. E. For the sum of N independent losses, both the Variance and the Third Central Moment
are N times that for a single loss. (For the sum of independent random variables, second and third
central moments each add.) Thus the skewness, which is the ratio of the Third Central Moment to the
variance to the 3/2 power, is divided by

N.

Per loss, mean = exp( + 2/2) = e11.125, second moment is: exp(2 + 22) = e24.5, and third moment
is: exp(3 + 4.52) = e40.125. Therefore, the variance is: e24.5 - e22.25 = 3.907 x 1010. The third
central moment is: e40.125 - 3e24.5e11.125 + 2e33.375 = 2.620 x 1017.
Thus for one loss the skewness is: 2.585 x 1017 / (3.907 x 1010)1.5 = 33.5. For 10,000 losses we
divide by

10,000 = 100, thus the skewness for the total losses is 0.335.

32.7. B. E[X 250,000] = exp(11.125)[(ln(250,000) - 10 - 2.25 )/1.5] +


(250,000) {1 [(ln(250,000) 10)/1.5]} = 50.3 thousand.
E[(X 250,000)2 ] = exp(24.5)[-1.38] + .6.25 x 1010{1 - [1.62]} = 6.95 x 109 .
Thus the variance of a limited loss is 6.95 x 109 - 2.53 x 109 = 4.42 x 109 .
E[(X L)3 ] = exp[3 + 4.5 2][{ln(L) (+ 32)} / ] + L3 {1- [{ln(L) } / ] }
E[(X 250,000)3 ] = e40.125[-2.88] + 1.5625 x 1016 {1- [1.62] } =
1.355 x 1015. The third central moment is:
1.355 x 1015 - 3(50.3 thousand)(6.95 x 109 ) + 2( 50.3 thousand)3 = 5.6 x 1014.
Thus for one loss the skewness is: 5.6 x 1014 / ( 4.42 x 109 )1.5 = 1.9.
For 10,000 losses we divide by

10,000 = 100; thus the skewness for the total losses is 0.019.

Comment: The skewness of the limited losses is much smaller than that of the unlimited losses. Rare
very large losses have a large impact on the skewness of the unlimited losses.
32.8. A. 2 = 100 million.
32.9. E. The second moment is E[(X x)2 ] = 22 (3; x/) + x2 e-x/.
E[(X 25000)2 ] = 200,000,000 (3; 2.5) + 625,000,000e-2.5 =
200 million{1- e-2.5(1+2.5+2.52 /2)}+51.30 million = 142.54 million.
variance = 142.54 million - 91792 = 58.28 million.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 492

32.10. B. The second moment is: E[X2 ] - E[(X 1000)2 ] - (2)(1000){E[X] - (E[X 1000]}.
E[X] - E[X 1000] = 10,000 - 952 = 9048. E[X2 ] = 22 = 200 million.
E[(X 1000)2 ] = 200,000,000 (3; .1) + 1,000,000e-.1 =
200 million{{1 - e-.1(1+.1+.12 /2)}+.005e-2.5} = .936 million.
The second moment is: 200 million - .936 million - (2000)(9048) = 180.97 million.
Variance = 180.97 million - 90482 = 99.1 million.

32.11. C. The second moment is:

1000

f(x)
(x -1000)2 dx
S(1000)

= (2nd moment of payment per loss) / S(1000) = 180.97 million / e-.1 = 180.97 million / .9048 =
200.00 million.
Variance = 200.00 million - 10,0002 = 100 million.
Comment: Due to the memoryless property of the Exponential, the variance is the same as in the
absence of a deductible.
32.12. D. E[X 25000] = 9179. E[X 1000] = 952.

E[(X 25000)2 ] =142.54 million. E[(X 1000)2 ] = .936 million.


second moment of the layer of loss =
E[(X 25000)2 ] - E[(X 1000)2 ] - (2)(1000){E[X 25000] - (E[X 1000]} =
142.54 million - .936 million - (2000)(9179-952) = 125.15 million.
Variance = 125.15 million - (9179-952)2 = 57.46 million.
25,000

32.13. D. The second moment is:

1000

f(x)
(x - 1000)2 dx + 240002 S(25000)/S(1000)
S(1000)

= (2nd moment of payment per loss) /S(1000)


= 125.15 million / e-.1 = 125.15 million / .9048 = 138.32 million.
The mean is: 9093. Variance = 138.32 million - 90932 = 55.63 million.
32.14. E. The second moment is:
.752 {E[(X 25000)2 ] - E[(X 1000)2 ] - (2)(1000){E[X 25000] - (E[X 1000]}}
=.5625{ 142.54 million - .936 million - (2000)(9179-952)} = 70.40 million.
Variance = 70.40 million - 61702 = 32.33 million.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

32.15. C. E[X

HCM 10/8/12,

Page 493

4] = (1/6)(1) + (1/6)(2) + (1/6)(3) + (3/6)(4) = 3.

4)2] = (1/6)(12) + (1/6)(22) + (1/6)(32) + (3/6)(42) = 10.33.


Var(X 4) = 10.33 - 32 = 1.33.
E[(X

32.16. E.

E[X2 | X > 100] =

x2 f(x) dx / S(100). x2 f(x) dx = S(100)E[X2 | X > 100]

100

100

= (.65)(250,000) = 162,500.

100

x f(x) dx = x f(x) dx - x f(x) dx = E[X] - {E[X 100] - 100S(100)}


100

= 245 - {85 - (100)(.65)} = 225.

f(x) dx = S(100) = .65.


100

With a deductible of 100 per loss, the second moment of the payment per loss is:

(x - 100)2 f(x) dx = x2 f(x) dx - 200 x f(x) dx + 10000 f(x) dx


100

100

100

100

= 162,500 - (200)(225) + (10000)(.65) = 124,000.


Comment: Similar to SOA M, 5/05, Q.17, in Mahlers Guide to Aggregate Distributions.
The variance of the payment per loss is: 124,000 - (245 - 85)2 = 98,400.
With a deductible of d, the second moment of the payment per loss is:
E[X2 | X > d] S(d) - 2d(E[X] - {E[X

E[X2 | X > d] S(d) - 2d E[X] + 2d E[X

d] - dS(d)}) + d2 S(d) =

d] - d2 S(d).

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 494

32.17. E. E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx )/]}


E[X 50000] = exp(10.02)[(ln(50,000) -9.7 - .64 )/.8] + (50,000) {1 [(ln(50,000) 9.7)/.8]} =
(22,471)[.60] + (50,000) {1 - [1.40]} = (22,471)(.7257) + (50,000) {1 - .9192} = 20,347.
E[X 10000] = exp(10.02)[(ln(10,000) - 9.7 - .64 )/.8] + (10,000){1 - [(ln(10,000) 9.7)/.8]} =
(22,471)[-1.41] + (10,000) {1 - [-.61]} = (22,471)(.0793) + (10,000) {.7291} = 9073.
The average payment per loss is:
(.9)(E[X 50000] - E[X 10000]) = (.9)(20,347 - 9073) = 10,147.
32.18. B. E[(X x)2 ] = exp[2 + 22][{ln(x) (+ 22)} / ] + x2 {1 - [{ln(x) } / ] }
E[(X 50,000)2 ] = exp[20.68][{ln(50000) - 10.98} / .8] + 500002 {1 - [{ln(50000) 9.7}/.8] } =
957,656,776[-.20] + 2,500,000,000{1 - [1.40]} =
(957,656,776)(.4207) + (2,500,000,000){1 - .9192} = 604.9 million.
32.19. B. E[(X 10,000)2 ] =
exp[20.68][{ln(10000) - 10.98} / 0.8] + 100002 {1 - [{ln(10000) - 9.7} / 0.8] } =
957,656,776[-2.21] + 100,000,000{1 - [-0.61]} =
(957,656,776)(0.0136) + (100,000,000)(0.7291) = 85.9 million.
32.20. D. c2 {E[(X L)2 ] - E[(X d)2 ] - 2d{E[X L] - (E[X d]} - {E[X L] - (E[X d]}2 }
= 0.92 {E[(X 50000)2 ] - E[(X 10000)2 ] - 2(10000){E[X 50000] - (E[X 10000]}
- {E[X 50000] - (E[X 10000]}2 } =
0.81{604.8 million - 85.9 million - 20,000(20,345 - 9073) - (20,345 - 9073)2 } =
(0.81){166.4 million) = 134.8 million.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 495

32.21. A. For a uniform distribution on (a, b), E[X2 ] = (b3 - a3 )/{3(b - a)}.
For those intervals above 80, E[(X 80)2 ] = 802 = 6400.
We need to divide the interval from 50 to 100 into two pieces.
For losses uniform on 50 to 100, 2/5 are expected to be greater than 80, so we assign
(2.5)(20) = 8 to the interval 80 to 100 and the remaining 12 to the interval 50 to 80.
Bottom of
Interval

Top of
Interval

Number
of Claims

Expected 2nd Moment


Limited to 80

0
25
50
80
100

25
50
80
100
200

30
32
12
8
8

208
1458
4300
6400
6400

Average

2299

32.22. B., 32.23. C., 32.24. E., 32.25. B. , 32.26. C.


10

E[X 10] = x/40 dx + (3/4)(10) = 8.75.


0
25

E[X 25] = x/40 dx + (15/40)(25) = 17.1875.


0
10

E[(X 10)2 ] = x2 /40 dx + (3/4)(102 ) = 83.333.


0
25

E[(X 25)2 ] = x2 /40 dx + (15/40)(252 ) = 364.583.


0

Layer Average Severity is: E[X 25] - E[X 10] = 17.1875 - 8.75 = 8.4375.

2nd moment of the layer = E[(X 25)2 ] - E[(X 10)2 ] - (2)(10)(E[X 25] - E[X 10]) =
364.583 - 83.333 - (2)(10)(8.4375) = 112.5. Variance of the layer = 112.5 - 8.43752 = 41.31.
Alternately, the contributions to the layer from each small loss is 0, from each medium loss is x - 10,
and each large loss is 15. Thus the second moment of the layer is:
25

(x-10)2 /40 dx + (15/40)(152 ) = 28.125 + 84.375 = 112.5. Proceed as before.


10

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 496

32.27. C. Mean = (50%)(10) + (30%)(50) + (20%)(100) = 40.


Second Moment = (50%)(102 ) + (30%)(502 ) + (20%)(1002 ) = 2800.
Variance = 2800 - 402 = 1200.
coefficient of variation =

1200 / 40 = 0.866.

32.28. E. Third Central Moment = (50%)(10 - 40)3 + (30%)(50 - 40)3 + (20%)(100 - 40)3 =
30,000.
Skewness = Third Central Moment / Variance1.5 = 30,000/12001.5 = 0.722.
32.29. A. Mean = (50%)(10) + (30%)(25) + (20%)(25) = 17.5.
Second Moment = (50%)(102 ) + (30%)(252 ) + (20%)(252 ) = 362.5.
Variance = 362.5 - 17.52 = 56.25.
coefficient of variation =

56.25 / 17.5 = 0.429.

32.30. B. Mean = (50%)(10) + (30%)(50) + (20%)(60) = 32.


Second Moment = (50%)(102 ) + (30%)(502 ) + (20%)(602 ) = 1520.
Variance = 1520 - 322 = 496.
coefficient of variation =

496 / 32 = 0.696.

32.31. B. Third Central Moment = (50%)(10 - 32)3 + (30%)(50 - 32)3 + (20%)(60 - 32)3 = 816.
Skewness = Third Central Moment / Variance1.5 = 816/4961.5 = 0.074.
32.32. C. (50%)(0) + (30%)(20) + (20%)(40) = 14.
32.33. D. Second Moment = (50%)(02 ) + (30%)(202 ) + (20%)(402 ) = 440.
Variance = 440 - 142 = 244. coefficient of variation =

32.34. B. E[X

244 / 14 = 1.116.

500

500] =

x f(x) dx + 500 S(500) = 217 + (500)(1 - .685) = 374.5.

E[(X

500

500)2 ] =

x2 f(x) dx + 5002 S(500) = 76,616 + (5002 )(1 - .685) = 155,366.

Var[(X

500)2 ] = E[(X

500)2 ] - E[X

500]2 = 155,366 - 374.52 = 15,116.

Comment: Based on a Gamma Distribution with = 4.3 and = 100.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 497

32.35. C. From the Tables attached to the exam, for the Single Parameter Pareto, for x :
E[X x] =

.
- 1 ( - 1) x - 1

2
2
E[(X x) ] =
.
- 2 ( - 2) x - 2
2

Thus E[X 250] = (3)(200) / 2 - 2003 / {(2) (2502 )} = 236.


E[X 1000] = (3)(200) / 2 - 2003 / {(2) (10002 )} = 296.
S(250) = (200/250)3 = 0.512.
Thus the mean payment per payment is: (80%) (296 - 236) / 0.512 = 93.75.
E[(X 250)2 ] = (3)(2002 ) / 1 - (2)(2003 ) / 250 = 56,000.
E[(X 1000)2 ] = (3)(2002 ) / 1 - (2)(2003 ) / 1000 = 104,000.
Thus the second moment of the non-zero payments is:
(80%)2 {104,000 - 56,000 - (2)(250)(296 - 236)} / 0.512 = 22,500.
Thus the variance of the non-zero payments is: 22,500 - 93.752 = 13,711.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 498

32.36. The probability that a loss exceeds d is: e-d/.


The non-zero payments excess of a deductible d is the same as the original Exponential.
Zero payments would contribute nothing to the aggregate amount paid.
One can think of Y = (X - d)+ as the aggregate that results from a Bernoulli frequency with q = e-d/,
and an Exponential severity with mean .
This has a variance of: (mean of Bernoulli)(var. of Expon.) + (mean of Expon.)2 (var. of Bernoulli)
= (q)(2) + (2)(q)(1 - q) = (2)(q)(2 - q).
Mean aggregate is: q.
Coefficient of variation is:

(2)(q)(2 - q) / (q) =

Comment: Similar to 4, 5/07, Q.13.

2/ q - 1=

2 ed / - 1.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 499

32.37. (i) E[U] = E[X b] - E[X a]. E[V] = E[X d] - E[X c].
For x < c, V = 0. For c < x < d, U = b - a, and V = x - c. For d < x, U = b - a, and V = d - c.
Therefore, E[UV] =

(b a)(x c)f(x)dx + (b-a)(d-c) S(d) = (b - a){

(x c)f(x) dx

+ (d-c) S(d)}

= (b - a)(expected losses in the layer from c to d) = (b - a){E[X d] - E[X c]}.


Cov[U, V] = E[UV] - E[U]E[V] = {b - a - E[X b] + E[X a]}{E[X d] - E[X c]}.
Cov[U, V] = (width of the lower interval minus the layer average severity of the lower interval)
(layer average severity of the upper interval).
Cov[U, V] = {E[(b - X)+] - E[(a - X)+]}{E[X d] - E[X c]}.
(ii) E[X 10] = 8(1 - e-10/8) = 5.708.
Using the result from part (i) with a = 0, b = c = 10, and d = :
Covariance = {10 - E[X 10]}{E[X] - E[X 10]} = (10 - 5.708)(8 - 5.708) = 9.84.
(iii) Mean of the layer from 0 to 10 is: E[X 10] = 5.708.
E[(X 10)2 ] = 2(82 ) - 2(82 ) e-10/8 - 2(8)(10) e-10/8 = 45.487.
Second Moment of the layer from 0 to 10 is:
E[(X

10)2 ] = 2(82 ) - 2(82 ) e-10/8 - 2(8)(10) e-10/8 = 45.487.

Variance of the layer from 0 to 10 is: 45.487 - 5.7082 = 12.906.


(iv) Mean of the layer from 10 to is: E[X] - E[X 10] = 8 - 5.708 = 2.292.
E[X2 ] = 22 = 2(82 ) = 128.
Second Moment of the layer from 10 to is:
E[X2 ] - E[(X

10)2 ] - (2)(10)(E[X] - E[X

10]) = 128 - 45.487 - (20)(2.292) = 36.673.

Variance of the layer from 10 to is: 36.673 - 2.2922 = 31.420.


(v) Correlation of the layer from 0 to 10 and the layer from 10 to infinity is:
9.84/ (12.906)(31.420) = 0.489.
(vi) E[X

10] = {/(-1)}{1 - (/(+x))1} = (16/2){1 - (16/26)2 } = 4.970.

Using the result from part (i) with a = 0, b = c = 10, and d = :


Covariance = {10 - E[X 10]}{E[X] - E[X 10]} = (10 - 4.970)(8 - 4.970) = 15.24.
(vii) Mean of the layer from 0 to 10 is: E[X 10] = 4.970.
Second Moment of the layer from 0 to 10 is:
E[(X

10)2 ] = 2(162 ){1 - (1 + 10/16)-2(1 + (2)(10)/16)}/{(2)(1)} = 37.870.

Variance of the layer from 0 to 10 is: 37.870 - 4.9702 = 13.169.


(viii) Mean of the layer from 10 to is: E[X] - E[X 10] = 8 - 4.970 = 3.030.
E[X2 ] = 2(162 )/{(2)(1)} = 256.
Second Moment of the layer from 10 to is:
E[X2 ] - E[(X

10)2 ] - (2)(10)(E[X] - E[X

10]) = 256 - 37.870 - (20)(3.030) = 157.530.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 500

Variance of the layer from 10 to is: 157.530 - 3.0302 = 148.35.


(ix) Correlation of the layer from 0 to 10 and the layer from 10 to infinity is:
15.24/ (13.169)(148.35) = 0.345.
Comment: Since a larger loss contributes more to both layers than a smaller loss, the losses in the
layers are positively correlated.
32.38. (i) E[X] = exp[4 + 0.32 /2] = 57.11.
(ii) E[X2 ] = exp[(2)(4) + (2)(0.32 )] = 3568.85.
VAR[X] = 3568.85 - 57.112 = 307.3.
(iii) E[Y] = 0 Prob[X < 70] + E[X - 70 | X > 70] Prob[X > 70] = E[X - 70 | X > 70] Prob[X > 70] =
e(70) S(70) = E[X] - E[X 70 ].
E[X 70 ] = exp[4 + 0.32 /2] [(ln(70) - 4 - 0.32 )/0.3] + 70 {1 - [(ln(70) - 4)/0.3] =
57.11 [0.53] + (70){1 - [0.83]} = (57.11)(0.7019) + (70)(1 - 0.7967) = 54.32.
Thus E[Y] = E[X] - E[X 70 ] = 57.11 - 54.32 = 2.79.
(iv) E[Y2 ] = 0 Prob[X < 70] + E[(X - 70)2 | X > 70] Prob[X > 70] =
(second moment of the layer from 70 to infinity) Prob[X > 70] =
E[X2 ] - E[(X 70)2 ] - (2)(70){E[X] - E[X 70 ]}.
For LogNormal Distribution the second limited moment is:
ln(x) 22
ln(x)
E[(X x)2 ] = exp[2 + 22]
+ x2 {1 -
}.

E[(X 70)2 ] = exp[(2)(4) + (2)(0.32 )] [(ln(70) - 4 - (2)(0.32 ))/0.3] + 702 {1 - [(ln(70) - 4)/0.3]}
= 3568.85 [0.23] + 4900 {1 - [0.83]} = (3568.85)(0.5910) + (4900)(1 - 0.7967) = 3105.36.
Thus, E[Y2 ] = E[X2 ] - E[(X 70)2 ] - (2)(70){E[X] - E[X 70 ]} =
3568.85 - 3105.36 - (140)(57.11 - 54.32) = 72.89.
VAR[Y] = E[Y2 ] - E[Y]2 = 72.89 - 2.792 = 65.11.
(v) E[XY] = E[X (X - 70) | X > 70] Prob[X > 70] =
E[(70)(X - 70) - (X - 70) (X - 70) | X > 70] Prob[X > 70] =
70 E[X - 70 | X > 70] Prob[X > 70] - E[Y2 | X > 70] Prob[X > 70] =
70 E[Y] + E[Y2 ] = (70) (2.79) + 72.89 = 268.19.
Cov[X, Y] = E[XY] - E[X] E[Y] = 268.19 - (57.11)(2.79) = 108.85.
Cov[X, Y]
108.85
(vi) Corr[X, Y] =
=
= 0.77.
VAR[X] VAR[Y]
(307.3) (65.11)

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 501

32.39. B. The first moment of the layer is: 91,837 - 55,556 = 36,281.
The second moment of the layer is:
20,408 million - 4444 million - (2)(100,000)(36,281) = 8707.8 million.
Variance of the layer is: 8707.8 million - 36,2812 = 7392 million.
Standard Deviation of the layer is: 85,977.
CV of the layer is: 85,977/36,281 = 2.37.
Comment: Based on a Pareto Distribution with = 3 and = 200,000.
32.40. D. The first moment of the layer is: 97,222 - 80,247 = 16,975.
The second moment of the layer is:
27,778 million - 12,346 million - (2)(250,000)(16,975) = 6944.5 million.
1 + CV2 = 6944.5 million / 16,9752 = 24.100. CV = 4.81.
32.41. C. The non-zero payments are uniform from 5 to 20, with variance:
(20 - 5)2 / 12 = 18.75.
32.42. B. The non-zero payments are uniform from 5 to 20,
with mean: 12.5, variance: (20 - 5)2 / 12 = 18.75,
and second moment: 18.75 + 12.52 = 175.
The probability of a non-zero payment is: 15/18 = 5/6.
Thus YL is a two-point mixture of a uniform distribution from 5 to 20 and a distribution that is always
zero, with weights 5/6 and 1/6.
The mean of the mixture is: (5/6)(12.5) + (1/6)(0) = 10.417.
The second moment of the mixture is: (5/6)(175) + (1/6)(02 ) = 145.83.
The variance of this mixture is: 145.83 - 10.4172 = 37.3.
Alternately, YL can be thought of as a compound distribution, with Bernoulli frequency with mean 5/6
and Uniform severity from 5 to 20.
The variance of this compound distribution is:
(Mean Freq.)(Var. Sev.) + (Mean Sev.)2 (Var. Freq.) =
(5/6)(18.75) + (12.5)2 {(5/6)(1/6)} = 37.3.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 502

32.43. A. The probability that a loss exceeds 300 is: (4/7)5 = 0.06093.
The losses truncated and shifted from below at 300 is also a Pareto Distribution but with
= 5 and = 400 + 300 = 700.
One can think of Y = (X - 300)+ as the aggregate that results from a Bernoulli frequency with
q = 0.06093, and a Pareto severity with = 5 and = 700.
This Pareto has mean: 700/4 = 175, second moment: (2)(7002 ) / {(4)(3)} = 81,667,
and variance: 81,667 - 1752 = 51,042.
This has a variance of: (mean of Bernoulli)(var. of Pareto) + (mean of Pareto)2 (var. of Bernoulli)
= (0.06093)(51,042) + (1752 )(0.06093)(1 - 0.06093) = 4862.
Mean aggregate is: (0.06093)(175) = 10.663.
Coefficient of variation is: 4862 / 10.663 = 6.54.
Alternately, this is mathematically equivalent to a two point mixture, with 0.06093 weight to a Pareto
with = 5 and = 700 (the non-zero payments) and (1 - 0.06093) weight to a distribution that is
always zero.
The mean is: (0.06093)(175) + (1 - 0.06093)(0) = 10.663.
The second moment is the weighted average of the two second moments:
(0.06093)(81,667) + (1 - 0.06093)(0) = 4976.
Therefore, 1 + CV2 = 4976/10.6632 = 43.76. CV = 6.54.
Comment: Similar to 4, 5/07, Q.13, which involves an Exponential rather than a Pareto.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 503

32.44. D. An Exponential distribution truncated and shifted from below is the same Exponential
Distribution, due to the memoryless property of the Exponential. Thus the nonzero payments are
Exponential with mean 1000. The probability of a nonzero payment is the probability that a loss is
greater than the deductible of 100; S(100) = e-100/1000 = 0.90484.
Thus the payments of the insurer can be thought of as a compound distribution, with Bernoulli
frequency with mean 0.90484 and Exponential severity with mean 1000. The variance of this
compound distribution is:
(Mean Freq.)(Var. Sev.) + (Mean Sev.)2 (Var. Freq.) =
(.90484)(10002 ) + (1000)2 {(.90484)(.09516)} = 990,945.
Equivalently, the payments of the insurer in this case are a two point mixture of an Exponential with
mean 1000 and a distribution that is always zero, with weights .90484 and .09516. This has a first
moment of: (1000)(.90484) + (.09516)(0) = 904.84, and a second moment of:
{(2)(10002 )}(.90484) + (.09516)(02 ) = 1,809,680.
Thus the variance is: 1809680 - 904.842 = 990,945.
Alternately, for the Exponential Distribution, E[X] = = 1000, and E[X2 ] = 22 = 2 million.
For the Exponential Distribution, E[X x] = (1 - e-x/).
E[X

100] = 1000(1 - e-100/1000) = 95.16.

For the Exponential, E[(X


E[(X

x)n ] = n! n (n+1; x/) + xn e-x/.

100)2 ] = (2)10002 (3; 100/1000) + 1002 e-100/1000.

According to Theorem A.1 in Loss Models, for integral , the incomplete Gamma function
(; y) is 1 minus the first densities of a Poisson Distribution with mean y.
(3; y) = 1 - e-y(1 + y + y2 /2). (3; .1) = 1 - e-.1(1 + .1 + .12 /2) = .0001546.
Therefore, E[(X 100)2 ] = (2 million)(.0001546) + 10000e-.1 = 9357.
The first moment of the layer from 100 to is: E[X] - E[X 100] = 1000 - 95.16 = 904.84.
The second moment of the layer from 100 to is:
E[X2 ] - E[(X 100)2 ] - (2)(100)(E[X] - E[X 100]) =
2,000,000 - 9357 - (200)(904.84) = 1,809,675.
Therefore, the variance of the layer from 100 to is: 1,809,675 - 904.842 = 990,940.
Alternately, one can work directly with the integrals, using integration by parts.
The first moment of the layer from 100 to is:

(x -100)e-x/1000/1000 dx = xe-x/1000/1000 dx - (1/10) e-x/1000 dx =


100

100

100

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 504

x=

-xe-x/1000 - 1000e-x/1000] - 100e-.1 = 100e-.1 + 1000e-.1 - 100e-.1 = 904.84.


x = 100

The second moment of the layer from 100 to is:

(x -100)2 e-x/1000/1000 dx =
100

x2e-x/1000/1000 dx - xe-x/1000/5 dx + 10e-x/1000 dx

100

100

100

x=

x=

= -x2 e-x/1000 - 2000xe-x/1000 - 2,000,000e-x/1000] + 200xe-x/1000 + 200,000e-x/1000]


x = 100

x = 100

+ 10,000e-.1 = e-.1{10,000 + 200,000 + 2,000,000 - 20,000 - 200,000 + 10,000} =


2,000,000e-.1 = 1,809,675.
Therefore, the variance of the layer from 100 to is: 1,809,675 - 904.842 = 990,940.
Comment: Very long and difficult, unless one uses the memoryless property of the Exponential
Distribution.
32.45. C. E[Y] = (2)(4/5) + 4(1/5) = 2.4.
4

E[Y2 ] = x2 /5 dx + 42 (1/5) = 64/15 + 16/5 = 7.4667.


0

Var[Y] = 7.4667 - 2.42 = 1.7067.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

32.46. C. For X 150, X = X

HCM 10/8/12,

Page 505

150.

So the only contribution to E(X2 ) - E[(X

150)2 ] comes from any losses of size > 150.

Losses uniform on (100, 200] expect 3 claims greater than 150, out of a total of 74.
Uniform on (150, 200], E[X2 ] = variance + mean2 = 502 /12 + 1752 = 30,833.
On (150, 200], each loss is at least 150, and therefore E[(X
E(X2 ) - E[(X 150)2 ] = Prob[X>150] E[X2 - (X
(3/74)(30,833 - 22,500) = 338.

150)2 ] = 1502 = 22,500.

150)2 | X uniform on (150, 200]] =

32.47. A. qx = 0.1. qx+1 = 0.2. qx+2 = 0.3.


Prob[K = 0] = Prob[Die 1st Year] = qx = 0.1.
Prob[K = 1] = P[Alive @ x+1] P[Die 2nd Year | Alive @ x+1] = (1 - qx) qx+1 = (0.9)(0.2) = 0.18.
Prob[K = 2] = P[Alive @ x+2 ] P[Die 3rd Year | Alive @ x+2] =
(1 - qx) (1 - qx+1) qx+2 = (0.9)(0.8)(0.3) = 0.216.
Prob[K 3] = 1 - (0.1 + 0.18 + .0216) = 0.504.
K

Prob

(K

3)2

0
.1
0
0
1
.18
1
1
2
.216
2
4
3
.504
3
9
Avg.
2.124
5.580
(.1)(0) + (.18)(1) + (.216)(2) + (.504)(3) = 2.124.
(.1)(0) + (.18)(1) + (.216)(4) + (.504)(9) = 5.580.
Var(K

3) = E[(K

3)2 ] - E[K

3]2 = 5.580 - 2.1242 = 1.069.

2013-4-2,

Loss Distributions, 32 Limited Higher Moments

HCM 10/8/12,

Page 506

32.48. C. The probability that a loss exceeds 30,000 is: e-30000/10000 = 0.049787.
The losses truncated and shifted from below at 30,000 is the same as the original Exponential.
One can think of Y = (X - 30000)+ as the aggregate that results from a Bernoulli frequency with
q = 0.049787, and an Exponential severity with mean 10,000.
This has a variance of: (mean of Bernoulli)(var. of Expon.) + (mean of Expon.)2 (var. of Bernoulli)
= (0.049787)(100002 ) + (100002 )(0.049787)(1 - 0.049787) = 9,710,096.
Mean aggregate is: ( 0.049787)(10000) = 497.87.
Coefficient of variation is:

9,710,096 / 497.87 = 6.26.

Alternately, this is mathematically equivalent to a two point mixture, with 0.049787 weight to an
Exponential with mean 10,000 (the non-zero payments) and (1 - 0.049787) weight to a distribution
that is always zero.
The mean is: (0.049787)(10,000) + (1 - 0.049787)(0) = 497.87.
The second moment is the weighted average of the two second moments:
(0.049787)(2)(10,0002) + (1 - 0.049787)(0) = 9,957,414.
Therefore, 1 + CV2 = 9,957,414/497.872 = 40.17. CV = 6.26.
Alternately, E[Y2 ] = E[(X - 30000)+2 | X > 30000] Prob[X > 30000]
= (Second moment of an Exponential Distribution with = 10000) e-10000/30000
= (2)(100002 )(0.049787) = 9,957,414. Proceed as before.

Alternately, E[Y] =

(x 30000) exp[x / 10000]/ 10000 dx =

30000

y exp[(y+ 30000) / 10000] /10000 dy = e-3

y exp[y / 10000] / 10000 dy


0

= e-3 (10000) = 497.87.

E[Y2 ] =

(x 30000) 2 exp[x /10000] /10000 dx

30000

y2 exp[(y + 30000) / 10000] / 10000 dy = e-3

y2 exp[y /10000]/ 10000 dy


0

= e-3 (2)(100002 ) = 9,957,414. Proceed as before.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 507

Section 33, Mean Excess Loss


As discussed previously, the Mean Excess Loss or Mean Residual Life (complete expectation of
life), e(x), is defined as the mean value of those losses greater than size x, where each loss is
reduced by x. Thus one only includes those losses greater than size x, and only that part of each
such loss greater than x.

(t - x) f(t) dt

e(x) = E[X - x | X > x] =

S(x)

t f(t) dt

S(x)

- x.

The Mean Excess Loss at d, e(d) = average payment per payment with a deductible d.
The Mean Excess Loss is the mean of the loss distribution truncated and shifted at x:
e(x) = (average size of those losses greater in size than x) - x.
Therefore, the average size of those losses greater in size than x = e(x) + x.
On the exam, usually the easiest way to compute the Mean Excess Loss for a distribution is to use
the formulas for the Limited Expected Value in Appendix A of Loss Models, and the identity:
E[X] E[X x]
e(x) =
.
S(x)
Therefore, e(0) = mean, provided the distribution has support, x > 0.228
Exercise: E[X $1 million] = $234,109. E[X] = $342,222. S($1 million) = 0.06119.
Determine e($1 million).
[Solution: e($1 million) = (342,222 - 234,109) / 0.06119 = $1.767 million.]

228

Thus e(0) = mean, with the notable exception of the Single Parameter Pareto.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 508

Formulas for the Mean Excess Loss for Various Distributions:


Distribution

Mean Excess Loss, e(x)

Exponential

Pareto

+ x
, > 1
1

LogNormal

ln(x) n 2
1 -

exp( + 2/2)
- x
ln(x)
1 -

Gamma

Weibull

(1 +1/) {1 - (1 +1/ ; (x/)) } exp[(x/)] - x

1 - ( +1 ; x / )
- x
1 - ( ; x / )

x
1

Single Parameter Pareto


Inverse Gaussian:
x
1
x

x
1
x

[
[

Burr

]
]

x
+ e2 / + 1
x
x
- e2 / + 1
x

[
[

]
]

- x

{( 1/)(1+1/) / ()}{[ 1/ , 1+1/ ; 1/(1+(x/))]}(1+(x/)), > 1

Trans. Gamma

{(+(1/)) /()}{1 - (+(1/) ; (x/)) } / {1-[ ; (x/)]} - x

Gen. Pareto

{ / (-1)}[1, +1; /(+x)] / [ , ; /(+x)]}, > 1

Normal

2[(x )/]/{1 [(x )/]} + - x

It should be noted that for heavier-tailed distributions, just as with the mean, the Mean Excess Loss
only exists for certain values of the parameters. Otherwise it is infinite.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 509

For example, for the Pareto for 1, the mean excess loss is infinite or does not exist.
The Exponential distribution is the only distribution with a constant Mean Excess Loss.
If F(x) represents the distribution of the ages of death, then e(x) is the (remaining) life expectancy of
a person of age x. A constant Mean Excess Loss is independent of age and is equivalent to a force
of mortality that is independent of age.
Exercise: For a Pareto with = 4 and = 1000, determine e(800).
[Solution: E[X] = /(1) = 333.3333. E[X 800] = {/(1)}{1 (/( + 800))1} = 276.1774.
S(800) = {/( + 800)} = (1/1.8)4 = 0.09526.
e(800) = (333.3333 - 276.1774)/(0.09526) = 600.
Alternately, e(x) = ( + x) / ( - 1) = (1000 + 800)/(4 - 1) = 600.]
Mean Excess Loss in terms of the Survival Function:
The Mean Excess Loss can be written in terms of the survival function, S(x) = 1 - F(x).
By definition, e(x) is the ratio of loss dollars excess of x divided by S(x).

e(x) = (t x) f( t)dt / S(x) = { t f(t) dt - S(x)x} / S(x)

Using integration by parts and the fact that the integral of f(x) is -S(x):229

e(x) = {S(x)x + S(t) dt - S(x)x} / S(x).


x

e(x) = S(t) dt / S(x) .


x

So the Mean Excess Loss at x is the integral of the survival function from x to infinity divided by the
survival function at x.230
229

Note that the derivative of S(x) is d(1-F(x))/dx = - f(x). Remember there is an arbitrary constant for indefinite
integrals. Thus the indefinite integral of f(x) is either -F(x) or S(x) = 1-F(x).
230
The Mean Excess Loss as defined here is the same as the complete expectation of life as defined in Life
Contingencies. The formula given here is equivalent to formula 3.5.2 in Actuarial Mathematics by Bowers et. al.,
pp.62-63. sp x = S(x+s) / S(x), and therefore:

e x =

spx ds.
0

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 510

For example, for the Pareto Distribution, S(x) = (+x).


Therefore, e(x) = { (+x)1 / (1)} / { (+x)} = ( + x) / ( 1).
This matches the formula given above for the Mean Excess Loss of the Pareto Distribution.
Behavior of e(x) in the Righthand Tail:
Here is a table of the behavior of the Mean Excess Loss as the loss size approaches infinity for
some distributions:
Distribution

Behavior of e(x) as x

For Extremely Large x

Exponential

constant

e(x) =

Single Par. Pareto

increases linearly

e(x) = x / (-1)

Pareto

increases linearly

e(x) = (+x) / (1)

LogNormal

increases to infinity less than linearly

e(x) x 2 / ln(x)

Gamma, >1

decreases towards a horizontal asymptote

e(x)

Gamma, <1

increases towards a horizontal asymptote231

e(x)

Inverse Gaussian

increases to a constant

e(x) 22/

Weibull, >1

decreases to zero

e(x) x1 /

Weibull, <1

increases to infinity less than linearly

e(x) x1 /

Trans. Gamma, >1

decreases to zero

e(x) x1 /

Trans. Gamma, <1

increases to infinity less than linearly

e(x) x1 /

Burr

increases to infinity approximately linearly

e(x) x / (1)

Gen. Pareto

increases to infinity approximately linearly

e(x) x / (1)

Inv. Trans. Gamma

increases to infinity approximately linearly

e(x) x / (1)

Normal

decreases to zero approximately as 1/x

e(x) 2 / (x-)

231

For the Gamma Distribution for large x, e(x) (-1)/x .

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 511

Recall that the mean and thus the Mean Excess Loss fails to exist for: Pareto with 1,
Inverse Transformed Gamma with 1, Generalized Pareto with 1, and Burr with 1.
Also the Gamma with = 1 and the Weibull with = 1 are Exponential distributions, and thus have
constant Mean Excess Loss.
The Transformed Gamma with = 1 is a Gamma distribution, and thus in this case has the
behavior of the Mean Excess Loss depend on whether alpha is greater than, less than, or
equal to one.
For the LogNormal, e(x) approaches its asymptotic behavior very slowly. Thus the formula
derived above e(x) x / {(ln(x) ) / 2 -1}, will provide a somewhat better approximation than
the formula e(x) x 2 / ln(x), until one reaches truly immense loss sizes.
Those curves with heavier tails have the Mean Excess Loss increase with x. Comparing the Mean
Excess Loss provides useful information on the fit of the curves to the data. Small differences in
the tail of the distributions that may not have been evident in the graphs of the Distribution Function,
are made evident by graphing the Mean Excess Loss.
I have found the Mean Excess Loss particularly useful at distinguishing between the tails of the
different distributions when using them to estimate Excess Ratios.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 512

Below is shown for various Gamma distributions the behavior of the Mean Excess Loss as the loss
size increases. For = 1, the Exponential Distribution has a constant mean excess loss equal to ,
in this case 1000. For > 1, the mean excess loss decreases to .
For a Gamma Distribution with < 1, the mean excess loss increases to .
The tail of a Gamma Distribution is similar to that of an Exponential Distribution with the same .
e(x)
2000
1800
1600
Gamma Distribution, alpha > 1

1400
1200
1000
800

Gamma Distribution, alpha < 1

600
5000

10000 15000 20000 25000 30000

For the Weibull with = 1, the Exponential Distribution has a constant mean excess loss equal to ,
in this 1000. For > 1, the mean excess loss decreases to 0.
For a Weibull Distribution with < 1, the mean excess loss increases to infinity less than linearly.
e(x)
7000
6000

Weibull Distribution, tau < 1

5000
4000
3000
2000
1000
Weibull Distribution, tau > 1
1000

2000

3000

4000

5000

6000

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 513

The Pareto and LogNormal Distribution each have heavy tails. However, the Pareto Distribution has
its mean excess loss increase linearly, while that of the LogNormal increases slightly less than
linearly. Thus the Pareto has a heavier (righthand) tail than the LogNormal, which in turn has a heavier
tail than the Weibull.232
40000
Pareto
30000

LogNormal
20000

10000
Weibull, tau < 1

10000

20000

30000

x
40000

All three distributions have mean residual lives that increase to infinity. Note that it takes a while for
the mean residual life of the Pareto to become larger than that of the LogNormal.233

The mean excess losses are graphed for a Weibull Distribution with = 500 and = 1/2, a LogNormal Distribution
with = 5.5 and = 1.678, and a Pareto Distribution with = 2 and = 1000.
All three distributions have a mean of 1000.
233
In this case, the Pareto has a larger mean residual life for loss sizes 15 times the mean and greater.
232

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 514

The mean residual life of an Inverse Gaussian increases to a constant; e(x) 22/.
Thus the Inverse Gaussian has a tail that is somewhat similar to a Gamma Distribution.
Here is e(x) for an Inverse Gaussian with = 1000 and = 500:
e(x)
3500

3000
2500

2000
1500

10000

20000

30000

40000

x
50000

Here is the e(x) for an Inverse Gaussian with = 1000 and = 2500:
850

800

750

700

650

5000

10000

15000

In this case, e(x) initially decreases and then increases towards: (2)(10002 )/2500 = 800.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 515

Determining the Tail Behavior of the Mean Excess Loss:


The fact that the Mean Excess Loss is the integral of S(t) from x to infinity divided by S(x), is the
basis for a method of determining the behavior of e(x) as x approaches infinity. One applies
LHospitals Rule twice.

lim e(x) = lim S(t)dt


x

/S(x) = lim S(x) / f(x) = lim

-f(x) / f(x).

For example for the Gamma distribution:


f(x) = x1 ex/ / (). f(x) = (2)x2 ex/ / () (+1)x1 ex/ / ().
lim e(x) = lim -f(x) / f(x) = lim 1 / {1/ ( 1)/ x} = .
x

When the Mean Excess Loss increases to infinity, it may be useful to look at the limit of x/e(x). Again
one applies LHospitals Rule twice.

lim x/e(x) = lim xS(x) / S(t)dt = lim (-f(x)x + S(x))/ -S(x) =


x

lim (-f(x)-xf(x) -f(x) )/ f(x) = lim {-xf(x) / f(x) } - 2.


x

For example for the LogNormal distribution:


f(x) = / x , where = exp[-.5 ({ln(x) } / )2] /{ 2 )
f(x) = - / x2 - {(ln(x) ) / (x2 )} ( / x).
lim x/e(x) = lim -xf(x) / f(x) - 2 = lim {1 + (ln(x) ) / 2 } - 2 ln(x) / 2 .
x

Thus for the LogNormal distribution the Mean Excess Loss increases to infinity, but a little less
quickly than linearly: e(x) x / {(ln(x) )/2 -1} x 2 / ln(x).

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 516

As another example, for the Burr distribution:


f(x) = x1(1+(x/))( + 1).

f(x) = (1)f(x)/x -(+1) x1 f(x) /(1+(x/)).

lim x/e(x) = lim -xf(x)/ f(x) - 2 = lim -(1) +(+1) x /(1+(x/)) - 2 = +1+ + 2 = 1.
x

The Mean Excess Loss for the Burr Distribution increases to infinity approximately linearly:
e(x) x / (1), provided > 1.
Moments, CV, and the Mean Excess Loss:
When the relevant moments are finite and the distribution has support x > 0, then one can compute
the moments of the distribution in terms of the mean excess loss, e(x).234
We have E[X] = e(0).235 We will now show how to write the second moment in terms of an integral
of the mean excess loss and the survival function.
As shown in a previous section:

E[X2 ] = 2

S(t) t dt

t=0

Note that the integral of S(t)/E[X] from x to infinity is the excess ratio, R(x), and thus
R(x) = -S(x)/E[X]. Using this fact and integration by parts:

] + 2R(t) dt

E[X2 ]/E[X] = 2 t S(t)/E[X] dt = -2t R(t)


t=0

t=0

t=0

For a finite second moment, tR(t) goes to zero as x goes to infinity, therefore:236

t=0

t=0

E[X2 ] = 2E[X] R(t) dt = 2 S(t)e(t) dt.

234

Since e(x) determines the distribution, it follows that e(x) determines the moments if they exist.
The numerator of e(0) is the losses excess of zero, i.e. all the losses.
The denominator of e(0) is the number of losses larger than 0, i.e., the total number of losses.
The support of the distribution has been assumed to be x > 0.
236
I have used the fact that E[X]R(t) = S(t)e(t). Both are the losses excess of t.
235

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 517

Exercise: For a Pareto Distribution, what is the integral from zero to infinity of S(x)e(x)?
[Solution: S(x) = (1+x/). e(x) = (x+)/(-1) = (1+ x/)/(-1).
S(x)e(x) = (1+x/)(1) /(-1).

S(t)e(t) dt = /(-1) (1+t/)(1) dt = {/(-1)} (2)(1+t/)(2) ] = 2/{(-1)(2)}.


t=0

t=0

t=0

Comment: The integral is one half of the second moment for a Pareto, consistent with the above
result.]
Assume that the first two moments are finite and the distribution has support x > 0, and
e(x) > e(0) = E[X] for all x. Then:

t=0

t=0

t=0

E[X2 ] = 2 S(t)e(t) dt > 2 S(t)E[X] dt = 2E[X] S(t) dt = 2E[X]E[X].

E[X2 ] > 2E2 [X]. E[X2 ] / E2 [X] > 2. CV2 = E[X2 ]/E2 [X] - 1 > 1. CV > 1.
When the first two moments are finite and the distribution has support x>0, then
if e(x) > e(0) = E[X] for all x, then the coefficient of variation is greater than one.237
Note that e(x) > e(0), if e(x) is monotonically increasing.
Examples where this result applies are the Gamma < 1, Weibull < 1, Transformed Gamma with
< 1 and < 1, and the Pareto.238 In each case CV > 1. Note that each of these distributions is
heavier-tailed than an Exponential, which has CV = 1.
While in the tail e(x) for the LogNormal approaches infinity, it is not necessarily true for the
LogNormal that e(x) > e(0) for all x. The mean excess loss of the LogNormal can decrease before it
finally increases as per x/ln(x) in the tail.
For example, here is a graph of the Mean Excess Loss for a LogNormal with = 1 and = 0.5:

237
238

See Section 3.4.5 in Loss Models.


For > 2, so that the CV of the Pareto exists.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 518

e(x)
7
6
5
4
3
2

20

40

60

In any case, the CV of the LogNormal is:


CV < 1 for <

80

100

exp(2 ) - 1. Thus for the LogNormal,

ln(2) .82, while CV > 1 for >

ln(2) 0.82.

When the first two moments are finite and the distribution has support x > 0, then
if e(x) < e(0) = E[X] for all x, then CV < 1. Note that e(x) < e(0), if e(x) is monotonically decreasing.
Examples where this result applies are the Gamma > 1, Weibull > 1, and the Transformed
Gamma with > 1 and > 1. In each case CV < 1. Note that each of these distributions is
lighter-tailed than an Exponential, which has CV = 1.
One can get similar results to those above for higher moments.
Exercise: Assuming the relevant moments are finite and the distribution has support x>0,
express the integral from zero to infinity of S(x)xn , in terms of moments.
[Solution: One applies integration by parts and the fact that dS(x)/dx = - f(x):

S(t)tndt = S(t) tn+1 /(n+1) ] + f(t)tn+1 /(n+1)dt = E[Xn+1]/(n+1).


t=0

t=0

t=0

Where Ive used the fact, that if the n+1st moment is finite, then S(x)xn+1 must go to zero as x
approaches infinity.
Comment: For n = 0 one gets the result that the mean is the integral of the survival function
from zero to infinity. For n = 1 one gets the result used above, that the integral of xS(x) from
zero to infinity is half of the second moment.]

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 519

Exercise: Assuming the relevant moments are finite and the distribution has support x>0,
express the integral from zero to infinity of S(x)e(x)xn , in terms of moments.
[Solution: S(x)e(x) = R(x)E[X]. Then one applies integration by parts, differentiating R(t) and
integrating tn . Since the integral of S(t)/E[X] from x to infinity is R(x), the derivative of R(x) is
-S(x)/E[X].

S(t)e(t)tndt = E[X] R(t)tndt = E[X] {R(t) tn+1 /(n+1) ] + (S(t)/E[X]) tn+1 /(n+1) dt } =
t=0

t=0

t=0

t=0

(1/(n+1)) S(t) tn+1 dt = (1/(n+1))E[Xn+2]/(n+2) = E[Xn+2]/{(n+1)(n+2)}.]


t=0

Where Ive used the result of the previous exercise and the fact, that if the n+2nd moment is
finite, then R(x)xn+1 must go to zero as x approaches infinity.]
Thus we can express moments, when they exist, either as integrals of S(t)e(t) times powers of t,
integrals of R(t) times powers of t, or as S(t) times powers of t.
Assuming the relevant moments are finite and the distribution has support x>0, then
if e(x) > e(0) = E[X] for all x, we have for n 1:

E[Xn+1]/{n(n+1)} =

S(t)e(t)tn-1dt > E[X] S(t)tn-1dt = E[X]E[Xn]/n.

t=0

t=0

Thus if e(x) > e(0) for all x, E[Xn+1] > (n+1)E[X]E[Xn ], n1.
For n = 1 we get a previous result: E[X2 ] > 2E2 [X].
For n = 2 we get: E[X3 ] > 3E[X]E[X2 ].
Conversely, if e(x) < e(0) for all x, E[Xn+1] < (n+1)E[X]E[Xn ], n1.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 520

Equilibrium Distribution:
Given that X follows a distribution with survival function SX, for x > 0, then Loss Models defines the
density of the corresponding equilibrium distribution as:239
g(y) = SX(y) / E[X], y > 0.
Exercise: Demonstrate that the above is actually a probability density function.
[Solution: SX(y) / E[X] 0.

(SX(y) / E[X]) dy = (1/E[X]) SX(y) dy = E[X]/ E[X] = 1.

Exercise: If severity is Exponential with mean = 10, what is the density of the corresponding
equilibrium distribution?
[Solution: g(y) = SX(y) / E[X] = exp(-y/10)/ 10.]
In general, if the severity is Exponential, then the corresponding equilibrium distribution is also
Exponential with the same mean.
Exercise: If severity is Pareto, with = 5 and = 1000, what is the corresponding equilibrium
distribution?
[Solution: g(y) = SX(y) / E[X] = (1 + y/1000)-5 / 250.
This is the density of another Pareto Distribution, but with = 4 and = 1000.]
The distribution function of the corresponding equilibrium distribution is the loss elimination ratio of the
severity distribution:
y

G(y) = (SX(y) / E[X]) dy = (1/E[X])


0

SX(y) dy = E[X y]/ E[X] = LER(y).

Therefore the survival function of the corresponding equilibrium distribution is the excess ratio of the
severity distribution.

239

See Equation 3.20 in Loss Models.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 521

For example, if severity is Pareto, the excess ratio, R(x) = {/(+x)}1, which is the survival function
for a Pareto with the same scale parameter and a shape parameter one less. Thus if severity is
Pareto, (with > 1), then the distribution of the corresponding equilibrium distribution is also Pareto,
with the same scale parameter and shape parameter of - 1.
The mean of the corresponding equilibrium distribution is:240

y (SX(y) / E[X]) dy = (1/E[X])

y SX(y) dy = E[X2] / {2E[X]}.

The second moment of the corresponding equilibrium distribution is:

y2 (SX(y) / E[X]) dy = (1/E[X]) y2 SX(y) dy = E[X3] / {3E[X]}.


0

Exercise: If severity is Pareto, with = 5 and = 1000, what are the mean and variance of the
corresponding equilibrium distribution?
[Solution: The mean of the Pareto is: 1000/4 = 250. The second moment of the Pareto is:
2(10002 )/{(5-1)(5-2)} = 166,667. The third moment of the Pareto is:
6(10003 )/{(5-1)(5-2)(5-3)} = 250 million. The mean of the corresponding equilibrium
distribution is: E[X2 ]/ {2E[X]} = 166,667 / 500 = 333.33.
The second moment of the corresponding equilibrium distribution is: E[X3 ] / {3E[X]} =
250 million/ 750 = 333,333. Thus the variance of the corresponding equilibrium distribution is:
333,333 - 333.332 = 222,222. Alternately, the corresponding equilibrium distribution is a
Pareto Distribution, but with = 4 and = 1000. This has mean: 1000/3 = 333.33, second
moment: 2(10002 )/{(3)(2)} = 333,333, and variance: 333,333 - 333.332 = 222,222.]
The hazard rate of the corresponding equilibrium distribution is:
density of the corresponding equilibrium distribution
S(x) / E[X]
S(x)
=
=
=
survival function the corresponding equilibrium distribution
R(x)
E[X] R(x)
S(x)
= 1 / e(x).
expected losses excess of x
The hazard rate of the corresponding equilibrium distribution is the inverse of the mean excess loss.
240

See Section 3.4.5 in Loss Models.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 522

Curtate Expectation of Life:


The mean excess loss is mathematically equivalent to what is called the complete expectation of life,
e x e(x).
Exercise: Five individuals live 53.2, 66.3, 70.8, 81.0, and 83.5 years.
What is the observed e(60) = e60 for this group?
[Solution: (6.3 + 10.8 + 21 + 23.5)/4 = 15.4.]
If instead we ignore any fractions of a year lived, then we get what is called the curtate expectation of
life, ex.
Exercise: Five individuals live 53.2, 66.3, 70.8, 81.0, and 83.5 years.
What is the observed e60 for this group?
[Solution: (6 + 10 + 21 + 23)/4 = 15.0.]
Since we are ignoring any fractions of a year lived, ex e x .
On average we are ignoring about 1/2 year of life, therefore, ex e x - 1/2.
Just as we can write e(x) = e x in terms of an integral of the Survival Function:241

e x = S(t)dt / S(x)
x

one can write the curtate expectation of life in terms of a summation of Survival Functions:242
ex =

S(t) / S(x).

t = x+1

e0 = S(t) .
t=1

Exercise: Determine ex for an Exponential Distribution with mean .


[Solution: ex =

t = x+1

t=1

t=1

S(t) / S(x) = e-(x + t)/ / e-x/ = e-t / = e1//(1 - e1/) = 1/(e1/ - 1).

Comment: ex = 1/(e1/ - 1) 1/{1/ + 1/(22)} = /{1 + 1/(2)} {1 - 1/(2)} = - 1/2.]


241
242

See equation 3.5.2 in Actuarial Mathematics, with tp x = S(x+t)/S(x).


See equation 3.5.7 in Actuarial Mathematics, with kp x = S(x+k)/S(x).

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 523

For example, for = 10, ex = 1/(e0.1 - 1) = 9.508.


This compares to e x = = 10.
Exercise: Determine e0 for a Pareto Distribution with = 1 and = 2.
[Solution: e0 = S(1) + S(2) + S(3) + ... = (1/2)2 + (1/3)2 + (1/4)2 + (1/5)2 + ... = 2/6 - 1 = 0.645.
Comment: e(0) = E[X] = /( - 1) = 1.]

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 524

Problems:
33.1 (2 points) Assume you have a Pareto distribution with = 5 and = $1000.
What is the Mean Excess Loss at $2000?
A. less than $500
B. at least $500 but less than $600
C. at least $600 but less than $700
D. at least $700 but less than $800
E. at least $800
33.2 (1 point) Assume you have a distribution F(x) = 1 - e-x/666.
What is the Mean Excess Loss at $10,000?
A. less than $500
B. at least $500 but less than $600
C. at least $600 but less than $700
D. at least $700 but less than $800
E. at least $800
33.3 (3 points) The random variables X and Y have joint density function
f(x, y) = 60,000,000 x exp(-10x2 ) / (100 + y)4 , 0 < x < , 0 < y < .
Determine the Mean Excess Loss function for the marginal distribution of Y evaluated at Y = 1000.
A. less than 200
B. at least 200 but less than 300
C. at least 300 but less than 400
D. at least 400 but less than 500
E. at least 500
33.4 (1 point) Which of the following distributions would be most useful for modeling the age at
death of humans?
A. Gamma B. Inverse Gaussian C. LogNormal D. Pareto E. Weibull
33.5 (1 point) Given the following empirical mean excess losses for 500 claims:
x
0
5
10
15
25
50
100 150 200 250 500 1000
e(x) 15.6 16.7 17.1 17.4 17.6 18.0 18.2 18.3 18.3 18.4 18.5 18.5
Which of the following distributions would be most useful for modeling this data?
A. Gamma with > 1

B. Gamma with < 1

D. Weibull with > 1

E. Weibull with < 1

C. Pareto

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 525

33.6 (2 points) You have a Pareto distribution with parameters and .


If e(1000)/e(100) = 2.5, what is ?
A. 100

B. 200

C. 300

D. 400

E. 500

For the following three questions, assume you have a LogNormal distribution with parameters
= 11.6, = 1.60.
33.7 (3 points) What is the Mean Excess Loss at $100,000?
A. less than $500,000
B. at least $500,000 but less than $600,000
C. at least $600,000 but less than $700,000
D. at least $700,000 but less than $800,000
E. at least $800,000
33.8 (1 point) What is the average size of those losses greater than $100,000?
A. less than $500,000
B. at least $500,000 but less than $600,000
C. at least $600,000 but less than $700,000
D. at least $700,000 but less than $800,000
E. at least $800,000
33.9 (2 points) What percent of the total loss dollars are represented by those losses greater than
$100,000?
A. less than 0.91
B. at least 0.91 but less than 0.92
C. at least 0.92 but less than 0.93
D. at least 0.93 but less than 0.94
E. at least 0.94

33.10 (2 points) You observe the following 35 losses: 6, 7, 11, 14, 15, 17, 18, 19, 25, 29, 30, 34,
40, 41, 48, 49, 53, 60, 63, 78, 85, 103, 124, 140, 192, 198, 227, 330, 361, 421, 514, 546, 750,
864, 1638.
What is the (empirical) Mean Excess Loss at 500?
A. less than 350
B. at least 350 but less than 360
C. at least 360 but less than 370
D. at least 370 but less than 380
E. at least 380

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 526

Use the following information for the next two questions:

The annual frequency of ground up losses is Negative Binomial with r = 4 and = 1.3.
The sizes of ground up losses follow a Pareto Distribution with = 3 and = 5000.
There is a franchise deductible of 1000.
33.11 (2 points) Determine the insurers average payment per nonzero payment.
(A) 2500
(B) 3000
(C) 3500
(D) 4000
(E) 4500
33.12 (2 points) Determine the insurer's expected annual payments.
(A) 8,000
(B) 9,000
(C) 10,000 (D) 11,000 (E) 12,000
33.13 (3 points) F is a continuous size of loss distribution on (0, ).
LER(x) is the corresponding loss elimination ratio at x.
Which of the following are true?
A. F(x) LER(x) for all x > 0.
B. F(x) LER(x) for all x > c, for some c > 0.
C. F(x) LER(x) for all x > 0, if and only if e(x) e(0).
D. F(x) LER(x) for all x > 0, if and only F is an Exponential Distribution.
E. None of A, B, C or D is true.

33.14 (2 points) The size of loss follows an Exponential Distribution with = 5.


The largest integer contained in each loss is the amount paid for that loss.
For example, a loss of size 3.68 results in a payment of 3.
What is the expected payment?
A. 4.46
B. 4.48
C. 4.50
D. 4.52
E. 4.54
33.15 (2 points) Losses follow a Single Parameter Pareto distribution with = 1000 and > 1.
Determine the ratio of the Mean Excess Loss function at x = 3000 to the Mean Excess Loss function
at x = 2000.
A. 1
B. 4/3
C. 3/2
D. 2
E. Cannot be determined from the given information.
33.16 (3 points) For a Gamma Distribution with = 2, what is the behavior of the mean excess loss
e(x) as x approaches infinity?

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 527

33.17 (4, 5/86, Q.58) (2 points) For a certain machine part, the Mean Excess Loss e(x) varies as
follows with the age (x) of the part:
Age x
e(x)
5 months
12.3 months
10
18.6
20
34.3
50
69.1
Which of the following continuous distributions best fits this pattern of Mean Excess Loss?
A. Exponential

B. Gamma ( > 1)

D. Weibull ( > 1)

E. Normal

C. Pareto

33.18 (160, 5/87, Q.1) (2.1 points) You are given the following survival function:
S(x) = (b - x/a)1/2, 0 x k. The median age is 75. Determine e(75).
(A) 8.3
(B) 12.5
(C) 16.7
(D) 20.0
(E) 33.3
33.19 (4B, 5/92, Q.14) (1 point) Which of the following statements are true about the Mean
Excess Loss function e(x)?
1. If e(x) increases linearly as x increases, this suggests that a Pareto model may be appropriate.
2. If e(x) decreases as x increases, this suggests that a Weibull model may be appropriate.
3. If e(x) remains constant as x increases, this suggests that an exponential model may be
appropriate.
A. 1 only
B. 2 only
C. 1 and 3 only
D. 2 and 3 only
E. 1, 2, and 3
33.20 (4B, 5/93, Q.24) (2 points) The underlying distribution function is assumed to be the
following: F(x) = 1 - e-x/10, x 0
Calculate the value of the Mean Excess Loss function e(x), for x = 8.
A. less than 7.00
B. at least 7.00 but less than 9.00
C. at least 9.00 but less than 11.00
D. at least 11.00 but less than 13.00
E. at least 13.00
33.21 (4B, 5/94, Q.4) (2 points) You are given the following information from an unknown size of
loss distribution for random variable X:
Size k ($000s)
1
3
5
7
9
Count of X k
180 118
75
50
34
Sum of X k
990 882 713 576 459
If you are using the empirical Mean Excess Loss function to help you select a distributional family for
fitting the empirical data, which of the following distributional families should you attempt to fit first?
A. Pareto
B. Gamma C. Exponential
D. Weibull E. Lognormal

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 528

33.22 (4B, 5/95, Q.21) (3 points) Losses follow a Pareto distribution, with parameters and
> 1. Determine the ratio of the Mean Excess Loss function at x = 2 to the Mean Excess Loss
function at x = .
A. 1/2
B. 1
C. 3/2
D. 2
E. Cannot be determined from the given information.
33.23 (4B, 11/96, Q.22) (2 points) The random variable X has the density function
f(x) = e-x// , 0 < x < , > 0.
Determine e(), the Mean Excess Loss function evaluated at .
B.

A. 1

D. /e

C. 1/

E. e/

33.24 (4B, 5/97, Q.13) (1 point) Which of the following statements are true?
1. Empirical Mean Excess Loss functions are continuous.
2. The Mean Excess Loss function of an exponential distribution is constant.
3. If it exists, the Mean Excess Loss function of a Pareto distribution is decreasing.
A. 2
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3
33.25 (4B, 5/98, Q.3) (3 points) The random variables X and Y have joint density function
f(x, y) = exp(-2x - y/2)
0 < x < , 0 < y < .
Determine the Mean Excess Loss function for the marginal distribution of X evaluated at
X = 4.
A. 1/4
B. 1/2
C. 1
D. 2
E. 4
33.26 (4B, 11/98, Q.6) (2 points) Loss sizes follow a Pareto distribution, with parameters
= 0.5 and = 10,000. Determine the Mean Excess Loss at 10,000.
A. 5,000

B. 10,000

C. 20,000

D. 40,000

E.

33.27 (4B, 11/99, Q.25) (2 points) You are given the following:
The random variable X follows a Pareto distribution, as per Loss Models, with parameters = 100
and = 2.
The mean excess loss function, eX(k), is defined to be E[X - k I X k].
Determine the range of eX(k) over its domain of [0, ).
A. [0, 100]

B. [0, )

C. 100

D. [100, )

E.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 529

33.28 (4B, 11/99, Q.27) (2 points) You are given the following:
The random variable X follows a Pareto distribution, as per Loss Models, with parameters = 100
and = 2 .
The mean excess loss function, eX(k), is defined to be E[X - k I X k].
Z = min(X, 500).
Determine the range of eZ(k) over its domain of [0, 500].
A. [0, 150]

B. [0, )

C. [100, 150]

D. [100, )

E. [150, )

33.29 (SOA3, 11/04, Q.24) (2.5 points) The future lifetime of (0) follows a two-parameter Pareto
distribution with = 50 and = 3.
Calculate e 20 .
(A) 5

(B) 15

(C) 25

(D) 35

(E) 45

33.30 (CAS3, 5/05, Q.4) (2.5 points) Well-Traveled Insurance Company sells a travel insurance
policy that reimburses travelers for any expenses incurred for a planned vacation that is canceled
because of airline bankruptcies. Individual claims follow a Pareto distribution with = 2 and = 500.
Because of financial difficulties in the airline industry, Well-Traveled imposes a limit of $1,000 on each
claim. If a policyholder's planned vacation is canceled due to airline bankruptcies and he or she has
incurred more than $1,000 in expenses, what is the expected non-reimbursed amount of the claim?
A. Less than $500
B. At least $500, but less than $1,000
C. At least $1,000, but less than $1,500
D. At least $1,500, but less than $2,000
E. $2,000 or more
33.31 (SOA M, 5/05, Q.9 & 2009 Sample Q.162) (2.5 points) A loss, X, follows a 2-parameter
Pareto distribution with = 2 and unspecified parameter . You are given:
E[X - 100 | X > 100] = (5/3) E[X - 50 | X > 50].
Calculate E[X - 150 | X > 150].
(A) 150
(B) 175
(C) 200
(D) 225

(E) 250

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 530

33.32 (CAS3, 11/05, Q.10) (2.5 points)


You are given the survival function s(x) as described below:

s(x) = 1 - x/40 for 0 x 40.


s(x) is zero elsewhere.
Calculate e 25, the complete expectation of life at age 25.
A. Less than 7.7
B. At least 7.7 , but less than 8.2
C. At least 8.2, but less than 8.7
D. At least 8.7, but less than 9.2
E. At least 9.2
33.33 (CAS3, 5/06, Q.38) (2.5 points) The number of calls arriving at a customer service center
follows a Poisson distribution with = 100 per hour. The length of each call follows an exponential
distribution with an expected length of 4 minutes. There is a $3 charge for the first minute or any
fraction thereof and a charge of $1 per minute for each additional minute or fraction thereof.
Determine the total expected charges in a single hour.
A. Less than $375
B. At least $375, but less than $500
C. At least $500, but less than $625
D. At least $625, but less than $750
E. At least $750

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 531

Solutions to Problems:
33.1. D. e(2000) = {mean - E[X 2000]} / S(2000) = (250 - 246.9) / .00411 = $754
Alternately, for the Pareto, e(x) = ( + x) / ( - 1) = 3000 / 4 = $750.
Alternately, a Pareto truncated and shifted at 2000, is another Pareto with = 5 and
= 1000 + 2000 = 3000. e(2000) is the mean of this new Pareto: 3000/(5 - 1) = $750.
Alternately, e(2000) =

tp2000 dt = S(2000 + t) / S(2000) dt =

(2000 + ) / (2000 + + t) dt = (2000 + ) / ( - 1) = 3000 / 4 = $750.


0

33.2. C. For the exponential distribution the mean excess loss is a constant; it is equal to the mean.
The mean in this case is = $666.
33.3. E. X and Y are independent since the support doesnt depend on x or y and the density can
be factored into a product of terms each just involving x and y.
f(x, y) = 60,000,000 x exp(-10x2 ) / (100 + y)4 = {20 x exp(-10x2 )} {3000000 / (100 + y)4 }.
The former is the density of a Weibull Distribution with = 1/ 10 and = 2. The latter is the density
of a Pareto Distribution with = 3 and = 100. When one integrates from x = 0 to in order to get
the marginal distribution of y, one is left with just a Pareto, since the Weibull integrates to unity and
the Pareto is independent of x. Thus the marginal distribution is just a Pareto, with parameters = 3
and = 100. Thus e(y) = ( + y)/( - 1) = (100 + y)/(3 - 1).
e(1000) = 1100/2 = 550.
33.4. E. Of these distributions, only the Weibull (for >1) has mean residual lives decline to zero.
The Weibull (for >1) has the force of mortality increase as the age approaches infinity, as is
observed for humans. The other distributions have the force of mortality decline or approach a
positive constant as the age increases.
33.5. B. The empirical mean residual lives seem to be increasing towards a limit of about 18.5 as x
approaches infinity. This is the behavior of a Gamma with alpha less than 1.
The other distributions given all exhibit different behaviors than this.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

33.6. E. For the Pareto Distribution: e(x) = (E[X] - E[X

HCM 10/8/12,

Page 532

x])/S(x) = (x + )/( - 1).

e(1000)/e(100) = (1000 + )/(100 + ) = 2.5. = 500.


Comment: One can not determine from the given information.
33.7. C. E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.
E[X 100000] = exp(12.88)[-1.65] + (100000) {1 - [-.05]} =
(392,385)(1 - .9505) + (100000)(.5199) = 71,413. For the LogNormal,
E[X] = exp( + 2/2) = exp(12.88) = 392,385. For the LogNormal,
F(x) = [{ln(x) } / ]. F(100000) = [-.05] = (1 - .5199). Therefore,
e(100000) = {E[X] - E[X 100000] }/{1 - F(100000)} = (392,385 - 71,413) / .5199 $617,000.
Alternately, for the LogNormal distribution,
e(x) = exp( + 2/2){1 [(lnx 2)/] / {1 [(lnx )/]} x.
For = 11.6, = 1.60, e(100000) = exp(12.88)(1 [-1.65]) / {1 [-.05]} - 100000 =
(392,385)(.9505)/(.5199) - (100000) = $617 thousand.
33.8. D. The size of those claims greater than $100,000 = $100,000 + e(100000). But from the
previous question e(100000) $617,000. Therefore, the solution $717,000.
33.9. E. Use the results from the previous two questions. F(100000) = [-.05] = (1 -.5199).
Thus, S(100000) = .5199. E[X] = exp( + 2/2) = exp(12.88) = 392,385.
Percent of the total loss dollars represented by those losses greater than $100,000 =
S(100000)}{size of those claims greater than $100,000}/E[X] = (.5199)(717000)/392,385 = 0.95.
Alternately, the losses represented by the small losses are:
E[X 100000] - S(100000)(100000) = 71,413 51,990 = 19,423.
Divide by the mean of 392,385 and get .049 of the losses are from small claims.
Thus the percentage of losses from large claims is: 1 - .049 = 0.95.
33.10. C. Each claim above 500 contributes its excess above 500 and then divide by the number
of claims greater than 500. e(500) = {14 + 46+250+364+1138}/ 5 = 362.4.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 533

33.11. D. With a franchise deductible the insurer pays the full value of every large loss and pays
nothing for small losses. Therefore, the Pareto Distribution has been truncated from below at 1000.
The mean of a distribution truncated and shifted from below at 1000 is e(1000) the mean of a
distribution truncated from below at 1000 is: e(1000) + 1000.
For a Pareto Distribution e(x) = (E[X] - E[X

x])/S(x) = (x + )/( - 1).

e(1000) = (1000 + 5000)/(3 - 1) = 3000. e(1000) + 1000 = 4000.


33.12. E. For the Pareto Distribution, S(1000) = {5000/(5000 + 1000)}3 = 0.5787.
Mean frequency = r = (4)(1.3) = 5.2. Expected # of nonzero payments = (0.5787)(5.2) = 3.009.
From the previous solution, average nonzero payment is 4000.
Expected annual payments = (3.009)(4000) = 12,036.
Alternately, with a franchise deductible of 1000 the payment is 1000 more than that for an ordinary
deductible for each large loss, and thus the average payment per loss is:
E[X] - E[X 1000] + 1000S(1000) =
(5000/2) - (5000/2){1 - (5000/6000)2 } + (1000)(5000/6000)3 = 2315.
Expected annual payments = (5.2)(2315) = 12,038.
33.13. C. F(x) - LER(x) = 1 - S(x) - {1 - S(x)e(x)/E[X]} = {S(x)/E[X]}{e(x) - E[X]} =
{S(x)/E[X]}{e(x) - e(0)}. Therefore, F(x) LER(x) e(x) e(0).
Alternately, e(x)/e(0) = {E[(X - x)+]/S(x)}/E[X] = {E[(X - x)+]/E[X]}/S(x) = R(x)/S(x).
Therefore, e(x) e(0). R(x) S(x). LER(x) = 1 - R(x) 1 - F(x) = S(x).
Comment: For an Exponential Distribution, e(x) = e(0) = , and therefore F(x) = LER(x).
For a Pareto Distribution with > 1, e(x) increases linearly, and therefore F(x) > LER(x).
33.14. D. The expected payment is the curtate expectation of life at zero.

e0 = S(t) = S(1) + S(2) + S(3) + ... = e-1/5 + e-2/5 + e-3/5 + ...


t=1

= e-1/5/(1 - e-1/5) = 1/(e0.2 - 1) = 4.517.


Comment: Approximately 1/2 less than the mean of 5.
33.15. C. e(x) = {E[X] - E[X

x]}/S(x) = (/(1) - {/(1) /((1)x1)})/{(/x)}

= x/(-1). e(3000)/e(2000) = 3000/2000 = 3/2.


Comment: Similar to 4B, 5/95, Q.21.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 534

33.16. The value of the scale parameter does not affect the behavior, for simplicity set = 1.
f(x) = x e-x, x > 0.

S(x) =

x f(t) dt = x e-x + e-x.

e(x) =

x S(t) dt / S(x) = (x e-x + 2e-x) / (x e-x + e-x) = 1 + 1/(1+x).

Thus, as x approaches infinity, e(x) decreases to a constant.


Comment: In this case the limit of e(x) is one, while in general it is .
In general, for > 1, e(x) decreases to a constant, while h(x) increases to a constant.
For < 1, e(x) increases to a constant, while h(x) decreases to a constant.
For = 1, we have an Exponential, and e(x) and h(x) are each constant.
For = 2 and = 1, h(x) = f(x) / S(x) = x e-x / (x e-x + e-x) = x / (x + 1).
33.17. C. The mean residual life increases approximately linearly, which indicates a Pareto.
Comment: The Pareto has a mean residual life that increases linearly. The Exponential has a constant
mean residual life. For a Gamma with > 1 the mean residual life decreases towards a horizontal
asymptote. For a Weibull with > 1 the mean residual life decreases to zero. For a Normal
Distribution the mean residual life decreases to zero.
33.18. C. We want S(0) = 1. b = 1. We want S(k) = 0. k = a.
0.5 = S(75) = (1 - 75/a)1/2. a = 100.
100

100

75

75

e(75) = S(x) dx /S(75) = (1 - x / 100)1/ 2 dx / 0.5 = (25/3)/0.5 = 16.7.

33.19. E. 1. T. Mean Residual Life of the Pareto Distribution increases linearly.


2. T. The Weibull Distribution for > 1 has the mean residual life decrease (to zero.)
3. T. The mean residual life for the Exponential Distribution is constant.
33.20. C. For the Exponential Distribution, e(x) = mean = = 10.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 535

33.21. C. The empirical mean residual life is calculated as:


e(k) = ($ excess of k) / (# claims > k) = {($ on claims > k) / (# claims > k)} - k =
(average size of those claims of size greater than k) - k.
Size k ($000)
# claims k
Sum of X k
average size of those claims of size > k
e(k)

1
180
990
5.500
4.500

3
118
882
7.475
4.475

5
75
713
9.507
4.507

7
50
576
11.520
4.520

9
34
459
13.500
4.500

Since the mean residual life is approximately constant, one would attempt first to fit an exponential
distribution, since it has a constant mean residual life.
33.22. C. For the Pareto, e(x) = (x+)/(1). e(2) = 3 / (-1). e() = 2 / (-1).
e(2) / e() = 3/2.
Comment: If one doesnt remember the formula for the mean residual life of the Pareto, it is a longer
question. In that case, one can compute: e(x) = (mean - E[X x]) / (1 -F(x)).
33.23. B. The mean residual life of the Exponential is a constant equal to its mean, here .
33.24. A. 1. False. The empirical mean residual life is the ratio of the observed losses excess of
the limit divided by the number of observed claims greater than the limit. While the numerator is
continuous, the denominator is not. For example, assume you observe 3 claims of sizes 2, 6 and
20. Then e(5.999) = {(20-5.999) + (6-5.999)}/2 = 7.001, while e(6.001) = (20-6.001)/1 = 13.999.
The limit of e(x) as x approaches 6 from below is 7, while the limit of e(x) as x approaches 6 from
above is 14. Thus the empirical mean residual life is discontinuous at 6. 2. True. 3. False. For the
Pareto Distribution, the mean residual life increases (linearly).
Comment: A function is continuous at a point x, if and only if the limits approaching x from below and
above both exist and are each equal to the value of the function at x. The empirical mean residual life
is discontinuous at points at which there are observed claims, since so are the Empirical Distribution
Function and the tail probability. In contrast, the empirical Excess Ratio and empirical Limited
Expected Value are continuous. The numerator of the Excess Ratio is the observed losses excess
of the limit; the denominator is the total observed losses. This numerator is continuous, while this
denominator is independent of x. Thus the empirical Excess Ratio is continuous. The numerator of
the empirical Limited Expected Value is the observed losses limited by the limit; the denominator is
the total number of observed claims. This numerator is continuous, while this denominator is
independent of x. Thus the empirical Limited Expected Value is continuous.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 536

33.25. B. The marginal distribution of X is obtained by integrating with respect to y:

f(x) =

exp(-2x - y/2)dy = e-2x e-y/2 dy = e-2x (-2e-y/2) ] = 2e-2x.

y=0

y=0

y=0

Thus the marginal distribution is an Exponential with a mean of 1/2. It has a mean residual life of 1/2,
regardless of x.
33.26. E. The mean excess loss for the Pareto only exists for > 1.
For 1 the relevant integral is infinite.

t f(t) dt /S(x)

e(x) =

- x.

For a Pareto with = .5,

t f(t) dt =

t (.5.5)( + t)-1.5 dt .

For large t, the integrand is proportional to t t-1.5 = t-.5, whose integral approaches infinity as the
upper limit of the integral approaches infinity. (The integral of t-.5 is 2t.5.)
Alternately, e(x) = (E[X] - E[X x]) / S(x). The limited expected value E[X x] is finite (it is less than
x), as is S(x). However, for 1, the mean E[X] (does not exist or) is infinite. Therefore, so is the
mean excess loss.
Comment: While choice E is the best of those available, in my opinion a better answer might have
been that the mean excess loss does not exist.
33.27. D. For the Pareto Distribution e(k) = (k+)/(-1) = k + 100.
Therefore, as k goes from zero to infinity, e(k) goes from 100 to infinity.

2013-4-2,

Loss Distributions, 33 Mean Excess Loss

HCM 10/8/12,

Page 537

33.28. A. Z is for data censored at 500, corresponding to a maximum covered loss of 500.
eZ(k) = (dollars of loss excess of k) / S(k) = (E[X 500] - E[X k])/ S(k).
E[X x] = {/(1)}{1(/(+x))1}, for the Pareto.
Thus (E[X 500] - E[X k])/ S(k) = {100/(100+k) - 100/600} {(100+k)/100}2 =
(100 + k)(600 - (100 + k))/600 = (100 + k)(500 - k)/600.
eZ(0) = 83.33.
eZ(500) = 0.
Setting the derivative equal to zero: (400-2k)/ 600 = 0. k = 200. eZ(200) = 150.
Thus the maximum over the interval is 150, while the minimum is 0.
Therefore, as k goes from zero to 500, eZ(k) is in the interval [0, 150].
33.29. D. E[X] = /(-1) = 50/(3-1) = 25.
E[X

20] = {/(-1)}{1 - (/(+x))1} = (25){1 - (50/(50 + 20))2 } = 12.245.

S(20) = (50/(50 + 20))3 = 0.3644.


e(20) = (E[X] - E[X 20])/S(20) = (25 - 12.245)/.3644 = 35.
Alternately, for the Pareto, e(x) = (x + )/( - 1). e(20) = (20 + 50)/(3 - 1) = 35.
33.30. D. Given someone incurred more than $1,000 in expenses, the expected
non-reimbursed amount of the claim is the mean residual life at e(1000).
For the Pareto, e(x) = (x + )/( - 1). e(1000) = (1000 + 500)/(2 - 1) = 1500.
Alternately, (E[X] - E[X

1000])/S(1000) = {500 - 500(1 - 500/1500)}/(500/1500)2 = 1500.

Alternately, a Pareto truncated and shifted from below is another Pareto, with parameters and
+ d. Therefore, the unreimbursed amounts follow a Pareto Distribution with parameters = 2 and
= 500 + 1000 = 1500, with mean 1500/(2 - 1) = 1500.
33.31. B. e(d) = E[X - d | X > d] = (E[X] - E[X

d])/S(d) = { - (1 - /( + d))}/{/( + d)}2 = + d.

The given equation states e(100) = (5/3)e(50). 100 + = (5/3)(50 + ). = 25.


E[X - 150 | X > 150] = e(150) = 150 + 25 = 175.
Comment: A Pareto truncated and shifted from below is another Pareto, with parameters and
+ d. e(x) = (x + )/( - 1).

2013-4-2,

Loss Distributions, 33 Mean Excess Loss


40

40

25

25

HCM 10/8/12,

Page 538

33.32. A. e(25) = S(x)dx / S(25) = (1 - x / 40) dx / (1 - 25/40) = 2.8125 / 0.375 = 7.5.


Alternately, the given survival function is a uniform distribution on 0 to 40.
At age 25, the future lifetime is uniform from 0 to 15, with an average of 7.5.
Comment: DeMoivres Law with = 40.
33.33. D. The charge per call of length t is: 3 + 1(if t > 1) + 1(if t > 2) + 1(if t > 3) + 1(if t > 4) + ...
The expected charge per call is: 3 + S(1) + S(2) + S(3) + ... = 3 + e-1/4 + e-2/4 + e-3/4 + ...
= 3 + e-1/4/(1 - e-1/4) = 6.521. (100)(6.521) = 652.1.
Comment: Ignore the possibility that a call lasts exactly an integer, since the Exponential is a
continuous distribution. Then the cost of a call is: 3 + curtate lifetime of a call.
For example, if the call lasted 4.6 minutes, the cost is: 3 + 4 = 7.
The expected cost of a call is: 3 + curtate expected lifetime of a call.

t=1

t=1

e0 = S(t) = e t / 4 = e-1/4/(1 - e-1/4) = 1/(e1/4 - 1) = 3.521. (100)(3 + 3.521) = 652.1.


e0 e(0) - 1/2 = E[X] - 1/2 = 4 - 1/2 = 3.5. (100)(3 + 3.5) = 650, close to the exact answer.

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 539

Section 34, Hazard Rate


The hazard rate, force of mortality, or failure rate, is defined as: h(x) =

f(x)
x 0.
S(x)

h(x) can be thought of as the failure rate of machine parts. The hazard rate can also be interpreted as
the force of mortality = probability of death / chance of being alive.243 For a given age x, h(x) is the
density of the deaths, divided by the number of people still alive.
Exercise: F(x) = 1 - e-x/10. What is the hazard rate?
[Solution: h(x) = f(x)/S(x) = (e-x/10/10)/e-x/10 = 1/10.]
The hazard rate determines the survival (distribution) function and vice versa.
d ln(S(x)) / dx = dS(x)/dx / S(x) = -f(x) / S(x) = - h(x).
Thus h(x) = -d ln(S(x)) / dx.

S(x) = exp -

h(t) dt ].

244

S(x) = exp[-H(x)], where H(x) =

h(t) dt .

245

Note that h(x) = f(x)/S(x) 0. S() = exp[-H()] = 0 H() =

h(t) dt = .

If h(x) 0, then H(x) is nondecreasing and therefore, S(x) = exp[-H(x)] is nonincreasing.


H(x) usually increases, while S(x) decreases, although H(x) and S(x) can be constant on an interval.
Since H(0) = 0, S(0) = exp[-0] = 1.
A function h(x) defined for x > 0 is a legitimate hazard rate, in other words it corresponds
to a legitimate survival function, if and only if h(x) 0 and the integral of h(x) from 0 to
infinity is infinite.

243

This is equation 3.2.13 in Actuarial Mathematics by Bowers et. al.


The lower limit of the integral should be the lower end of the support of the distribution.
245
H is called the cumulative hazard rate. See Mahlers Guide to Survival Analysis.
244

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 540

As in Life Contingencies, one can write the distribution function and the density function in terms of
the force of mortality h(t):246

F(x) = 1 - exp -

h(t) dt .

f(x) = h(x) exp -

h(t) dt ].

Exercise: h(x) = 1/10. What is the distribution function?


[Solution: F(x) = 1 - e-x/10, an Exponential Distribution with = 10.]
h constant the Exponential Distribution, with constant hazard rate of 1/ = 1/mean.
The Exponential is the only continuous distribution with a constant hazard rate, and therefore constant
mean excess loss.
The Force of Mortality for various distributions is given below:
Distribution

Force of Mortality or Hazard Rate

Behavior as x approaches

Exponential

1/

Weibull

x 1

Pareto

+ x

h(x) 0, as x

Burr247

x 1
+ x

h(x) 0, as x

Single Parameter Pareto

/x

h(x) 0, as x

Gompertzs Law248

Bcx

h(x) , as x

Makehams Law249

A + Bcx

h(x) , as x

246

h(x) constant
< 1, h(x) 0. > 1, h(x) .

See equation 3.2.14 in Actuarial Mathematics, np x = chance of living n years for those who have reached age x =
{1-F(x+t)} / {1-F(x)} = S(x+t) / S(x) = exp (- integral from x to x+n of t).
The Loglogistic is a special case of the Burr with = 1.
As per Life Contingencies. See Actuarial Mathematics Section 3.7.
249
As per Life Contingencies. See Actuarial Mathematics Section 3.7.
247
248

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 541

Exercise: h(x) = 3/(10 + x). What is the distribution function?


[Solution:
x
F(x) = 1 - exp[ - 3/(10 + t) dt ] = 1 - exp[-3 {ln(10 +x) - ln(10)}] = 1 - {10/(10+x)}3 .
0

A Pareto Distribution with = 3 and = 10.]

Relationship to Other Items of Interest:


One can obtain the Mean Excess Loss from the Hazard Rate:
t

s=0

s=0

s=0

s=0

s= t

S(t) / S(x) = exp(- h(s) ds) / exp(- h(s) ds) = exp( h(s) ds - h(s) ds) = exp( h(s) ds).

S(t) dt /

e(x) =
x

S(x) =

S(t)/S(x) dt

t=x

exp( h(s) ds) dt .


t=x

s= t

exp(H(x) - H(t)) dt = exp[H(x)] exp(-H(t)) dt, where H(x) = h(t) dt.

250

e(x) =

t=x

Exercise: Given a hazard rate of h(x) = 4 / (100+x), what is the mean excess loss, e(x)?
[Solution:
x

H(x) = h(t) dt = 4 ln(100+x).


0

e(x) = (100+x)4

1/(100+t)4 dt = -(100+x)4/ 3(100+t))3 ] = (100+x)/3.

t=x

t=x

Comment: This is Pareto Distribution with = 4 and = 100; e(x) = (+x) / (1), h(x) = / ( + x). ]
250

H is called the cumulative hazard rate and is used in Survival Analysis. S(x) = Exp[-H(x)].

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 542

One can obtain the Hazard Rate from the Mean Excess Loss as follows:

{ S(t) dt } / S(x).

e(x) =

Thus e(x) =

{-S2(x) + f(x) S(t) dt} / S2(x) = -1 + f(x) e(x) / S(x) = -1 + e(x)h(x).


x

Thus h(x) = {1 + e(x)} / e(x).


Exercise: Given a mean excess loss of rate of e(x) = (100+x) / 3, what is the hazard rate, h(x)?
[Solution: e(x) = 1/3 . h(x) = {1 + e(x)} / e(x) = (4/3)(3/(100+x)) = 4/(100+x).
Comment: This is Pareto Distribution with = 4 and = 100.
It has e(x) = (+x) / (1) and h(x) = / ( + x). ]
Finally, one can obtain the Survival Function from the Mean Excess Loss as follows:
h(x) = {1 + e(x)} / e(x).
x

H(x) =

h(t) dt = {1/e(t) + e(t) / e(t)} dt = 1/e(t) dt + ln(e(x)/e(0)).

0
x

S(x) = exp[-H(x)] = (e(0)/e(x)) exp[- 1/e(t) dt].


0

For example, for the Pareto, e(x) = ( + x) / ( - 1).


x

1/e(t) dt = ( - 1) 1/( + x) dt = ( -1) ln(( + x)/).


0

S(x) = {{( + x)/( - 1)} / {/( - 1)}}{( + x)/}( 1) = {/( + x)}.

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 543

Tail Behavior of the Hazard Rate:


As was derived above, the limit as x approaches infinity of e(x) is equal to the limit as x approaches
infinity of S(x) / f(x) = 1/h(x).
lim e(x) = lim 1/h(x)
x

Thus an increasing mean excess loss, e(x), is equivalent to a decreasing hazard or failure rate, h(x),
and vice versa.
Since the force of mortality for the Pareto, / (+x), decreases with age, the Mean Excess Loss
increases with age.251
The quicker the hazard rate declines, the faster the Mean Excess Loss increases and the heavier the
tail of the distribution.
For the Weibull, if > 1 then the hazard rate, x1/, increases and thus the Mean Excess Loss
decreases.
For the Weibull with < 1, the hazard rate decreases and thus the Mean Excess Loss increases.
Lighter Tail

e(x) decreases

h(x) increases

Heavier Tail

e(x) increases

h(x) decreases

251

Unlike the situation for mortality of humans. For Gompertzs or Makehams Law with B > 0 and c > 0, the
force of mortality increases with age, so the Mean Excess Loss decreases with age. For the Pareto, if 1, then the
force of mortality is sufficiently small so that there exists no mean; for 1 the mean lifetime is infinite.

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Problems:
34.1 (1 point) You are given the following three loss distributions.
1. Gamma = 1.5, = 0.5
2. LogNormal = 0.1, = 0.6
3. Weibull = 1.4, = 0.8
For which of these distributions does the hazard rate increase?
A. 1 B. 2 C. 3 D. 1,2,3
E. None of A, B, C, or D

Use the following information for the next 5 questions:


e(x) = 72 - 0.8x, 0 < x < 90.
34.2 (2 points) What is the force of mortality at 50?
A. less than 0.003
B. at least 0.003 but less than 0.004
C. at least 0.004 but less than 0.005
D. at least 0.005 but less than 0.006
E. at least 0.006
34.3 (3 points) What is the Survival Function at 60?
A. 72%

B. 74%

C. 76%

D. 78%

E. 80%

D. 68%

E. 70%

34.4 (2 points) What is 50p 30?


A. 62%

B. 64%

C. 66%

34.5 (1 point) What is the mean lifetime?


A. 72
B. 74
C. 76
D. 78

E. 80

34.6 (2 points) What is the probability density function at 40?


A. less than 0.002
B. at least 0.002 but less than 0.003
C. at least 0.003 but less than 0.004
D. at least 0.004 but less than 0.005
E. at least 0.005

Page 544

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 545

34.7 (2 points) For a LogNormal distribution with parameters = 11.6, = 1.60,


what is the hazard rate at $100,000?
A. less than 4 x 10-6
B. at least 4 x 10-6 but less than 5 x 10-6
C. at least 5 x 10-6 but less than 6 x 10-6
D. at least 6 x 10-6 but less than 7 x 10-6
E. at least 7 x 10-6
34.8 (2 points) The hazard rate h(x) = 0.002 + 1.1x / 10,000, x > 0. What is S(50)?
(A) 0.76
(B) 0.78
(C) 0.80
(D) 0.82
(E) 0.84
34.9 (1 point) If the hazard rate of a certain machine part is a constant 0.10 for t > 0, what is the Mean
Excess Loss at age 25?
A. less than 10
B. at least 10 but less than 15
C. at least 15 but less than 20
D. at least 20 but less than 25
E. at least 25
34.10 (2 points) Losses follow a Weibull Distribution with = 25 and = 1.7.
What is the hazard rate at 100?
A. less than 0.05
B. at least 0.05 but less than 0.10
C. at least 0.10 but less than 0.15
D. at least 0.15 but less than 0.20
E. at least 0.20
34.11 (2 points) For a loss distribution where x 10, you are given:
i) The hazard rate function: h(x) = z/x, for x 10.
ii) A value of the survival function: S(20) = .015625.
Calculate z.
A. 2
B. 3
C. 4
D. 5
E. 6
34.12 (2 points) For a loss distribution where x 0, you are given:
i) The hazard rate function: h(x) = z x2 , for x 0.
ii) A value of the distribution function: F(5) = 0.1175.
Calculate z.
A. 0.002
B. 0.003
C. 0.004
D. 0.005

E. 0.006

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 546

Use the following information for the next four questions:


Ground up losses follow a Weibull Distribution with = 2 and = 10.
34.13 (3 points) There is an ordinary deductible of 5.
What is the hazard rate of the per loss variable?
34.14 (3 points) There is an ordinary deductible of 5.
What is the hazard rate of the per payment variable?
34.15 (3 points) There is a franchise deductible of 5.
What is the hazard rate of the per loss variable?
34.16 (3 points) There is a franchise deductible of 5.
What is the hazard rate of the per payment variable?
34.17 (3 points) X follows a Gamma Distribution with parameters = 3 and .
Determine the form of the hazard rate h(x). What is the behavior of h(x) as x approaches infinity?
34.18 (2 points) The hazard rate h(x) = 4/(100 + x), x > 0. What is S(50)?
(A) 0.18

(B) 0.20

(C) 0.22

(D) 0.24

(E) 0.26

34.19 (2 points) Determine the hazard rate at 300 for a Loglogistic Distribution with = 2 and
= 100.
(A) 0.005

(B) 0.006

(C) 0.007

(D) 0.008

(E) 0.009

34.20 (1 point) F(x) is a Pareto Distribution.


If the hazard rate h(x) is doubled for all x, what is the new distribution function?
34.21 (2 points) You are using a Weibull Distribution to model the length of time workers remain
unemployed. Briefly discuss the implications of different values of the parameter .

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 547

34.22 (2 points) Robots can fail due to two independent decrements: Internal and External.
(Internal includes normal wear and tear. External includes accidents.)
Assuming no external events, a robots time until failure is given by a Pareto DIstribution with = 2
and = 10.
Assuming no internal events, a robots time until failure is given by a Pareto DIstribution with = 4
and = 10.
At time = 5, what is the hazard rate of the robots time until failure?
A. 0.35
B. 0.40
C. 0.45
D. 0.50
E. 0.55
34.23 (1 point) F(x) is a Weibull Distribution.
If the hazard rate h(x) is doubled for all x, what is the new distribution function?
34.24 (160, 11/87, Q.7) (2.1 points) Which of the following are true for all values of x > 0?
S(x) - S(x +1)
I. For every exponential survival model h(x) =
.
x+1

S(t)dt

II. For every survival model f(x) h(x).


III. For every survival model f(x) f(x + 1).
(A) I and II only
(B) I and III only
(C) II and III only
(D) I, II and III
(E) The correct answer is not given by (A), (B), (C), or (D).
34.25 (160, 11/87, Q.8) (2.1 points) The force of mortality for a survival distribution is given by:
1
h(x) =
, 0 < x < 100. Determine e(64).
2 (100 - x)
(A) 16

(B)18

(C) 20

(D) 22

(E) 24

34.26 (160, 11/87, Q.15) (2.1 points) For a Weibull distribution as per Loss Models,
the hazard rate at the median age is 0.05. Determine the median age.
(A) ln(2)

(B) ln(20)

(C) 20 ln(2)

(D) 2ln()

(E) 2 ln(20)

34.27 (160, 11/88, Q.2) (2.1 points) A survival model is represented by the following probability
density function: f(t) = (0.1)(25 - t)-1/2; 0 t 25. Calculate the hazard rate at 20.
(A) 0.05
(B) 0.10
(C) 0.15
(D) 0.20
(E) 0.25

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 548

34.28 (160, 11/89, Q.1) (2.1 points) For a survival model, you are given:
(i) The hazard rate is h(t) = 2/(w - t), 0 t < w.
(ii) T is the random variable denoting time of failure.
Calculate Var(T).
(A) w2 /18

(B) w2 /12

(C) w2 /9

(D) w2 /6

(E) w2 /3

34.29 (160, 11/89, Q.2) (2.1 points) S(x) = 0.1(100 - x)1/2, 0 x 100.
Calculate the hazard rate at 84.
(A) 1/32
(B) 1/24
(C) 1/16
(D) 1/8
(E) 1/4
34.30 (160, 5/90, Q.4) (2.1 points) You are given that y is the median age for the survival function
S(x) = 1 - (x/100)2 , 0 x 100. Calculate the hazard rate at y.
(A) 0.013
(B) 0.014
(C) 0.025
(D) 0.026
(E) 0.028
34.31 (Course 160 Sample Exam #1, 1996, Q.2) (1.9 points)
X has a uniform distribution from 0 to 10.
Y = 4X2 . Calculate the hazard rate of Y at 4.
(A) 0.007
(B) 0.014
(C) 0.021 (D) 0.059

(E) 0.111

34.32 (Course 160 Sample Exam #2, 1996, Q.2) (1.9 points) You are given:
1
(i) A survival model has a hazard rate h(x) =
, 0 x .
3 ( - x)
(ii) The median age is 63.
Calculate the mean residual life at 63, e(63).
(A) 4.5
(B) 6.8
(C) 7.9
(D) 9.0

(E) 13.5

34.33 (Course 160 Sample Exam #3, 1997, Q.1) (1.9 points) You are given:
(i) For a Weibull distribution with parameters and , the median age is 22.
(ii) At the median age, the value of the Hazard Rate Function is 1.26.
Calculate .
(A) 37

(B) 38

(C) 39

(D) 40

(E) 41

34.34 (Course 160 Sample Exam #1, 1999, Q.19) (1.9 points)
Losses follow a Loglogistic Distribution, with parameters = 3 and = 0.1984.
For what value of x is the hazard rate, h(x), a maximum?
(A) 0.18
(B) 0.20
(C) 0.22
(D) 0.25
(E) 0.28

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 549

34.35 (Course 160 Sample Exam #1, 1999, Q.20) (1.9 points)
A sample of 10 batteries in continuous use is observed until all batteries fail. You are given:
(i) The times to failure (in hours) are 14.1, 21.3, 23.2, 26.2, 29.8, 31.3, 35.7, 39.4, 39.2, 45.3.
(ii) The composite hazard rate function for these batteries is defined by
h(t) = , 0 t < 27.9,
h(t) = + (t - 27.9)2 , t 27.9.
(iii) S(15) = 0.7634, S(30) = 0.5788.
Calculate the absolute difference between the cumulative hazard rate at 34, H(34), based on the
assumed hazard rate, and the cumulative hazard rate at 34, HO(34), based on the observed data.
(A) 0.03
(B) 0.06
(C) 0.08
D) 0.11
(E) 0.14
34.36 (CAS3, 11/03, Q.19) (2.5 points) For a loss distribution where x 2, you are given:
i) The hazard rate function: h(x) = z2 / (2x), for x 2.
ii) A value of the distribution function: F(5) = 0.84.
Calculate z.
A. 2 B. 3 C. 4 D. 5 E. 6
34.37 (CAS3, 11/04, Q.7) (2.5 points)
Which of the following formulas could serve as a force of mortality?
1. x = BCx,

B > 0, C > 1

2. x = a(b+x)-1,

a > 0, b > 0

3. x = (1+x)-3,

x0

A. 1 only

B. 2 only

C. 3 only

D. 1 and 2 only

E. 1 and 3 only

34.38 (CAS3, 11/04, Q.27) (2.5 points) You are given:

X has density f(x), where f(x) = 500,000 / x3 , for x > 500 (single-parameter Pareto with = 2).
Y has density g(y), where g(y) = y e-y/500 / 250,000 (gamma with = 2 and = 500).
Which of the following are true?
1. X has an increasing mean residual life function.
2. Y has an increasing hazard rate.
3. X has a heavier tail than Y based on the hazard rate test.
A. 1 only.
B. 2 only.
C. 3 only.
D. 2 and 3 only.
Note: I have rewritten this exam question.

E. All of 1, 2, and 3.

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 550

34.39 (CAS3, 5/05, Q.30) (2.5 points) Acme Products will offer a warranty on their products for x
years, where x is the largest integer for which there is no more than a 1% probability of product
failure.
Acme introduces a new product with a hazard function for failure at time t of 0.002t.
Calculate the length of the warranty that Acme will offer on this new product.
A. Less than 3 years
B. 3 years
C. 4 years D. 5 years
E. 6 or more years
34.40 (CAS3, 11/05, Q.11) (2.5 points) Individuals with Flapping Gum Disease are known to
have a constant force of mortality . Historically, 10% will die within 20 years.
A new, more serious strain of the disease has surfaced with a constant force of mortality equal to 2.
Calculate the probability of death in the next 20 years for an individual with this new strain.
A. 17%
B. 18%
C. 19%
D. 20%
E. 21%
34.41 (SOA M, 11/05, Q.13) (2.5 points) The actuarial department for the SharpPoint
Corporation models the lifetime of pencil sharpeners from purchase using a generalized DeMoivre
model with s(x) = (1 - x/), for > 0 and 0 < x .
A senior actuary examining mortality tables for pencil sharpeners has determined that the
original value of must change. You are given:
(i) The new complete expectation of life at purchase is half what it was previously.
(ii) The new force of mortality for pencil sharpeners is 2.25 times the previous force of
mortality for all durations.
(iii) remains the same.
Calculate the original value of .
(A) 1

(B) 2

(C) 3

(D) 4

(E) 5

34.42 (CAS3, 5/06, Q.10) (2.5 points) The force of mortality is given as:
(x) = 2 / (110 - x), for 0 x < 110.
Calculate the expected future lifetime for a life aged 30.
A. Less than 20
B. At least 20, but less than 30
C. At least 30, but less than 40
D. At least 40, but less than 50
E. At least 50

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 551

34.43 (CAS3, 5/06, Q.11) (2.5 points) Eastern Digital uses a single machine to manufacture digital
widgets. The machine was purchased 10 years ago and will be used continuously until it fails.
The failure rate of the machine, u(x), is defined as:
u(x) = x2 / 4000, for x 4000 , where x is the number of years since purchase.
Calculate the probability that the machine will fail between years 12 and 14, given that the machine
has not failed during the first 10 years.
A. Less than 1.5%
B. At least 1.5%, but less than 3.5%
C. At least 3.5%, but less than 5.5%
D. At least 5.5%, but less than 7.5%
E. At least 7.5%
34.44 (CAS3, 5/06, Q.16) (2.5 points) The force of mortality is given as:
(x) = 1 / (100 - x), for 0 x < 100.
Calculate the probability that exactly one of the lives (40) and (50) will survive 10 years.
A. 9/30
B. 10/30
C. 19/30
D. 20/30
E. 29/30

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 552

Solutions to Problems:
34.1. A. For a Gamma with > 1, hazard rate increases (toward a horizontal asymptote given by
an exponential.) For >1 the Gamma is lighter-tailed than an Exponential. For a LogNormal the
hazard rate decreases. For a Weibull with < 1, the hazard rate decreases; for <1 the Weibull is
heavier-tailed than an Exponential. Alternately, the hazard rate increases if and only if the mean
excess loss decreases. For a Gamma with > 1, mean excess loss decreases (toward a horizontal
asymptote given by an exponential.) For a LogNormal mean excess loss increases. For a Weibull
with < 1, mean excess loss increases.
34.2. E. h(x) = {1 + e(x)} / e(x) = (1 - .8)/(72 - .8x) = 1/(360 - 4x). h(50) = 0.00625.
x

t=x

34.3. C. S(x) = exp[- (t) dt] = exp[- 1/(360 - 4t) dt] = exp[ln(360 - 4t)/4]
0

]=
t=0

exp[ln(360 - 4x)/4 - ln(360)/4] = exp[(1/4)ln(1 - x/90)] = (1 - x/90)1/4. S(60) = 0.760.


34.4. B. 50p 30 = probability that a life aged 30 lives at least 50 more years = S(80)/S(30) =
(1/9)1/4 / (2/3)1/4 = 0.639.
34.5. A. mean lifetime = e(0) = 72. Alternately,
90

E[X] = S(t) dt =
0

90

90

(1 - x/90)1/4 dt = -72(1 - x/90)5/4 ] = 72.


0

34.6. D. f(x) = -dS(x)/dx = (1 - x/90)-3/4 /360. f(40) = 0.0043.


34.7. B. For the LogNormal, F(x) = [{ln(x) } / ]. F(100000) = [-.0544] =
(1 - .5217). Therefore, S(100,000) = .5217. f(x) = exp[-.5 ({ln(x)} /)2] /{x 2 }.
f(100,000) = exp[-.5({ln(100000)-11.6}/1.6)2 )]/{ 160000 2 } = 2.490 x 10-6.
h(100000) = f(100000) / S(100000) = 2.490 x 10-6 / .5217 = 4.77 x 10- 6.

2013-4-2,

Loss Distributions, 34 Hazard Rate

34.8. C.

HCM 10/8/12,

Page 553

S(x) = exp(- (t) dt) = exp(- .002 + 1.1t/10000 dt) = exp(-.002x + (1 - 1.1x)/(10000 ln(1.1) ).
0

S(50) = exp(-.1 - .1221) = 0.80.


Comment: This is an example of Makehams Law of mortality.
34.9. B. A constant rate of hazard implies an Exponential Distribution, with
= 1 / the hazard rate. The mean excess loss is at all ages. Thus the mean excess loss at age 25
(or any other age) is: 1 / 0.10 = 10.
Comment: Note, one can write down the equation:
hazard rate = chance of failure / probability of still working = F(x) / {1-F(x)} = S(x) / S(x) = .10 and
solve the resulting differential equation: S(x) = .1 S(x), for S(x) = e-.1x or F(x) = 1 - e-.1x.
34.10. D. h(x) = f(x) / S(x) = {(x/) exp(-(x/)) /x}/ exp(-(x/)) = x1/.
h(100) = (1.7)(100.7)/(251.7) = 0.179.
34.11. E.

S(x) = exp(- h(t) dt) = exp[-z{ln(x) - ln(10)}] = (10/x)z, for x 10.


10

0.015625 = S(20) = (1/2)z. z = 6.


Comment: S(x) = (10/x)6 , for x 10. A Single Parameter Pareto, with = 6 and = 10.
Similar to CAS3, 11/03, Q.19.
34.12. B.

S(x) = exp(- h(t) dt) = exp[-z x3 /3] , for x 0.


0

0.8825 = S(5) = exp[-z 53 /3]. z = -ln(.8825) 3/125 = 0.0030.


Comment: S(x) = exp[-(x/10)3 ], for x 0. A Weibull Distribution, with = 10 and = 3.

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 554

34.13. For the Weibull, f(x) = (x/) exp(-(x/)) / x = x exp(-(x/10)2 )/ 50.


S(x) = exp(-(x/)) = exp(-(x/10)2 ). h(x) = f(x)/S(x) = x/50.
The per loss variable Y is 0 for x 5, and is X - 5 for x > 5.
Y has a point mass of F(5) at 0. Thus fY(y) is undefined at zero. fY(y) = fX(y+5) for y > 0.
S Y(0) = SX(5). SY(y) =
hY(y) undefined at zero.

SX(y+5) for y > 0.


hY(y) = fY(y)/SY(y) = fX(y+5)/SX(y+5) = hX(y+5) = (y+5)/50 for y > 0.

Comment: Similar to Example 8.1 in Loss Models. Loss Models uses the notation YP for the per
payment variable and YL for the per loss variable.
34.14. The per payment variable Y is undefined for x 5, and is X - 5 for x > 5.
fY(y) = fX(y+5)/SX(5) for y > 0. SY(y) = SX(y+5)/SX(5) for y > 0.
hY(y) = fY(y)/SY(y) = fX(y+5)/SX(y+5) = hX(y+5) = (y+5)/50 for y > 0.
34.15. The per loss variable Y is 0 for x 5, and is X for x > 5.
Y has a point mass of FX(5) at 0. Thus fY(y) is undefined at zero.
fY(y) = 0 for 0 < y 5. fY(y) = fX(y) for y > 5. SY(y) = SX(5) for 0 < y 5. SY(y) = SX(x) for y > 5.
hY(y) undefined at zero. hY(y) = 0 for 0 < y 5.
hY(y) = fY(y)/SY(y) = fX(x)/SX(x) = hX(y) = y/50 for y > 5.
34.16. The per payment variable Y is undefined for x 5, and is X for x > 5.
fY(y) = fX(x)/SX(5) for y > 5. SY(y) = SX(x)/SX(5) for y > 5.
hY(y) = fY(y)/SY(y) = fX(y)/SX(y) = hX(y) = y/50 for y > 5.
Comment: Similar to Example 8.2 in Loss Models.
34.17. f(x) = 0.5 x2 e-x/ / 3. S(x) = 1 - (3 ; x/) = e-x/ + (x/)e-x/ + (x/)2 e-x//2.
h(x) = f(x)/S(x) = x2 /(23 + 22x + x2 ).
h(x) = 1/(23/x2 + 22/x + ), which increases to 1/ as x approaches infinity.
Comment: I have used Theorem A.1 in Appendix A of Loss Models, in order to write out the
incomplete Gamma Function for an integer parameter. One can also verify that dS(x)/dx = -f(x).
A Gamma Distribution for > 1 is lighter tailed than an Exponential ( = 1), and the hazard rate
increases to 1/, while the mean excess loss decreases to .

2013-4-2,

Loss Distributions, 34 Hazard Rate


x

34.18. B. S(x) = exp[ - ( t) dt ] = exp[ -

exp[-4{ln(100+x) - ln(100)}] =

HCM 10/8/12,

Page 555

0 100 + t dt ] =
4

4
100
. S(50) = (100/150)4 = 0.198.
100 + x

Comment: This is a Pareto Distribution, with = 4 and = 100.


34.19. B. F(x) = (x/) / {1 + (x/)}. F(300) = 32 / (1 + 32 ) = 0.9. S(300) = 0.1.
f(x) = (x/) / (x{1 + (x/)}2 ). f(300) = (2) (32 ) / {(300)(1 + 32 )2 } = 0.0006.
h(300) = f(300)/S(300) = 0.0006/0.1 = 0.006.
Comment: For the Loglogistic, h(x) = f(x)/S(x) = x1 / {1+ (x/)}.
For = 2 and = 100: h(x) = 0.0002 x / {1 + (x/100)2 }.
The hazard rate increases and then decreases:
hazard rate
0.010

0.008

0.006

0.004

0.002

200

400

600

34.20. For the Pareto the hazard rate is: h(x) = f(x) / S(x) =

800

x
1000

2
. 2 h(x) =
.
+ x
+ x

This is the hazard rate for another Pareto Distribution with parameters 2 and .

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 556

34.21. If = 1, then we have an Exponential with constant hazard rate.


The probability of the period of unemployment ending is independent of how long the worker has
been out of work.
If < 1, then we have decreasing hazard rate. As a worker remains out of work for longer periods of
time, his chance of finding a job declines. This could be due to exhaustion of possible employment
opportunities or the worker becoming discouraged.
If > 1, then we have increasing hazard rate. As a worker remains out of work for longer periods of
time, his chance of going back to work increases. This could be due to worry about exhaustion of
unemployment benefits or the worker becoming more willing to settle for a less than desired job.
Comment: A Weibull with < 1 has a heavier righthand tail than an Exponential.
A Weibull with > 1 has a lighter righthand tail than an Exponential.

34.22. B. For the Pareto Distribution, h(x) = f(x)/S(x) =


Thus h1 (x) =

.
+ x

2
4
, and h2 (x) =
.
10 + x
10 + x

Since the decrements are independent, the hazard rates add, and h(x) =

6
.
10 + x

h(5) = 6/15 = 0.4.


Alternately, the probability of surviving past age x is the product of the probabilities of surviving
both of the independent decrements:
10 2 10 4 10 6
S(x) = S1 (x) S2 (x) =
=
.
10 + x 10 + x
10 + x
This is a Pareto DIstribution with = 6 and = 10.
h(x) =

6
. h(5) = 6/15 = 0.4.
10 + x

34.23. For the Weibull the hazard rate is: h(x) = f(x) / S(x) =
2 h(x) = 2

x 1
.

x 1
x 1
=
.

( / 21/ )

This is the hazard rate for another Weibull Distribution with parameters and / 21/.
Comment: If = 1 we have an Exponential, and if the hazard rate is doubled then the mean is
halved.

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 557
x+1

34.24. A. For the Exponential, h(x) = 1/ and S(x) = e-x/. {S(x) - S(x+1)}/ S(t)dt =
x

{e-x/ - e-(x+1)/}/{e-x// - e-(x+1)//} = 1/. Statement I is true.


h(x) = f(x)/S(x) f(x), since S(x) 1. Therefore, Statement II is true.
While the density must go to zero as x approaches infinity, the density can either increase or
decrease over short periods. Statement III is not true.
x+1

Comment: {S(x) - S(x+1)}/ S(t)dt = mx = central death rate.


x

See page 70 of Actuarial Mathematics.


x

34.25. E. S(x) = exp[- h(t)dt ] = exp[ln(100-x)/2 - ln(100)/2] =


0

1 - x / 100 .

100

e(64) = S(t)dt /S(64) = (200/3)(1 - 64/100)3/2 / (1 - 64/100)1/2 = 24.


64

34.26. C. For the Weibull, S(x) = exp(-(x/)), and f(x) = x1exp(-(x/))/.

h(x) = f(x)/S(m) = x1/.


Let m be the median. S(m) = 0.5. lnS(m) = -(m/) = -ln(2). h(m) = 0.05. m1/ = 0.05.
Dividing the two equations: m/ = ln(2)/0.05. m = 20 ln(2).
34.27. B. Integrating f(t) from t to 25, S(t) = (0.2)(25 - t)1/2.
h(t) = f(t)/S(t) = 1/(50 - 2t). h(20) = 1/10.
34.28. A. h(t) = 2/(w - t). H(t) = 0t h(x)dx = 2ln(w) - 2ln(2 - t).
S(t) = exp[-H(t)] = {(w - t)/w}2 = (1 - t/w)2 . f(t) = 2(1 - t/w)/w. = 2/w - 2t/w2 .
0t x f(x )dx = w/3. 0t x2 f(x)dx = w2 /6. Var(T) = w2 /6 - (w/3)2 = w 2 /18.
34.29. A. S(x) = 0.1(100 - x)1/2. f(x) = 0.05(100 - x)-1/2. h(x) = f(x)/S(x) = 0.5/(100 - x).
h(84) = 0.5/16 = 1/32.
34.30. E. 0.5 = (y/100)2 . y = 70.71. f(x) = x/50,000.
f(y) = f(70.71) = 70.71/50,000 = 0.01414. h(y) = f(y)/S(y) = 0.01414/.5 = 0.0283.

2013-4-2,

Loss Distributions, 34 Hazard Rate

34.31. B. Y = 4X2 . X =

HCM 10/8/12,

Page 558

Y / 2.

S X(x) = 1 - x/10, 0 x 10. SY(y) = 1 -

y / 20, 0 y 400.

fY(y) = 1/(40 y ). fY(4) = 1/80. SY(4) = .9. hY(4) = (1/80)/0.9 = 1/72 = 0.0139.
34.32. B. H(t) = 0t h(x)dx = ln()/3 - ln( - t)/3.
S(x) = exp[-H(t)] = exp[ln( - x)/3 - ln()/3] = ( - x)1/3/1/3 = (1 - x/)1/3.
Median age is 63. .5 = S(63) = (1 - 63/)1/3. = 63/.875 = 72. S(t) = (1 - t/72)1/3.
72

e(63) = S(t) dt / S(63) = {(3/4)(72)(1 - 63/72)4/3}/0.5 = 6.75.


63

34.33. D. .5 = S(22) = exp[-(22/)]. .69315 = (22/).


h(x) = f(x)/S(x) = x1/. We are given: 1.26 = h(22) = 221/.
Dividing the two equations: /22 = 1.8178. = 40.
Comment: = 22.2.
34.34. D. S(x) = 1/(1 + (x/)). f(x) = x1 / (1+ (x/))2 .
h(x) = f(x)/S(x) = x1 / (1+ (x/)) = {(3)x2 /(.19843 )}/{1 + (x/.1984)3 } = 3x2 /(.00781 + x3 ).
0 = h(x) = {6x(.00781 + x3 ) - (3x2 )(3x2 )}/(.00781 + x3 )2 . x3 = 0.01562. x = 0.25.
Comment: Here is a graph of h(x) = 3x2 / (0.00781 + x3 ):
hazard rate
8

0.5

1.0

1.5

2.0

2.5

3.0

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 559

34.35. E. For t < 27.9, H(t) = 0t h(x)dx = t. S(t) = et. S(15) = 0.7634. = 0.018.
t
For t 27.9, H(t) = H(27.9) + 27.9
h(x)dx = (27.9)(.018) + (t - 27.9)(.018) + (t - 27.9)3 /3.

S(t) = exp[-.018t - (t - 27.9)3 /3]. 0.5788 = S(30) = exp[-(.018)(30) - (30 - 27.9)3 /3].

(.018)(30) + (30 - 27.9)3 /3 = .5468. = .00222.


H(34) = -lnS(34) = (0.018)(34) + (0.00222)(34 - 27.9)3 /3 = 0.780.
For the observed data, SO(34) = 4/10 = 0.4. HO(34) = -ln(.4) = 0.916.
|H(34) - HO(34)| = 0.916 - 0.780 = 0.14.
34.36. A.

2
S(x) = exp(- h(t) dt) = exp[- (z2 /2){ln(x) - ln(2)}] = (x / 2)-z / 2 , for x 2.

2
2
0.16 = S(5) = 2.5-z / 2 . ln 0.16 = (-z2 /2) ln 2.5. z2 = -2 ln 0.16 / ln 2.5 = 4. z = 2.

Comment: S(x) = (2/x)2 , for x 2. A Single Parameter Pareto, with = 2 and = 2.


34.37. D. x corresponds to a legitimate survival function, if and only if it is nonnegative and its
integral from 0 to infinity is infinite. All the candidates are nonnegative.

x =

B Cx = B lnC Cx ] = , since C>1.


0

x =0

x =

a(b+x)-1 = a ln(b+x) ] = .
0

x =0

x =

(1+x)-3 = -(1+x)-2/2 ] = 1/2 .


0

x =0

Thus 1 and 2 could serve as a force of mortality.


Comment: #1 is Gompertz Law. #2 is a Pareto Distribution with a = and b = .
The Pareto is not useful for modeling human lives, since its force of mortality decreases to zero as x
approaches infinity.

2013-4-2,

Loss Distributions, 34 Hazard Rate

HCM 10/8/12,

Page 560

34.38. E. 1. Single Parameter Pareto is heavy tailed with an increasing mean residual life.
2. A Gamma with > 1 is lighter tailed than an Exponential; it has a decreasing mean residual life and
an increasing hazard rate.
3. Single Parameter Pareto has a heavier tail than a Gamma.
Comments: The mean residual life of a Single Parameter Pareto increases linearly as x goes to
infinity, e(x) = x/(-1). The hazard rate of a Single Parameter Pareto goes to zero as x goes to
infinity, h(x) = /x. For this Gamma Distribution, h(x) = f(x)/S(x) =
{y e-y/500/ 250,000}/{1 - [2 ; y/500]} = {y e-y/500/ 250,000}/{e-y/500 + (y/500)e-y/500} =
1/(250000/y + 500), where I have used Theorem A.1 to write out the incomplete Gamma function
for integer parameter. h(x) increases to 1/500 = 1/ as x approaches infinity.
A Single Parameter Pareto, which has a decreasing hazard rate, has a heavier righthand tail than a
Gamma, which has an increasing hazard rate.
A Gamma with = 1 is an Exponential with constant hazard rate. For integer, a Gamma is a sum of
independent, identically distributed Exponentials. Therefore, as , the Gamma Distribution
approaches a Normal Distribution. The Normal Distribution is very light-tailed and has an increasing
hazard rate. This is one way to remember that for > 1, the Gamma Distribution has an increasing
hazard rate. For < 1, the Gamma Distribution has a decreasing hazard rate.

34.39. B. h(t) = 0.002t. H(t) = h(t) dt = 0.001t2 . S(t) = exp[-H(t)] = exp[-0.001t2 ].


We want F(t) 1%. F(3) = 1 - exp[-0.009] = .009 1%, so 3 is OK.
F(4) = 1 - exp[-0.016] = .016 > 1%, so 4 is not OK.
Comment: A Weibull Distribution with = 2. 99% = S(t) = exp[-0.001t2 ]. t = 3.17.
34.40. C. For a constant force of mortality (hazard rate) one has an Exponential Distribution. For the
original strain of the disease: 10% = 1 - e-20. = 0.005268.
For the new strain, the probability of death in the next 20 years is:
1 - exp[-(20)(2)] = 1 - exp[-(20)(2)(0.005268)] = 1 - e-0.21072 = 19.0%.
Alternately, for twice the hazard rate, the survival function is squared.
For the original strain, S(20) = 1 - 0.010 = 0.90.
For the new strain, S(20) = 0.92 = 0.81.
For the new strain, the probability of death in the next 20 years is: 1 - 0.81 = 19%.

2013-4-2,

Loss Distributions, 34 Hazard Rate

34.41. D.

HCM 10/8/12,

Page 561

e(x) = S(t) dt / S(x) = (1 - t/) dt / (1 - x/) = {(1 - x/)/( + 1)}/ (1 - x/)


x

= ( - x)/( + 1). e(0) = /( + 1).


By differentiating, f(x) = -dS(x)/dx = (1 - x/)1/.
h(x) = f(x)/S(x) = {(1 - x/)1/}/(1 - x/) = /( - x).
Let be the original value and be the new value of this parameter.
From bullet i: /( + 1) = .5/( + 1). = 2 + 1.
From bullet ii: /( - x) = 2.25/( - x). = 2.25.
Therefore, 2.25 = 2 + 1. = 4.
Alternately, H(x) = -ln S(x) = - ln(1 - x/).
h(x) = d H(x) / dx = (/)/(1 - x/) = /( - x). Proceed as before.
Comment: If = 1, then one has DeMoivres Law, the uniform distribution.
A Modified DeMoivre model has times the hazard rate of DeMoivres Law for all ages.
34.42. B.

H(x) = (t) dt = 2/(110 - t) dt = -2{ln(110 - x) - ln(110)}.


0

S(x) = exp[-H(t)] = {(110 - x)/110}2 = (1 - x/100)2 , for 0 x < 110.


110

110

e(30) = S(t) dt / S(30) = (1 - x/100)2 dt /(1 - 30/100)2 = (110/3)(1 - 30/110)3 /(1 - 30/100)2 =
30

30

(110 - 30)/3 = 26.67.


Comment: Generalized DeMoivres Law with = 110 and = 2. (x) = /( - x), 0 x < .
e(x) = ( - x)/(+1) = (110 - x)/3.
The remaining lifetime at age 30 is a Beta Distribution with a = 1, b = = 2, and = - 30 = 80.

2013-4-2,

Loss Distributions, 34 Hazard Rate

34.43. E.

HCM 10/8/12,

Page 562

H(x) = u(x) dt = t2 /4000 dt = x3 /12,000.


0

S(x) = exp[-H(t)] = exp[- x3 /12,000], for 0 x < 4000 .


S(10) = .9200. S(12) = .8659. S(14) = .7956.
Prob[fail between 12 and 14 | survive until 10] = {S(12) - S(14)}/S(10) =
(.8659 - .7956)/.9200 = 0.0764.
Comment: Without the restriction, x

4000 , this would be a Weibull Distribution with = 3.

34.44. A. x

H(x) = (t) dt = ln(100) - ln(100 - x). S(x) = exp[-H(x)] = (100 - x)/100.


0

Prob[life aged 40 survives at least 10 years] = S(50)/S(40) = .5/.6 = 5/6.


Prob[life aged 50 survives at least 10 years] = S(60)/S(50) = .4/.5 = 4/5.
Prob[exactly one survives 10 years] = (5/6)(1 - 4/5) + (1 - 5/6)(4/5) = 9/30.
Comment: DeMoivres Law with = 100.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 563

Section 35, Loss Elimination Ratios and Excess Ratios


As discussed previously, the Loss Elimination Ratio (LER) is defined as the ratio of the losses
eliminated by a deductible to the total losses prior to imposition of the deductible. The losses
eliminated by a deductible d, are E[x d], the Limited Expected Value at d.252
LER(x) =

E[X x]
.
E[X]

The excess ratio R(x), is defined as the ratio of loss dollars excess of x divided by the total loss
dollars.253 It is the complement of the Loss Elimination Ratio; they sum to unity.
E[X] - E[X x]
E[X x]
R(x) =
=1= 1 - LER(x).
E[X]
E[X]
Using the formulas in Appendix A of Loss Models for the Limited Expected Value one can use the
E[X x]
relationship: R(x) = 1 to compute the Excess Ratio.
E[X]
For various distributions, here are the resulting formulas for the excess ratios, R(x):
Distribution

Excess Ratio, R(x)

Exponential

e-x/

Pareto

,>1
+ x
ln(x)
1

exp[ + 2 / 2]

LogNormal

ln(x)
1 -

Gamma

1 - (+1 ; x/) - x{1-( ; x/)}/

Weibull

1 - [1 +1/ ; (x/)] - (x/) exp( -(x/)) / [1 +1/]

Single Parameter Pareto

(1/) (x/)1 , > 1, x >

252

The losses eliminated are paid by the insured rather than the insurer. The insured would generally pay less for its
insurance in exchange for accepting a deductible. By estimating the percentage of losses eliminated the insurer can
price how much of a credit to give the insured for selecting various deductibles. How the LER is used to price
deductibles is beyond the scope of this exam, but generally the higher the loss elimination ratio, the greater the
deductible credit.
253
The excess ratio is used by actuaries to price reinsurance, workers compensation excess loss factors, etc.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 564

Recall that the mean and thus the Excess Ratio fails to exist for: Pareto with 1,
Generalized Pareto with 1, and Burr with 1. Except where the formula could be simplified,
there is a term in the Excess Ratio which is: -x(S(x)) / mean.254
Due to the computational length, exam questions involving the computation of Loss Elimination or
Excess Ratios are most likely to involve the Exponential, Pareto, Single Parameter Pareto, or
LogNormal Distributions.
Exercise: Compute the excess ratios at $1 million and $5 million for a Pareto with parameters
= 1.702 and = 240,151.
[Solution: For the Pareto R(x) = {/(+x)}1. R($1 million) = (240,151/1,240,151)0.702 = 0.316.
R($5 million) = (240,151/5,240,151)0.702 = 0.115.]
Since LER(x) = 1 - R(x), one can use the formulas for the Excess Ratio to get the Loss Elimination
Ratio and vice-versa.
Exercise: Compute the loss elimination ratio at 10,000 for the Pareto with parameters:
= 1.702 and = 240,151.
[Solution: For the Pareto, R(x) = {/(+x)}1. Therefore, LER(x) = 1- {/(+x)}1.
LER(10,000) = 1 - (240,151/250,151)0.702 = 2.8%.
Comment: One could get the same result by using LER(x) = E[X x] / mean.]
Loss Elimination Ratio and Excess Ratio in Terms of the Survival Function:
As discussed previously, for a distribution with support starting at zero, the Limited Expected Value
can be written as an integral of the Survival Function from 0 to the limit:
E[X x] =

S(t) dt .

LER(x) = E[X x] / E[X], therefore:

LER(x) =

S(t) dt
E[X]

S(t) dt

S(t) dt

This term comes from the second part of - E[X x] in the numerator, -xS(x). For example for the Gamma
Distribution, the excess ratio has a term -x{1-( ; x)}/ = -xS(x)/( /) = - xS(x)/mean.
254

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 565

Thus, for a distribution with support starting at zero, the Loss Elimination Ratio is the
integral from zero to the limit of S(x) divided by the mean.
Since R(x) = 1 - LER(x) = (E[X] - E[X x]) / E[X], the Excess Ratio can be written as:

R(x) =

S(t) dt
E[X]

S(t) dt

S(t) dt

So the excess ratio is the integral of the survival from the limit to infinity, divided by the mean.255
For example, for the Pareto Distribution, S(x) = (+x). So that:
R(x) =

( + x)1- / ( -1)
= {/(+x)}1.
/ ( - 1)

This matches the formula given above for the Excess Ratio of the Pareto Distribution.

LER(x) =

Since

S(t) dt
E[X]

d LER(x) S(x)
=
.
dx
E[X]

S(x)
0, the loss elimination ratio is a increasing function of x.256
E[X]

d LER(x) S(x)
d2 LER(x)
f(x)
=
.
=.
2
dx
E[X]
E[X]
dx

Since

f(x)
0, the loss elimination ratio is a concave downwards function of x.
E[X]

The loss elimination ratio as a function of x is increasing, concave downwards, and approaches one
as x approaches infinity.

255

This result is used extremely often by property/casualty actuaries. See for example, The Mathematics of Excess
of Loss Coverage and Retrospective Rating -- A Graphical Approach, by Y.S. Lee, PCAS LXXV, 1988.
256
If S(x) = 0, in other words there is no possibility of a loss of size greater than x, then the loss elimination is a
constant 1, and therefore, more precisely the loss elimination is nondecreasing.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 566

For example, here is a graph of the loss elimination ratio for a Pareto Distribution with parameters
= 1.702 and = 240,151:257
LER
0.8

0.6

0.4

0.2

size in millions

Since the loss elimination ratio is increasing and concave downwards, the excess ratio is increasing
and concave upwards (convex).
For a distribution with support starting at zero:
d LER(x) S(x)
d LER(0)
1
d LER(x) d LER(0)
=
.
=
. S(x) =
/
.
dx
E[X]
dx
E[X]
dx
dx
Therefore, the loss elimination ratio function determines the distribution function, as well as vice-versa.
Layers of Loss:
As discussed previously, layers can be thought of in terms of the difference of loss elimination ratio
or the difference of excess ratios in the opposite order.
Exercise: Compute the percent of losses in the layer from $1 million to $5 million for a Pareto
Distribution with parameters = 1.702 and = 240,151.
[Solution: For this Pareto Distribution, R($1 million) - R($5 million) = 0.316 - 0.115 = 0.201.
Thus for this Pareto, 20.1% of the losses are in the layer from $1 million to $5 million.]
257

As x approaches infinity, the loss elimination ratio approaches one. In this case it approaches the limit slowly.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 567

Problems:
35.1 (2 points) Assume you have a Pareto distribution with = 5 and = $1000.
What is the Loss Elimination Ratio for $500?
A. less than 78%
B. at least 78% but less than 79%
C. at least 79% but less than 80%
D. at least 80% but less than 81%
E. at least 81%
35.2 (2 points) Assume you have Pareto distribution with = 5 and = $1000.
What is the Excess Ratio for $2000?
A. less than 1%
B. at least 1% but less than 2%
C. at least 2% but less than 3%
D. at least 3% but less than 4%
E. at least 4%
35.3 (3 points) You observe the following 35 losses: 6, 7, 11, 14, 15, 17, 18, 19, 25, 29, 30, 34,
40, 41, 48, 49, 53, 60, 63, 78, 85, 103, 124, 140, 192, 198, 227, 330, 361, 421, 514 ,546, 750,
864, 1638.
What is the (empirical) Loss Elimination Ratio at 50?
A. less than 0.14
B. at least 0.14 but less than 0.16
C. at least 0.16 but less than 0.18
D. at least 0.18 but less than 0.20
E. at least 0.20
35.4 (2 points) The size of losses follows a LogNormal distribution with parameters = 11 and
= 2.5. What is the Excess Ratio for 100 million?
A. less than 5%
B. at least 5% but less than 10%
C. at least 10% but less than 15%
D. at least 15% but less than 20%
E. at least 20%

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 568

35.5 (2 points) The size of losses follows a Gamma distribution with parameters
= 3, = 100,000. What is the excess ratio for 500,000?
Hint: Use Theorem A.1 in Appendix A of Loss Models:
j=n-1

(n; x) = 1 -

xj e- x / j! , for n a positive integer.


j=0

A. less than 5.6%


B. at least 5.6% but less than 5.8%
C. at least 5.8% but less than 6.0%
D. at least 6.0% but less than 6.2%
E. at least 6.2%
35.6 (2 points) The size of losses follows a LogNormal distribution with parameters = 10, = 3.
What is the Loss Elimination Ratio for 7 million?
A. less than 10%
B. at least 10% but less than 15%
C. at least 15% but less than 20%
D. at least 20% but less than 25%
E. at least 25%
Use the following information for the next two questions:

Accident sizes for Risk 1 follow an Exponential distribution, with mean .

Accident sizes for Risk 2 follow an Exponential distribution, with mean 1.2.

The insurer pays all losses in excess of a deductible of d.


10 accidents are expected for each risk each year.

35.7 (1 point) Determine the expected amount of annual losses paid by the insurer for Risk 1.
A. 10d

B. 10 / (d)

C. 10

D. 10e-d/

E. 10e-d/

35.8 (1 point) Determine the limit as d goes to infinity of the ratio of the expected amount of annual
losses paid by the insurer for Risk 2 to the expected amount of annual losses paid by the insurer for
Risk 1.
A. 0
B. 1/1.2
C. 1
D. 1.2
E.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 569

35.9 (1 point) You have the following estimates of integrals of the Survival Function.
1000

S(x) dx 400.

S(x) dx 2300.

1000

Estimate the Loss Elimination Ratio at 1000.


A. less than 15%
B. at least 15% but less than 16%
C. at least 16% but less than 17%
D. at least 17% but less than 18%
E. at least 18%
35.10 (4 points) For a LogNormal distribution with coefficient of variation equal to 3, what is the Loss
Elimination Ratio at twice the mean?
A. less than 50%
B. at least 50% but less than 55%
C. at least 55% but less than 60%
D. at least 60% but less than 65%
E. at least 65%
35.11 (3 points) The loss elimination ratio at 1 x 0 is:

ln[a + bx] - ln[a +1]


, 1 > b > 0, a > 0.
ln[a + b] - ln[a + 1]

Determine the form of the distribution function.


35.12 (4, 5/86, Q.59) (2 points) Assume that losses follow the probability density function
f(x) = x/18 for 0 x 6 with f(x) = 0 otherwise.
What is the loss elimination ratio (LER) for a deductible of 2?
A. Less than .35
B. At least .35, but less than .40
C. At least .40, but less than .45
D. At least .45, but less than .50
E. .50 or more.
35.13 (4, 5/87, Q.57) (2 points) Losses are distributed with a probability density function
f(x) = 2/x3 , 1 < x < . What is the expected loss eliminated by a deductible of d = 5?
A. Less than .5
B. At least .5, but less than 1
C. At least 1, but less than 1.5
D. At least 1.5, but less than 2
E. 2 or more.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 570

35.14 (4B, 5/92, Q.25) (2 points) You are given the following information:

Deductible
$250

Expected size of loss with no deductible

$2,500

Probability of a loss exceeding deductible

Mean Excess Loss value of the deductible


Determine the loss elimination ratio.
A. Less than 0.035
B. At least 0.035 but less than 0.070
C. At least 0.070 but less than 0.105
D. At least 0.105 but less than 0.140
E. At least 0.140

0.95
$2,375

35.15 (4B, 11/92, Q.18) (2 points) You are given the following information:

Deductible, d

$ 500

Expected value limited to d, E[x d]

$ 465

Probability of a loss exceeding deductible, 1-F(d)

0.86

Mean Excess Loss value of the deductible, e(d)


Determine the loss elimination ratio.
A. Less than 0.035
B. At least 0.035 but less than 0.055
C. At least 0.055 but less than 0.075
D. At least 0.075 but less than 0.095
E. At least 0.095

$5,250

35.16 (4B, 5/94, Q.10) (2 points) You are given the following:

The amount of a single loss has a Pareto distribution with parameters = 2 and = 2000.

Calculate the Loss Elimination Ratio (LER) for a $500 deductible.


A. Less than 0.18
B. At least 0.18, but less than 0.23
C. At least 0.23, but less than 0.28
D. At least 0.28, but less than 0.33
E. At least 0.33

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 571

35.17 (4B, 5/96, Q.9 & Course 3 Sample Exam, Q.17) (2 points)
You are given the following:

Losses follow a lognormal distribution,with parameters = 7 and = 2.


There is a deductible of 2,000.
10 losses are expected each year.

The number of losses and the individual loss amounts are independent.
Determine the loss elimination ratio (LER) for the deductible.
A. Less than 0.10
B. At least 0.10, but less than 0.15
C. At least 0.15, but less than 0.20
D. At least 0.20, but less than 0.25
E. At least 0.25
35.18 (4B, 11/96, Q.13) (2 points) You are given the following:

Losses follow a Pareto distribution, with parameters = k and = 2, where k is a constant.

There is a deductible of 2k.


What is the loss elimination ratio (LER)?
A. 1/3
B. 1/2
C. 2/3

D. 4/5

E. 1

35.19 (4B, 5/97, Q.19) (3 points) You are given the following:

Losses follow a distribution with density function


f(x) = (1/1000) e-x/1000, 0 < x < .

There is a deductible of 500.

10 losses are expected to exceed the deductible each year.


Determine the amount to which the deductible would have to be raised to double the loss
elimination ratio (LER).
A. Less than 550
B. At least 550, but less than 850
C. At least 850, but less than 1,150
D. At least 1,150, but less than 1,450
E. At least 1,450

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 572

Use the following information for the next two questions:


Loss sizes for Risk 1 follow a Pareto distribution, with parameters and , > 2 .
Loss sizes for Risk 2 follow a Pareto distribution, with parameters and 0.8, > 2 .
The insurer pays losses in excess of a deductible of k.
1 loss is expected for each risk each year.
35.20 (4B, 11/97, Q.22) (2 points) Determine the expected amount of annual losses paid by the
insurer for Risk 1.
+k


A.
B.
C.
-1
( +k)
( +k) + 1
D.

+ 1
( -1) ( +k)

E.

( -1) ( +k) 1

35.21 (4B, 11/97, Q.23) (1 point) Determine the limit of the ratio of the expected amount of annual
losses paid by the insurer for Risk 2 to the expected amount of annual losses paid by the insurer for
Risk 1 as k goes to infinity.
A. 0
B. 0.8
C. 1
D. 1.25
E.
35.22 (4B, 5/99, Q.20) (2 points) Losses follow a lognormal distribution, with parameters
= 6.9078 and = 1.5174. Determine the ratio of the loss elimination ratio (LER) at 10,000 to the
loss elimination ratio (LER) at 1,000.
A. Less than 2
B. At least 2, but less than 4
C. At least 4, but less than 6
D. At least 6, but less than 8
E. At least 8

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 573

35.23 (SOA3, 11/03, Q.29 & 2009 Sample Q.87) (2.5 points)
The graph of the density function for losses is:
f(x)
0.010
0.008
0.006
0.004
0.002

80

120

Loss amount, x

Calculate the loss elimination ratio for an ordinary deductible of 20.


(A) 0.20
(B) 0.24
(C) 0.28
(D) 0.32
(E) 0.36
35.24 (SOA M, 11/06, Q.29 & 2009 Sample Q.284) (2.5 points)
A risk has a loss amount which has a Poisson distribution with mean 3.
An insurance covers the risk with an ordinary deductible of 2.
An alternative insurance replaces the deductible with coinsurance , which is the proportion of the
loss paid by the insurance, so that the expected insurance cost remains the same.
Calculate .
(A) 0.22

(B) 0.27

(C) 0.32

(D) 0.37

(E) 0.42

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 574

Solutions to Problems:
35.1. D. LER(x) = { E[X x] / mean } , for the Pareto: mean = /(1) = $250 and
E[X x] = {/(1)}{1(/+x)1} = $200.6. LER(x) = 200.6/250 = 80.24%
35.2. B. Excess Ratio = 1- { E[X x] / mean } = {/( +x)}1 = 1.23%.
Comment: E[X x] = 246.925, mean = 250.
35.3. D. E[X 50] = { 6 + 7 + 11+ 14 +15 + 17 + 18 + 19 + 25+ 29+ 30+ 34 + 40 + 41+ 48 +
49 + (19)(50)} /35 = (403 + 950)/35 = 38.66. E[X] = {6 + 7 + 11 +14 + 15 + 17 + 18 +19 + 25 +
29 + 30 + 34 + 40 + 41+ 48 + 49 + 53 + 60 + 63 + 78 + 85 + 103 + 124 + 140 + 192 + 198 +
227 + 330 + 361 + 421 + 514 + 546 + 750 + 864 + 1638}/35 = 7150 /35 = 204.29.
LER(50) = E[X 50] / E[X] = 38.66 / 204.29 = 0.189.
35.4. E. mean = exp( + 2/2) = exp(11 + 2.52 /2) = 1362729.
E{X x] = exp( + 2/2)[(lnx - 2)/] + x {1 [(lnx - )/]}.
E{X 100 million] = 1362729[(18.421 - 11 - 2.52 )/2.5] (100 million){1 - [(18.421 - 11)/2.5]} =
1362729[.47] (100 million){1 - [2.97]} = 1,362,729(.6808) - (100 million){1 - 0.9985) =
1,077,745. Then R(100 million) = 1 - 1,077,745 / 1,362,729 = 1 - .791 = 20.9%.
35.5. B. [3 ; 5] = 1 - e-5(1 + 5 + 52 /2) = 0.875.
[4 ; 5] = 1 - e-5(1 + 5 + 52 /2 +53 /6) = 0.735.
For Gamma E[X] = = 300,000.
E[X

500,000] = ()[+1 ; 500,000/] + 500,000 (1 - [ ; 500,000/]) =

300000[4 ; 5] + 500,000(1 - [3 ; 5]) = 283,000.


Excess Ratio = 1 - { E[X

500,000] / E[X] } = 1 - 283,000/300,000 = 1 - 0.943 = 5.7%.

35.6. D. mean = exp( + 2/2) = exp(10 + 32 /2) = 1,982,759.


E{X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.
E{X 7 million] = 1982759[(15.761 - 10 - 32 )/3] + (7 million){1 - [(15.761 - 10)/3]}
= 1,982,759 [-1.08] + (7 million){1 - [1.92]} =
1,982,759(0.1401) + (7 million)(1 - 0.9726) = 469,585.
Then LER(7 million) = E{X 7 million]/ E[X] = 469,585 / 1,982,759 = 0.237.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 575

35.7. D. The expected amount paid by the insurer is: 10{E[X] - E[X d]} =
10{ - (1e-d/)} = 10 e-d/ .
Alternately, per claim the losses excess of the limit d are: e(k)(1-F(k)) = e-d/ .
Thus for 10 claims we expect the insurer to pay: 10 e-d/ .
Alternately, per claim the losses excess of the limit k are: R(k)E[X] = e-d/ = e-d/.
Thus for 10 claims we expect the insurer to pay: 10 e-d/ .
35.8. E. Using the solution to the previous question, the expected amount paid by the insurer for
Risk 1 is: 10 e-d/.
Similarly, the expected amount paid by the insurer for Risk 2 is: 12 e-d/1.2.
Therefore, the ratio of the expected amount of annual losses paid by the insurer for Risk 2 to the
expected amount of annual losses paid by the insurer for Risk 1 is:
{12 e-d/1.2} / {10 e-d/} = 1.2 e.167d/. As d goes to infinity this goes to infinity.
35.9. A.

1000

LER(1000)=
0

S(t) dt / S(t) dt 400/(400+2300) = 14.8%.


0

Comment: The estimated mean is 400+2300 = 2700. The estimated limited expected value at
1000 is 400.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 576

35.10. D. 1 + CV2 = E[X2 ]/E[X]2 = exp[2 + 22]/exp[ + 2/2]2 = exp[2].


1 + 32 = exp[2]. =

ln(10) = 1.5174.

LER[x] = E[X x] / mean = {exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx )/]}} / exp( + 2/2) =
[(lnx )/ - ] + (x/mean) {1 [(lnx )/]}. x = 2 mean. ln(x) = ln(2) + + 2/2.

(lnx )/ = ln(2)/ + /2 = .69315/1.5174 + 1.5174/2 = 1.2155.


LER[x] = [1.2155 - 1.5174] + (2){1 - [1.2155]} = [-.30] + 2(1 - [1.22]) =
0.3821 + 2(1 - 0.8888) = 60.5%.
Comment: See Table I in Estimating Pure Premiums by Layer - An Approach by Robert J.
Finger, PCAS 1976. Finger calculates excess ratios, which are one minus the loss elimination ratios.
Here is a graph of Loss Elimination Ratios as a function of the ratio to the mean, for LogNormal
Distributions with some different values of the coefficient of variation:
LER
1

CV= 1
CV= 2

0.8

CV= 3

0.6

0.4

0.2

Ratio to Mean
1

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

35.11. LER(x) =

S(t) dt
E[X]

HCM 10/8/12,

Page 577

d LER(x) S(x)
=
.
dx
E[X]

d LER(0)
1
d LER(x) d LER(0)
d LER(x) d LER(0)
=
. S(x) =
/
. F(x) = 1 /
.
dx
E[X]
dx
dx
dx
dx

d LER(x)
1
ln[b] bx d LER(0)
1
ln[b ]
=
.
=
.
dx
ln[a + b] - ln[a +1] a + bx
dx
ln[a + b] - ln[a +1] a + 1

F(x) = 1 - (a+1)

bx
, 0 x < 1.
a + bx

S(1-) = (a+1)b / (a + b) > 0.


Thus there is a point mass of probability at 1 of size: (a+1)b / (a + b).
Comment: Note that F(1-) = a (1 - b) / (a + b) < 1.
This is a member of the MBBEFD Distribution Class, not on the syllabus of your exam.
See Swiss Re Exposure Curves and the MBBEFD Distribution Class, by Stefan Bernegger,
ASTIN Bulletin, Vol. 27, No. 1, May 1997, pp. 99-111, on the syllabus of CAS Exam 8.
Here is a graph of the loss elimination ratio for b = 0.2 and a = 3:
LER
1.0
0.8
0.6
0.4
0.2
size
0.2
0.4
0.6
0.8
1.0
As it should, the LER is increasing, concave downwards, and approaches 1 as x approaches 1.
Here is a graph of the Survival Function for b = 0.2 and a = 3:

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 578

Survival Function
1.0
0.8
0.6
0.4
0.2
size
0.2
0.4
0.6
0.8
1.0
There is a point mass of probability at x = 1 of size: (a+1)b / (a + b) = (4)(0.2)/3.2 = 25%.
35.12. D. F(x) = x2 / 36.
2

LER(2) = {

f(x)xdx + 2(1-F(2)) } / f(x)xdx = {(23)/(54) +2 (1 - 22 / 36)} / {(63)/54} =

x=0

x=0

1.926 / 4 = 0.481.
35.13. D. Integrating f(x) from 1 to x, F(x) = 1 - 1/x2 . A deductible of d eliminates the size of the
loss for small losses and d per large loss. The expected losses eliminated by a deductible of d is:
d

f(x)x dx + d (1-F(d)) = 2x-2 dx + d(1/d2) = -2/x ] +1/d = (2 -2/d) + 1/d = 2 - 1/d .


1

For d =5, the expected losses eliminated are: 2 - 1/5 = 1.8.


Comment: A Single Parameter Pareto with = 2 and = 1.
35.14 . C. e(x) = {mean - E[X x]}/(1 - F(x)). Therefore: 2375 = (2500 - E[X x])/(.95).
Thus E[X x] = 243.75. Then, LER(x) = E[X x] / E[X] = 243.75 / 2500 = 0.0975.
Alternately, LER(x) = 1 - e(x) {1 - F(x)}/E[x] = 1 - (2375)(1 - .05)/2500 = 0.0975.
35.15. D. LER(d) = E[x d] / Mean. e(d) = (Mean - E[X d]) / S(d).
Therefore, 5250 = (Mean - 465) / 0.86. Therefore Mean = 4515 + 465 = 4980.
Thus LER(d) = 465/4980 = 0.093.
Comment: One does not use the information that the deductible amount is $500.
However, note that E[x d] d, as it should.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 579

35.16. B. For the Pareto distribution LER(x) = 1 - (1 + x / )1 . For = 2 and


= 2000, LER(500) = 1 - 1/1.25 = 0.2.
Alternately, E[X x] = {/(1)}{1(/+x)1} and E[X 500] = (2000)(1 - .8) = 400.
The mean is / (-1) = 2000. LER(x) = E[X x] / mean = 400 / 2000 = 0.2.
35.17. B. Mean = exp( + .5 2) = exp(7 + 2) = e9 = 8103.
E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}
E[X 2000] = 8103[(ln2000 7 - 4)/2] + 2000 {1 [(ln2000 7)/2]} =
8103 (-1.7) + 2000{1 - (.3)} = (8103)(1-.9554) +(2000)(1 - .6179) = 361 +764 = 1125.
LER(2000) = E[X 2000] / Mean = 1125 / 8103 = 0.139.
35.18. C. For the Pareto Distribution, the Loss Elimination Ratio is: 1 - (/(+x))1.
For = k and = 2, LER(x) = 1 - (k/(k+x)) = x / (k+x). Thus LER(2k) = 2k/ 3k = 2/3.
Comment: If one does not remember the formula for the LER for the Pareto, one can use the
formula for the limited expected value and the fact that LER(x) = E[X x] / E[X].
35.19. E. For the Exponential Distribution: LER(x) = 1 - e-x/. For = 1000,
LER(500) = 1- e-.5 = .3935. In order to double the LER, then (2)(.3935) = 1 - e-x/1000.
Thus e-x/1000 = 0.214. x = -1000 ln(.213) = 1546.
Comment: For the Exponential, e(x) = , and therefore R(x) = e(x) S(x) / mean =
()(e-x/ )/() = e-x/ . Thus LER(X) = 1- R(x) = 1 - e-x/ .
35.20. E. Since there is 1 loss expected per risk per year, the expected amount paid by the
insurer is: E[X] - E[X k] = /(1) {/(1)}{1 - 1/(+k)1} =
{/(1)}1/( + k)1 = /{(1)( + k)1}.
Alternately, the losses excess of the limit k are e(k)(1-F(k)) = {(k + )/(1)} /(+k) =
/{(1)( + k)1}.
Alternately, the losses excess of the limit k are R(k)E[X] =
{/( + k)}1{/(1)} = /{(1)( + k)1}.

2013-4-2,

Loss Distributions, 35 Loss Elimination Ratio

HCM 10/8/12,

Page 580

35.21. E. Using the solution to the prior question, but with 0.8 rather than , the expected amount
of annual losses paid by the insurer for Risk 2 is:
.8/{(.81)( + k).81}. That for Risk 1 is: /{(1)( + k)1}. The ratio is:
{(1)/(.81)}( + k).2 / .2. As k goes to infinity, this ratio goes to infinity.
Comment: The loss distribution of Risk 2 has a heavier tail than Risk 1. The pricing of very large
deductibles is very sensitive to the value of the Pareto shape parameter, .
35.22. B. LER = E[X x] / E[X]. LER(10,000) / LER (1000) = E[X
E[X

E[X
E[X
E[X

10000] / E[X

1000].

x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.


10,000] = e8.059[0] + 10000 {1 [1.52]} = (3162)(.5) +10000(1 - .9357) = 2224.
1000] = e8.059[-1.52] + 1000 {1[0]} = (3162)(.0643) +1000(.5) = 703.
10000] / E[X

1000] = 2224 / 703 = 3.16.

35.23. E. F(20) = (20)(.010) = 0.2. S(20) = 1 - 0.2 = 0.8.


f(x) = .01, 0 x 80.
From 80 to 120 the graph is linear, and it is 0 at 120 and .010 at 80.

f(x) = (.01)(120 - x)/40 = .03 - .00025x, 80 x 120.


80

120

E[X] = (.01)x dx +
0

x=80

x=120

x=120

x (.03 - .00025x) dx = .01x2/2 ] + .03x2/2 ] - .00025x3/3 ] = 50.67.

80

x=0

x=80

x=80

20

E[X

20] = (.01)x dx + 20S(20) = 2 + (20)(0.8) = 18.


0

LER(20) = E[X

20]/E[X] = 18 / 50.67 = 35.5%.

35.24. E. E[X

2] = 0f(0) + 1f(1) + 2{1 - f(0) - f(1)} = 2 - 2f(0) - f(1) = 2 - 2e-3 -3e-3 = 2 - 5e-3.

Loss Elimination Ratio = E[X

2]/E[X] = (2 - 5e-3)/3 = .584.

We told that, = 1 - LER = 1 - .584 = 0.416.


Comment: is set equal to the excess ratio for a deductible of 2.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 581

Section 36, The Effects of Inflation


Inflation is a very important consideration when pricing Health Insurance and Property/Casualty
Insurance. Important ideas include the effect of inflation when there is a maximum covered loss
and/or deductible, in particular the effect on the average payment per loss and the average
payment per payment, the effect on other quantities of interest, and the effect on size of loss
distributions. Memorize the formulas for the average sizes of payment including inflation,
discussed in this section!
On this exam, we deal with the effects of uniform inflation, meaning that a single inflation factor is
applied to all sizes of loss.258 For example, if there is 5% inflation from 1999 to the year 2000, we
assume that a loss of size x in 1999 would have been of size 1.05x if it had instead occurred in the
year 2000.
Effect of a Maximum Covered Loss:
Exercise: You are given the following:
For 1999 the amount of a single loss has the following distribution:
Amount
Probability
$500
20%
$1,000
30%
$5,000
25%
$10,000
15%
$25,000
10%
An insurer pays all losses after applying a $10,000 maximum covered loss to each loss.
Inflation of 5% impacts all losses uniformly from 1999 to 2000.
Assuming no change in the maximum covered loss, what is the inflationary impact on dollars paid by
the insurer in the year 2000 as compared to the dollars the insurer paid in 1999?
[Solution: One computes the average amount paid by the insurer per loss in each year:
Probability

1999 Amount
1999
of Loss
Insurer Payment

2000 Amount
2000
of Loss
Insurer Payment

0.20
0.30
0.25
0.15
0.10

500
1,000
5,000
10,000
25,000

500
1,000
5,000
10,000
10,000

525
1,050
5,250
10,500
26,250

525
1,050
5,250
10,000
10,000

Average

5650.00

4150.00

5932.50

4232.50

4232.50 / 4150 = 1.020, therefore the insurers payments increased 2.0%.]


258

Over a few years inflation can often be assumed to be approximately uniform by size of loss. However, over
longer periods of time the larger losses often increase at a different rate than the smaller losses.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 582

Inflation on the limited losses is 2%, less than that of the total losses. Prior to the application of the
maximum covered loss, the average loss increased by the overall inflation rate of 5%, from 5650 to
5932.5. In general, for a fixed limit, limited losses increase more slowly than the overall
rate of inflation.
Effect of a Deductible:
Exercise: You are given the following:
For 1999 the amount of a single loss has the following distribution:
Amount
Probability
$500
20%
$1,000
30%
$5,000
25%
$10,000
15%
$25,000
10%
An insurer pays all losses after applying a $1000 deductible to each loss.
Inflation of 5% impacts all losses uniformly from 1999 to 2000.
Assuming no change in the deductible, what is the inflationary impact on dollars paid by the insurer in
the year 2000 as compared to the dollars the insurer paid in 1999?
[Solution: One computes the average amount paid by the insurer per loss in each year:
Probability

1999 Amount
1999
of Loss
Insurer Payment

2000 Amount
2000
of Loss
Insurer Payment

0.20
0.30
0.25
0.15
0.10

500
1,000
5,000
10,000
25,000

0
0
4,000
9,000
24,000

525
1,050
5,250
10,500
26,250

0
50
4,250
9,500
25,250

Average

5650.00

4750.00

5932.50

5027.50

5027.5 / 4750 = 1.058, therefore the insurers payments increased 5.8%.]


Inflation on the losses excess of the deductible is 5.8%, greater than that of the total losses. Prior to
the application of the deductible, the average loss increased by the overall inflation rate of 5%, from
5650 to 5932.5. In general, for a fixed deductible, losses paid excess of the deductible
increase more quickly than the overall rate of inflation.
The Loss Elimination Ratio in 1999 is: (5650 - 4750) / 5650 = 15.9%.
The Loss Elimination Ratio in 2000 is: (5932.5 -5027.5) / 5932.5 = 15.3%.
In general, under uniform inflation for a fixed deductible amount the LER declines.
The effect of a fixed deductible decreases over time.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 583

Similarly, under uniform inflation the Excess Ratio over a fixed amount increases.259
If a reinsurer were selling reinsurance excess of a fixed limit such as $1 million, then over time
the losses paid by the reinsurer would be expected to increase faster than the overall rate of
inflation, in some cases much faster.
Limited Losses increase slower than the total losses.
Excess Losses increase faster than total losses.
Limited Losses plus Excess Losses = Total Losses.
Graphical Examples:
Assume for example that losses follow a Pareto Distribution with = 3 and = 5000 in the earlier
year.260 Assume that there is 10% inflation and the same limit in both years. Then the increase in
limited losses as a function of the limit is:
(%
Inflation (%)

9
8
7
6

10000

20000

30000

40000

Limit
50000

As the limit increases, so does the rate of inflation. For no limit the rate is 10%.

259

See 3, 11/00, Q.42.


How to work with the Pareto and other continuous size of loss distributions under uniform inflation will be
discussed subsequently.
260

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 584

If instead there were a fixed deductible, then the increase in losses paid excess of the deductible as
a function of the deductible is:
Inflation (%)
24
22
20
18
16
14
12
2000

4000

6000

8000

Deductible
10000

For no deductible the rate of inflation is 10%. As the size of the deductible increases, the losses
excess of the deductible becomes more excess, and the rate of inflation increases.
Effect of a Maximum Covered Loss and Deductible, Layers of Loss:
Exercise: You are given the following:
For 1999 the amount of a single loss has the following distribution:
Amount
Probability
$500
20%
$1,000
30%
$5,000
25%
$10,000
15%
$25,000
10%
An insurer pays all losses after applying a $10,000 maximum covered loss
and then a $1000 deductible to each loss.
Inflation of 5% impacts all loss uniformly from 1999 to 2000.
Assuming no change in the deductible or maximum covered loss, what is the inflationary impact on
dollars paid by the insurer in the year 2000 as compared to 1999?

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 585

[Solution: One computes the average amount paid by the insurer per loss in each year:
Probability

1999 Amount
1999
of Loss
Insurer Payment

2000 Amount
2000
of Loss
Insurer Payment

0.20
0.30
0.25
0.15
0.10

500
1,000
5,000
10,000
25,000

0
0
4,000
9,000
9,000

525
1,050
5,250
10,500
26,250

0
50
4,250
9,000
9,000

Average

5650.00

3250.00

5932.50

3327.50

3327.5 / 3250 = 1.024, therefore the insurers payments increased 2.4%.]


In this case, the layer of loss from 1000 to 10,000 increased more slowly than the overall rate of
inflation. However, there were two competing effects. The deductible made the rate of increase
larger, while the maximum covered loss made the rate of increase smaller. Which effect dominates
depends on the particulars of a given situation.
For example, for the ungrouped data in Section 1, the dollars of loss in various layers are as follows
for the original data and the revised data after 50% uniform inflation:
LAYER ($ million)
Bottom
Top
0
0.5
0.5
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
0

infinity

Dollars of Loss ($000)


Original Data
24277
6424
4743
2441
1961
802
0
0
0

Revised Data
30174
10239
8433
4320
2661
2000
1942
1000
203

Ratio
1.24
1.59
1.78
1.77
1.36
2.49
infinite
infinite
infinite

40648

60972

1.50

We observe that the inflation rate for higher layers is usually higher than the uniform inflation rate, but
not always. For the layer from 3 to 4 million dollars the losses increase by 36%, which is less than
the overall rate of 50%.
A layer (other than the very first) gains dollars as loss sizes increase and are thereby pushed
above the bottom of the layer. For example, a loss of size 0.8 million would contribute nothing to
the layer from 1 to 2 million prior to inflation, while after 50% inflation it would be of size 1.2 million,
and would contribute 0.2 million. In addition, losses which were less than the top of the layer and
more than the bottom of the layer, now contribute more dollars to the layer. For example, a loss of
size 1.1 million would contribute 0.1 million to the layer from 1 to 2 million prior to inflation, while after
50% inflation it would be of size 1.65 million, and would contribute 0.65 million to this layer. Either of
these two types of increases can be very big compared to the dollars that were in the layer prior to
the effects of inflation.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 586

On the other hand, a loss whose size was bigger than the top of a given layer, contributes no more
to that layer no matter how much it grows. For example, a loss of size 3 million would contribute
1 million to the layer from 1 to 2 million prior to inflation, while after 50% inflation it would be of size
4.5 million, and would still contribute 1 million. A loss of size 3 million has already contributed the
width of the layer, and that is all that any single loss can contribute to that layer. So for such losses
there is no increase to this layer.
Thus for an empirical sample of losses, how inflation impacts a particular layer depends how the
varying effects from the various sizes of losses contribute to the combined effect.

Manners of Expressing the Amount of Inflation:


There are a number of different ways to express the amount of inflation:
1. State the total amount of inflation from the earlier year to the later year.
2. Give a constant annual inflation rate.
3. Give the different amounts of inflation during each annual period between the earlier and
later year.
4. Give the value of some consumer price index in the earlier and later year.
In all cases, you want to determine the total inflation factor, (1+r), to get from the earlier
year to the later year.
For example, from the year 2001 to 2004, inflation might be:
1. A total of 15%; 1 + r = 1.15.
2. 4% per year; 1 + r = (1.04)3 = 1.125.
3. 7% between 2001 and 2002, 4% between 2002 and 2003, and 5% between 2003 and 2004;
1 + r = (1.07)(1.04)(1.05) = 1.168.
4. The CPI (Consumer Price Index) was 327 in 2001 and is 366 in 2004; 1 + r = 366/327 = 1.119.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 587

Moments, etc.:
If one multiplies all of the loss sizes by 1.1, then the mean is also multiplied by 1.1.
E[1.1X] = 1.1 E[X].
Since each loss is multiplied by the inflation factor, (1+r), so are the Mean, Mode and
Median of the distribution. Any percentile of the distribution is also multiplied by (1+r); in fact this
is the definition of inflation uniform by size of loss. Any quantity in dollars is expected to be multiplied
by the inflation factor, 1+ r.
If one multiplies all of the loss sizes by 1.1, then the second moment is multiplied by 1.12 .
E[(1.1X)2 ] = 1.12 E[X2 ].
E[{(1+r)X}n ] = (1+r)n E[Xn ].
In general, under uniform inflation the nth moment is multiplied by (1+r)n .
Exercise: In 2003 the mean loss is 100 and the second moment is 50,000. Between 2003 and
2004 there is 5% inflation. What is the variance of the losses in 2004?
[Solution: In 2004, the mean is: (1.05)(100) = 105, and the second moment is: (1.052 )(50000) =
55,125. Thus in 2004, the variance is: 55,125 - 1052 = 44,100.]
The variance in 2003 was 50,000 - 1002 = 40,000. The variance increased by a factor of:
44,100/40,000 = 1.1025 = 1.052 = (1+r)2 .
Var[(1+r)X] = E[{(1+r)X}2 ] - E[(1+r)X]2 = (1+r)2 E[X2 ] - (1+r)2 E[X]2 = (1+r)2 Var[X].
Under uniform inflation, the Variance is multiplied by (1+r)2 . Any quantity in dollars squared
is expected to the multiplied by the square of the inflation factor, (1+r)2 .
Since the Variance is multiplied by (1+r)2 , the Standard Deviation is multiplied by (1+r).

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 588

CV, Skewness, and Kurtosis:


Exercise: In 2003 the mean loss is 100 and the second moment is 50,000. Between 2003 and
2004 there is 5% inflation. What is the coefficient of variation of the losses in 2004?
[Solution: In 2004, the mean is 105, and the standard deviation is

44,100 = 210.

The Coefficient of Variation is: 210 / 105 = 2.]


In this case, the CV for 2003 is:

40,000 / 100 = 2. Thus the coefficient of variation remained the

same. CV = standard deviation / mean, and in general both the numerator and denominator are
multiplied by (1+r), and therefore the CV remains the same.
Skewness = (3rd central moment)/ standard deviation3 .
Both the numerator and denominator are in dollars cubed, and under uniform inflation they are each
multiplied (1+r)3 . Thus the skewness is unaffected by uniform inflation.
Kurtosis = (4th central moment)/ standard deviation4 .
Both the numerator and denominator are in dollars to the fourth power, and under uniform inflation
they are each multiplied (1+r)4 . Thus the kurtosis is unaffected by uniform inflation.
The Coefficient of Variation, the Skewness, and the Kurtosis are each unaffected by
uniform inflation. Each is a dimensionless quantity, which helps to describe the shape of a
distribution and is independent of the scale of the distribution.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 589

Limited Expected Values:


As discussed previously, losses limited by a fixed limit increase slower than the rate of inflation. For
example, if the expected value limited to $1 million is $300,000 in the prior year, then after uniform
inflation of 10%, the expected value limited to $1 million is less than $330,000 in the later year.
Exercise: You are given the following:
For 1999 the amount of a single loss has the following distribution:
Amount
Probability
$500
20%
$1,000
30%
$5,000
25%
$10,000
15%
$25,000
10%
Inflation of 5% impacts all losses uniformly from 1999 to 2000.
An insurer pays all losses after applying a maximum covered loss to each loss.
The maximum covered loss in 1999 is $10,000.
The maximum covered loss in 2000 is $10,500, 5% more than that in 1999.
What is the inflationary impact on dollars paid by the insurer in the year 2000 as compared to the
dollars the insurer paid in 1999?
[Solution: One computes the average amount paid by the insurer per loss in each year:
Probability

1999 Amount
1999
of Loss
Insurer Payment

2000 Amount
2000
of Loss
Insurer Payment

0.20
0.30
0.25
0.15
0.10

500
1,000
5,000
10,000
25,000

500
1,000
5,000
10,000
10,000

525
1,050
5,250
10,500
26,250

525
1,050
5,250
10,500
10,500

Average

5650.00

4150.00

5932.50

4357.50

4357.50 / 4150 = 1.050, therefore the insurers payments increased 5.0%.]


On exam questions, the maximum covered loss would usually be the same in the two years.
In that case, as discussed previously, the insurers payments would increase at 2%, less than the
overall rate of inflation. In this exercise, instead the maximum covered loss was increased in order to
keep up with inflation. The result was that the insurers payments, the limited expected value,
increased at the overall rate of inflation.
Provided the limit keeps up with inflation, the Limited Expected Value is multiplied by
the inflation factor.261 If we increase the limit at the rate of inflation, then the Limited Expected
Value, which is in dollars, also keeps up with inflation.
261

As discussed previously, if rather than being increased in order to keep up with inflation the limit is kept fixed,
then the limited losses increase slower than the overall rate of inflation.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 590

Exercise: The expected value limited to $1 million is $300,000 in the 2007.


There is 10% uniform inflation between 2007 and 2008.
What is the expected value limited to $1.1 million in 2008?
[Solution: Since the limit kept up with inflation, ($300,000)(1.1) = $330,000.]
Proof of the Result for Limited Expected Values:
The Limited Expected Value is effected for two reasons by uniform inflation. Each of the losses
entering into its computation is multiplied by (1+r), but in addition the relative effect of the limit has
been effected. Due to the combination of these two effects it turns out that if Z = (1+r) X,
then E[Z u(1+r)] = (1+r) E[X u].
In terms of the definition of the Limited Expected Value:
E[Z u(1+r)] =

u(1+r)

0
u

x fX(x) dx

z fZ(z) dz

+ {SZ(u(1+r))} {u(1+r)} =

+ {SX(u)} {L(1+u)} = (1+r) E[X u].

Where we have applied the change of variables, z = (1+r) x and thus FZ(L(1+r)) = FX(L),
and fX(x) dx = fZ(z) dz.
We have shown that E[(1+r)X u(1+r)] = (1+r) E[X u]. The left hand side is the Limited
Expected Value in the later year, with a limit of u(1+r); we have adjusted u, the limit in the prior year,
in order to keep up for inflation via the factor 1+r. This yields the Limited Expected Value in the prior
year, except multiplied by the inflation factor to put them in terms of the subsequent years dollars,
which is the right hand side.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 591

Mean Excess Loss:


Exercise: You are given the following:
For 1999 the amount of a single loss has the following distribution:
Amount
Probability
$500
20%
$1,000
30%
$5,000
25%
$10,000
15%
$25,000
10%
Inflation of 5% impacts all losses uniformly from 1999 to 2000.
Compute the mean excess loss at $3000 in 1999.
Compute the mean excess loss at $3000 in 2000.
Compute the mean excess loss at $3150 in 2000.
[Solution: In 1999, e(3000) = {(2000)(.25) + (7000)(.15) + (22,000)(.1)}/(.25 + .15 + .1) = 7500.
In 2000, e(3000) = {(5250 - 3000)(.25) + (10,500 - 3000)(.15) + (26,250 - 3000)(.1)}/.5 = 8025.
In 2000, e(3150) = {(5250 - 3150)(.25) + (10,500 - 3150)(.15) + (26,250 - 3150)(.1)}/.5 = 7875.]
In this case, if the limit is increased for inflation, from 3000 to (1.05)(3000) = $3150 in 2000, then the
mean excess loss increases by the rate of inflation; (1.05)(7500) = 7875.
The mean excess loss in the later year is multiplied by the inflation factor, provided the
limit has been adjusted to keep up with inflation.
Exercise: The mean excess loss beyond $1 million is $3 million in 2007.
There is 10% uniform inflation between 2007 and 2008.
What is the mean excess loss beyond $1.1 million in 2008?
[Solution: Since the limit kept up with inflation, ($3 million)(1.1) = $3.3 million.]
If the limit is fixed, then the behavior of the mean excess loss, depends on the particular size of loss
distribution.262
Proof of the Result for the Mean Excess Loss:
The Mean Excess Loss or Mean Residual Life at L in the prior year is given by
eX(L) = {E[X] - E[X L]} / S(L).
Letting Z = (1+r)X, the mean excess loss at L(1+r) in the latter year is given by
eZ(L(1+r)) = { E[Z] - E[Z L(1+r)] } / SZ(L(1+r)) =
{(1+r)E[X] - (1+r)E[X L] } / {SX(L)} = (1+r)eX(L).
262

As was discussed in a previous section, different distributions have different behaviors of the mean excess loss
as a function of the limit.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 592

Loss Elimination Ratio:


As discussed previously, for a fixed deductible, the Loss Elimination Ratio declines under uniform
inflation. For example, if the LER(1000) = 13% in the prior year, then after uniform inflation,
LER(1000) is less than 13% in the latter year.
Exercise: You are given the following:
For 1999 the amount of a single loss has the following distribution:
Amount
Probability
$500
20%
$1,000
30%
$5,000
25%
$10,000
15%
$25,000
10%
Inflation of 5% impacts all losses uniformly from 1999 to 2000.
An insurer pays all losses after applying a deductible to each loss.
The deductible in 1999 is $1000.
The deductible in 2000 is $1050, 5% more than that in 1999.
Compare the loss elimination ratio in the year 2000 to that in the year 1999.
[Solution: One computes the average amount paid by the insurer per loss in each year:
Probability

1999 Amount
1999
of Loss
Insurer Payment

2000 Amount
2000
of Loss
Insurer Payment

0.20
0.30
0.25
0.15
0.10

500
1,000
5,000
10,000
25,000

0
0
4,000
9,000
24,000

525
1,050
5,250
10,500
26,250

0
0
4,200
9,450
25,200

Average

5650.00

4750.00

5932.50

4987.50

The Loss Elimination Ratio in 1999 is: 1 - 4750/5650 = 15.9%.


The Loss Elimination Ratio in 2000 is: 1 - 4987.5/5932.5 = 15.9%.
Comment: 4987.50 / 4750 = 1.050, therefore the insurers payments increased 5.0%.]
On exam questions, the deductible would usually be the same in the two years. In that case, as
discussed previously, the loss elimination ratio would decrease from 15.9% to 15.3%. In this
exercise, instead the deductible was increased in order to keep up with inflation. The result was that
the insurers payments increased at the overall rate of inflation, and the loss elimination ratio stayed
the same.
The Loss Elimination Ratio in the later year is unaffected by uniform inflation, provided
the deductible has been adjusted to keep up with inflation.263
263

As discussed above, for a fixed deductible the Loss Elimination Ratio decreases under uniform inflation.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 593

Exercise: The Loss Elimination Ratio for a deductible of $1000 is 13% in 2007.
There is 10% uniform inflation between 2007 and 2008.
What is the Loss Elimination ratio for a deductible of $1100 in 2008?
[Solution: Since the deductible keeps up with inflation, the Loss Elimination Ratio is the same in
2008 as in 2007, 13%.]
Since the Excess Ratio is just unity minus the LER, the Excess Ratio in the latter year is unaffected
by uniform inflation, provided the limit has been adjusted to keep up with inflation.264
Proof of the Result for Loss Elimination Ratios:
The Loss Elimination Ratio at d in the prior year is given by LERX(d) = E[X d] / E[X].
Letting Z = (1+r)X, the Loss Elimination Ratio at d(1+r) in the latter year is given by LERZ(d(1+r)) =
E[Z

d(1+r)] / E[Z] = (1+r)E[X d] / {(1+r)E[X]} = E[X d] / E[X] = LERX(d).

Using Theoretical Loss Distributions:


It would also make sense to use continuous distributions, obtained perhaps from fitting to a data set,
in order to estimate the impact of inflation. We could apply a factor of 1+r to every loss in the data
set and then fit a distribution to the altered data. In most cases, it would be a waste of time fitting new
distributions to the data modified by the uniform effects of inflation. For most size of loss
distributions, after uniform inflation one gets the same type of distribution with the scale parameter
revised by the inflation factor. For example, for a Pareto Distribution with parameters = 1.702 and
= 240,151, under uniform inflation of 50% one would get another Pareto Distribution with
parameters: = 1.702, = (1.5)(240,151) = 360,360.265
Behavior of Specific Distributions under Uniform Inflation of (1+r):
For the Pareto, becomes (1+r). The Burr and Generalized Pareto have the same
behavior. Not coincidentally, for these distributions the mean is proportional to . As discussed in a
previous section, theta is the scale parameter for these distributions; everywhere x appears in the
Distribution Function it is divided by . In general under inflation, scale parameters are transformed
under inflation by being multiplied by (1+r). For the Pareto the shape parameter remains the
same. For the Burr the shape parameters and remain the same. For the Generalized Pareto
the shape parameters and remain the same.
264
265

As discussed above, for a fixed limit the Excess Ratio increases under uniform inflation.
Prior to inflation,this the Pareto fit by maximum likelihood to the ungrouped data in Section 1.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 594

Similarly, for the Gamma, and Weibull, becomes (1+r). The Transformed Gamma has the
same behavior. As parameterized in Loss Models, theta is the scale parameter for the Gamma,
Weibull, and Transformed Gamma distributions. For the Gamma the shape parameter remains the
same. For the Weibull the shape parameter remains the same. For the Transformed Gamma the
shape parameters and remain the same. Since the Exponential is a special case of the
Gamma, for the Exponential becomes (1+r), under uniform inflation of 1+r.
Exercise: In 2001 losses follow a Gamma Distribution with parameters = 2 and = 100.
There is 10% inflation in total between 2001 and 2004. What is loss distribution in 2004?
[Solution: Gamma with = 2 and = (1.1)(100) = 110.]
The behavior of the LogNormal under uniform inflation is explained by noting that multiplying each
loss by a factor of (1+r) is the same as adding a constant amount ln(1+r) to the log of each loss.
Adding a constant amount to a Normal distribution, gives another Normal Distribution, with the same
variance but with the mean shifted. = + ln(1+r), and = .
X ~ LogNormal( , ). ln(X) ~ Normal( , ).
ln[(1+r)X] = ln(X) + ln(1+r) ~ Normal( , ) + ln(1+r) = Normal( + ln(1+r), ).
(1+r)X ~ LogNormal( + ln(1+r), ). Thus under uniform inflation for the LogNormal,
becomes + ln(1+r). The other parameter, , remains the same.
The behavior of the LogNormal under uniform inflation can also be explained by the fact that
e is the scale parameter and is a shape parameter. Therefore, e is multiplied by (1+r);
e becomes e(1+r) = e+ln(1+r). Therefore, becomes + ln(1+r).
Exercise: In 2001 losses follow a LogNormal Distribution with parameters = 5 and = 2.
There is 10% inflation in total between 2001 and 2004. What is loss distribution in 2004?
[Solution: LogNormal with = 5 + ln(1.1) = 5.095, and = 2.]
Note that in each case, the behavior of the parameters under uniform inflation depends on the
particular way in which the distribution is parameterized. For example, in Loss Models the
Exponential distribution is given as: F(x) = 1 - e-x/. Thus in this parameterization of the Exponential,
acts as a scale parameter, and under uniform inflation becomes (1+r). This contrasts with the
parameterization of the Exponential in Actuarial Mathematics, F(x) = 1 - e-x, where 1/ acts as a
scale parameter, and under uniform inflation becomes /(1+r). 1/.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 595

Conveniently, most of the distributions in Loss Models have a scale parameter, which is multiplied
by (1+r), while the shape parameters are unaffected. Exceptions are the LogNormal Distribution
and the Inverse Gaussian.266
Note that all of the members of Transformed Beta Family all act similarly.267 The scale
parameter is multiplied by the inflation factor and all of the shape parameters remain the
same. All of the members of the Transformed Gamma Family all act similarly.268 The scale
parameter is multiplied by the inflation factor and all of the shape parameters remain the
same.
Distribution

Parameters Prior to Inflation

Pareto

Generalized Pareto

Burr

Inverse Burr

Parameters After Inflation

(1+r)

(1+r)

(1+r)

(1+r)

Transformed Beta

(1+r)

Inverse Pareto

(1+r)

Loglogistic

(1+r)

Paralogistic

(1+r)

Inverse Paralogistic

(1+r)

Exponential
Gamma

Inverse Gamma

Weibull

Inverse Weibull

(1+r)

(1+r)

(1+r)

(1+r)
(1+r)

Trans. Gamma

(1+r)

Inv. Trans. Gamma

(1+r)

266

This is discussed along with the behavior under uniform inflation of the LogNormal and Inverse Gaussian, in
Appendix A of Loss Models. However it is not included in the Tables attached to the exam.
267
See Figure 5.2 and Appendix A of Loss Models.
268
See Figure 5.3 and Appendix A of Loss Models

2013-4-2,
Distribution

Loss Distributions, 36 Inflation


Parameters Prior to Inflation

HCM 10/8/12,

Page 596

Parameters After Inflation

Normal

(1+r)

(1+r)

LogNormal

+ ln(1+r)

Inverse Gaussian

(1+r)

(1+r)

Single Par. Pareto

Uniform Distribution

Beta Distribution

Generalized Beta Dist.

(1+r)
a(1+r)

b(1+r)

(1+r)

(1+r)

Note that all the distributions in the above table are preserved under uniform inflation.
After uniform inflation we get the same type of distributions, but some or all of the parameters have
changed. If X follows a type of distribution implies that cX, for any c > 0, also follows the same type
of distribution, then that is defined as a scale family.
So for example, the Inverse Gaussian is a scale family of distributions, even though it does not have
scale parameter. If X follows an Inverse Gaussian, then Y = cX also follows an Inverse Gaussian.
Any distribution with a scale parameter is a scale family.
In order to compute the effects of uniform inflation on a loss distribution, one can adjust the
parameters as in the above table. Then one can work with the loss distribution revised by inflation in
the same manner one would work with any loss distribution.
Exercise: Losses prior to inflation follow a Pareto Distribution with parameters = 1.702 and
= 240,151. Losses increase uniformly by 50%.
What are the means prior to and subsequent to inflation?
[Solution: For the Pareto Distribution, E[X] = /(1).
Prior to inflation, E[X] = 240,151 / 0.702 = 342,095.
After inflation, with parameters = 1.702 and = (1.5)(240,151) = 360,227.
After inflation, E[X] = 360,227 / 0.702 = 513,143.
Alternately, inflation increases the mean by 50% to: (1.5)(342,095) = 513,143.]

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 597

Exercise: Losses prior to inflation follow a Pareto Distribution with parameters = 1.702 and
= 240,151. Losses increase uniformly by 50%.
What are the limited expected values at 1 million prior to and subsequent to inflation?
1

[Solution: For the Pareto Distribution, E[X x] =
1 -

.
+ x
1

Prior to inflation, E[X 1 million] = (240,151 / 0.702) {1 - (240,151/1,240,151).702} = 234,044.


After inflation, E[X 1 million] = (360,227 / 0.702) {1 - (360,227/1,360,227).702} = 311,232.]
Exercise: Losses prior to inflation follow a Pareto Distribution with parameters = 1.702 and
= 240,151. Losses increase uniformly by 50%. Excess Ratio = 1 - LER.
What are the excess ratios at 1 million prior to and subsequent to inflation?
[Solution: Excess ratio = R(x) = (E[X] - E[X x]) / E[X] = 1 - E[X x] / E[X].
Prior to inflation, R(1 million) = 1 - 234,044 / 342,095 = 31.6%.
After inflation, R(1 million) = 1 - 311,232 / 513,143 = 39.3%.
Comment: As expected, for a fixed limit the Excess Ratio increases under uniform inflation.
1
For the Pareto the excess ratio is given by R(x) =

.]
+ x

Behavior in General of Distributions under Uniform Inflation of (1+r):


For distributions in general, including those not discussed in Loss Models, one can determine the
behavior under uniform inflation as follows. One makes the change of variables Z = (1+r) X.
For the Distribution Function one just sets FZ(z) = FX(x); one substitutes x = z / (1+r).
Alternately, for the density function fZ(z) = fX(x) / (1+r).269
(x -)2
]
22 . Under uniform inflation, x = z/(1+r) and
2

exp[For example, for the Normal Distribution f(x) =

{z - (1+ r)}2
(z / (1+r) -)2
exp[]
]
2 {(1+ r)}2
2 2
=
.
(1+ r) 2
(1+ r) 2

exp[fZ(z) = fX(x) / (1+r) =

This is a Normal density function with sigma and mu each multiplied by (1+r). Thus under inflation, for
the Normal becomes (1+r) and becomes (1+r). The location parameter has been
multiplied by the inflation factor, as has the scale parameter .
269

Under change of variables you need to divide by dz/dx = 1+ r, since fZ(z) = dF/dz = (dF/dx) / (dz/dx) = fX(x) / (1+r).

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 598

Alternately, the Distribution Function for the Normal is [(x-)/]. Therefore, FZ(z) = FX(x) =
[(x-)/] = [({z/(1+r)}-)/] = [{z- (1+r)}/{(1+r)}]. This is the Distribution Function for a Normal
with sigma and mu each multiplied by (1+r), which matches the previous result.
Exercise: What is the behavior under inflation of the distribution function:
F(x) = xa / (xa + ba), x > 0.
[Solution: Under uniform inflation, FZ(z) = FX(x) = xa / (xa + ba) = {z/(1+r)}a / ({z/(1+r)}a + ba) =
za/(za + {b(1+r)}a). This is the same type of distribution, where b has become b (1+r).
The scale parameter b has been multiplied by the inflation factor (1+r).
Alternately, one can work with the density function f(x) = a ba xa-1 / (xa + ba)2 =
(a/b)(x/b)a-1 / (1+ (x/b)a)2 . Then under uniform inflation: x = z/(1+r) and fZ(z) = fX(x) / (1+r) =
(a/b)(x/b)a-1 / {(1+r)(1+ (x/b)a)2 } = (a / {b (1+r)})(z / {b (1+r)} )a-1 / { (1+ (z / {b (1+r)})a)2 },
which is same type of density, where b has become b(1+r), as was shown previously.
Alternately, you can recognize b is a scale parameter, since F(x) = (x/b)a / {(x/b)a + 1}.
Or alternately, you can recognize that this is Loglogistic Distribution with a = and b = . ]
Exercise: What is the behavior under uniform inflation of the density function:
x 2
1

exp

2x
f(x) =
.
2
x1.5

[Solution: In general one substitutes for x = z / (1+r), and for the density function fZ(z) = fX(x) / (1+r).

fZ(z) = fX(x) / (1+r) =

z
2

1
(1+ r)
exp
2 {z / (1+ r)}
2
(1+ r) {z / (1+r)}1.5

z
2
(1+ r)
1
(1+ r)
(1+ r) exp 2z
2
z1.5

This is of the same form, but with parameters (1+r) and (1+r), rather than and .]
Thus we have shown that under uniform inflation for the Inverse Gaussian Distribution and
become (1+r) and (1+r).

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 599

Behavior of the Domain of Distributions:


For all of the distributions discussed so far, the domain has been 0 to . For the Single Parameter
Pareto distribution the domain is x > . Under uniform inflation the domain becomes x > (1+r).
In general, the domain [a, b] becomes under uniform inflation [(1+r)a, (1+r)b].
If a = 0, multiplying by 1+r has no effect; if b = , multiplying by 1+r has no effect. So for
distributions like the Gamma, the domain remains (0, ) after uniform inflation.
For the Single Parameter Pareto F(x) = 1 - (x / ), x > , under uniform inflation is unaffected
and becomes (1+r).
The uniform distribution on [a , b] becomes under uniform inflation the uniform
distribution on [a(1+r) , b(1+r)].
Working in Either the Earlier or Later Year:
Exercise: Losses prior to inflation follow a Pareto Distribution with parameters = 1.702 and
= 240,151. Losses increase uniformly by 50%. What is the average contribution per loss to the
layer from 1 million to 5 million, both prior to and subsequent to inflation?
1

[Solution: For the Pareto Distribution, E[X x] =
1 -

.
+ x
1

Prior to inflation, E[X 5 million] = (240,151 / 0.702){1 - (240,151/5,240,151).702} = 302,896.


After inflation, losses follow a Pareto with =1.702 and = (1.5)(240,151) = 360,227, and
E[X 5 million] = (360,227 / 0.702){1 - (360,227/5,360,227).702} = 436,041.
Prior to inflation, the average loss contributes: E[X 5 million] - E[X 1 million] =
302,896 - 234,044 = 68,852, to this layer.
After inflation, the average loss contributes: 436,041 - 311,232 = 124,809, to this layer.]
The contribution to this layer has increased by 82%, in this case more than the overall rate of inflation.
There are two alternative ways to solve many problems involving inflation. In the above solution,
one adjusts the size of loss distribution in the earlier year to the later year based on the amount of
inflation. Then one calculates the quantity of interest in the later year. However, there is an alternative,
which many people will prefer. Instead one calculates the quantity of interest in the earlier year at its
deflated value, and then adjusts it to the later year for the effects of inflation. Heres how this alternate
method works for this example.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 600

A limit of 1 million in the later year corresponds to a limit of 1 million /1.5 = 666,667 in the earlier year.
Similarly, a limit of 5 million in the later year corresponds to 5 million /1.5 = 3,333,333 in the earlier
year. Using the Pareto in the earlier year, with =1.702 and = 240,151,
E[X 666,667] = (240,151 / 0.702){1 - (240,151/906,818)0.702} = 207,488, and
E[X 3,333,333] = (240,151 / 0.702){1 - (240,151/3,573,484)0.702} = 290,694. In terms of the
earlier year dollars, the contribution to the layer is: 290,694 - 207,488 = 83,206.
However, one has to inflate back up to the level of the later year: (1.5)(83,206) =124,809, matching
the previous solution.
This type of question can also be answered using the formula discussed subsequently for the
average payment per loss. This formula for the average payment per loss is just an application of
the technique of working in the earlier year, by deflating limits and deductibles. However, this
technique of working in the earlier year is more general, and also applies to other quantities of
interest, such as the survival function.
Exercise: Losses in 2003 follow a LogNormal Distribution with parameters = 3 and = 5.
Between 2003 and 2009 there is a total of 35% inflation.
Determine the percentage of the total number of losses in 2009 that would be expected to exceed
a deductible of 1000.
[Solution: The losses in year 2009 follow a LogNormal Distribution with parameters
= 3 + ln(1.35) = 3.300 and = 5. Thus in 2009, S(1000) = 1 - F(1000) =
1 - [{ln(1000) - 3.300} / 5] = 1 - [0.72] = 1 - 0.7642 = 0.2358.
Alternately, we first deflate to 2003. A deductible of 1000 in 2009 is equivalent to a deductible of
1000/1.35 = 740.74 in 2003. The losses in 2003 follow a LogNormal Distribution with parameters
= 3 and = 5. Thus in 2003, S(740.74) = 1 - F(740.47) =
1 - [{ln(740.47) - 3}/ 5] = 1 - [0.72] = 1 - 0.7642 = 0.2358.]
Of course both methods of solution produce the same answer. One can work either in terms of 2003
or 2009 dollars. In this case, the survival function is a dimensionless quantity. However, when
working with quantities in dollars, such as the limited expected value, if one works in the earlier year,
in this case 2003, one has to remember to reinflate the final answer back to the later year, in this case
2009.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 601

Formulas for Average Payments:


The ideas discussed above can be put in terms of formulas:270
Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered
Loss u, and coinsurance factor c, then in terms of the values in the earlier year, the
insurers average payment per loss in the later year is:

(1+r) c { E[X

u
d
] - E[X
]}.
1+ r
1+ r

Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered
Loss u, and coinsurance factor c, then in terms of the values in the earlier year, the
average payment per (non-zero) payment by the insurer in the later year is:

(1+r) c

E[X

u
d
] - E[X
]
1+ r
1+ r .
d
S

1+ r

In each case we have deflated the Maximum Covered Loss and the Deductible back to the earlier
year, computed the average payment in the earlier year, and then reinflated back to the later year.
Important special cases are: d = 0 no deductible, L = no maximum covered loss,
c = 1 no coinsurance, r = 0 no inflation or prior to the effects of inflation.
For example, assume losses in 2001 follow an Exponential distribution with = 1000.
There is a total of 10% inflation between 2001 and 2004. In 2004 there is a deductible of 500, a
maximum covered loss of 5000, and a coinsurance factor of 80%. Then the average payment per
(non-zero) payment in 2004 is computed as follows, using that for the Exponential Distribution,
E[X

x] = (1 - e-x/).

Take d = 500, u = 5000, c = 0.8, and r = 0.1.


average payment per (non-zero) payment in 2004 =
(1+r) c (E[X u/(1+r)] - E[X d/(1+r)]) / S(d/(1+r)) =
(1.1)(0.8)(E[X 4545] - E[X 455]) / S(455) =
(0.88){1000(1 - e-4545/1000) - 1000(1 - e-455/1000)}/e-455/1000 = (0.88)(989 - 366)/0.634 = 865.
Note that all computations use the original Exponential Distribution in 2001, with = 1000.
270

See Theorem 8.7 in Loss Models.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 602

Exercise: For a LogNormal Distribution with parameters = 3 and = 5, determine


E[X 100,000], E[X 1,000,000], E[X 74,074], and E[X 740,740].
[Solution: E[X 100000] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]} =
exp(3 + 25/2)[(ln(100000) - 3 - 25)/5] + (100000){1 - [{ln(100000) - 3} / 5]} =
5,389,670[-3.30] + (740,740){1 - [1.70]} = 5,389,670(.0005) + (100000)(1 - .9554) = 7155.
E[X 1,000,000] =
exp(3 + 25/2)[(ln(1000000) - 3 - 25)/5] + (1000000){1 - [{ln(1000000) - 3} / 5] } =
5,389,670[-2.84] + (1000000){1 - [2.16]} = 5,389,670(.0023) + (1000000)(1 - 0.9846) =
27,796.
E[X 74,074] = exp(3 + 25/2)[(ln(74074) - 3 - 25)/5] + (74074){1 - [{ln(74074) - 3} / 5]) =
5,389,670[-3.36] + (74074){1 - [1.64]} = 5,389,670(.0004) + (74074)(1 - 0.9495) = 5897.
E[X 740,740] = exp(3 + 25/2)[(ln(740,740) - 3 - 25)/5] + (740,740){1 - [{ln(740,740) - 3}/ 5]}
= 5,389,670[-2.90] + (740,740){1 - [2.10]} = 5,389,670(.0019) + (740,740)(1 - 0.9821) =
23,500.]
Exercise: Losses in 2003 follow a LogNormal Distribution with parameters = 3 and = 5.
Between 2003 and 2009 there is a total of 35% inflation.
In 2009 there is a deductible of $100,000 and a maximum covered loss of $1 million. Determine
the increase between 2003 and 2009 in the insurers average payment per loss to the insured.
[Solution: In 2003, take r = 0, d = 100,000, u = 1 million, and c = 1.
Average payment per loss = E[X 1 million] - E[X 100,000] = 27,796 - 7155 = 20,641.
In 2009, take r = 0.35, d = 100,000, u = 1 million, and c = 1.
Average payment per loss = 1.35 (E[X 1 million/1.35] - E[X 100000/1.35]) =
1.35 (E[X 740,740] - E[X 74074]) = 1.35 (23,500 - 5897) = 23,764.
The increase is: 23,764/20,641 - 1 = 15%.
Comment: Using a computer, the exact answer without rounding is: 23,554/20,481 - 1 = 15.0%.
Using the formula in order to get the average payment per loss in 2009 is equivalent to deflating to
2003, working in the year 2003, and then reinflating to the year 2009. The 2009 LogNormal has
parameters = 3 + ln(1.45) = 3.300 and = 5. For this LogNormal, E[X 100,000] =
exp(3.3 + 25/2)[(ln(100000) - 3.3 - 25)/5] + (100000){1 - [{ln(100000) - 3.3} / 5]} =
7,275,332[-3.36] + (100,000){1 - [1.64]} = 7,275,332(.0004) + (100000)(1 - 0.9495) = 7960.
For this LogNormal, E[X 1,000,000] =
exp(3.3 + 25/2)[(ln(1000000) - 3.3 - 25)/5] + (1000000){1 - [{ln(1000000) - 3.3} / 5]} =
7,275,332[-2.90] + (100,0000){1 - [2.10]} = 7,275,332(.0019) + (1000000)(1 - 0.9821) =
31,723. 31,723 - 7960 = 23,763, matching the 23,764 obtained above except for rounding.]

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 603

Exercise: Losses in 2003 follow a LogNormal Distribution with parameters = 3 and = 5.


Between 2003 and 2009 there is a total of 35% inflation.
In 2009 there is a deductible of $100,000 and a maximum covered loss of $1 million.
Determine the increase between 2003 and 2009 in the insurers average payment per
(non-zero) payment to the insured.
[Solution: In 2003, take r = 0, d = 100,000, u = 1 million, and c = 1.
Average payment per non-zero payment = (E[X 1 million] - E[X 100,000])/S(100,000) =
(27,796 - 7155)/{1 - [{ln(100000) - 3} / 5] } = 20,641/{1 - [1.70]} = 20,641/0.0446 = 462,803.
In 2009, take r = 0.35, d = 100,000, d = 1 million, and c = 1.
Average payment per non-zero payment =
1.35 (E[X 1 million/1.35] - E[X 100000/1.35])/S(100,000/1.35) =

1.35 (E[X 740,740] - E[X 74074])/S(74074) = 1.35 (23,500 - 5897)/{1 - [1.64]} =


23,764/0.0505 = 470,574. The increase is: 470,574/462,803 - 1 = 1.7%.
Comment: Using a computer, the exact answer without rounding is: 468,852/462,085 - 1 = 1.5%.]
Formulas for Second Moments:271
We have previously discussed second moments of layers. We can incorporate inflation in a manner
similar to the formulas for first moments. However, since the second moment is in dollars squared,
we reinflate back by multiplying by (1+r)2 . Also we multiply by the coinsurance factor squared.
Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered
Loss u, and coinsurance factor c, then in terms of the values in the earlier year,
the second moment of the insurers payment per loss in the later year is:
(1+r)2 c2 { E[(X

u 2
d 2
d
u
d
) ] - E[(X
) ] -2
( E[X
] - E[X
])}.
1+
r
1+ r
1+ r
1+ r
1+ r

Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered
Loss u, and coinsurance factor c, then in terms of the values in the earlier year,
the average payment per (non-zero) payment by the insurer in the later year is:

(1+r)2 c2

E[(X

u 2
d 2
d
u
d
) ] - E[(X
) ] - 2
{E[X
] - E[X
]}
1+ r
1+ r
1+ r
1+ r
1+ r .
d
S
1+ r

One can combine the formulas for the first and second moments in order to calculate the variance.
271

See Theorem 8.8 in Loss Models. If r = 0, these reduce to formulas perviously discussed.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 604

Exercise: Losses in 2005 follow a Single Parameter Pareto Distribution with = 3 and = 200.
Between 2005 and 2010 there is a total of 25% inflation.
In 2010 there is a deductible of 300, a maximum covered loss of 900, and a coinsurance of 90%.
In 2010, determine the variance of YP, the per payment variable.
[Solution: From the Tables attached to the exam, for the Single Parameter Pareto, for x :
E[X x] =

.
- 1 ( - 1) x - 1

E[(X x)2 ] =

2
2
.
- 2 ( - 2) x - 2

Thus [X 300 / 1.25] = [X 240] = (3)(200) / 2 - 2003 / {(2) (2402 )} = 230.556.


E[X 900 / 1.25] = E[X 720] = (3)(200) / 2 - 2003 / {(2) (7202 )} = 292.284.
S(300/1.25) = S(240) = (200/240)3 = 0.5787.
Thus the mean payment per payment is: (1.25) (90%) (292.284 - 230.556) / 0.5787 = 120.00.
E[(X 240)2 ] = (3)(2002 ) / 1 - (2)(2003 ) / 240 = 53,333.
E[(X 720)2 ] = (3)(2002 ) / 1 - (2)(2003 ) / 720 = 97,778.
Since the second moment is in dollars squared, we multiply by the square of the coinsurance factor,
and the square of the inflation factor.
Thus the second moment of the non-zero payments is:
(1.252 ) (90%)2 {97,778 - 53,333 - (2)(240)(292.284 - 230.556)} / 0.5787 = 32,402.
Thus the variance of the non-zero payments is: 32,402 - 120.002 = 18,002.
Alternately, work with the 2010 Single Parameter Pareto with = 3, and = (200)(1.25) = 250.
E [X 300] = (3)(250) / 2 - 2503 / {(2) (3002 )} = 288.194.
E [X 900] = (3)(250) / 2 - 2503 / {(2) (9002 )} = 365.355.
S(300) = (250/300)3 = 0.5787.
Thus the mean payment per payment is: (90%) (365.355 - 288.194) / 0.5787 = 120.00.
E[(X 300)2 ] = (3)(2502 ) / 1 - (2)(2503 ) / 300 = 83,333.
E[(X 900)2 ] = (3)(2502 ) / 1 - (2)(2503 ) / 900 = 152,778.
Since the second moment is in dollars squared, we multiply by the square of the coinsurance factor.
Thus the second moment of the non-zero payments is:
(90%)2 {152,778 - 83,333 - (2)(300)(365.355 - 288.194)} / 0.5787 = 32,401.
Thus the variance of the non-zero payments is: 32,401 - 120.002 = 18,001.]

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 605

Mixed Distributions:272
If one has a mixed distribution, then under uniform inflation each of the component distributions acts
as it would under uniform inflation.
Exercise: The size of loss distribution is: F(x) = 0.7{1 - e-x/130} + 0.3{1 - (250/(250+x))2 }.
After uniform inflation of 20%, what is the size of loss distribution?
[Solution: After uniform inflation of 20%, we get another Exponential Distribution, but with
= (1.2)(130) = 156: 1 - e-x/156. After uniform inflation of 20%, we get another Pareto Distribution,
but with = 2 and = (1.2)(250) = 300: 1 - {300/(300+x)}2 .
Therefore, the mixed distribution becomes: 0.7{1 - e-x/156} + 0.3{1 - (300/(300+x))2 }. ]

272

Mixed Distributions are discussed in a subsequent section.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 606

Non-Uniform Rates of Inflation by Size of Loss:


On the exam, inflation is assumed to be uniform by size of loss. What would one expect to see if for
example large losses were inflating at a higher rate than smaller losses? Then we would expect for
example the 90th percentile to increase at a faster rate than the median.273
Exercise: In 2001 the losses follow a Pareto Distribution with parameters = 3 and = 1000.
In 2004 the losses follow a Pareto Distribution with parameters = 2.5 and = 1100.
What is the increase from 2001 to 2004 in the median (50th percentile)?
Also, what is the increase from 2001 to 2004 in the 90th percentile?
[Solution: For the Pareto, at the 90th percentile: 0.9 = 1 - {/( +x)}. x = {101/ -1}.
In 2001 the 90th percentile is: 1000{101/3 -1} = 1154.
In 2004 the 90th percentile is: 1100{101/2.5 - 1} = 1663.
For the Pareto, the median is: {21/ -1}.
In 2001 the median is: 1000{21/3 - 1} = 260.
In 2004 the median is: 1100{21/2.5 -1} = 351.
The median increased by: (351/260) - 1 = 35.0%,
while the 90th percentile increased by: (1663/1154) - 1 = 44.1%.]
In this case, the 90th percentile increased more than the median did. The shape parameter of the
Pareto decreased, resulting in a heavier-tailed distribution in 2004 than in 2001. If the higher
percentiles increase at a different rate than the lower percentiles, then inflation is not uniform by size
of loss. When inflation is uniform by size of loss, all percentiles increase at the same rate.274

273

If the larger losses are inflating at a lower rate than the smaller losses then the situation is reversed and the higher
percentiles will inflate more slowly than the lower percentiles. Which situation applies may be determined by
graphing selected percentiles over time, with the size of loss on a log scale. In practical applications this analysis
would be complicated by taking into account the impacts of any deductible and/or maximum covered loss.
274
In practical situations, the estimated rates of increase in different percentiles based on data will differ somewhat,
even if the underlying inflation is uniform by size of loss.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 607

Fixed Exchange Rates of Currency:275


Finally it is useful to note that the mathematics of changes in currency is the same as that for inflation.
Thus if loss sizes are expressed in dollars and you wish to convert to some other currency one
multiplies each loss size by the appropriate exchange rate.
Assuming each loss is paid at (approximately) the same time one can apply (approximately) the
same exchange rate to each loss. This is mathematically identical to applying the same inflation factor
under uniform inflation.
If the exchange rate is 80 yen per dollar, then the Loss Elimination Ratio at 80,000 yen is the same
as that at $1000.
Exercise: The Limited Expected Value at $1000 is $600.
The exchange rate is 80 yen per dollar.
Determine the Limited Expected Value at 80,000 yen.
[Solution: 80,000 yen 1000.
Limited Expected Value at 80,000 yen is: (600)(80) = 48,000 yen.]
The Coefficient of Variation, Skewness, and Kurtosis, which describe the shape of the size of loss
distribution, are unaffected by converting to yen.
Exercise: The size of loss distribution in dollars is Gamma with = 3 and = 2000.
The exchange rate is 0.80 euros per dollar.
Determine the size of loss distribution in euros.
[Solution: Gamma with = 3 and = (0.80)(2000) = 1600.
Comment: The mean in euros is: (3)(1600) = 4800.
The mean in dollars is: (3)(2000) = $6000. (.8)(6000) = 4800.
0.80 euros per dollar. 1.25 dollars per euro.
Going from euros to dollars would be mathematically equivalent to 25% inflation.
Going from dollars to euros is mathematically equivalent to deflating back to the earlier year from the
later year with 25% inflation. $6000/1.25 = 4800.]

275

See CAS3, 5/06, Q.26.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 608

Problems:
36.1 (1 point) The size of losses in 1994 follow a Pareto Distribution,
with parameters = 3, = 5000.
Assume that inflation uniformly increases the size of losses between 1994 and 1997 by 20%.
What is the average size of loss in 1997?
A. 2500
B. 3000
C. 3500
D. 4000
E. 4500
36.2 (1 point) The size of losses in 2004 follow an Exponential Distribution: F(x) = 1 - e-x/, with
= 200. Assume that inflation uniformly increases the size of losses between 2004 and 2009 by
3% per year. What is the variance of the loss distribution in 2009?
A. 48,000
B. 50,000
C. 52,000
D. 54,000
E. 56,000

1
36.3 (2 points) The size of losses in 1992 follows a Burr Distribution, F(x) = 1 -
,
1 + (x / )
with parameters = 2, = 19,307, = 0.7.
Assume that inflation uniformly increases the size of losses between 1992 and 1996 by 30%.
What is the probability of a loss being greater than 10,000 in 1996?
A. 39%
B. 41%
C. 43%
D. 45%
E. 47%
36.4 (1 point) The size of losses in 1994 follow a Gamma Distribution, with parameters = 2,
= 100. Assume that inflation uniformly increases the size of losses between 1994 and 1996 by
10%. What are the parameters of the loss distribution in 1996?
A. = 2, = 100

B. = 2, = 110

C. = 2, = 90.9

D. = 2, = 82.6

E. None of A, B, C, or D.
36.5 (2 points) The size of losses in 1995 follow a Pareto Distribution, with = 1.5, = 15000.
Assume that inflation uniformly increases the size of losses between 1995 and 1999 by 25%.
In 1999, what is the average size of the non-zero payments excess of a deductible of 25,000?
A. 72,000
B. 76,000
C. 80,000
D. 84,000
E. 88,000
36.6 (2 points) The size of losses in 1992 follow the density function f(x) = 2.5x-2 for 2 < x < 10.
Assume that inflation uniformly increases the size of losses between 1992 and 1996 by 20%.
What is the probability of a loss being greater than 6 in 1996?
A. 23%
B. 25%
C. 27%
D. 29%
E. 31%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 609

Use the following information for the next 4 questions:


A size of loss distribution has been fit to certain data in terms of dollars. The loss sizes have been
converted to yen. Assume the exchange rate is 80 yen per dollar.
36.7 (1 point) In terms of dollars the sizes of loss are given by a Loglogistic, with parameters = 4,
and = 100.
Which of the following are the parameters of the distribution in terms of yen?
A. = 4, and = 100

B. = 320, and = 100

D. = 320, and = 8000

E. None of A, B, C, or D.

C. = 4, and = 8000

36.8 (1 point) In terms of dollars the sizes of loss are given by a LogNormal Distribution, with
parameters = 10 and = 3.
Which of the following are the parameters of the distribution in terms of yen?
A. = 10 and = 3

B. = 800 and = 3

D. = 800 and = 240

E. None of A, B, C, or D.

C. = 10 and = 240

36.9 (1 point) In terms of dollars the sizes of loss are given by a Weibull Distribution, with
parameters = 625 and = 0.5.
Which of the following are the parameters of the distribution in terms of yen?
A. = 625 and = 0.5

B. = 69.9 and = 0.5

D. = 50,000 and = 0.5

E. None of A, B, C, or D.

C. = 5,590 and = 0.5

36.10 (1 points) In terms of dollars the sizes of loss are given by a Paralogistic Distribution, with
= 4, = 100. Which of the following are the parameters of the distribution in terms of yen?
A. = 4, and = 100

B. = 320, and = 100

D. = 320, and = 8000

E. None of A, B, C, or D.

C. = 4, and = 8000

36.11 (1 point) The size of losses in 1994 follows a distribution F(x) = [; ln(x)], x > 1, with
parameters = 40, = 10. Assume that inflation uniformly increases the size of losses between
1994 and 1996 by 10%. What are the parameters of the loss distribution in 1996?
A. = 40, = 10

B. = 40, = 9.1

D. = 40, = 12.1

E. None of A, B, C, or D.

C. = 40, = 11

36.12 (1 point) X1 , X2 , ... X50, are independent, identically distributed variables, each with an
Exponential Distribution with mean 800. What is the distribution of X their average?

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 610

36.13 (1 point) Assume that inflation uniformly increases the size of losses between 1994 and
1996 by 10%. Which of the following statements is true regarding the size of loss distribution?
1. If the skewness in 1994 is 10, then the skewness in 1996 is 13.31.
2. If the 70th percentile in 1994 is $10,000, then the 70th percentile in 1996 is $11,000.
3. If in 1994 the Loss Elimination Ratio for a deductible of $1000 is 10%,
then in 1996 the Loss Elimination Ratio for a deductible of $1100 is 11%.
A. 1
B. 2
C. 3
D. 1, 2, 3
E. None of A, B, C, or D
36.14 (3 points) The size of losses in 1995 follow the density function:
f(x) = 375x2 e-10x + 20x3 exp(-20x4 ).
Assume that inflation uniformly increases the size of losses between 1995 and 1999 by 25%.
Which of the following is the density in 1999?
A. 468.75x2 e-12.5x + 16x3 exp(-16x4 )

B. 192x2 e-8x + 8.192x3 exp(-8.192x4 )

C. 468.75x2 e-12.5x + 8.192x3 exp(-8.192x4 )


E. None of the above.

D. 192x2 e-8x + 16x3 exp(-16x4 )

36.15 (3 points) You are given the following:

Losses follow a distribution with density function

ln(x) - 7 2
exp-0.5

f(x) =
, 0 < x < .
3x 2

There is a deductible of 1000.

173 losses are expected to exceed the deductible each year.

Determine the expected number of losses that would exceed the deductible each year if all loss
amounts increased by 40%, but the deductible remained at 1000.
A. Less than 175
B. At least 175, but less than 180
C. At least 180, but less than 185
D. At least 185, but less than 190
E. At least 190
36.16 (3 points) Losses in the year 2001 have a Pareto Distribution with parameters = 3 and
= 40. Losses are uniformly 6% higher in the year 2002 than in the year 2001. In both 2001 and
2002, an insurance policy has a deductible of 5 and a maximum covered loss of 25.
What is the ratio of expected payments in 2002 over expected payments in the year 2001?
(A) 104% (B) 106% (C) 108% (D) 110% (E) 112%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 611

36.17 (2 points) You are given the following:

1000 observed losses occurring in 1993 for a group of risks have been recorded
and are grouped as follows:
Interval
Number of Losses
(0, 100]
341
(100, 500]
202
(500, 1000]
131
(1000, 5000]
151
(5000, 10000]
146
(10000, )
29

Inflation of 8% per year affects all losses uniformly from 1993 to 2002.
What is the expected proportion of losses for this group of risks that will be greater than 1000 in the
year 2002?
A. 38%
B. 40%
C. 42%
D. 44%
E. 46%

36.18 (2 points) The probability density function of losses in 1996 is: f(x) =

-)
[ (x2x
]

exp -

2 x 3

, x> 0.

Between 1996 and 2001 there is a total of 20% inflation. What is the density function in 2001?
A. Of the same form, but with parameters 1.2 and , rather than and .
B. Of the same form, but with parameters and 1.2, rather than and .
C. Of the same form, but with parameters 1.2 and 1.2, rather than and .
D. Of the same form, but with parameters /1.2 and , rather than and .
E. Of the same form, but with parameters and /1.2, rather than and .
Use the following information for the next two questions:

The losses in 1998 prior to any deductible follow a Distribution: F(x) = 1 - e-x/5000.
Assume that losses increase uniformly by 40% between 1998 and 2007.
In 1998, an insurer pays for losses excess of a 1000 deductible.
36.19 (2 points) If in 2007 this insurer pays for losses excess of a 1000 deductible, what is the
increase between 1998 and 2007 in the dollars of losses that this insurer expects to pay?
A. 44%
B. 46%
C. 48%
D. 50%
E. 52%
36.20 (2 points) If in 2007 this insurer pays for losses excess of a 1400 deductible, what is the
increase between 1998 and 2007 in the dollars of losses that this insurer expects to pay?
A. 38%
B. 40%
C. 42%
D. 44%
E. 46%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 612

36.21 (3 points) You are given the following:

In 1990, losses follow a LogNormal Distribution, with parameters 3 and .

Between 1990 and 1999 there is uniform inflation at an annual rate of 4%.
In 1990, 5% of the losses exceed the mean of the losses in 1999.

Determine .
A.
B.
C.
D.
E.

0.960 or 2.960
0.645 or 2.645
0.546 or 3.374
0.231 or 3.059
None of the above

36.22 (3 points) You are given the following:

The losses in 1995 follow a Weibull Distribution with parameters = 1 and = 0.3.
A relevant Consumer Price Index (CPI) is 170.3 in 1995 and 206.8 in 2001.
Assume that losses increase uniformly by an amount based on the increase in the CPI.
What is the increase between 1995 and 2001 in the expected number of losses exceeding a 1000
deductible?
A. 45%
B. 48%
C. 51%
D. 54%
E. 57%
36.23 (3 points) You are given the following:

The losses in 1994 follow a LogNormal Distribution, with parameters = 3 and = 4.


Assume that losses increase by 5% from 1994 to 1995, 3% from 1995 to 1996,
7% from 1996 to 1997, and 6% from 1997 to 1998.

In both 1994 and 1998, an insurer sells policies with a $25,000 maximum covered loss.
What is the increase due to inflation between 1994 and 1998 in the dollars of losses that the insurer
expects to pay?
A. 9%
B. 10%
C. 11%
D. 12%
E. 13%
36.24 (3 points) You are given the following:

The losses in 1994 follow a Distribution: F(x) = 1 - (100000/x)3 for x > $100,000.

Assume that inflation is a total of 20% from 1994 to 1999.

In each year, a reinsurer pays for the layer of loss from $500 thousand to $2 million.

What is the increase due to inflation between 1994 and 1999 in the dollars that the reinsurer expects
to pay?
A. 67%
B. 69%
C. 71%
D. 73%
E. 75%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 613

36.25 (2 points) You are given the following:

The losses in 2001 follow an Inverse Gaussian Distribution,


with parameters = 3 and =10.

There is uniform inflation from 2001 to 2009 at an annual rate of 3%.

What is the variance of the distribution of losses in 2009?


A. Less than 2
B. At least 2, but less than 3
C. At least 3, but less than 4
D. At least 4, but less than 5
E. At least 5
36.26 (1 point) In the year 2002 the size of loss distribution is a Pareto with = 3 and = 5000.
During the year 2002 what is the median of those losses of size greater than 10,000?
A. 13,900
B. 14,000
C. 14,100
D. 14,200
E. 14,300
36.27 (2 points) In the year 2002 the size of loss distribution is a Pareto with = 3 and = 5000.
You expect a total of 15% inflation between the years 2002 and 2006.
During the year 2006 what is the median of those losses of size greater than 10,000?
A. 13,900
B. 14,000
C. 14,100
D. 14,200
E. 14,300
36.28 (1 point) The size of losses in 1992 follow the density function f(x) = 1000e-1000x.
Assume that inflation uniformly increases the size of losses between 1992 and 1998 by 25%.
Which of the following is the density in 1998?
A. 800e-800x

B. 1250e-1250x

C. 17841 x.5 e-1000x

D. 1000e-1000x

E. None of the above.

36.29 (2 points) You are given the following:


For 2003 the amount of a single loss has the following distribution:
Amount
Probability
$1,000
1/6
$2,000
1/3
$5,000
1/3
$10,000
1/6
An insurer pays all losses after applying a $2000 deductible to each loss.
Inflation of 4% per year impacts all claims uniformly from 2003 to 2006.
Assuming no change in the deductible, what is the inflationary impact on losses paid by the insurer in
2006 as compared to the losses the insurer paid in 2003?
A. 9%
B. 12%
C. 15%
D. 18%
E. 21%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 614

36.30 (2 points) You are given the following:

The size of loss distribution in 2007 is LogNormal Distribution with = 5 and = 0.7.
Assume that losses increase by 4% per year from 2007 to 2010.
What is the second moment of the size of loss distribution in 2010?
A. Less than 70,000
B. At least 70,000, but less than 75,000
C. At least 75,000, but less than 80,000
D. At least 80,000, but less than 85,000
E. At least 85,000
36.31 (3 points) In 2005 sizes of loss follow a distribution F(x), with survival function S(x) and
density f(x). Between 2005 and 2008 there is a total of 10% inflation.
In 2008 there is a deductible of 1000.
Which of the following does not represent the expected payment per loss in 2008?
A. 1.1 E[X] - 1.1 E[X 909]

B. 1.1

x f(x) dx - 1000

909

f(x) dx

909

C. 1.1

S(x) dx

909

D. 1.1

(x -1000) f(x) dx

909

E. 1.1

909

909

x f(x) dx + 1.1

{x f(x) - S(x)} dx

36.32 (3 points) For Actuaries Professional Liability insurance, severity follows a Pareto Distribution
with = 2 and = 500,000.
Excess of loss reinsurance covers the layer from R to $1 million.
Annual unlimited ground up inflation is 10% per year.
Determine R, less than $1 million, such that the annual loss trend for the reinsured layer is exactly
equal to the overall inflation rate of 10%.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 615

36.33 (3 points) In 2011, the claim severity distribution is exponential with mean 5000.
In 2013, an insurance company will pay the amount of each claim in excess of a deductible of 1000.
There is a total of 8% inflation between 2011 and 2013.
In 2013, calculate the variance of the amount paid by the insurance company for one claim,
including the possibility that the amount paid is 0.
(A) 24 million
(B) 26 million
(C) 28 million
(D) 30 million
(E) 32 million
36.34 (3 points) In 2005 sizes of loss follow a certain distribution, and you are given the following
selected values of the distribution function and limited expected value:
x
F(x)
Limited Expected Value at x
3000
.502
2172
3500
.549
2409
4000
.590
2624
4500
.624
2820
5000
.655
3000
5500
.681
3166
6000
.705
3319
6500
.726
3462
7000
.744
3594
Between 2005 and 2010 there is a total of 25% inflation.
In both 2005 and 2010 there is a deductible of 5000.
In 2010 the average payment per payment is 15% more than it was in 2005.
Determine E[X] in 2005.
A. 5000
B. 5500
C. 6000
D. 6500
E. 7000

Use the following information for the next 2 questions:

In 2010, losses follow a Pareto Distribution with = 5 and = 40.


There is a total of 25% inflation between 2010 and 2015.
In 2015, there is a deductible of 10.
36.35 (2 points) In 2015, determine the variance of YP, the per-payment variable.
A. 300

B. 325

C. 350

D. 375

E. 400

36.36 (3 points) In 2015, determine the variance of YL , the per-loss variable.


A. 160

B. 180

C. 200

D. 220

E. 240

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 616

Use the following information for the next four questions:

Losses in 2002 follow a LogNormal Distribution with parameters = 9.7 and = 0.8.
In 2007, the insured has a deductible of 10,000, maximum covered loss of 50,000,
and a coinsurance factor of 90%.
Inflation is 3% per year from 2002 to 2007.
36.37 (3 points) In 2007, what is the average payment per loss?
A. less than 12,100
B. at least 12,100 but less than 12,200
C. at least 12,200 but less than 12,300
D. at least 12,300 but less than 12,400
E. at least 12,400
36.38 (1 point) In 2007, what is the average payment per non-zero payment?
A. less than 15,500
B. at least 15,500 but less than 15,600
C. at least 15,600 but less than 15,700
D. at least 15,700 but less than 15,800
E. at least 15,800
36.39 (3 points) In 2007, what is the standard deviation of YL , the per-loss variable?
A. less than 12,100
B. at least 12,100 but less than 12,200
C. at least 12,200 but less than 12,300
D. at least 12,300 but less than 12,400
E. at least 12,400
36.40 (1 point) In 2007, what is the standard deviation of YP, the per-payment variable?
A. less than 12,100
B. at least 12,100 but less than 12,200
C. at least 12,200 but less than 12,300
D. at least 12,300 but less than 12,400
E. at least 12,400

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 617

36.41 (3 points) In 2011, losses prior to the effect of a deductible follow a Pareto Distribution
with = 2 and = 250.
There is deductible of 100 in both 2011 and 2016.
The ratio of the expected aggregate payments in 2016 to 2011 is 1.26.
Determine the total amount of inflation between 2011 and 2016.
A. 19%
B. 20%
C. 21%
D. 22%
E. 23%
36.42 (4, 5/86, Q.61 & 4, 5/87, Q.59) (1 point) Let there be a 10% rate of inflation over the
period of concern. Let X be the uninflated losses and Z be the inflated losses.
Let Fx be the distribution function (d.f.) of X, and fx be the probability density function (p.d.f.) of X.
Similarly, let Fz and fz be the d.f. and p.d.f. of Z. Then which of the following statements are true?
1. fz(Z) = fx(Z / 1.1)
2. If Fx is a Pareto, then Fz is also a Pareto.
3. If Fx is a LogNormal, then Fz is also a LogNormal.
A. 2

B. 3

C. 1, 2

D. 1, 3

E. 2, 3

36.43 (4, 5/89, Q.58) (2 points) The random variable X with distribution function Fx(x) is distributed

1
according to the Burr distribution, F(x) = 1 -
,
1 + (x / )
with parameters > 0, > 0, and > 0.
If Z = (1 + r)X where r is an inflation rate over some period of concern, find the parameters for the
distribution function Fz(z) of the random variable z.
A. , ,

B. (1+r), ,

D. , , (1+r)

E. None of the above

C. , (1+r),

36.44 (4, 5/90, Q.37) (2 points) Liability claim severity follows a Pareto distribution with a mean of
$25,000 and parameter = 3. If inflation increases all claims by 20%, the probability of a claim
exceeding $100,000 increases by:
A. less than 0.02
B. at least 0.02 but less than 0.03
C. at least 0.03 but less than 0.04
D. at least 0.04 but less than 0.05
E. at least 0.05

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 618

36.45 (4, 5/91, Q.27) (3 points) The Pareto distribution with parameters = 12,500 and = 2
appears to be a good fit to 1985 policy year liability claims.
Assume that inflation has been a constant 10% per year.
What is the estimated claim severity for a policy issued in 1992 with a $200,000 limit of liability?
A. Less than 22,000
B. At least 22,000 but less than 23,000
C. At least 23,000 but less than 24,000
D. At least 24,000 but less than 25,000
E. At least 25,000
36.46 (4, 5/91, Q.44) (2 points) Inflation often requires one to modify the parameters of a
distribution fitted to historical data. If inflation has been at the same rate for all sizes of loss, which of
the sets of parameters shown in Column C would be correct?
The form of the distributions is as given in the Appendix A of Loss Models.
(A)
(B)
(C)
Distribution Family
Distribution Function
Parameters of z = (1+r)(x)
x
x
1. Inverse Gaussian
1
+ e2 / + 1
, (1+r)
x
x

2. Generalized Pareto

[, ; x/(+x)]

, /(1+r),

3. Weibull

x
1 - exp-

(1+r),

A. 1

B. 2

C. 3

D. 1, 2, 3

E. None of the above

36.47 (4B, 5/92, Q.7) (2 points) The random variable X for claim amounts with distribution function
Fx(x) is distributed according to the Erlang distribution with parameters b and c.
The density function for X is as follows: f(x) = (x/b)c-1 e-x/b / { b (c-1)! }; x > 0, b > 0 , c > 1.
Inflation of 100r% acts uniformly over a one year period.
Determine the distribution function Fz(Z) of the random variable Z = (1+r)X.
A.
B.
C.
D.
E.

Erlang with parameters b and c(1+r)


Erlang with parameters b(1+r) and c
Erlang with parameters b/(1+r) and c
Erlang with parameters b/(1+r) and c/(1+r)
No longer an Erlang distribution

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 619

36.48 (4B, 11/92, Q.20) (3 points) Claim severity follows a Burr distribution,

1
F(x) = 1 -
, with parameters = 3, = 0.5 and . The mean is 10,000.
1 + (x / )
If inflation increases all claims uniformly by 44%, determine the probability of a claim exceeding
$40,000 after inflation.
Hint: The nth moment of a Burr Distribution is: n (1+ n/) ( n/) / (), > n.
A. Less than 0.01
B. At least 0.01 but less than 0.03
C. At least 0.03 but less than 0.05
D. At least 0.05 but less than 0.07
E. At least 0.07
36.49 (4B, 5/93, Q.11) (1 point) You are given the following:
The underlying distribution for 1992 losses is given by a lognormal distribution with
parameters = 17.953 and = 1.6028.
Inflation of 10% impacts all claims uniformly the next year.
What is the underlying loss distribution after one year of inflation?
A. lognormal with = 19.748 and = 1.6028
B. lognormal with = 18.048 and = 1.6028
C. lognormal with = 17.953 and = 1.7631
D. lognormal with = 17.953 and = 1.4571
E. no longer a lognormal distribution
36.50 (4B, 5/93, Q.12) (3 points) You are given the following:
The underlying distribution for 1992 losses is given by f(x) = e-x, x > 0,
where losses are expressed in millions of dollars.

Inflation of 10% impacts all claims uniformly from 1992 to 1993.

Under a basic limits policy, individual losses are capped at $1.0 (million).
What is the inflation rate from 1992 to 1993 on the capped losses?
A. less than 2%
B. at least 2% but less than 3%
C. at least 3% but less than 4%
D. at least 4% but less than 5%
E. at least 5%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 620

36.51 (4B, 5/93, Q.28) (3 points) You are given the following:
The underlying loss distribution function for a certain line of business in 1991 is:
F(x) = 1 - x-5, x > 1.
From 1991 to 1992, 10% inflation impacts all claims uniformly.
Determine the 1992 Loss Elimination Ratio for a deductible of 1.2.
A. Less than 0.850
B. At least 0.850 but less than 0.870
C. At least 0.870 but less than 0.890
D. At least 0.890 but less than 0.910
E. At least 0.910
36.52 (4B, 11/93, Q.5) (3 points) You are given the following:
The underlying distribution for 1993 losses is given by
f(x) = e-x, x > 0, where losses are expressed in millions of dollars.
Inflation of 5% impacts all claims uniformly from 1993 to 1994.
Under a basic limits policy, individual losses are capped at $1.0 million in each year.
What is the inflation rate from 1993 to 1994 on the capped losses?
A. Less than 1.5%
B. At least 1.5%, but less than 2.5%
C. At least 2.5%, but less than 3.5%
D. At least 3.5%, but less than 4.5%
E. At least 4.5%
36.53 (4B, 11/93, Q.15) (3 points) You are given the following:
X is the random variable for claim severity with probability distribution function F(x).
During the next year, uniform inflation of r% impacts all claims.
Which of the following are true of the random variable Z = X(1+r), the claim severity one year later?
1. The coefficient of variation for Z equals (1+r) times the coefficient of variation for X.
2. For all values of d > 0, the mean excess loss of Z at d(1+r) equals (1+r) times
the mean excess loss of X at d.
3. For all values of d > 0, the limited expected value of Z at d equals (1+r) times
the limited expected value of X at d.
A. 2
B. 3
C. 2, 3
D. 1, 2, 3
E. None of A, B, C or D

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 621

36.54 (4B, 11/93, Q.27) (3 points) You are given the following:
Losses for 1991 are uniformly distributed on [0, 10,000].
Inflation of 5% impacts all losses uniformly from 1991 to 1992 and from 1992 to 1993
(5% each year).
Determine the 1993 Loss Elimination Ratio for a deductible of $500.
A. Less than 0.085
B. At least 0.085, but less than 0.090
C. At least 0.090, but less than 0.095
D. At least 0.095, but less than 0.100
E. At least 0.100
36.55 (4B, 5/94, Q.16) (1 point) You are given the following:
Losses in 1993 follow the density function
f(x) = 3x-4, x 1,
where x = losses in millions of dollars.
Inflation of 10% impacts all claims uniformly from 1993 to 1994.
Determine the probability that losses in 1994 exceed $2.2 million.
A. Less than 0.05
B. At least 0.05, but less than 0.10
C. At least 0.10, but less than 0.15
D. At least 0.15, but less than 0.20
E. At least 0.20
36.56 (4B, 5/94, Q.21) (2 points) You are given the following:
For 1993 the amount of a single claim has the following distribution:
Amount
Probability
$1,000
1/6
$2,000
1/6
$3,000
1/6
$4,000
1/6
$5,000
1/6
$6,000
1/6
An insurer pays all losses AFTER applying a $1,500 deductible to each loss.
Inflation of 5% impacts all claims uniformly from 1993 to 1994.
Assuming no change in the deductible, what is the inflationary impact on losses paid by the insurer in
1994 as compared to the losses the insurer paid in 1993?
A. Less than 5.5%
B. At least 5.5%, but less than 6.5%
C. At least 6.5%, but less than 7.5%
D. At least 7.5%, but less than 8.5%
E. At least 8.5%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 622

36.57 (4B, 5/94, Q.24) (3 points) You are given the following:
X is a random variable for 1993 losses, having the density function f(x) = 0.1e-0.1x, x > 0.
Inflation of 10% impacts all losses uniformly from 1993 to 1994.
For 1994, a deductible, d, is applied to all losses.
P is a random variable representing payments of losses truncated and shifted by
the deductible amount.
Determine the value of the cumulative distribution function at p = 5, FP(5), in 1994.

A. 1 - e-0.1(5+d)/1.1
B.
C.
D.
E.

{e-0.1(5/1.1) - e-0.1(5+d)/1.1} / {1 - e-0.1(5/1.1)}


0
At least 0.25, but less than 0.35
At least 0.35, but less than 0.45

36.58 (4B, 11/94, Q.8) (3 points) You are given the following:
In 1993, an insurance companys underlying loss distribution for an individual claim amount is a
lognormal distribution with parameters = 10.0 and 2 = 5.0.
From 1993 to 1994, an inflation rate of 10% impacts all claims uniformly.
In 1994, the insurance company purchases excess-of-loss reinsurance that caps the insurers loss at
$2,000,000 for any individual claim. Determine the insurers 1994 expected net claim amount for a
single claim after application of the $2,000,000 reinsurance cap.
A. Less than $150,000
B. At least $150,000, but less than $175,000
C. At least $175,000, but less than $200,000
D. At least $200,000, but less than $225,000
E. At least $225,000
36.59 (4B, 11/94, Q.28) (2 points) You are given the following:
In 1993, the claim amounts for a certain line of business were normally distributed with mean
(x -)2
exp[]
22 .
= 1000 and variance 2 = 10,000: f(x) =
2
Inflation of 5% impacted all claims uniformly from 1993 to 1994. What is the distribution for claim
amounts in 1994?
A. No longer a normal distribution
B. Normal with = 1000.0 and = 102.5
C. Normal with = 1000.0 and = 105.0
D. Normal with = 1050.0 and = 102.5
E. Normal with = 1050.0 and = 105.0

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 623

36.60 (Course 160 Sample Exam #3, 1994, Q.2) (1.9 points) You are given:
(i) The random variable X has an exponential distribution.
(ii) px = 0.95, for all x.
(iii) Y = 2X.
(iv) fY(Y) is the probability density function of the random variable Y.
Calculate fY(1).
(A) 0.000

(B) 0.025

(C) 0.050

(D) 0.075

(E) 0.100

36.61 (4B, 5/95, Q.6) (3 points) You are given the following:
For 1994, loss sizes follow a uniform distribution on [0, 2500].
In 1994, the insurer pays 100% of all losses.
Inflation of 3.0% impacts all losses uniformly from 1994 to 1995.
In 1995, a deductible of $100 is applied to all losses.
Determine the Loss Elimination Ratio (L.E.R.) of the $100 deductible on 1995 losses.
A. Less than 7.3%
B. At least 7.3%, but less than 7.5%
C. At least 7.5%, but less than 7.7%
D. At least 7.7%, but less than 7.9%
E. At least 7.9%
36.62 (4B, 5/95, Q.23) (2 points) You are given the following:
Losses follow a Pareto distribution, with parameters = 1000 and = 2.
10 losses are expected each year.
The number of losses and the individual loss amounts are independent.
For each loss that occurs, the insurer's payment is equal to the entire amount of the
loss if the loss is greater than 100. The insurer makes no payment if the loss is less
than or equal to 100.
Determine the insurer's expected number of annual payments if all loss amounts increased uniformly
by 10%.
A. Less than 7.9
B. At least 7.9, but less than 8.1
C. At least 8.1, but less than 8.3
D. At least 8.3, but less than 8.5
E. At least 8.5

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 624

36.63 (4B, 11/95, Q.6) (2 points) You are given the following:

In 1994, losses follow a Pareto distribution, with parameters = 500 and = 1.5.

Inflation of 5% impacts all losses uniformly from 1994 to 1995.


What is the median of the portion of the 1995 loss distribution above 200?
A. Less than 600
B. At least 600, but less than 620
C. At least 620, but less than 640
D. At least 640, but less than 660
E. At least 660
36.64 (4B, 5/96, Q.10 & Course 3 Sample Exam, Q.18) (2 points)
You are given the following:

Losses follow a lognormal distribution, with parameters = 7 and = 2.


There is a deductible of 2,000.
10 losses are expected each year.

The number of losses and the individual loss amounts are independent.
Determine the expected number of annual losses that exceed the deductible if all loss amounts
increased uniformly by 20%, but the deductible remained the same.
A. Less than 4.0
B. At least 4.0, but less than 5.0
C. At least 5.0, but less than 6.0
D. At least 6.0, but less than 7.0
E. At least 7.0
36.65 (4B, 11/96, Q.1) (1 point) Using the information in the following table, determine the total
amount of losses from 1994 and 1995 in 1996 dollars.
Year
Actual Losses Cost Index
1994
10,000,000
0.8
1995
9,000,000
0.9
1996
--1.0
A. Less than 16,000,000
B. At least 16,000,000, but less than 18,000,000
C. At least 18,000,000, but less than 20,000,000
D. At least 20,000,000, but less than 22,000,000
E. At least 22,000,000

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 625

36.66 (4B, 11/96, Q.14) (2 points) You are given the following:

Losses follow a Pareto distribution, with parameters = k and = 2, where k is a constant.

There is a deductible of 2k.


Over a period of time, inflation has uniformly affected all losses, causing them to double, but the
deductible remains the same. What is the new loss elimination ratio (LER)?
A. 1/6
B. 1/3
C. 2/5
D. 1/2
E. 2/3
36.67 (4B, 11/96, Q.25) (1 point)
The random variable X has a lognormal distribution, with parameters and .
If the random variable Y is equal to 1.10X what is the distribution of Y?
A. Lognormal with parameters 1.10 and
B. Lognormal with parameters and 1.10
C. Lognormal with parameters + ln1.10 and
D. Lognormal with parameters and + ln1.10
E. Not lognormal
36.68 (4B, 5/97, Q.17) (2 points) You are given the following:

The random variable X has a Weibull distribution, with parameters = 625 and = 0.5.
Z is defined to be 0.25X.
Determine the distribution of Z.
A. Weibull with parameters = 10,000 and = 0.5
B. Weibull with parameters = 2500 and = 0.5
C. Weibull with parameters = 156.25 and = 0.5
D. Weibull with parameters = 39.06 and = 0.5
E. Not Weibull
36.69 (4B, 5/97, Q.20) (2 points) You are given the following:

Losses follow a distribution with density function f(x) = (1/1000) e-x/1000, 0 < x < .
There is a deductible of 500.

10 losses are expected to exceed the deductible each year.


Determine the expected number of losses that would exceed the deductible each year if all loss
amounts doubled, but the deductible remained at 500.
A. Less than 10
B. At least 10, but less than 12
C. At least 12, but less than 14
D. At least 14, but less than 16
E. At least 16

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 626

36.70 (4B, 11/97, Q.4) (1 point) You are given the following:
The random variable X has a distribution that is a mixture of a Burr distribution,

1
F(x) = 1 -
, with parameters = 1,000 , = 1 and = 2,
1 + (x / )
and a Pareto distribution, with parameters = 1,000 and = 1.
Each of the two distributions in the mixture has equal weight.
Y is defined to be 1.10 X, which is also a mixture of a Burr distribution and a Pareto distribution.
Determine for the Burr distribution in this mixture.
A. Less than 32
B. At least 32, but less than 33
C. At least 33, but less than 34
D. At least 34, but less than 35
E. At least 35
36.71 (4B, 11/97, Q.26) (3 points) You are given the following:

In 1996, losses follow a lognormal distribution, with parameters and .

In 1997, losses follow a lognormal distribution with parameters + ln k and ,

where k is greater than 1.


In 1996, 100p% of the losses exceed the mean of the losses in 1997.

Determine .
Note: zp is the 100pth percentile of a normal distribution with mean 0 and variance 1.
A. 2 ln k
B. -zp

zp2 - 2 ln k

C. zp

zp2 - 2 ln k

D.

-zp

E.

zp

zp 2 - 2 ln k
zp 2 - 2 ln k

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 627

36.72 (4B, 5/98, Q.25) (2 points) You are given the following:
100 observed claims occurring in 1995 for a group of risks have been recorded and are
grouped as follows:
Interval
Number of Claims
(0, 250)
36
[250, 300)
6
[300, 350)
3
[350, 400)
5
[400, 450)
5
[450, 500)
0
[500, 600)
5
[600, 700)
5
[700, 800)
6
[800, 900)
1
[900, 1000)
3
[1000, )
25
Inflation of 10% per year affects all claims uniformly from 1995 to 1998.
Using the above information, determine a range for the expected proportion of claims for this group
of risks that will be greater than 500 in 1998.
A. Between 35% and 40%
B. Between 40% and 45%
C. Between 45% and 50%
D. Between 50% and 55%
E. Between 55% and 60%
36.73 (4B, 11/98, Q.13) (2 points) You are given the following:

Losses follow a distribution (prior to the application of any deductible) with


cumulative distribution function and limited expected values as follows:
Loss Size (x) F(x)
E[X x]
10,000
0.60
6,000
15,000
0.70
7,700
22,500
0.80
9,500

1.00
20,000

There is a deductible of 15,000 per loss and no maximum covered loss.

The insurer makes a nonzero payment p.


After several years of inflation, all losses have increased in size by 50%, but the deductible has
remained the same. Determine the expected value of p.
A. Less than 15,000
B. At least 15,000, but less than 30,000
C. At least 30,000, but less than 45,000
D. At least 45,000, but less than 60,000
E. At least 60,000

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 628

36.74 (4B, 5/99, Q.17) (2 points) You are given are following:

In 1998, claim sizes follow a Pareto distribution, with parameters (unknown) and = 2.

Inflation of 6% affects all claims uniformly from 1998 to 1999.


r is the ratio of the proportion of claims that exceed d in 1999
to the proportion of claims that exceed d in 1998.
Determine the limit of r as d goes to infinity.
A. Less than 1.05
B. At least 1.05, but less than 1.10
C. At least 1.10, but less than 1.15
D. At least 1.15, but less than 1.20
E. At least 1.20

36.75 (4B, 5/99, Q.21) (2 points) Losses follow a lognormal distribution,with parameters
= 6.9078 and = 1.5174. Determine the percentage increase in the number of losses that
exceed 1,000 that would result if all losses increased in value by 10%.
A. Less than 2%
B. At least 2%, but less than 4%
C. At least 4%, but less than 6%
D. At least 6%, but less than 8%
E. At least 8%
36.76 (CAS6, 5/99, Q.39) (2 points) Use the information shown below to determine the one-year
severity trend for the loss amounts in the following three layers of loss:
$0 - $50
$50 - $100
$100 - $200

Losses occur in multiples of $40, with equal probability, up to $200, i.e., if a loss occurs,
it has an equal chance of being $40, $80, $120, $160, or $200.

For the next year, the severity trend will uniformly increase all losses by 10%.
36.77 (4B, 11/99, Q.26) (1 point) You are given the following:
The random variable X follows a Pareto distribution, as per Loss Models, with parameters
= 100 and = 2 .
The mean excess loss function, eX(k), is defined to be E[X - k I X k].
Y = 1.10 X.
Determine the range of the function eY(k)/eX(k) over its domain of [0, ).
A. (1, 1.10]

B. (1, )

C. 1.10

D. [1.10, )

E.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 629

36.78 (CAS9, 11/99, Q.38) (1.5 points) Assume a ground-up claim frequency of 0.05.
Based on the following claim size distribution, answer the following questions. Show all work.
Claim size (d)
Fx(d)
E(X d)
$909
0.075
$870
$1,000
0.090
$945
$1,100
0.100
$1,040
Unlimited
1.000
$10,000
a. (1 point) For a $1,000 franchise deductible, what is the frequency of payments and the average
payment per payment?
b. (0.5 point) Assuming a constant annual inflation rate of 10% across all loss amounts, what is the
pure premium one year later if there is a $1,000 franchise deductible?
36.79 (Course 151 Sample Exam #1, Q.7) (1.7 points)
For a certain insurance, individual losses in 1994 were uniformly distributed over (0,1000).
A deductible of 100 is applied to each loss.
In 1995, individual losses have increased 5%, and are still uniformly distributed.
A deductible of 100 is still applied to each loss.
Determine the percentage increase in the standard deviation of amount paid.
(A) 5.00% (B) 5.25% (C) 5.50% (D) 5.75% (E) 6.00%
36.80 (Course 1 Sample Exam, Q.17) (1.9 points) An actuary is reviewing a study she
performed on the size of claims made ten years ago under homeowners insurance policies.
In her study, she concluded that the size of claims followed an exponential distribution and that the
probability that a claim would be less than $1,000 was 0.250.
The actuary feels that the conclusions she reached in her study are still valid today with one
exception: every claim made today would be twice the size of a similar claim made ten years ago as
a result of inflation.
Calculate the probability that the size of a claim made today is less than $1,000.
A. 0.063
B. 0.125
C. 0.134
D. 0.163
E. 0.250
36.81 (3, 5/00, Q.30) (2.5 points) X is a random variable for a loss.
Losses in the year 2000 have a distribution such that:
E[X d] = -0.025d2 + 1.475d - 2.25, d = 10, 11, 12,..., 26
Losses are uniformly 10% higher in 2001.
An insurance policy reimburses 100% of losses subject to a deductible of 11 up to a maximum
reimbursement of 11.
Calculate the ratio of expected reimbursements in 2001 over expected reimbursements in the year
2000.
(A) 110.0% (B) 110.5% (C) 111.0% (D) 111.5% (E) 112.0%

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 630

Use the following information for the next two questions:


An insurer has excess-of-loss reinsurance on auto insurance. You are given:
(i) Total expected losses in the year 2001 are 10,000,000.
(ii) In the year 2001 individual losses have a Pareto distribution with
2000 2
F(x) = 1 -
, x > 0.
2000 + x
(iii) Reinsurance will pay the excess of each loss over 3000.
(iv) Each year, the reinsurer is paid a ceded premium, Cyear , equal to 110% of
the expected losses covered by the reinsurance.
(v) Individual losses increase 5% each year due to inflation.
(vi) The frequency distribution does not change.
36.82 (3, 11/00, Q.41 & 2009 Sample Q.119) (1.25 points) Calculate C2001.
(A) 2,200,000

(B) 3,300,000

(C) 4,400,000

(D) 5,500,000

(E) 6,600,000

36.83 (3, 11/00, Q.42 & 2009 Sample Q.120) (1.25 points) Calculate C2002 / C2001.
(A) 1.04
(B) 1.05
(C) 1.06
(D) 1.07
(E) 1.08

36.84 (3, 11/01, Q.6 & 2009 Sample Q.97) (2.5 points) A group dental policy has a negative
binomial claim count distribution with mean 300 and variance 800.
Ground-up severity is given by the following table:
Severity
Probability
40
0.25
80
0.25
120
0.25
200
0.25
You expect severity to increase 50% with no change in frequency.
You decide to impose a per claim deductible of 100.
Calculate the expected total claim payment after these changes.
(A) Less than 18,000
(B) At least 18,000, but less than 20,000
(C) At least 20,000, but less than 22,000
(D) At least 22,000, but less than 24,000
(E) At least 24,000

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 631

36.85 (CAS3, 5/04, Q.17) (2.5 points) Payfast Auto insures sub-standard drivers.

Each driver has the same non-zero probability of having an accident.


Each accident does damage that is exponentially distributed with = 200.
There is a $100 per accident deductible and insureds only "report" claims that are larger
than the deductible.

Next year each individual accident will cost 20% more.


Next year Payfast will insure 10% more drivers.
What will be the percentage increase in the number of reported claims next year?
A. Less than 15%
B. At least 15%, but less than 20%
C. At least 20%, but less than 25%
D. At least 25%, but less than 30%
E. At least 30%
36.86 (CAS3, 5/04, Q.29) (2.5 points) Claim sizes this year are described by a
2-parameter Pareto distribution with parameters = 1,500 and = 4. What is the expected claim
size per loss next year after 20% inflation and the introduction of a $100 deductible?
A. Less than $490
B. At least $490, but less than $500
C. At least $500, but less than $510
D. At least $510, but less than $520
E. At least $520
36.87 (CAS3, 5/04, Q.34) (2.5 points) Claim severities are modeled using a continuous
distribution and inflation impacts claims uniformly at an annual rate of i.
Which of the following are true statements regarding the distribution of claim severities after the effect
of inflation?
1. An Exponential distribution will have scale parameter (1+i).
2. A 2-parameter Pareto distribution will have scale parameters (1+i) and (1+i).
3. A Paralogistic distribution will have scale parameter /(1+i).
A. 1 only

B. 3 only

C. 1 and 2 only

D. 2 and 3 only

E. 1, 2, and 3

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 632

36.88 (CAS3, 11/04, Q.33) (2.5 points)


Losses for a line of insurance follow a Pareto distribution with = 2,000 and = 2.
An insurer sells policies that pay 100% of each loss up to $5,000. The next year the insurer changes
the policy terms so that it will pay 80% of each loss after applying a $100 deductible.
The $5,000 limit continues to apply to the original loss amount. That is, the insurer will pay 80% of
the loss amount between $100 and $5,000. Inflation will be 4%.
Calculate the decrease in the insurer's expected payment per loss.
A. Less than 23%
B. At least 23%, but less than 24%
C. At least 24%, but less than 25%
D. At least 25%, but less than 26%
E. At least 26%
36.89 (SOA3, 11/04, Q.18 & 2009 Sample Q.127) (2.5 points)
Losses in 2003 follow a two-parameter Pareto distribution with = 2 and = 5.
Losses in 2004 are uniformly 20% higher than in 2003.
An insurance covers each loss subject to an ordinary deductible of 10.
Calculate the Loss Elimination Ratio in 2004.
(A) 5/9
(B) 5/8
(C) 2/3
(D) 3/4
(E) 4/5
36.90 (CAS3, 11/05, Q.21) (2.5 points) Losses during the current year follow a Pareto distribution
with = 2 and = 400,000. Annual inflation is 10%.
Calculate the ratio of the expected proportion of claims that will exceed $750,000 next year to the
proportion of claims that exceed $750,000 this year.
A Less than 1.105
B. At least 1.105, but less than 1.115
C. At least 1.115, but less than 1.125
D. At least 1.125, but less than 1.135
E. At least 1.135
36.91 (CAS3, 11/05, Q.33) (2.5 points)
800 3
In year 2005, claim amounts have the following Pareto distribution: F(x) = 1 -
.
800 + x
The annual inflation rate is 8%. A franchise deductible of 300 will be implemented in 2006.
Calculate the loss elimination ratio of the franchise deductible.
A. Less than 0.15
B. At least 0.15, but less than 0.20
C. At least 0.20, but less than 0.25
D. At least 0.25, but less than 0.30
E. At least 0.30

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 633

36.92 (SOA M, 11/05, Q.28 & 2009 Sample Q.209) (2.5 points)
In 2005 a risk has a two-parameter Pareto distribution with = 2 and = 3000.
In 2006 losses inflate by 20%. An insurance on the risk has a deductible of 600 in each year.
Pi, the premium in year i, equals 1.2 times the expected claims.
The risk is reinsured with a deductible that stays the same in each year.
Ri, the reinsurance premium in year i, equals 1.1 times the expected reinsured claims.
R2005/P2005 = 0.55.

Calculate R2006/P2006.

(A) 0.46

(C) 0.55

(B) 0.52

(D) 0.58

(E) 0.66

36.93 (CAS3, 5/06, Q.26) (2.5 points) The aggregate losses of Eiffel Auto Insurance are denoted
in euro currency and follow a Lognormal distribution with = 8 and = 2.
Given that 1 euro = 1.3 dollars, which set of lognormal parameters describes the distribution of
Eiffels losses in dollars?
A. = 6.15, = 2.26

B. = 7.74, = 2.00

D. = 8.26, = 2.00

E. = 10.40, = 2.60

C. = 8.00, = 2.60

36.94 (CAS3, 5/06, Q.39) (2.5 points) Prior to the application of any deductible, aggregate claim
counts during 2005 followed a Poisson distribution with = 14. Similarly, individual claim sizes
followed a Pareto distribution with = 3 and = 1000. Annual severity inflation is 10%.
If all policies have a $250 ordinary deductible in 2005 and 2006, calculate the expected increase in
the number of claims that will exceed the deductible in 2006.
A. Fewer than 0.41 claims
B. At least 0.41, but fewer than 0.45
C. At least 0.45, but fewer than 0.49
D. At least 0.49, but fewer than 0.53
E. At least 0.53
36.95 (CAS3, 11/06, Q.30) (2.5 points) An insurance company offers two policies.
Policy R has no deductible and no limit. Policy S has a deductible of $500 and a limit of $3,000; that
is, the company will pay the loss amount between $500 and $3,000.
In year t, severity follows a Pareto distribution with parameters = 4 and = 3,000.
The annual inflation rate is 6%.
Calculate the difference in expected cost per loss between policies R and S in year t+4.
A. Less than $500
B. At least $500, but less than $550
C. At least $550, but less than $600
D. At least $600, but less than $650
E. At least $650

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 634

36.96 (CAS5, 5/07, Q.46) (2.0 points) You are given the following information:
Claim
Ground-up Uncensored Loss Amount
A
$35,000
B
125,000
C
180,000
D
206,000
E
97,000
If all claims experience an annual ground-up severity trend of 8.0%, calculate the effective trend in the
layer from $100,000 to $200,000 ($100,000 in excess of $100,000.) Show all work.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 635

Solutions to Problems:
36.1. B. For the Pareto, the new theta is the old theta multiplied by the inflation factor of 1.2.
Thus the new theta = (1.2)(5000) = 6000. Alpha is unaffected.
The average size of claim for the Pareto is: / (1). In 1997, this is: 6000/(3-1) = 3000.
Alternately, the mean in 1994 is 5000 / (3-1) = 2500. The mean increases by the inflation factor of
1.2; therefore the mean in 1997 is (1.2)(2500) = 3000.
36.2. D. The inflation factor is (1.03)5 = 1.1593. For the Exponential, the new is the old
multiplied by the inflation factor. Thus the new is: (200)(1.1593) = 231.86.
The variance for the Exponential Distribution is: 2 , which in 2009 is: 231.862 = 53,758.
Alternately, the variance in 2004 is: 2002 = 40,000. The variance increases by the square of the
inflation factor; therefore the variance in 2009 is: (1.15932 )(40,000) = 53759.
36.3. C. For a Burr, the new theta = (1+r) = 19307(1.3) = 25,099. (Alpha and gamma are
unaffected.) Thus in 1996, 1 - F(10,000) = {1/(1 + (10000/25099).7)}2 = 43.0%.
Alternately, $10,000 in 1996 corresponds to $10,000 / 1.3 = $7692 in 1992.
Then in 1992, 1 - F(7692) = {1/(1 + (7692/19307).7)}2 = 43.0%.
36.4. B. For the Gamma Distribution, is multiplied by the inflation factor of 1.1, while is
unaffected. Thus the parameters in 1996 are: = 2, = 110.
36.5. E. For the Pareto, the new theta is the old theta multiplied by the inflation factor of 1.25.
Thus the new theta = (1.25)(15000) = 18750. Alpha is unaffected. The average size of claim for
data truncated and shifted at 25,000 in 1999 is the mean excess loss, e(25000), in 1999.
For the Pareto e(x) = (x+) / (-1).
In 1999, e(25000) = (25000 + 18750) / (1.5 - 1) = 87,500.
Alternately, $25,000 in 1999 corresponds to $25,000 / 1.25 = $20,000 in 1995.
The average size of claim for data truncated and shifted at 20,000 in 1995 is the mean excess loss,
e(20000), in 1995. For the Pareto e(x) = (x+) / (1).
In 1995, e(20,000) = (20000 + 15000) / (1.5 - 1) = 70,000. However, we need to inflate this back
up to get the average size in 1999 dollars: (70,000)(1.25) = 87,500.
Comment: The alternate solution uses the fact that the effect of a deductible keeps up with inflation
provided the limit keeps up with inflation, or equivalently if the limit keeps up with inflation, then the
mean excess loss increases by the inflation rate.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 636

36.6. B. Integrating the density function, FX(x) = (2.5)(1/2 - 1/x), 2<x<10.


FZ(z) = FX(x) = (2.5)(1/2 - 1/x) = (2.5)(1/2 - 1.2/z).
Alternately, FZ(z) = FX(z/(1+r)) = FX(z/1.2) = (2.5)(1/2 - 1.2/z).
FZ(6) = (2.5)(1/2 - 1.2/6 ) = 0.75. 1 - FZ(6) = 0.25.
Alternately, 6 in 1996 is 6 /1.2 = 5 in terms of 1992.
In 1992, FX(5) = (2.5)(1/2 - 1/5 ) = 0.75. 1 - FX(5) = 0.25.
Comment: Note that the domain becomes [2.4, 12] in 1996.
36.7. C. For the Loglogistic, is multiplied by 80 (the inflation factor), while the other parameter is
unaffected.
36.8. E. For the LogNormal Distribution has ln(80) added to it, while is unaffected.
New = 10 + ln(80) = 14.38 and = 3.
36.9. D. For the Weibull Distribution, is multiplied by 80, while is unaffected.
New = (625) (80) = 50,000.

36.10. C. For the Paralogistic, is multiplied by 80 (the inflation factor), while the other parameter
is unaffected.
36.11. E. In 1996 x becomes 1.1x. z = 1.1x. ln(z) = ln(1.1x) = ln(x) + ln(1.1).
Thus in 1996 the distribution function is F(z) = [; ln(z)] = [; ln(x) + ln(1.1)].
This is not of the same form, so the answer is none of the above.
Comment: This is called the LogGamma Distribution. If ln(x) follows a Gamma Distribution, then x
follows a LogGamma Distribution. Under uniform inflation, ln(x) becomes
ln(x) + ln(1+r). If you add a constant amount to a Gamma distribution, then you no longer have a
Gamma distribution. Which is why under uniform inflation a LogGamma distribution is not
reproduced.
36.12. The sum of 50 independent, identically distributed Exponentials each with = 800 is
Gamma with = 50 and = 800. The average is 1/50 times the sum, and has a Gamma
Distribution with = 50 and = 800/50 = 16.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 637

36.13. B. 1. F. The skewness is unaffected by uniform inflation. (The numerator of the skewness is
the third central moment which would be multiplied by 1.13 .) 2. T. Since each claim size is
increased by 10%, the place where 70% of the claims are less and 30% are more is also increased
by 10%. Under uniform inflation, each percentile is increased by the inflation factor.
3. F. Under uniform inflation, if the deductible increases to keep up with inflation, then the Loss
Elimination Ratio is unaffected. So in 1996 the Loss Elimination Ratio at $1100 is 10% not 11%.
36.14. B. This is a mixed Gamma-Weibull Distribution.
The Gamma has parameters = 3 and = 1/10, and density: x1 ex/ /() = 500x2 e-10x.
The Weibull has parameters =1/ (201/4) and = 4, and density:
(x/) exp(-(x/)) /x = 80x3 exp(-20x4 ). In the mixed distribution, the Gamma is given a weight of
.75, while the Weibull is given a weight of .25. Note that (0.75)(500) =375, and (0.25)(80) = 20.
Under uniform inflation of 25%, the Gamma has parameters:
= 3 and = (1/10)1.25 = 1/8, and density: x1 ex/ /() = 256x2 e-8x.
Under uniform inflation of 25%, the Weibull has parameters: = 1.25/ (201/4).
-1/4= 20/(1.25)4 = 8.192 and = 4, and density: (x/) exp(-(x/)) / x =
32.768x3 exp(-8.192x4 ). Therefore, the mixed distribution after inflation has a density of:
(.75){256x2 e-8x} + (.25){32.768x3 exp(-8.192x4 )} = 192x2 e-8x + 8.192x3 exp(-8.192x4 ).
Comment: For a mixed distribution under uniform inflation, the weights are unaffected, while the each
separate distribution is affected as usual.
36.15. D. This is a LogNormal Distribution with parameters (prior to inflation) of = 7 and
= 3. Thus posterior to inflation of 40%, one has a LogNormal Distribution with parameters of
= 7 + ln(1.4) = 7.336 and = 3. For the LogNormal, S(x) = 1 - ((ln(x) - )/). Prior to inflation,
S(1000) = 1 - ((ln(x) - )/) = 1 - ((ln(1000) - 7)/3) = 1 - (-.031) = (.03) = .5120.
After inflation, S(1000) = 1 - [(ln(x) - )/] = 1 - [(ln(1000) - 7.336)/3] =
1 - (-.143) = (.14) = .5557. Prior to inflation, 173 losses are expected to exceed the deductible
each year. The survival function increased from .5120 to .5557 after inflation.
Thus after inflation one expects to exceed the deductible per year:
173(.5557)/.5120 = 187.8 claims.
Alternately, a limit of 1000 after inflation is equivalent to 1000/1.4 = 714.29 prior to inflation. Thus the
tail probability after inflation at 1000 is the same as the tail probability at 714.29 prior to inflation.
Prior to inflation, 1 - F(714.29) = 1 - ((ln(x) - )/) = 1 - ((ln(714.3) - 7)/3) = 1 - (-.14) = (.14) =
.5557. Proceed as before.
Comment: The expected number of claims over a fixed deductible increases under uniform inflation.

2013-4-2,

Loss Distributions, 36 Inflation

36.16. A. For the Pareto Distribution, E[X

E[X
E[X
E[X

HCM 10/8/12,

Page 638

x] = {/(1)} {1(/(+x))1} = 20{1 - (1 + x/40)-2}.

5/1.06] = 20{1 - (1 + 4.717/40)-2} = 3.997.


5] = 20{1 - (1+ 5/40)-2} = 4.198. E[X

25/1.06] = 20(1-(1+ 23.585/40)-2) = 12.085.

25] = 20{1 - (1+ 25/40)-2} = 12.426.


In 2001 the expected payments are: E[X 25] - E[X 5] = 12.426 - 4.198 = 8.228.
A deductible of 5 and maximum covered loss of 25 in the year 2002, when deflated back to the
year 2001, correspond to a deductible of: 5/1.06 = 4.717, and a maximum covered loss of: 25/1.06
= 23.585. Therefore, reinflating back to the year 2002, the expected payments in the year 2002
are: (1.06)(E[X 23.585] - E[X 4.717]) = (1.06)(12.085 - 3.997) = 8.573.
The ratio of expected payments in 2002 over the expected payments in the year 2001 is:
8.573/ 8.228 = 1.042.
Alternately, the insurers average payment per loss is: (1+r) c (E[X u/(1+r)] - E[X d/(1+r)]).
c = 100%, u = 25, d = 5. r = .06 for the year 2002 and r = 0 for the year 2001.
Then proceed as previously.
36.17. E. Inflation is 8% per year for 9 years, thus the inflation factor is 1.089 = 1.999.
Thus 1000 in the year 2002 is equivalent to 1000/1.999 = 500 in 1993. There are 457 claims
excess of 500 in 1993; this is 457/1000 = 45.7%.
Comment: Note the substantial increase in the proportion of claims over a fixed limit. In 1993 there
are 32.6% of the claims excess of 1000, while in 2002 there are 45.7%.
36.18. C. In general one substitutes for x = z / (1+r), and for the density function
fZ(z) = fX(x) / (1+r). (Recall that that under change of variables you need to divide by dz/dx = 1+ r,
since dF/dz = (dF /dx) / (dz/dx).) In this case, 1 + r = 1.2.
Thus, fZ(z) = fX(x) / 1.2 = exp[- (x-)2 /(2x)] / {1.2 2 x3 )} =
exp[- ({z/1.2}-)2 /(2{z/1.2})] / {1.2 2 {z / 1.2}3 } =
(1.2) exp[- (z-(1.2))2 /{2(1.2)z}] /

2(1.2) z3 .

This is of the same form, but with parameters 1.2 and 1.2, rather than and .
Comment: This is an Inverse Gaussian Distribution. Let = 2 / and one has the parameterization
in Loss Models, with parameters and . Since under uniform inflation, for the Inverse Gaussian each
of and are multiplied by the inflation factor, so is = 2 / .

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 639

36.19. C. This is an Exponential Distribution with = 5000. new = (5000)(1.4) = 7000.


For the Exponential Distribution, E[X x] = (1-ex/). The mean is .
The losses excess of a limit are proportional to E[X] - E[X x] = ex/.
In 1998 this is for x = 1000: 5000e-1000/5000 = 4094.
In 2007 this is for x = 1000: 7000e-1000/7000 = 6068.
The increase is: (6068/4094) - 1 = 48.2%.
Comment: In general excess losses over a fixed limit, increase faster than the rate of inflation. Note
that E[X] - E[X x] = S(x)e(x) = R(x)E[X] = ex/.
36.20. B. This is an Exponential Distribution with = 5000.
Therefore, the new theta = (5000) 1.4 = 7000.
For the Exponential Distribution, E[X x] = (1-ex/). The mean is .
The losses excess of a limit are proportional to E[X] - E[X x] = ex/.
In 1998 this is for x = 1000: 5000e-1000/5000 = 5000e-.2.
In 2007 this is for x = 1400: 7000e-1400/7000 = 7000e-.2.
The increase is: (7000/5000) - 1 = 40.0%.
Comment: If the limit keeps up with inflation, then excess losses increase at the rate of inflation.
36.21. D. The inflation factor from 1990 to 1999 is: (1.04)9 = 1.423. Thus the parameters of the
1999 LogNormal are: 3 + ln(1.423) and . Therefore, the mean of the 1999 LogNormal is:
Mean99 = exp(3 + ln(1.423) + 2 /2) = 1.423 exp(3 + 2 /2) = 1.423 Mean90.
Therefore, (ln(Mean99) = 3 + ln(1.423) + 2 /2. F90(Mean99) = [(ln(Mean99)) - ) / ] =
( (3+ ln 1.423 + 2 /2 - 3) / ) = ( (ln 1.423 + 2 /2) / ).
We are given that in 1990 5% of the losses exceed the mean of the losses in 1999. Thus,
F90(Mean99) = .95. Therefore, ( (ln 1.423 + 2 /2) / ) = .95.
(1.645) = .95. (ln 1.423 + 2 /2) / = 1.645. 2 /2 -1.645 + ln 1.423 = 0.
= 1.645

1.6452 - 2 ln 1.423 = 1.645

2.000 = 0.231 or 3.059.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 640

36.22. E. The inflation factor from 1995 to 2001 is 206.8/170.3 = 1.214.


For the Weibull Distribution, is multiplied by 1+ r, while is unaffected.
Thus in 2001 the new is: (1) (1.214) = 1.214.
The survival function of the Weibull is S(x) = exp(-(x/)).
In 1995, S(10000) = exp(-(1000.3)) = .000355.
In 2001, S(10000) = exp(-(1000/1.214).3) = .000556.
The ratio of survival functions is: .000556/.000355 = 1.57 or a 57% increase in the expected
number of claims excess of the deductible.
Comment: Generally for the Weibull, the ratio of the survival functions at x is:
exp[-(x/ (1+r))] / exp[-(x/)] = exp[(x/) {1 - 1/(1+r)}].
36.23. B. The inflation factor from 1994 to 1998 is: (1.05)(1.03)(1.07)(1.06) = 1.2266.
For the LogNormal Distribution, has ln(1+r) added, while is unaffected.
Thus in 1998 the new is: 3 + ln(1.2266) = 3.2042.
The Limited Expected Value for the LogNormal Distribution is:
E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}
In 1994, E[X 25000] = e11[(ln(25000) - 19)/4] + 25000{1 - [(ln((25000) - 3)/4]} =
59874[-2.22] + 25000{1 - [1.78]} = 59,874(1 - .9868) + 25,000(1 - .9625) = 1728.
In 1998, E[X 25000] =
e11.2042[(ln(25000) - 19.2042)/4] + 25000{1 - [(ln((25000) - 3.2043)/4]} =
73438[-2.27] + 25000{1 - [1.73]} = 73438(1 - .9884) + 25,000{1 - .9582) = 1897.
The ratio of Limited Expected Values is: 1897/1728 = 1.098 or a 9.8% increase in the expected
dollars of claims between 1994 and 1998.
Alternately, in 1994, E[X 20382] = e11[(ln(20382) - 19)/4] + 20382{1 - [(ln((20382) - 3)/4]} =
59874[-2.27] + 20382{1 - [1.73]} = 59,874(1 - .9884) + 20382(1 - .9582) = 1547.
In 1998 the average payment per loss is:
(1+r) c (E[X L/(1+r)] - E[X d/(1+r)]) = 1.2266 E[X 25000/1.2266] =
1.2266 E[X 20382] = (1.2266)(1547) = 1898. Proceed as before.
Comment: For a fixed limit, basic limit losses increase at less than the overall rate of inflation.
Here unlimited losses increase 22.7%, but limited losses increase only 9.8%.
When using the formula for the average payment per loss, use the original LogNormal for 1994.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 641

36.24. D. In general a layer of loss is proportional to the integral of the survival function.
In 1994, S(x) = 1015 x-3. The integral from 500,000 to 2,000,000 of S(x) is:
1015( 500000-2 - 2000000-2)/2 = 1875.
In 1999, the Distribution Function is gotten by substituting x = z/1.2.
F(z) = 1 - (100000/(z/1.2))3 = 1 - (120000/z)3 for z > $120,000.
Thus in 1999, the integral from 500,000 to 2,000,000 of the survival function is:
(120000)3 ( 500000-2 - 2000000-2)/2 = 3240.
3240 / 1875 = 1.728, representing a 72.8% increase.
Alternately, this is a Single Parameter Pareto Distribution, with = 3 and = 100,000.
Under uniform inflation of 20%, becomes 120,000 while is unaffected.
E[X x] = [{ (x/)1} / ( 1)] for > 1.
Then the layer from 500,000 to 2,000,000 is proportional to:
E[X 2000000] -E[X 500000] = (/(1)){( (2000000/)1) - ( (500000/)1)} =
(/( 1)){(500000/)1 (2000000/)1} .
In 1994, E[X 2000000] -E[X 500000] = (100000/2){5-2 - 20-2) = 1875.
In 1999, E[X 2000000] -E[X 500000] = (120000/2){4.1667-2 - 16.6667-2) = 3240.
3240 / 1875 = 1.728, representing a 72.8% increase.
Comment: As shown in A Practical Guide to the Single Parameter Pareto Distribution, by Stephen
W. Philbrick, PCAS LXXII, 1985, pp. 44, for the Single Parameter Pareto Distribution, a layer of
losses is multiplied by (1+r). In this case 1.23 = 1.728.
36.25. D. The total inflation factor is (1.03)8 = 1.2668.
Under uniform inflation both parameters of the Inverse Gaussian are multiplied by 1 + r = 1.2668.
Thus in 2009 the parameters are: = 3(1.26668) = 3.8003 and = 10(1.2668) = 12.668.
Thus the variance in 2009 is: 3 / = 3.80033 / 12.668 = 4.33.
Alternately, the variance in 2001 is: 3 / = 33 / 10 = 2.7. Under uniform inflation, the variance is
multiplied by (1+r)2 . Thus in 2009 the variance is: (2.7)(1.26682 ) = 4.33.
36.26. A. S(10000) = (5000/(5000 + 10000))3 = 1/27.
Thus we want x such that S(x) = (1/2)(1/27) = 1/54.
(5000/(5000 + x))3 = 1/54 x = 5000(541/3 - 1) = 13,899.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 642

36.27. C. During the year 2006, the losses are Pareto with = 3 and = (1.15)(5000) = 5750.
S(10000) = {5750/(5750 + 10000)}3 = 0.04866.
Thus we want x such that S(x) = (1/2)(0.04866) = 0.02433.
(5750/(5750 + x))3 = 0.02433 x = 5750(1/0.024331/3 - 1) = 14,094.
36.28. A. The new theta = (1/1000)1.25 = 1/800. Thus in 1998 the density is:
e-x // = 800e-800x.
36.29. E. The inflation factor is 1.043 = 1.1249.
Probability
0.1667
0.3333
0.3333
0.1667

2003 Amount
Loss Amount
1000
2000
5000
10000

2003
Insurer Payment
0
0
3000
8000

2006 Amount
Loss Amount
1124.9
2249.7
5624.3
11248.6

2006
Insurer Payment
0.0
249.7
3624.3
9248.6

Average

4166.7

2333.3

4686.9

2832.8

2832.8 / 2333.3 = 1.214, therefore the insurers expected payments increased 21.4%.
Comment: Similar to 4B, 5/94, Q.21.
36.30. B. The second moment of the LogNormal in 2007 is exp[(2)(5) + (2)(0.72 )] = 58,689.
The second moment increases by the square of the inflation factor: (1.043 )2 (58,689) = 74,260.
Alternately, the LogNormal in 2010 has parameters of: = 5 + ln[1.043 ] = 5.1177, and = 0.7.
The second moment of the LogNormal in 2010 is exp[(2)(5.1177) + (2)(0.72 )] = 74,265.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 643

36.31. D. Deflating, the 1000 deductible in 2005 is equivalent to a deductible of 1000/1.1 = 909 in
2005. Work in 2005 and then reinflate back up to the 2008 level by multiplying by 1.1.

Average payment per loss is: 1.1 {E[X] - E[X

= 1.1 x f(x) dx - 1000


909

909]} = 1.1{ x f(x) dx - x f(x) dx - 909 S(909)}


0

909

f(x) dx.

909

Average payment per loss is: 1.1 {E[X] - E[X

909

909]} = 1.1{ S(x) dx - S(x) dx} = 1.1 S(x) dx.


0

909

Average payment per loss is: 1.1 E[(X - 909)+] = 1.1 (x - 909) f(x) dx.
909

Average payment per loss is: 1.1 {E[X] - E[X

909

= 1.1 x f(x) dx + 1.1 {xf(x) - S(x)} dx.


909

909]} = 1.1 x f(x) dx - 1.1 S(x) dx


0

909

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 644

36.32. For convenience put everything in millions of dollars.


Prior to inflation, E[X x] = 0.5 - 0.52 / (0.5 + x).
Thus prior to inflation the average payment per loss is:
E[X 1] - E[X R] = 0.52 / (0.5 + R) - 0.52 / (0.5 + 1) = 0.52 / (0.5 + R) - 0.166667.
After inflation the average payment per loss is:
1.1(E[X 1/1.1] - E[X R/1.1]) =
(1.1)(0.52 ) / (0.5 + R/1.1) - (1.1)(0.52 ) / (0.5 + 1/1.1) = 0.552 / (0.55 + R) - 0.195,161.
Setting the ratio of the two average payments per loss equal to 1.1:
(1.1){0.52 / (0.5 + R) - 0.166667} = 0.552 / (0.55 + R) - 0.195161.
0.011827(0.5 + R) (0.55 + R) + (1.1)(0.52 )(0.55 + R) - 0.552 (0.5 + R) = 0.
0.011827R2 - 0.015082R + 0.0032524 = 0.
R=

0.015082

0.0150822 - (4)(0.011827)(0.0032524)
(2)(0.011827)

= 0.6376 0.3627.

R = 0.275 or 1.000. R = $275,000.


Comment: A rewritten version of CAS9, 11/99, Q.39.
36.33. C. During 2013, the losses follow an Exponential with mean: (1.08)(5000) = 5400.
An Exponential distribution truncated and shifted from below is the same Exponential Distribution,
due to the memoryless property of the Exponential. Thus the nonzero payments are Exponential
with mean 5400. The probability of a nonzero payment is the probability that a loss is greater than
the deductible of 1000; S(1000) = e-1000/5400 = 0.8310. Thus the payments of the insurer
can be thought of as a compound distribution, with Bernoulli frequency with mean 0.8310 and
Exponential severity with mean 5400. The variance of this compound distribution is:
(Mean Freq.)(Var. Sev.) + (Mean Sev.)2 (Var. Freq.) =
(0.8310)(54002 ) + (5400)2 {(0.8310)(1 - 0.8310)} = 28.3 million.
Equivalently, the payments of the insurer in this case are a two point mixture of an Exponential with
mean 5400 and a distribution that is always zero, with weights 0.8310 and 0.1690.
This has a first moment of: (5400)(0.8310) + (0)(0.1690) = 4487.4,
and a second moment of: {(2)(54002 )}(0.8310) + (02 )(0.1690) = 48,463,920.
Thus the variance is: 48,463,920 - 4487.42 = 28.3 million.
Comment: Similar to 3, 11/00, Q.21, which does not include inflation.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 645

36.34. E. In 2005, the average payment per payment is:


(E[X] - E[X 5000])/S(5000) = (E[X] - 3000)/(1 - .655) = 2.8986E[X] - 8695.7.
In 2005, the average payment per payment is:
1.25(E[X] - E[X 5000/1.25])/S(5000/1.25) = 1.25(E[X] - E[X 4000])/(1 - F(4000))
= 1.25(E[X] - 2624)/(1 - .590) = 3.0488E[X] - 8000.
Set 1.15(2.8986E[X] - 8695.7) = 3.0488E[X] - 8000. E[X] = 2000/.2846 = 7027.
36.35. D. In 2015, the losses are Pareto with = 5 and = (1.25)(4) = 50.
With a deductible of 10, the non-zero payments are Pareto with = 5 and = 50 + 10 = 60.
The mean of this Pareto is: 60/4 = 15.
The second moment of this Pareto is:

(2) (602 )
= 600.
(5 - 1) (5 - 2)

The variance of this Pareto is: 600 - 152 = 375.


36.36. C. The non-zero payments are Pareto with = 5 and = 50 + 10 = 60,
with mean: 15, second moment: 600, and variance: 600 - 152 = 375.
The probability of a non-zero payment is the survival function at 10 of the original Pareto:
50 5
= 0.4019.
50 + 10
Thus YL is a two-point mixture of a Pareto distribution = 5 and = 60, and a distribution that is
always zero, with weights 0.4019 and 0.5981.
The mean of the mixture is: (0.4019)(15) + (0.5981)(0) = 6.029.
The second moment of the mixture is: (0.4019)(600) + (0.5981)(02 ) = 241.14.
The variance of this mixture is: 241.14 - 6.0292 = 205.
Alternately, YL can be thought of as a compound distribution,
with Bernoulli frequency with mean 0.4019 and Pareto distribution = 5 and = 60.
The variance of this compound distribution is:
(Mean Freq.)(Var. Sev.) + (Mean Sev.)2 (Var. Freq.) =
(0.4019)(375) + (15)2 {(0.4019)(0.5981)} = 205.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 646

36.37. B. inflation factor = (1+r) = 1.035 = 1.1593. coinsurance factor = c = .90.


Maximum Covered Loss = u = 50,000. Deductible amount = d = 10,000.
L/(1+r) = 43,130. d/(1+r) = 8626.
E[X x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}
E[X u/(1+r)] = E[X 43,130] =
exp(10.02)[(ln(43,130) - 9.7 - .64 )/.8] + (43,130) {1 [(ln(43,130) 9.7)/.8]} =
(22,471)[.41] + (43,130){1 - [1.21]} = (22,471)(.6591) + (43,130){1 - .8869} = 19,689.
E[X d/(1+r)] = E[X 8626] =
exp(10.02)[(ln(8626) - 9.7 - 0.64 )/0.8] + (8626) {1 - [(ln(8626) - 9.7)/0.8]} =
(22,471)[-1.60] + (8626) {1 - [-.80]} = (22,471)(.0548) + (8626) {.7881} = 8030.
The average payment per loss is: (1+r) c (E[X L/(1+r)] - E[X d/(1+r)]) =
(1.1593)(.9)(19,689 - 8030) = 12,165.
Comment: In 2002, the average payment per loss is: (.9)(E[X 50000] - E[X 10000])
(.9)(20,345 - 9073) = 10,145. Thus it increased from 2002 to 2007 by: 12165/10145 - 1 =
19.9%. The maximum covered loss would cause the increase to be less than the rate of inflation of
15.9%, while the deductible would cause it to be greater. In this case the deductible had a bigger
impact than the maximum covered loss on the rate of increase. When using the formula for the
average payment for loss, use the parameters of the original LogNormal for 2002. This formula is
equivalent to deflating the 2007 values back to 2002, working in 2002, and then reinflating back up
to 2007. One could instead inflate the LogNormal to 2007 and work in 2007.
36.38. A. Inflation factor = (1+r) = 1.035 = 1.1593.
coinsurance factor = c = 0.90.
Maximum Covered Loss = u = 50,000. Deductible amount = d = 10,000.
u/(1+r) = 43,130. d/(1+r) = 8626. S(d/(1+r)) = 1 - [(ln(8626) -9.7)/.8] = 1 - [-.80] = 0.7881.
The average payment per non-zero payment is:
(1+r)c(E[X u/(1+r)] - E[X d/(1+r)]) / S(d/(1+r)) = (1.1593)(0.9)(19,689 - 8030)/0.7881 = 15,435.
Comment: The average payment per non-zero payment is:
average payment per loss / S(d/(1+r)) = 12,165 /0.7881 = 15,436.

2013-4-2,

Loss Distributions, 36 Inflation

[ ln(x)

36.39. D. E[(X x)2 ] = exp[2 + 22]

22

HCM 10/8/12,

Page 647

] + x2 {1 - [ ln(x) ]}

E[(X u/(1+r))2 ] = E[(X 43,130)2 ] =


exp(20.68)[{ln(43,130) - 9.7 - (2)(0.82 )}/0.8] + (43,130)2 {1 - [(ln(43,130) 9.7)/0.8]} =
e20.68 [-0.39] + (43,1302 ){1 - [1.21]} = (e20.68)(0.3483) + (43,1302 ){(1 - 0.8869) =
543,940,124.
E[(X d/(1+r))2 ] = E[(X 8626)2 ] =
exp(20.68)[{ln(8626) - 9.7 - (2)(0.82 )}/0.8] + (86262 ) {1 - [(ln(8626) - 9.7)/0.8]} =
(e20.68)[-2.40] + (86262 ) {1 - [-.80]} = (e20.68)(0.0082) + (86262 ) (0.7881) = 66,493,633.
From previous solutions: E[X 43,130] = 19,689, E[X 8626] = 8030, S(8626) = 0.7881.
Thus the second moment of the per-loss variable is:
(1.15932 ) (90%2 ) {543,940,124 - 66,493,633 - (2)(8626)(19,689 - 8030)} = 300,791,874.
From a previous solution, the average payment per loss is 12,165.
Thus the variance of the per-loss variable is: 300,791,874 - 12,1652 = 152,804,649.
The standard deviation of the per-loss variable is 12,361.
Comment: One could instead inflate the LogNormal to 2007 and work in 2007.
The 2007 LogNormal has parameters = 9.7 + ln[1.035 ] = 9.848, and = 0.8
36.40. A. The second moment of the per-payment variable is:
(second moment of the per-loss variable) / S(d/(1+r)) = 300,791,874 / 0.7881 = 381,667,141.
From a previous solution, the average payment per payment is 15,435.
Thus the variance of the per-payment variable is: 381,667,141 - 15,4352 = 143,427,916.
The standard deviation of the per-payment variable is 11,976.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 648

36.41. B. In 2016 the ground up losses are Pareto with = 2 and = (1+r)250.
The average payment per loss is: E[X] - E[X 100].
For a Pareto with = 2: E[X] - E[X 100] = - (1 -

2
)=
.
+ 100
+ 100

In 2011, E[X] - E[X 100] = 178.57.


In 2016, E[X] - E[X 100] =

{250(1+ r)}2
.
250(1+ r) + 100

Setting this ratio equal to 1.26:


(1.26)(178.57) =

{250(1+ r)}2
.
250(1+ r) + 100

(225) (250) (1+r) + 22,500 = 62,500 (1 + 2r + r2 ).


r2 + 1.1r - 0.26 = 0. r = 0.2, taking the positive root of the quadratic equation.
In other words, there is a total of 20% inflation between 2011 and 2016.
Comment: The mean ground up loss increases by 20%, but the losses excess of the deductible
increase at a faster rate of 26%.
The average payment per loss is 2011: 2502 / (250 + 100) = 178.57.
The average payment per loss is 2016: 3002 / (300 + 100) = 225.00.
Their ratio is: 225.00/178.57 = 1.260.
The average payment per payment is: e(100) = ( + 100) / ( - 1) = + 100.
In 2011, e(100) = 250 + 100 = 350. In 2016, e(100) = (1.2)(250) + 100 = 400.
Their ratio is: 400/350 = 1.143.
36.42. E. 1. False. FZ(Z) = FX(Z / 1.1) so that fZ(Z) = fX(Z / 1.1) / 1.1. 2. True. 3. True.
36.43. C. For the Burr distribution is transformed by inflation to (1+r) .
This follows from the fact that is the scale parameter for the Burr distribution.
The shape parameters and remain the same.
From first principles, one makes the change of variables Z = (1+r) X. For the Distribution Function
one just sets FZ(z) = FX(x); one substitutes for x = z / (1+r).
FZ(z) = FX(x) = 1 - (1/(1+(x/))) = 1 - (1/(1+(z / (1+r)))).
This is a Burr Distribution with parameters: , (1+r), and .

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 649

36.44. A. The mean of a Pareto is /(1).


Therefore, = (1)(mean) = (3-1)(25,000) = 50,000.
Prior to the impact of inflation: 1 - F(100000) = {50000 / (50000 +100000)} = 1/33 = .0370.
Under uniform inflation for the Pareto, is multiplied by 1.2 and is unchanged.
Thus the new is (50000)(1.2) = 60000.
Thus after inflation: 1 - F(100000) = { / ( + 100000)} = (6/16)3 = 0.0527.
The increase is .0527- .0370 = 0.0157.
36.45. A. Inflation has been 10% per year for 7 years. Thus the inflation factor is 1.17 = 1.949.
Under uniform inflation, the Pareto has increase to (1+r), while remains the same.
Thus in 1992 the Pareto Distribution has parameters: 2 and (12500)(1.949) = 24,363.
For the Pareto E[X x] = {/(1)}{1 - (/(+x))1}. Thus in 1992,
E[X 200000] = {24363/(2-1)}{1-(24363/(24363+200000))2-1} = 21,717.
Alternately, the 200,000 limit in 1992 corresponds to 200,000 / 1.949 = 102,617 limit in 1985.
In 1985, E[X 102,617] = {12500/(2-1)}{1-(12500/(12500+102,617))2-1} = 11143.
In order to inflate to 1992, multiply by 1.949: (1.949)(11143) = 21,718.
36.46. C. 1. False. For the Inverse Gaussian, both and are multiplied by 1+r.
2. False. becomes (1+r). (The Generalized Pareto acts like the Pareto under inflation. The scale
parameter is multiplied by the inflation factor.) 3. True.
36.47. B. Since b divides x everywhere that x appears in the density function, b is a scale
parameter. Therefore, under uniform inflation we get a Erlang Distribution with b multiplied by (1+r).
Alternately, one can substitute for x = z / (1+r).
For the density function fZ(z) = fX(x) / (1+r). (Recall that that under change of variables you need to
divide by dz/dx = 1+ r, since dF/dy = (dF /dx) / (dy/dx).)
Thus f(z) = (z/(1+r)b)c-1 e-z/(1+r)b / (1+r){ b (c-1)! }, which is an Erlang Distribution with b multiplied
by (1+r) and with c unchanged.
Comment: The Erlang Distribution is a special case of the Gamma Distribution, with c integral.
c , and b .

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 650

36.48. D. The mean of a Burr distribution is: (1+ 1/) ( 1/)/() =


(1+2 )(3 2)/(3) = (1 )(3 )/(3) = . Under uniform inflation the mean increases from 10,000 to
(10,000)(1.44) = 14,400. After inflation, the chance that a claim exceeds $40,000 is: S(40000) =
{1 / (1 + (40000/))} = {1 / (1 + (40000/14400).5)}3 = 0.0527.
Alternately, one can compute the chance of exceeding 40000 / 1.44 = 27778 prior to inflation:
S(27778) = {1 / (1 + (27778/10000).5)}3 = 0.0527.
36.49. B. Under uniform inflation for the LogNormal we get another LogNormal,
but becomes + ln(1+r) while stays the same.
Thus in this case '= 17.953 + ln(1.1) = 18.048, while remains 1.6028.
36.50. C. For the Exponential Distribution E[X x] = (1- e-x/). During 1992 the distribution is an
Exponential Distribution with = 1 and the average value of the capped losses is
E[X 1] = 1 - e-1 = .632. During 1993 the distribution is an Exponential Distribution with
= 1.1. Thus in 1993, E[X

1] = 1.1{1- e-1/1.1} = .6568.

The increase in capped losses between 1993 and 1992 is .6568 / .6321 = 1.039.
Comments: The rate of inflation of 3.9% for the capped losses with a fixed limit is less than the
overall rate of inflation of 10%.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 651

36.51. B. Prior to inflation in 1991, F(x) = 1 - x-5 , x > 1. After inflation in 1992,
F(x) = 1 - (x/1.1)-5, x > 1.1. f(y) = 5(1.15 )x-6. LER(1.2) = E[X 1.2] / E[X].
1.2

E[X 1.2] = xf(x) dx + (1.2){S(1.2)} = (1.15 )(5/4){(1.1-4) - (1.2-4)} + (1.2){(1.2/1.1)-5}


1.1

= 0.404 + 0.777 = 1.181

E[X] = xf(x) dx = (1.15 ) x-5 dx = (1.15 ) (5/4)(1.1-4) = 1.375.


1.1

1.1

LER(1.2) = E[X 1.2] / E[X] = 1.181 / 1.375 = 0.859.


Comment: Remember that under uniform inflation the domain of the Distribution Function also
changes; in 1992 x >1.1. This is a Single Parameter Pareto with = 5 and = 1.1.
E[X x] = [{ (x/)1} / ( 1)]. E[X 1.2] = 1.1[{5 - (1.2/1.1)1-5} / (5 - 1)] = 1.181.
E[X] = / ( 1) = 1.1(5/4) = 1.375.
LER(x) = 1 - (1/) (x/)1. LER(1.2) = 1 - (1/5)(1.2/1.1)-4 = 1 - (.2)(.7061) = 0.859.
Note that one could instead deflate the 1.2 deductible in 1992 to a 1.2/1.1 = 1.0909 deductible in
1991 and then work with the 1991 distribution function.
36.52. B. The distribution for the 1993 losses is an exponential distribution F(x) = 1 - e-x.
In order to convert into 1994 dollars, the parameter of 1 is multiplied by 1 plus the inflation rate of
5%; thus the revised parameter is 1.05. The capped losses which are given by the Limited
Expected Value are for the exponential: E[X x] = (1 - ex/). Thus in 1993 the losses capped to
1 ($million) is E[X 1] = (1- e-1) / 1 = 0.6321. In 1994 with = 1.05,
E[X 1] = (1 - e-0.9524)(1.05) = 0.6449.
The increase in capped losses is: 0.6449 / 0.6321 = 1.019, or 1.9% inflation.
Alternately rather than working with the 1994 distribution one can translate everything back to 1993
dollars and use the 1993 distribution. In 1993 dollars the 1994 limit of 1 is only 1/1.05 = 0.9524.
Thus the capped losses in 1994 are in 1993 dollars E[X 0.9524] = (1 - e-0.9524).
In 1994 dollars the 1994 capped losses are therefore 1.05E[X 0.9524] = 0.6449.
The solution is therefore 0.6449 / 0.6321 = 1.019, or 1.9% inflation.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 652

36.53. A. Statement 1 is false. In fact the coefficient of variation as well as the skewness are
dimensionless quantities which are unaffected by a change in scale and are therefore unchanged
under uniform inflation. Specifically in this case the new mean is the prior mean times (1 + r), the new
variance is the prior variance times (1+r)2 .
Therefore, the new coefficient of variation = new standard deviation / new mean =
(1+ r) prior standard deviation / (1 + r) prior mean = prior standard deviation / prior mean =
prior coefficient of variation.
Statement 3 is false. In fact, E[Z d(1+r)] = (1+r) E[X d]. The left hand side is the Limited
Expected Value in the later year, with a limit of d(1+r); we have adjusted d, the limit in the prior year,
in order to keep up for inflation via the factor 1+r. This yields the Limited Expected Value in the prior
year, except multiplied by the inflation factor to put it in terms of the subsequent year dollars, which is
the right hand side. For example, if the expected value limited to $1 million is $300,000 in the prior
year, then after uniform inflation of 10%, the expected value limited to $1.1 million is $330,000 in the
later year. In terms of the definition of the Limited Expected Value:
d(1+r)

E[Z d(1+r)] =

z fZ(z) dz + SZ(d(1+r))d(1+r) = (1+r)x fX(x) dx + SX(d)d(1+r) = (1+r)E[X d].


0

Where we have applied the change of variables, z = (1+d) x and thus FZ(d(1+r)) = FX(d),
and fX(x) dx = fZ(z) dz.
Statement 2 is true. The mean residual life at d in the prior year is given by eX(d) =

{ mean of X - E[X d] } / {1 - FX(d)}. Similarly, the mean residual life at d(1+r) in the later year is
given by eZ(d(1+r)) = {mean of Z - E[Z;d(1+r)]} / {1 - FZ(d(1+r))} =
{ (1+r)mean of X - (1+r)E[X d] } / {1 - FX(d)} = (1+r)eX(d) . Thus the mean residual life in the later
year is multiplied by the inflation factor of (1+r), provided the limit has been adjusted to keep up
with inflation. For example, if the mean residual life beyond $1 million is $3 million in the prior year,
then after uniform inflation of 10%, the mean residual life beyond $1.1 million is $3.3 million in the
subsequent year.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 653

36.54. B. Losses uniform on [0, 10000] in 1991 become


uniform on: [0, 1.052(10000)] = [0, 11025] in 1993.
500

LER( 500) = {

11025

x f(x) dx + (1- F(500))(500) } / x f(x) dx

We have f(x) = 1/11025 for 0 x 11025. F(500) = 500 / 11025 = 0.04535.


Thus, LER( 500) = {(1/11025)(5002 )/2 + (1-.04535)(500)} / (11025 / 2) = 0.0886.
Alternately, the LER(500) in 1993 is the LER(500/1.1025) = LER(453.51) in 1991.
453.51

In 1991: E[X 453.51] =

x /10000 dx + S(453.51) (453.51) =

10.28 + 432.96 = 443.24. Mean in 1991 = 10000 / 2 = 5000.


In 1991: LER(453.51) = E[X 453.51] / mean = 443.24 / 5000 = .0886.
36.55. C. F(x) = 1 - x-3, x 1 in 1993 dollars.
A loss exceeding $2.2 million in 1994 dollars is equivalent to a loss exceeding
$2.2 million / 1.1 = $2 million in 1993 dollars.
The probability of the latter is: 1 - F(2) = 2-3 = 1/8 = 0.125.
Alternately, the distribution function in 1994 dollars is: G(x) = 1 - (x/1.1)-3, x 1.1.
Therefore, 1 - G(2.2) = (2.2/1.1)-3 = 1/8 = 0.125.
Comment: Single Parameter Pareto Distribution.
36.56. D.
Probability
0.1667
0.1667
0.1667
0.1667
0.1667
0.1667

1993 Amount
Loss Amount
1000
2000
3000
4000
5000
6000

1993
Insurer Payment
0
500
1500
2500
3500
4500

1994 Amount
Loss Amount
1050
2100
3150
4200
5250
6300

1994
Insurer Payment
0
600
1650
2700
3750
4800

Average

3500.00

2083.33

3675.00

2250

2250 / 2083 = 1.080, therefore the insurers payments increased 8%.


Comment: Inflation on the losses excess of the deductible is greater than that of the ground up
losses.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 654

36.57. E. The distribution for the 1993 losses is an exponential distribution F(x) = 1 - e-.1x. In order
to convert into 1994 dollars, the parameter of 1/.1 is multiplied by 1 plus the inflation rate of 10%;
thus the revised parameter is 1.1/.1 = 1/.0909. Thus the 1994 distribution function is
G(x) = 1 - e-.0909x , where x is in 1994 dollars. The next step is to write down (in 1994 dollars) the
Truncated and Shifted distribution function for a deductible of d:
FP(x) = {G(x+d) - G(d)} / {1 - G(d)} = {e-0.0909d - e-0.0909 (x+d)} / e-0.0909d = 1 - e-0.0909x.
Fp (5) = 1- e-(0.0909)(5) = 0.3653.
Alternately, $5 in 1994 dollars corresponds to $5 / 1.1 = $4.545 in 1993 dollars.
In 1993, the Truncated and Shifted distribution function for a deductible of d:
G(x) = {F(x+d) - F(d)} / {1 - F(d)} = {e-0.1d - e-0.1(x+d)} / e-0.1d = 1 - e-0.1x.
G(4.545) = 1 - e-0.1(4.545) = 0.3653.
Comment: Involves two separate questions: how to adjust for the effects of inflation and how to
adjust for the effects of truncated and shifted data. Note that for the exponential distribution, after
truncating and shifting the new distribution function does not depend on the deductible amount d.
36.58. B. Under uniform inflation, the parameters of a LogNormal become:
= + ln(1.1) = 10 + .09531 = 10.09531, = =

5.

Using the formula for the limited expected value of the LogNormal: E[X $2,000,000] =
exp(10.09531 + 5/2) [ln(2,000,000) - 10.09531 - 5)/ 5 ] +
(2,000,000){1 - [ln(2,000,000) - 10.09531)/ 5 ]} = 295,171 [-.26] + (2,000,000)(1 - [1.97]) =
(295,171)(.3974) + (2,000,000)(.0244) = $166 thousand.
Alternately, using the original LogNormal Distribution, the average payment per loss in 1994 is:
1.1 E1993[X 2 million / 1.1] = 1.1 E1993[X 1,818,182] =
1.1 { exp(10 + 5/2) [ln(1,818,182) - 10 - 5)/ 5 ] + (1,818,182){1 - [ln(1,818,182) - 10)/ 5 ]} } =
1.1{268,337 [-.26] + (1,818,182)(1 - [1.97]) =
(1.1){(268,337)(.3974) + (1,818,182)(.0244)} = $166 thousand.
36.59. E. One can put y = 1.05x, where y is the claim size in 1994 and x is the claim size in 1993.
Then let g(y) be the p.d.f. for y, g(y)dy = f(x)dx = f(y/1.05) dy/1.05.
g(y) = exp(-.5[{(y/1.05)-1000}/100]2 ) / { 2 (100)(1.05)} = exp(-.5{(y-1050)/105}2 / { 2 (105)}.
This is again a Normal Distribution with both and multiplied by the inflation factor of 1.05.
Comment: As is true in general, under uniform inflation, both the mean and the standard deviation
have been multiplied by the inflation factor of 1.05. Assuming you remember that a Normal
Distribution is reproduced under uniform inflation, you can use this general result to arrive at the
solution to this particular problem, since for the Normal, is the mean and is the standard deviation.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 655

36.60. B. 0.95 = p0 = S(1) = e-1/. = -1/ln(0.95) = 19.5.


Y is also Exponential with twice the mean of X: (2)(19.5) = 39. fY(1) = e-1/39/39 = 0.0250.
36.61. C. If the losses are uniformly distributed on [0,2500] in 1994 then they are uniform on
[0,2575] in 1995. (Each boundary is multiplied by the inflation factor of 1.03.)

100

LER( 100) = {

x f(x) dx + (1- F(100))(100) } / x f(x) dx =

100

2575

x (1/2575) dx

+ ( 1 - (100/2575) )(100) } /

x (1/2575) dx =
0

{(1/2575)(1002 )/2 + 100 - (1/2575)(1002 )} / (2575/2) = 7.62%.


Alternately, $100 in 1995 is equivalent to 100/1.03 = $97.09 in 1994. In 1994:

97.09

LER( 97.09) = {

x f(x) dx + (1- F(97.09))(97.09)} / x f(x) dx =

97.09

{
0

2500

x (1/2500) dx + (1 - (97.09/2500))(97.09)} / x (1/2500) dx =


0

{(1/2500)(97.092 )/2 + 97.09 - (1/2500)(97.092 )} / (2500/2) = 7.62%.


36.62. D. Under uniform inflation the parameters of the Pareto become 2 and 1000(1.1) = 1100.
The expected number of insurer payments is 10 losses per year times the percent of losses
greater than 100: 10{1-F(100)} = 10 {1100/(1100+100)}2 = 8.40.
Alternately, after inflation the $100 deductible is equivalent to 100/1.1 = 90.91. For the original
Pareto with = 2 and = 1000, 10{1-F(90.91)} = 10 {1000/1090.91}2 = 8.40.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 656

36.63. C. Under uniform inflation the scale parameter of the Pareto is multiplied by the inflation
factor, while the shape parameter remains the same. Therefore the size of loss distribution in 1995
has parameters: = (500)(1.05) = 525, = 1.5.
F(x) = 1 - {525/(525+x)}1.5 . The distribution function of the data truncated from below at 200 is:
G(x) = {F(m) - F(200)} / {1-F(200)} =
{(525/(525+200))1.5 - (525/(525+x))1.5} / (525/(525+200))1.5 = 1 - (725/(525+x))1.5.
At the median m of the distribution truncated from below G(m) = .5.
Therefore, 1 - (725/(525+m))1.5 = .5. (725/(525+m))1.5 = .5.
Thus {(525+m)/725} = 21/1.5 = 1.587. Solving, m = (725)(1.587) - 525 = 626.
36.64. B. After 20% uniform inflation, the parameters of the LogNormal are:
= + ln(1+r) = 7 + ln (1.2) = 7.18, while is unchanged at 2.
F(2000) = [{ln(2000) 7.18} / 2] = [.21] = .5832.
Thus the expected number of claims per year greater than 2000 is:
10{1 - F(2000)} = (10)(1 - .5832) = 4.17.
Alternately, one can deflate the deductible amount of 2000, which is then 2000/ 1.2 = 1667, and use
the original LogNormal Distribution.
The expected number of claims per year greater than 1667 in the original year is:
10(1 - F(1667)) = (10)(1 - [{ln(1667) - 7}/ 2]) = (10)(1 - [.21]) = (10)(1 - .5832) = 4.17.
Comment: Prior to inflation, the expected number of claims per year greater than 2000 is:
10(1 - F(2000)) = (10)(1 - [{ln(2000) - 7} / 2]) = (10)(1 - [.30]) = 3.82.
36.65. E. (10 million / 0.8) + (9 million / 0.9) = 22.5 million.
Comment: Prior to working with observed losses, they are commonly brought to one common level
of inflation.
36.66. D. For the Pareto Distribution, LER(x) = E[X x] / E[X] = 1 - (/(+x))1.
In the later year, losses have doubled, so the scale parameter of the Pareto has doubled, so = 2k,
rather than k.
For = 2k and = 2, LER(x) = 1 - (2k/(2k+x)) = x / (2k + x). Thus LER(2k) = 2k/ 4k = 1/2.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 657

36.67. C. The behavior of the LogNormal under uniform inflation is explained by noting that
multiplying each claim by a factor of 1.1 is the same as adding a constant amount ln(1.1) to the log of
each claim. (For the LogNormal, the log of the sizes of the claims follow a Normal distribution.)
Adding a constant amount to a normal distribution, gives another normal distribution, with the same
variance but with the mean shifted. Thus under uniform inflation for the LogNormal, becomes
+ ln(1.1). The parameter remains the same.
36.68. C. An inflation factor of .25 applied to a Weibull Distribution, gives another Weibull with
scale parameter: (0.25) = (.25)(625) = 156.25, while the shape parameter is unaffected.
Thus Z is a Weibull with parameters = 156.25 and = 0.5.
36.69. C. For the Exponential Distribution,under uniform inflation is multiplied by the inflation
factor. In this case, the inflation factor is 2, so the new theta is (1000)(2) = 2000.
Prior to inflation the percent of losses that exceed the deductible of 500 is:
e-500/1000 = e-.5 = .6065.
After inflation the percent of losses that exceed the deductible of 500 is: e-500/2000 = e-.25 = .7788.
Thus the number of losses that exceed the deductible increased by a factor of .7788/.6065 =
1.284. Since there were 10 losses expected prior to inflation, there are (10)(1.284) = 12.8 claims
expected to exceed the 500 deductible after inflation.
Comment: One can also do this question by deflating the 500 deductible to 250. Prior to inflation,
S(250) = e-250/1000 = e-.25 and S(500) = e-500/1000 = e-.5. Thus if 10 claims are expected to
exceed 500, then there a total of 10/e-.5 claims. Thus the number of clams expected to exceed 250
is: (10e.5)(e-.25) = 10e.25 = 12.8.
36.70. D. Under uniform inflation, for the Burr, theta is multiplied by (1+r), thus theta becomes:
1.1 1000 = 34.8.
Comment: For a mixed distribution, under uniform inflation each of the individual distributions is
transformed just as it would be if an individual distribution. In this case, the Pareto has new
parameters = 1 and = (1000)(1.1) = 1100, while the Burr has new parameters = 1,
= (1.1) 1000 = 1210, and = 2. The weights applied to the distributions remain the same.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 658

36.71. B. The mean of the 1997 LogNormal is: exp((+ ln k) + 2 /2).


F96[Mean97] = [(ln(Mean97) - ) / ] = [(+ ln k + 2 /2 - ) / ] = [(ln k + 2 /2) / ].
Since we are given that in 1996 100p% of the losses exceed the mean of the losses in 1997,
1 - F96[Mean97] = p. Thus F96[Mean97] = 1 - p.
Thus [(ln k + 2 /2) / ] = 1 - p. Since the Normal Distribution is symmetric,
[-(ln k + 2 /2) / ] = p. Thus by the definition of zp , -(ln k + 2 /2) / = zp .
Therefore, 2 /2 + zp + ln k = 0. = -zp zp2 - 2 ln k .
Comment: The 1997 distribution is the result of applying uniform inflation, with an inflation factor of k,
to the 1996 distribution. Thus the mean of the 1997 distribution is: k exp(+ 2 /2), k times the
mean of the 1996 distribution. One could take for example p = .05, in which case zp = -1.645, and
then solve for in that particular case.
36.72. D. At 10% per year for three years, the inflation factor is 1.13 = 1.331. Thus greater than
500 in 1998 corresponds to greater than 500/1.331 = 376 in 1995. At least 45 and at most 50
claims are less than 376 in 1995. Therefore, between 50% and 55% of the total of 100 claims are
greater than 376 in 1995. Therefore, between 50% and 55% of the total of 100 claims are greater
than 500 in 1998.
Comment: One could linearly interpolate that about 52% or 53% of the claims are greater than 500
in 1998.
36.73. D. Deflate the 15,000 deductible in the later year back to the prior year:
15,000/1.5 = 10,000. In the prior year, the average non-zero payment is:
(E[X] - E[X 10000]) / S(10000) = (20,000 - 6000) / (1 - 0.6) = 14000 / 0.4 = 35,000.
Inflating to the subsequent year: (1.5)(35,000) = 52,500.
Comment: If the limit keeps up with inflation, so does the mean residual life.
36.74. C. In 1999, one has a Pareto with parameters 2 and 1.06.
S 1999(d) = {1.06 / (1.06 + d)}2 . S1998(d) = { / ( + d)}2 .
r = S1999(d) / S1998(d) = 1.062 {( + d) / (1.06 + d)}2 = 1.1236 {(1+ /d ) / (1+ 1.06/d)}2
As d goes to infinity, r goes to 1.1236.
Comment: Alternately, S1999(d) = S1998(d/1.06) = { / ( + d/1.06)}2 .

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 659

36.75. C. After 10% inflation, the survival function at 1000 is what it was originally at 1000/1.1 =
909.09. S(1000) = 1 - F(1000) = 1 - [{ln(1000)} / ] = 1 - [0] = 0.5.
S(909.09) = 1 - [{ln(909.09)} / ] = 1 - [-.06] = 0.5239.
S(909.09) / S(1000) = 0.5239 / 0.5 = 1.048. An increase of 4.8%.
Comment: After inflation one has a LogNormal with = 6.9078 + ln(1.1), = 1.5174.
36.76. Assume for simplicity that the expected frequency is 5. One loss of each size.
Loss

Contribution to
Layer 0-50

Contribution to
Layer 50-100

Contribution to
Layer 100-200

Contribution to
Layer 200-

Total

40
80
120
160
200

40
50
50
50
50

0
30
50
50
50

0
0
20
60
100

0
0
0
0
0

40
80
120
160
200

Total

240

180

180

600

For the next year, increase each size of loss by 10%:


Loss

Contribution to
Layer 0-50

Contribution to
Layer 50-100

Contribution to
Layer 100-200

Contribution to
Layer 200-

Total

44
88
132
176
220

44
50
50
50
50

0
38
50
50
50

0
0
32
76
100

0
0
0
0
20

44
88
132
176
220

Total

244

188

208

20

660

Trend for layer from 0 to 50 is: 244/240 - 1 = 1.7%.


Trend for layer from 50 to 100 is: 188/180 - 1 = 4.4%.
Trend for layer from 100 to 200 is: 208/180 - 1 = 15.6%.
Comment: The limited losses in the layer from 0 to 50 increase slower than the overall rate of
inflation 10%, while the excess losses in the layer from 200 to increase faster. The losses in
middle layers, such as 50 to 100 and 100 to 200, can increase either slower or faster than the overall
rate of inflation, depending on the particulars of the situation.
36.77. A. Y follows a Pareto Distribution with parameters: = 2 and = (1.10)(100) = 110.
Thus eY(k) = (k+)/(-1) = k + 110. eX(k) = k + 100.
eY(k) / eX(k) = (k+110) / (k+100) = 1 + 10/(k+110).
Therefore, as k goes from zero to infinity, eY(k) / eX(k) goes from 1.1 to 1.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 660

36.78. a. The frequency of claims exceeding the franchise deductible is:


{1 - F(1000)} 0.05 = (1 - 0.09) (0.05) = 4.55%.
The average payment per payment for an ordinary deductible of 1000 is:
(E[X] - E[X 1000]) / {1 - F(1000)} = (10,000 - 945) / (1 - 0.09) = 9950.55.
The average payments per payment for a franchise deductible of 1000 is 1000 more: $10,950.55.
b. After inflation, the average payment per payment for an ordinary deductible of 1000 is:
(1.1)(E[X] - E[X 909]) / {1 - F(909)} = (1.1)(10,000 - 870) / (1 - 0.075) = 10,857.30.
The average payments per payment for a franchise deductible of 1000 is 1000 more: $11,857.30.
Subsequent to inflation, the frequency of payments is: (1 - 0.075) (0.05) = 4.625%,
and the pure premium is: (4.625%)($11,857.30) = $548.40.
Comment: Prior to inflation, the pure premium is: (4.55%)($10,950.55) = $498.25.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 661

36.79. B. During 1994, there is a 100/1000 = 10% chance that nothing is paid.
If there is a non-zero payment, it is uniformly distributed on (0, 900).
Thus the mean amount paid is: 90%(450) = 405.
The second moment of the amount paid is: (90%)(9002 )/3 = 243,000.
Thus in 1994, the standard deviation of the amount paid is:

243,000 - 4052 = 281.02.

In 1995, the losses are uniformly distributed from (0, 1050). During 1995, there is a 100/1050
chance that nothing is paid. If there is a non-zero payment it is uniformly distributed on (0, 950).
Thus the mean amount paid is: (950/1050)(950/2) = 429.76.
The second moment of the amount paid is: (950/1050)(9502 )/3 = 272,182.5.
Thus in 1994, the standard deviation of the amount paid is:

272,182.5 - 429.762 = 295.78.

% increase in the standard deviation of amount paid is: 295.78/281.02 - 1 = 5.25%.


Alternately, the variance of the average payment per loss under a maximum covered loss of u and
a deductible of d is: E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - E[X d]} - {E[X u] - E[X d]}2 .
With no maximum covered loss (u = ), this is:
E[X2 ] - E[(X d)2 ] - 2d{E[X] - E[X d]} - {E[X] - E[X d]}2 .
For the uniform distribution on (a, b), the limited moments are for a x b :
x

E[(X d)n ] = yn / (b-a) dy + xn S(x) = (xn+1 - an+1)/{(n+1)(b-a)} + xn (b-x)/(b-a) =


a

{(n+1)xn b - an+1 - n xn+1} / {(n+1)(b-a)}.


In 1994, a = 0, b = 1000, and d =100. E[X 100] = {(2)(100)(1000) - 1002 }/2000 = 95.
E[(X 100)2 ] = ((3)(1002 )(1000) - (2)1003 )/3000 = 9333.33. The variance of the average
payment per loss is: 10002 /3 - 9333.33 - (2)(100)(500 - 95) - (500 - 95)2 = 78,975.
Similarly, in 1995, E[X 100] = ((2)(100)(1050) - 1002 )/((2)(1050)) = 95.238.
E[(X 100)2 ] = ((3)(1002 )(1050) - (2)1003 )/((3)(1050))= 9365.08.
In 1995, the variance of the average payment per loss is:
10502 /3 - 9365.08 - (2)(100)(525 - 95.238) - (525 - 95.238)2 = 87,487.
% increase in the standard deviation of amount paid is: 87,487 / 78,975 - 1 = 5.25%.
Comment: The second moment of the uniform distribution on (a, b) is: (b3 - a3 ) / {3(a-b)}.
When a = 0, this is b2 / 3. The amount paid is a mixture of two distributions, one always zero and the
other uniform. For example, in 1994, the amount paid is a 10%-90% mixture of a distribution that is
always zero and a uniform distribution on (0, 900). The second moments of these distributions are
zero and 9002 /3= 270,000. Thus the second moment of the amount paid is:
(10%)(0) + (90%)(270000) = 243,000.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 662

36.80. C. Prior to inflation, .250 = F(1000) = 1 - e-1000/. = 3476.


After inflation, = (2)(3476) = 6952. F(1000) = 1 - e-1000/6952 = 0.134.
36.81. D. E[X

E[X
E[X
E[X

10] = -0.025(102 ) + (1.475)(10) - 2.25 = 10.

11] = -0.025(112 ) + (1.475)(11) - 2.25 = 10.95.


20] = -0.025(202 ) + (1.475)(20) - 2.25 = 17.25

22] = -0.025(222 ) + (1.475)(22) - 2.25 = 18.10.


In 2000, there is a deductible of 11 and a maximum covered loss of 22, so the expected payments
are: E[X 22] - E[X 11] = 18.10 -10.95 = 7.15.
A deductible of 11 and maximum covered loss of 22 in the year 2001, when deflated back to the
year 2000 correspond to a deductible of 11/1.1 = 10 and a maximum covered loss of 22/1.1 = 20.
Therefore, reinflating back to the year 2001, the expected payments in the year 2001 are:
(1.1)(E[X 20] - E[X 10]) = (1.1)(17.25 - 10) = 7.975.
The ratio of expected payments in 2001 over the expected payments in the year 2000 is:
7.975/ 7.15 = 1.115.
Alternately, the insurers average payment per loss is: (1+r) c (E[X L/(1+r)] - E[X d/(1+r)]).
c = 100%, L = 22, d = 11. r = .1 for the year 2001 and r = 0 for the year 2000.
Then proceed as previously.
36.82. C. The expected frequency is: 10,000,000 / 2000 = 5000.
E[X

x] = {/(-1)}{1 - (/(x+))1} = (2000){1 - (2000/(x + 2000)} = 2000 x/(x + 2000).

E[X] - E[X

3000] = 2000 - 1200 = 800.

Expected losses paid by reinsurer: (5000)(800) = 4 million.


The ceded premium is: (1.1)(4 million) = 4.4 million.
Alternately, the Excess Ratio, R(x) = 1 - E[X x]/E[X]. For the Pareto, E[X] = /(-1) and
E[X x] = {/(-1)}{1- (/(x+))1}. Therefore R(x) = (/(x+))1.
In this case, = 2000 and = 2, so R(x) = 2000/(x+2000). R(3000) = 40%.
The expected excess losses are: (40%)(10,000,000) = 4 million.
The ceded premium is: (1.1)(4 million) = 4.4 million.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 663

36.83. E. In 2001 the Pareto has = 2000 and = 2, so in 2002 the Pareto has parameters
= (1.05)(2000) = 2100 and = 2. In 2002, R(3000) = 2100/(3000+2100) = 41.18%.
The expected losses in 2002 are: (1.05)(10 million) = 10.5 million.
The expected excess losses in 2002 are: (41.18%)(10.5 million) = 4.324 million.
The ceded premium in 2002 is: (1.1)(4.324 million) = 4.756 million.
C 2002 / C2001 = 4.756/4.4 = 1.08.
Alternately, the excess ratio at 3000 in 2002 is the same as the excess ratio at 3000/1.05 = 2857 in
2001. In 2001, R(2857) = 2000/(2857+2000) = 41.18%. Proceed as before.
Alternately, the average payment per loss is: (1+r) c (E[X L/(1+r)] - E[X d/(1+r)]).
In 2001 this is: E[X] - E[X 3000]. In 2002 this is: (1.05)(E[X] - E[X 3000/1.05]).
C 2002 / C2001 = (avg. payment per loss in 2002)/(average payment per loss in 2001) =
(1.05)(E[X] - E[X 2857])/(E[X] - E[X 3000]) = (1.05)(2000 - 1176)/(2000 - 1200) = 1.08.
Comment: Over a fixed limit the excess losses increase more quickly than the overall inflation rate of
5%.
36.84. D. After severity increases by 50%:
Probability Severity
Payment with 100 deductible
0.25
60
0
0.25
120
20
0.25
180
80
0.25
300
200
Average payment per loss: (0 + 20 + 80 + 200)/4 = 75.
Expected total payment = (300)(75) = 22,500
Comment: Expect 300 payments: 75@0, 75@ 20, 75@80, and 75@ 200, for a total of 22,500.
36.85. B. Prior to inflation, S(100) = e-100/200 = 0.6065.
After inflation, severity is Exponential with = (1.2)(200) = 240.
S(100) = e-100/240 = .6592.
Percentage increase in the number of reported claims next year:
(1.1)(.6592/.6065) - 1 = 19.6%.
36.86. D. After inflation, severity is Pareto with = (1.2)(1500) = 1800, and = 4.
Expected payment per loss: E[X] - E[X

100] = {/(-1)} - {/(-1)}{1 - (/(+100))1}

= {1800/(4 - 1)}(1800/(1800 +100))4-1 = 510.16.


Alternately, the average payment per loss in the later year is:
(1+r) c (E[X u/(1+r)] - E[X d/(1+r)]) = (1.2)(1)(E[X] - E[X 100/1.2]) =
1.2{500 - 500(1 - (1500/1583.33)3 )} = 510.16.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 664

36.87. A. 1. True.
2. False. Should be and (1+i). In any case, is a shape not a scale parameter.
3. False. Should be (1+i); the scale parameter is multiplied by one plus the rate of inflation.
36.88. B. First year: E[X 5000] = (2000/(2-1)){1 - (2000/(2000 + 5000))2-1} = 1429.
Next year, d = 100, u = 5000, c = 80% and r = 4%, and the average payment per loss is:
(1.04)(80%){E[X 5000/1.04] - E[X 100/1.04]} = 0.832{E[X 4807.7] - E[X 96.15]} =
(.832)(2000){2000/(2000 + 96.15) - 2000/(2000 + 4807.7)} = 1099.
Reduction is: 1 - 1099/1429 = 23.1%.
36.89. B. In 2004 losses follow a Pareto with = 2 and = (1.2)(5) = 6.
E[X] = 6/(2 - 1) = 6. E[X
LER(10) = E[X

10] = {/(-1)}{1 - (/(+x))1} = (6){1 - 6/(6 + 10)} = 3.75.

10]/E[X] = 3.75/6 = .625 = 5/8.

36.90. D. Sthisyear(750000) = {400000/(400000 + 750000)}2 = .12098.


After inflation, the losses follow a Pareto distribution with = 2 and = (1.1)(400,000) = 440,000.
S nextyear(750000) = {440000/(440000 + 750000)}2 = .13671.
S nextyear(750000) / Sthisyear(750000) = .13671/.12098 = 1.130.
Alternately, one can calculate the survival function at 750,000 next year, by deflating the 750000 to
this year. 750000/1.1 = 681,818. Snextyear(750000) = Sthisyear(681,818) =
{400000/(400000 + 681,818)}2 = 0.13671. Proceed as before.
36.91. B. This is Pareto Distribution with = 3 and = 800.
In 2006, losses follow a Pareto Distribution with = 3 and = (1.08)(800) = 864.
With a franchise deductible, the total amount of those losses of size 300 or less are eliminated, while
the full amount of all losses of size greater than 300 is paid.
300

Losses eliminated = x f(x)dx = E[X


0

300] - 300S(300) =

(864/2){1 - (864/(300 + 864))2 } - (300){864/(300 + 864)}3 = 71.30.


Loss elimination ratio = Losses eliminated / mean = 71.30/(864/2) = 16.5%.
Alternately, the expected payment per loss with an ordinary deductible would be:
E[X] - E[X 300] = (864/2) - (864/2){1 - (864/(300 + 864))2 } = 238.02.
With the franchise deductible one pays 300 more on each loss of size exceeding 300 than under
the ordinary deductible: 238.02 + 300S(300) = 238.02 + (300){864/(300 + 864)}3 = 360.71.
E[X] = 864/2 = 432. Loss Elimination Ratio is: 1 - 360.71/432 = 16.5%.

2013-4-2,

Loss Distributions, 36 Inflation

36.92. D. For a Pareto Distribution, E[X] - E[X

HCM 10/8/12,

Page 665

d] = /(1) - {/(1)}{1 - (/(d + )) 1} =

/{(-1)(d + )1}.
The premium in each year is: 1.2(E[X] - E[X

600]) = 1.2/{(-1)(600 + )1}.

If the reinsurance covers the layer excess of d in ground up loss, then the reinsurance premium is:
1.1(E[X] - E[X

d]) = 1.1/{(-1)(d + )1}.

In 2005, the Pareto has = 2 and = 3000.


R2005 = .55P2005. (1.1) 30002 /(d + 3000) = (.55)(1.2) 30002 /3600.

(.66)(d + 3000) = (1.1)(3600). d = 3000.


In 2006, the losses follow a Pareto Distribution with = 2 and = (3000)(1.2) = 3600.
P2006 = (1.2)36002 /4200. R2006 = (1.1)36002 /6600.
R2006/P2006 = (1.1)(4200)/{(1.2)(6600)} = 0.583.
Comment: The higher layer increases more due to inflation, and therefore the ratio of R/P has to
increase, thereby eliminating choices A, B, and C.
One could have instead described the reinsurance as covering the layer excess of 600 + d in
ground up loss, in which case d = 2400 and one obtains the same final answer.
P2005 = 3000. R2005 = 1650. P2006 = 3703. R2006 = 2160.
36.93. D. The losses in dollars are all 30% bigger.
For example, 10,000 euros 13,000 dollars.
This is mathematically the same as 30% uniform inflation.
We get another Lognormal with = 2 the same, and = + ln(1+r) = 8 + ln(1.3) = 8.26.
Comment: The mean in dollars should be 1.3 times the mean in euros. The mean in euros is:
exp[8 + 22 /2] = 22,026. For the choices we get means of: exp[6.15 + 2.262 /2] = 6026,
exp[7.74 + 22 /2] = 16,984, exp[8 + 2.62 /2] = 87,553, exp[8.26 + 22 /2] = 28,567, and
exp[10.4 + 2.62 /2] = 965,113. Eliminating all but choice D.
36.94. A. In 2005, S(250) = (1000/1250)3 = .512.
In 2006, the Pareto distribution has = 3 and = (1.1)(1000) = 1100.
In 2006, S(250) = (1100/1350)3 = .54097.
Increase in expected number of claims: (14)( .54097 - .512) = 0.406 claims.
Alternately, deflate 250 in 2006 back to 2005: 250/1.1 = 227.27.
In 2005, S(227.27) = (1000/1227.27)3 = .54097. Proceed as before.
Comment: We make no use of the fact that frequency is Poisson.

2013-4-2,

Loss Distributions, 36 Inflation

HCM 10/8/12,

Page 666

36.95. D. In 4 years, severity is Pareto with parameters = 4 and = (1.064 )3000 = 3787.
Under Policy R, the expected cost per loss is the mean: 3787/(4 - 1) = 1262.
Under Policy S, the expected cost per loss is: E[X 3000] - E[X 500] =
{/(1)}{(/(+500))1 (/(+3000))1} = (1262){(3787/4287)3 - (3787/6787)3 } = 651.
Difference is: 1262 - 651 = 611.
Alternately, under Policy S, the expected cost per loss, using the Pareto for year t, is:
(1.064 ){E[X 3000/1.064 ] - E[X 500/1.064 ]} =
(1.2625){E[X 2376] - E[X 396]} =
(1.2625){{3000/(4-1)}{1 - (3000/5376)4-1} - {3000/(4-1)}{1 - (3000/3396)4-1} =
(1.2625){826 - 311} = 650.
Difference is: 1262 - 650 = 612.
Comment: This exam question should have said: Policy S has a deductible of $500 and a
maximum covered loss of $3,000.
36.96. Compare the contributions to the layer from 100,000 to 200,000 before and after inflation:
Claim

Loss

Contribution
to Layer

A
B
C
D
E

35,000
125,000
180,000
206,000
97,000

0
25,000
80,000
100,000
0

Total

205,000

Inflated
Loss

Contribution
to Layer

37,800
135,000
194,400
222,480
104,760

0
35,000
94,400
100,000
4,760
234,160

234,160/205,000 = 1.142. 14.2% effective trend on this layer.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 667

Section 37, Lee Diagrams


A number of important loss distribution concepts can be displayed graphically.
Important concepts will be illustrated using this graphical approach of Lee.276
While this material is not on your exam, graphically oriented students may benefit from looking at this
material. You may find that it helps you to remember formulas that are used on your exam.
Below is shown a conventional graph of a Pareto Distribution with = 4 and = 2400:

Exercise: For a Pareto Distribution with = 4 and = 2400, what is F(1000)?



[Solution: F(x) = 1 -
. F(1000) = 1 - (2400/3400)4 = 0.752.]
+ x
In the conventional graph, the x-axis correspond to size of loss, while the y-axis corresponds to
probability. Thus for example, the above graph of a Pareto includes the point (1000, 0.752).
In contrast, Lee Diagrams have the x-axis correspond to probability, while the y-axis
corresponds to size of loss.
276

The Mathematics of Excess of Loss Coverage and Retrospective Rating --- A Graphical Approach, by
Y.S. Lee, PCAS LXXV, 1988. Currently on the syllabus of CAS Exam 9. Lee cites A Practical Guide to the Single
Parameter Pareto Distribution, by Steven Philbrick, PCAS 1985. Philbrick in turn points out the similarity to the
treatment of Insurance Charges and Savings in Fundamentals of Individual Risk Rating and Related Topics, by
Richard Snader.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 668

Here is the Lee Diagram of a Pareto Distribution with = 4 and = 2400:

For example, since F(1000) = .752, the point (.752, 1000) is on the curve. Note the way that the
probability approaches a vertical asymptote of unity as the claim size increases towards infinity.277
Advantages of this representation of Loss Distributions include the intuitively appealing features:278
1. The mean is the area under the curve.279
2. A loss limit is represented by a horizontal line, and excess losses lie above the line.
3. Losses eliminated by a deductible lie below the horizontal line represented by the deductible
amount.
4. After the application of a trend factor, the new loss distribution function lies above the prior
distribution.

F(x) 1, as x .
A Practical Guide to the Single Parameter Pareto Distribution, by Steven Philbrick, PCAS 1985.
279
As discussed below. In a conventional graph, the area under a distribution function is infinite, the area under the
survival function is the mean, and the area under the density is one.
277
278

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 669

Means:
One can compute the mean of this size of loss distribution as the integral from 0 to
of y f(y)dy. As shown below, this is represented by summing narrow vertical strips, each of width
f(y)dy and each of height y, the size of loss.280
5000

4000

3000

2000

f(y)dy
1000

0.2

0.4

0.6

0.8

1.0

Summing over all y, would give the area under the curve. Thus the mean is the area under the
curve.281

280

y = size of loss. x = F(y) = Prob[Y y]. f(y) = dF/dy = dx/dy. dx = (dx/dy)dy = f(y)dy.
The width of each vertical strip is dx = f(y)dy.
281
In this case, the mean = /(-1) = 2400/3 = 800.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 670

Alternately as shown below, one can compute the area under the curve by summing narrow
horizontal strips, each of height dy and each of width 1 - F(y) = S(y), the survival function.282
Summing over all y, would give the area under the curve.
5000

4000

3000

2000

dy

1000

S(y)

0.2

0.4

0.6

0.8

1.0

Two integrals can be gotten from each other via integration by parts:

S(y) dy =

y f(y) dy = mean claim size.

Thus the area under the curve can be computed in either of two ways. The mean is either the integral
of the survival function, via summing horizontal strips, or the integral of y f(y), via summing vertical
strips .
In the Pareto example:

2400 4
24004
y + 2400 dy = 800 = y 4 (y + 2400)5 dy = mean claim size.
0
0

282

Each horizontal strip goes from x = F(y) to x = 1, and thus has width 1 - F(x) = S(x).

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 671

Mean vs. Median:


The Lee Diagram below, illustrates why for loss distributions skewed to the right, i.e., with positive
coefficient of skewness, the mean is usually greater than the median.283

The area under the curve is the mean; since the width is one, the average height of the curve is the
mean. On the other hand, the median is the height at which the curve reaches a probability of one
half, which in this case is less than the average height of the curve. The diagram would be similar for
most distributions significantly skewed to the right, and the median is less than the mean.

283

For distributions with skewness close to zero, the mean is usually close to the median. (For symmetric
distributions, the mean equals the median.) Almost all loss distributions encountered in practical applications by
casualty actuaries have substantial positive skewness, with the median significantly less than the mean.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 672

Limited Expected Value:


A loss of size less than 1000 contributes its size to the Limited Expected Value at 1000,
E[X 1000]. A loss of size greater than or equal to 1000 contributes 1000 to E[X 1000]. These
contributions to E[X 1000] correspond to vertical strips:
5000

4000

3000

2000

1000

f(500) dy
1000

500
0.2

0.4

0.6

0.8

1.0

The contribution to E[X 1000] of the small losses, as a sum of vertical strips, is the integral from 0 to
1000 of y f(y) dy. The contribution to E[X 1000] of the large losses, is the area of the rectangle of
height 1000 and width S(1000): 1000 S(1000).

These 2 pieces correspond to the 2 terms: E[X 1000] =

1000

y f(y) dy + 1000S(1000).

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 673

Adding up these two types of contributions, E[X 1000] corresponds to the area under both the
distribution curve and the horizontal line y = 1000:

In general, E[X L] the area below the curve and also below the horizontal line at L.
Summing horizontal strips, the Limited Expected Value also is equal to the integral of the survival
function from zero to the limit:
5000

4000

3000

2000

1000

dy
S ( y)
0.2

E[X 1000] =

1000

S(t) dt .

0.4

0.6

0.8

1.0

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 674

Losses Eliminated:
A loss of size less than 1000 is totally eliminated by a deductible of 1000. For a loss of size greater
than or equal to 1000, 1000 is eliminated by a deductible of size 1000. Summing these
contributions as vertical strips, as shown below the losses eliminated by a 1000 deductible
correspond to the area under both the curve and y = 1000.

Losses Eliminated by 1000 Deductible area under both the curve and y = 1000

E[X 1000].
In general, the losses eliminated by a deductible d the area below the curve and also
below the horizontal line at d.
Loss Elimination Ratio (LER) = losses eliminated / total losses.
The Loss Elimination Ratio is represented by the ratio of the area under both the curve and
y = 1000 to the total area under the curve.284 One can either calculate the area corresponding to the
losses eliminated by summing of horizontal strips of width S(t) or vertical strips of height t limited to
x

x. Therefore: LER(x) =

S(t) dt
0

284

/ = { t f(t) dt + xS(x)} / = E[X x] / E[X].


0

In the case of a Pareto with = 4 and = 2400, the Loss Elimination Ratio at 1000 is 518.62 / 800 = 64.8% .

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 675

Excess Losses:
A loss of size y > 1000, contributes y - 1000 to the losses excess of a 1000 deductible. Losses of
size less than or equal to 1000 contribute nothing. Summing these contributions, as shown below
the losses excess of a 1000 deductible corresponds to the area under the distribution curve but
above y = 1000.

This area under the distribution curve but above y = 1000, are the losses excess of 1000,
E[(X - 1000)+], the numerator of the Excess Ratio.285 The denominator of the Excess Ratio is the
mean, the whole area under the distribution curve. Thus the Excess Ratio, R(1000) =
E[(X - 1000)+] / E[X], corresponds to the ratio of the area under the curve but above y = 1000, to
the total area under the curve.286
Since the Losses Eliminated by a 1000 deductible and the Losses Excess of 1000 sum to the total
losses, E[(X - 1000)+] = Losses Excess of 1000 = total losses - losses limited to 1000
E[X] - E[X 1000]. LER(1000) + R(1000) = 1.

In the case of a Pareto with = 4 and = 2400, the area under the curve and above the line y=1000 is:
(+y)1 / (1) = 24004 / {(34003 )3} = 281.38.
285

In the case of a Pareto with = 4 and = 2400, the Excess Ratio at 1000 is: 281.38 / 800 = 35.2% =
(2400 / 3400)3 = {/(+x)}1. R(1000) = 1 - LER(1000) = 1 - .648 = .352.

286

2013-4-2,

Loss Distributions, 37 Lee Diagrams

E[(X - 1000)+] =

S(t) dt =

1000

HCM 10/8/12,

Page 676

(t - 1000) f(t) dt .

1000

The first integral summing of horizontal strips.


The second integral summing of vertical strips.
E[(X - 1000)+] area under the distribution curve but above y = 1000.
In general, the losses excess of u the area below the curve and also above the
horizontal line at u.
As was done before, one can divide the limited losses into two Area A and B.

1000

Area A =

t f(t) dt = 271.

Area B = (1000)S(1000) = (1000)(.248) = 248.


Area A + Area B = E[X 1000] = 519.
Area C = E[X] - E[X 1000] = 800 - 519 = 281.
Excess Ratio at 1000 = C / (A + B + C) = 281/ 800 = 35%.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 677

Mean Excess Loss (Mean Residual Life):


The Mean Excess Loss can be written as the excess losses divided by the survival function:
e(x) =

S(t) dt / S(x) =

(t

- x) f(t) dt / S(x) .

For example, the losses excess of a 1000 deductible corresponds to the area under the distribution
curve but above y = 1000:
5000

4000

3000

2000
Losses
Excess
of 1000

1000

S(1000) = 0.248
0.2

0.4

0.6

0.8

1.0

This area under the distribution curve but above y = 1000 is the numerator of the Mean Excess
Loss.287 The first integral above corresponds to the summing of horizontal strips. The second integral
above corresponds to the summing of vertical strips. The denominator of the Excess Ratio is the
survival function S(1000) = .248, which is the width along the horizontal axis of the area
corresponding to the numerator.
Thus the Mean Excess Loss, e(1000), corresponds to the average height of the area under the
curve but above y = 1000.288 For example, in this case that average height is e(1000) = 1133.
However, since the curve extends vertically to infinity, it is difficult to use this type of diagram to
distinguish the mean excess loss, particularly for heavy-tailed distributions such as the Pareto.
287

The numerator of the Mean Excess Loss is the same as the numerator of the Excess Ratio. In the case of a Pareto
with = 4 and = 2400, the area under the curve and above the line y=1000 is:
(+y)1 / (1) = 24004 / {(34003 )3} = 281.38.
In the case of a Pareto with = 4 and = 2400, the Mean Excess Loss at 1000 is 281.38 / .2483 = 1133, since
S(x) = {/(+x)} = (2400 / 3400)4 = .2483. Alternately, for the Pareto e(x) = (+x)/(1) = 3400/3 = 1133.
288

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 678

Layers of Loss:
Below is shown the effect of imposing a 3000 maximum covered loss:

Amount paid with a 3000 maximum covered loss the losses censored from above at 3000

E[X 3000] the area under both the curve and the horizontal line y = 3000.
The Layer of Loss between 1000 and 3000 would be those dollars paid in the presence of both a
1000 deductible and a 3000 maximum covered loss.
As shown below, the layer of losses from 1000 to 3000 is the area under the curve but between
the horizontal lines y = 1000 and y = 3000 Area B.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 679

Area A Losses excess of 3000 Losses not paid due to 3000 maximum covered loss.
Area C Losses limited to 1000 Losses not paid due to 1000 deductible.
Area B Layer from 1000 to 3000

Losses paid with 1000 deductible and 3000 maximum covered loss.
Summing horizontal strips, Area B can be thought of as the integral of the survival function from 1000
to 3000:
Layer of losses from 1000 to 3000 =

3000

S(t) dt = E[X 3000] - E[X 1000].

1000

This area is also equal to the difference of two limited expected values: the area below the curve
and y = 3000, E[X 3000], minus the area below the curve and y = 1000, E[X 1000].
In general, the layer from d to u the area under the curve but also between the
horizontal line at d and the horizontal line at u.
Summing vertical strips this same area can also be expressed as the sum of an integral and the area
of a rectangle:
5000

4000

3000

2000
t-1000

1000
f(t)dt

0.2

0.4

0.6

0.8

1.0

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 680

3000

Layer of losses from 1000 to 3000 =

(t - 1000) f(t) dt + (3000 - 1000) S(3000).

1000

For the Pareto example, losses excess of 3000 = E[X] - E[X 3000] =
2400/3 - (2400/3){1 - (2400/(2400 + 3000))3 } = 800 - 729.767 = 70.233.
Losses limited to 1000 = E[X 1000] = (2400/3){1 - (2400/(2400 + 1000))3 ) = 518.624.
Layer from 1000 to 3000 = E[X 3000] - E[X 1000] = 729.767 - 518.624 = 211.143.
(Losses limited to 1000) + (Layer from 1000 to 3000) + (Losses excess of 3000) =
518.624 + 211.143 + 70.233 = 800 = Mean total losses.
3000

Alternately, the layer from 1000 to 3000 =

(t - 1000) f(t) dt + (3000 - 1000) S(3000) =

1000
3000

1000

(t - 1000) (4)

24004
dt + (2000){2400/(3000+2400)}4
(t + 2400)5

= 133.106 + 78.037 = 211.143.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 681

The area below the Pareto distribution curve, which equals the mean of 800, can be divided into four
pieces, where the layer from 1000 to 3000 has been divided into the two terms calculated above:

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 682

Exercise: For a Pareto Distribution with = 4 and = 2400, determine the expected losses in the
following layers: 0 to 500, 500 to 1000, 1000 to 1500, and 1500 to 2000.

1
2400 3

[Solution: E[X x] =
1 -

= 800 1 -
.
+ x
2400 + x
1

E[X 500] = 347. E[X 1000] = 519. E[X 1500] = 614. E[X 2000] = 670.
Expected Losses in layer from 0 to 500: E[X 500] = 347.
Expected Losses in layer from 500 to 10000: E[X 1000] - E[X 500] = 519 - 347 = 172.
Expected Losses in layer from 1000 to 1500: E[X 1500] - E[X 1000] = 614 - 519 = 95.
Expected Losses in layer from 1500 to 2000: E[X 2000] - E[X 1500] = 670 - 614 = 56.]
For a given width, lower layers of loss are larger than higher layers of loss.289
This is illustrated in the following Lee Diagram:
3000

2500

2000

D
1500

C
1000

B
500

A
0.2

0.4

0.6

0.8

Area A > Area B > Area C > Area D.

289

Therefore, incremental increased limits factors decrease as the limit increases.


These ideas are discussed in On the Theory of Increased Limits and Excess of Loss Pricing, by Robert Miccolis,
with discussion by Sheldon Rosenberg, PCAS 1997.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 683

Various Formulas for a Layer of Loss:290

In the above diagram, the layer from a to b is: Area F + Area G.


There are various different formulas for the layer from a to b.
Algebraic Expression for Layer from a to b
E[X b] - E[X

a]

E[(X - a)+] - E[(X - b)+]

Corresponding Areas on the Lee Diagram


(C + D + E + F + G) - (C + D + E)
(F + G + H) - H

(y

- a) f(y) dy + (b-a)S(b)

F+G

f(y) dy + (b-a)S(b) - a{F(b)-F(a)}

(D + F) + G - D

f(y) dy + bS(b) - aS(a)

(D + F) + (E + G) - (D + E)

a
b

y
a
b

y
a

290

See pages 58-59 of The Mathematics of Excess of Loss Coverage and Retrospective Rating --- A Graphical
Approach by Y.S. Lee, PCAS LXXV, 1988.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 684

Average Sizes of Losses in an Interval:


As shown below, the dollars of loss on losses of size less than 1000 correspond to the area under
the curve and to the left of the vertical line x = F(1000) = .7517:

Summing up vertical strips, this left hand area corresponds to the integral of y f(y) from 0 to 1000.
As discussed previously, this area is the contribution of the small losses, one of the two pieces
making up E[X 1000]. The other piece of E[X 1000] was the contribution of the large losses,
1000S(1000). Thus the dollars of loss on losses of size less than 1000 can also be calculated as
E[X 1000] - 1000S(1000) or as the difference of the corresponding areas.
In this case, the area below the curve and to the left of x = F(1000) is:
1000

y f(y) dy = E[X 1000] - 1000S(1000) = 518.62 - 248.27 = 270.35.291

Therefore, the average size of the losses of size less than 1000 is:
270.35 / F(1000) = 270.35 / .75173 = 359.64.
In general, the losses of size a to b the area below the curve and also between the
vertical line at a and the vertical line at b.

In the case of a Pareto with = 4 and = 2400, E[X 1000] = (2400/3){1 - (2400/3400)3 } = 518.62, and
S(1000) = (2400/3400)4 = .24827.
291

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 685

As shown below, the dollars of loss on losses of size between 1000 and 2000 correspond to the
area under the curve and between the vertical lines the vertical line x = F(1000) = 0.7517 and
x = F(2000) = 0.9115:

Summing up vertical strips, this area corresponds to the integral of y f(y) from 1000 to 2000.
This area between the two vertical lines can also be thought of as the difference between the area to
the left of x = F(2000) and that to the left of x = F(1000). In other words the dollars of loss on losses
between 1000 and 2000 are the difference between the dollars of loss on losses of size less than
2000 and dollars of loss on losses of size less than 1000.
In this case, the area below the curve and to the left of x = F(2000) is:
E[X 2000] - 2000S(2000) = 670.17 - (2000)(.088519) = 493.13.
The area below the curve and to the left of x = F(1000) is:
E[X 1000] - 1000S(1000) = 518.62 - 248.27 = 270.35.
The area between the vertical lines is the difference:
2000

1000

2000

y f(y) dy =

1000

y f(y) dy -

y f(y) dy = 493.13 - 270.35 = 222.78.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 686

The average size of the losses of size between 1000 and 2000 is:
222.78 / {F(2000) -F(1000)} = 222.78 / (0.9115 - 0.7517) = 1394.
The numerator is the area between the vertical lines at F(2000) and F(1000) and below the curve,
while the denominator is the width of this area. The ratio is the average height of this area, the
average size of the losses of size between 1000 and 2000.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 687

Expected Amount by Which Losses are Less than a Given Value:


Assume we take 1000 - x for x 1000 and 0 for x > 1000. This is the amount by which X is less
than 1000, or (1000 - X)+. As shown below for various x < 1000, (1000 - X)+ is the height of a
vertical line from the curve to 1000, or from x up to 1000:

The expected amount by which X is less than 1000, E[(1000 - X)+], is Area A below:

Area B = E[X 1000]. Area A + Area B = area of a rectangle of width 1 and height 1000 = 1000.
E[(1000 - X)+] = the expected amount by which losses are less than 1000
= Area A = 1000 - Area B = 1000 - E[X

1000].

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

1000

We can also write area A as a sum of horizontal strips:

F(x) dx .

In general, E[(d - X)+] =

F(x) dx =

1S(x)dx = d - E[X d].

E[(X - d)+] versus E[(d - X)+]:


In the following Lee Diagram, Area A = E[(1000 - X)+] and Area C = E[(X - 1000)+].

Area A + Area B = a rectangle of height 1000 and width 1.


Therefore, A + B = 1000.
B = 1000 - A = 1000 - E[(1000 - X)+].
Area B + Area C = area under the curve.
Therefore, B + C = E[X].
B = E[X] - C = E[X] - E[(X - 1000)+].
Therefore, 1000 - E[(1000 - X)+] = E[X] - E[(X - 1000)+].
Therefore, E[(X - 1000)+] - E[(1000 - X)+] = E[X] - 1000 = E[X - 1000].
In general, E[(X-d)+] - E[(d-X)+] = E[X] - d = E[X - d].

Page 688

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 689

Payments Subject to a Minimum:

Max[X, 1000] = A + B + C = (A + B) + C = 1000 + (E[X] - E[X 1000]).


Max[X, 1000] = A + B + C = A + (B + C) = (1000 - E[X 1000]) + E[X].
Payments Subject to both a Minimum and a Maximum:

Min[Max[X, 1000], 3000] = A + B + C = (A + B) + C = 1000 + (E[X 3000] - E[X 1000]).


Min[Max[X, 1000], 3000] = A + B + C = A + (B + C) = (1000 - E[X 1000]) + E[X 3000].

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 690

Inflation:
After 50% uniform inflation is applied to the original Pareto distribution, with = 4 and
= 2400, the revised distribution is also a Pareto with = 4 but with = (1.5)(2400) = 3600.
Here are the original Pareto (solid) and the Pareto after inflation (dashed):

The increase in the losses due to inflation corresponds to the area between the distribution curves.
The total area under the new curve is: (1.5)(800) = 3600 / (4-1) = 1200. The area under the old
curve is 800. The increase in losses is the difference = 1200 - 800 = (.5)(800) = 400. The increase
in losses is 50% from 800 to 1200.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 691

The losses excess of 1000 are above the horizontal line at 1000. Prior to inflation the excess losses
are below the original Pareto (solid), Area E. After inflation the excess losses are below the revised
Pareto (dashed), Areas D + E:

Area D represents the increase in excess losses due to inflation, while Area A represents the
increase in limited losses due to inflation. Note that the excess losses have increased more quickly
(as a percent) than the total losses, while the losses limited to 1000 have increased less quickly (as
a percent) than the total losses.
The loss excess of 1000 for a Pareto with = 4 and = 3600 is: 1200 - E[X 1000] =
1200 - 624.80 = 575.20, Areas D + E above. The loss excess of 1000 for a Pareto with = 4 and
= 2400 is: 800 - E[X 1000] = 800 - 518.62 = 281.38, Area E above. Thus under uniform inflation
of 50%, in this case the losses excess of 1000 have increased by 104.4%, from 281.38 to 575.20.
The increase in excess losses is Area D above, 575.20 - 281.38 = 293.82.
The loss limited to 1000 for a Pareto with = 4 and = 3600 is: E[X 1000] = 624.80. The loss
limited to 1000 for a Pareto with = 4 and = 2400 is: E[X 1000] = 518.62. Thus under uniform
inflation of 50%, in this case the losses limited to 1000 have increased by only 20.5%, from 518.62
to 624.80. The increase in limited losses is Area A above,
624.80 - 518.62 = 106.18.
The total losses increase from 800 to 1200; Area A + Area D = 293.82 + 106.18 = 400.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Another version of this same Lee Diagram, showing the numerical areas:

% Increase in Limited Losses is: 106/519 = 20% < 50%.


% Increase in Excess Losses is: 294/281 = 105% > 50%.
% Increase in Total Losses is: (106 + 294)/ (519 + 281) = 400/800 = 50%.

Page 692

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 693

In the earlier year, the losses limited to 2000 are below the horizontal line at 2000, and below the
solid Pareto, Area A in the Lee Diagram below.
In the later year, the losses limited to 3000 are below the horizontal line at 3000, and below the
dotted Pareto, Areas A + B + C in the Lee Diagram below.
Every loss in combined Areas A + B + C is exactly 1.5 times the height of a corresponding loss in
Area A.

Showing that Elater year[X

3000] = 1.5 Eearlier year[X

3000/1.5] = 1.5 Eearlier year[X

2000].

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 694

Call Options:292
The expected payoff on a European Call on a stock is equal to E[(ST - K)+], where ST is the future
price of the stock at expiration of the option, time T.
Let F(x) be the distribution of the future stock price at time T.293
E[(X - K)+] is the expected losses excess of K, and corresponds to the area above the horizontal
line at height K and also below the curve graphing F(x) in the following Lee Diagram:

As K increases, the area above the horizontal line at height K decreases; in other words, the value of
the call decreases as K increases.

292
293

Not on the syllabus of this exam. See Mahlers Guide to Financial Economics.
A common model is that the future prices of a stock follow a LogNormal Distribution.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 695

For an increase in K of K, the value of the call decreases by Area A in the following Lee Diagram:
Stock Price

K+K
K

Prob.

The absolute change in the value of the call, Area A, is smaller than a rectangle of height K and
width 1 - F(K). Thus Area A is smaller than K {1 - F(K)} K. Thus a change of K in the strike
price results in an absolute change in the value of the call option smaller than K.
The following Lee Diagram shows the effect of raising the strike price by fixed amounts.

The successive absolute changes in the value of the call are represented by Areas A, B, C, and D.
We see that the absolute changes in the value of the call get smaller as the strike price increases.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 696

Put Options:294
The expected payoff of a European put is E[(K - ST)+], where ST is the future price of the stock at
expiration of the option, time T.
Let F(x) be the distribution of the future stock price at time T.
Then this expected payoff corresponds to Area P below the horizontal line at height K and also
above the curve graphing F(x) in the following Lee Diagram:

As K increases, the area below the horizontal line at height K increases; in other words, the value of
the put increases as K increases.

294

Not on the syllabus of this exam. See Mahlers Guide to Financial Economics.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 697

For an increase in K of K, the value of the put increases by Area A in the following Lee Diagram:
Stock Price

K+K
K

Prob.

The change in the value of the put, Area A, is smaller than a rectangle of height K and width F(K).
Thus Area A is smaller than K F(K) K. Thus a change of K in the strike price results in a change
in the value of the put option smaller than K.
The following Lee Diagram shows the effect of raising the strike price by fixed amounts.

The successive changes in the value of the put are represented by Areas A, B, C, and D.
We see that the changes in the value of the put get larger as the strike price increases.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Tail Value at Risk (TVaR):295


The Tail Value at Risk of a loss distribution is defined as: TVaRp E[X | X > p ],
where the percentile p is such that F(p ) = p.
Exercise: For a Pareto Distribution with = 4 and = 2400, determine 0.90.
[Solution: .90 = 1 - {2400/(2400 + x)}4. x = 1868.]
TVaR.90 is the average size of those losses of size greater than 0.90 = 1868.
The denominator of TVaR.90 is: 1 - 0.90 = 0.10.
The numerator of TVaR.90 is Area A + Area B in the following Lee Diagram:

Therefore, TVaR.90 is the average height of Areas A + B.


Area A has height 0.90 = 1868.
Area B is the expected losses excess of 0.90 = 1868.
The average height of Area B is the mean excess loss, e(1868) = e(0.90).
Therefore, TVaR.90 = 0.90 + e(0.90). In general, TVaRp = p + e(p ).
295

See Mahlers Guide to Risk Measures.

Page 698

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 699

Problems:
Use the following information for the next 26 questions:
The size of loss distribution F(x), with corresponding survival function S(x) and density f(x), is shown
in the following diagram, with probability along the horizontal axis and size of loss along the vertical
axis. Express each of the stated quantities algebraically in terms of the six labeled areas in the
diagram: , , , , , .

F(2)

F(5) 1

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

37.1 (1 point) E[X].


37.2 (1 point) Losses from claims of size less than 2.
37.3 (1 point) Portion of total losses in the layer from 0 to 2.
2

37.4 (1 point)

x dF(x)

+ 2{1 - F(2)}.

37.5 (1 point) Portion of total losses from claims of size more than 2.
37.6 (1 point) Portion of total losses in the layer from 0 to 5.
37.7 (1 point) E[X 5].
37.8 (1 point) Portion of total losses from claims of size less than 5.
37.9 (1 point) Portion of total losses in the layer from 2 to 5.
37.10 (1 point) R(2) = excess ratio at 2 = 1 - LER(2).

37.11 (1 point)

x dF(x) .
5

37.12 (1 point) Portion of total losses in the layer from 2 to .


37.13 (1 point) LER(5) = loss elimination ratio at 5.
37.14 (1 point) 2(F(5)-F(2)).
37.15 (1 point) Portion of total losses from claims of size between 2 and 5.

37.16 (1 point)

S(t) dt .
5

37.17 (1 point) LER(2) = loss elimination ratio at 2.

Page 700

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 701

37.18 (1 point)

(t - 5) f(t) dt .
5

37.19 (1 point) 2S(5).


37.20 (1 point) R(5) = excess ratio at 5 = 1 - LER(5).
37.21 (1 point) e(5) = mean excess loss at 5.
5

37.22 (1 point)

(t - 2) f(t) dt .
2

37.23 (1 point) Losses in the layer from 5 to .


5

37.24 (1 point)

(1 - F(t)) dt .
2

37.25 (1 point) 3S(5).


37.26 (1 point) e(2) = mean excess loss at 2.

37.27 (2 points) Using Lees The Mathematics of Excess of Loss Coverages and Retrospective
Rating -- A Graphical Approach, show graphically why the limited expected value increases at a
decreasing rate as the limit is increased.
Label all axes and explain your reasoning in a brief paragraph.
37.28 (2 points) Losses follow an Exponential Distribution with mean 500.
Using Lees The Mathematics of Excess of Loss Coverages and Retrospective Rating -A Graphical Approach, draw a graph to show the expected losses from those losses of size 400
to 800. Label all axes.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 702

Use the following information for the next five questions:


Prior to the effects of any maximum covered loss or deductible, losses follow a Weibull Distribution
with = 300 and = 1/2.
Using Lees The Mathematics of Excess of Loss Coverages and Retrospective Rating -- A
Graphical Approach, draw a graph to show the expected payments. Label all axes.
37.29 (1 point) With no deductible and no maximum covered loss.
37.30 (1 point) With a 500 deductible and no maximum covered loss.
37.31 (1 point) With no deductible and a 1500 maximum covered loss.
37.32 (1 point) With a 500 deductible and a 1500 maximum covered loss.
37.33 (1 point) With a 500 franchise deductible and no maximum covered loss.
37.34 (2 points) You are given the following graph of the cumulative loss distribution:
2500
2000
1500
B
1000
500

0.63

Size of the area labeled A = 377.


Size of the area labeled B = 139.
Calculate the loss elimination ratio at 1000.
A. Less than 0.6
B. At least 0.6, but less than 0.7
C. At least 0.7, but less than 0.8
D. At least 0.8, but less than 0.9
E. At least 0.9

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 703

37.35 (3 points) The following graph is of the cumulative loss distribution in 2001:
x

7500

P
5000

Q
3333

R
1500

1000
667

U
V
1

There is a total of 50% inflation between 2001 and 2008.


A policy in 2008 has a 1000 deductible and a 5000 maximum covered loss.
Which of the following represents the expected size of loss under this policy?
A. 1.5(Q + R + T)
B. 1.5(R + T + U)
C. (P + Q + R)/1.5
D. (Q + R + T)/1.5
E. None of A, B, C, or D.

Prob.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 704

37.36 (2 points) The size of loss distribution is shown in the following diagram, with probability
along the horizontal axis and size of loss along the vertical axis.
Which of the following represents the expected losses under a policy with a franchise deductible of
2 and a maximum covered loss of 5?
A. +

B. + +

C. + +

D. + + +

E. + + + +

F(2)

F(5) 1

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 705

37.37 (2 points) The following graph shows the distribution function, F(x), of loss severities.
x
P

6250
Q

5000

4000
T

F(x)

A policy has a 5000 maximum covered loss and an 80% coinsurance.


Which of the following represents the expected size of loss under this policy?
A. Q + R
B. R + T
C. Q + R + T
D. P + Q + R + T
E. None of A, B, C, or D.

37.38 (3 points) For a Pareto Distribution with = 4 and = 3, draw a Lee Diagram,
showing the curtate expectation of life, e0 .

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 706

For the next nine questions, use the following graph of the cumulative loss distribution:
2000

1500

1000

B
500
A
0.323
0.762
1
Size of the area labeled A = 107. Size of the area labeled B = 98. The mean size of loss is 750.
Calculate the following items:
37.39 (1 point) The average size of loss for those losses of size less than 500.
37.40 (1 point) The loss elimination ratio at 500.
37.41 (1 point) The average payment per loss with a deductible of 500 and
a maximum covered loss of 1000.
37.42 (1 point) The mean excess loss at 500.
37.43 (1 point) The average size of loss for those losses of size between 500 and 1000.
37.44 (1 point) The loss elimination ratio at 1000.
37.45 (1 point) The average payment per payment with a deductible of 500 and
a maximum covered loss of 1000.
37.46 (1 point) The mean excess loss at 1000.
37.47 (1 point) The average size of loss for those losses of size greater than 1000.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 707

37.48 (3 points) You are given the following graph of cumulative distribution functions.
Size

L
P

L/(1+r)

Prob.

The thicker curve is the cumulative distribution function for the size of loss in a later year, F(x), with
corresponding Survival Function S(x) and density f(x).
There is total inflation of r between this later year and an earlier year.
The thinner curve is the cumulative distribution function for the size of loss in this earlier year.
Which of the following is an expression for Area P?
L

A.

S(x) dx - S(L)Lr.

L/(1+r)
L

B.

x f(x) dx - L{S(L/(1+r)) - S(L)}.

L/(1+r)
L

C. (1+r)

S(x) dx - S(L)Lr/(1+r).

L/(1+r)
L

D.

x f(x) dx - L{F(L) - F(L/(1+r))}/(1+r).

L/(1+r)

E. None of the above.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 708

37.49 (CAS9, 11/99, Q.34) (2 points) Using Lee's "The Mathematics of Excess of Loss
Coverages and Retrospective Rating - A Graphical Approach," answer the following:
a. (0.5 point) If the total limits inflation rate is 6%, describe why the inflation rate for the basic limits
coverage is lower than 6%.
b. (1 point) Use Lee to graphically justify your answer.
c. (0.5 point) What are the two major reasons why the inflation rate in the excess layer is greater
than the total limits inflation rate?

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 709

37.50 (5, 5/03, Q.14) (1 point)

Given E[x] =

x f(x) dx = $152,500

and the following graph of the cumulative loss distribution, F(x), as a function of the size of loss, x,
calculate the excess ratio at $100,000.

Size of the area labeled Y = $12,500

Loss Size (x)

$100,000

Y
0.20

1.00
F(x) = Cumulative Claim Frequency

A. Less than 0.3


B. At least 0.3, but less than 0.5
C. At least 0.5, but less than 0.7
D. At least 0.7, but less than 0.9
E. At least 0.9

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 710

37.51 (CAS3, 11/03, Q.20) (2.5 points) Let X be the size-of-loss random variable with
cumulative distribution function F(x) as shown below:

Which expression(s) below equal(s) the expected loss in the shaded region?

I.

x dF(x)

II. E(x) -

x dF(x) 0

Ill.

[1

- F(x)] dx

A. I only
B. II only
C. Ill only
D. I and Ill only
E. II and Ill only

K[1-F(K)]

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 711

37.52 (CAS3, 11/03, Q.23) (2.5 points)


F(x) is the cumulative distribution function for the size-of-loss variable, X.
P, Q, R, S, T, and U represent the areas of the respective regions.
What is the expected value of the insurance payment on a policy with a deductible of "DED" and a
limit of "LIM"? (For clarity, that is a policy that pays its first dollar of loss for a loss of
DED + 1 and its last dollar of loss for a loss of LIM.)

A. Q

B. Q+R

C. Q+T

D. Q+R+T+U

E. S+T+U

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 712

37.53 (CAS3, 5/04, Q.33) (2.5 points)


F(x) is the cumulative distribution function for the size-of-loss variable, X.
P, Q, R, S, T, and U represent the areas of the respective regions.
What is the expected value of the savings to the insurance company of implementing a franchise
deductible of DED" and a limit of LIM" to a policy that previously had no deductible and no limit?
(For clarity, that is a policy that pays its first dollar of loss for a loss of DED + 1 and its last dollar of
loss for a loss of LIM.)

A. S

B. S+P

C. S+Q+P

D. S+P+R+U

E. S+T+U+P

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 713

37.54 (CAS3, 11/04, Q.30) (2.5 points) Let X be a random variable representing an amount of
loss. Define the cumulative distribution function F(x) as F(x) = Pr(Xx).

Determine which of the following formulas represents the shaded area.


b

A.

x dF(x) + a - b + aF(b) - bF(a)


a
b

B.

x dF(x) + a - b + aF(a) - bF(b)


a

C.

x dF(x) - a + b + aF(b) - bF(a)


a
b

D.

x dF(x) - a + b + aF(a) - bF(b)


a
b

E.

x dF(x) - a + b - aF(a) + bF(b)


a

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 714

37.55 (CAS3, 5/06, Q.28) (2.5 points)


The following graph shows the distribution function, F(x), of loss severities in 2005.

P
1.1 D
Q

D
S

D/1.1

R
U

0
Loss severities are expected to increase 10% in 2006 due to inflation.
A deductible, D, applies to each claim in 2005 and 2006.
Which of the following represents the expected size of loss in 2006?
A. P
B. 1.1P
C. 1.1(P+Q+R)
D. P+Q+R+S+T+U
E. 1.1(P+Q+R+S+T+U)

F(x)

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 715

Solutions to Problems:
37.1. +++++, the mean is the area under the distribution curve.
37.2. , the result of summing vertical strips under the curve from zero to F(2).
37.3. (++) / (+++++) = E[X 2] / E[X].
37.4. E[X 2] is: ++, the area under the curve and the horizontal line at 2.
37.5. (++++) / ( +++++) = 1 - / ( +++++).
37.6. (++++) / ( +++++) = 1 - / ( +++++) = E[X 2] / E[X].
37.7. ++++, the area under the curve and the horizontal line at 5.
37.8. (++)/( +++++), the numerator is the result of summing vertical strips under the
curve from zero to F(5), while the denominator is the total area under the curve.
37.9. (+) / ( +++++) = ( E[X 5] - E[X 2]) / E[X].
37.10. (++) / ( +++++) = (E[X] - E[X 2]) / E[X].
37.11. Losses from claims of size more than 5: ++.
37.12. (++) / ( +++++) = (E[X] - E[X 2]) / E[X].
37.13. (++++) / ( +++++) = 1 - / ( +++++) = E[X 5] / E[X].
37.14. , a rectangle of height 2 and width F(5)-F(2).
37.15. (+) / ( +++++), the numerator is the result of summing vertical strips under the curve
from F(2) to F(5), while the denominator is the total area under the curve.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 716

37.16. = E[X] - E[X 5], the sum of horizontal strips of length S(t) = 1-F(t) between the horizontal
lines at 5 and .
37.17. (++) / ( +++++) = E[X 2] / E[X].
37.18. , the sum of vertical strips of height t-5 between the vertical lines at F(5) and 1.
37.19. , a rectangle of height 2 and width S(5).
37.20. / ( +++++) = (E[X] - E[X 5]) / E[X].
37.21. e(5) = /S(5). But, = 2S(5) and = 3S(5). Thus e(5) = 3/ or 2/.
37.22. , the sum of vertical strips of height t-2 between the vertical lines at F(2) and F(5).
37.23. = E[(X - 5)+] = E[X] - E[X 5].
37.24. + = E[X 5] - E[X 2], the sum of horizontal strips of length 1-F(t) between the horizontal
lines at 2 and 5.
37.25. , a rectangle of height 5 -2 and width 1-F(5) = S(5).
37.26. e(2) = losses excess of 2 / S(2) = (++) /S(2).
But, + = 2S(2). e(2) = 2(++)/(+).

2013-4-2,

Loss Distributions, 37 Lee Diagrams

37.27. In the Lee diagram below, E[X

HCM 10/8/12,

Page 717

1000] = Area A.

E[X 2000] = Area A + Area B. Therefore, Area B = E[X 2000] - E[X 1000].
Area B is the increase in the limited expected value due to increasing the limit from 1000 to 2000.
Similarly, Area C is the increase in the limited expected value due to increasing the limit by another
1000. Area C < Area B, and therefore the increase in the limited expected value is less. In general
the areas of a given height get smaller as one moves up the diagram, as the curve moves closer to
the righthand asymptote. Therefore, the rate of increase of the limited expected value decreases.
Comment: The diagram was based on a Pareto Distribution with = 4 and = 2400.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 718

37.28. If y = size of loss and x = probability, then for an Exponential with = 500,
x = 1 - exp[-y/500]. Therefore, y = -500 ln[1 - x].
loss of size 400 probability = 1 - e-.8 = 0.551.
loss of size 800 probability = 1 - e-1.6 = 0.798.
Losses of size 400 to 800 corresponds to the area below the curve and between vertical lines at
0.551 and 0.798:
Size of Loss

800

400

Losses of
size 400
to 800
0.2

0.4

0.6

0.8

Probability
1.0

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 719

37.29. If y = size of loss and x = probability, then for a Weibull with = 300 and = 1/2,
x = 1 - exp[-(y/300).5]. Therefore, y = 300 ln[1 -x]2 . Lee Diagram:

Comment: One has to stop graphing at some size of loss, unless one has infinite graph paper!
In this case, I only graphed up to 2500.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 720

37.30. The payments with a 500 deductible and no maximum covered loss are represented by
the area above the line at height 500 and to the right of the curve:

37.31. The payments with no deductible and a 1500 maximum covered loss are represented by
the area below the line at height 1500, and to the right of the curve:

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 721

37.32. The payments with a 500 deductible and a 1500 maximum covered loss are represented
by the area above the line at height 500, below the line at height 1500, and to the right of the curve:

37.33. Under a 500 franchise deductible, nothing is paid on a loss of size 500 or less, and the
whole loss is paid for a loss of size greater than 500. The payments with a 500 franchise deductible
and no maximum covered loss are the losses of size greater than 500, represented by the area to
the right of the vertical line at F(500) and below the curve:

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 722

37.34. D. Area C is a rectangle with height 1000 and width (1 - .63) = .37, with area 370.
2500
2000
1500

1000
500

0.63
1
Expected losses limited to 1000 = Area A + Area C = 377 + 370 = 747.
E[X] = Area A + Area B + Area C = 377 + 139 + 370 = 886.
Loss elimination ratio at 1000 = E[X 1000]/E[X] = 747/886 = 84.3%.
Comment: Similar to 5, 5/03, Q.14.
The Lee Diagram was based on a Weibull Distribution with = 1000 and = 2.
37.35. B. Deflate 5000 from 2008 to 2001, where it is equivalent to: 5000/1.5 = 3333.
Deflate 1000 from 2008 to 2001, where it is equivalent to: 1000/1.5 = 667. Then the average
expected loss in 2001 is the area between horizontal lines at 667 and 3333, and under the curve:
R+T+U. In order to get the expected size of loss in 2008, reinflate back up to the 2008 level, by
multiplying by 1.5: 1.5(R + T + U).
Comment: Similar to CAS3, 5/06, Q.28.
37.36. D. A policy with a franchise deductible of 2 pays the full amount of all losses of size greater
than 2. This is the area under the curve and to the right of the vertical line at 2:
+ + + + . However, there is also a maximum covered loss of 5, which means the policy does
not pay for the portion of any loss greater than 5, which eliminates area , above the horizontal line at
5. Therefore the expected payments are: + + + .
Comment: Similar to CAS3, 5/04, Q.33.
37.37. E. Prior to the effect of the coinsurance, the expected size of loss is below the line at 5000
and below the curve: R + T. We multiply by 80% before paying under this policy.
The expected size of loss is: 0.8(R + T).

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 723

37.38. The curtate expectation of life, e0 , is the sum of a series of rectangles, each with height 1,
with areas: S(1) = (3/4)4 , S(2) = (3/5)4 , S(3) = (3/6)4 , etc.
The first six of these rectangles are shown below:
Size of Loss

Prob.
0.2

0.4

0.6

Comment: e0 < e(0) = E[X] = area under the curve.


The curtate expectation of life is discussed in Actuarial Mathematics.

0.8

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 724

37.39. We are given Area A = 107 and Area B is 98. We can get the areas of three rectangles.
One rectangle has width: .762 - .323 = .439, height 500, and area: (.439)(500) = 219.5.
Two rectangles have width: 1 - .762 = .238, height 500, and area: (.238)(500) = 119.
The total area under the curve is equal to the mean, given as 750.
Therefore, the area under the curve and above the horizontal line at 1000 is:
750 - (107 + 219.5 + 98 + 119 + 119) = 87.5.
2000

1500

87.5

1000
98

119

219.5

119

500
107

0.323
0.762
The average size of loss for those losses of size less than 500 is:
(dollars from losses of size less than 500)/F(500) = 107/.323 = 331.

Comment: The Lee Diagram was based on a Gamma Distribution with = 3 and = 250.
37.40. LER(500) = E[X 500]/E[X] = (107 + 219.5 + 119)/750 = 59.4%.
37.41. The average payment per loss with a deductible of 500 and maximum covered loss of
1000 is: layer from 500 to 1000
the area under the curve and between the horizontal lines at 500 and 1000
98 + 119 = 217.
37.42. e(500) = (losses excess of 500)/S(500) = (98 + 119 + 87.5)/(1 - .323) = 450.
37.43. The average size of loss for those losses of size between 500 and 1000 is:
(dollars from losses of size between 500 and 1000)/{F(1000) - F(500)} =
(219.5 + 98)/(.762 - .323) = 723.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 725

37.44. The loss elimination ratio at 1000 = E[X 1000]/E[X] =


(107 + 219.5 + 119 + 98 + 119)/750 = 88.3%.
Alternately, excess ratio at 1000 is: 87.5/750 = 11.7%. LER(1000) = 1 - 11.7% = 88.3%.
37.45. The average payment per (non-zero) payment with a deductible of 500 and maximum
covered loss of 1000 is: (average payment per loss)/S(500) = 217/(1 - .323) = 321.
37.46. e(1000) = (losses excess of 1000)/S(1000) = 87.5/(1 - .762) = 368.
37.47. The average size of loss for those losses of size greater than 1000 is:
(dollars from losses of size > 1000)/S(1000) = (87.5 + 119 + 119)/(1 - .762) = 1368.
37.48. D. The rectangle below Area P plus Area P, represent those losses of size between
L/(1+r) and L.
Size

L
P

L/(1+r)

1
L

Area P + Rectangle =

x f(x) dx.
L/(1+r)

The rectangle has height: L/(1+r), and width: F(L) - F(L/(1+r)).


L

Area P =

x f(x) dx - {F(L) - F(L/(1+r))}L/(1+r).

L/(1+r)

Prob.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 726

Comment: Area P can be written in other ways as shown below:

Size

L
P

L/(1+r)

Prob.

Area P plus the rectangle to the left of Area P, represents the layer of loss from L/(1+r) to L.
L

Area P + Rectangle =

S(x) dx.

L/(1+r)

The rectangle has height: L - L/(1+r), and width: S(L).


L

Area P =

S(x) dx - S(L){L - L/(1+r)} = S(x) dx - S(L) L r/(1+r).

L/(1+r)

L/(1+r)

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 727

37.49. a. Some losses will hit the basic limit before they get the total increase from inflation, while
some were already at the basic limit so inflation wont increase them.
For example, if the basic limit is $100,000, then a loss of $125,000 will still contribute $100,000 to
the basic limit after inflation. A loss of $98,000 will increase to $103,880 after inflation, and would
then contribute $100,000 to the basic limit, an increase of only 100/98 - 1 = 2.04% in that
contribution, less than 6%.
b. Let L be the basic limit. The solid curve refers to the losses prior to inflation, while the dashed
curve refers to the losses after inflation:

The expected excess losses prior to inflation are Area D.


The increase in the expected excess losses due to inflation is Area C.
The expected basic limit losses prior to inflation are Area B.
The increase in the expected basic limit losses due to inflation is Area A.
Area C is larger compared to Area D, than is Area A compared to Area B.
Therefore, the basic limit losses increase slower due to inflation than do the excess losses.
Since the unlimited ground up losses are the sum of the excess and basic limit losses,
the basic limit losses increase slower due to inflation than the unlimited ground up losses.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 728

Looked at from a somewhat different point of view, those losses in Area M will have their
contributions to the basic limit increase at a rate less than 6%, while those losses in Area N, will not
have their contribution to the basic limit increases at all due to inflation.

c. All losses that were already in the excess layer receive the full increase from inflation. Some
losses that did not contribute to the excess layer will, after inflation, contribute something to the
excess layer.
Comment: In order to allow one to see what is going on, the Lee Diagrams in my solution are based
on much more than 6% inflation.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 729

37.50. B. Area Y + Area A + Area B = E[x] = $152,500.


Area A is a rectangle with height $100,000 and width .8, with area $80,000.
Area Y is given as $12,500.
Expected losses excess of $100,000 = Area B = $152,500 - $12,500 - $80,000 = $60,000.
Excess ratio at $100,000 = (Expected losses excess of $100,000)/E[X] =
$60,000 / $152,500 = 0.393.

Loss Size (x)

B
$100,000

A
0.20

1.00
F(x) = Cumulative Claim Frequency

Comment: Loss Elimination Ratio at $100,000 is: 1 - .393 = .607.


Not one of the usual size of loss distributions encountered in casualty actuarial work.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 730

37.51. E. The shaded area represents the losses excess of K = E[(X-K)+] = E[X] - E[X
K

K] =

E[X] - { x f(x) dx + KS(K)} = x f(x) dx - k S(k) = (x-K) f(x) dx = S(x) dx.


0

Since S(x) = 1 - F(x) and f(x) dx = dF(x), statements II and III are true.
Statement I is false; it would be true if the integrand were x-K rather than x.
37.52. B. The layer of loss from DED to LIM is the area under the curve and between the
horizontal lines at DED and LIM: Q + R.
37.53. B. Under a franchise deductible one does not pay any losses of size less than DED, but
pays the whole of any loss of size greater than DED. Due to the franchise deductible, one saves S,
the area corresponding to the losses of size less than or equal to DED. Due to the limit, one pays at
most LIM for any loss, so one saves P, the area representing the expected amount excess of LIM.
The expected savings are: S + P.
Comment: Similar to CAS3, 11/03, Q.23, except here there is a franchise deductible rather than an
ordinary deductible. The affect of the franchise deducible is to left truncate the data at DED, which
removes area S.

2013-4-2,

Loss Distributions, 37 Lee Diagrams

HCM 10/8/12,

Page 731

37.54. D. Label some of the areas in the Lee Diagram as follows:

D = (b-a)S(b).
E = a{F(b) - F(a)}
b

C + E = losses on losses of size between a and b = x dF(x).


a

Shaded Area = C + D = (C + E) + D - E =
b

x dF(x) + (b-a)S(b) - a{F(b) - F(a)} = x dF(x) - a + b + aF(a) - bF(b).


a

Alternately, the shaded area is the layer from a to b:


b

E[X

b] - E[X

a] =
0

x dF(x) + bS(b) - x dF(x) - aS(a) = x dF(x) - a + b + aF(a) - bF(b).


0

37.55. E. Deflate D from 2006 to 2005, where it is equivalent to a deductible of: D/1.1.
Then the average expected loss in 2005 is the area above the deductible, D/1.1, and under the
curve: P+Q+R+S+T+U. In order to get the expected size of loss in 2006, reinflate back up to the
2006 level, by multiplying by 1.1: 1.1(P+Q+R+S+T+U).

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 732

Section 38, N-Point Mixtures of Models


Mixing models is a technique that provides a greater variety of loss distributions.
Such mixed distributions are referred to by Loss Models as n-point or two-point mixtures.296
2-point mixtures:
For example, let A be a Pareto Distribution with parameters = 2.5 and = 10,
while B is a LogNormal Distribution with parameters = 0.5 and = 0.8.
Let p = 0.10, the weight for the Pareto Distribution.
If we let G(x) = pA(x) + (1-p)B(x) = 0.1 A(x) + 0.9 B(x),
then G is a Distribution Function since G(0) = 0 and G() = 1.
G is a mixed Pareto-LogNormal Distribution.
Heres the individual distribution functions, as well as that of this mixed distribution:
Limit

Pareto
Distribution
Function

LogNormal
Distribution
Function

Mixed Pareto-LogNormal
Distribution
Function

0.5
1
2.5
5
10
25
50
100

0.1148
0.2120
0.4276
0.6371
0.8232
0.9564
0.9887
0.9975

0.0679
0.2660
0.6986
0.9172
0.9879
0.9997
1.0000
1.0000

0.0726
0.2606
0.6715
0.8892
0.9714
0.9954
0.9989
0.9998

For example, 0.9954 = (0.1)(0.9564) + (0.09)(0.9997). For the mixed distribution, the chance of a
claim greater than 25 is: 0.46% = (0.1)(4.36%) + (0.9)(.03%). The Distribution Function and the
Survival Function of the mixture are mixtures of the individual Distribution and Survival Functions.
Also we see from the above table, that for example the 89th percentile of this mixed ParetoLogNormal Distribution is a little more than 5, since F(5) = 0.8892.
In general, one can take a weighted average of any two Distribution Functions:
G(x) = p A(x) + (1-p)B(x). Such a Distribution Function H, called a 2-point mixture of models,
will generally have properties that are a mixture of those of A and B.

296

Loss Models, Section 4.2.3.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 733

One can create a very large number of possible combinations by choosing various types of
distributions for A and B. The mixture will have a number of parameters equal to the sum of the
number of parameters of the two distributions A and B, plus one more for p, the weighting
parameter. The mixed Pareto-LogNormal Distribution discussed above has 2 + 2 + 1 = 5
parameters: , , , , and p.
Densities:
The density of the mixture is the derivative of its Distribution Function.
Therefore, the density of the mixture is the mixture of the densities.
Exercise: What is the density at 5 of this mixed Pareto-LogNormal Distribution?

[Solution: The density of the Pareto is: f(x) =
= (2.5)(102.5)(10 + x)-3.5.
( + x) + 1
f(5) = 790.57/153.5 = 0.06048. The density of the LogNormal is:

exp
f(x) =

( ln(x)
-

)2
22
x 2

0.5)2
2 (0.82)
x 0.8 2

exp =

( ln(x)

f(5) = exp[-.5 ({ln(5) - 0.5}/0.8)2 ] / {(5)(0.8) 2 } = 0.03813.


The density for the mixed distribution is: (0.1)(0.06048) + (0.9)(0.03813) = 0.0404.]
Moments:
Moments of the mixed distribution are the weighted average of the moments of the individual
distributions: EH[Xn ] = p EA[Xn ] + (1-p) EB[Xn ].

E H [X] = p EA [X] + (1-p) EB [X].


For example, the mean of the above mixed Pareto-LogNormal Distribution is:
(0.1)( mean of the Pareto) + (0.9)( the mean of the LogNormal).
Exercise: What are the first and second moments of the Pareto Distribution with parameters
= 2.5 and = 10?
[Solution: For the Pareto, the mean is: / (1) = 10/1.5 = 6.667,
while the second moment is:

2 2
200
=
= 266.67.]
( 1) ( 2) (1.5) (0.5)

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 734

Exercise: What are the first and second moments of the LogNormal Distribution with parameters
= 0.5 and = 0.8?
[Solution: For the LogNormal, the mean is: exp[ + 0.5 2] = e0.82 = 2.270,
while the second moment is: exp[2 + 22] = e2.28 = 9.777.]
Thus the mean of the mixed Pareto-LogNormal is: (0.1)(6.667) + (0.9)( 2.27) = 2.71.
We also note that of the total loss dollars represented by the mixed distribution,
(0.1)(6.667) = 0.667 come from the underlying Pareto, while (0.9)( 2.27) = 2.04 come from the
underlying LogNormal. Thus 0.667 / 2.71 = 25% of the total losses come from the underlying
Pareto, while the remaining 2.04 / 2.71 = 75% come from the underlying LogNormal.
In general, p EA[X] / {p EA[X] + (1-p) EB[X]} represents the portion of the total losses for the mixed
distribution that come from the first of the individual distributions.
For a 2 point mixture of A and B with weights p and 1-p:
E[X] = E[X | A]Prob[A] + E[X | B]Prob[B] = p(mean of A) + (1-p)(mean B).
E[X2 ] = E[X2 | A]Prob[A] + E[X2 | B]Prob[B] = p(2nd moment of A) + (1-p)(2nd moment of B).

E H [X2 ] = p EA [X2 ] + (1-p) EB [X2 ].


The second moment of this mixed distribution is the weighted average of the second moments of
the two individual distributions: (0.1)(266.67) + (0.9) (9.777) = 35.47.

The moment of the mixture is the mixture of the moments.


Thus the variance of this mixed distribution is: 35.47 - 2.712 = 28.13.
First one gets the moments of the mixture, and then one gets the variance of the mixture.
One does not weight together the individual variances.
Coefficient of Variation:
One can now get the coefficient of variation of this mixture, from the mean and variance of this
mixture. The Coefficient of Variation of this mixture is: 28.16 / 2.71 = 1.96. The C.V. of the mixed
distribution is between the C.V. of the Pareto at 2.24 and that of the LogNormal at 0.95. The mixed
distribution has a heavier tail than the LogNormal and a lighter tail than the Pareto.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 735

Limited Moments:
Limited Moments of the mixed distribution are the weighted average of the limited moments of the
individual distributions: EH[(X x)n ] = p EA[(X x)n ] + (1-p) EB[(X x)n ].
E H [X

x] = p EA [X

x] + (1-p) EB [X

x].

For example, the limited expected value of the mixed Pareto-LogNormal Distribution is:
(.1)(LEV of the Pareto) + (.9)(LEV of the LogNormal).
Exercise: What is E[X 4] for the Pareto Distribution with parameters = 2.5 and =10?
[Solution: For the Pareto, E[X x] =


-1
1 -
.
1
+ x

E[X 4] = (10/1.5) {1 - (10/14)1.5} = 2.642.]


Exercise: Compute E[X 4] for the LogNormal with parameters = 0.5 and = 0.8.
ln(x) 2
ln(x)
[Solution: E[X x] = exp( + 2/2)
+ x {1 -

}.

E[X 4] = e.82 [.3079] + 4 {1 - [1.1079]} = (2.2705)(.6209) + (4){1 - .8660} = 1.946.]


Thus for the mixed Pareto-LogNormal, E[X 4] = (0.1)(2.642) + (0.9)(1.946) = 2.016.
Quantities that are Mixtures:
Thus there are a number of quantities which are mixtures:
The Distribution Function of the mixture is the mixture of the Distribution Functions.
The Survival Function of the mixture is the mixture of the Survival Functions.
The density of the mixture is the mixture of the densities.
The moments of the mixture are the mixture of the moments.
The limited moments of the mixture are the mixture of the limited moments.
E[(X -d)+] = E[X] - E[X

d] is also a mixture.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 736

As discussed already the variance of the mixture is not the mixture of the variances.
Rather, first one gets the moments of the mixture, and then one gets the variance.
The coefficient of variation of the mixture is not the mixture of the coefficients of variation.
Rather, first one computes the moments of the mixture.
Similarly, the skewness of the mixture is not the mixture of the skewnesses.
Rather, first one computes the moments of the mixture. Then one gets the third central moment of
the mixture and divides it by the standard deviation of the mixture cubed.
A number of other quantities of interest, such as the hazard rate, mean residual life, Loss Elimination
Ratio, Excess Ratio, and percent of losses in a layer, have to be computed from their components
for the mixture, as will be discussed.
Hazard Rates:
h(x) = f(x)/S(x). Therefore, in order to get the hazard rate for a mixture, one computes the density
and the survival function for that mixture.
As computed previously, for this mixture f(5) = (0.1)(0.06048) + (0.9)(0.03813) = 0.0404.
As computed previously, for this mixture F(5) = (0.1)(0.6371) + (0.9)(0.9172) = 0.8892.
S(5) = 1 - 0.8892 = 0.1108.
For this mixture, h(5) = 0.0404/0.1108 = 0.3646.
For the Pareto, h(5) = 0.06048 / (1 - 0.6371) = 0.1667.
For the LogNormal, h(5) = 0.03813 / (1 - 0.9172) = 0.4605.
(0.1)(0.1667) + (0.9)(0.4605) = 0.4311 0.3646.
The hazard rate of the mixture is not equal to the mixture of the hazard rates.
Excess Ratios:
Here is how one calculates the Excess Ratio for this mixed distribution at 10, RG(10).
The numerator is the loss dollars excess of 10. For the Pareto this is the excess ratio of the Pareto at
10 times the mean for the Pareto: RA(10)EA[X]; for the LogNormal it is: RB(10)EB[X].
Thus the numerator of RG(10) is: pEA[X]RA(10) + (1 - p)EB[X]RB(10).

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 737

Exercise: What is the Excess Ratio at 10 of the Pareto Distribution with = 2.5 and = 10?
1
[Solution: The excess ratio for the Pareto is: R(x) =

.
+ x
RA(10) = {10/(10 + 10)}2.5-1 = 0.3536.]
Exercise: What is the Excess Ratio at 10 of the LogNormal Distribution with parameters = 0.5 and
= 0.8?
[Solution: The excess ratio for the LogNormal is:
ln(x)
1

ln(x) 2

R(x) = 1 -
x
.

exp[ + 2 / 2]
RB(10) = 1 - [(ln10 - 0.5 - 0.82 )/0.8] - x {1 - [(ln10 - 0.5)/0.8]} / exp(0.5 + 0.82 /2) = 0.0197. ]
Thus for the mixed distribution the excess ratio at 10 is:
RG(10) = {pEA[X]RA(10) + (1 - p)EB[X]RB(10)} / EH[X] =
{pEA[X]RA(10) + (1 - p)EB[X]RB(10)} / {p EA[X] + (1 - p) EB[X]} =
{(0.1)(6.667)(0.3536) + (0.9)(2.27)(0.0198)} / 2.71= 10.2%
At each limit, the Excess Ratio for the mixed distribution is a weighted average of the individual
excess ratios, with weights: pEA[X] = (0.1)(6.667), and (1-p)EB[X] = (0.9)(2.27).
Here are the Excess Ratios computed at different limits:
Limit

Pareto
Excess Ratio

LogNormal
Excess Ratio

Mixed Pareto-LogNormal
Excess Ratio

1
2.5
5
10
25
50

86.68%
71.55%
54.43%
35.36%
15.27%
6.80%

59.96%
27.83%
9.64%
1.97%
0.10%
0.00%

66.53%
38.59%
20.66%
10.18%
3.83%
1.68%

We note that the excess ratio for the LogNormal declines much more quickly than that of the Pareto.
The excess ratio for the mixed distribution is somewhere in between.
In general, the Excess Ratio for the mixed distribution is a weighted average of the individual excess
ratios, with weights pEA[X] and (1 - p)EB[X]:
RG(x) = {pEA[X]RA(x) + (1 - p)EB[X]RB(x)} / EH[X]
= {pEA[X]RA(x) + (1 - p)EB[X]RB(x)}/{pEA[X] + (1 - p)EB[X]}.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 738

Layers of Losses:
One can compute the percent of total dollars in a layer by taking the difference of the excess ratios.
For example, for the Pareto the layer from 5 to 10 represents: 54.43% - 35.36% = 19.07% of the
Paretos total dollars of loss. For the LogNormal the layer from 5 to 10 represents only:
9.64% - 1.97% = 7.67% of its total dollars of loss. Using the excess ratios computed for the mixed
distribution, for the mixed distribution the layer from 5 to 10 represents: 20.66% - 10.18% =
10.48% of its total dollars of loss.
One can divide the losses for a layer for the mixed distribution into those from the Pareto and those
from the LogNormal.
The contribution from the Pareto to the mixed distribution for the layer from 5 to 10 is:
(0.1)(6.667)(54.43% - 35.36%) = 0.1271.
The contribution from the LogNormal for the mixed distribution to the layer from 5 to 10 is:
(0.9)(2.270)(9.64% - 1.97%) = 0.1566.
The mixed distribution has losses in the layer from 5 to 10 of:
(2.71)(20.66% - 10.18%) = 0.284 = 0.1271 + 0.1566.
Thus for the layer from 5 to 10, about 45% of the losses for the mixed distribution come from the
Pareto while the remaining 55% come from the LogNormal.297
One can perform similar calculations for other layers:
Bottom
of Layer

Top
of Layer

0
1
2.5
5
10
25
50

1
2.5
5
10
25
50

Losses in Layer Losses in Layer


Contributed
Contributed
by Pareto
by LogNormal
0.0888
0.1008
0.1141
0.1272
0.1339
0.0565
0.0454

0.8180
0.6564
0.3716
0.1567
0.0382
0.0020
0.0001

Portion of Losses in Layer


Contributed
Contributed
by Pareto
by LogNormal
9.8%
13.3%
23.5%
44.8%
77.8%
96.7%
99.8%

90.2%
86.7%
76.5%
55.2%
22.2%
3.3%
0.2%

We note that for this mixed distribution the losses for lower layers come mainly from the
LogNormal Distribution, while those for the higher layers come mainly from the Pareto
Distribution. In that sense the LogNormal is modeling the behavior of the smaller claims, while
the Pareto is modeling the behavior of the larger claims. This is typical of a mixture of models;
a lighter-tailed distribution mostly models the behavior of the smaller losses, while a heaviertailed distribution mostly models the behavior of the larger losses.

297

0.127 / 0.284 = 45%, and 0.157 / 0.284 = 55%.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 739

Fitting to Data:
One can fit size of loss data to mixed models using the same techniques as for other size of loss
distributions. However, due to the generally larger number of parameters there are often practical
difficulties. For example, if one attempted to use maximum likelihood to fit a mixed ParetoLogNormal Distribution to data, one would be fitting 5 parameters. Thus numerical algorithms would
be searching through 5 dimensional space and might take a long time. Thus it is important in practical
applications to have a good staring point.
Sometimes one fits each of the individual distributions to the given data and uses these results
to help pick a starting value. Sometimes, keeping one or more of the parameters fixed while
fitting the others will help to determine a good starting place. Often one can use a prior years
result or a result from a similar data set to help choose a starting value. Sometimes one just
has to try a few different starting values before finding one that seems to work.
Sometimes using numerical methods, one has problems getting p, the weight parameter, to
stay between 0 and 1 as desired.
One could in such cases reparameterize by letting p = eb / (eb +1).
n-Point Mixtures:
In general, one can weight together any number of distributions, rather than just two.298
While I have illustrated two-point mixtures, there is no reason why one could not use
three-point, four-point, etc. mixtures. The quantities of interest can be calculated in a manner parallel
to that used here for the two-point distributions. Also besides those situations where all the
distributions of are different types, some or all of the individual distributions can be of the same type.
For example, the Distribution:
(0.2)(1 - e-x/10) + (0.5)(1 - e-x/25) + (0.3)(1 - e-x/100) = 1 - 0.2e-x/10 - 0.5e-x/25 - 0.3e-x/100,
is a three-point mixture of Exponential Distributions, with means of 10, 25, and 100 respectively.
An n-point mixture of Exponential Distributions would have 2n -1 parameters,
n means of Exponential Distributions and n-1 weighting parameters.
For example a three-point mixture of Exponential Distributions has 3 means and 2 weights for a total
of 5 parameters.

298

Of course one must be careful of introducing too many parameters. The principal of parsimony applies; one
should use the minimum number of parameters necessary. One can always improve the
fit to a given data set by adding parameters, but the resulting model is often less useful.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 740

Variable Mixtures:
Variable-Component Mixture Distribution
weighted average of unknown # of distributions F(x) = wi Fi(x), with wi =1.299
Variable Mixture
weighted average of unknown # of distributions of the same family but differing parameters F(x)
= wi Fi(x), with each Fi of the same family and wi =1.
For example a variable mixture of Exponentials would have:
F(x) = wi (1 - exp[-x/i]) = 1 - wi exp[-x/i], with wi =1.
The key difference from a n-point mixture of Exponentials is that in the variable mixture, the number
of Exponentials weighted together is unknown and is a parameter to be determined.300
Variable-Component Mixture Distribution, and their special cases Variable Mixtures, are called
semi-parametric, since they share some of the properties of both parametric and nonparametric
distributions.301

299

See Definition 4.6 in Loss Models.


For example, one can fit a variable mixture of Exponentials via maximum likelihood. See Modeling Losses with
the Mixed Exponential Distribution, by Clive L. Keatinge, PCAS 1999.
301
The Exponential and LogNormal are examples of parametric distributions. The empirical distribution function is an
example of a nonparametric distribution.
300

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 741

Summary:
The mixture of models can be useful when more flexibility is desired to fit size of loss data.302 It has
been useful for a number of practical applications.303 One can perform either n-point mixtures or
continuous mixing.304 One can use the same techniques to mix together frequency models.
Sometimes the mixture of models is just a mathematical device with no physical significance.
However, sometimes the mixture directly models a feature of the real world.
For example, sometimes a mixture is useful when a population is divided into two
sub-populations such as smoker and nonsmoker.
Mixtures can also be useful when the data results from different perils.
For example, for Homeowners Insurance it might be useful to fit Theft, Wind, Fire, and Liability
losses each to a separate size of loss distribution. This is an example, where one might weight
together 4 different distributions.
Mixtures will come up again in Buhlmann Credibility.305

302

See for example Methods of Fitting Distributions to Insurance Loss Data, by Charles C. Hewitt and
Benjamin Lefkowitz, PCAS 1979.
303
For example, Insurance Services Office used mixed Pareto-Pareto models to calculate Increased Limits Factors.
ISO has switched to an n-point mixture of Exponential Distributions. See Modeling Losses with the Mixed
Exponential Distribution, PCAS 1999, by Clive Keatinge. The Massachusetts Workers Compensation Rating
Bureau has used mixed Pareto-Exponential models to calculate Excess Loss Factors.
See Workers Compensation Excess Ratios, an Alternative Method, PCAS 1998, by Howard C. Mahler.
304
To be discussed in the next section.
305
See Mahlers Guide to Buhlmann Credibility and Bayesian Analysis.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Problems:
Use the following information for the next 22 questions:

V follows a Pareto Distribution, with parameters = 4, = 10.

W follows an Exponential Distribution: F(w) = 1 - e- w /0.8.

Y is a two-point mixture of V and W, with 5% weight to the Pareto Distribution


and 95% weight to the Exponential Distribution.

38.1 (1 point) For V, what is the chance of a claim greater than 2?


A. less than 49%
B. at least 49% but less than 50%
C. at least 50% but less than 51%
D. at least 51% but less than 52%
E. at least 52%
38.2 (1 point) What is the mean of V?
A. less than 3.1
B. at least 3.1 but less than 3.2
C. at least 3.2 but less than 3.3
D. at least 3.3 but less than 3.4
E. at least 3.4
38.3 (1 point) What is the second moment of V?
A. less than 34
B. at least 34 but less than 35
C. at least 35 but less than 36
D. at least 36 but less than 37
E. at least 37
38.4 (1 point) What is the coefficient of variation of V?
A. less than 1.3
B. at least 1.3 but less than 1.4
C. at least 1.4 but less than 1.5
D. at least 1.5 but less than 1.6
E. at least 1.6

Page 742

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

38.5 (1 point) What is the third moment of V?


A. less than 1000
B. at least 1000 but less than 1001
C. at least 1001 but less than 1002
D. at least 1002 but less than 1003
E. at least 1003
38.6 (2 points) What is the skewness of V?
A. less than 7.0
B. at least 7.0 but less than 7.1
C. at least 7.1 but less than 7.2
D. at least 7.2 but less than 7.3
E. at least 7.3
38.7 (2 points) What is the excess ratio at 5 of V?
A. less than 28%
B. at least 28% but less than 29%
C. at least 29% but less than 30%
D. at least 30% but less than 31%
E. at least 31%
38.8 (1 point) For W, what is the chance of a claim greater than 2?
A. less than 8%
B. at least 8% but less than 9%
C. at least 9% but less than 10%
D. at least 10% but less than 11%
E. at least 11%
38.9 (1 point) What is the mean of W?
A. less than 0.5
B. at least 0.5 but less than 0.6
C. at least 0.6 but less than 0.7
D. at least 0.7 but less than 0.8
E. at least 0.8

HCM 10/8/12,

Page 743

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

38.10 (1 point) What is the second moment of W?


A. less than 1.3
B. at least 1.3 but less than 1.4
C. at least 1.4 but less than 1.5
D. at least 1.5 but less than 1.6
E. at least 1.6
38.11 (1 point) What is the coefficient of variation of W?
A. less than 0.7
B. at least 0.7 but less than 0.8
C. at least 0.8 but less than 0.9
D. at least 0.9 but less than 1.00
E. at least 1.0
38.12 (1 point) What is the third moment of W?
A. less than 2.7
B. at least 2.7 but less than 2.8
C. at least 2.8 but less than 2.9
D. at least 2.9 but less than 3.0
E. at least 3.0
38.13 (1 point) What is the skewness of W?
A. less than 1.8
B. at least 1.8 but less than 1.9
C. at least 1.9 but less than 2.0
D. at least 2.0 but less than 2.1
E. at least 2.1
38.14 (1 point) What is the Excess Ratio at 5 of W?
A. less than 0.16%
B. at least 0.16% but less than 0.18%
C. at least 0.18% but less than 0.20%
D. at least 0.20% but less than 0.22%
E. at least 0.22%
38.15 (1 point) For Y, what is the chance of a claim greater than 2?
A. less than 8%
B. at least 8% but less than 9%
C. at least 9% but less than 10%
D. at least 10% but less than 11%
E. at least 11%

HCM 10/8/12,

Page 744

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

38.16 (1 point) What is the mean of Y?


A. less than 0.9
B. at least 0.9 but less than 1.0
C. at least 1.0 but less than 1.1
D. at least 1.1 but less than 1.2
E. at least 1.2
38.17 (1 point) What is the second moment of Y?
A. less than 2.8
B. at least 2.8 but less than 2.9
C. at least 2.9 but less than 3.0
D. at least 3.0 but less than 3.1
E. at least 3.1
38.18 (1 point) What is the coefficient of variation of Y?
A. less than 1.4
B. at least 1.4 but less than 1.5
C. at least 1.5 but less than 1.6
D. at least 1.6 but less than 1.7
E. at least 1.7
38.19 (1 point) What is the third moment of Y?
A. 50
B. 51
C. 52
D. 53

E. 54

38.20 (2 points) What is the skewness of Y?


A. less than 14
B. at least 14 but less than 15
C. at least 15 but less than 16
D. at least 16 but less than 17
E. at least 17
38.21 (2 points) What is the excess ratio at 5 of Y?
A. less than 4%
B. at least 4% but less than 5%
C. at least 5% but less than 6%
D. at least 6% but less than 7%
E. at least 7%
38.22 (4 points) What is the mean excess loss at 2 of Y?
A. 0.8
B. 1.0
C. 1.2
D. 1.4
E. 1.6

HCM 10/8/12,

Page 745

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 746

Use the following information for the next three questions:


The random variable X has the density function:
f(x) = 0.4 exp(-x/1128)/1128 + 0.6 exp(-x/5915)/5915, 0 < x < .
38.23 (2 points) Determine the variance of X.
(A) 21 million (B) 23 million (C) 25 million (D) 27 million
38.24 (2 points) Determine E[X 2000].
(A) 1250
(B) 1300
(C) 1350
(D) 1400

(E) 29 million

(E) 1450

38.25 (1 point) Determine E[(X - 2000)+].


(A) 2600

(B) 2650

(C) 2700

(D) 2750

(E) 2800

38.26 (2 points) You are given the following:


The random variable X has a distribution that is a mixture of a Pareto distribution with parameters
= 1000 and = 1, and another Pareto distribution, but with parameters = 100 and = 1.
The first Pareto is given a weight of 0.3 and the second Pareto a weight of 0.7.
Determine the 20th percentile of X.
A. Less than 15
B. At least 15, but less than 25
C. At least 25, but less than 35
D. At least 35, but less than 45
E. At least 45
Use the following information for the next two questions:
Medical losses are Poisson with = 2.
The size of medical losses are uniform from 0 to 2000.
Dental losses are Poisson with = 1.
The size of dental losses is uniform from 0 to 500.
A policy, with an ordinary deductible of 200, covers both medical and dental losses.
38.27 (2 points) Determine the average payment per loss for this policy.
(A) 570
(B) 580
(C) 590
(D) 600
(E) 610
38.28 (1 point) Determine the average payment per payment for this policy.
(A) 700
(B) 710
(C) 720
(D) 730
(E) 740

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Use the following information for the next 14 questions:


F(x) = (0.2)(1 - e-x/10) + (0.5)(1 - e-x/25) + (0.3)(1 - e-x/100).
38.29 (1 point) What is the probability that x is more than 15?
A. 56%
B. 58%
C. 60%
D. 62%
E. 64%
38.30 (1 point) What is the mean?
A. Less than 20
B. At least 20, but less than 30
C. At least 30, but less than 40
D. At least 40, but less than 50
E. At least 50
38.31 (3 points) What is the median?
A. Less than 19
B. At least 19, but less than 20
C. At least 20, but less than 21
D. At least 21, but less than 22
E. At least 22
38.32 (1 point) What is the mode?
(A) 0
(B) 10
(C) 25

(D) 100

(E) None of A, B, C, D

38.33 (1 point) What is the second moment?


A. Less than 3000
B. At least 3000, but less than 4000
C. At least 4000, but less than 5000
D. At least 5000, but less than 6000
E. At least 6000
38.34 (1 point) What is the coefficient of variation?
(A) 0.7
(B) 0.9
(C) 1.1
(D) 1.3
38.35 (2 points) What is the third moment?
A. Less than 1.0 million
B. At least 1.0 million, but less than 1.3 million
C. At least 1.3 million, but less than 1.6 million
D. At least 1.6 million, but less than 1.9 million
E. At least 1.9 million

(E) 1.5

Page 747

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

38.36 (2 points) What is the skewness?


A. Less than 2.0
B. At least 2.0, but less than 2.5
C. At least 2.5, but less than 3.0
D. At least 3.0, but less than 3.5
E. At least 3.5
38.37 (2 points) What is the hazard rate at 50, h(50)?
A. Less than 0.020
B. At least 0.020, but less than 0.025
C. At least 0.025, but less than 0.030
D. At least 0.030, but less than 0.035
E. At least 0.035
38.38 (2 points) What is the Limited Expected Value at 20, E[X 20]?
A. Less than 11
B. At least 11, but less than 13
C. At least 13, but less than 15
D. At least 15, but less than 17
E. At least 17
38.39 (2 points) What is the Loss Elimination Ratio at 15?
A. Less than 26%
B. At least 26%, but less than 29%
C. At least 29%, but less than 32%
D. At least 32%, but less than 35%
E. At least 35%
38.40 (2 points) What is the Excess Ratio at 75?
A. Less than 26%
B. At least 26%, but less than 29%
C. At least 29%, but less than 32%
D. At least 32%, but less than 35%
E. At least 35%

Page 748

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 749

38.41 (4 points) What is the Limited Second Moment at 30, E[(X 30)2 ]?
Hint: Use Theorem A.1 in Appendix A of Loss Models:
j=n-1

(n; x) = 1 -

xj e- x / j!, for n a positive integer.


j=0

A. Less than 430


B. At least 440, but less than 450
C. At least 450, but less than 460
D. At least 460, but less than 470
E. At least 470
38.42 (3 points) What is the Mean Excess Loss at 50?
A. Less than 78
B. At least 78, but less than 82
C. At least 82, but less than 86
D. At least 86, but less than 90
E. At least 90

38.43 (3 points) With the aid of a computer, graph a two-point mixture of a Gamma Distribution
with = 4 and = 3 and a Gamma Distribution with = 2 and = 10, with 60% weight to the first
distribution and 40% weight to the second distribution.
38.44 (2 points) 40% of lives follow DeMoivres Law with = 80.
The other 60% of lives follow DeMoivres law with = 100.
A life is picked at random.
If the life has survives to at least age 70, what is its expected age at death?
A. 80
B. 81
C. 82
D. 83
E. 84
38.45 (2 points) You are the consulting actuary to a group of venture capitalists financing a search for
pirate gold. Its a risky undertaking: with probability 0.80, no treasure will be found, and thus the
outcome is 0. The rewards are high: with probability 0.20 treasure will be found.
The outcome, if treasure is found, is uniformly distributed on [1000, 5000].
Calculate the variance of the distribution of outcomes.
(A) 1.3 million
(B) 1.4 million
(C) 1.5 million
(D) 1.6 million
(E) 1.7 million
38.46 (3 points) With the aid of a computer, graph a two-point mixture of a Gamma Distribution with
= 4 and = 3 and a Gamma Distribution with = 6 and = 10, with 30% weight to the first
distribution and 70% weight to the second distribution.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 750

38.47 ( 3 points) The distribution of a loss, X, is a two-point mixture:


(i) With probability 0.7, X has an Exponential distribution with = 100.
(ii) With probability 0.3, X has an Exponential distribution with = 200.
If a loss is of size greater than 50, what is its expected size?
A. 180
B. 185
C. 190
D. 195
E. 200
38.48 (3 points) In 2002 losses follow the following density:
f(x) = 0.7 exp(-x/1000)/1000 + 0.3 exp(-x/5000)/5000, 0 < x < .
Losses uniformly increase by 8% between 2002 and 2004.
In 2004 a policy has a 3000 maximum covered loss.
In 2004 what is the average payment per loss?
A. 1320
B. 1340
C. 1360
D. 1380
E. 1400
Use the following information for the next 3 questions:
Risk Type
Number of Risks
Size of Loss Distribution
I

600

Single Parameter Pareto, = 10, = 4

II

400

Single Parameter Pareto, = 10, = 3

38.49 (2 points) You independently simulate a single loss for each risk.
Let S be the sum of these 1000 amounts. You repeat this process many times.
What is the variance of S?
A. Less than 42,000
B. At least 42,000, but less than 43,000
C. At least 43,000, but less than 44,000
D. At least 44,000, but less than 45,000
E. At least 45,000
38.50 (2 points) A risk is selected at random from one of the 1000 risks.
You simulate a single loss for this risk.
This risk is replaced, and a new risk is selected at random from one of the 1000 risks.
You simulate a single loss for this new risk.
You repeat this process many times, each time picking a new risk at random.
What is the variance of the outcomes?
A. 42
B. 43
C. 44
D. 45
E. 46
38.51 (2 points) A risk is selected at random from one of the 1000 risks.
You simulate a single loss for this risk. You then simulate another loss for this same risk.
You repeat this process many times. What is the expected variance of the outcomes?
A. 42
B. 43
C. 44
D. 45
E. 46

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 751

Use the following information for the next two questions:

Bob is an overworked underwriter.


Applications arrive at his desk.
Each application has a 1/3 chance of being a bad risk and a 2/3 chance of being a good risk.
Since Bob is overworked, each time he gets an application he flips a fair coin.
If it comes up heads, he accepts the application without looking at it.
If the coin comes up tails, he accepts the application if and only if it is a good risk.
The expected profit on a good risk is 300 with variance 10,000.
The expected profit on a bad risk is -100 with variance 90,000.
38.52 (2 points) Calculate the variance of the profit per applicant.
A. 50,000
B. 51,000
C. 52,000
D. 53,000
E. 54,000
38.53 (2 points) Calculate the variance of the profit per applicant that Bob accepts.
A. Less than 50,000
B. At least 50,000, but less than 51,000
C. At least 51,000, but less than 52,000
D. At least 52,000, but less than 53,000
E. At least 53,000

38.54 (4 points) You are given the following information for Homeowners Insurance:

10% of losses are due to Wind.


30% of losses are due to Fire.
20% of losses are due to Liability.
40% of losses are due to All Other Perils.
Losses due to Wind follow a LogNormal distribution with = 10 and = 0.7.
Losses due to Fire follow a Gamma distribution with = 2 and = 10,000.
Losses due to Liability follow a Pareto distribution with = 5 and = 200,000.
Losses due to All Other Perils follow an Exponential distribution with = 5,000.
Determine the standard deviation of the size of loss for Homeowners Insurance.
A. 15,000
B. 20,000
C. 25,000
D. 30,000
E. 35,000

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 752

Use the following information for the next 3 questions:


X follows a two-point mixture of LogNormal Distributions.
The first LogNormal is given weight 65%, and has parameters = 8 and = 0.5.
The second LogNormal is given weight 35%, and has parameters = 9 and = 0.3.
38.55 (1 point) Determine E[X].
A. Less than 4400
B. At least 4400, but less than 4600
C. At least 4600, but less than 4800
D. At least 4800, but less than 5000
E. At least 5000
38.56 (1 point) Determine E[X2 ].
A. Less than 35 million
B. At least 35 million, but less than 40 million
C. At least 40 million, but less than 45 million
D. At least 45 million, but less than 50 million
E. At least 50 million
38.57 (1 point) Determine 1 / E[1/X].
A. Less than 3200
B. At least 3200, but less than 3400
C. At least 3400, but less than 3600
D. At least 3600, but less than 3800
E. At least 3800

38.58 (4 points) On an exam, the grades of Good students are distributed via a Beta Distribution
with a = 6, b = 2 and = 100.
On this exam, the grades of Bad students are distributed via a Beta Distribution with
a = 3, b = 2 and = 100.
3/4 of students are good, while 1/4 of students are bad.
A grade of 65 or more passes.
What is the expected grade of a student who fails this exam?
A. Less than 50
B. At least 50, but less than 51
C. At least 51, but less than 52
D. At least 52, but less than 53
E. At least 53

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 753

Use the following information for the next four questions:


R is the annual return on a stock.
At random, half of the time R is a random draw from a Normal Distribution with = 8% and = 20%.
The other half of the time, R is a random draw from a Normal Distribution with = 11% and = 30%.
Hint: The third moment of a Normal Distribution is: 3 + 32.
The fourth moment of a Normal Distribution is: 4 + 622 + 34.
38.59 (1 point) What is the mean of R?
A. 8.5%
B. 9%
C. 9.5%
D. 10.0%

E. 10.5%

38.60 (2 points) What is the standard deviation of R?


A. Less than 25%
B. At least 25%, but less than 27%
C. At least 27%, but less than 29%
D. At least 29%, but less than 31%
E. 31% or more
38.61 (3 points) What is the skewness of R?
A. Less than -0.10
B. At least -0.10, but less than -0.05
C. At least -0.05, but less than 0.05
D. At least 0.05, but less than 0.10
E. 0.10 or more
38.62 (4 points) What is the kurtosis of R?
A. 3.0
B. 3.2
C. 3.4
D. 3.6

E. 3.8

38.63 (5 points) For a mixture of two distributions with the same coefficient of variation, compare the
coefficient of variation of the mixture with that of the components.
38.64 (2, 5/85, Q.15) (1.5 points) If X is a random variable with density function
f(x) = 1.4e-2x + 0.9e-3x for x 0. Determine E(X).
A. 9/20
B. 5/6
C. 1
D. 230/126

E. 23/10

38.65 (160, 11/86, Q.1) (2.1 points) Three populations have constant forces of mortality 0.01,
0.02, and 0.04, respectively. For a group of newborns, one-third from each population, determine
the complete expectation of future lifetime at age 50.
(A) 8.3
(8) 24.3
(C) 50.0
(D) 58.3
(E) 74.3

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 754

38.66 (160, 11/86, Q.4) (2.1 points) For a certain population, you are given:
(i) At any point in time, equal numbers of males and females are born.
(ii) The mean and variance of the lifetime distribution for males at birth are 60 and 200, respectively.
(iii) The mean and variance of the lifetime distribution for females at birth are 80 and 300,
respectively.
Determine the variance of the lifetime distribution for the population.
(A) 150
(B) 200
(C) 250
(D) 300
(E) 350
38.67 (4B, 5/93, Q.20) (1 point) Which of the following statements are true?
1.
With an n-point mixture of models, the large number of parameters that need
to be estimated may be a problem.
2.
Starting with several distributions, the two-point mixture of models leads to
many more pairs of distributions.
3.
A potential computational problem with the mixture of models is that estimation
of p in the equation F(x) = pF1 (x) + (1-p)F2 (x) via iterative numerical techniques,
A. 1

may lead to a value of p outside the interval from 0 to 1.


B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3

38.68 (4B, 11/97, Q.3) (2 points) You are given the following:
The random variable X has a distribution that is a mixture of a Burr distribution,

1
F(x) = 1 -
, with parameters = 1000 , = 1 and = 2,
1 + (x / )
and a Pareto distribution, with parameters = 1,000 and = 1.
Each of the two distributions in the mixture has equal weight.
Determine the median of X.
A. Less than 5
B. At least 5, but less than 50
C. At least 50, but less than 500
D. At least 500, but less than 5,000
E. At least 5,000

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 755

38.69 (4B, 5/98, Q.16) (2 points) You are given the following:

X1 is a mixture of a random variable with a uniform distribution on [0,2]

and a random variable with a uniform distribution on [1, 3].


(Each distribution in the mixture has positive weight.)
X2 is the sum of a random variable with a uniform distribution on [0,2]

and a random variable with a uniform distribution on [1, 3].


X3 is a random variable that has a normal distribution that is right censored at 1.

Match X1 , X2 , and X3 with the following descriptions:


1. Continuous distribution function and continuous density function
2. Continuous distribution function and discontinuous density function
3. Discontinuous distribution function
A. X1 :1, X2 :2, X3 :3
B. X1 :1, X2 :3, X3 :2
C. X1 :2, X2 :1, X3 :3
D. X1 :2, X2 :3, X3 :1

E. X1 :3, X2 :1, X3 :2

38.70 (4B, 11/98 Q.8) (2 points) You are given the following:

A portfolio consists of 75 liability risks and 25 property risks.

The risks have identical claim count distributions.

Loss sizes for liability risks follow a Pareto distribution, with parameters = 300 and = 4.

Loss sizes for property risks follow a Pareto distribution, with parameters
= 1,000 and = 3.

Determine the variance of the claim size distribution for this portfolio for a single claim.
A. Less than 150,000
B. At least 150,000, but less than 225,000
C. At least 225,000, but less than 300,000
D. At least 300,000, but less than 375,000
E. At least 375,000
38.71 (Course 151 Sample Exam #2, Q.5) (0.8 points)
You are given S = S1 + S2 , where S1 and S2 are independent and have compound Poisson
distributions with the following characteristics:
(i) 1 = 2 and 2 = 3
(ii)

p 1 (x)

p 2 (x)

1
0.6
0.1
2
0.4
0.3
3
0.0
0.5
4
0.0
0.1
Determine the variance of individual claim amounts for S.
(A) 0.83
(B) 0.87
(C) 0.91
(D) 0.95
(E) 0.99

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 756

38.72 (Course 1 Sample Exam. Q.24) (1.9 points) An automobile insurance company divides
its policyholders into two groups: good drivers and bad drivers.
For the good drivers, the amount of an average claim is 1400, with a variance of 40,000.
For the bad drivers, the amount of an average claim is 2000, with a variance of 250,000.
Sixty percent of the policyholders are classified as good drivers.
Calculate the variance of the amount of a claim for a policyholder.
A. 124,000 B. 145,000 C. 166,000 D. 210,400 E. 235,000
38.73 (Course 3 Sample Exam, Q.10) An insurance company is negotiating to settle a liability
claim. If a settlement is not reached, the claim will be decided in the courts 3 years from now.
You are given:
There is a 50% probability that the courts will require the insurance company to make
a payment. The amount of the payment, if there is one, has a lognormal distribution
with mean 10 and standard deviation 20.
In either case, if the claim is not settled now, the insurance company will have to pay
5 in legal expenses, which will be paid when the claim is decided, 3 years from now.
The most that the insurance company is willing to pay to settle the claim is the
expected present value of the claim and legal expenses plus 0.02 times the variance
of the present value.
Present values are calculated using i = 0.04.
Calculate the insurance company's maximum settlement value for this claim.
A. 8.89
B. 9.93
C. 12.45
D. 12.89
E. 13.53
38.74 (IOA 101, 9/00, Q.8) (4.5 points) Claims on a certain class of policy are classified as being
of two types, I and II.
Past experience has shown that:
25% of claims are of type I and 75% are of type II;
Type I claim amounts have mean 500 and standard deviation 100;
Type II claim amounts have mean 300 and standard deviation 70.
Calculate the mean and the standard deviation of the claim amounts on this class of policy.
38.75 (1, 5/01, Q.17) (1.9 points) An auto insurance company insures an automobile worth
15,000 for one year under a policy with a 1,000 deductible. During the policy year there is a 0.04
chance of partial damage to the car and a 0.02 chance of a total loss of the car.
If there is partial damage to the car, the amount X of damage (in thousands) follows a distribution with
density function f(x) = 0.5003 e-x/2, 0 < x <15.
What is the expected claim payment?
(A) 320
(B) 328
(C) 352
(D) 380

(E) 540

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 757

38.76 (3, 11/01, Q.28 & 2009 Sample Q.100) (2.5 points) The unlimited severity distribution for
claim amounts under an auto liability insurance policy is given by the cumulative distribution:
F(x) = 1 - 0.8e-0.02x - 0.2e-0.001x, x 0.
The insurance policy pays amounts up to a limit of 1000 per claim.
Calculate the expected payment under this policy for one claim.
(A) 57
(B) 108
(C) 166
(D) 205
(E) 240
38.77 (4, 11/02, Q.13) (2.5 points) Losses come from an equally weighted mixture of an
exponential distribution with mean m1 , and an exponential distribution with mean m2 .
Determine the least upper bound for the coefficient of variation of this distribution.
(A) 1

(B)

(C)

(D) 2

(E)

38.78 (SOA3, 11/03, Q.18) (2.5 points) A population has 30% who are smokers with a constant
force of mortality 0.2 and 70% who are non-smokers with a constant force of mortality 0.1.
Calculate the 75th percentile of the distribution of the future lifetime of an individual selected at
random from this population.
(A) 10.7
(B) 11.0
(C) 11.2
(D) 11.6
(E) 11.8
38.79 (CAS3, 11/04, Q.28) (2.5 points) A large retailer of personal computers issues a Warranty
contract with each computer that it sells. The warranty covers any cost to repair or replace a defective
computer within the first 30 days of purchase. 40% of all claims are easily resolved with minor
technical help and do not involve any cost to replace or repair.
If a claim involves some cost to replace or repair, the claim size is distributed as a Weibull with
parameters = 1/2 and = 30.
Which of the following statements are true?
1. The expected cost of a claim is $60.
2. The survival function at $60 is 0.243.
3. The hazard rate at $60 is 0.012.
A. 1 only.
B. 2 only.
C. 3 only.
D. 1 and 2 only.

E. 2 and 3 only.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 758

38.80 (CAS3, 11/04, Q.29) (2.5 points) High-Roller Insurance Company insures the cost of
injuries to the employees of ACME Dynamite Manufacturing, Inc.

30% of injuries are "Fatal" and the rest are "Permanent Total" (PT).
There are no other injury types.
Fatal injuries follow a log-logistic distribution with = 400 and = 2.
PT injuries follow a log-logistic distribution with = 600 and = 2.
There is a $750 deductible per injury.
Calculate the probability that an injury will result in a claim to High-Roller.
A. Less than 30%
B. At least 30%, but less than 35%
C. At least 35%, but less than 40%
D. At least 40%, but less than 45%
E. 45% or more
38.81 (SOA M, 5/05, Q.34 & 2009 Sample Q.169) (2.5 points)
The distribution of a loss, X, is a two-point mixture:
(i) With probability 0.8, X has a two-parameter Pareto distribution with = 2 and = 100.
(ii) With probability 0.2, X has a two-parameter Pareto distribution with = 4 and = 3000.
Calculate Pr(X 200).
(A) 0.76
(B) 0.79

(C) 0.82

(D) 0.85

(E) 0.88

38.82 (CAS3, 11/05, Q.32) (2.5 points) For a certain insurance company, 60% of claims have a
normal distribution with mean 5,000 and variance 1,000,000.
The remaining 40% have a normal distribution with mean 4,000 and variance 1,000,000.
Calculate the probability that a randomly selected claim exceeds 6,000.
A Less than 0.10
B. At least 0.10, but less than 0.15
C. At least 0.15, but less than 0.20
D. At least 0.20, but less than 0.25
E. At least 0.25
38.83 (SOA M, 11/05, Q.32) (2.5 points) For a group of lives aged 30, containing an equal
number of smokers and non-smokers, you are given:
(i) For non-smokers, n (x) = 0.08, x 30.
(ii) For smokers, s(x) = 0.16, x 30.
Calculate q80 for a life randomly selected from those surviving to age 80.
(A) 0.078

(B) 0.086

(C) 0.095

(D) 0.104

(E) 0.112

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 759

38.84 (CAS3, 11/06, Q.20) (2.5 points)


An insurance company sells hospitalization reimbursement insurance. You are given:

Benefit payment for a standard hospital stay follows a lognormal distribution with = 7 and = 2.
Benefit payment for a hospital stay due to an accident is twice as much as a standard benefit.
25% of all hospitalizations are for accidental causes.
Calculate the probability that a benefit payment will exceed $15,000.
A. Less than 0.12
B. At least 0.12, but less than 0.14
C. At least 0.14, but less than 0.16
D. At least 0.16, but less than 0.18
E. At least 0.18
38.85 (IOA, CT8, 4/10, Q.9) (7.5 points) An asset is worth 100 at the start of the year and is
funded by a senior loan and a junior loan of 50 each.
The loans are due to be repaid at the end of the year;
the senior one with annual interest at 6% and the junior one with annual interest at 8%.
Interest is paid on the loans only if the asset sustains no losses.
Any losses of up to 50 sustained by the asset reduce the amount returned to the investor in the
junior loan by the amount of the loss. Any losses of more than 50 mean that the investor in the junior
loan gets 0 and the amount returned to the investor in the senior loan is reduced by the excess of
the loss over 50.
The probability that the asset sustains a loss is 0.25. The size of a loss, L, if there is one, follows a
uniform distribution between 0 and 100.
(i) ( 6 points)
(a) Calculate the variance of the distribution of amounts paid back to the investors in the junior loan.
(b) Calculate the variance of the distribution of amounts paid back to the investors in the senior loan.
(ii) (1.5 points) Calculate the probabilities for the investors in the junior and senior loans, that they get
paid back less than the original amounts of their loans.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 760

38.86 (IOA CT8, 9/10, Q.1) (5.25 points) An investor holds an asset that produces a random rate
of return, R, over the course of a year.
The distribution of this rate of return is a mixture of Normal distributions:
R has a Normal distribution with a mean of 0% and standard deviation of 10% with probability 0.8
and a Normal distribution with a mean of 30% and a standard deviation of 10% with a probability of
0.2.
S is the normally distributed random rate of return on another asset that has the same mean and
variance as R.
(i) (2.25 points) Calculate the mean and variance of R.
(ii) (3 points) Calculate the following probabilities for R and for S:
(a) probability of a rate of return less than 0%.
(b) probability of a rate of return less than -10%.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 761

Solutions to Problems:
38.1. A. The chance of a claim greater than 2 is: 1 - F(2) = (/(+2)) = (10/12)4 = 0.4823.
38.2. D. The mean of a Pareto is: /(1) = 10/3 = 3.333.
38.3. A. The second moment of a Pareto is: 22 /{(1)(2)} = 200/6 = 33.333.
38.4. C. The variance is : 33.333 - 3.3332 = 22.22. Thus the CV =
Comment: For the Pareto the CV is: / ( - 2) = 4 / 2 =

22.22 / 3.333 = 1.414.

2 = 1.414.

38.5. B. The third moment of a Pareto is : 63 /{(1)(2)(3)} = 6000/6 = 1000.

38.6. B. Skewness =

E[X3] - 3 E[X] E[X2] + 2 E[X] 3


=
STDDEV 3

{1000 - (3)(3.333)(33.333) + (2)(3.333)3 } / (22.22)1.5 = 7.07.


Comment: For the Pareto, Skewness = 2{(+1)/(3)} ( - 2) / = (2){(5)/(1)}

2 / 4 = 7.07.

38.7. C. The excess ratio for the Pareto is: {/(+x)}1 = (10/15)3 = 0.2963.
38.8. B. The chance of a claim greater than 2 is: 1 - F(2) = e-2/ = e-2/0.8 = 0.0821.
38.9. E. The mean of the Exponential Distribution is: = 0.8.
38.10. A. The second moment of the Exponential is: 22 = 1.28.
38.11. E. The variance = 1.28 - 0.82 = 0.64. Thus the standard deviation = 0.8.
The CV = standard deviation divided by the mean = 0.8 / 0.8 = 1.
38.12. E. The third moment of the exponential is 63 = 3.072.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

38.13. D. Skewness =

HCM 10/8/12,

Page 762

E[X3] - 3 E[X] E[X2] + 2 E[X] 3


=
STDDEV 3

{3.072 - (3)(0.8)(1.28) + (2)(0.8)3 } / (0.64)1.5 = 2.


Comment: The C.V. of the exponential distribution is always 1, while the skewness is always 2.
38.14. C. The excess ratio for the Exponential is: ex/ = e-5/0.8 = 0.00193.
38.15. D. The chance of a claim greater than 2 is: (0.05)(0.4823) + (0.95)(0.0821) = 0.102.
38.16. B. The mean is a weighted average of the individual means:
(.05)(3.333 ) + (.95)(.8) = 0.9267.
38.17. B. The second moment is a weighted average of the individual second moments:
(.05)(33.333 ) + (.95)(1.28) = 2.883.
38.18. C. The variance is: 2.883 - .9272 = 2.024. Thus the CV =

2.024 / .927 = 1.535.

38.19. D. The third moment is a weighted average of the individual third moments:
(.05)(1000) + (.95)( 3.072) = 52.918.

38.20. D. Skewness =

E[X3] - 3 E[X] E[X2] + 2 E[X] 3


=
STDDEV 3

{52.918 - (3)(.927)(2.883) + (2)(.927)3 } / (2.024)1.5 = 16.15.


38.21. C. The excess ratio for the mixed distribution is the weighted average of the individual
excess ratios, using as the weights the means times p or 1-p:
{(.05)(3.333)( .2963)+ (.95)(.8)(.00193)} / {(.05)(3.333)+ (.95)(.8)} = 0.05085 / 0.9267 = 0.0549.
Comment: Almost certainly beyond what will be asked on the exam.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 763

38.22. E. For the Pareto, S(2) = (/(+2)) = (10/12)4 = 0.4823.


For the Exponential, S(2) = exp[-2/]} = exp[-2/.8] = 0.0821.
For the mixture, S(2) = (5%)(0.4823) + (95%)(0.0821) = 0.1021.
For the Pareto, the expected losses excess of 2 are:
E[X] - E[X 2] = /(1) {/(1)}{1(/(+2))1} = (10/3)(10/12)3 = 1.9290.
For the Exponential, the expected losses excess of 2 are:
E[X] - E[X 2] = {1 - exp[-2/]} = exp[-2/] = .8 exp[-2/.8] = 0.0657.
For the mixture, the expected losses excess of 2 are: (5%)(1.9290) + (95%)(0.0657) = 0.1589.
For the mixture, e(2) = 0.1589/0.1021 = 1.556.
Comment: For the Exponential e(2) = = 0.8.
For the Pareto, e(2) = (2 + )/( - 1) = (2 + 10)/(4 - 1) = 4.
The mean excess loss of the mixture is not equal to the mixture of the mean excess losses:
(5%)(4) + (95%)(0.8) = 0.96 1.556.
Here is a graph of the mean excess loss of the mixture:
Mean ExcessLoss
6
5
4
3
2
1
2

10

38.23. D. The mean of each Exponential is: .


The second moment of each Exponential is: 22 .
The mean and second moment of the mixed distribution are the weighted average of those of the
individual distributions. Therefore, the mixed distribution has mean:
.41 + .62 = (.4)(1128) + (.6)(5915) = 4000 and
second moment: 2(.412 + .622) = 2{(.4)(11282 ) + (.6)(59152 )} = 43,002,577.
Variance = 43,002,577 - 40002 = 27.0 million.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

38.24. D. For the Exponential Distribution, E[X

HCM 10/8/12,

Page 764

x] = (1 - e-x/).

The given distribution is a 40%-60% mixture of two Exponentials, with means 1128 and 5915.
Therefore, the limited expected value at 2000 is a weighted average of the LEVs for the individual
Exponentials.
E[X 2000] = (.4){(1128)(1 - e-2000/1128)} + (.6){(5915)(1 - e-2000/5915)} = 1393.
Comment: Similar to 3, 11/01, Q.28.
38.25. A. E[(X - 2000)+] = E[X] - E[X

2000] = 4000 - 1393 = 2607.

Alternately, for each Exponential, E[(X - 2000)+] =

S(x) dx = e-x/ dx = e-2000/. For = 1128, E[(X - 2000)+] = 1128 e-2000/1128 = 191.6.
2000

2000

For = 5915, E[(X - 2000)+] = 5915 e-2000/5915 = 4218.0.


For the mixture, E[(X - 2000)+] = (.4)(191.5) + (.6)(4218.0) = 2607.
38.26. D. Let F be the mixed distribution, then:
F(x) = (.3){1-1000/(1000+x)} + (.7){1-100/(100+x)}
The 20th percentile is that x such that F(x) = .2.
Thus the 20th percentile of the mixed distribution is the value of x such that:
.2 = (.3){1-1000/(1000+x)} + (.7){1-100/(100+x)}.
Thus .8 = 300/(1000+x) + 70/(100+x). Thus .8x2 + 510x - 20000 = 0.
Thus x = {-510 +

(5102 ) + (4)(0.8)(20,000) }/ {(2)(0.8)} = 37.1.

38.27. A. E[X] = (2/3)(1000) + (1/3)(250) = 750.


E[X 200] = (2/3){(.1)(100) + (.9)(200)} + (1/3){(.4)(100) + (.6)(200)} = (2/3)(190) + (1/3)(160)
= 180. E[(X - 200)+] = E[X] - E[X 200] = 750 - 180 = 570.
Alternately, for each uniform from 0 to b, E[(X - 200)+] =
b

S(x) dx = (1 - x/b) dx = (b - 200) - (b/2 - 20000/b) = b/2 + 20000/b - 200.


200

200

For b = 2000, E[(X - 200)+] = 810. For b = 500, E[(X - 200)+] = 90.
For the mixture, E[(X - 200)+] = (2/3)(810) + (1/3)(90) = 570.
Comment: Mathematically the same as a mixture of two uniform distributions, with weight 2/(2 + 1) =
2/3 to the first uniform distribution.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 765

38.28. B. For the mixture, S(200) = (2/3)(.9) + (1/3)(.6) = .8.


Average payment per payment = E[(X - 200)+]/S(200) = 570/.8 = 712.5.
Alternately, nonzero payments for medical have mean frequency of: (.9)(2) = 1.8.
Nonzero payments for medical are uniform from 0 to 1800 with mean 900.
Nonzero payments for dental have mean frequency of: (.6)(1) = .6.
Nonzero payments for dental are uniform from 0 to 300 with mean 150.
{(1.8)(900) + (.6)(150)} / (1.8 + .6) = 712.5.
38.29. B. S(x) = 1 - F(x) = 0.2 e-x/10 + 0.5 e-x/25 + 0.3 e-x/100.
S(15) = 0.2 e-15/10 + 0.5 e-15/25 + 0.3 e-15/100 = 57.7%.
38.30. D. The mean of the mixed distribution is a weighted average of the mean of each
Exponential Distribution: (0.2)(10) + (0.5)(25) + (0.3)(100) = 44.5.
Comment: This is a 3-point mixture of Exponential Distributions, with means of 10, 25, and 100
respectively.
38.31. B. Set F(x) = 1 - 0.2e-x/10 + 0.5e-x/25 + 0.3e-x/100.
One can calculate the distribution function at the endpoints of the intervals and determine that the
median is between 19 and 20. (Solving numerically, median = 19.81.)
x

1 - Exp(-x/10)

1- Exp(-x/25)

1- Exp(-x/100)

Mixed Distribution

19
20
21
22

0.850
0.865
0.878
0.889

0.532
0.551
0.568
0.585

0.173
0.181
0.189
0.197

0.488
0.503
0.516
0.530

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 766

38.32. A. The mode of every Exponential Distribution is 0, thus so is that of the Mixed
Exponential.
Comment: If the individual distributions of a mixture have different modes, then in general it would
difficult to calculate algebraically the mode. One could do so by graphing the mixed density and
seeing where it reached a maximum. In this case a graph of the mixed density is as follows:
0.04

0.03

0.02

0.01

20

40

60

80

100

120

140

38.33. E. Each Exponential Distribution has a second moment of 22.


The second moment of the mixture is a weighted average of the individual second moments:
(0.2)(2)(102 ) + (0.5)(2)(252 ) + (0.3)(2)(1002 ) = 6665.
38.34. E. Variance = 6665 - 44.522 = 4684.75. CV = 4684.75 / 44.5 = 1.54.
Comment: Note that while the CV of every Exponential is 1, the CV of a mixed exponential is
always greater than one.
38.35. D. Each Exponential Distribution has a third moment of 63.
The third moment of the mixture is a weighted average of the individual third moments:
(0.2)(6)(103 ) + (0.5)(6)(253 ) + (0.3)(6)(1003 ) = 1,848,078.
38.36. E. skewness = {1,848,078 -(3)(44.50)(6665) + (2)(44.53 )} / 4684.751.5 = 3.54.
Comment: Note that while the skewness of every Exponential is 2, the skewness of a mixed
exponential is always greater than 2.
38.37. A. f(x) = (0.2)(e-x/10/10) + (0.5)(1-e-x/25/25) + (0.3)(1-e-x/100/100) . f(50) = .00466.
S(x) = 0.2e-x/10 + 0.5e-x/25 + 0.3e-x/100. S(50) = 0.25097.
h(50) = f(50)/S(50) = 0.00466 / 0.25097 = 0.0186.
Comment: Note that while the hazard rate of each Exponential Distribution is independent of x, that
is not true for the Mixed Exponential Distribution.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 767

38.38. C. For each individual Exponential, the Limited Expected Value is: (1-e-x/).
The Limited Expected Value of the mixture is a weighted average of the individual Limited
Expected Values: (.2)(10)(1-e-20/10) + (.5)(25)(1-e-20/25) + (.3)(100)(1-e-20/100) = 14.05.
38.39. A. E[X 15] = (.2)(10)(1-e-15/10) + (.5)(25)(1-e-15/25) + (.3)(100)(1-e-15/100) = 11.37.
E[X] = 44.5. LER(15) = E[X 15] / E[X] = 11.37 / 44.5 = 25.6%.
38.40. D. E[X 75] = (.2)(10)(1-e-75/10) + (.5)(25)(1-e-75/25) + (.3)(100)(1-e-75/100) = 29.71.
E[X] = 44.5. R(75) = 1 - E[X 75] / E[X] = 29.71 / 44.5 = 33.2%.
38.41. D. For each individual Exponential, the Limited Second Moment is:
22(3;x/) + x2 e-x/. Using Theorem A.1 in Appendix A of Loss Models,
(3;x/) = 1 - e-x/ - (x/)e-x/ - (x/)2 e-x/ /2.
Thus E[(X x)2 ] =22(3; x/) + x2 e-x/ = 22 - 22e-x/ - 2xe-x/ = 2{ - (+x)e-x/} .
For = 10, E[(X 30)2 ] = 20{10 - 40e-3} = 160.17.
For = 25, E[(X 30)2 ] = 50{25 - 55e-1.2} = 421.72.
For = 100, E[(X 30)2 ] = 200{100 - 130e-0.3} = 738.72.
The Limited Second Moment of the mixture is a weighted average of the individual Limited
Second Moments: (0.2)(160.17) + (0.5)(421.72) + (0.3)(738.72) = 464.51.
Comment: Difficult. One can compute the integral in the limited second moment of the Exponential
Distribution by repeated use of integration by parts.
38.42. B. E[X 50] = (0.2)(10)(1-e-50/10) + (0.5)(25)(1-e-50/25) + (0.3)(100)(1-e-50/100) =
24.60. E[X] = 44.5. S(x) = 0.2e-x/10 + 0.5e-x/25 + 0.3e-x/100. S(50) = 0.25097.
e(50) = (E[X] - E[X 50]) / S(50) = (44.5-24.60)/.25097 = 79.3.
Comment: Note that while for each Exponential Distribution e(x) is independent of x, that is not true
for the Mixed Exponential Distribution. For the Mixed Distribution, the Mean Excess Loss increases
with x, towards the largest mean of the individual Exponentials.
For example, in this case e(200) = 99.7.
The tail behavior of the mixed exponential is that of the individual exponential with the largest mean.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 768

38.43. In this case, the mixed distribution is unimodal.


density
0.05

0.04

0.03

0.02

0.01

10

20

30

40

50

38.44. D. Prob[ = 80 | surviving to least 70] =


Prob[surviving to least 70 | = 80] Prob[ = 80] / Prob[surviving to least 70] =
(10/80)(0.4) / {(10/80)(0.4) + (30/100)(0.6)} = 21.7%.
Prob[ = 100 | surviving to least 70] = (30/100)(.6)/{(10/80)(.4) + (30/100)(.6)} = 78.3%.
Therefore, expected age at death is: (21.7%)(70 + 80)/2 + (78.3%)(70 + 100)/2 = 82.8.
Comment: DeMoivres Law is uniform from 0 to .
38.45. E. This can be thought of as a two point mixture between a severity that is always zero and
a uniform distribution on [1000, 5000].
Mean = (80%)(0) + (20%)(3000) = 600.
2nd moment of uniform on [1000, 5000] is: (50003 - 10003 )/{(3)(5000 - 1000)} = 10,333,333.
Second moment of mixture = (80%)(0) + (20%)(10,333,333) = 2,066,667.
Variance of mixture = 2,066,667 - 6002 = 1,706,667.
Alternately, this can be thought of as a Bernoulli Frequency with q = .2 and a uniform severity.
Variance of Aggregate = (mean freq.)(var. of severity) + (mean severity)2 (variance of freq.) =
(0.2)(40002 /12) + 30002(0.2)(0.8) = 1,706,667.
Comment: This can also be thought of as a two component splice between a point mass at 0 of
80% and a uniform distribution with weight 20%.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 769

38.46. f(x) = x1 ex/ / (). With = 4 and = 3, f(x) = x2 e-x/3/ 486:


density
0.07
0.06
0.05
0.04
0.03
0.02
0.01
10

20

30

40

With = 6 and = 10, f(x) = x5 e-x/10 / 120,000,000:


density
0.015
0.010
0.005
x
20 40 60 80 100 120 140
With 30% weight to the first distribution and 70% weight to the second distribution:
density
0.020
0.015
0.010
0.005
x
20
40
60
80 100 120 140
Comment: In this case, the mixed distribution is bimodal.
Two-point mixtures of distributions each unimodal, can be either unimodal or bimodal.
In this example, with 3% weight to the first Gamma and 97% weight to the second Gamma, the
mixture would have been unimodal.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

38.47. B. E[X | X > 50] =

{.7 f1(x)

HCM 10/8/12,

Page 770

+ .3 f2 (x)} x dx /

50

.7 f1(x)

+ .3 f2(x) dx

50

50

50

50

50

= { .7 f1(x) x dx + .3 f2 (x) x dx }/{ .7 f1(x) dx + .3 f2(x) dx }


= {(.7) S1 (50) (50 + e1 (50)) + (.3) S2 (50) (50 + e2 (50))}/ {(.7) S1 (50) + (.3) S2 (50)}
= {(.7) e-50/100 (50 + 100) + (.3) e-50/200 (50 + 200)}/{(.7) e-50/100 + (.3) e-50/200}
= 122.096/.6582 = 185.5.
Alternately, for the mixture, S(50) = (.7) e-50/100 + (.3) e-50/200 = 0.6582.
E[X] = (.7)(100) + (.3)(200) = 130.
E[X 50] = (.7){(100)(1 - e-50/100)} + (.3){(200)(1 - e-50/200)} = 40.815.
Average size of those losses greater than 50 is:
{E[X] - (E[X 50] - 50S(50))}/S(50) = {130 - (40.815 - 50(.6582))}/.6582 = 185.5.

Comment: e(x) =

f(t)( t

- x) dt / S(x). e(x) S(x) =

f(t) t

dt - x S(x).

f(t) t

dt = S(x) {x + e(x)}.

38.48 . E. In 2002 the losses are a 70%-30% mixture of two Exponentials with means 1000 and
5000.
In 2004 the losses are a 70%-30% mixture of two Exponentials with means 1080 and 5400.
For the Exponential, E[X

For = 5400, E[X


For = 1080, E[X

x] = (1 - e-x/).

3000] = (1080)(1 - e-3000/1080) = 1012.8.


3000] = (5400)(1 - e-3000/5400) = 2301.7.

In 2004 the average payment per loss: (0.7)(1012.8) + (0.3)(2301.7) = 1399.5.


38.49. C. For Type I, the Single Parameter Pareto has mean /( 1) = (4)(10)/3 = 13.3333,
2nd moment 2/( 2) = (4)(100)/2 = 200, and variance 200 - 13.33332 = 22.222.
For Type II, the Single Parameter Pareto has mean /( 1) = (3)(10)/2 = 15,
second moment 2/( 2) = (3)(100)/1 = 300, and variance 300 - 152 = 75.
The sum of 600 risks of Type I and 400 risks of Type II has variance:
(600)(22.222) + (400)(75) = 43,333.
Comment: We know exactly how many of each type we have, rather than picking a certain number
of risks at random.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 771

38.50. C. This is a 60%-40% mixture of the two Single Parameter Pareto Distributions.
The mixture has mean: (0.6)(13.3333) + (0.4)(15) = 14.
The mixture has second moment: (0.6)(200) + (0.4)(300) = 240.
The mixture has variance: 240 - 142 = 44.
38.51. B. If the risk is of Type I, then the variance of outcomes is 22.222.
If the risk is of Type II, then the variance of outcomes is 75.
The expected value of this variance is: (0.6)(22.222) + (0.4)(75) = 43.333.
Comment: This is the Expected Value of the Process Variance.
The Variance of the Hypothetical Means is: (0.6)(13.3333 - 14)2 + (0.4)(15- 14)2 = .667.
Expected Value of the Process Variance + Variance of the Hypothetical Means =
43.333 + .667 = 44 = Total Variance. Note that the variance of the sum of one loss from each of the
1000 risks is: (1000)(43.333) = 43,333, the solution to a previous question.
38.52. A. Profit per applicant is a mixed distribution, with 50% weight to heads and 50% weight to
tails. These are each in turn mixed distributions.
Heads is 2/3 weight to good and 1/3 weight to bad,
with mean: (2/3)(300) + (1/3)(-100) = 166.67,
and with second moment: (2/3)(10000 + 3002 ) + (1/3)(90000 + 1002 ) = 100,000.
Tails is 2/3 weight to good and 1/3 weight to zero,
with mean: (2/3)(300) + (1/3)(0) = 200,
and with second moment: (2/3)(10000 + 3002 ) + (1/3)(02 ) = 66,667.
The overall mean profit is: (50%)(166.67) + (50%)(200) = 183.33.
The overall second moment of profit is: (50%)(100,000) + (50%)(66,667) = 83,333.
The variance of the profit per applicant is: 83333 - 183.332 = 49,722.
Comment: Information taken from 3, 11/02, Q.15.
38.53. C. Of the original applicants: (50%)(2/3) = 1/3 are heads and good, (50%)(1/3) = 1/6 are
heads and bad, (50%)(2/3) = 1/3 are tails and good, (50%)(1/3) = 1/6 are tails and bad. Bob
accepts the first three types, 5/6 of the total. Thus profit per accepted applicant is a mixed
distribution, with (1/3 + 1/3)/(5/6) = 80% weight to good and (1/6)/(5/6) = 20% weight to bad.
The mean profit is: (80%)(300) + (20%)(-100) = 220.
The second moment of profit is: (80%)(10000 + 3002 ) + (20%)(90000 + 1002 ) = 100,000.
The variance of the profit per accepted applicant is: 100000 - 2202 = 51,600.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 772

38.54. E. The LogNormal has mean: exp[10 + 0.72 /2] = 28,141,


and second moment: exp[(2)(10) + (2)(0.72 )] = 1,292,701,433.
The Gamma has mean: (2)(10,000) = 20,000,
and second moment: (2)(2+1)(10,0002 ) = 600,000,000.
The Pareto has mean: 200,000/(5 - 1) = 50,000,
and second moment: (2)200,0002 /{(5 - 1)(5 - 2)} = 6,666,666,667.
The Exponential has mean: 5000, and second moment: (2)(50002 ) = 50,000,000.
The mixed distribution has mean:
(.1)(28,141) + (.3)(20,000) + (.2)(50,000) + (.4)(5000) = 20,814.
The mixed distribution has second moment:
(.1)(1,292,701,433) + (.3)(600,000,000) + (.2)(6,666,666,667) + (.4)(50,000,000)
= 1662.60 million.
Standard deviation of the mixed distribution:

1662.60 million - 20,8142 = 35,062.

Comment: Not intended to be a realistic model of Homeowners Insurance. For example, the size
of loss distribution would depend on the value of the insured home. For wind and fire, there
would be point masses of probability at the value of the insured home. The mix of losses by
peril would depend on the location of the insured home.
38.55. E. E[X] = (.65)exp[8 + .52 /2] + (.35)exp[9 + .32 /2] = (.65)(3378) + (.35)(8476) = 5162.
38.56. B. E[X2 ] = (.65)exp[(2)(8) + (2)(.52 )] + (.35)exp[(2)(9) + (2)(.32 )] =
(.65)(14,650,719) + (.35)(78,609,255) = 37,036,207.
38.57. C. For the LogNormal, E[X-1] = exp[- + 2/2].
E[X-1] = (.65)exp[-8 + .52 /2] + (.35)exp[-9 + .32 /2] = (.65)(.0003801) + (.35)(.00012909) =
0.00029225. 1/E[1/X] = 1/.00029225 = 3422.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 773

38.58. A. For good students, f(x) = (6 + 2 - 1)!/{(6-1)!(2-1)!} (x/100)6 (1 - x/100)2-1/x


= 42 x5 (1 - x/100)/1012, 0 x 100.
F(65) =

65
12
42 / 10

x5 - x6 / 100 dx = .2338.

0
65

x f(x) dx =

65
12
42 / 10

x6 - x7 / 100 dx = 12.685.

Average grade for those good students who fail: 12.685/.2338 = 54.26.
For bad students, f(x) = (3 + 2 - 1)!/{(3-1)!(2-1)!} (x/100)3 (1 - x/100)2-1/x
= 12 x2 (1 - x/100)/106 , 0 x 100.
65

65

F(65) = 12 / 106 x2 - x 3 / 100 dx = .5630.

65

x f(x) dx = 12 / 106 x3 - x 4 / 100 dx = 25.705.


0

Average grade for those bad students who fail: 25.705/0.5630 = 45.66.
Prob[Good | failed] = Prob[fail | Good] Prob[Good] / Prob[fail] =
(.2338)(.75)/{(.2338)(.75) + (.5630)(.25)} = 55.47%.
Expected grade of a student who fails: (0.5547)(54.26) + (1 - 0.5547)(45.66) = 50.4.
Comment: The distribution of grades is given as continuous, so we integrate from 0 to 65.
38.59. C. & 38.60. B. & 38.61. D. & 38.62. C. Mean = (.5)(8%) + (.5)(11%) = 9.5%.
For each Normal, its second moment is equal to: 2 + 2.
Second Moment of R is: (.5)(.22 + .082 ) + (.5)(.32 + .112 ) = 0.07425.
Variance of R is: 0.07425 - .0952 = 0.0652.

0.0652 = 0.255.

Third Moment of the first Normal is: .083 + (3)(.08)(.22 ) = .01011.


Third Moment of the second Normal is: .113 + (3)(.11)(.32 ) = .03103.
Third Moment of R is: (.5)(.01011) + (.5)(.03103) = 0.02057.
Third Central Moment of R is: 0.02057 - (3)(.095)(.07425) + (2)(.0953 ) = 0.001124.
Skewness of R is: 0.001124/0.06521.5 = 0.0675.
Fourth Moment of the first Normal is: .084 + (6)(.082 )(.22 ) + (3)(.24 ) = .006377.
Fourth Moment of the second Normal is: .114 + (6)(.112 )(.32 ) + (3)(.34 ) = .030980.
Fourth Moment of R is: (.5)(.006377) + (.5)(.030980) = 0.01868.
Fourth Central Moment of R is: 0.01868 - (4)(.095)(0.02057) + (6)(.0952 )(.07425) - (3)(.0954 ) =
0.01464. Kurtosis of R is: 0.01464/0.06522 = 3.44.
Comment: Note that each Normal has a kurtosis of 3, yet the mixture has a kurtosis greater than 3.
Mixtures tend to have heavier tails.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 774

38.63. Let the CV of each component be c. Let 1/2 = r.


Then the mean of the mixture is: p 1 + (1-p)2 = 2{r p + 1 - p}.
1 = c1 = cr2. 2 = c2.
The second moment of the mixture is: p{12 + 12} + (1-p){22 + 22} =
p{c2 r2 22 + r2 22} + (1-p){c2 22 + 22} = 22(1 + c2 )(pr2 + 1 - p).
2
2 (1 + c 2)(pr2 + 1 - p)
2 ) pr + 1 - p .
For the mixture: 1 + CV2 = E[X2 ]/E[X]2 = 2
=
(1
+
c
22 {r p + 1 - p}2
{r p + 1 - p}2

The relationship of the CV of the mixture to that of each component, c, depends on the ratio:
1 + CV2
pr2 + 1 - p
=
.
1 + c2
{r p + 1 - p}2
If this ratio is one, then CV = c. If this key ratio is greater than one, the CV > c.
If p = 0 or p = 1, then this key ratio is one. In this case we really do not have a mixture.
For r = 1, this key ratio is one. Thus if the two components have the same mean, then the CV of the
mixture is equal to the CV of each component.
For p fixed, 0 < p < 1, take the derivative with respect to r of this key ratio:
2pr(rp + 1 - p)2 - (pr2 + 1 - p)2(rp + 1 - p)p
r(rp + 1 - p) - (pr2 + 1 - p)
= 2p
=
4
{r p + 1 - p}
{r p + 1 - p}3
2p(1-p)

r - 1
.
{r p + 1 - p}3

Since p > 0, 1 - p > 0, and the denominator of this derivative is positive, the sign of the derivative
depends on r - 1.
For r < 1 this derivative is negative, and for r > 1 this derivative is positive.
Thus for p fixed, the minimum of the key ratio occurs for r = 1.
Thus if the two components have the same mean, then the CV of the mixture is equal to
the CV of each component.
However, if the two components have different means, in other words r 1, then the CV
of the mixture is greater than the CV of each component.
Alternately, the variance of the mixture = EPV + VHM.
If the means of the two components are equal, then the VHM = 0, and the variance of the mixture =
EPV. If the means of the two components are not equal, then the VHM > 0, and the variance of the
mixture is greater than the EPV.
The EPV = p 12 + (1 - p)22 = pc2 r2 22 + (1-p)c2 22 = c2 22 (pr2 + 1 - p).

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 775

Thus if the means of two components differ, in other words if r 1,


C V2 of mixture = (Variance of mixture) / (mean of mixture)2 > EPV / (mean of mixture)2 =
c2 22 (pr2 + 1 - p)
.
2 2{r p + 1 - p}2
Thus for r 1,

(CV of mixture)2
pr2 + 1 - p
>
.
c2
{r p + 1 - p}2

As before, we can show that this key ratio is greater than one when r 1.
Thus if the two components have different means, in other words r 1, then the CV of the
mixture is greater than the CV of each component.
38.64. A. f(x) = (.7)(2e-2x) + (.3)(3e-3x), a 70%-30% mixture of two Exponentials with means 1/2
and 1/3. E[X] = (.7)(1/2) + (.3)(1/3) = .45 = 9/20.
38.65. E. A mixture of three Exponentials with means 100, 50, and 25, and equal weights.
For each Exponential, E[X] - E[X

50] = e-50/, and S(50) = e-50/.

e(50) = (E[X] - E[X 50]) / S(50) = (100e-.5/3 + 50e-1/3 + 25e-2/3) / (e-.5/3 + e-1/3 + e-2/3) =
27.4768 / 0.369915 = 74.28.
38.66. E. Mean = (60 + 80)/2 = 70. Second Moment = {(200 + 602 ) + (300 + 802 )}/2 = 5250.
Variance = 5250 - 702 = 350.
Alternately, Expected Value of the Process Variance = (200 + 300)/2 = 250.
Variance of the Hypothetical Means = {(60 -70)2 + (80 - 70)2 }/2 = 100.
Total Variance = EPV + VHM = 250 + 100 = 350.
38.67. E. 1. True. 2. True. 3. True.
38.68. C. Let F be a mixed distribution: then F(x) = pA(x) + (1-p)B(x). The median is that x such
that F(x) = .5. Thus the median of the mixed distribution is the value of x such that:
.5 = pA(x) + (1-p)B(x). In this case, p = .5, A is a Burr and B is a Pareto. Substituting into the
equation for the median: .5 = (.5){1-(1000/(1000+x2 ))} + (.5){1-1000/(1000+x)}.
Thus 1 = 1000/(1000+x2 ) + 1000/(1000+x). Thus x3 = 1,000,000. Thus x =100.
Comment: Check: for x = 100, (.5){1-(1000/(1000+1002 ))} + (.5){1-1000/(1000+100)} =
(.5)(1-1000/11000) + (.5)(1-1000/1100) = (.5)(.9090) + (.5)(.0909) = .4545 + .0455 = .5.
Note that the median of the Burr is 31.62, while the median of the Pareto is 1000. The (weighted)
average of the medians is: (.5)(31.62) + (.5)(1000) = 515.8, which is not equal to the median of the
mixed distribution.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 776

38.69. C. The uniform distribution on [0,2] has density function:


0
x < 0,
1/2
0 < x < 2,
0
2<x
The uniform distribution on [1,3] has density function:
0
x < 1,
1/2
1 < x < 3,
0
3<x
X1 has a continuous distribution function and discontinuous density function. For example, let the
weights be 1/3 and 2/3. Then the density function is:
1/6
0 < x < 1,
1/2
1 < x < 2,
1/3
2<x<3
X2 has a continuous distribution function and continuous density function.
It has a triangle density function: (x-1)/4, 1 < x < 3,
(5-x)/4, 3 < x < 5.
X3 has a discontinuous distribution function. At the censorship point of 1 the distribution jumps up
from ((1-)/) to 1. Generally censorship leads to a jump discontinuity in the Distribution Function at
the censorship point, provided the survival function of the original distribution at the censorship point
is positive.
Comment: Convolution is discussed in Mahlers Guide to Aggregate Distributions.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 777

38.70. C. For mixed distributions the moments are weighted averages of the moments of the
individual distributions. For a Pareto the first moment is /(1), which in these cases are 100 and
500. For a Pareto the second moment is 22 /{(1)(2)}, which in these cases are 30,000 and
1,000,000.
Thus the first moment of the mixed distribution is: (.75)(100) + (.25)(500) = 200.
The second moment of the mixed distribution is: (.75)(30,000) + (.25)(1,000,000) = 272,500.
The variance of the mixed distribution is: 272,500 - 2002 = 232,500.
Alternately, the variance of a Pareto is 2 /{(1)2 (2)}.
Thus the process variances are: Var[X | Liability] = 4(3002 ) / {(32 )(2)} = 20,000 and
Var[X | Property] = 3(10002 ) / {(22 )(1)} = 750,000.
Thus the Expected Value of the Process Variance = (.75)(20000) + (.25)(750000) = 202,500.
Since the mean of Pareto is /(1), the hypothetical means are:
E[X | Liability] = 300 / 3 = 100 and E[X | Property] = 1000 / 2 = 500.
The overall mean is: (.75)(100) + (.25)(500) = 200.
Thus the Variance of the Hypothetical Means is: (.75)(100-200)2 + (.25)(500-200)2 = 30,000.
Thus for the whole portfolio the Total Variance = EPV + VHM = 202,500 + 30,000 = 232,500.
Type of
Risk
Liability
Property

A Priori
Chance of
This Type
of Risk
0.750
0.250

Overall

Hypothetical
Mean

Second
Moment

Process
Variance

Square of
Hypothetical
Mean

100
500

30,000
1,000,000

20,000
750,000

10,000
250,000

202,500

70,000

200

VHM = 70,000 -

2002

= 30,000. Total Variance = 202,500 + 30,000 = 232,500.

Comment: The variance of a mixed distribution is not the weighted average of the individual
variances. The statement that the risks have identical claim count distributions, allows one to weight
the two distributions with weights 75% and 25%. For example, if instead, liability risks had twice the
mean claim frequency of property risks, then (2)(.75)/{(2)(.75) + (1)(.25)} = 85.7% of the claims
would come from liability risks. Therefore, in that case one would instead weight the two distributions
together using weights of 85.7% and 14.3%, as follows.
Type of
Risk
Liability
Property

A Priori
Chance of
This Type
of Risk
0.750
0.250

Overall

VHM = 44286 -

1572 =

Relative
Claim
Frequency
2
1

Chance of a
Claim from
this Type
of Risk
0.857
0.143

Hypothetical
Second Process
Mean MomentVariance

Square of
Hypothetical
Mean

100
500

20,000
750,000

10,000
250,000

157

124,286

44,286

19,367. Total Variance = 124,286 + 19,367 = 143,923.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 778

38.71. A. This question asks for the variance of individual claim amounts.
For a claim picked at random, it has a 2/(2+3) = 40% chance of coming from the first severity
distribution and a 60% chance of coming from the second severity distribution.
This is mathematically the same as a two point mixture.
The first distribution has a mean 1.4 and a second moment 2.2, while the second distribution has a
mean of 2.6 and a second moment of 7.4. The mixed distribution has a weighted average of the
moments; it has a mean of: (2/5)(1.4) + (3/5)(2.6) = 2.12,
and a second moment of: (2/5)(2.2) + (3/5)(7.4) = 5.32.
The variance of the mixed severity distribution is: 5.32 - 2.122 = .83.
Alternately, one computes the combined severity distribution, by weighting the individual
distribution together, using their mean frequencies of 2 and 3 as weights.
x

p1(x)

p2(x)

combined
severity

1
2
3
4

0.6
0.4
0
0

0.1
0.3
0.5
0.1

0.30
0.34
0.30
0.06

first
moment

second
moment

0.30
0.68
0.90
0.24

0.30
1.36
2.70
0.96

2.12

5.32

The variance of the combined severity is: 5.32 - 2.122 = .83.


Comment: In Mahlers Guide to Aggregate Distributions, the very similar Course 151 Sample
Exam #2, Q.4 asks instead for the variance of S. The variance of each compound Poisson is its
mean frequency times the second moment of its severity. Since the two compound Poissons are
independent, their variances add to get the variance of S.
38.72. D. Mean of the mixture is: (0.6)(1400) + (0.4)(2000) = 1640.
Second of the mixture is: (0.6)(40,000 + 14002 ) + (0.4)(250,000+ 20002 ) = 2,900,000.
Variance of the mixture is: 2,900,000 - 16402 = 210,400.
Alternately, Expected Value of the Process Variance = (0.6)(40,000) + (0.4)(250,000) = 124,000.
Variance of the Hypothetical Means = (0.6)(1400 - 1640)2 + (0.4)(2000 - 1640)2 = 86,400.
Total Variance = EPV + VHM = 124,000 + 86,400 = 210,400.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 779

38.73. C. Since all payments take place 3 years from now and we used an interest rate of 4%,
present values are taken by dividing by 1.043 = 1.125.
The mean claim payment is (50%)(0) + (50%)(10) = 5.
The insurer's mean payment plus legal expense is: 5 + 5 = 10,
with present value: 10/1.125 = 8.89.
The payment of 5 in claims expense is fixed, so that it does not affect the variance.
The second moment of the LogNormal is: variance + mean2 = 202 + 102 = 500.
The claims payment is a 50%-50% mixture of zero and a LogNormal Distribution. Therefore its
second moment is a weighted average of the second moments: (.5)(0) + (.5)(500) = 250.
Thus the variance of the claim payments is: 250 - 52 = 225. The variance of the present value is:
225/1.1252 = 177.78.
Therefore, the expected present value of the claim and legal expenses plus 0.02 times the variance
of the present value is: 8.89 + (.02)(177.78) = 12.45.
Comment: Since the time until payment and the interest rate are both fixed, the present values are
easy to take. The present value is gotten by dividing by 1.125, so the variance of the present value
is divided by 1.1252 . There is no interest rate risk or timing risk in this simplified example. We do not
make any use of the fact that the amount of payment specifically follows a LogNormal Distribution.
38.74. Overall mean is: (25%)(500) + (75%)(300) = 350.
Second moment for Type I is: 1002 + 5002 = 260,000.
Second moment for Type II is: 702 + 3002 = 94,900.
Overall second moment is: (25%)(260,000) + (75%)(94,900) = 136,175.
Overall variance is: 136,175 - 3502 = 13,675.
Overall standard deviation is:

13,675 = 116.9.

Alternately, E[Var | Type] = (25%)(1002 ) + (75%)(702 ) = 6175.


Var[Mean | Type] = (.25)(500 - 350)2 + (.75)(300 - 350)2 = 7500.
Overall variance is: E[Var | Type] + Var[Mean | Type] = 6175 + 7500 = 13,675.
Overall standard deviation is:

13,675 = 116.9.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 780

38.75. B. Put into thousands, the density is: f(x) = 0.0005003 e-x/2000, 0 < x <15000.
15000

Expected payment is: (.02)(15000 - 1000) + (.04) (x - 1000) 0.0005003 e-x/2000 dx =


1000
15000

15000

280 + .000020012 x e-x/2000 dx - .020012 e-x/2000 dx =


1000

1000
x=15000

x=15000

280 + .000020012 (-2000x e-x/2000 -20002 e-x/2000) ] - (.020012) 2000e-x/2000 ] =


x=1000

x=1000

280 + 23.9 + 48.5 - 24.3 = 328.


38.76. C. For the Exponential Distribution E[X

x] = (1 - e-x/).

The given distribution is a 80%-20% mixture of two Exponentials, with means 50 and 1000.
F(x) = .8(1 - e-x/50) + .2(1 - e-x/1000). Therefore, the limited expected value at 1000 is a weighted
average of the LEVs for the individual Exponentials.
E[X 1000] = (.8){(50)(1 - e-1000/50)} + (.2){(1000)(1 - e-1000/1000)} = 166.4.
Alternately, E[X 1000] =
1000

1000

S(x) dx = 0.8e-0.02x + 0.2e-0.001x dx = 40(1 - e-20) + (200)(1 - e-1) = 166.4.


0

38.77. C. E[X] = (m1 + m2 )/2. The second moment of a mixture is the mixture of the second
moments: E[X2 ] = (2m1 2 + 2m2 2 )/2 = m1 2 + m2 2 .
1 + CV2 = E[X2 ]/E[X]2 = 4(m1 2 + m2 2 )/(m1 + m2 )2 = 4{1 - 2m1 m2 /(m1 + m2 )2 } 4.

C V2 3. CV 3 .
Comment: The CV is largest when m1 and m2 are significantly different. If m1 = m2 , then the CV is:
(4)(1 - 2m2 /(2m)2 ) - 1 = 1; we would have a single Exponential Distribution with CV = 1.
If we let r = m2 /m1 , then CV2 = 4(m1 2 + m2 2 )/(m1 + m2 )2 - 1 = 4(1 + r2 )/(1 + r)2 - 1.
This is maximized as either r0 or r and CV2 3 or CV 3 .

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 781

38.78. D. The future lifetime of smokers is Exponential with mean: 1/.2 = 5.


The future lifetime of non-smokers is Exponential: F(t) = 1 - e-.1t.
The future lifetime for an individual selected at random is a mixed Exponential:
F(t) = (.3)(1 - e-.2t) + (.7)(1 - e-.1t) = 1 - .3e-.2t - .7e-.1t. Want .75 = F(t) = 1 - .3e-.2t - .7e-.1t.
Let y = e-.1t. .3y2 + .7y - .25 = 0. y = 0.315. t = 11.56.
Comment: Constant force of mortality Exponential Distribution with mean 1/.
38.79. C. The mean of the Weibull is: 30 [1 + 1/(1/2)] = 30 (3) = (30)(2!) = 60.
Thus the expected cost of a claim is: (40%)(0) + (60%)(60) = 36. #1 is false.
The survival function at $60 is: (60%)(Survival Function at 60 for the Weibull) =
(60%)exp[-(60/30)1/2] = 0.1459. #2 is false.
The density at 60 is: (40%)(0) + (60%){exp[-(60/30)1/2](1/2)(60/30)1/2/60} = .001719.
h(60) = .001719/.1459 = 0.0118. # 3 is true.
Comment: For the Weibull, h(x) = x1/. h(60) = (1/2)(60-1/2)/301/2 = 0.0118.
For the mixture, h(60) = {(60%)(f(60) of the Weibull)}/{(60%)(S(60) of the Weibull)} =
hazard rate at 60 for the Weibull. Since in this case, one of the components of the mixture is zero, it
does not effect the hazard rate. For example, if one mixed a Weibull and a Pareto, they would each
affect the numerator and denominator of the hazard rate of the mixture.
38.80. B. S(750) = (30%){1/(1 + (750/400)2 )} + (70%){1/(1 + (750/600)2 )} = .3396.
Comment: The Survival Function of the mixture is the mixture of the Survival Functions.
For the loglogistic distribution, F(x) = (x/)/{1 + (x/)}. S(x) = 1/{1 + (x/)}.
38.81. A. For the first Pareto, F(200) = 1 - {100/(100 + 200)}2 = .8889.
For the second Pareto, F(200) = 1 - {3000/(3000 + 200)}4 = .2275.
Pr(X 200) = (0.8)(0.8889) + (0.2)(0.2275) = 0.757.
38.82. B. F(6000) = (0.6)[(6000 - 5000)/ 1,000,000 ] + (0.4)[(6000 - 4000)/ 1,000,000 ] =
(0.6)[1] + (0.4)[2] = (0.6)(0.8413) + (0.4)(0.9772) = 0.8957.
S(6000) = 1 - 0.8957 = 10.43%.
Comment: A two-point mixture of Normal Distributions.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 782

38.83. A. Nonsmokers have an Exponential Survival function beyond age 30:


S(x)/S(30) = exp[-.08(x - 30)]. Similarly, for smokers, S(x)/S(30) = exp[-.16(x - 30)].
Since we have a 50-50 mixture starting at age 30:
S(x)/S(30) = .5 exp[-.08(x - 30)] + .5 exp[-.16(x - 30)].
S(80)/S(30) = .5 exp[-(.08)(50)] + .5 exp[-(.16)(50)] = .009326.
S(81)/S(30) = .5 exp[-(.08)(51)] + .5 exp[-(.16)(51)] = .008597.
p 80 = S(81)/S(80) = {S(81)/S(30)}/{S(80)/S(30)} = .008597/.009326 = .9218.
q80 = 1 - .9218 = 0.0782.
Alternately, assume that there are for example originally 2,000,000 individuals, 1,000,000
nonsmokers and 1,000,000 smokers, alive at age 30.
Then the expected nonsmokers alive at age 80 is: 1,000,000 exp[-(.08)(80 - 30)] = 18,316.
The expected nonsmokers alive at age 81 is: 1,000,000 exp[-(.08)(81 - 30)] = 16,907.
The expected smokers alive at age 80 is: 1,000,000 exp[-(.16)(80 - 30)] = 335.
The expected smokers alive at age 81 is: 1,000,000 exp[-(.16)(81 - 30)] = 286.
In total we expect 18316 + 335 = 18,651 alive at age 80 and 16907 + 286 = 17913 alive at age
81. Therefore, q80 = (18651 - 17193)/18651 = 0.0782.
Comment: Since the expected number of smokers alive by age 80 is so small, q80 is close to that
for nonsmokers: 1 - exp[-(.08)(51)]exp[-(.08)(50)] = 1 - e-.08 = 0.0769.
38.84. A. For the first LogNormal F(15000) = [(ln(15000) - 7)/2] = [1.31] = .9049.
The second LogNormal has = 7 + ln2 = 7.693 and = 2, and
F(15000) = [(ln(15000) - 7.693)/2] = [0.96] = .8315.
For the mixed distribution: S(15000) = 1 - {(.75)(.9049) + (.25)(.8315)} = 0.113.
Alternately, Prob[accident 15000] = Prob[standard stay 15000/2 = 7500] = [(ln(7500) - 7)/2] =
[0.96] = .8315. Proceed as before.

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

38.85. (i) (a) Let J be the amount paid back to the investor in the junior loan.
If the asset does not sustain a loss then they get paid the 50 plus 8% interest or 54.
If a loss is sustained they get no interest.
If the loss is more than 50 they get paid nothing.
If the loss is less than 50, they get paid 50 minus the loss.
Let U(0, 50) be uniform from 0 to 50.

54 with probability 75%

J=
0 with probability 12.5%
=
50 - U(0, 50) with probability 12.5%

54 with probability 75%

0 with probability 12.5%


.

U(0, 50) with probability 12.5%

J is a mixture.
E[J] = (0.75)(54) + (0.125)(0) + (0.125)(25) = 43.625.
E[J2 ] = (0.75)(542 ) + (0.125)(02 ) + (0.125)(502 /12 + 252 ) = 2291.
Var[J] = 2291 - 43.6252 = 388.
(b) Let S be the amount paid back to the investor in the senior loan.
If the asset does not sustain a loss then they get paid the 50 plus 6% interest or 53.
If a loss is sustained they get no interest.
If the loss is less than 50, they get paid 50.
If the loss L is more than 50 they get paid: 50 - (L - 50) = 100 - L.

53 with probability 75%

53 with probability 75%

S=
50 with probability 12.5%
= 50 with probability 12.5% .
100 - U(50, 100) with probability 12.5% U(0, 50) with probability 12.5%

S is a mixture.
E[S] = (0.75)(53) + (0.125)(50) + (0.125)(25) = 49.125.
E[S2 ] = (0.75)(532 ) + (0.125)(502 ) + (0.125)(502 /12 + 252 ) = 2523.
Var[S] = 2523 - 49.1252 = 110.
(ii) Prob(J < 50) = 0.25.
Prob(S < 50) = 0.125.
Comment: The variance of a uniform distribution from 0 to 50 is 502 /12.
Thus the second moment of a uniform distribution from 0 to 50 is: 502 /12 + 252 .

Page 783

2013-4-2,

Loss Distributions, 38 N-Point Mixtures

HCM 10/8/12,

Page 784

38.86. (a) E[R] = (0.8)(0) + (0.2)(30%) = 6%.


E[R2 ] = (0.8)(0.12 + 02 ) + (0.2)(0.12 + 0.32 ) = 0.028.
Var[R] = 0.028 - 0.062 = 0.0244.
(b) Prob[R < 0] = 0.8 [(0 - 0)/0.1] + 0.2 [(0 - 0.3)/0.1] = (0.8) [0] + (0.2) [-3] =
(0.8)(0.5) + (0.2)(0.0013) = 0.40026.
Prob[S < 0] = [(0 - 0.06)/ 0.0244 ] = [-0.38] = 0.3520.
Prob[R < -0.1] = 0.8 [(-0.1 - 0)/0.1] + 0.2 [(-0.1 - 0.3)/0.1] = (0.8) [-1] + (0.2) [-4] =
(0.8)(0.1587) + (0.2)(0) = 0.12696.
Prob[S < -0.1] = [(-0.1 - 0.06)/ 0.0244 ] = [-1.02] = 0.1539.
Comment: Even though R and S have the same mean and variance they have different
probabilities in the lefthand tail. F(0) is greater for R than S, while F(-0.1) is greater for S than R.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 785

Section 39, Continuous Mixtures of Models306


Discrete mixtures can be extended to a continuous case such as the Inverse Gamma - Exponential
situation, to be discussed below. Instead of an n-point mixture, one can take a continuous mixture of
severity distributions.
Mixture Distribution Continuous Mixture of Models.
Continuous mixtures can be performed of either frequency distributions307 or loss distributions.
For example assume that each individual's future lifetime is exponentially distributed with mean 1/,
and over the population, is uniformly distributed over (0.05, 0.15).
u() = 1/0.1 = 10, 0.05 0.15.
Then the probability that a person picked at random lives more than 20 years is:
0.15

S(20) =

S(20; ) u() d = 0.05 e- 20 (1/ 0.10) d = (10/20)(e-1 - e-3) = 15.9%.

The density at 20 of this mixture distribution is:


0.15

f(20; ) u() d = 0.05

e- 20

10 d =

(10)(-e- 20

/ 20 -

e- 20

= 0.15

/ 400)]

= 0.0134.

= 0.05

In general, one takes a mixture of the density functions for specific values of the parameter :

g(x) =

f(x; ) () d .

via some mixing distribution ().


For example, in the case where the severity is Exponential and the mixing distribution of their means
is Inverse Gamma, we get the Inverse Gamma - Exponential process.
306

See Section 5.2.4 of Loss Models.


See the sections on Mixed Frequency Distributions and the Gamma-Poisson in Mahlers Guide to Frequency
Distributions.
307

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 786

Inverse Gamma-Exponential:
The sizes of loss for a particular policyholder are assumed to be Exponential with mean . Given ,
the distribution function of the size of loss is 1 - ex/, while the density of the size of loss distribution
is (1/)ex/. The mean of this Exponential is and its variance is 2.
Note that I have used rather than , so as to not confuse the scale parameter of the Exponential
with that of the Inverse Gamma which is .
So for example, the density of a loss being of size 8 is (1/)e-8/. If = 2 this density is:
(1/2)e-4 = 0.009, while if = 20 this density is: (1/20)e-0.4 = 0.034.
Assume that the values of across a portfolio of policyholders are given by an Inverse Gamma
distribution with = 6 and = 15, with probability density function:
e- /
94921.875 e- 15 /
() = + 1
=
, 0 < < .308

[ ]
7
Note that this distribution has a mean of: / ( 1) = 15 / (6-1) = 3.
If we have a policyholder and do not know its expected mean severity, in order to get the density
of the next loss being of size 8, one would weight together the densities of having a loss of size 8
given , using the a priori probabilities of : () = 94921.875 e15/ / 7, and integrating from zero
to infinity:

g(8) =

- 8/
- 23/
e- 8 /
e
94921.875 e- 15 /
e
() d =
d
=
94921.875
d

8
7

0
0
0

= 94921.875 ( 6! ) / (237 ) = 0.0201.

Where we have used the fact that the density of the Inverse Gamma Distribution integrates to unity

over its support and therefore:

e- / x
[ ] ( 1)!
dx = =
.

+
1

0 x

The Inverse Gamma Distribution has density: f(x) = e/x / {() x+1}.
In this case, the constant in front is: / () = 156 / (6) = 11390625 / 120 = 94921.875.
308

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 787

More generally, if the distribution of Exponential means is given by an Inverse Gamma distribution
e- /
() = + 1
, and we compute the density of having a claim of size x by integrating from zero to

[ ]

infinity: g(x) =

- x/
e- 8 /
e
e - /
e- ( + x) /
() d =
d
=
+ 2 d
+ 1 []

[
]
0
0
0

[ + 1]

=
.309

+
1
[ ] ( + x)+ 1

+
x
( )

Thus the (prior) mixed distribution is in the form of the Pareto distribution. Note that the shape
parameter and scale parameter of the mixed Pareto distribution are the same as those of the
Inverse Gamma distribution. For the specific example: = 6 and = 15. Thus the mixed Pareto has
g(x) = 6(156 )(15+x)-7. g(8) = 6(156 )(23)-7 = 0.0201, matching the previous result.
For the Inverse Gamma-Exponential the (prior) mixed distribution is always a Pareto,
with = shape parameter of the (prior) Inverse Gamma
and = scale parameter of the (prior) Inverse Gamma.310
Note that for the particular case we get a mixed Pareto distribution with parameters of = 6 and
= 15, which has a mean of 15/(6-1) = 3, which matches the result obtained above. Note that the
formula for the mean of an Inverse Gamma and a Pareto are both /(1).
Exercise: Each insured has an Exponential severity with mean . The values of are distributed via
an Inverse Gamma with parameters = 2.3 and = 1200. An insured is picked at random.
What is the probability that its next claim will be greater than 1000?
[Solution: The mixed distribution is a Pareto with parameters = 2.3 and = 1200.

2.3
1200
S(1000) =
=

= 24.8%.]
+ x
1000 +1200

309
310

Both the Exponential and the Inverse Gamma have terms involving powers of e1/ and 1/.
See Example 5.4 in Loss Models. See also 4B, 11/93, Q.26.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 788

Hazard Rates of Exponentials Distributed via a Gamma:311


If the hazard rate of the Exponential, , is distributed via a Gamma(, ), then the mean 1/ is
distributed via an Inverse Gamma(, 1/), and therefore the mixed distribution is Pareto.
If the Gamma has parameters and , then the mixed Pareto has parameters and 1/.

311

See for example, SOA M, 11/05, Q.17.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 789

Relationship of Inverse Gamma-Exponential to the Gamma-Poisson:


If , the mean of each Exponential, follows an Inverse Gamma Distribution with parameters and ,
F() = 1 - [, /].
If = 1/, then F() = [, ], and follows a Gamma with parameters and 1/.
This is mathematically the same as Exponential interarrival times each with mean 1/, or a Poisson
Process with intensity .
Prob[X > x] Prob[Waiting time to 1st claim > x] = Prob[no claims by time x].
From time 0 to x we have a Poisson Frequency with mean x. x has a Gamma Distribution with
parameters and x/. This is mathematically a Gamma-Poisson, with mixed distribution that is
Negative Binomial with r = and = x/.
Prob[X > x] Prob[no claims by time x] = f(0) = 1/(1 + x/) = /( + x).
This is the survival function at x of a Pareto Distribution, with parameters and , as obtained
previously.
Exercise: Each insured has an Exponential severity with mean .
The values of are distributed via an Inverse Gamma with parameters = 2.3 and = 1200.
An insured is picked at random.
What is the probability that the sum of his next 3 claims will be greater than 6000?
[Solution: Prob[sum of 3 claims > 6000] Prob[Waiting time to 3rd claim > 6000] =
Prob[at most 2 claims by time 6000].
The mixed distribution is Negative Binomial with r = = 2.3, and = x/ = 6000/1200 = 5.
Prob[at most 2 claims by time 6000] = f(0) + f(1) + f(2)
= 1/62.3 + (2.3)5/63.3 + {(2.3)(3.3)/2}52 /64.3 = 9.01%.
Alternately, the sum of 3 independent Exponential Claims is a Gamma with = 3 and = .
As listed subsequently, the mixture of a Gamma by an Inverse Gamma is a Generalized Pareto
Distribution with parameters, = 2.3, = 1200, and = 3. F(x) = [, ; x/(x + )].
S(6000) = 1 - [3, 2.3; 6000/(6000 + 1200)] = 1 - [3, 2.3; 1/1.2] = [2.3, 3, 1 - 1/1.2] =

[2.3, 3, 1/6]. Using a computer, [2.3, 3, 1/6] = 9.01%.


Comment: As shown in Mahlers Guide to Frequency Distributions, the distribution function of
a Negative Binomial is: F(x) = [r, x+1; 1/(1+)]. In this case, F(2) = [2.3, 3; 1/6]. Thus, one
can compute [a, b; x], for b integer, as a sum of Negative Binomial densities.]

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 790

Moments of Mixed Distributions:


The nth moment of a mixed distribution is the mixture of the nth moments for specific values of the
parameter :
E[Xn ] = E[E[Xn | ]]
Exercise: What is the mean for a mixture of Exponentials, mixed on the mean ?
[Solution: For a given value of , the mean of a Exponential Distribution is . We need to weight
these first moments together via the density of delta, ():

() d = mean of (), the distribution of .]


Thus the mean of a mixture of Exponentials is the mean of the mixing distribution. This result will hold
whenever the parameter being mixed is the mean, as it was in the case of the Exponential.
For the case of a mixture of Exponentials via an Inverse Gamma Distribution with parameters and
, the mean of the mixed distribution is that of the Inverse Gamma, /(1).
Exercise: What is the Second Moment of Exponentials, mixed on the mean ?
[Solution: For a given value of , the second moment of an Exponential Distribution is 22.
We need to weight these second moments together via the density of delta, ():

22 () = 2(second moment of (), the distribution of ).]


Exercise: What is the variance of Exponentials mixed on the mean via an Inverse Gamma
Distribution, as per Loss Models, with parameters and ?
[Solution: The second moment of the mixed distribution is:
2
2(second moment of the Inverse Gamma) = 2
.
( 1) ( 2)
The mean of the mixed distribution is the mean of the Inverse Gamma: / (1).
Thus the variance of the mixed distribution is: 2

2
2
2
-
=
.
( 1)2 ( 2)
( 1) ( 2) 1

Comment: The mixed distribution is a Pareto and this is indeed its variance.]

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 791

Normal-Normal:
The sizes of claims a particular policyholder makes is assumed to be Normal with mean m and
known fixed variance s2 . 312
Given m, the distribution function of the size of loss is: [

while the density of the size of loss distribution is: [

x-m
],
s

x-m
]=
s

(x - m)2
]
2s2 .
s 2

exp[-

(8 - m)2
]
18
.
3 2

exp[So for example if s = 3, then the probability density of claim being of size 8 is:

If m = 2 this density is:

exp[-2]
3 2

= 0.018, while if m = 20 this density is:

exp[-8]
3 2

= 0.000045.

Assume that the values of m are given by another Normal Distribution with mean 7 and standard
deviation of 2, with probability density function:313
(m - 7)2
]
8
, - < m < .
2 2

exp[(m) =

Note that 7, the mean of this distribution, is the a priori mean claim severity.

312

Note Ive used roman letter for parameters of the Normal likelihood, in order to distinguish from those of the
Normal distribution of parameters discussed below.
313
There is a very small but positive chance that the mean severity will be negative.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 792

Below is displayed this distribution of hypothetical mean severities:314


Density
0.20

0.15

0.10

0.05

10

12

14

Mean Severity

If we have a risk and do not know what type it is, in order to get the chance of the next claim being of
size 8, one would weight together the chances of having a claim of size 8 given m:
(8 - m)2
exp[]
18
, using the a priori probabilities of m:
3 2
(m - 7)2
]
8
, and integrating from minus infinity to infinity:
2 2

exp[(m) =

2
2
(8 - m)2
exp[- (8 - m) ] exp[- (m - 7 ) ]
]
18
18
8
(m) dm =
dm =
3 2
3 2
2 2

exp[-

1
6 2

1
6 2
314

(8 - m)2
(m - 7 )2

exp[-{
+
}]
1
exp[- {13m2 - 190m + 697} / 72 ]
18
8
dm =
dm =
6 2
2
2

exp[- {m2 - (190 / 13)m + (95 / 13)2 + 697 / 13 - (95 / 13) 2} / (72 / 13) ]
2

dm =

Note that there is a small probability that a hypothetical mean is negative. When this situation is discussed further
in Mahlers Guide to Conjugate Priors, this will be called the prior distribution of hypothetical mean severities.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

exp[(-36 / 132 )/ (72 / 13)]


6 2

exp[- (m - 95 / 13)2 / {(2) (6 / 13)2 } ]


2

HCM 10/8/12,

Page 793

dm =

exp[-1/ 26]
exp[-1/ 26]
(6/ 13 ) =
= 0.1065.
6 2
13 2

Where we have used the fact that a Normal Density integrates to unity:315

exp[- (m - 95 / 13)2 / {(2) (6 / 13)2 } ]


2

dm = 6/ 13 .

More generally, for the Normal-Normal, the mixed distribution is another Normal, with
mean equal to that of the Normal distribution of parameters, and variance equal to the
sum of the variances of the Normal distribution of parameters and the Normal
likelihood.316
For the specific case dealt with previously: s = 3, = 7, and = 2, the mixed distribution has a Normal
Distribution with a mean of 7 and variance of: 32 + 22 = 13.
(x - 7)2
]
26
.
13 2

exp[Thus the chance of having a claim of size x is:

For x = 8 this chance is:

exp[-1/ 26]
= 0.1065.
13 2

This is the same result as calculated above.

315

With mean of 95/13 and standard deviation of 6/ 13 .


The Expected Value of the Process Variance is the variance of the Normal Likelihood, the Variance of the
Hypothetical Means is the variance of the Normal distribution of parameters, and the total variance is the variance of
the mixed distribution. Thus this relationship follows from the general fact that the total variance is the sum of the
EPV and VHM. See Mahlers Guide to Buhlmann Credibility.
316

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 794

Derivation of the Mixed Distribution for the Normal-Normal:


The sizes of loss for a particular policyholder is assumed to be Normal with mean m and known fixed
variance s2 .
Given m, the density of the size of loss distribution is: [

x-m
]=
s

(x - m)2
]
2s2
.
s 2

exp[-

The distribution of hypothetical means m is given by another Normal Distribution with mean and
variance 2 : (m) =

(m - )2
]
22
, - < m < .
2

exp[-

We compute the mixed density at x, the chance of having a claim of size x, by integrating from
minus infinity to infinity:
2
2
2
exp[- (x - m) ]
exp[- (x - m) ] exp[- (m - ) ]
2s2
2s2
2 2
(m) dm =
dm =
s 2
s 2
2

1
s 2

1
s 2

(x - m)2
(m - )2
exp[-{
+
}]
2s2
22
dm =
2

exp[-

(s2 + 2 )m2 - (x 2 + s2 )2m + x2 2 + 2s2


]
2s2 2
dm =
2

s2 2
x2 + s2
x2 2 + 2 s2
Let 2 = 2
,

=
,
and

=
.
s + 2
s2 + 2
s2 + 2
Then the above integral is equal to:
1
s 2

exp[-

m2 - 2m +
]
22
2

dm =

1
s 2

exp[-

m2 - 2m + 2 - 2 +
]
2 2
2

dm =

2013-4-2,

1
s 2

exp[-

Loss Distributions, 39 Continuous Mixtures

exp[-

2 -
]
22

s2 + 2

2 -
]
2 2

exp[-

(m - )2
]
22
2

dm =

1
s 2

exp[-

HCM 10/8/12,

Page 795

2 -
]=
2 2

Where we have used the fact that a Normal Density integrates to unity:317

exp[-

(m - )2
]
22

dm = 1.

Note that 2 - =

x2 4 + 2x2 s2 + 2 s4 - {x2s2 2 + x2 4 + 2s4 + 2 2 s2}


=
(s2 + 2)2

2x2 s2 - x 2s2 2 - 2 2s2


(x - )2 2 s2
=
.
(s2 + 2 )2
(s2 + 2 )2

Thus,

2 -
(x - )2 2 s2 s2 + 2
(x - )2
=
=
.
2
s2 2
(s2 + 2 )2
s2 + 2

Thus the mixed distribution can be put back in terms of x, s, , and :


exp[-

2 -
]
22

s2 + 2

exp[=

(x - )2
]
2(s2 + 2)

s2 + 2

This is a Normal Distribution with mean and variance s2 + 2 .


Thus if the likelihood is a Normal Distribution with variance s2 (fixed and known), and the distribution
of the hypothetical means of the likelihood is also a Normal, but with mean and variance 2, then
the mixed distribution is yet a third Normal Distribution with mean and variance s2 + 2 . The mean
of the likelihood is what is varying among the insureds in the portfolio. Therefore, the mean of the
mixed distribution is equal to that of the prior distribution, in this case .
317

With mean of and standard deviation of .

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 796

Other Mixtures:
There are many other examples of continuous mixtures of severity distributions. Here are some
examples.318 In each case the scale parameter is being mixed, with the other parameters in the
severity distribution held fixed.
Severity

Mixing Distribution

Mixed Distribution

Exponential

Inverse Gamma: ,

Pareto: ,

Inverse

Gamma: ,

Inverse
Pareto: = ,

Exponential
Weibull, = t

Inverse Transformed

Burr: , , = t

Gamma: , , = t
Inverse Burr: = , , = t

Inverse

Transformed

Weibull, = t

Gamma: , , = t

Gamma, = a

Inverse Gamma: ,

Generalized Pareto: , , = a

Inverse Gamma, = a

Exponential:

Pareto: = a,

Inverse

Generalized

Gamma, = a

Gamma: ,

Pareto: = a, , =

Transformed

Inverse Transformed

Transformed Beta:

Gamma, = a, = t

Gamma: , , = t

, , = t, = a

Inverse Transformed

Transformed

Transformed Beta:

Gamma, = a, = t

Gamma: , , = t

= a, , = t, =

318

See the problems for illustrations of some of these additional examples.


Example 5.6 in Loss Models shows that mixing an Inverse Weibull via a Transformed Gamma gives an Inverse Burr.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 797

For example, assume that the amount of an individual claim has an Inverse Gamma distribution with
shape parameter fixed and scale parameter q (rather than to avoid later confusion.)
The parameter q is distributed via an Exponential Distribution with mean .
For the Inverse Gamma, f(x | q) = q e-q/x / {[] x+1}. For the Exponential, (q) = e-q/ / .
f(x) =

f(x | q) (q) dq =

q e- q / x e-q /
dq
=
[] x+ 1
q e- q(1/ x + 1/ ) dq / { [] x+1}
0
0

= {[+1]/(1/x + 1/)+1} / ({ [] x+1}) = /(x + )+1.


This is the density of a Pareto Distribution with parameters and = .
This is an example of an Exponential-Inverse Gamma, an Inverse Gamma Severity with shape
parameter , with its scale parameter mixed via an Exponential.319
The mixture is a Pareto Distribution, with shape parameter equal to that of the Inverse Gamma
severity, and scale parameter equal to the mean of the Exponential mixing distribution.
Exercise: The severity for each insured is an Inverse Gamma Distribution with parameters
= 3 and q. Over the portfolio, q varies via an Exponential Distribution with mean 500.
What is the severity distribution for the portfolio as a whole?
[Solution: The mixed distribution is a Pareto Distribution with parameters = 3, = 500.]
Exercise: In the previous exercise, what is the probability that a claim picked at random will be
greater than 400?
[Solution: S(400) = {500/(400 + 500)}3 = 17.1%.]
Exercise: In the previous exercise, what is the expected size of a claim picked at random?
[Solution: Mean of a Pareto with = 3 and = 500 is: 500/(3 - 1) = 250.
Alternately, the mean of each Inverse Gamma is: E[X | q] = q/(3 - 1) = q/2.
E[X] = Eq [E[X | q]] = Eq [q/2] = Eq [q]/2 = (mean of the Exponential Dist.)/2 = 500/ 2 = 250.]
Exercise: The severity for each insured is a Transformed Gamma Distribution with parameters

= 3.9, q, and = 5. Over the portfolio, q varies via an Inverse Transformed Gamma
Distribution with parameters = 2.4, = 17, and = 5.
What is the severity distribution for the portfolio as a whole?
[Solution: Using the above chart, the mixed distribution is a Transformed Beta Distribution with
parameters = 2.4, = 17, = 5, and = 3.9.]
319

This differs from the more common Inverse Gamma-Exponential discussed previously, in which we have an
Exponential severity, whose mean is mixed via the Inverse Gamma.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 798

Frailty Models:320
A particular type of mixed model are called Frailty models. They involve a particular form of the
hazard rate.
x

Recall that the hazard rate, h(x) = f(x) / S(x). Also S(x) = exp[-H(x)], where H(x) = h(t) dt .
Assume h(x | ) = a(x), where is a parameter which varies across the portfolio.
a(x) is some function of x, and let A(x) =

a(x) dx . Then H(x | ) = A(x).

S(x | ) = exp[- A(x)].


S(x) = E[S(x | )] = E[exp[- A(x)]] = M[-A(x)], where M is the moment generating function of
the distribution of .321 322
For an Exponential Distribution, the hazard rate is constant and equal to one over the mean.
Thus if each individual has an Exponential Distribution, a(x) = 1, and = 1/.
A(x) = x, and S(x) = M[-x].
We have already discussed mixtures like this. For example, could be distributed uniformly from
0 to 2.323 In that case, the general mathematical structure does not help very much.
However, let us assume each individual is Exponential and that is Gamma Distributed with
parameters and . The Gamma has moment generating function M(t) = (1 - t).324
Therefore, S(x) = (1 + x). This is a Pareto Distribution with parameters and = 1/.325
This is mathematically equivalent to the Inverse Gamma-Exponential discussed previously.
If is Gamma Distributed with parameters and , then the means of the Exponentials,
= 1/ are distributed via an Inverse Gamma with parameters and 1/. The mixed distribution is
Pareto with parameters and = 1/.
320

Section 5.2.5 in Loss Models.


The definition of the moment generating function is My(t) = Ey[exp[yt]].
See Mahlers Guide to Aggregate Distributions.
322
The survival function of the mixture is the mixture of the survival functions.
323
See 3, 5/01, Q.28.
324
See Appendix A in the tables attached to the exam.
325
For the Pareto, S(x) = (1 + x/).
321

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 799

Exercise: What is the hazard rate for a Weibull Distribution?


[Solution: h(x) = f(x)/S(x) = {(x/) exp(-(x/)) / x}/exp(-(x/)) = x1 .]
Therefore, we can put the Weibull for fixed into the form of a frailty model; h(x) = a(x), by taking
a(x) = x1 and = . A(x) = x.
Therefore, if each insured is Weibull with fixed , with = , then S(x) = M[-A(x)] = M[-x].
Exercise: Each insured has a Weibull Distribution with fixed. = is Gamma distributed with
parameters and . What is the form of the mixed distribution?326
[Solution: The Gamma has moment generating function M(t) = (1 - t).327 Therefore,
S(x) = (1 + x). This is a Burr Distribution with parameters , = 1/1/, and = .
Comment: The Burr Distribution has S(x) = (1 + (x /)).]
If in this case = 1, then has an Exponential Distribution, and the mixed distribution is a Loglogistic,
a special case of the Burr for = 1.328 If instead = 1, then as discussed previously the mixed
distribution is a Pareto, a special case of the Burr.
In general for a frailty model, f(x | ) = -dS(x | )/dx = a(x) exp[- A(x)].
Therefore, f(x) = E[ a(x) exp[- A(x)]] = a(x) E[ exp[- A(x)]] = a(x) M[-A(x)], where M is the
derivative of the moment generating function of the distribution of .329 330
For example, in the previous exercise, M(t) = (1 - t), and M(t) = (1 - t)(+1).
f(x) = a(x) M[-A(x)] = x1 (1 - x)(+1).
The density of a Burr Distribution is: ( x1)(1 + (x /))( + 1).
This is indeed the density of a Burr Distribution with parameters , = 1/1/, and = .
326

See Example 5.7 in Loss Models. This result is mathematically equivalent to mixing the scale parameter of a
Weibull via an Inverse Transformed Gamma, resulting in a Burr, one of the examples listed previously.
327
See Appendix A in the tables attached to the exam.
328
See Exercise 5.13 in Loss Models.
329
Ey[y exp[yt]] = My(t). See Mahlers Guide to Aggregate Distributions.
330
The density function of the mixture is the mixture of the density functions.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 800

For a frailty model, h(x) = f(x)/S(x) = a(x) M[-A(x)] / M[-A(x)] = a(x) d ln M[-A(x)] / d.331
Defining the cumulant generating function as X(t) = ln MX(t) = ln E[etx], then
h(x) = a(x) (-A(x)), where is the derivative of the cumulant generating function.
For example in the previous exercise, M(t) = (1 - t), and (t) = ln M(t) = - ln (1 - t).

(t) = /(1 - t). h(x) = a(x) (-A(x)) = x1 /(1 + x).


Exercise: What is the hazard rate for a Burr Distribution?
[Solution: h(x) = f(x)/S(x) = {( x1)(1 + (x/))( + 1)}/{1 + (x/)} =
( x1)/(1 + (x/)).]
Thus the above h(x) is indeed the hazard rate of a Burr Distribution with parameters ,

= 1/1/, and = .

331

The hazard rate of the mixture is not the mixture of the hazard rates.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 801

Problems:
Use the following information for the next 3 questions:
Assume that the size of claims for an individual insured is given by an Exponential Distribution:
f(x) = ex/ /, with mean and variance 2.
Also assume that the parameter varies for the different insureds, with following an
e- /
Inverse Gamma distribution: g() = + 1
, for 0< < .

[ ]
39.1 (2 points) An insured is picked at random and is observed until it has a loss.
What is the probability density function at 400 for the size of this loss?
1


A.
B.
C.
D.
E.
( + 400)
( + 400)
( + 400)
( + 400)+ 1
( + 400)+ 1
39.2 (2 points) What is the unconditional mean severity?

- 1

A.
B.
C.
D.
E. None of A, B, C, or D
- 1

39.3 (3 points) What is the unconditional variance?


A.

2
- 1

B.

2 2
- 1

C.

2
( - 1) ( - 2)

D.

2
( - 1)2 ( - 2)

E.

2 2
( - 1)2 ( - 2)

39.4 (3 points) The severity distribution of each risk in a portfolio is given by a Weibull Distribution,
with parameters = 1/3 and , with varying over the portfolio via an Inverse Transformed Gamma
Distribution: g() =
A. Burr

(72.5 / 3) exp[-7 / 1/ 3 ]
11/ 6 (2.5)

B. Generalized Pareto

. What is the mixed distribution?


C. Inverse Burr

D. LogLogistic

E. ParaLogistic

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 802

39.5 (3 points) You are given the following:


The amount of an individual claim, X, follows an exponential distribution function
with probability density function: f(x | ) = e-x, x, > 0.

The parameter , follows a Gamma distribution with probability density function


p() = (4/3) 4 e-2, > 0.

Determine the unconditional probability that x > 7.


A. 0.00038
B. 0.00042
C. 0.00046

D. 0.00050

E. 0.00054

39.6 (2 points) Consider the following frailty model:


h(x | ) = a(x).
a(x) = 4 x3 .
follows an Exponential Distribution with mean 0.007.
Determine S(6).
A. 7%
B. 8%

C. 9%

D. 10%

E. 11%

39.7 (3 points) The future lifetimes of a certain population consisting of 1000 people is modeled as
follows:
(i) Each individual's future lifetime is exponentially distributed with constant hazard rate .
(ii) Over the population, is uniformly distributed over (0.01, 0.11).
For this population, all of whom are alive at time 0, calculate the number of deaths expected
between times 3 and 5.
(A) 75
(B) 80
(C) 85
(D) 90
(E) 95
39.8 (2 points) You are given the following:
The number of miles that an individual car is driven during a year is given by an
Exponential Distribution with mean .
differs between cars.
is distributed via an Inverse Gamma Distribution with parameters = 3 and = 25,000.
What is the probability that a car chosen at random will be driven more than 20,000 miles during the
next year?
(A) 9%
(B) 11%
(C) 13%
(D) 15%
(E) 17%

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 803

39.9 (2 points) You are given the following:


The IQs of actuaries are normally distributed with mean 135 and standard deviation 10.
Each actuarys score on an IQ test is normally distributed around his true IQ,
with standard deviation of 15.
What is the probability that Abbie the actuary scores between 145 and 155 on his IQ test?
A. 10%
B. 12%
C. 14%
D. 16%
E. 18%
39.10 (2 points) Consider the following frailty model:
SX|(x | ) = ex.
follows a Gamma Distribution with = 3 and = 0.01.
Determine S(250).
A. Less than 3%
B. At least 3%, but less than 4%
C. At least 4%, but less than 5%
D. At least 5%, but less than 6%
E. At least 6%
39.11 (3 points) The severity distribution of each risk in a portfolio is given by an Inverse Weibull
Distribution, F(x) = exp[-(q/x)4 ], with q varying over the portfolio via a Transformed Gamma
Distribution with parameters = 1.3, = 11, and = 4.
What is the probability that the next loss will be of size less than 10?
x
x -1 exp -

Hint: For a Transformed Gamma Distribution, f(x) =

()

[ ]

A. Less than 32%


B. At least 32%, but less than 34%
C. At least 34%, but less than 36%
D. At least 36%, but less than 38%
E. At least 38%

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 804

39.12 (3 points) You are given the following:


The amount of an individual loss in 2002, follows an exponential distribution
with mean $3000.
Between 2002 and 2007, losses will be multiplied by an inflation factor.
You are uncertain of what the inflation factor between 2002 and 2007 will be,
but you estimate that it will be a random draw from an Inverse Gamma Distribution with
parameters = 4 and = 3.5.
Estimate the probability that a loss in 2007 exceeds $5500.
A. Less than 18%
B. At least 18%, but less than 19%
C. At least 19%, but less than 20%
D. At least 20%, but less than 21%
E. At least 21%
Use the information on the following frailty model for the next two questions:
Each insured has a survival function that is Exponential with hazard rate .
The hazard rate varies across the portfolio via
an Inverse Gaussian Distribution with = 0.015 and = 0.005.
39.13 (3 points) Determine S(65).
A. 56%
B. 58%
C. 60%

D. 62%

E. 64%

39.14 (2 points) For the mixture what is the hazard rate at 40?
A. 0.0060
B. 0.0070 C. 0.0080 D. 0.0090
E. 0.0100

39.15 (4 points) You are given the following:


The amount of an individual claim has an Inverse Gamma distribution with shape parameter = 4
and scale parameter q (rather than to avoid later confusion.)
The parameter q is distributed via an Exponential Distribution with mean 100.
What is the probability that a claim picked at random will be of size greater than 15?
A. Less than 50%
B. At least 50%, but less than 55%
C. At least 55%, but less than 60%
D. At least 60%, but less than 65%
E. At least 65%

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

39.16 (3 points) You are given the following:


X is a Normal Distribution with mean zero and variance v.
v is distributed via an Inverse Gamma Distribution with = 10 and = 10.
Determine the form of the mixed distribution.
39.17 (2 points) You are given the following:
The amount of an individual claim has an exponential distribution given by:
p(y) = (1/) e-y/ y > 0, > 0
The parameter has a probability density function given by:
f()= (4000/4 ) e-20/

>0

Determine the variance of the claim severity distribution.


A. 150
B. 200
C. 250
D. 300
E. 350
39.18 (3 points) Consider the following frailty model:
h(x | ) = a(x).
a(x) =

0.004
1 + 0.008x

follows an Gamma Distribution with = 6 and = 1.


Determine S(11).
A. 70%
B. 72%

C. 74%

D. 76%

E. 78%

39.19 (3 points) You are given the following:


The amount of an individual loss this year, follows an Exponential Distribution
with mean $8000.
Between this year and next year, losses will be multiplied by an inflation factor.
The inflation factor follows an Inverse Gamma Distribution with parameters
= 2.5 and = 1.6.
Estimate the probability that a loss next year exceeds $10,000.
A. Less than 21%
B. At least 21%, but less than 22%
C. At least 22%, but less than 23%
D. At least 23%, but less than 24%
E. At least 24%

Page 805

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 806

39.20 (2 points) Severity is LogNormal with parameters and 0.3.


varies across the portfolio via a Normal Distribution with parameters 5 and 0.4.
What is probability that a loss chosen at random exceeds 200?
(A) 25%
(B) 27%
(C) 29%
(D) 31%
(E) 33%
Use the following information for the next two questions:
For each class, sizes of loss are Exponential with mean .
Across a group of classes varies via an Inverse Gamma Distribution with parameters
= 3 and = 1000.
39.21 (2 points) For a class picked at random, what is the expected value of the loss elimination
ratio at 500?
A. 50%
B. 55%
C. 60%
D. 65%
E. 70%
39.22 (6 points) What is the correlation across classes of the loss elimination ratio at 500 and the
loss elimination ratio at 200?
A. Less than 96%
B. At least 96%, but less than 97%
C. At least 97%, but less than 98%
D. At least 98%, but less than 99%
E. At least 99%

39.23 (4B, 5/93, Q.19) (2 points) You are given the following:
The amount of an individual claim has an exponential distribution given by:
p(y) = (1/) e-y/, y > 0, > 0.
The parameter has a probability density function given by: f()= (400/3 )e-20/, > 0.
Determine the mean of the claim severity distribution.
A. 10
B. 20
C. 200
D. 2000

E. 4000

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 807

39.24 (4B, 11/93, Q.26) (3 points) You are given the following:
The amount of an individual claim, Y, follows an exponential distribution function
with probability density function f(y | ) = (1/ ) e-y/, y, > 0.
The conditional mean and variance of Y given are E[Y | ] = and Var[Y | ] = 2.
The mean claim amount, , follows an Inverse Gamma distribution with density function
p() = 4e-2/ / 4, > 0.
Determine the unconditional density of Y at y = 3.
A. Less than 0.01
B. At least 0.01, but less than 0.02
C. At least 0.02, but less than 0.04
D. At least 0.04, but less than 0.08
E. At least 0.08
39.25 (3, 5/00, Q.17) (2.5 points) The future lifetimes of a certain population can be modeled as
follows:
(i) Each individual's future lifetime is exponentially distributed with constant hazard rate .
(ii) Over the population, is uniformly distributed over (1, 11).
Calculate the probability of surviving to time 0.5, for an individual randomly selected at time 0.
(A) 0.05
(B) 0.06
(C) 0.09
(D) 0.11
(E) 0.12
39.26 (3, 5/01, Q.28) (2.5 points) For a population of individuals, you are given:
(i) Each individual has a constant force of mortality.
(ii) The forces of mortality are uniformly distributed over the interval (0, 2).
Calculate the probability that an individual drawn at random from this population dies
within one year.
(A) 0.37
(B) 0.43
(C) 0.50
(D) 0.57
(E) 0.63
39.27 (SOA M, 5/05, Q.10 & 2009 Sample Q.163) The scores on the final exam in Ms. Bs Latin
class have a normal distribution with mean and standard deviation equal to 8.
is a random variable with a normal distribution with mean equal to 75 and standard deviation equal
to 6.
Each year, Ms. B chooses a student at random and pays the student 1 times the students
score. However, if the student fails the exam (score 65), then there is no payment.
Calculate the conditional probability that the payment is less than 90, given that there is a
payment.
(A) 0.77
(B) 0.85
(C) 0.88
(D) 0.92
(E) 1.00

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 808

39.28 (SOA M, 11/05, Q.17 & 2009 Sample Q.204) (2.5 points)
The length of time, in years, that a person will remember an actuarial statistic is modeled by an
exponential distribution with mean 1/Y.
In a certain population, Y has a gamma distribution with = = 2.
Calculate the probability that a person drawn at random from this population will remember
an actuarial statistic less than 1/2 year.
(A) 0.125
(B) 0.250
(C) 0.500
(D) 0.750
(E) 0.875
39.29 (SOA M, 11/05, Q.20) (2.5 points) For a group of lives age x, you are given:
(i) Each member of the group has a constant force of mortality that is drawn from the
uniform distribution on [0.01, 0.02 ].
(ii) = 0.01.
For a member selected at random from this group, calculate the actuarial present value of a
continuous lifetime annuity of 1 per year.
(A) 40.0
(B) 40.5
(C) 41.1
(D) 41.7
(E) 42.3

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 809

Solutions to Problems:
39.1. E. The conditional probability of a loss of size 400 given is: e400/ /.
The unconditional probability can be obtained by integrating the conditional probabilities versus the
distribution of :

f(400) =

f(400 | )g() d = {e400/ / } (+1) e/ /() d =

{ /()} (+2) e(400+)/ d = { /()} (+1) / (400 + )+1 = / (+400)+1.


0

39.2. A. The conditional mean given is: . The unconditional mean can be obtained by integrating
the conditional means versus the distribution of :

E[X] = E[X | ]g() d = (+1) e/ /() d = {/()} e/ d


0

= {/()} (1) / 1 = /(1).


Comment: The mean of a Pareto Distribution; the mixed distribution is a Pareto with scale parameter
and shape parameter .

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 810

39.3. D. The conditional mean given is . The conditional variance given is 2.


Thus the conditional second moment given is: 2 + 2 = 22.
The unconditional second moment can be obtained by integrating the conditional second moments
versus the distribution of :

E[X2 ] =

E[X2 | ] g() d = {22} (+1) e/ /() d =

{2/()} (1) e/d = {2/()} (2) / 2 = 22/{(1)(2)}.


0

Since the mean is /(1), the variance is: 22/{(1)(2)} - 2/(1)2 =


( 2/{(1)2(2)}){2(1)(2)} = 2 / { (2)(1)2 }.
Comment: The variance of a Pareto Distribution.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 811

39.4. A. Weibull has density, (x/) exp(-(x/)) /x = x-2/31/3 exp(-x1/3/1/3)/3.


The density of the mixed distribution is obtained by integrating the Weibull density times g():

f(x)g()d = {x-2/31/3 exp(-x1/3/1/3)/3} 72.5(1/3) exp(-7/1/3) / {11/6 (2.5)}d =


0

72.5 x-2/3 /(9(2.5)) 13/6 exp(-(7+x1/3)/1/3) d =


0

Make the change of variables, y = (7+x1/3)/1/3. = (7+x1/3)y-3. d = -3(7+x1/3)y-4dy.

72.5 x-2/3 /(3(2.5)) (7+x1/3)-13/2 y 13/2exp(-y) (7+x1/3)y-4dy =


0

72.5 x-2/3 (7+x1/3)-7/2 /(3(2.5)) y5/2exp(-y)dy = {72.5 x-2/3 (7+x1/3)-7/2 /(3(2.5))}(3.5)


0
= (2.5)(1/3){1/7) x-2/3 /(1+(x/343)1/3)-7/2.

This is the density of a Burr Distribution, (x/)(1+(x/))( + 1) /x, with parameters:


= 2.5, = 343 = 73 , and = 1/3.
Comment: In general, if one mixes a Weibull with = t fixed, with its scale parameter varying via an
Inverse Transformed Gamma Distribution, with parameters: , , and = t,
then the mixed distribution is a Burr with parameters: , , and = t.
This Inverse Transformed Gamma Distribution has parameters: = 2.5, = 343 = 73 , and = 1/3.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 812

39.5. E. Although it is not obvious, this is the Exponential - Inverse Gamma.


The mean of the exponential is = 1/, and follows an Inverse Gamma.
Since d / d = 1/2, g() = p() |d / d| = (4/3) 4e-2//2 = (4/3) 6e-2/.
This Inverse Gamma has parameters = 5 and = 2. The (prior) marginal distribution is a Pareto,
with = 5 and = 2. Therefore 1 - F(x) = (2/ (2+x))5 . 1 - F(7) = (2/9)5 = 0.00054.
Alternately, one can compute the unconditional distribution function at x = 7 via integration:

1 - F(7) = {1-F(7 | ) }p() d = exp(-7)(4/3) 4e-2 d = (4/3) 4e-9 d.


0

This is the same type of integral as the Gamma Function, thus:


1 - F(7) = (4/3) (5) / 95 = (4/3) (24) / 95 = 0.00054.
Alternately, one can compute the unconditional distribution function at x via integration:

1 - F(x) = {1-F(x | ) }p()d =


0

exp(-x)(4/3) 4e-2 d = (4/3) 4e-(2+x) d.


0

This is the same type of integral as the Gamma Function, thus: 1 -F(x) = (4/3) (5) / (2+x)5 .
Therefore, 1 - F(7) = (4/3) (24) / 95 = 32 / 95 = 0.00054.
Comment: If one recognizes this as a Pareto with scale parameter of 2 and shape parameter of 5,
then one can determine the constant by looking in Appendix A of Loss Models without doing the
Gamma integral.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 813

39.6. D. A(x) =

a(t) dt =

x4 .

S(x) = M[-A(x)].
The moment generating function of this Exponential Distribution is: M(t) =
Thus S(x) =
Thus, S(6) =

1
.
1 - 0.007t

1
.
1 + 0.007x 4
1
= 9.9%.
1 + 0.007 (64)

Comment: See Exercise 5.13 in Loss Models.


The mixture is Loglogistic.
For the Weibull as per Loss Models, h(x) = x1 .
Therefore, we can put the Weibull for fixed into the form of a frailty model;
h(x) = a(x), by taking a(x) = x1 and = . A(x) = x.
Here = 4, a(x) = 4 x3 , and = -4.
Thus for a given value of theta or lambda, S(x | ) = exp[-(x/)4 ] = exp[-x4 ].
Therefore, S(6 | ) = exp[-64 ] = exp[-1296].
Thus, S(6) =

exp[-1296] exp[- / .007] / 0.007 d = exp[-1438.86] d / 0.007

= (1/1438.86)/0.007 = 9.9%.
39.7. D. The hazard rate for an Exponential is one over its mean. Therefore, the survival function is
S(t; ) = e-t. Mixing over the different values of :
.11

S(t) =
.01

.11

.11

S(t; ) f() d = e-t (1/0.1) d = (-10/t)e-t ] = (10/t)(e-.01t - e-.11t).


.01

.01

S(3) = (10/3)(e-.03 - e-.33) = 0.8384. S(5) = (10/5)(e-.05 - e-.55) = 0.7486.


The number of deaths expected between time 3 and time 5 is:
(1000)(S(3) - S(5)) = (1000)(0.8384 - 0.7486) = 89.8.
Comment: Similar to 3, 5/00, Q.17.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 814

39.8. E. For this Inverse Gamma-Exponential, the mixed distribution is a Pareto with = 3 and
= 25,000. S(20000) = {25000/(25000 + 20000)}3 = (5/9)3 = 17.1%.
39.9. D. If the severity is Normal with fixed variance s2 , and the mixing distribution of their means is
also Normal with mean and variance 2, then the mixed distribution is another Normal, with mean
and variance: s2 + 2.
In this case, the mixed distribution is Normal with mean 135 and variance: 152 + 102 = 325.
Prob[145 score 155] = [(155 - 135)/ 325 ] - [(145 - 135)/ 325 ] = [1.109] - [.555] =
.8663 - .7106 = 0.156.
39.10. A. is the hazard rate of each Exponential, one over the mean. We are mixing via a
Gamma. Therefore, the mixed distribution is Pareto with parameters = 3 and = 1/0.01 = 100.
(This is mathematically the same as the Inverse-Gamma Exponential.)
Thus, S(250) =

100

3
= 2.3%.
100 + 250

Alternately, h(x | ) = a(x). For the Exponential, h(x) = .


Thus, this is a frailty model with a(x) = 1 and A(x) = x.
S(x) = M[-A(x)].
The moment generating function of this Gamma Distribution is: M(t) =

Thus S(x) =

3 100 3
=
.
1 + 0.01x
100 + x

Thus, S(250) =

100

3
= 2.3%.
100 + 250

3
.
1 - 0.01t

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 815

39.11. A. Inverse Weibull has density: 4x-5q4 exp(-(q/x)4 ).


The density of q is that of the Transformed Gamma Distribution:
(q/) exp(-(q/)) / {q ()} = 4 q4.2 11-5.2exp(-(q/11)4 ) / (1.3).
The density of the mixed distribution is obtained by integrating the Inverse Weibull density times
the density of q:

4x-5q4exp(-(q/x)4) 4 q4.2 11-5.2 exp(-(q/11)4) / (1.3) dq =


0

((16) 11-5.2 x-5 /(1.3))

q8.2exp(-(q4(11-4 + x-4)) dq =

Make the change of variables, y = q4 (11-4 + x-4). q = (11-4 + x-4)-1/4y 1/4.


dq = (1/4)(11-4 + x-4)-1/4y -3/4dy.

((16) 11-5.2 x-5 /(1.3)) (11-4 + x-4)-2.05y 2.05 exp(-y) (1/4)(11-4 + x-4)-1/4y -3/4dy =
0

((4) 11-5.2 x-5 (11-4 + x-4)-2.3 /(1.3)) y1.3 exp(-y)dy =


0

((4) 11-5.2 x-5 (11-4 + x-4)-2.3 /(1.3)) (2.3) = (4)(1.3) 11-5.2 x-5 (11-4 + x-4)-2.3 =
(4)(1.3) 11-5.2 x-5 x9.2(1 + (x/11)4 )-2.3 = (4)(1.3) (x/11)5.2 2(1 + (x/11)4 )-2.3 /x.
This is the density of an Inverse Burr Distribution, (x/)(1+(x/))( + 1) /x ,
with parameters = 1.3, = 11, and = 4. Therefore, the mixed distribution is:
F(x) = {(x /)/(1+(x /))} = {1+(11 /x )4 }-1.3. F(10) = {1 + (11/10)4 }-1.3 = 31.0%.
Comment: In general, if one mixes an Inverse Weibull with = t fixed, with its scale parameter
varying via a Transformed Gamma Distribution, with parameters , , and = t, then the mixed
distribution is an Inverse Burr with parameters = , , and = t.
For each Inverse Weibull, S(10) = exp(-(q / 10)4 ). One could instead average S(10) for each
Inverse Weibull, in order to get S(10) the mixed distribution:

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 816

exp(-(q / 10)4) 4 q4.2 11-5.2 exp(-(q/11)4) / (1.3) dq=


0

{4 (11-5.2) / (1.3)} exp(- .0001683q4 ) q4.2 dq =


0

(11-5.2)/(1.3) exp(-.0001683y)y.3 dy = {(11-5.2)/(1.3)} (1.3)(.0001683)-1.3 = 31.0%.


0

39.12. B. Let the inflation factor be y. Then given y, in the year 2007 the losses have an
Exponential Distribution with mean 3000y. Let z = 3000y. Then since y follows an Inverse Gamma
with parameters = 4 and scale parameter = 3.5, z follows an Inverse Gamma with parameters
= 4 and = (3000)(3.5) = 10,500. Thus in the year 2007, we have a mixture of Exponentials each
with mean z, with z following an Inverse Gamma. This is the (same mathematics as the) Inverse
Gamma-Exponential.
For the Inverse Gamma-Exponential the mixed distribution is a Pareto, with = shape parameter of
the Inverse Gamma and = scale parameter of the Inverse Gamma.
In this case the mixed distribution is a Pareto with = 4 and = 10,500.
For this Pareto, S(5500) = {1 + (5500/10500)}-4 = 18.5%.
Comment: This is an example of parameter uncertainty. We assume that the loss distribution in
year 2007 will also be an Exponential, we just are currently uncertain of its parameter.
x

39.13. B. h(x | ) = a(x). For an Exponential a(x) = 1, and A(x) =

a(t) dt = x.

The moment generating function of this Inverse Gaussian Distribution is:


M(t) = exp[( / ) (1 -

1 - 2t2 / )] = exp[(1/3)(1 -

S(x) = M[-A(x)] = exp[(1/3)(1 Thus, S(65) = exp[(1/3)(1 -

1 + 0.09x )].

1 + (0.09)(65))] = 58%.

1 - 0.09t )].

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

39.14. B. From the previous solution, S(x) = exp[(1/3)(1 S(40) = exp[(1/3)(1 -

Page 817

1 + 0.09x )].

1 + (0.09)(40))] = 0.683.

Differentiating, f(x) = exp[(1/3)(1 f(40) = exp[(1/3)(1 -

HCM 10/8/12,

1 + 0.09x )] (1/3)(0.09)(1/2)/ 1 + 0.09x .

1 + (0.09)(40))] 0.015 /

1 + (0.09)(40) = 0.00478.

h(40) = f(40)/S(40) = 0.00478/0.683 = 0.0070.


Alternately, for a frailty model, h(x) = a(x) d ln M[-A(x)] / d.
M(t) = exp[( / ) (1 ln M(t) = (1/3) (1 -

1 - 2t2 / )] = exp[(1/3)(1 -

1 - 0.09t )].

1 - 0.09t ).

d ln M[t] / dt = (1/3)(0.09)(1/2) /

1 - 0.09t .
x

h(x | ) = a(x). For an Exponential a(x) = 1, and A(x) =

a(t) dt = x.

h(x) = (1) (1/3)(0.09)(1/2)/ 1 + 0.09x .


h(40) = 0.015 /

1 + (0.09)(40) = 0.0070.

39.15. C. For the Inverse Gamma, f(x | q) = q e-q/x / {[] x+1} = q4 e-q/x / {6 x5 }.
For the Exponential, u(q) = e-q/100/100.

f(x) =

q4 e- q / x e- q / 100
dq
=
q4 e -q (1/ x + 1/ 100) dq / (600x5 )

5
6x
100
0
0

f(x | q) u(q) dq =

= {[5]/(1/x + 1/100)5 }/ (600x5 ) = {(24)1005 /(x + 100)5 }/ 600 = (4)1004 /(x + 100)5 .
This is the density of a Pareto Distribution with parameters = 4 and = 100.
Therefore, F(x) = 1 - {/(x+)} = 1 - {100/(x+ 100)}4 . S(15) = (100/115)4 = 57.2%.
Comment: An example of an Exponential-Inverse Gamma.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 818

39.16. For a Normal Distribution with mean zero and variance v:


f(x | v) = exp[-x2 /(2v)]/ 2 v .
An Inverse Gamma with = 10 and = 10 has density:
g(v) = 1010 e-10/v v-11 / (10), v > 0.
The mixed distribution has density:

exp[-x2/(2v)]/

2 v 1010 e-10/v v-11 / (10) dv = 1010 / {9! 2 )} exp[-(10+x2 /2)/v] v-11.5 dv

v=0

= (1010 / {(10)

v=0

2 ) (10.5) / (10 + x2 /2)10.5 = (1 + x2 /20)-10.5 (10.5) / {(10) (1/2)

20 }.

This is a Students t distribution with 20 degrees of freedom.


Comment: Difficult! Since the Inverse Gamma Density integrates to one over its support,

exp[-/x] x-(+1) dx = () / . Also, (1/2) =

A Students t distribution with degrees of freedom, has density:


f(x) = 1/{(1 + x2 /)(+1)/2 [/2, 1/2] .5}, where [/2, 1/2] = (1/2)(/2)/((+1)/2).
39.17. D. The mixed distribution is a Pareto with shape parameter = = 3 and
scale parameter = = 20, with variance: (2)(202 )/{{(3-1)(3-2)} - {20 / (3-1)}2 = 400 - 100 = 300.
Alternately,Var[X | ] = Variance[Exponential Distribution with mean ] = 2 .
f() is an Inverse Gamma Distribution, with = 20 and = 3.
E[Var[X | ]] = E[2 ] = 2nd moment of Inverse Gamma = 202 /{(3-1)(3-2)} = 200.
Var[E[X | ]] = Var[] = variance of Inverse Gamma =
2nd moment of Inverse Gamma - (mean Inverse Gamma)2 = 200 - {20 / (3-1)}2 = 100.
Var[X] = E[Var[X | ]] + Var[E[X | ]] = 200 + 100 = 300.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 819

39.18. E. A(x) =

a(t) dt =

1 + 0.008x - 1.

The moment generating function of a Gamma Distribution with = 6 and = 1 is:


M(t) =

1 6
.
1 - t

6
1
1
6
1
3
S(x) = M[-A(x)] =
=

=
.

1 + 0.008x
1 + 0.008x
1 + A(x)

3
1
Thus, S(11) =
= 77.6%.
1 + (0.008)(11)
Comment: See Exercise 5.14 in Loss Models.
The mixture is a Pareto Distribution with = 3, and = 1/0.008 = 125.
39.19. D. Let the inflation factor be y. Then given y, in the next year the losses have an
Exponential Distribution with mean 8000y. Let z = 8000y. Then since y follows an Inverse Gamma
with parameters = 2.5 and scale parameter = 1.6, z follows an Inverse Gamma with parameters
= 2.5 and = (8000)(1.6) = 12,800. Thus next year, we have a mixture of Exponentials each with
mean z, with z following an Inverse Gamma. This is the (same mathematics as the) Inverse GammaExponential. For the Inverse Gamma-Exponential the mixed distribution is a Pareto, with = shape
parameter of the Inverse Gamma, and
= scale parameter of the Inverse Gamma.
In this case the mixed distribution is a Pareto with = 2.5 and = 12,800.
For this Pareto Distribution, S(10,000) = {1 + (10000/12800)}-2.5 = 23.6%.
39.20. B. ln[x] follows a Normal with parameters and 0.3.
Therefore, we are mixing a Normal with fixed variance via another Normal.
Therefore, the mixture of ln[x] is Normal with parameters 5, and
Thus the mixture of x is LogNormal with parameters 5 and 0.5.
S(200) = 1 - [{ln(200) - 5}/0.5] = 1 - [0.60] = 27.43%.

0.32 + 0.42 = 0.5.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 820

39.21. E. & 39.22. B.


For the Exponential, the loss elimination ratio is equal to the distribution function: LER(x) = 1 - e-x/.
The mixed distribution of the size of loss is Pareto with the same parameters, = 3 and = 1000.
E[LER(x)] = E[1 - e-x/] = E[F(x ; )] =

[] F(x ; ) d

3
1000
= distribution function of the mixture = distribution function of the Pareto = 1 .
x + 1000

E[LER(500)] = 1 - (10/15)3 = 0.70370.


E[LER(200)] = 1 - (10/12)3 = 0.42130.
E[LER(x) LER(y)] = E[(1 - e-x/) (1 - e-y/)] = E[(1 - e-x/ - e-y/ + e-(x+y)/]
= 1 - E[S(x ; )] - E[S(x ; )] + E[S(x + y ; )]
3
3
1000 3
1000
1000
=1-
+
.
x + 1000
y + 1000
x + y + 1000

E[LER(200) LER(500)] = 1 - (10/12)3 - (10/15)3 + (10/17)3 = 0.32854.


C o v[LER(200) , LER(500)] = E[LER(200) LER(500)] - E[LER(200)] E[LER(500)] =
0.32854 - (0.42130)(0.70370) = 0.03207.
E[LER(x)2 ] = E[LER(x) LER(x)] = 1 - 2

3
3
1000
1000
+
.
x + 1000
2x + 1000

E[LER(200)2 ] = 1 - (2)(10/12)3 + (10/14)3 = 0.20702.


Var[LER(200)] = 0.20702 - 0.421302 = 0.02953.
E[LER(500)2 ] = 1 - (2)(10/15)3 + (10/20)3 = 0.53241.
Var[LER(500)] = 0.53241 - 0.703702 = 0.03722.
Corr[LER(200) , LER(500)] =

0.03207
(0.02953) (0.03722)

= 96.73%.

Comment: The loss elimination ratios for deductibles of somewhat similar sizes are highly correlated
across classes.
For a practical example for excess ratios, see Tables 3 and 4 in NCCIs 2007 Hazard Group
Mapping, by John P. Robertson, Variance, Vol. 3, Issue 2, 2009, not on the syllabus of this exam.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

39.23. B. f() is an Inverse Gamma Distribution, with = 20 and = 2.


p(y) is an Exponential Distribution with E[Y | ] = .
Therefore the mean severity = E[ E[Y | ] ] = E[] = f() d =
mean of Inverse Gamma = / (-1) = 20 / (3-2) = 20.
Alternately, the mixed distribution is a Pareto with shape parameter = = 2,
and scale parameter = = 20. Therefore this Pareto has mean 20 / (2-1) = 20.
Comment: One can do the relevant integral via the substitution x = 1/, dx = -d / 2:

f() d = (400 / 3)e-20/ d = 400e-20/ d/ 2 = 400e-20x dx = 400/20 = 20.

Page 821

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 822

39.24. C. This is an Exponential mixed via an Inverse Gamma. The Inverse Gamma has
parameters = 3 and = 2. Therefore the (prior) mixed distribution is a Pareto, with = 3 and = 2.
Thus f(x) = (3) (23 ) (2+x)-4. f(3) = (3)(8) / 54 = 0.0384.
Alternately, one can compute the unconditional density at y = 3 via integration:

f(3) =

f(3 | ) p() d = (1/)exp(-3 / )(4 / 4) exp(-2 / ) d = 45 exp(-5 / ) d.

Let x = 5/ and dx = (-5/2)d in the integral:

f(3) = (4 / 54)

x3 exp(-x ) dx = (4/625) (4) = (4/625)(3!) = (4/625)(6) = 0.0384.

Alternately, one can compute the unconditional density at y via integration:

f(y) = (1/) exp(-y / ) (4 / 4) exp(-2 / ) d =


0

45 exp(- (2+y)/ ) d
0

Let x = (2+y)/ and dx = -((2+y)/2)d in the integral:

f(y) = (4 / (2+y)4)

x3 exp(-x ) dx = (4 / (2+y)4) (4) = (4/ (2+y)4)(3!) = 24(2+y)-4.

f(3) = 0.0384.

Comment: If one recognizes this as a Pareto with = 2 and = 3, then one can determine the
constant by looking in Appendix A of Loss Models, rather than doing the Gamma integral.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 823

39.25. E. The hazard rate for an Exponential is one over its mean. The mean is 1/ not . The
survival function is S(t; ) = e-t. S(.5; ) = e-.5. Mixing over the different values of :
11

11

11

S(.5) = S(.5; ) f() d = e-.5 (1/10) d = (-1/5)e-.5 = (e-.5 - e-5.5 )/5 = (.607-.004)/5 = 0.12.
1

Comment: The mean future lifetime given is 1/. The overall mean future lifetime is:
11

11

11

(1/) f() d = (1/) (1/10) d = ln()/10 ] = 0.24.


1

39.26. D. For a constant force of mortality, , the distribution function is Exponential:


F(t | ) = 1 - e-t. F(1 | ) = 1 - e-.
The forces of mortality are uniformly distributed over the interval (0, 2). () = 1/2, 0 2.
Taking the average over the values of :
2

F(1) = F(1 | ) () d = (1 - e- ) / 2 d = 1 - (1 - e-2)(1/2) = 0.568.


Alternately, one can work with the means = 1/, which is harder.
is uniform from 0 to 2. The distribution function of is: F() = /2, 0 2.
The distribution function of is:
F() = 1 - F() = 1 - F(1/) = 1 - 1/(2), 1/2 .
The density function of is: 1/(22), 1/2 .
Given , the probability of death by time 1 is: 1 - e-1/.
Taking the average over the values of :

F(1) = (1 -

e- 1/ )/ (22)

1/ 2

d = 1 -

1/ 2

e-1/

/ (22 )

d = 1 - (1/2)

x =
-1/

]
(e

x = 1/ 2

= 1 - (1 - e-2)/2 = 0.568.
Comment: F(1 | = 1) = 1 - e-1 = 0.632. Thus choices A and B are unlikely to be correct.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 824

39.27. D. If the severity is Normal with fixed variance s2 , and the mixing distribution of their means
is also Normal with mean and variance 2, then the mixed distribution is another Normal, with mean
and variance: s2 + 2.
In this case, the mixed distribution is Normal with mean 75 and variance: 82 + 62 = 100.
Prob[there is a payment] = Prob[Score > 65] = 1 - [(65 - 75)/10] = 1 - [-1] = .8413.
Prob[90 > Score > 65] = [(90 - 75)/10] - [(65 - 75)/10] = [1.5] - [-1] = .9332 - .1587 = .7745.
Prob[payment < 90 | payment > 0] = Prob[90 > Score > 65 | Score > 65] =
Prob[90 > Score > 65]/Prob[Score > 65] = .7745/.8413 = 0.9206.
39.28. D. Y is Gamma with = = 2. Therefore, F(y) = [2; y/2].
Let the mean of each Exponential Distribution be = 1/y. Then F() = [2; (1/2)/].
Therefore, has an Inverse Gamma Distribution with = 2 and = 1/2.
This is an Inverse Gamma - Exponential with mixed distribution a Pareto with = 2 and = 1/2.
F(x) = 1 - {/(x + )} = 1 - {.5/(x + .5)}2 . F(1/2) = 1 - (.5/1)2 = 0.75.

Alternately, Prob[T > t | y] = e-yt. Prob[T > t] = e-yt f(y) dy = MY[-t].


The moment generating function of a Gamma Distribution is: 1/(1 - t).
Therefore, the moment generating function of Y is: 1/(1 - 2t)2 .
Prob[T > 1/2] = MY[-1/2] = 1/22 = 1/4. Prob[T < 1/2] = 1 - 1/4 = 3/4.
Alternately, f(y) = y e-y/2 /((2) 22 ) = y e-y/2 /4. Therefore, the mixed distribution is:

F(x) =

(1 - e-xy) ye-y/2/4 dy = 1 - (1/4) y e-y(x + .5) dy = 1 - .25/(x + .5)2.

F(1/2) = 1 - 0.25 = 0.75.


Alternately, the length of time until the forgetting is analogous to the time until the first claim.
This time is Exponential with mean 1/Y and is mathematically the same as a Poisson Process with
intensity Y. Since Y has a Gamma Distribution, this is mathematically the same as a
Gamma-Poisson. Remembering less than 1/2 year, is analogous to at least one claim by time 1/2.
Over 1/2 year, Y has a Gamma Distribution with = 2 and instead = 2/2 = 1.
The mixed distribution is Negative Binomial, with r = = 2 and = = 1.
1 - f(0) = 1 - 1/(1 + 1)2 = 3/4.

2013-4-2,

Loss Distributions, 39 Continuous Mixtures

HCM 10/8/12,

Page 825

39.29. B. The present value of a continuous annuity of length t is: (1 - e-t)/.


Given constant force of mortality , the lifetimes are exponential with density f(t) = et.

For fixed , APV =

(1 - e-t)/ et dt = {1 - /( + )}/ = 1/( + ) = 1/( + .01).

in turn is uniform from .01 to .02 with density 100.


0.02

Mixing over , Actuarial Present Value = 1/( + .01) 100 d = 100ln(3/2) = 40.55.
0.01

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 826

Section 40, Spliced Models332


A spliced model allows one to have different behaviors for different sizes of loss. For example as
discussed below, one could splice together an Exponential Distribution for small losses and a Pareto
Distribution for large losses. This would differ from a two-point mixture of an Exponential and Pareto,
in which each distribution would contribute its density to all sizes of loss.
A Simple Example of a Splice:
Assume f(x) = 0.01 for 0 < x < 10, and f(x) = 0.009 e0.1 e-x/100 for x > 10.
Exercise: Show that this f(x) is a density.
[Solution: f(x) 0.

10

f(x) dx =

0.01 dx +

0.009 e0.1 e- x / 100 dx = 0.1 + 0.9 e.1 e-10/100 = 1.]

10

Here is a graph of this density:


density
0.010
0.008
0.006
0.004
0.002

50

100

150

200

We note that this density is discontinuous at 10.


This is an example of a 2-component spliced model. From 0 to 10 it is proportional to a uniform
density and above 10 it is proportional to an Exponential density.
332

See Section 5.2.6 of Loss Models.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 827

Two-Component Splices:
In general a 2-component spliced model would have density:
f(x) = w1 f1 (x) on (a1 , b1 ) and f(x) = w2 f2 (x) on (a2 , b2 ), where f1 (x) is a density with support (a1 ,
b 1 ), f2 (x) is a density with support (a2 , b2 ), and w1 + w2 = 1.
In the example, f1 (x) = 1/10 on (0, 10), f2 (x) = e0.1e-x/100/100 on (10, ), w1 = 0.1, and w2 = 0.9.
f1 is the uniform distribution on (0, 10). f2 is proportional to an Exponential with = 100.
In order to make f2 a density on (10, ), we have divided by S(10) = e-10/100 = e-0.1.333
A Splice of an Exponential and a Pareto:
Assume an Exponential with = 50.
On the interval (0, 100) this would have probability F(100) = 1 - e-2 = 0.8647.
In order to turn this into a density on (0, 100), we would divide this Exponential density by 0 .8647:
(e-x/50/50) / 0.8647 = 0.02313 e-x/50.
This integrates to one from 0 to 100.
Assume a Pareto Distribution with = 3 and = 200, with density (3)(2003 ) / (200 + x)4 =
0.015/(1 + x/200)4 .
On the interval (100, ) this would have probability S(100) = (/( + x)) = (200/300)3 = 8/27.
In order to turn this into a density on (100, ), we would multiply by 27/8:
(27/8) {.015/(1 + x/200)4 } = 0.050625 / (1 + x/200)4 .
This integrates to one from 100 to .
So we would have f1 (x) = 0.02313e-x/50 on (0, 100), f2 (x) = 0.050625/(1 + x/200)4 on (100, ).
We could use any weights w1 and w2 as long as they add to one, so that the spliced density will
integrate one.
If we took for example, w1 = 70% and w2 = 30%, then the spliced density would be:
(0.7)(0.02313 e-x/50) = 0.01619 e-x/50 on (0, 100), and
(0.3) (0.050625/(1 + x/200)4 ) = 0.0151875 / (1 + x/200)4 on (100, ).

333

This is how one alters the density in the case of truncation from below.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 828

This 2-component spliced density looks as follows:


Density
0.015

0.010

0.005

100

200

300

400

Size

It is not continuous at 100.


This spliced density is: 0.01619e-x/50 on (0, 100), and 0.01519/(1 + x/200)4 on (100, )
(0.7)(.02313e-x/50) on (0, 100), and (0.3)(.050625/(1 + x/200)4 ) on (100, )
(0.7)(e-x/50/50)/(1 - e-100/50) on (0, 100), (0.3){(3)2003 /(200 + x)4 }/{(200/300)3 } on (100, )
(0.8096) (Exponential[50]) on (0, 100), and (1.0125) (Pareto[3, 200]) on (100, ).
Exercise: What is the distribution function of this splice at 80?
[Solution: (0.8096)(Exponential Distribution function at 80) = (0.8096)(1 - e-80/50) = 0.6461.]
Exercise: What is the survival function of the splice at 300?
[Solution: (1.0125)(Pareto Survival function at 300) = (1.0125){200/(200 + 300)}3 = 0.0648.]
In general, it is easier to work with the distribution function of the first component of the splice below
the breakpoint and the survival function of the second component above the breakpoint.
Note that at the breakpoint of 100, the distribution function of the splice is:
(0.8096)(Exponential Distribution function at 100) = (0.8096)(1 - e-100/50) = 0.700 =
1 - (1.0125)(Pareto Survival function at 100) = 1 - (1.0125){200/(200 + 100)}3 .

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 829

Assume we had originally written the splice as:334


c1 Exponential[50] on (0, 100), and c2 Pareto[3, 200] on (100, ).
Since we chose weights of 70% & 30%, we want:
100

70% = c 1 Exponential[50] dx = c1 (1 - e-100/50).


0

c1 = 70% / (1 - e-100/50) = 0.8096.

200

3
30% = c 2 Pareto[3, 200] dx = c2
= 0.29630 c2 .
200 + 100
100
c2 = 30% / 0.29630 = 1.0125.
Therefore, as shown previously, the spliced density can be written as:
(0.8096) (Exponential[50]) on (0, 100), and (1.0125) (Pareto[3, 200]) on (100, ).
Density
0.015

0.010

0.005

100

334

200

300

400

Size

As discussed, while this is mathematically equivalent, this is not the manner in which the splice would be written in
Loss Models, which uses f1 , f2 , w1 , and w2 .

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 830

Continuity:
With appropriate values of the weights, a splice will be continuous at its breakpoint.
Exercise: Choose w1 and w2 so that the above spliced density would be continuous at 100.
[Solution: f1 (100) = 0.02313e-100/50 = 0.00313. f2 (100) = 0.050625 / (1 + 100/200)4 = 0.01.
In order to be continuous at 100, we need w1 f1 (100) = w2 f2 (100) = (1- w1 )f2 (100).
w1 = f2 (100) / {f1 (100) + f2 (100)} = 0.01 / (0.00313 + 0.01) = 0.762. w2 = 1 - w1 = 0.238.]
If we take f(x): (0.762)(0.02313e-x/50) = 0.01763e-x/50 on (0, 100), and
(0.238){0.050625 / (1 + x/200)4 } = 0.01205 / (1 + x/200)4 on (100, ), then f(x) is continuous:
Density
0.015

0.010

0.005

100

200

300

400

Size

The density of an Exponential Distribution with mean 50 is: e-x/50 / 50.


The density of a Pareto Distribution with = 3 and = 200 is:

(3) (2003)
3 / 200
=
.
(200 + x) 4 (1 + x / 200)4

(0.01763)(50) = 0.8815. (0.01205)(200/3) = 0.8033.


Therefore, similar to the noncontinuous splice, we could rewrite this continuous splice as:
(0.8815) (Exponential[50]) on (0, 100), and (0.8033) (Pareto[3, 200]) on (100, ).
In general, a 2-component spliced density will be continuous at the breakpoint b,
provided the weights are inversely proportional to the component densities at the breakpoint:
w 1 = f2 (b) / {f1 (b) + f2 (b)}, w2 = f1 (b) / {f1 (b) + f2 (b)}.335
335

While this spliced density will be continuous at the breakpoint, it will not be differentiable at the breakpoint.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 831

Moments:
One could compute the moments of a spliced density by integrating xn f(x).
For example, the mean of this continuous 2-component spliced density is:

100

x 0.01763 e- x / 50 dx +

100

x 0.01205
dx .
(1 + x / 200)4

100

Exercise: Compute the integral

0 x 0.01763 e- x / 50 dx .

100

[Solution:

0 x 0.01763

x = 100

e- x / 50

dx = -50 x 0.01763 e- x / 50 - 502 0.01763 e- x / 50 ]

= 26.18.

x =0

Alternately, as discussed previously, the first component of the continuous splice is:
(0.8815) Exponential[50], on (0, 100).
100

Now

0 x fexp(x) dx = E[X 100] - 100 Sexp(100) = 50(1 - e-100/50) - 100 e-100/50 = 29.700.

100

Thus

100

0 x 0.01763 e- x / 50 dx = 0.8815 0 x fexp(x) dx = (0.8815)(29.700) = 26.18.

Comment:

x e- x / dx = - x e-x/ - 2 e-x/.]

2013-4-2,

Loss Distributions, 40 Splices

Exercise: Compute the integral

HCM 10/8/12,

Page 832

100 (1 + x / 200)4 dx .
x 0.01205

[Solution: One can use integration by parts.

100

x =

x 0.01205
-200 / 3
dx = 0.01205 x
4
(1 + x / 200)
(1 + x / 200)3

dx =

(1 + x / 200)3
100

- 0.01205

x = 100

(0.01205) (20,000/3) (

-200 / 3

1
1
+
) = 59.51.
3
1.5
1.52

Alternately, as discussed previously, the second component of the continuous splice is:
(0.8033) Pareto[3, 200], on (100, ).

Now

100 x fPareto(x) dx = 100 (x - 100) fPareto(x) dx + 100 100 fPareto(x) dx =

100 + 200
200

ePareto(100) SPareto(100) + 100 SPareto(100) = (


+ 100)
200 + 100
3 - 1

Thus

100

x 0.01205
dx = 0.8033
(1 + x / 200)4

100 x fPareto(x) dx = (0.8033)(74.074) = 59.51.

Comment: e(x) =

x +

x (t - x) f(t) dt / S(x). For a Pareto Distribution, e(x) = - 1 .]

Thus the mean of this continuous splice is:

100

0 x 0.01763

e- x / 50

dx +

100 (1 + x / 200)4 dx = 26.18 + 59.51 = 85.69.


x 0.01205

= 74.074.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 833

More generally, assume we have a splice which is w1 h1 (x) on (0, b) and w2 h2 (x) on (b, ),
where h1 (x) = f1 (x) / F1 (b) and h2 (x) = f2 (x) / S2 (b). Then the mean of this spliced density is:

0 x w1 h1(x) dx + b

w1
x w2 h2 (x) dx =
F1(b)

w1
w2
{ E[X1 b] - bS1 (b)} +
{E[X2 ] F1(b)
S2 (b)

w2
x f1(x) dx +
S2 (b)

b x f2(x) dx =

0 x f2(x) dx } =

w1
w2
{ E[X1 b] + bF1 (b) - b} +
{E[X2 ] + bS2 (b) - E[X2 b]} =
F1(b)
S2 (b)
w1
w2
{ E[X1 b] - b} + bw1 + bw2 +
{E[X2 ] - E[X2 b]} =
F1(b)
S2 (b)

b+

w1
w2
{ E[X1 b] - b} +
{E[X2 ] - E[X2 b]}.
F1(b)
S2 (b)

For the example of the continuous splice, b = 100, F1 (100) = 1 - e-100/50 = 0.8647,
E[X1 b] = 50(1 - e-100/50) = 43.235,336 S 2 (b) = (200/300)3 = 8/27, E[X2 ] = 200/(3-1) = 100,
E[X2 b] = 100(1 - (2/3)2 ) = 55.556.337 Therefore, for w1 = 0.762 and w2 = 0.238, the mean is:
100 + (0.762/0.8647)(43.235 - 100) + (0.238)(27/8)(100 - 55.556) = 85.68,
matching the previous result subject to rounding.
n-Component Splices:
In addition to 2-component splices, one can have 3-components, 4-components, etc.
In a three component splice there are three intervals and the spliced density is:
w1 f1 (x) on (a1 , b1 ), w2 f2 (x) on (a2 , b2 ), and w3 f3 (x) on (a3 , b3 ), where f1 (x) is a density with
support (a1 , b1 ), f2 (x) is a density with support (a2 , b2 ), f3 (x) is a density with support (a3 , b3 ), and
w1 + w2 + w3 = 1.
Previously, when working with grouped data we had discussed assuming a uniform distribution on
each interval. This is an example of an n-component splice, with n equal to the number of intervals
for the grouped data, and with each component of the splice uniform.
336
337

For the Exponential Distribution, E[X d] = (1 - e-d/).

For the Pareto Distribution, E[X d] = (/(1))(1 - (/(d+))-1).

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 834

Using the Empirical Distribution:


One common use of splicing, is to use the Empirical Distribution function (or a smoothed version of
it) for small losses, and some parametric distribution to model large losses.338
For example, take the ungrouped data in Section 1. We could model the losses of size less than
100,000 using the Empirical Distribution Function, and use a Pareto Distribution to model the losses
of size greater than 100,000. There are 57 out of 130 losses of size less than 100,000.
Therefore, the Empirical Distribution Function at 100,000 is: 57/130 = .4385.
A Pareto Distribution with = 2 and = 298,977, has F(100000) = 1 - (298,977/398,977)2 =
.4385, matching the Empirical Distribution Function.
Thus one could splice together this Pareto Distribution from 100,000 to , and the Empirical
Distribution Function from 0 to 100,000. Here is what this spliced survival function looks like:
1.0

0.8

Empirical

0.6

0.4

Pareto

0.2

100000

338

500000

900000

A variation of this technique is used in Workers Compensation Excess Ratios, an Alternative Method, by
Howard C. Mahler, PCAS 1998.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 835

Using a Kernel Smoothed Density:339


Rather than use the Empirical Distribution, one could use a kernel smoothed version of the Empirical
Distribution. For example, one could splice together the same Pareto Distribution above 100,000,
and below 100,000 the kernel smoothed density for the ungrouped data in Section 1, using a
uniform kernel with a bandwidth of 5000.
Here is one million times this spliced density:
10

Kernel Smoothed

4
Pareto
2

100000

339

500000

Kernel Smoothing is discussed in Mahlers Guide to Fitting Loss Distributions.

900000

2013-4-2,

Loss Distributions, 40 Splices

Problems:
Use the following information for the next 11 questions:
f(x) = 0.12 for x 5, and f(x) = 0.06595e-x/10 for x > 5.
40.1 (1 point) What is the Distribution Function at 10?
A. less than 0.60
B. at least 0.60 but less than 0.65
C. at least 0.65 but less than 0.70
D. at least 0.70 but less than 0.75
E. at least 0.75
40.2 (3 points) What is the mean?
A. less than 5
B. at least 5 but less than 6
C. at least 6 but less than 7
D. at least 7 but less than 8
E. at least 8
40.3 (4 points) What is the variance?
A. less than 70
B. at least 70 but less than 75
C. at least 75 but less than 80
D. at least 80 but less than 85
E. at least 85
40.4 (5 points) What is the skewness?
A. less than 2.6
B. at least 2.6 but less than 2.7
C. at least 2.7 but less than 2.8
D. at least 2.8 but less than 2.9
E. at least 2.9
40.5 (3 points) What is E[X 3]?
A. less than 2.2
B. at least 2.2 but less than 2.3
C. at least 2.3 but less than 2.4
D. at least 2.4 but less than 2.5
E. at least 2.5

HCM 10/8/12,

Page 836

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 837

40.6 (3 points) What is the loss elimination ratio at 5?


A. less than 40%
B. at least 40% but less than 45%
C. at least 45% but less than 50%
D. at least 50% but less than 55%
E. at least 55%
40.7 (3 points) What is E[(X-20)+]?
A. less than 0.75
B. at least 0.75 but less than 0.80
C. at least 0.80 but less than 0.85
D. at least 0.85 but less than 0.90
E. at least 0.90
40.8 (2 points) What is e(20)?
A.7
B. 8
C. 9

D. 10

E. 11

40.9 (1 point) What is the median?


A. 4.2
B. 4.4
C. 4.6

D. 4.8

E. 5.0

40.10 (2 points) What is the 90th percentile?


A. 18
B. 19
C. 20
D. 21

E. 22

40.11 (3 points) The size of loss for the Peregrin Insurance Company follows the given f(x).
The average annual frequency is 138.
Peregrin Insurance buys reinsurance from the Meriadoc Reinsurance Company for 5 excess of 15.
How much does Meriadoc expect to pay per year for losses from Peregrin Insurance?
A. 60
B. 70
C. 80
D. 90
E. 100

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 838

Use the following information for the next 3 questions:


One has a two component splice, which is proportional to an Exponential Distribution with mean 3
for loss sizes less than 5, and is proportional to a Pareto Distribution with = 4 and = 60 for loss
sizes greater than 5. The splice is continuous at 5.
40.12 (2 points) What is the distribution function of the splice at 2?
A. less than 20%
B. at least 20% but less than 25%
C. at least 25% but less than 30%
D. at least 30% but less than 35%
E. at least 35%
40.13 (2 points) What is the survival function of the splice at 10?
A. less than 35%
B. at least 35% but less than 40%
C. at least 40% but less than 45%
D. at least 45% but less than 50%
E. at least 50%
40.14 (3 points) What is the mean of this splice?
A. less than 14.0
B. at least 14.0 but less than 14.5
C. at least 14.5 but less than 15.0
D. at least 15.0 but less than 15.5
E. at least 15.5

40.15 (4 points) In 2008, the size of monthly pension payments for a group of retired municipal
employees follows a Single Parameter Pareto Distribution, with = 2 and = $1000.
The city announces that for 2009, there will be a 5% cost of living adjustment (COLA.)
However. the COLA will only apply to the first $2000 in monthly payments.
What is the probability density function of the size of monthly pension payments in 2009?

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 839

40.16 (3 points) You are given the following grouped data:


Range
# of claims
loss
0-1
6300
3000
1-2
2350
3500
2-3
850
2000
3-4
320
1000
4-5
110
500
over 5
70
500
10,000
10,500
What is the mean of a 2-component splice between the empirical distribution below 4 and an
Exponential with = 1.5?
A. less than 1.045
B. at least 1.045 but less than 1.050
C. at least 1.050 but less than 1.055
D. at least 1.055 but less than 1.060
E. at least 1.060
Use the following information for the next 2 questions:
617,400
,0 < x 4
218 (10 + x) 4
f(x) =
.
3920

,x > 4
25 (10 + x)3

40.17 (2 points) Determine the probability that X is greater than 2.


A. 54%
B. 56%
C. 58%
D. 60%
E. 62%
40.18 (4 points) Determine E[X].
A. 4
B. 5
C. 6

D. 7

E. 8

40.19 (SOA M, 11/05, Q.35 & 2009 Sample Q.211) (2.5 points)
An actuary for a medical device manufacturer initially models the failure time for a particular device with
an exponential distribution with mean 4 years.
This distribution is replaced with a spliced model whose density function:
(i) is uniform over [0, 3]
(ii) is proportional to the initial modeled density function after 3 years
(iii) is continuous
Calculate the probability of failure in the first 3 years under the revised distribution.
(A) 0.43
(B) 0.45
(C) 0.47
(D) 0.49
(E) 0.51

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 840

40.20 (CAS3, 11/06, Q.18) (2.5 points) A loss distribution is a two-component spliced model
using a Weibull distribution with 1 = 1,500 and = 1 for losses up to $4,000, and a Pareto
distribution with 2 = 12,000 and = 2 for losses $4,000 and greater.
The probability that losses are less than $4,000 is 0.60.
Calculate the probability that losses are less than $25,000.
A. Less than 0.900
B. At least 0.900, but less than 0.925
C. At least 0.925, but less than 0.950
D. At least 0.950, but less than 0.975
E. At least 0.975

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 841

Solutions to Problems:
5

10

40.1. E. F(10) = .12 dx + .06595e-x/10 dx = (5)(.12) + .6595(e-5/10 - e-10/10) = 0.757.


0

Alternately, S(10) = .06595e-x/10 dx = .6595e-10/10 = .243. F(10) = 1 - .243 = 0.757.


10
5

40.2. D. mean =

.12 x dx + .06595e-x/10 x dx =

x=5

.06 x2

x=

] - .06595 {10xe-x/10 + 100e-x/10} ] = 1.5 + 6.0 = 7.5.

x=0

x=5
5

40.3. C. 2nd moment = .12 x2 dx + .06595e-x/10 x2 dx =


0

x=5

.04x3

x=

] - .06595{10x2e-x/10 + 200xe-x/10 + 2000e-x/10} ] = 5 + 130 = 135.

x=0

x=5

Variance = 135 - 7.52 = 78.75.


5

40.4. A. 3rd moment = .12 x3 dx + .06595e-x/10 x3 dx =


0
x=5

.03 x4

5
x=

] - .06595 {10x3e-x/10 + 300x2e-x/10 + 6000xe-x/10 + 60000e-x/10} ]


x=0

x=5

= 18.75 + 3950 = 3968.75. Skewness = (3968.5 - (3)(7.5)(135) + (2)(7.53 ))/(78.751.5) = 2.54.

2013-4-2,

Loss Distributions, 40 Splices


3

HCM 10/8/12,

Page 842

x=3

40.5. D. E[X 3] = .12 x dx + 3S(3) = .06 x2


0

] - (3)(1 - (.12)(3)) = .54 + 1.92 = 2.46.


x=0

x=5

40.6. C. E[X 5] = .12 x dx + 5S(5) = .06 x2


0

] - (5){1 - (.12)(5)} = 1.5 + 2 = 3.5.


x=0

E[X 5]/E[X] = 3.5/7.5 = 46.7%.


Alternately, since for x > 5, the density is proportional to an Exponential,
f(x) = 0.06595e-x/10 for x > 5, S(x) = 0.6595e-x/10 for x > 5.

The layer from 5 to infinity is:

0.6595 e - x / 10 dx = 4.00.

The loss elimination ratio at 5 is: 1 - 4/7.5 = 46.7%.


Comment: 1 - (0.12)(5) = S(5) = 0.6595e-5/10 = 0.400.
5

20

40.7. D. E[X 20] = .12 x dx + .06595e-x/10 x dx + 20S(20) =


0

x=5

.06 x2

x=20

] - .06595 {10xe-x/10 + 100e-x/10} ] + (20)(.06595)(10e-2)

x=0

x=5

= 1.5 + 3.323 + 1.785 = 6.608. E[(X-20)+] = E[X] - E[X 20] = 7.5 - 6.608 = 0.892.

x=

Alternately, E[(X-20)+] = .06595e-x/10(x - 20) dx = .06595 {-10xe-x/10 + 100e-x/10} ] = 0.8925.


20

x=20

x=

Alternately, E[(X-20)+] = S(x) dx = .6595e-x/10 dx = -6.595e-x/10 ] = 0.8925.


20

20

x=20

40.8. D. Beyond 5 the density is proportional to an Exponential density with mean 10, and
therefore, beyond 5, the mean residual life is a constant 10. Alternately,
S(20) = (.06595)(10e-2) = .08925. e(20) = E[(X-20)+] /S(20) = .8925/.08925 = 10.0.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 843

40.9. A. f(x) = 0.12 for x 5. F(5) = 0.60. median = 0.5 / 0.12 = 4.167.
40.10. B. f(x) = 0.12 for x 5. F(5) = 0.6.
f(x) = 0.06595e-x/10 for x > 5.
x

F(x) = 0.6 +

5 0.06595 e- t / 10 dt = 0.6 + 0.6595e-5/10 - 0.6595e-x/10, x > 5.

Require that: 0.9 = 0.6 + 0.6595e-5/10 - 0.6595e-x/10. e-x/10 = 0.15164. x = 18.86.


20

20

x=20

40.11. C. E[X 20] - E[X 15] = S(x) dx = .6595e-x/10 dx = -6.595e-x/10


15

15

] = .579.

x=15

Meriadoc reinsures the layer from 15 to 20, so it expects to pay: (138)(.579) = 79.9.
40.12. C. Let the splice be: a(Exponential), x < 5, and b(Pareto), x > 5.
The splice must integrate to unity from 0 to :
1 = a(Exponential Distribution at 5) + b(1 - Pareto Distribution at 5)
1 = a(1 - e-5/3) + b(60/65)4 1 = 0.8111a + 0.7260b.
The density of the Exponential is: e-x/3/3. f(5) = e-5/3/3 = 0.06296.
The density of the Pareto is: (4)(604 )/(x+60)5 . f(5) = (4)(604 )/(655 ) = 0.04468
Also in order for the splice to be continuous at 5:
a(Exponential density @ 5) = b(Pareto density @5) a(0.06296) = b(0.04468).

b = 1.4091a. 1 = 0.8111a + 0.7260(1.4091a). a = 0.545.


the distribution function at 2 is: .545(Exponential Distribution at 2) = 0.545(1 - e-2/3) = 0.265.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 844

40.13. C. Continuing the previous solution, b = 1.4091a = .768.

the survival function at 10 is: .768(Pareto survival function at 10) = .768(60/70)4 = 0.415.
Alternately, the distribution function at 10 is:
5

0.545

e- x / 3
dx +0.768
3

10

(4)(604)
dx =
(x + 60)5

0.545 (Exponential distribution function at 5) +


0.768 (Pareto distribution function at 10 - Pareto distribution function at 5) =
0.545 (1 - e-5/3) + 0.768{(1 - (60/70)4 ) - (1 - (60/65)4 )}
= (0.545)(0.811) + (0.768)(0.186) = 0.585.
Therefore, the survival function at 10 is: 1 - 0.585 = 0.415.
5

e- x / 3
40.14. E. mean = 0.545 x
dx +0.768
3

(4)(604 )
dx =
(x + 60)5

The first integral is for an Exponential Distribution: E[X 5] - 5S(5) = 3(1 - e-5/3) - 5e-5/3 = 1.49.
5

The second integral is for an Pareto Distribution: E[X] -

0 x fPareto(x) dx =

E[X] - {E[X 5] - 5S(5)} = E[X] - E[X 5] + 5S(5) = (60/3)(60/65)3 + (5)(60/65)4 = 19.36.


Thus the mean of the splice is: (0.545)(1.49) + (0.768)(19.36) = 15.68.
Alternately, the second integral is for an Pareto:

5 x fPareto(x) dx = 5 (x- 5) fPareto(x) dx + 5 5 fPareto(x) dx = SPareto(5) ePareto(5) + 5 SPareto(5)


= (60/65)4

5 + 60
+ (5)(60/65)4 = 19.36. Proceed as before.
4 - 1

Comment: e(x) =

x +

x (t - x) f(t) dt / S(x). For a Pareto Distribution, e(x) = - 1 .

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 845

40.15. In 2008, S(2000) = (1000/2000)2 = 1/4.


For a Single Parameter Pareto Distribution, f(x) = / x+1, x > .
For those whose payments are less than $2000 per month, the payment is multiplied by 1.05.
Thus in 2009 they follow a Single Parameter Pareto with = 2 and = $1050.
The density is proportional to: f(x) = 2 (10502 ) / x3 , x > 1050.
For those whose payments are $2000 or more per month, their payment is increased by
(2000)(5%) = 100. S(x) = {1000/(x-100)}2 , x > 2100.
The density is proportional to: f(x) = 2 (10002 ) / (x - 100)3 .
In 2009, the density is a splice, with 3/4 weight to the first component and 1/4 weight to the second
component. Someone with $2000 in 2008, will get $2100 in 2009; $2100 is the breakpoint of the
splice.
f(x) = 2 (10502 ) / x3 , x > 1050, would integrate from 1050 to 2100 to the distribution function of
at 2100 of a Single Parameter Pareto with = 2 and = $1050, 1 - (1050/2100)2 = 3/4.
This is the desired weight for the first component, so this is OK.
f(x) = 2 (10002 ) / (x - 100)3 , would integrate from 2100 to : (10002 ) / (2100 - 100)2 = 1/4.
This is the desired weight for the second component, so this is OK.
The probability density function of the size of monthly pension payments in 2009 is a splice:
2 (10502 ) / x3 , 2100 > x > 1050
f(x) =
.
2 (10002 ) / (x - 100) 3, x > 2100
Comment: Coming from the left, f(2100) = 2 (10502 ) / 21003 = 1/4200.
Coming from the right, f(2100) = 2 (10002 ) / (2100 - 100)3 = 1/4000. Thus the density of this splice
is not (quite) continuous at the breakpoint of 2100. A graph of this splice:
f(x)

0.0015
0.0010
0.0005

1050

2100

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 846

40.16. B. The empirical survival function at 4 is: (110 + 70)/10000 = .018.


Above 4 the splice is proportional to an Exponential with survival function e-x/1.5.
Let w be the weight applied to this Exponential. Matching S(4), set 0.018 = we-4/1.5. w = 0.259.
The contribution to the mean from the losses of size less than 4 is:
(3000 + 3500 + 2000 + 1000)/10,000 = 0.9500.
The contribution to the mean from the losses of size greater than 4 is:

.259 x e-x/1.5/1.5 dx =-.259(x e-x/1.5 + 1.5e-x/1.5) ] = 0.0990.


4

mean = 0.9500 + 0.0990 = 1.0490.


3

40.17. D. F(3) =

617,400
617,400
1
1
dx
=
{
} = 0.3977.
(3) (103 ) (3) (123 )
218 (10 + x) 4
218

S(3) = 1 - 0.3977 = 0.6023.


Comment: Via integration, one can determine that the first component of the splice has a total
probability of 3/5, while the second component of the splice has a total probability of 2/5.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 847

40.18. E. One can use integration by parts.


x= 4 4
4
-x

x
1
4
1
1
dx
=
+
dx = +
3

4
3
2
3
(10
+
x)
3
(10 + x)
3 (10 + x)
(3) (14 ) (2) (3) (14 )
(2) (3) (102 )

0
0
x= 0

= 0.00033042.
4

Thus

617,400
617,400
dx
=
0.00033042 = 0.9358.
218 (10 + x)4
218
x=

x
-x
dx
=
(10 + x)3
2 (10 + x)2

x= 4

Therefore,

Thus, E[X] =

4
1
1
dx
=
+
= 9/196.
2
(2) (142 )
(2) (14)
2
(10
+
x)
4

3920
3920
dx
=
(9/196) = 7.2000.
25
25 (10 + x)3
617,400
x
dx +
218 (10 + x)4

4 x 25 (10 + x)3 dx = 0.9358 + 7.2000 = 8.1358.


3920

Alternately, each component of the splice is proportional to the density of a Pareto Distribution.
The density of a Pareto with = 3 and = 10 is:
Thus the first component of the splice is:

(3) (103 )
.
(10 + x)4

617,400
1
2058
Pareto[3, 10] =
Pareto[3, 10].
3000
2180
218

Now

0 x fPareto(x) dx = E[X 4] - 4 S(4) = (10/2) {1 - (10/14)2} - (4)(10/14)3 = 0.99125.


4

Therefore,

0 x 218 (10 + x)4 dx = 2180 0.99125 = 0.9358.


617,400

2058

The density of a Pareto with = 2 and = 10 is:


Thus the second component of the splice is:

(2) (102 )
.
(10 + x)4

3920 1
Pareto[2, 10] = 0.784 Pareto[2, 10].
25 200

2013-4-2,

Loss Distributions, 40 Splices

Now

HCM 10/8/12,

Page 848

4 x fPareto(x) dx = 4 (x - 4) fPareto(x) dx + 4 4 fPareto(x) dx = e(4) S(4) + 4 S(4)

= (14/1) (10/14)2 + (4) (10/14)2 = 9.1837.

Therefore,

4 x 25 (10 + x)3 dx = (0.784) (9.1837) = 7.2000.


3920

Thus, E[X] =

617,400
x
dx +
218 (10 + x)4

4 x 25 (10 + x)3 dx = 0.9358 + 7.2000 = 8.1358.


3920

Comment: e(x) =

x +

x (t - x) f(t) dt / S(x). For a Pareto Distribution, e(x) = - 1 .

40.19. A. A uniform on [0, 3] has density of 1/3.


On the interval 3 to , we want something proportional to an Exponential with = 4.
From 3 to this Exponential density would integrate to S(3) = e-3/4.
Therefore, something proportional that would integrate to one is: .25e-x/4/e-3/4 = .25e-(x-3)/4.
Thus the density of the splice is: w(1/3) from 0 to 3, and (1 - w).25e-(x-3)/4 from 3 to .
In order to be continuous, the two densities must match at 3:
w(1/3) = (1 - w)0.25e-(3-3)/4. 4w = 3(1 - w). w = 3/7 = 0.429.
Probability of failure in the first 3 years is the integral of the splice from 0 to 3: w = 0.429.

2013-4-2,

Loss Distributions, 40 Splices

HCM 10/8/12,

Page 849

40.20. C. For the Pareto, S(4000) = {12/(12 + 4)}2 = 9/16.


The portion of the splice above $4000 totals 1 - .6 = 40% probability.
Therefore, the portion of the splice above 4000 is: (.4) Pareto[2, 12,000] / (9/16).
For the Pareto, S(25000) = {12/(12 + 25)}2 = 0.1052.
Therefore, for the splice the probability that losses are greater than $25,000 is:
(.4)(0.1052)/(9/16) = 0.0748. 1 - 0.0748 = 0.9252.
Comment: A Weibull with = 1 is an Exponential.
It is easier to calculate S(25,000), and then F(25,000) = 1 - S(25,000).
When working above the breakpoint of $4000, work with the Pareto.
If working below the breakpoint of $4000, work with the Weibull.
It might have been better if the exam question had read instead:
A loss distribution is a two-component spliced model using a density proportional to a Weibull
distribution with 1 = 1,500 and = 1 for losses up to $4,000, and a density proportional toa Pareto
distribution with 2 = 12,000 and = 2 for losses $4,000 and greater.
The density above 4000 is proportional to a Pareto.
The original Pareto integrates to 9/16 from 4000 to infinity.
In order to get the density from 4000 to infinity to integrate to the desired 40%, we need to multiply
the density of the original Pareto by 40%/(9/16).

2013-4-2, Loss Distributions, 41 Relationship to Life Con.

HCM 10/8/12,

Page 850

Section 41, Relationship to Life Contingencies


Many of the ideas discussed with respect to Loss Distributions apply to Life Contingencies and
vice-versa. For example, as discussed previously, the mean residual life (complete expectation of
life) and the mean excess loss are mathematically equivalent. Similarly, as discussed previously, the
hazard rate and force of mortality are two names for the same thing.
One can relate the notation used in Life Contingencies to that used in Loss Distributions.
ps and qs:340
The probability of survival past time 70 + 10 = 80, given survival past time 70, is 10p 70.
The probability of failing at or before time 70 + 10 = 80, given survival past time 70, is 10q70.
10p 70

+ 10q70 = 1.

In general, y-xpx Prob[Survival past y | Survival past x] = S(y)/S(x).


y-xqx

Prob[Not Surviving past y | Survival past x] = {S(x) - S(y)}/S(x) = 1 - y-xpx .

Also px 1 px = Prob[Survival past x+1 | Survival past x] = S(x+1)/S(x).


qx 1 qx = Prob[Death within one year | Survival past x] = 1 - S(x+1)/S(x).
Exercise: Estimate 100p 50 and 300q100, given the following 10 values:
22, 35, 52, 69, 86, 90, 111, 254, 362, 746.
[Solution: S(50) = 8/10. S(150) = 3/10. 100p 50 = S(150)/S(50) = (3/10)/(8/10) = 3/8.
S(100) = 4/10. S(400) = 1/10. 300q100 = 1 - S(400)/S(100) = 3/4.]
t|uqx

Prob[x+t < time of death x+t+u | Survival past x]

= {S(x+t) - S(x+t+u)}/S(x).
Note that t is the time delay, while u is the length of the interval whose probability we measure.
Exercise: In the previous exercise, estimate 100|200q70.
[Solution: 100|200q70 = {S(170) - S(370)}/S(70) = {(3/10) - (1/10)}/(6/10) = 1/3.]
340

See Section 3.2.2 of Actuarial Mathematics.

2013-4-2, Loss Distributions, 41 Relationship to Life Con.

HCM 10/8/12,

Page 851

Variance of ps and qs:341


With the 10 values: 22, 35, 52, 69, 86, 90, 111, 254, 362, 746, the estimate of
100p 50 = S(150)/S(50) = (3/10)/(8/10) = 3/8 = (number > 150)/(number > 50).
Conditional on having 8 values greater than 50, the number of values greater than 150 is Binomial
with m = 8 and q = 100p 50, and variance: 8 100p 50(1 - 100p 50) = 8 100p 50 100q50.
However, given 8 values greater than 50, 100p 50 = (number > 150)/8.
Thus, Var[100p 50 | S(50) = 8/10] = 8 100p 50 100q50 /82 = 100p 50 100q50 / 8 = (3/8)(5/8)/8 =
(3)(5)/83 .
Let nx number of values greater than x.
Then by the above reasoning, y-xp x = ny/nx, and Var[y-xp x | nx] = ny(nx - ny)/nx3 .
Since y-xqx = 1 - y-xp x, Var[y-xqx | nx] = Var[y-xp x | nx] = ny(nx - ny)/nx3 .
Exercise: Estimate Var[30q70 | 6 values greater than 70], given the following 10 values:
22, 35, 52, 69, 86, 90, 111, 254, 362, 746.
[Solution: 30p 70 = S(100)/S(70) = (4/10)/(6/10) = 2/3. 30q70 = 1/3.
Var[30q70 | 6 values greater than 70] = 30p 70 30q70 /6 = (2/3)(1/3)/6 = 1/27.
Alternately, Var[30q70 | n70 = 6] = n100(n70 - n100)/n703 = (4)(6 - 4)/63 = 1/27.]
Central Death Rate:342
The central death rate,
mx = (Probability of dying from age x to age x + 1) / (expected years lived from x to x + 1)
x+1

= {S(x) - S(x+1)} / S(t)dt


x

= (Probability of loss of size x to x + 1) / (layer of loss from x to x + 1).


n mx

= (Probability of dying from age x to age x + n) / (expected years lived from x to x + n)


x+n

= {S(x) - S(x+n)} / S(t)dt


x

= (Probability of loss of size x to x + n) / (layer of loss from x to x + n).


341
342

See Example 14.5 in Loss Models.


See page 70 of Actuarial Mathematics.

2013-4-2, Loss Distributions, 41 Relationship to Life Con.

HCM 10/8/12,

Page 852

Problems:
Use the following information for the next 5 questions:
Mortality follows a Weibull Distribution with = 70 and = 4.
41.1 (1 point) Determine q60.
A. 0.028

B. 0.030

C. 0.032

D. 0.034

E. 0.036

D. 0.96

E. 0.98

D. 0.48

E. 0.50

D. 0.34

E. 0.36

D. 0.24

E. 0.26

41.2 (1 point) Determine p80.


A. 0.90

B. 0.92

C. 0.94

41.3 (1 point) Determine 10q65.


A. 0.42

B. 0.44

C. 0.46

41.4 (1 point) Determine 13p 74.


A. 0.28

B. 0.30

C. 0.32

41.5 (2 points) Determine 10|5q62.


A. 0.18

B. 0.20

C. 0.22

41.6 (165, 5/87, Q.9) (2.1 points) Mortality follows a Weibull Distribution with parameters and .
q0 = 0.09516. q1 = 0.25918. Determine q2 .
A. 0.37

B. 0.39

C. 0.41

D. 0.43

E. 0.45

41.7 (CAS3, 11/07, Q.30) (2.5 points) Survival follows a Weibull Distribution.
Given the following:

(x) = kx2 , k > 0, x 0 defines the hazard rate function.


3q2 = 0.68963.
Calculate 2|q2 .

2013-4-2, Loss Distributions, 41 Relationship to Life Con.

HCM 10/8/12,

Page 853

Solutions to Problems:
41.1. E. q 60 = 1 - p60 = 1 - S(61)/S(60) = 1 - exp[-(61/70)4 ]/exp[-(60/70)4 ] =
1 - e-.03689 = 0.0362.
41.2. B. p 80 = S(81)/S(80) = exp[-(81/70)4 ]/exp[-(80/70)4 ] = e-.08691 = 0.917.
41.3. B. 10q65 = 1 - 10p 65 = 1 - S(75)/S(65) = 1 - exp[-(75/70)4 ]/exp[-(65/70)4 ] =
1 - e-.5743 = 0.437.
41.4. C. 13p 74 = S(87)/S(74) = exp[-(87/70)4 ]/exp[-(74/70)4 ] = e-1.1372 = 0.321.
41.5. A. 10|5q62 = {S(72) - S(77)}/S(62) = {exp[-(72/70)4 ] - exp[-(77/70)4 ]}/exp[-(62/70)4 ] =
(.32652 - .23129)/.54041 = 0.176.
41.6. B. S(x) = exp[-(x/)].
q0 = {S(0) - S(1)}/S(0) = 1 - exp[ -1/]. exp[ -1/] = 1 - .09516 = .90484.
q1 = {S(1) - S(2)}/S(1) = 1 - exp[-(2/)]/exp[ -1/].

exp[-(2/)]/exp[ -1/] = 1 - .25918 = .74082. exp[-(2/)] = (.90484)(.74082) = .67032.


Therefore, 1/ = -ln(.90484) = .100, and (2/) = -ln(.67032) = .400.
Dividing the two equations: 2 = 4. = 2. =

10 .

S(x) = exp[-x2 /10]. q2 = 1 - S(3)/S(2) = 1 - e-.9/e-.4 = 1 - e-.5 = 0.3935.

41.7. D. H(x) = h(x) = k x3 /3. S(x) = exp[-H(x)] = exp[- k x3 /3].


0.68963 = 3 q2 = 1 - S(5)/S(2) = 1 - exp[-125k/3]/exp[-8k/3].

0.31037 = exp[-117k/3]. k = 0.03. S(x) = exp[-x3 /100].


2|q 2

= {S(4) - S(5)}/S(2) = (e-.64 - e-1.25)/ = e-.08 = 0.261.

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 854

Section 42, Gini Coefficient343


The Gini Coefficient or coefficient of concentration is a concept that comes up for example in
economics, when looking at the distribution of incomes. This section will discuss the Gini coefficient
and relate it to the Relative Mean Difference.
The Gini coefficient is a measure of inequality. For example if all of the individuals in a group have the
same income, then the Gini coefficient is zero. As incomes of the individuals in a group became
more and more unequal, the Gini coefficient would increase towards a value of 1. The Gini coefficient
has found application in many different fields of study.
Mean Difference:
Define the mean difference as the average absolute difference between two random draws from a
distribution.
Mean Difference =

| x - y | f(x) f(y) dx dy ,
where the double integral is taken over the support of f.

For example, for a uniform distribution from 0 to 10:


10 10

Mean Difference =

10 10

(1/100)

10 y

0 x=y x - y dx dy + (1/100) 0 x=0 y - x dx dy =

10

(1/100)

0 0 | x - y | (1/ 10) (1/ 10) dx dy =

10

0 50 - y2 / 2 + y2 - 10y dy + (1/100) 0 y2 - y2 / 2 dy =

(1/100) (500 + 1000/6 - 500) + (1/100)(1000/6) = 10/3.


In a similar manner, in general for the continuous uniform distribution, the mean difference is:
(width)/3.344
343

Not on the syllabus of your exam.


For a sample of size two from a uniform, the expected value of the minimum is the bottom of the interval plus
(width)/3, while the expected value of the maximum is the top of the interval - (width)/3. Thus the expected absolute
difference is (width)/3. This is discussed in order statistics, on the Syllabus of Exam CAS 3L.
344

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 855

Exercise: Compute the mean difference for an Exponential Distribution.


[Solution: Mean difference =

(1/2)

| x - y | e- x / /

e- y /

x=0 (y - x)

dx e - y / / dy =

e- x /

dx dy + (1/2)

e- y /

x=y (x - y) e- x / dx dy =

(1/)

0 e- y / {y (1- e- y / ) + e - y / + y e- y / - } dy +

(1/)

0 e- y / {e - y / + y e- y / - y e- y / } dy =

0 y e- y / /

+ 2 e - 2y / - e - y / dy = + - = .

Alternately, by symmetry the contributions from when x > y and when y > x must be equal.

Thus, the mean difference is: (2) (1/2)

e- y /

x=0 (y - x) e- x / dx dy =

(2/)

0 e- y / {y (1- e- y / ) + e - y / + y e- y / - } dy =

0 y e- y / /

Comment:

+ e - 2y / - e- y / dy = (2)( + /2 - ) = .

x e - x / dx = - (x + ) e-x/.

For a sample of size two from an Exponential Distribution, the expected value of the minimum
is /2, while the expected value of the maximum is 3/2.
Therefore, the expected value of the difference is .]

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 856

Mean Relative Difference:


The mean relative difference of a distribution is defined as:

mean difference
.
mean

For the uniform distribution, the mean relative difference is:

(width) / 3
= 2/3.
(width) / 2

For the Exponential Distribution, the mean relative difference is:

= 1.

Exercise: Derive the form of the Mean Relative Difference for a Pareto Distribution.
x
-1
x (a- 1) +
Hint:
dx
=
(x+ )a
(x + )a - 1 (a - 1)(a - 2)

[Solution: For > 1, E[X] = /(-1). f(x) =


Mean difference =

|x-y |


.
( + x) + 1



dx
dy .
( + x ) + 1
( + y) + 1

By symmetry the contributions from when x > y and when y > x must be equal.

Therefore, mean difference = 2 2 2

Now using the hint:

x=y

dx
dy .

( + x) + 1
( + y) + 1
y=0 x=y
(x - y)

x
y +
dx
=
.
( + x) + 1
( - 1) ( + y)

x=y ( + x) + 1 dx = ( + y) .
1

Therefore,

x=y

x - y
y +
y
1
dx
=
=
.
( + x) + 1
( - 1) ( + y) ( + y) ( - 1) ( + y) -1

2 2
Thus, mean difference =
-1

1
2 2
1
2
dy
=
=
.

2
2
1
( - 1) (2 - 1)
( + y)
-1
(2 - 1)

E[X] = /(-1). Thus, the mean relative difference is: 2 / (2 - 1), > 1.]

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Lorenz Curve:
Assume that the incomes in a country follow a distribution function F(x).345
Then F(x) is the percentage of people with incomes less than x.
x

The income earned by such people is:

0 t f(t) dt = E[X x] - x S(x) = 0 S(t) dt .

The percentage of total income earned by such people is:


x

y f(y) dy
0

E[X]

E[X x] - x S(x)
.
E[X]
x

y f(y) dy

Define G(x) = 0

E[X]

E[X x] - x S(x) 346


.
E[X]

For example, assume an Exponential Distribution.


Then F(x) = 1 - e-x/.
G(x) =

E[X x] - x S(x) (1 - e- x / ) - x e - x /
=
= 1 - e-x/ - (x/) e-x/.
E[X]

Let t = F(x) = 1 - e-x/. Therefore, x/ = - ln(1 - t).347


Then, G(t) = t - {-ln(1-t)} (1-t) = t + (1-t) ln(1-t).

345

Of course, the mathematics applies regardless of what is being modeled.


The distribution of incomes is just the most common context.
346
This is not standard notation. I have just used G to have some notation.
347
This is just the VaR formula for the Exponential Distribution.

Page 857

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 858

Then we can graph G as a function of F:


G(x)
1.0
0.8
0.6
0.4
0.2

0.2

0.4

0.6

0.8

1.0

F(x)

This curve is referred to as the Lorenz curve or the concentration curve.


Since F(0) = 0 = G(0) and F() = 1 = G(), the Lorenz curve passes through the points (0, 0) and
(1, 1). Usually one would also include in the graph the 45 reference line
connecting (0, 0) and (1, 1), as shown below:
% of income
1.0

0.8

0.6

0.4
Lorenz Curve
0.2

0.2

0.4

0.6

0.8

1.0

% of people

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 859

y f(y) dy

G(t) = G[F(x))] = 0

E[X]

dG dG dF x f(x)
x
=
/
=
/ f(x) =
> 0.
dt
dx dx
E[X]
E[X]
d2 G
1 dx dF
1
=
/
=
> 0.
2
dt
E[X] dx dx
E[X] f(x)
Thus, in the above graph, as well as in general, the Lorenz curve is increasing and concave up.
The Lorenz curve is below the 45 reference line, except at the endpoints when they are equal.
The vertical distance between the Lorenz curve and the 45 comparison line is: F - G.
dF dG
Thus, this vertical distance is a maximum when: 0 =
.
dF dF

dG
x
= 1.
= 1. x = E[X].
dF
E[X]

Thus the vertical distance between the Lorenz curve and the 45 comparison line is a maximum at
the mean income.
Exercise: If incomes follow an Exponential Distribution, what is this maximum vertical distance
between the Lorenz curve and the 45 comparison line?
[Solution: The maximum occurs when x = .
F(x) = 1 - e-x/. From previously, G(x) = 1 - e-x/ - (x/) e-x/.
F - G = (x/) e-x/. At x = , this is: e-1 = 0.3679.]

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 860

Exercise: Determine the form of the Lorenz Curve, if the distribution of incomes follows a Pareto
Distribution, with > 1.

1


[Solution: F(x) = 1 -
. E[X] =
. E[X x] =
1 -

.
+ x
+ x
1
1

-1


{1 } - x S(x)
-1
+ x
E[X x] - x S(x) - 1
x

G(x) =
=
=1 - (-1) S(x).
+ x
E[X]
/ ( -1)



Let t = F(x) = 1 -
.
= S(x) = 1 - t. Also, x/ = (1 - t)-1/ - 1.348

+ x
+x
Therefore, G(t) = 1 - (1 - t)(-1)/ - (-1){(1 - t)-1/ - 1} (1 - t) = t + - t - (1-t)1-1/, 0 t 1.
Comment: G(0) = - = 0. G(1) = 1 + - - 0 = 1.]
Here is graph comparing the Lorenz curves for Paretos with = 2 and = 5:
% of income
1.0

0.8

0.6

0.4
alpha = 5
alpha = 2
0.2

0.2

348

0.4

0.6

This is just the VaR formula for the Pareto Distribution.

0.8

1.0

% of people

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 861

The Pareto with = 2 has a heavier righthand tail than the Pareto with = 5. If incomes follow a
Pareto with = 2, then there are more extremely high incomes compared to the mean, than if
incomes follow a Pareto with = 5. In other words, if = 2, then income is more concentrated in the
high income individuals than if = 5.349
The Lorenz curve for = 2 is below that for = 5. In general, the lower curve corresponds to a
higher concentration of income. In other words, a higher concentration of income corresponds to a
smaller area under the Lorenz curve. Equivalently, a higher concentration of income corresponds to a
larger area between the Lorenz curve and the 45 reference line.
Gini Coefficient:
This correspondence between areas on the graph of the Lorenz curve the concentration of income is
the idea behind the Gini Coefficient.
Let us label the areas in the graph of a Lorenz Curve, in this case for an Exponential Distribution:
% of income
1.0

0.8

0.6
A

0.4

0.2

0.2
Gini Coefficient =
349

0.4

0.6

0.8

1.0

% of people

Area A
.
Area A + Area B

An Exponential Distribution has a lighter righthand tail than either Pareto. Thus if income followed an Exponential,
it would less concentrated than if it followed any Pareto.

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 862

However, Area A + Area B add up to a triangle with area 1/2.


Area A
Therefore, Gini Coefficient =
= 2A = 1 - 2B.
Area A + Area B
For the Exponentials Distribution, the Lorenz curve was: G(t) = t + (1-t) ln(1-t).
1

Thus, Area B = area under Lorenz curve =

0 t + (1- t) ln(1- t) dt = 1/2 + 0 s ln(s) ds .

Applying integration by parts,


1

s =1

0 s ln(s) ds = (s2 / 2) ln(s)s ]= 0 - 0 (s2 / 2) (1/ s) ds = 0 - 1/4 = -1/4.


Thus Area B = 1/2 - 1/4 = 1/4.
Therefore, for the Exponential Distribution, the Gini Coefficient is: 1 - (2)(1/4) = 1/2.
Recall that for the Exponential Distribution, the mean relative difference was 1.
As will be shown subsequently, in general, Gini Coefficient = (mean relative difference)/2.
Therefore, for the Uniform Distribution, the Gini Coefficient is: (1/2)(2/3) = 1/3.
Similarly, for the Pareto Distribution, the Gini Coefficient is: (1/2){2 / (2 - 1)} = 1 / (2 - 1), > 1.
We note that the Uniform with the lightest righthand tail of the three has the smallest Gini coefficient,
while the Pareto with the heaviest righthand tail of the three has the largest Gini coefficient.
Among Pareto Distributions, the smaller alpha, the heavier the righthand tail, and the larger the
Gini Coefficient.350
The more concentrated the income is among the higher earners, the larger the Gini coefficient.

350

As alpha approaches one, the Gini coefficient approaches one.

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 863

LogNormal Distribution:
For the LogNormal Distribution: E[X] = exp[ + 2/2].
ln(x) 2
ln(x)
E[X x] = exp( + 2/2)
+ x {1 -

ln(x) 2
= E[X]
+ x S(x).

Therefore, G(x) =

E[X x] - x S(x)
ln(x) 2
ln(x) -

=
=
- .

E[X]

ln(x) -
Let t = F(x) =
.

Then the Lorenz Curve is: G(t) = [ -1[t] - ].


For example, here a graph of the Lorenz curves for LogNormal Distributions with =1 and = 2:
% of income
1.0

0.8

0.6

0.4
sigma = 1

0.2
sigma = 2

0.2

0.4

0.6

0.8

1.0

% of people

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 864

As derived subsequently, for a LogNormal Distribution, the Gini Coefficient is: 2[/ 2 ] - 1.
Here is a graph of the Gini Coefficient as a function of sigma:
Gini Coef
1.0

0.8

0.6

0.4

0.2

sigma

As sigma increases, the LogNormal has a heavier tail, and the Gini Coefficient Increases towards 1.
The mean relative distance is twice the Gini Coefficient: 4[/ 2 ] - 2.

2013-4-2,

Loss Distributions, 42 Gini Coefficient

Derivation of the Gini Coefficient for the LogNormal Distribution:


In order to compute the Gini Coefficient, we need to compute area B.
% of income
1.0
0.8
0.6
A
0.4
B

0.2

0.2
1

B=

0.4

0.6

0.8

0 G(t) dt = 0 [ - 1[t] - ] dt .

Let y = -1[t]. Then t = [y]. dt = [y] dy.

B=

- [y - ] [y] dy .

Now B is some function of .

B() =

- [y - ] [y] dy .

B(0) =

- [y] [y] dy = [y]

y =

/2]

y = -

= 1/2.

1.0

% of people

HCM 10/8/12,

Page 865

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 866

B() =

- [y - ] [y] dy . Taking the derivative of B with respect to sigma:

B() = -

1
[y - ] [y] dy = 2

1
=exp[-2/2]
2

1
=exp[-2/4]
2

1
=exp[-2/4]
2
Let x =
B() = -

2 y-/

- exp[-(y - ) 2 / 2] exp[-y 2 / 2] dy

- exp[-(2y2 - 2y) / 2] dy

- exp[-{(

2 y) 2 - 2( 2 y)( / 2 ) + ( / 2 ) 2} / 2] dy

- exp[-{

2 y - / 2 } 2 / 2] dy .

2 . dy = dx /

exp[-2/4]

2.

exp[-x 2 / 2]
1
dx = exp[-2/4].351
2
2

Now assume that B() = c - [/ 2 ], for some constant c.


Then B() = -[/ 2 ] / 2 = -

1
exp[-(/ 2 )2 /2] /
2

2 =-

1
2

exp[-2/4], matching above.

Therefore, we have shown that B() = c - [/ 2 ].


However, B(0) = 1/2. 1/2 = c - 1/2. c = 1. B() = 1 - [/ 2 ]. 352
Thus the Gini Coefficient is: 1 - 2B = 2[/ 2 ] - 1.

351

Where I have used the fact that the density of the Standard Normal integrates to one over its support from - to .

352

In general,

- [a + by] [y] dy = [a /

1 + b2 ).

For a list of similar integrals, see http://en.wikipedia.org/wiki/List_of_integrals_of_Gaussian_functions

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 867

Proof of the Relationship Between the Gini Index and the Mean Relative Difference:353
I will prove that: Gini Coefficient = (mean relative difference) / 2.
As a first step, let us look at a graph of the Lorenz Curve with areas labeled:
% of income
1.0

0.8
C
0.6
A

0.4

0.2

0.2

0.4

0.6

0.8

1.0

% of people

A + B = 1/2 = C.
B is the area on the Lorenz curve:

G dF .

Area B is the area between the Lorenz curve and the horizontal axis.
We can instead look at: C + A = area between the Lorenz curve and the vertical axis =

F dG .

F dG - G dF = C + A - B = 1/2 + A - (1/2 - A) = 2 A.
Area A = (1/2) { F dG - G dF }.
Therefore, we have that:

Gini Coefficient =
353

Area A
= 2A =
Area A + Area B

F dG - G dF .

Based on Section 2.25 of Volume I of Kendalls Advanced Theory of Statistics, not on the syllabus.

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 868

y f(y) dy

Recall that G(x) = 0

E[X]

Therefore, Gini Coefficient =

1
E[X]

0 0

. dG =

x f(x) dx
.
E[X]

F dG -

1
G dF =
E[X]

1
s f(t) dt f(s) ds E[X]

0 0

0 F(s) s f(s) ds - 0 G(s) f(s) ds =

1
t f(t) dt f(s) ds =
E[X]

0 0 (s - t) f(t) dt f(s) ds .

0 0 (s - t) f(t) dt f(s) ds is the contribution to the mean distance from when s > t.
By symmetry it is equal to the contribution to the mean distance from when t > s.
s

Therefore, 2

0 0 (s - t) f(t) dt f(s) ds = mean distance.

Gini Coefficient =

(mean difference) / 2
= (mean relative difference) / 2.
E[X]

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 869

Problems:
42.1 (15 points) The distribution of incomes follows a Single Parameter Pareto Distribution, > 1.
a. (3 points) Determine the mean relative distance.
b. (3 points) Determine the form of the Lorenz curve.
c. (3 points) With the aid of a computer, draw and compare the Lorenz curves for = 1.5 and = 3.
d. (3 points) Use the form of the Lorenz curve to compute the Gini coefficient.
e. (3 point) If the Gini coefficient is 0.47, what percent of total income is earned
by the top 1% of earners?

42.2 (5 points) For a Gamma Distribution with = 2, determine the mean relative distance.
Hint: Calculate the contribution to the mean difference from when x < y.

x e- x / dx = -x e-x/ - e-x/ 2.
x2 e- x / dx = -x2 e-x/ - 2x e-x/ 2 - 2e-x/ 3.

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Solutions to Problems:

42.1. a. f(x) =


, x > .
x + 1

The contribution to the relative difference from when x > y is:


y



(x - y) + 1 dx + 1 dy = 2 2
x
y

1
1
2 2
- 1

1
y2

dy = 2

1
1
-1 y
1
+
dy =

-1 y - 1
y y + 1

=
.
2
1
- 1 (2 - 1)
( -1) (2 - 1)

By symmetry this is equal to the contribution to the mean distance from when y > x.

Therefore, the mean distance is: 2


.
( -1) (2 - 1)
E[X] =

, > 1.
-1

Therefore, the mean relative difference is:

b. G(x) =

y f(y) dy
E[X]

Now let t = F(x) = 1 -


y + 1 dy
y

-1

2
, > 1.
2 -1

= (-1) 1

, x > . /x = (1-t)1/.
x

Then G(t) = 1 - (1-t)1-1/ , 0 t 1.

1
y

dy = 1 -

1
, x > .
x 1

Page 870

2013-4-2,

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 871

c. For = 1.5, G(t) = 1 - (1-t)1/3, 0 t 1. For = 3, G(t) = 1 - (1-t)2/3, 0 t 1.


Here is a graph of these two Lorenz curves:
% of income
1.0

0.8

0.6

alpha = 3
0.4
alpha = 1.5
0.2

0.2

0.4

0.6

0.8

1.0

% of people

The Lorenz curve for = 1.5 is below that for = 3.


The incomes are more concentrated for = 1.5 than for = 3.
d. The Lorenz curve is: G(t) = 1 - (1-t)1-1/ , 0 t 1.
Integrating, the area under the Lorenz curve is: B = 1 - 1/(2 - 1/) = 1 - /(2-1) = (-1)/(2-1).
Gini coefficient is: 1 - 2B = 1 - 2(-1)/(2-1) =

1
, > 1.
2 -1

Note that the Gini Coefficient = (mean relative difference) / 2 = (1/2)

2
1
=
.
2 -1 2 -1

2013-4-2,
e. 0.47 =

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 872

1
. = 1.564. E[X] = / (-1) = 2.773 .
2 -1

The 99th percentile is: (1 - 0.99)-1/1.564 = 19.00 .


The income earned by the top 1% is:

19

1.564 1.564
dx = (1.564/0.564) 1.564 / (19)0.564 = 0.527 .
x2.564

Thus the percentage of total income earned by the top 1% is: 0.527 / (2.773) = 19.0%.
Comment: The mean relative distance and the Gini coefficient have the same form as for the
two-parameter Pareto Distribution.
The distribution of incomes in the United States has a Gini coefficient of about 0.47.
For a sample of size two from a Single Parameter Pareto Distribution with > 1, it turns out that:
E[Min] =

2
2 2
. E[Max] =
.
2 -1
( - 1)(2 - 1)

Therefore, the mean difference is:


Since, E[X] =

2 2
2

= 2
.
( - 1)(2 - 1) 2 -1
( -1) (2 - 1)

2
, the mean relative distance is
.
-1
2 -1

2013-4-2,
42.2. f(x) =

Loss Distributions, 42 Gini Coefficient

HCM 10/8/12,

Page 873

x -1 e - x /
= x e-x/ / 2.
()

The contribution to the relative difference from when x < y is:

(1/4)

0 y (y - x) x

e- x /

dx y

e- y /

x
/

x
/

dy = (1/4)
y xe
dx - x2 e
dx y e- y / dy

0 y
y

= (1/4)

0 {y(-ye- y / - e- y / 2) + y2e- y / + 2ye- y / 2 + 2e- y / 3 } y e- y / dy =

(1/4)

0 y2 e- 2y / 2 + 2y e- 2y / 3 dy = (1/4) {2 2(/2)3 + 23 (/2)2} = 3/4.

By symmetry this is equal to the contribution to the mean distance from when x > y.
Therefore, the mean distance is: 3/2. E[X] = = 2 .
Therefore, the mean relative difference is:

3 /2
= 3/4.
2

Comment: The Gini Coefficient is half the mean relative difference or 3/8.
One can show in general that for the Gamma the mean relative distance is 2 - 4 (, +1; 1/2).
2
Then in turn it can be shown that for alpha integer, the mean relative distance is: / 22-1.

8
For example, for = 4, the mean relative difference is: / 27 = 70/128 = 35/64.
4
The Gini Coefficient is half the mean relative difference, and is graphed below as a function of alpha:
Gini Coef
1.0
0.8
0.6
0.4
0.2
2

10

alpha

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 874

Section 43, Important Ideas & Formulas


Here are what I believe are the most important formulas and ideas from this study guide to know for
the exam.
Statistics Ungrouped Data (Section 2):
Average of X

= 1st moment = E[X].

Average of X2 = 2nd moment about the origin = E[X2 ].


Mean = E[X].
Mode = the value most likely to occur.
Median = the value at which the distribution function is 50% = 50th percentile.
Variance = second central moment = E[(X - E[X])2 ] = E[X2 ] - E[X]2 .
Standard Deviation = Variance .
Var[kX] = k2 Var[X].
For independent random variables the variances add.
The average of n independent, identically distributed variables has a variance of
Var[X] / n.
Var[X+Y] = Var[X] + Var[Y] + 2Cov[X,Y].
Cov[X,Y] = E[XY] - E[X]E[Y].

Corr[X, Y] = Cov[X ,Y] /

Var[X]Var[Y] .

Sample Mean = Xi / N = X .
The sample variance is an unbiased estimator of the variance of the distribution from which a data set

(Xi
was drawn: Sample variance

- X )2

N - 1

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 875

Coefficient of Variation and Skewness (Section 3):


Coefficient of Variation (CV) = Standard Deviation / Mean.
1 + CV2 = E[X2 ] / E2 [X] = 2nd moment divided by the square of the mean
Average of X3 = 3rd moment about the origin = E[X3 ].
Third Central Moment = E[(X - E[X])3 ] = E[X3 ] - 3 E[X] E[X2 ] + 2 E[X]3 .

Skewness = 1 =

Kurtosis =

E[(X - E[X]) 3]
. A symmetric distribution has zero skewness.
STDDEV 3

E[(X - E[X])4 ] E[X4] - 4 E[X] E[X3] + 6 E[X]2 E[X2] - 3 E[X]4


=
Variance2
Variance2

When computing the empirical coefficient of variation, skewness, or kurtosis, we use the biased
estimate of the variance, with n in the denominator, rather than the sample variance.
Empirical Distribution Function (Section 4):
The Empirical Distribution Function at x: (# of losses x)/(# losses).
The Empirical Distribution Function has mean of F(x) and a variance of: F(x){1-F(x)}/N.
S(x) = 1 - F(x) = the Survival Function.
Limited Losses (Section 5):
X

L Minimum of x and L = Limited Loss Variable.

The Limited Expected Value at L = E[X

E[X

L]

= E[Minimum[L, x]].

L] =

x f(x) dx + LS(L)

= contribution of small losses + contribution of large losses.


mean = E[X

].

E[X

x] x.

E[X

x] mean.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 876

Losses Eliminated (Section 6):


N = the total number of accidents or loss events.
d

Losses Eliminated by a deductible of size d = N

x f(x) dx + N d S(d) = N E[X

d].

Loss Elimination Ratio (LER) =

LER(x) =

Losses Eliminated by a deductible of size d


.
Total Losses

E[X x]
.
E[X]

Excess Losses (Section 7):


(X - d)+ 0 when X d, X - d when X > d left censored and shifted variable at d
amounts paid to insured with a deductible of d.
Excess Ratio = R(x) = (Losses Excess of x) / (total losses) = E[(X - d)+]/E[X].
R(x) = 1 - LER(x) = 1 - { E[X x] / mean }.

Total Losses = Limited Losses + Excess Losses: X = (X


E[(X - d)+] = E[X] - E[(X

d) + (X - d)+.

d)].

Mean Excess Loss (Section 8):


Excess Loss Variable for d X - d for X > d, undefined for X d
the nonzero payments excess of deductible d.
Mean Residual Life or Mean Excess Loss = e(x)
= the average dollars of loss above x on losses of size exceeding x.
e(x) =

E[X] E[X x]
.
S(x)

e(x) = (average size of those claims of size greater than x) - x.


Failure rate, force of mortality, or hazard rate = h(x) = f(x)/S(x) = - d ln(S(x)) / dx .

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 877

Layers of Loss (Section 9):


The percentage of losses in the layer from d to u =
u

(x -

d) f(x) dx + S(u) (u - d)
=

f(x) dx

E[X u] E[X d]
= LER(u) - LER(d) = R(d) - R(u).
E[X]

Layer Average Severity (LAS) for the layer from d to u =


The mean losses in the layer from d to u = E[X u] - E[X d] =
{LER(u) - LER(d)} E[X] = {R(d) - R(u)} E[X].
Average Size of Losses in an Interval (Section 10):
The average size of loss for those losses of size between a and b is:
b

x f(x) dx

F(b) - F(a)

{E[X b] - b S(b)} - {E[X a] - a S(a)}


.
F(b) - F(a)

Proportional of Total Losses from Losses in the Interval [a, b] is:


{E[X b] - b S(b)} - {E[X a] - a S(a)}
.
E[X]
Working with Grouped Data (Section 12):
For Grouped Data, if one is given the dollars of loss for claims in each interval,
then one can compute E[X

x], LER(x), R(x), and e(x), provided x is an endpoint of an interval.

Uniform Distribution (Section 13):


Support: a x b

Parameters: None

D. f. : F(x) = (x-a) / (b-a)

P. d. f. : f(x) = 1/ (b-a)

bn + 1 - an + 1
Moments: E[Xn] =
(b - a) (n + 1)
Mean = (b+a)/2

Variance = (b-a)2 /12

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 878

Statistics of Grouped Data (Section 14):


One can estimate moments of Grouped Data by assuming the losses are uniformly distributed on
each interval and then weighting together the moments for each interval by the number of claims
observed in each interval.
Policy Provisions (Section 15):
An ordinary deductible is a provision which states that when the loss is less than or
equal to the deductible, there is no payment and when the loss exceeds the deductible,
the amount paid is the loss less the deductible.
The Maximum Covered Loss is the size of loss above which no additional payments are
made.
A coinsurance factor is the proportion of any loss that is paid by the insurer after any other
modifications (such as deductibles or limits) have been applied.
A coinsurance is a provision which states that a coinsurance factor is to be applied.
The order to operations is:
1. Limit the size of loss to the maximum covered loss.
2. Subtract the deductible. If the result is negative, set the payment equal to zero.
3. Multiply by the coinsurance factor.
A policy limit is maximum possible payment on a single claim.
Policy Limit = c(u - d). Maximum Covered Loss = u = d + (Policy Limit)/c.
With no deductible and no coinsurance, the policy limit the maximum covered loss.
Under a franchise deductible the insurer pays nothing if the loss is less than the deductible
amount, but ignores the deductible if the loss is > the deductible amount.
Name
ground-up loss

Description
Losses prior to the impact of any deductible or maximum covered loss;
the full economic value of the loss suffered by the insured
regardless of how much the insurer is required to pay
in light of any deductible, maximum covered loss, coinsurance, etc.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 879

Truncated Data (Section 16):


Ground-up, unlimited losses have distribution function F(x).
G(x) is what one would see after the effects of either a deductible or maximum covered loss.
Left Truncated Truncation from Below at d
deduct. d & record size of loss when size > d.
F(x) - F(d)
G(x) =
,x>d
1 - G(x) = S(x) / S(d), x > d
S(d)
g(x) = f(x) / S(d), x > d

x the size of loss.

Truncation & Shifting from Below at d


deductible d & record non-zero payment amount paid per (non-zero) payment.
F(x + d) - F(d)
, x > 0.
g(x) = f(x+d) / S(d), x > 0
S(d)
x the size of (non-zero) payment.
x+d the size of loss.
G(x) =

When data is truncated from above at the value L, claims of size greater than L are not in the
reported data base. G(x) = F(x) / F(L), x L
g(x) = f(x) / F(L), x L.
Censored Data (Section 17):
Right Censored Censored from Above at u
Maximum Covered Loss u & dont know exact size of loss, when u.
F(x)
x < u
G(x) =
x = u
1
f(x)
g(x) =
point mass of probability S(u)

x < u
x = u

The revised Distribution Function and density under censoring from above at u and truncation from
below at d is:
F(x) - F(d)

d < x < u
G(x) =
S(d)

1
x = u
d < x < u
f(x) / S(d)
g(x) =
point mass of probability S(u)/ S(d)
x = u

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 880

Left Censored and Shifted at d (X - d)+ losses excess of d


0 when X d, X - d when X > d amounts paid to insured with a deductible of d
payments per loss, including when the insured is paid nothing due to the deductible
amount paid per loss.
G(0) = F(d) ; G(x) = F(x+d), x > 0.

g(0) point mass of F(d) ; g(x) = f(x+d), x > 0.

Average Sizes (Section 18):


Type of Data
Ground-up, Total Limits
Censored from Above at u
Truncated from Below at d
Truncated and Shifted from Below at d
Left Censored and Shifted

Average Size
E[X]
E[X u]
e(d) + d = {E[X] - E[X d]}/S(d) + d
e(d) = {E[X] - E[X d]}/S(d)
E[(X - d)+] = E[X] - E[X d]

Censored from Above at u and


Truncated and Shifted from Below at d

{E[X

u] - E[X

d]} / S(d)

With Maximum Covered Loss of u and an (ordinary) deductible of d, the average amount
paid by the insurer per loss is: E[X u] - E[X d].
With Maximum Covered Loss of u and an (ordinary) deductible of d, the average amount
paid by the insurer per non-zero payment to the insured is:
E[X u] - E[X d]
= e(d).
S(d)
A coinsurance factor of c, multiplies the average payment, either per loss or per non-zero payment
by c.
Percentiles (Section 19):
For a continuous distribution, the 100pth percentile is the first value at which F(x) = p.
For a discrete distribution, take the 100pth percentile as the first value at which F(x) p.
Definitions (Section 20):
A loss event or claim is an incident in which an insured or group of insureds suffers damages which
are potentially covered by their insurance contract.
The loss is the dollar amount of damage suffered by an insured or group of insureds as a result of a
loss event. The loss may be zero.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 881

A payment event is an incident in which an insured or group of insureds receives a payment as a


result of a loss event covered by their insurance contract.
The amount paid is the actual dollar amount paid to the policyholder(s) as a result of a loss event or a
payment event. If it is as the result of a loss event, the amount paid may be zero.
A loss distribution is the probability distribution of either the loss or the amount paid from a loss
event or of the amount paid from a payment event.
The severity can be either the loss or amount paid random variable.
The exposure base is the basic unit of measurement upon which premiums are determined.
The frequency is the number of losses or number of payments random variable.
Parameters of Distributions (Section 21):
For a given type of distribution, in addition to the size of loss x, F(x) depends on
what are called parameters. The numerical values of the parameter(s) distinguish among the
members of a parametric family of distributions.
It is useful to group families of distributions based on how many parameters they have.
A scale parameter is a parameter which divides x everywhere it appears in the distribution function.
A scale parameter will appear to the nth power in the formula for the nth moment of the distribution.
A shape parameter affects the shape of the distribution and appears in the coefficient of variation
and the skewness.
Exponential Distribution (Section 22):
Support: x > 0

Parameters: > 0 ( scale parameter)

F(x) = 1 - e-x/

f(x) = e-x/ /

Mean =

Variance = 2

Coefficient of Variation = 1.

Skewness = 2.

2nd moment = 22

e(x) = Mean Excess Loss =

When an Exponential Distribution is truncated and shifted from below,


one gets the same Exponential Distribution, due to its memoryless
property.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 882

Single Parameter Pareto Distribution (Section 23):


Support: x >

Parameters: > 0 (shape parameter)


F(x) = 1 -
x


f(x) = + 1
x

Mean =

, > 1.
1

Common Two Parameter Distributions (Section 24):


Pareto: is a shape parameter and is a scale parameter.
The Pareto is a heavy-tailed distribution. Higher moments may not exist.

F(x) = 1 -
= 1 - (1 + x / ) f(x) =
Mean =
, > 1.

+
1
+ x
1
( + x)
E[Xn ] =

n! n
, > n.
( 1)...( n)

Mean Excess Loss =

+ x
, > 1.
1

If losses prior to any deductible follow a Pareto Distribution with parameters and , then after
truncating and shifting from below by a deductible of size d, one gets another Pareto Distribution,
but with parameters and + d.
Gamma: is a shape parameter and is a scale parameter. Note the factors of in the moments.
For = 1 you get the Exponential.
The sum of n independent identically distributed variables which are Gamma with parameters and
is a Gamma distribution with parameters n and . For = a positive integer, the Gamma
distribution is the sum of independent variables each of which follows an Exponential distribution.
x -1 e - x /
()

F(x) = (; x/)

f(x) =

Mean =

Variance = 2

E[Xn ] = n ( )...( + n -1) .

The skewness for the Gamma distribution is always twice the coefficient of variation.
LogNormal: If ln(x) follows a Normal, then x itself follows a LogNormal.
ln(x)
F(x) =
f(x) =

Mean = exp[ + .5 2]

[(

exp -

ln(x) )2
22
x 2

Second Moment = exp[2 + 22]

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 883

Weibull: is a shape parameter, while is a scale parameter.


x
F(x) = 1 - exp-

f(x) =

x
1
x
exp

[ ]

For = 1 you get the Exponential Distribution.


Other Two Parameter Distributions (Section 25):
Mean =

Inverse Gaussian:

x
F(x) = 1

Variance = 3 /

x
+ e2 / + 1

(x / )
LogLogistic: F(x) =
.
1 + (x / )

f(x) =

x 2
1

exp 2x
1.5
2
x

x 1
f(x) =
.
(1 + (x / ) )2

Inverse Gamma : If X follows a Gamma Distribution with parameters and 1, then /x follows an
Inverse Gamma Distribution with parameters = and . is the shape parameter and is the scale
parameter. The Inverse Gamma is heavy-tailed.
e - / x
F(x) = 1 - ( ; /x)
f(x) = + 1
x
[]
Mean =

, > 1.
1

E[Xn ] =

n
, > n.
( 1)...( n)

Producing Additional Distributions (Section 29):


Introduce a scale parameter by "multiplying by a constant".
Let G(x) = 1 - F(1/x). One gets the Inverse Gamma from the Gamma.
Let G(x) = F(ln(x)). One gets the LogNormal from the Normal by exponentiating.
Add up independent identical copies. One gets the Gamma from the Exponential.
Let G(x) = F(x). One gets a Weibull from the Exponential by "raising to a power."
One can get a new distribution as a continuous mixture of distributions.
The Pareto can be obtained as a mixture of Exponentials via an Inverse Gamma.
Another method of getting new distributions is via two-point or n-point mixtures.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 884

Tails of Loss Distributions (Section 30):


If S(x) goes to zero slowly as x approaches , this is a "heavy-tailed distribution."
The righthand tail is thick.
If S(x) goes to zero quickly as x approaches , this is a "light-tailed distribution."
The righthand tail is thin.
The Pareto Distribution is heavy-tailed.
The Exponential distribution is light-tailed.
The Pareto Distribution is heavier-tailed than the LogNormal Distribution.
The Gamma, Pareto and LogNormal all have positive skewness.
Heavier Tailed

Lighter Tailed

f(x) goes to zero more slowly


Few Moments exist
Larger Coefficient of Variation
Higher Skewness
e(x) Increases to Infinity
Decreasing Hazard Rate

f(x) goes to zero more quickly


All (positive) moments exist
Smaller Coefficient of Variation
Lower Skewness
e(x) goes to a constant
Increasing Hazard Rate

Here is a list of some loss distributions, arranged in increasing heaviness of the tail:
Distribution
Mean Excess Loss
All Moments Exist
Weibull for > 1

decreases to zero less quickly than 1/x

Yes

Gamma for > 1

decreases to a constant

Yes

Exponential

constant

Yes

Gamma for < 1

increases to a constant

Yes

Inverse Gaussian

increases to a constant

Yes

Weibull for < 1

increases to infinity less than linearly

Yes

LogNormal
Pareto

increases to infinity just less than linearly


increases to infinity linearly

Yes
No

Let f(x) and g(x) be the two densities, then if:


lim f(x) / g(x) = , f has a heavier tail than g
x

lim f(x) / g(x) = 0, f has a lighter tail than g


x

lim f(x) / g(x) = positive constant, f has a similar tail to g.


x

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 885

Limited Expected Values (Section 31):


E[X x] =

t f(t) dt + xS(x).

Rather than calculating this integral, make use of Appendix A of Loss Models, which has formulas for
the limited expected value for each distribution.
mean = E[X infinity].

e(x) = { mean - E[X x] } / S(x).


LER(x) = E[X x] / mean.
Layer Average Severity = E[X top of Layer] - E[X bottom of layer].
Expected Losses Excess of d: E[(X - d)+] = E[X] - E[X d].
Given Deductible Amount d, Maximum Covered Loss u, and coinsurance factor c, then
the average payment per non-zero payment by the insurer is:
E[X u] - E[X d]
c
.
S(d)
Given Deductible Amount d, Maximum Covered Loss u, and coinsurance factor c, then
the insurers average payment per loss to the insured is:
c (E[X u] - E[X d]).
E[X x] =

S(t) dt .

E[X] =

S(t) dt .

The Losses in a Layer can be written as an integral of the Survival Function from the bottom of the
Layer to the top of the Layer:
E[X b] - E[X a] =

S(t) dt .
a

The expected amount by which losses are less than d is: E[(d - X)+ ] = d - E[X d].
E[Max[X, a]] = a + E[X] - E[X

E[Min[Max[X , a] , b]] = a + E[X

a].

b] - E[X

a].

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 886

Limited Higher Moments (Section 32):


E[(X u)2 ] =

t2 f(t) dt + S(u) u2

The second moment of the average payment per loss under a Maximum Covered Loss u and a
deductible of d = the second moment of the layer from d to u is:
E[(X u)2 ] - E[(X d)2 ] - 2d {E[X u] - E[X d]}.
Given a deductible of d and a Maximum Covered Loss u, the second moment of the non-zero
payments is : (2nd moment of the payments per loss)/S(d).
If one has a coinsurance factor of c, then each payment is multiplied by c, therefore the second
moment and the variance are each multiplied by c2 .
Mean Excess Loss (Mean Residual Life) (Section 33):

(t - x) f(t) dt

e(x) = E[X - x | X > x] =

S(x)

t f(t) dt

S(x)

- x = S(t) dt / S(x) .
x

e(x) = { mean - E[X x] } / S(x).


e(d) = average payment per payment with a deductible d.
It should be noted that for heavier-tailed distributions, just as with the mean, the Mean Excess Loss
only exists for certain values of the parameters. Otherwise it is infinite.
Distribution

Behavior of e(x) as x

Exponential

constant

Pareto

increases linearly

LogNormal

increases to infinity less than linearly

Gamma, >1

decreases towards a horizontal asymptote

Gamma, <1

increases towards a horizontal asymptote

Weibull, >1

decreases to zero

Weibull, <1

increases to infinity less than linearly

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 887

Hazard Rate (Section 34):


The Hazard Rate, force of mortality, or failure rate, is defined as:
h(x) = f(x)/S(x), x0.
h(x) = -d ln(S(x)) / dx
x

S(x) = exp[-H(x)], where H(x) =

h(t) dt .

h(x) defined for x > 0 is a legitimate hazard rate, if and only if h(x) 0 and the integral of h(x) from 0 to
infinity is infinite.
For the Exponential, h(x) = 1/ = constant.
1
.
x h(x)

lim e(x) = lim

Loss Elimination Ratios and Excess Ratios (Section 35):


Loss Elimination Ratio = LER(x) = E[X x] / mean = 1 - R(x)
Excess Ratio = R(x) = (mean - E[X x]) / mean = 1 - { E[X x] / mean } = 1 - LER(x).

LER(x)=

S(t) dt
E[X]

S(t) dt

S(t) dt

The percent of losses in a layer can be computed either as the difference in Loss Elimination Ratios
or the difference of Excess Ratios in the opposite order.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 888

The Effects of Inflation (Section 36):


Uniform Inflation Every size of loss increases by a factor of 1+r.
Under uniform inflation, for a fixed limit the excess ratio increases and for a fixed deductible amount
the loss elimination ratio declines.
In order to keep up with inflation either the deductible or the limit must be increased at the rate of
inflation, rather than being held constant.
Under uniform inflation the dollars limited by a fixed limit increase slower than the overall
rate of inflation. Under uniform inflation the dollars excess of a fixed limit increase faster
than the overall rate of inflation.
Limited Losses plus Excess Losses = Total Losses.
Common ways to express the amount of inflation:
1. State the total amount of inflation from the earlier year to the later year.
2. Give a constant annual inflation rate.
3. Give the different amounts of inflation during each annual period between the earlier and later year.
4. Give the value of some consumer price index in the earlier and later year.
In all cases you want to determine the total inflation factor, (1+r), to get from the earlier year to the
later year.
The Mean, Mode, Median, and the Standard Deviation are each multiplied by (1+r).
The Variance is multiplied by (1+r)2 . The nth moment is multiplied by (1+r)n .
The Coefficient of Variation, the Skewness, and the Kurtosis are each unaffected by
uniform inflation.
Provided the limit keeps up with inflation, the Limited Expected Value, in dollars,
is multiplied by the inflation factor.
In the later year, the mean excess loss, in dollars, is multiplied by the inflation factor,
provided the limit has been adjusted to keep up with inflation.
The Loss Elimination Ratio, dimensionless, is unaffected by uniform inflation, provided
the deductible has been adjusted to keep up with inflation.
In the later year, the Excess Ratio, dimensionless, is unaffected by uniform inflation, provided the
limit has been adjusted to keep up with inflation.
Most of the size of loss distributions are scale families; under uniform inflation one gets the same
type of distribution. If there is a scale parameter, it is revised by the inflation factor.
For the Pareto, Single Parameter Pareto, Gamma, Weibull, and Exponential
Distributions, becomes (1+r).
Under uniform inflation for the LogNormal, becomes + ln(1+r).

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 889

For distributions in general, one can determine the behavior under uniform inflation as follows.
One makes the change of variables Z = (1+r) X.
For the Distribution Function one just sets FZ(z) = FX(x); one substitutes for x = z / (1+r).
Alternately, for the density function fZ(z) = fX(x) / (1+r).
The domain [a, b] becomes under uniform inflation [(1+r)a, (1+r)b].
The uniform distribution on [a, b] becomes under uniform inflation the uniform
distribution on [a(1+r), b(1+r)].
There are two alternative ways to solve many problems involving uniform inflation:
1. Adjust the size of loss distribution in the earlier year to the later year based on the amount of
inflation. Then calculate the quantity of interest in the later year.
2. Calculate the quantity of interest in the earlier year at its deflated value, and then adjust it to
the later year for the effects of inflation.
Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered
Loss u, and coinsurance factor c, then in terms of the values in the earlier year, the
insurers average payment per loss in the later year is:

(1+ r) c { E[X

u
d
] - E[X
] }.
1+ r
1+ r

Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered
Loss u, and coinsurance factor c, then in terms of the values in the earlier year, the
average payment per (non-zero) payment by the insurer in the later year is:

(1+r) c

E[X

u
d
] - E[X
]
1+ r
1+ r .
d
S

1+ r

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 890

Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered Loss u,
and coinsurance factor c, then in terms of the values in the earlier year,
the second moment of the insurers payment per loss in the later year is:
(1+r)2 c2 { E[(X

u 2
d 2
d
u
d
) ] - E[(X
) ]-2
( E[X
] - E[X
])}.
1+r
1+ r
1+ r
1+ r
1+ r

Given uniform inflation, with inflation factor of 1+r, Deductible Amount d, Maximum Covered Loss u,
and coinsurance factor c, then in terms of the values in the earlier year,
the average payment per (non-zero) payment by the insurer in the later year is:

(1+r)2 c2

E[(X

u 2
d 2
d
u
d
) ] - E[(X
) ] - 2
{E[X
] - E[X
]}
1+r
1+ r
1+r
1+r
1+r .
d
S
1+r

If one has a mixed distribution, then under uniform inflation each of the component distributions acts
as it would under uniform inflation.
Lee Diagrams (Section 37):
Put the size of loss on the y-axis and probability on the x-axis.
The mean is the area under the curve.
Layers of loss correspond to horizontal strips.
Restricting attention to only certain sizes of loss corresponds to vertical strips.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 891

N-Point Mixtures of Models (Section 38):


Mixing models is a technique that provides a greater variety of loss distributions.
One can take a weighted average of any two Distribution Functions:
G(x) = pA(x) + (1-p)B(x). This is called a 2-point mixture of models.
The Distribution Function of the mixture is the mixture of the Distribution Functions.
The Survival Function of the mixture is the mixture of the Survival Functions.
The density of the mixture is the mixture of the densities.

The mean of the mixture is the mixture of the means.


The moment of the mixture is the mixture of the moments:
E H [Xn ] = p EA [Xn ] + (1-p) EB [Xn ].
Limited Moments of the mixed distribution are the weighted average of the limited moments of the
individual distributions: EH[X x] = p EA[X x] + (1-p) EB[X x].
In general, one can weight together any number of distributions, rather than just
two. These are called n-point mixtures.
Sometimes the mixture of models is just a mathematical device with no physical significance.
However, it can also be useful when the data results from different perils.
Variable Mixture weighted average of unknown # of distributions of the same family but differing
parameters F(x) = wi Fi(x), with each Fi of the same family & wi = 1.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 892

Continuous Mixtures of Models (Section 39):


Mixture Distribution Continuous Mixture of Models.
One takes a mixture of the density functions for specific values of the parameter via some mixing
distribution u: g(x) =

f(x; ) u() d .

The nth moment of a mixed distribution is the mixture of the nth moments for specific values of the
parameter : E[Xn ] = E[E[Xn | ]].
If the severity is Exponential and the mixing distribution of their means is Inverse Gamma,
then the mixed distribution is a Pareto, with
= shape parameter of the Inverse Gamma and = scale parameter of the Inverse Gamma.
If the hazard rate of the Exponential, , is distributed via a Gamma(, ), then the mean 1/ is
distributed via an Inverse Gamma(, 1/), and therefore the mixed distribution is Pareto.
If the Gamma has parameters and , then the mixed Pareto has parameters and 1/.
If the severity is Normal with fixed variance s2 , and the mixing distribution of their means is also
Normal with mean and variance 2, then the mixed distribution is another Normal,
with mean and variance: s2 + 2.
In a Frailty Model, the hazard rate is of the form: h(x | ) = a(x), where is a parameter which varies
across the portfolio, and a(x) is some function of x.
x

Let A(x) = a(t) dt .


0

Then S(x) = M[-A(x)].


For an Exponential Distribution: a(x) = 1, and A(x) = x.
For a Weibull Distribution: = , a(x) = x1, and A(x) = x.

2013-4-2,

Loss Distributions, 43 Important Ideas

HCM 10/8/12,

Page 893

Spliced Models (Section 40):


A 2-component spliced model has: f(x) = w1 f1 (x) on (a1 , b1 ) and f(x) = w2 f2 (x) on (a2 , b2 ),
where f1 (x) is a density with support (a1 , b1 ), f2 (x) is a density with support (a2 , b2 ),
and w1 + w2 = 1.
A 2-component spliced density will be continuous at the breakpoint b,
provided the weights are inversely proportional to the component densities at the breakpoint:
w1 = f2 (b) / {f1 (b) + f2 (b)}, w2 = f1 (b) / {f1 (b) + f2 (b)}.
Relationship to Life Contingencies (Section 41):
y-xp x

Prob[Survival past y | Survival past x] = S(y)/S(x).

y-xq x

Prob[Not Surviving past y | Survival past x] = {S(x) - S(y)}/S(x) = 1 - y-xp x.

p x 1 p x = Prob[Survival past x+1 | Survival past x] = S(x+1)/S(x).


qx 1 qx = Prob[Death within one year | Survival past x] = 1 - S(x+1)/S(x).
t|uq x

Prob[x+t < time of death x+t+u | Survival past x] = {S(x+t) - S(x+t+u)}/S(x).

Mahlers Guide to

Aggregate Distributions
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-3


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-3,

Aggregate Distributions,

HCM 10/23/12,

Page 1

Mahlers Guide to Aggregate Distributions


Copyright 2013 by Howard C. Mahler.
The Aggregate Distribution concepts in Loss Models are demonstrated.
Information in bold or sections whose title is in bold are more important for passing the exam.
Larger bold type indicates it is extremely important.
Information presented in italics (and sections whose titles are in italics) should not be needed to
directly answer exam questions and should be skipped on first reading. It is provided to aid the
readers overall understanding of the subject, and to be useful in practical applications.
Highly Recommended problems are double underlined.
Recommended problems are underlined.
Solutions to problems are given at the end of each section.1
Section #

Pages

1
2
3
4
5
6
7
8
9
10
11
12

3-28
29-59
60-74
75-108
109-198
199-210
211-242
243-252
253-274
275-286
287-323
324-329

C
D

Section Name
Introduction
Convolutions
Using Convolutions
Generating Functions

Moments of Aggregate Losses


Individual Risk Model
Recursive Method / Panjer Algorithm
Recursive Method / Panjer Algorithm, Advanced
Discretization
Analytic Results
Stop Loss Premiums
Important Formulas & Ideas

Note that problems include both some written by me and some from past exams. The latter are copyright by the
CAS and/or SOA and are reproduced here solely to aid students in studying for exams. The solutions and
comments are solely the responsibility of the author; the CAS/SOA bear no responsibility for their accuracy. While
some of the comments may seem critical of certain questions, this is intended solely to aid you in studying and in no
way is intended as a criticism of the many volunteers who work extremely long and hard to produce quality exams. In
some cases Ive rewritten past exam questions in order to match the notation in the current Syllabus. In some cases
the material covered is preliminary to the current Syllabus; you will be assumed to know it in order to answer exam
questions, but it will not be specifically tested.

2013-4-3,

Aggregate Distributions,

HCM 10/23/12,

Page 2

Past Exam Questions by Section of this Study Aid2


Course 3 Course 3 Course 3 Course 3 Course 3Course 3 CAS 3
Section Sample

5/00

11/00

1
2
3
4
5
6
7
8
9
10
11

11/01

26

36

11/02

11/03

CAS 3

11/03

5/04

19
37

20 25

16 19

8 32

29

41 42

14-15

Section 11/04

31, 32

24, 25

4, 33

19 22 38 39

36

11

CAS 3 SOA 3
1
2
3
4
5
6
7
8
9
10
11

5/01

SOA 3

19 30

18

40

16

CAS 3

SOA M

CAS 3

SOA M

11/04

5/05

5/05

11/05

11/05

17

15

8, 9, 40

CAS 3 CAS 3 SOA M


5/06

11/06

11/06

5/07

40

17, 31, 40 30, 34 34, 38, 40

29

21, 32

17
8

18

19

The CAS/SOA did not release the 5/02 and 5/03 exams.
The SOA did not release its 5/04 and 5/06 exams.
From 5/00 to 5/03, the Course 3 Exam was jointly administered by the CAS and SOA.
Starting in 11/03, the CAS and SOA gave separate exams.
The CAS/SOA did not release the 11/07 and subsequent exams 4/C.

4/C

Excluding any questions that are no longer on the syllabus.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 3

Section 1, Introduction
Aggregate losses are the total dollars of loss. Aggregate losses are the product of: the number of
exposures, frequency per exposure, and severity.
Aggregate Losses =
(# of Exposures) (

# of Claims
$ of Loss
) (
)=
# of Exposures
# of Claims

(Exposures) (Frequency) (Severity).


If one is not given the frequency per exposure, but is rather just given the frequency for the whole
number of exposures,3 whatever they are for the particular situation, then
Aggregate Losses = (Frequency) (Severity).
Definitions:
The Aggregate Loss is the total dollars of loss for an insured or set of an insureds. If not stated
otherwise, the period of time is one year.
For example, during 1999 the MT Trucking Company may have had $152,000 in aggregate losses
on its commercial automobile collision insurance policy. All of the trucking firms insured by the
Fly-by-Night Insurance Company may have had $16.1 million dollars in aggregate losses for
collision. The dollars of aggregate losses are determined by how many losses there are and the
severity of each one.
Exercise: During 1998 MT Trucking suffered three collision losses for $8,000, $13,500, and
$22,000. What are its aggregate losses?
[Solution: $8,000 + $13,500 + $22,000 = $43,500.]
The Aggregate Payment is the total dollars paid by an insurer on an insurance policy or set of
insurance policies. If not stated otherwise, the period of time is one year.
Exercise: During 1998 MT Trucking suffered three collision losses for $8,000, $13,500, and
$22,000. MT Trucking has a $10,000 per claim deductible on its policy with the Fly-by-Night
Insurance Company. What are the aggregate payments by Fly-by-Night?
[Solution: $0 + $3,500 + $12,000 = $15,500.]

For example, the expected number claims from a large commercial insured is 27.3 per year or the expected number
of Homeowners claims expected by XYZ Insurer in the State of Florigia is 12,310.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 4

Loss Models uses many different terms for each of the important concepts:
aggregate losses, frequency and severity.4
aggregate losses aggregate loss random variable total loss random variable aggregate
payments total payments S.
frequency frequency distribution number of claims claim count distribution
claim count random variable N.
severity severity distribution single loss random variable
individual loss random variable loss random variable X.
Collective Risk Model:
There are two different types of risk models discussed in Loss Models in order to calculate
aggregate losses or payments.
The collective risk model adds up the individual losses.5 Frequency is independent of severity
and the sizes of loss are independent, identically distributed variables. Exam questions almost
always involve the collective risk model.
1. Conditional on the number of losses, the sizes of loss are independent, identically
distributed variables.
2. The size of loss distribution is independent of the number of losses.
3. The distribution of the number of claims is independent of the sizes of loss.
For example, one might look at the aggregate losses incurred this year by Few States Insurance
Company on all of its Private Passenger Automobile Bodily Injury Liability policies in the State of
West Carolina. Under a collective risk model one might model the number of losses via a Negative
Binomial and the size of loss via a Weibull Distribution. In such a model one is not modeling what
happens on each individual policy.6
If we have 1000 independent, identical policies, then the mean of the sum of the aggregate loss is
1000 times the mean of the aggregate loss for one policy. If we have 1000 independent, identical
policies, then the variance of the sum of the aggregate losses is 1000 times the variance of the
aggregate loss for one policy.
4

This does not seem to add any value for the reader.
See Definition 9.1 in Loss Models.
6
In any given year, almost all of these policies would have no Bodily Injury Liability claim.
5

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 5

Individual Risk Model:


In contrast the individual risk model adds up the amount paid on each insurance policy.7
The amounts paid on the different insurance policies are assumed to be independent of each other.8
For example, we have 10 life insurance policies with death benefit of $100,000, and 5 with a death
benefit of $250,000. Each policy could have a different mortality rate. Then one adds the modeled
payments on these polices. This is an example of an individual risk model.
Advantages of Analyzing Frequency and Severity Separately:9
Loss Models lists seven advantages of separately analyzing frequency and severity, which allow a
more accurate and flexible model of aggregate losses:
1. The number of claims changes as the volume of business changes.
Aggregate frequency may change due to changes in exposures. For example, if exposures are in
car years, the expected frequency per car year might stay the same, while the number of car years
insured increases somewhat. Then the expected total frequency would increase.
For example, if the frequency per car year were 3%, and the insured car years increased from
100,000 to 110,000, then the expected number of losses would increase from 3000 to 3300.10
2. The effects of inflation can be incorporated.
3. One can adjust the severity distribution for changes in deductibles,
maximum covered loss, etc.

See Definition 9.2 in Loss Models.


Unless specifically told to, do not assume that the amount of loss on the different policies are identically distributed.
For example, the different policies might represent the different employees under a group life or health contract.
Each employee might have different amounts of coverage and/or frequencies.
9
See page 201 of Loss Models.
10
In spite of what Loss Models says, there is no significant advantage to looking at frequency and severity separately
in this case. One could just multiply expected aggregate losses by 10% in this example. Nor is this example likely to
be a big concern financially, if as usual the premiums collected increase in proportion to the increase in the number
of car years. Nor is this situation relatively hard to keep track of and/or predict. It should be noted that insurers make
significant efforts to keep track of the volume of business they are insuring. In most cases it is directly related to
collecting appropriate premiums.
Of far greater concern is when the expected frequency per car year changes significantly. For example, if the
expected frequency per car year went from 3.0% to 3.3%, this would also increase the expected total number of
losses, but without any automatic increase in premiums collected. This might have occurred when speed limits were
increased from 55 m.p.h. or when lawyers were first allowed to advertise on T.V. Being able to separately adjust
historical frequency and severity for the expected impacts of such changes would be an advantage.
8

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 6

4. One can adjust frequency for changes in deductibles.


5. One can appropriately combine data from policies with different deductibles and maximum
covered losses into a single severity distribution.
6. One can create consistent models for the insurer, insured, and reinsurer.
7. One can analyze the tail of the aggregate losses by separately analyzing the tails of the
frequency and severity.11
A separate analysis allows an actuary to estimate the parameters of the frequency and severity from
separate sources of information.12
Deductibles and Maximum Covered Losses:13
Just as when dealing with loss distributions, aggregate losses may represent somewhat different
mathematical and real world quantities. They may relate to the total economic loss, i.e., no deductible
and no maximum covered loss. They may relate to the amount paid by an insurer after deductibles
and/or maximum covered losses.14 They may relate to the amount paid by the insured due to a
deductible and/or other policy modifications. They may relate to the amount the insurer pays net of
reinsurance. They may relate to the amount paid by a reinsurer.
In order to get the aggregate losses in these different situations, one has to adjust the severity
distribution and then add up the number of payments or losses.15 One can either add up the
non-zero payments or one can add up all the payments, including zero payments. One needs to
be careful to use the corresponding frequency and severity distributions.
For example, assume that losses are Poisson with = 3, and severity is Exponential with
= 2000. Frequency and severity are independent.

11

See Mahlers Guide to Frequency Distributions and Mahler's Guide to Loss Distributions. For most casualty
lines, the tail behavior of the aggregate losses is determined by that of the severity distribution rather than the
frequency distribution.
12
This can be either an advantage or a disadvantage. Using inconsistent data sources or models may produce a
nonsensical estimate of aggregate losses.
13
See Section 8.6 of Loss Models and Mahlers Guide to Loss Distributions.
14
Of course in practical applications, we may have a combination of different deductibles, maximum covered losses,
coinsurance clauses, reinsurance agreements, etc. However, the concept is still the same.
15
See Mahlers Guide to Loss Distributions. Policy provisions that deals directly with the aggregate losses, such as
the aggregate deductible for stop loss insurance to be discussed subsequently, must be applied to the aggregate
losses at the end.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 7

If there is a deductible of 1000, then the (non-zero) payments are also Exponential with = 2000.16
S(1000) = e-1000/2000 = 60.65%; this is the proportion of losses that result in (non-zero) payments.
Therefore the aggregate payments can be modeled as either:
1. Number of (non-zero) payments is Poisson with = (.6065)(3) = 1.8195 and the size of
(non-zero) payments are also Exponential with = 2000.17
2. Number of losses is Poisson with = 3 and the size of payments per loss is a two-point
mixture, with 39.35% chance of zero and 60.65% of an Exponential with = 2000.
Average aggregate payments are: (1.8915)(2000) = 3639 = (3){(0)(.3935) + (2000)(.6065)}.
Exercise: Frequency is Negative Binomial with = 4 and r = 3. Severity is Pareto with
= 2 and = 10,000. Frequency and severity are independent. What are the average aggregate
losses?
The insurer buys $25,000 per claim reinsurance; the reinsurer will pay the portion of each claim
greater than $25,000.
What are the insurers aggregate annual losses after reinsurance?
How would one model the insurers aggregate losses after reinsurance?
[Solution: Average frequency = (4)(3) = 12. Average severity = 10000/(2-1) = 10,000.
Average aggregate losses (prior to reinsurance) = (12)(10,000) = 120,000.
After reinsurance, the average severity is:
E[X 25000] = {(10000)/(2-1)}{1 - (1+25/10)-(2-1)} = 7143.
Average aggregate losses, after reinsurance = (12)(7143) = 85,716.
After reinsurance, the frequency distribution is the same, while the severity distribution is censored at
25,000:
G(x) = 1 - {1+(x/10000)}-2, for x < 25000; G(25,000) = 1.]

16
17

Due to the memoryless property of the Exponential Distribution. See Mahlers Guide to Loss Distributions.
When one thins a Poisson, one gets another Poisson. See Mahlers Guide to Frequency Distributions.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 8

Exercise: Frequency is Negative Binomial with = 4 and r = 3. Severity is Pareto with


= 2 and = 10,000. Frequency and severity are independent. The insurer buys $25,000 per
claim reinsurance. What is the average aggregate loss for the reinsurer?
How would one model the reinsurers aggregate losses?
[Solution: From the previous exercise, the reinsurer pays on average 120,000 - 85,716 = 34,284.
Alternately, the distribution function of the Pareto at $25000 is: 1 - (1 + 2.5)-2 = 0.918367.
Thus only 8.1633% of the insurers losses lead to a non-zero payment by the reinsurer.
Thus the reinsurer sees a frequency distribution for non-zero payments which is Negative Binomial
with = (4)(8.1633%) = 0.32653, r = 3, and mean (0.32653)(3) = 0.97959.18
The severity distribution for non-zero payments by the reinsurer is truncated and shifted from below
at 25,000. G(x) = {F(x+25000)-F(25000)}/S(25000) =
{(1+2.5)-2 - (1+ (x+25000)/10000)-2} /(1+2.5)-2 = 1 - {3.5 + (x/10000)}-2 3.52 =
1 - (1 + (x/35,000))-2.
Thus the distribution of the reinsurers non-zero payments is Pareto with = 2 and = 35000, and
mean 35000/(2-1) = 35,000.19
Thus the reinsurers expected aggregate loses are: (0.97959)(35,000) = 34,286.
Alternately, we can model the reinsurer including its non-zero payments. In that case, the frequency
distribution is the original Negative Binomial with = 4 and r = 3. The severity would be a
91.8367% and 8.1633% mixture of zero and a Pareto with = 2 and = 35,000;
G(0) = 0.918367, and G(x) = 1 - (0.081633){1+ (x/35000)}-2, for x > 0.]
Model Choices:
Severity distributions that are members of scale families have the advantage that they are easy to
adjust for inflation and/or changes in currency.20 Infinitely divisible frequency distributions have the
advantages that they are easy to adjust for changes in level of exposures and/or time period.21
Loss Models therefore recommends the use of infinitely divisible frequency distributions,
unless there is a specific reason not to do so.22

If one thins a Negative Binomial variable one gets the same form, but with multiplied by the thinning factor.
See Mahlers Guide to Frequency Distributions.
19
The mean excess loss for a Pareto is e(x) = (+x)/(-1). e(25000) = (10,000 + 25,000)/(2-1) =35,000.
See Mahlers Guide to Loss Distributions.
20
See Mahlers Guide to Loss Distributions. If X is a member of a scale family, then for any c > 0, cX is also a
member of that family.
21
See Mahlers Guide to Frequency Distributions. Infinitely divisible frequency distributions include the Poisson
and Negative Binomial. Compound distributions with a primary distribution that is infinitely divisible are also infinitely
divisible.
22
For example, the Binomial may be appropriate when there is some maximum possible number of claims.
18

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 9

Problems:
1.1 (1 point) According to Loss Models, which of the following are advantages of the separation of
the aggregate loss process into frequency and severity?
1. Allows the actuary to estimate the parameters of the frequency and severity from separate
sources of information.
2. Allows the actuary to adjust for inflation.
3. Allows the actuary to adjust for the effects of deductibles.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3
E. None of A,B, C, or D.
Use the following information for the next two questions:

Frequency is Poisson with = 130, prior to the effect of any deductible.

Loss amounts have a LogNormal Distribution with = 6.5 and = 1.3, prior to the
effect of any deductible.

Frequency and loss amounts are independent.

1.2 (1 point) Calculate the expected aggregate amount paid by the insurer.
(A) 160,000

(B) 170,000

(C) 180,000

(D) 190,000

(E) 200,000

1.3 (3 points) Calculate the expected aggregate amount paid by the insurer, if there is a deductible
of 500 per loss.
(A) 140,000

(B) 150,000

(C) 160,000

(D) 170,000

(E) 180,000

1.4 (1 point) According to Loss Models, which of the following are advantages of the use of infinitely
divisible frequency distribution?
1. The type of distribution selected does not depend on whether one is working with months
or years.
2. Allows the actuary to retain the same type of distribution after adjusting for inflation.
3. Allows the actuary to adjust for changes in exposure levels while using the same type of
distribution.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3
E. None of A,B, C, or D.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 10

1.5 (1 point) Aggregate losses for a portfolio of policies are modeled as follows:
(i) The number of losses before any coverage modifications follows a distribution
with mean 30.
(ii) The severity of each loss before any coverage modifications is uniformly distributed
between 0 and 1000.
The insurer would like to model the impact of imposing an ordinary deductible of 100 on each loss
and reimbursing only 80% of each loss in excess of the deductible.
It is assumed that the coverage modifications will not affect the loss distribution.
The insurer models its claims with modified frequency and severity distributions.
The modified claim amount is uniformly distributed on the interval [0, 720].
Determine the mean of the modified frequency distribution.
(A) 3
(B) 21.6
(C) 24
(D) 27
(E) 30
1.6 (3 points) The amounts of loss have a Pareto Distribution with = 4 and = 3000, prior to any
maximum covered loss or deductible. Frequency is Negative Binomial with r = 32 and = 0.5, prior
to any maximum covered loss or deductible. If there is a 1000 deductible and 5000 maximum
covered loss, what is the expected aggregate amount paid by the insurer?
(A) 6000
(B) 6500
(C) 7000
(D) 7500
(E) 8000
1.7 (3 points) The Boxborough Box Company owns three factories.
It buys insurance to protect itself against major repair costs.
Profit = 45 less the sum of insurance premiums and retained major repair costs.
The Boxborough Box Company will pay a dividend equal to half of the profit, if it is positive.
You are given:
(i) Major repair costs at the factories are independent.
(ii) The distribution of major repair costs for each factory is:
k
Prob(k)
0
0.6
20
0.3
50
0.1
(iii) At each factory, the insurance policy pays the major repair costs in excess of that factorys
ordinary deductible of 10.
(iv) The insurance premium is 25.
Calculate the expected dividend.
(A) 3.9
(B) 4.0
(C) 4.1
(D) 4.2
(E) 4.3

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 11

1.8 (2 points) Lucky Tom always gives money to one beggar on his walk to work and money to
another beggar on his walk home from walk.
There is a 3/4 chance he gives a beggar $1 and a 1/4 chance he gives a beggar $10.
However, 1/8 of the time Tom has to stay late at work and takes a cab home rather than walk.
What is the 90th percentile of the amount of money Tom gives away on a work day?
A. 1
B. 2
C. 10
D. 11
E. 20
1.9 (2 points) A restaurant has tables that seat two people. 30% of the time one person eats at a
table, while 70% of the time two people eat at a table. After eating their meal, tips are left.
50% of the time the tip at a table is $1 per person, 40% of the time the tip at a table is $2 per
person, and 10% of the time the tip at a table is $3 per person.
What is the 70th percentile of the distribution of tips left at a table?
A. 2
B. 3
C. 4
D. 5
E. 6
1.10 (2 points) A coffee shop has tables that seat two people. 30% of the time one person sits at
a table, while 70% of the time two people sit at a table. There are a variety of beverages which cost
either $1, $2, or $3. Each person buys a beverage, independently of anyone else. 50% of the
time the beverage costs $1, 40% of the time the beverage costs $2, and 10% of the time the
beverage costs $3.
Determine the probability that the total cost of beverages at a table is either $2, $3, or $4.
A. 78%
B. 79%
C. 80%
D. 81%
E. 82%
Use the following information for the next four questions:
The losses for the Mockingbird Tequila Company have a Poisson frequency distribution with = 5
and a Weibull severity distribution with = 1/2 and = 50,000.
The Mockingbird Tequila Company buys insurance from the Atticus Insurance Company, with a
deductible of $5000, maximum covered loss of $250,000, and coinsurance factor of 90%.
The Atticus Insurance Company buys reinsurance from the Finch Reinsurance Company.
Finch will pay Atticus for the portion of any payment in excess of $100,000.
1.11 (3 points) Construct a model for the aggregate payments retained by the Mockingbird Tequila
Company.
1.12 (3 points) Construct a model for the aggregate payments made by Atticus Insurance
Company to the Mockingbird Tequila Company, prior to the impact of reinsurance.
1.13 (3 points) Construct a model for the aggregate payments made by the Finch Reinsurance
Company to the Atticus Insurance Company.
1.14 (3 points) Construct a model for the aggregate payments made by the Atticus Insurance
Company to the Mockingbird Tequila Company net of reinsurance.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 12

1.15 (2 points) The number of claims is Binomial with m = 2 and q = 0.1.


The size of claims is Normal with = 1500 and = 400.
Frequency and severity are independent.
Determine the probability that the aggregate loss is greater than 2000.
A. Less than 1.0%
B. At least 1.0%, but less than 1.5%
C. At least 1.5%, but less than 2.0%
D. At least 2.0%, but less than 2.5%
E. 2.5% or more
1.16 (5A, 11/95, Q.21) (1 point) Which of the following assumptions are made in the collective risk
model?
1. The individual claim amounts are identically distributed random variables.
2. The distribution of the aggregate losses generated by the portfolio is continuous.
3. The number of claims and the individual claim amounts are mutually independent.
A. 1
B. 2
C. 1, 3
D. 2, 3
E. 1, 2, 3
1.17 (Course 151 Sample Exam #1, Q.15) (1.7 points) An insurer issues a portfolio of 100
automobile insurance policies. Of these 100 policies, one-half have a deductible of 10 and the other
half have a deductible of zero. The insurance policy pays the amount of damage in excess of the
deductible subject to a maximum of 125 per accident. Assume:
(i) the number of automobile accidents per year per policy has a Poisson distribution
with mean 0.03
(ii) given that an accident occurs, the amount of vehicle damage has the distribution:
x
p(x)
30 1/3
150 1/3
200 1/3
Compute the total amount of claims the insurer expects to pay in a single year.
(A) 270
(B) 275
(C) 280
(D) 285
(E) 290
1.18 (Course 151 Sample Exam #1, Q.19) (2.5 points) SA and SB are independent random
variables and each has a compound Poisson distribution. You are given:
(i) A = 3, B =1
(ii) The severity distribution of SA is: pA(1) = 1.0.
(iii) The severity distribution of SB is: pB(1) = pB(2) = 0.5.
(iv) S = SA + SB
Determine FS(2).
(A) 0.12

(B) 0.14

(C) 0.16

(D) 0.18

(E) 0.20

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 13

1.19 (Course 151 Sample Exam #2, Q.21) (1.7 points) Aggregate claims, X, is uniformly
distributed over (0, 20). Complete insurance is available with a premium of 11.6.
If X is less than k, a dividend of k - X is payable.
Determine k such that the expected cost of this insurance is equal to the expected claims without
insurance.
(A) 2
(B) 4
(C) 6
(D) 8
(E) 10
1.20 (Course 151 Sample Exam #3, Q.18) (1.7 points)
Aggregate claims has a compound Poisson distribution with:
(i) = 1.0
(ii) severity distribution: p(1) = p(2) = 0.5
For a premium of 4.0, an insurer will pay total claims and a dividend equal to the excess of 75% of
premium over claims. Determine the expected dividend.
(A) 1.5
(B) 1.7
(C) 2.0
(D) 2.5
(E) 2.7
1.21 (5A, 11/96, Q.37) (2 points) Claims arising from a particular insurance policy have a
compound Poisson distribution. The expected number of claims is five.
The claim amount density function is given by P(X = 1,000) = 0.8 and P(X = 5,000) = 0.2
Compute the probability that losses from this policy will total 6,000.
1.22 (5A, 11/99, Q.24) (1 point)
Which of the following are true regarding collective risk models?
A. If we combine insurance portfolios, where the aggregate claims of each of the portfolios have
compound Poisson Distributions and are mutually independent, then the aggregate claims for
the combined portfolio will also have a compound Poisson Distribution.
B. When the variance of the number of claims exceeds its mean, the Poisson distribution is
appropriate.
C. If the claim amount distribution is continuous, it can be concluded that the distribution of the
aggregate claims is continuous.
D. A Normal Distribution is usually the best approximation to the aggregate claim distribution.
E. All of the above are false.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 14

1.23 (Course 1 Sample Exam, Q.10) (1.9 points) An insurance policy covers the two
employees of ABC Company, Bob and Carol. The policy will reimburse ABC for no more than
one loss per employee in a year. It reimburses the full amount of the loss up to an annual
company-wide maximum of 8000. The probability of an employee incurring a loss in a year is 40%.
The probability that an employee incurs a loss is independent of the other employees losses.
The amount of each loss is uniformly distributed on [1000, 5000].
Given that Bob has incurred a loss in excess of 2000, determine the probability that losses will
exceed reimbursements.
A. 1/20
B. 1/15
C. 1/10
D. 1/8
E. 1/6
Note: The original exam question has been rewritten.
1.24 (IOA 101, 4/01, Q.8) (5.25 points) Consider two independent lives A and B.
The probabilities that A and B die within a specified period are 0.1 and 0.2 respectively.
If A dies you lose 50,000, whether or not B dies.
If B dies you lose 30,000, whether or not A dies.
(i) (3 points) Calculate the mean and standard deviation of your total losses in the period.
(ii) (2.25 points) Calculate your expected loss within the period, given that one, and only one,
of A and B dies.
1.25 (3, 5/01, Q.26 & 2009 Sample Q.109) (2.5 points) A company insures a fleet of vehicles.
Aggregate losses have a compound Poisson distribution.
The expected number of losses is 20.
Loss amounts, regardless of vehicle type, have exponential distribution with = 200.
In order to reduce the cost of the insurance, two modifications are to be made:
(i) a certain type of vehicle will not be insured. It is estimated that this will
reduce loss frequency by 20%.
(ii) a deductible of 100 per loss will be imposed.
Calculate the expected aggregate amount paid by the insurer after the modifications.
(A) 1600
(B) 1940
(C) 2520
(D) 3200
(E) 3880
1.26 (1, 11/01, Q.16) (1.9 points) Let S denote the total annual claim amount for an insured.
There is a probability of 1/2 that S = 0.
There is a probability of 1/3 that S is exponentially distributed with mean 5.
There is a probability of 1/6 that S is exponentially distributed with mean 8.
Determine the probability that 4 < S < 8.
(A) 0.04
(B) 0.08
(C) 0.12
(D) 0.24
(E) 0.25
Note: This past exam question has been rewritten.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 15

1.27 (3, 11/01, Q.36 & 2009 Sample Q.102) (2.5 points) WidgetsRUs owns two factories. It
buys insurance to protect itself against major repair costs. Profit equals revenues, less the sum of
insurance premiums, retained major repair costs, and all other expenses.
WidgetsRUs will pay a dividend equal to the profit, if it is positive.
You are given:
(i) Combined revenue for the two factories is 3.
(ii) Major repair costs at the factories are independent.
(iii) The distribution of major repair costs for each factory is
k
Prob(k)
0
0.4
1
0.3
2
0.2
3
0.1
(iv) At each factory, the insurance policy pays the major repair costs in excess of that factorys
ordinary deductible of 1. The insurance premium is 110% of the expected claims.
(v) All other expenses are 15% of revenues.
Calculate the expected dividend.
(A) 0.43
(B) 0.47
(C) 0.51
(D) 0.55
(E) 0.59
1.28 (SOA3, 11/03, Q.19 & 2009 Sample Q.86) (2.5 points)
Aggregate losses for a portfolio of policies are modeled as follows:
(i) The number of losses before any coverage modifications follows a Poisson
distribution with mean .
(ii) The severity of each loss before any coverage modifications is uniformly distributed
between 0 and b.
The insurer would like to model the impact of imposing an ordinary deductible, d (0 < d < b), on each
loss and reimbursing only a percentage, c (0 < c < 1), of each loss in excess of the
deductible.
It is assumed that the coverage modifications will not affect the loss distribution.
The insurer models its claims with modified frequency and severity distributions.
The modified claim amount is uniformly distributed on the interval [0, c(b - d)].
Determine the mean of the modified frequency distribution.
(A)

(B) c

(C) d/b

(D) (b-d)/b

(E) c(b-d)/b

1.29 (SOA3, 11/04, Q.17 & 2009 Sample Q.126) (2.5 points)
The number of annual losses has a Poisson distribution with a mean of 5.
The size of each loss has a two-parameter Pareto distribution with = 10 and = 2.5.
An insurance for the losses has an ordinary deductible of 5 per loss.
Calculate the expected value of the aggregate annual payments for this insurance.
(A) 8
(B) 13
(C) 18
(D) 23
(E) 28

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 16

1.30 (CAS3, 5/05, Q.6) (2.5 points) For a portfolio of 2,500 policies, claim frequency is 10% per
year and severity is distributed uniformly between 0 and 1,000. Each policy is independent and has
no deductible. Calculate the reduction in expected annual aggregate payments, if a deductible of
$200 per claim is imposed on the portfolio of policies.
A. Less than $46,000
B. At least $46,000, but less than $47,000
C. At least $47,000, but less than $48,000
D. At least $48,000, but less than $49,000
E. $49,000 or more
1.31 (SOA M, 11/06, Q.40 & 2009 Sample Q.289) (2.5 points)
A compound Poisson distribution has = 5 and claim amount distribution as follows:
x

p(x)

100
0.80
500
0.16
1000
0.04
Calculate the probability that aggregate claims will be exactly 600.
(A) 0.022
(B) 0.038
(C) 0.049
(D) 0.060
(E) 0.070

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 17

Solutions to Problems:
1.1. D. 1. True. 2. True. May also adjust for expected changes in frequency. 3. True.
1.2. E. E[X] = exp( + 2/2) = exp(6.5 + 1.32 /2) = 1548.
Mean aggregate loss = (130)(1548) = 201,240.
1.3. B. E[X] = exp( + 2/2) = e7.345 = 1548.
E[X 500] = exp( + 2/2)[(ln500 2)/] + 500{1 - [(ln500 )/]} =
1548[-1.52] + 500{1 - [-0.22]} = (1548)(0.0643) + (500)(0.5871) = 393.
The mean payment per loss is: E[X] - E[X 500] = 1548 - 393 = 1155.
Mean aggregate loss = (130)(1155) = 150,150.
Alternately, the nonzero payments are Poisson with mean: (130)S(500).
The average payment per nonzero payment is: (E[X] - E[X 500])/S(500).
Therefore, the mean aggregate loss is: (130)S(500)(E[X] - E[X 500])/S(500) =
(130)(E[X] - E[X 500]) = (130)(1155) = 150,150.
1.4. B. 1. True. For example, if the frequency over one year is Negative Binomial with parameters
= .3 and r =2, then (assuming the months are independent) the frequency is Negative Binomial
over one month, with parameters = .3 and r = 2/12. This follows from the form of the Probability
Generating Function, which is: P(z) = {1- (z-1)}-r.
2. False. The frequency distribution has no effect on this. 3. True.
1.5. D. For the uniform distribution on (0, 1000), S(100) = 90%.
The distribution of non-zero payments has mean: (90%)(30) = 27.
Comment: Similar to SOA3, 11/03 Q.19.
The mean aggregate payment after modifications is: (27)(720/2) = 9720.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 18

1.6. A. The mean frequency is: (32)(.5) = 16.


The mean payment per loss is: E[X 5000] - E[X 1000]
= {/(1)}{1(/(+5000))1} - {/(1)}{1(/(+1000))1} =
(3000/3){(3000/4000)3 - (3000/8000)3 } = (1000)(.4219 - .0527) = 369.2.
Mean aggregate loss = (16)(369.2) = 5907.
Alternately, the nonzero payments have mean frequency: (16)S(1000).
The average payment per nonzero payment is: (E[X 5000] - E[X 1000])/S(1000).
Therefore, the mean aggregate loss is: (16)S(1000)(E[X 5000] - E[X 1000])/S(1000) =
(16)(E[X 5000] - E[X 1000]) = (16)(947.3 - 578.1) = 5907.
1.7. E. Each factory has retained costs that are 0 60% of the time and 10 40% of the time.
Profit = 45 - 25 - retained costs = 20 - retained costs.
Prob[retained cost = 0] = Prob[all 3 factories have 0 retained costs] = 0.63 = 0.216.
Prob[retained cost = 10] = Prob[2 factories have 0 retained costs and other has 10] =
(3)(0.62 )(0.4) = 0.432.
Probability Retained Cost
Profit
0.216
0
20
0.432
10
10
0.288
20
0
0.064
30
-10
Average
12
8
Comment: Similar to 3, 11/01, Q.36.

Dividend
10
5
0
0
4.32

1.8. D. There is a 7/8 chance he donates to two beggars and a 1/8 chance he donates to one
beggar. Prob[Agg = 1] = (1/8)(3/4) = 3/32. Prob[Agg = 2] = (7/8)(3/4)2 = 63/128.
Prob[Agg = 10] = (1/8)(1/4) = 1/32. Prob[Agg = 11] = (7/8)(2)(3/4)(1/4) = 42/128.
Prob[Agg = 20] = (7/8)(1/4)2 = 7/128. Prob[Agg 10] = 79/128 = 61.7% < 90%.
Prob[Agg 11] = 121/128 = 94.5% 90%. The 90th percentile is 11.
Comment: The 90th percentile is the first value such that F 90%.
1.9. C. Prob[Agg = 1] = (30%)(50%) = 15%.
Prob[Agg = 2] = (30%)(40%) + (70%)(50%) = 47%.
Prob[Agg = 3] = (30%)(10%) = 3%.
Prob[Agg = 4] = (70%)(40%) = 28%.
Prob[Agg = 6] = (70%)(10%) = 7%.
Prob[Agg 3] = 65% < 70%.
Prob[Agg 4] = 93% 70%. The 70th percentile is 4.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 19

1.10. B. Prob[Agg = 2] = Prob[1 person] Prob[drink = $2] + Prob[2 persons] Prob[both $2] =
(30%)(40%) + (70%)(50%)2 = 29.5%.
Prob[Agg = 3] = Prob[1 person] Prob[drink = $3] + Prob[2 persons] Prob[one at $1 and one at $2]
= (30%)(10%) + (70%)(2)(50%)(40%) = 31%.
Prob[Agg = 4] = Prob[2 persons] Prob[each $2] + Prob[2 persons] Prob[one at $1 and one at $3]
= (70%)(40%)2 + (70%)(2)(50%)(10%) = 18.2%.
Prob[Agg = 2] + Prob[Agg = 3] + Prob[Agg = 4] = 29.5% + 31% + 18.2% = 78.7%.
Comment: Prob[Agg = 1] = (30%)(50%) = 15%.
Prob[Agg = 5] = (70%)(2)(40%)(10%) = 5.6%. Prob[Agg = 6] = (70%)(10%)2 = 0.7%.
1.11. Mockingbird retains all of any loss less than $5000.
For a loss of size greater than $5000, it retains $5000 plus 10% of the portion above $5000.
Mockingbird retains the portion of any loss above the maximum covered loss of $250,000.
Let X be the size of loss and Y be the amount retained.
Let F be the Weibull Distribution of X and G be the distribution of Y.
y = x, for x 5000.
y = 5000 + (0.1)(x - 5000) = 4500 + 0.1x, for 5000 x 250,000.
Therefore, x = 10y - 45000, for 5000 y 29,500.
y = 4500 + (.1)(250000) + (x - 250000) = x - 220,500, for 250,000 x.
Therefore, x = y + 220,500, for 29,500 y.
G(y) = F(y) = 1 - exp[-(y/50000)1/2], for y 5000.
G(y) = F(10y - 45000) = 1 - exp[-((10y - 45000)/50000)1/2]
= 1 - exp[-(y/5000 - 0.9)1/2], 5000 y 29,500.
G(y) = F(y + 220,500) = 1 - exp[-((y + 220,500)/50000)1/2]
= 1 - exp[-(y/50000 + 4.41)1/2], 29,500 y.
The number of losses is Poisson with = 5.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 20

1.12. Let X be the size of loss and Y be the amount paid on that loss.
Let F be the Weibull Distribution of X and G be the distribution of Y.
Atticus Insurance pays nothing for a loss less than $5000. For a loss of size greater than $5000,
Atticus Insurance pays 90% of the portion above $5000.
For a loss of size 250,000, Atticus Insurance pays: (0.9)(250,000 - 5000) = 220,500.
Atticus Insurance pays no more for a loss larger than the maximum covered loss of $250,000.
y = 0, for x 5000.
y = (0.9)(x - 5000) = 0.9x - 4500, for 5000 x 250,000.
Therefore, x = (y + 4500)/.9, y < 220,500.
y = 220,500, for 250,000 x.
G(0) = F(5000) = 1 - exp[-(5000/50000)1/2] = .2711.
G(y) = F[(y + 4500)/0.9] = 1 - exp[-((y + 4500)/45000)1/2], 0 < y < 220,500.
G(220,500) = 1.
The number of losses is Poisson with = 5.
Alternately, let Y be the non-zero payments by Atticus Insurance.
Then G(y) = {F((y + 4500)/.9) - F(5000)} / S(5000)
= {1 - exp[-((y + 4500)/45000)1/2] - .2711} / 0.7289
= 1 - exp[-((y + 4500)/45000)1/2] / 0.7289, 0 < y < 220,500.
G(220,500) = 1.
The number of non-zero payments is Poisson with = (0.7289)(5) = 3.6445.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 21

1.13. Finch Reinsurance pays something when the loss results in a payment by Atticus of more
than $100,000. Solve for the loss that results in a payment of $100,000:
100000 = (.9)(x - 5000). x = 116,111.
Let X be the size of loss and Y be the amount paid by Finch Reinsurance.
Let F be the Weibull Distribution of X and G be the distribution of Y.
y = 0, for x 116,111.
y = (.9)(x - 116,111) = .9x - 104,500, for 116,111 < x 250,000.
Therefore, x = (y + 104500)/.9, for 0 < y < 120,500.
y = 120,500, for 250,000 x.
G(0) = F(116111) = 1 - exp[-(116111/50000)1/2] = .7821.
G(y) = F((y + 104500)/.9) = 1 - exp[-((y + 104500)/45000)1/2], for 0 < y < 120,500.
G(120500) = 1.
The number of losses is Poisson with = 5.
Alternately, let Y be the non-zero payments by Finch.
Then G(y) = {F((y + 104500)/.9) - F(116111)}/S(116111)
= {1 - exp[-((y + 104500)/45000)1/2] - 0.7821}/0.2179
= 1 - exp[-((y + 104500)/45000)1/2]/0.2179, for 0 < y < 120,500.
G(120500) = 1.
The number of non-zero payments by Finch is Poisson with = (.2179)(5) = 1.0895.
1.14. Let X be the size of loss and Y be the amount paid on that loss net of reinsurance.
Let F be the Weibull Distribution of X and G be the distribution of Y.
For a loss greater than 116,111, Atticus Insurance pays 100,000 net of reinsurance.
y = 0, for x 5000.
y = (0.9)(x - 5000) = 0.9x - 4500, for 5000 x 116,111.
y = 100,000, for 116,111 < x.
G(0) = F(5000) = 1 - exp[-(5000/50000)1/2] = 0.2711.
G(y) = F[(y + 4500)/0.9] = 1 - exp[-((y + 4500)/45000)1/2], 0 < y < 100,000.
G(100,000) = 1.
The number of losses is Poisson with = 5.
Alternately, let Y be the non-zero payments by Atticus Insurance net of reinsurance.
Then G(y) = {F((y + 4500)/.9) - F(5000)}/S(5000)
= {1 - exp[-((y + 4500)/45000)1/2] - .2711}/.7289
= 1 - exp[-((y + 4500)/45000)1/2]/0.7289, 0 < y < 100,000.
G(100,000) = 1.
The number of non-zero payments is Poisson with = (0.7289)(5) = 3.6445.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 22

1.15. E. Prob[N = 0] = 0.92 = 0.81. Prob[N = 1] = (2)(0.1)(0.9) = 0.18.


Prob[N = 2] = 0.12 = 0.01.
Let X be the size of a single loss.
Prob[X > 2000] = 1 - [(2000 - 1500)/400] = 1 - [1.25] = 0.1056.
X + X is also Normal, but with = 3000 and = 400

2 = 565.685.

Prob[X + X > 2000] = 1 - [(2000 - 3000)/565.685] = 1 - [-1.77] = 0.9616.


Prob[aggregate > 2000] = Prob[N = 1] Prob[X > 2000] + Prob[N = 2] Prob[X + X > 2000] =
(0.18)(0.1056) + (0.01)(0.9616) = 2.86%.
Comment: Do not use the Normal Approximation.
1.16. C. 1. True. 2. False. The aggregate losses are continuous if the severity distribution is
continuous and there is no probability of zero claims. The aggregate losses are discrete if the
severity distribution is discrete. 3. True.
1.17. B. The expected number of accidents is: (.03)(100) = 3. The mean payment with no
deductible is: (30 + 125 + 125)/3 = 93.333. The mean payment with a deductible of 10 is:
(20 + 125 + 125)/3 = 90. The overall mean payment is: (1/2)(93.333) + (1/2)(90) = 91.667.
Therefore, the mean aggregate loss is: (3)(91.667) = 275.
Comment: The maximum amount paid is 125; in one case there is no deductible and a maximum
covered loss of 125, while in the other case there is a deductible of 10 and a maximum covered loss
of 135.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 23

1.18. E. Since pA(1) = 1, SA is just a Poisson Distribution with mean 3.


Prob(SA = 0) = e-3. Prob(SA = 1) = 3e-3. Prob(SA = 2) = (9/2)e-3.
For B the chance of 0 claims is e-1, the chance of one claim is e-1, and the chance of two claims is: e1 /2.

Prob(SB = 0) = Prob(zero claims) = e-1.

Prob(SB = 1) = Prob(one claim)Prob(claim amount = 1) = e-1/2.


Prob(SB = 2) = Prob(one claim)Prob(amount = 2) + Prob(two claims)Prob( both amounts = 1) =
e-1/2 + (e-1/2)(1/4) = 5e-1/8.
Prob(S2) = Prob(SA = 0)Prob(SB 2) + Prob(SA = 1)Prob(SB 1) + Prob(SA = 2)Prob(SB = 0)
= (e-3)(17e-1/8) + (3e-3)(12e-1/8) + (9e-3/2)(e-1) = 89e-4/8 = 0.2038.
Alternately, the combined process is the sum of two independent compound Poisson processes,
so it in turn is a compound Poisson process. It is has claims of sizes
1 and 2. The expected number of claims of size 1 is: (3)(1) + (1)(.5) = 3.5.
The expected number of claims of size 2 is: (3)(0) + (1)(.5) = .5.
The small claims (those of size 1) and the large claims (those of size 2), form independent Poisson
Processes.
Prob(S 2) = Prob(no claims of size 1)Prob(0 or 1 claims of size 2) +
Prob(1 claim of size 1)Prob(no claims of size 2) +
Prob(2 claims of size 1)Prob(no claims of size 2) =
(e-3.5)(e-0.5 + .5e-0.5) + (3.5e-3.5)(e-0.5) + (3.52 e-3.5/2)(e-0.5) = 11.125e-4 = 0.2038.
Comment: One could use either the Panjer algorithm or convolutions in order to compute the
distribution of SB.
1.19. D. The expected aggregate losses are: (0+20)/2 = 10. The expected dividend is:
k

x =k

(k - x) (1/ 20) dx = -(k - x)2 / 40 ]

] = k2/40.

x =0

Setting as stated, premiums - expected dividends = expected aggregate losses,


11.6 - k2 /40 = 10. Therefore, k = 8.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 24

1.20. B. Let A be the aggregate claims. 75% of premiums is 3. If A 3, the dividend is 0;


if A 3, the dividend is 3 - A. Thus the dividend is: 3 - Minimum(3, A).
Therefore, the expected dividend is 3 - E[A 3].
Prob(A = 0) = Prob(0 claims) = e-1.
Prob(A = 1) = Prob(1 claim)Prob(claim size = 1) = (e-1)(.5).
Prob(A = 2) = Prob(1 claim)Prob(claim size = 2) + Prob(2 claims)Prob(claim sizes are both 1) =
(e-1)(.5) + (e-1/2)(.52 ) = .625e-1. Prob(A 3) = 1 - 2.125e-1.
Thus E[A

3] = (0)(e-1) + (1)(.5e-1) + (2)(.625e-1) + (3)(1 - 2.125e-1) = 3 - 4.625 e-1.

Therefore the expected dividend is: 3 - E[A 3] = 4.625 e-1 = 1.70.


Alternately, if A = 0 the dividend is 3, if A = 1 the dividend is 2, if A = 2 the dividend is 1,
and if A 3, the dividend is zero. Therefore, the expected dividend is:
(3)(e-1) + (2)(.5e-1) + (1)(.625e-1) + (0)(1 - 2.125e-1) = 4.625 e-1 = 1.70.
1.21. For the aggregate losses to be 6000, either there are two claims with one of size 1000 and
one of size 5000, or there are six claims each of size 1000.
The probability is: (2)(.8)(.2)(52 e-5/2!) + (.86 )(56 e-5/6!) = 6.53%.
Alternately, thin the Poisson Distribution. The number of claims of size 1000 is Poisson with = 4.
The number of claims of size 5000 is Poisson with = 1. The number of small and large claims is
independent. For the aggregate losses to be 6000, either there are two claims with one of size
1000 and one of size 5000, or there are six claims each of size 1000.
The probability is: Prob[1 claim of size 1000]Prob[1 claim of size 5000] +
Prob[6 claims of size 1000]Prob[no claim of size 5000] = (4e-4)(e-1) + (46 e-4/6!)(e-1) = 6.53%.
Comment: One could use either convolution or the Panjer Algorithm (recursive method), but they
would take longer in this case.
1.22. A. The sum of independent Compound Poissons is also Compound Poisson, so
Statement A is True. When the variance of the number of claims equals its mean, the Poisson
distribution is appropriate, so Statement B is False. If there is a chance of no claims, then there is an
extra point mass of probability at zero in the aggregate distribution, and the distribution of aggregate
losses is not continuous at zero, so Statement C is not True. When, as is common, the distribution of
aggregate losses is significantly skewed, the Normal Distribution is not the best approximation, so
Statement D is not True.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 25

1.23. B. The only way that reimbursements can be greater than 8000 is if both employees have a
claim, and they sum to more than 8000.
Let x be the size of Bobs claim and y be the size of the Carols claim, then x + y > 8000

y > 8000 - x. Prob[ y > 8000 - x] = {5000 - (8000 - x)}/4000 = (x - 3000)/4000, x > 3000.
Given Bob has had a claim of size > 2000, f(x) = 1/(5000-2000) = 1/3000.
Prob[reimbursements > 8000 | Bob has a claim > 2000] =
Prob[Carol has a claim] Prob[x + y > 8000] =
5000

0.4

(Prob[ y >

2000

5000

8000 - x] / 3000) dx = (4/30,000)

{(x - 3000) / 4000} dx

3000

= (20002 /2) / 30,000,000 = 1/15.


1.24. (i) A has mean: (0.1)(50,000) = 5000, and variance: (500002 )(0.1)(0.9) = 225 million.
B has mean: (0.2)(30,000) = 6000, and variance: (300002 )(0.2)(0.8) = 144 million.
The total has mean: 5000 + 6000 = 11,000,
and variance: 225 million + 144 million = 369 million.
The standard deviation of the total is: 369 million = 19,209.
(ii) Prob[A and not B] = (0.1)(0.8) = 0.08. Prob[B and not A] = (0.2)(0.9) = 0.18.
{(0.08)(50000) + (0.18)(30000)} / (0.08 + 0.18) = 36,154.
1.25. B. After the modifications, the mean frequency is (80%)(20) = 16.
The mean payment per loss is: E[X] - E[X

100] = - (1 - e-100/) = (200)e-100/200 = 121.31.

After the modifications, the mean aggregate loss is: (16)(121.31) = 1941.
Alternately, given a loss, the probability of a non-zero payment given a deductible of size 100 is:
S(100) = e-100/200 = 0.6065.
Thus the mean frequency of non-zero payments is: (0.6065)(16) = 9.704.
Due to the memoryless property of the Exponential, the non-zero payments excess of the
deductible are also exponentially distributed with = 200.
Thus the mean aggregate loss is: (9.704)(200) = 1941.
1.26. C. Prob(4 < S < 8) = (1/3)Prob[4 < S < 8 | = 5] + (1/6)Prob[4 < S < 8 | = 8] =
(1/3)(e-4/5 - e-8/5) + (1/6)(e-4/8 - e-8/8) = 0.122.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 26

1.27. E. E[X] = (0.4)(0) + (0.3)(1) + (0.2)(2) + (0.1)(3) = 1.


E[X 1] = (0.4)(0) + (0.3)(1) + (0.2)(1) + (0.1)(1) = 0.6.
Expected losses paid by the insurer per factory = E[X] - E[X 1] = 1 - 0.6 = .4.
Insurance Premium = (110%)(2 factories)(0.4 / factory) = .88.
Profit = 3 - (0.88 + retained costs + (0.15)(3)) = 1.67 - retained costs.
For each factory independently, the retained costs are either zero 40% of the time, or one 60% of
the time. Therefore, total retained costs are: zero 16% of the time, one 48% of the time, and two
36% of the time.
Probability Retained Losses
Profit
Dividend
16%
0
1.67
1.67
48%
1
0.67
0.67
36%
2
-0.33
0
Average
1.20
0.47
0.589
Expected Dividend = (0.16)(1.67) + (0.48)(0.67) + (0.36)(0) = 0.589.
Alternately, expected losses paid by the insurer per factory = E[(X-1)+] =
(0.4)(0) + (0.3)(0) + (0.2)(2 - 1) + (0.1)(3 -1) = 0.4. Proceed as before.
Alternately, the dividends are the amount by which the retained costs are less than:
revenues - insurance premiums - all other expenses = 3 - 0.88 - .045 = 1.67.
Expected Dividend = expected amount by which retained costs are less than 1.67 =
1.67 - E[retained costs 1.67].
Probability Retained Costs
Retained Costs Limited to 1.67
16%
0
0
48%
1
1
36%
2
1.67
E[retained costs 1.67] = (.16)(0) + (.48)(1) + (.36)(1.67) = 1.081.
Expected Dividend = 1.67 - 1.081 = 0.589.
Comment: Note that since no dividend is paid when the profit is negative, the average dividend is
not equal to the average profit.
E[Profit] = 1.67 - E[retained costs] = 1.67 - 1.20 = 0.47 .589.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 27

1.28. D. For the uniform distribution on (0, b), S(d) = (b-d)/b, for d < b.
The frequency distribution of non-zero payments is Poisson with mean: S(d) = (b-d)/b.
Comment: The severity distribution has been truncated and shifted from below, which would have
been uniform on [0, b - d], and then all the values were multiplied by c. uniform on [0, c(b -d)].
The mean aggregate payment per year after the modifications is:
{(b-d)/b}{c(b-d)/2} = c(b-d)2 /(2b).
For example, assume b = 5000, so that the severity of each loss before any coverage
modifications is uniformly distributed between 0 and 5000, d = 3000, and c = 90%.
Then 40% of the losses exceed the deductible of 3000.
Thus the modified frequency is Poisson with mean: .4 = (5000 - 3000)/5000.
The modified severity is uniform from 0 to (90%)(5000 - 3000) = 1800.
The mean aggregate payment per year after the modifications is: (.4)(1800/2) = 360.

1.29. C. E[X

5] = {/(-1)}{1 - (/(+x))1} = (10/1.5)(1 - (10/15)1.5) = 3.038.

E[X] = /(-1) = 10/1.5 = 6.667. Expected aggregate: (5)(6.667 - 3.038) = 18.1.


Alternately, the expected number of (nonzero) payments is: (5)S(5) = (5)(10/15)2.5 = 1.81.
The average payment per (nonzero) payment is for the Pareto Distribution:
e(5) = (5 + )/( - 1) = (5 + 10)/(2.5 - 1) = 10.
Expected aggregate loss is: (10)(1.81) = 18.1.
1.30. A. Prior to the deductible, we expect: (2500)(10%) = 250 losses.
The expected losses eliminated by the deductible is:
250/5 = 50 averaging 100 and (250)(4/5) = 200 at 200. (50)(100) + (200)(200) = 45,000.
Alternately, the losses eliminated are: (expected number of losses) E[X 200]
200

= (250){

0 (1/ 1000) x dx

+ 200(4/5)} = (250){(2002 /2)/1000 + 160} = (250)(180) = 45,000.

2013-4-3,

Aggregate Distributions 1 Introduction,

HCM 10/23/12,

Page 28

1.31. D. One can either have six claims each of size 100, or two claims with one of size 100 and
the other of size 500, in either order.
Prob[n = 6] Prob[x = 100]6 + Prob[n = 2] 2 Prob[x = 100] Prob[x = 500] =
(56 e-5/6!)(0.86 ) + (52 e-5/2!)(2)(0.8)(0.16) = (0.1462)(0.2621) + (0.0842)(0.2560) = 0.0599.
Alternately, the claims of size 100 are Poisson with = (0.8)(5) = 4, the claims of size 500 are
Poisson with = (0.16)(5) = 0.8, and the claims of size 1000 are Poisson with = (0.04)(5) = 0.2,
and the three processes are independent.
Prob[6 @ 100]Prob[0 @ 500]Prob[0 @ 1000] + Prob[1 @ 100]Prob[1 @ 500]Prob[0 @ 1000] =
(46 e-4/6!)(e-0.8)(e-0.2) + (4e-4)(0.8e-0.8)(e-0.2) = 0.0599.
Comment: One could instead use the Panjer Algorithm, but that would be much longer.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 29

Section 2, Convolutions23
Quite often one has to deal with the sum of two independent variables. One way to do so is via the
so-called convolution formula.24 You are unlikely to be tested directly on convolutions on your exam.
As will be discussed in the next section, convolutions can be useful for computing either aggregate
distributions or compound distributions.
Six-sided Dice:
If one has a variable with density f, then the convolution of f with itself, f*f, is the density of the sum of
two such independent, identically distributed variables.
Exercise: Let f be a distribution which has 1/6 chance of a 1, 2, 3, 4, 5, or 6.
One can think of this as the result of rolling a six-sided die. What is f*f at 4?
[Solution: Prob[X1 + X2 = 4] = Prob[X1 = 1] Prob[X2 = 3] + Prob[X1 = 2] Prob[X2 = 2]
+ Prob[X1 = 3] Prob[X2 = 1] = (1/6)(1/6) + (1/6)(1/6) + (1/6)(1/6) = 3/36.]
One can think of f*f as the sum of the rolls of two six-sided dice.
Then f*f*f = f*3 can be thought of as the sum of the rolls of three six-sided dice.

Adding Variables versus Multiplying by a Constant:


Let X be the result of rolling a six-sided die:
Prob[X = 1] = Prob[X = 2] = Prob[X = 3] = Prob[X = 4] = Prob[X = 5] = Prob[X = 6] = 1/6.
Exercise: What are the mean and variance of X?
[Solution: The mean is 3.5, the second moment is: (12 + 22 + 32 + 42 + 52 + 62 )/6 = 91/6, and the
variance is: 91/6 - 3.52 = 35/12.]
Then X + X is the sum of rolling two dice.
Exercise: What is the distribution of X + X?
[Solution:
Result: 2
3
4
5
6
7
8
9
10
11
12
Prob.: 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 ]
23

See Section 9.3 of Loss Models. Also see Section 2.3 of Actuarial Mathematics, not on the Syllabus,
or An Introduction to Probability Theory and Its Applications by William Feller.
24
Another way is to work with Moment Generating Functions or other generating functions.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 30

In general, X + X means the sum of two independent, identically distributed variables.


In contrast 2X means twice the value of a single variable.
In this example, 2X is twice the result of rolling a single die.
Exercise: What is the distribution of 2X?
[Solution:
Result: 2
4
6
8
10
Prob.: 1/6 1/6
1/6
1/6
1/6

12
1/6 ]

We note that the distributions of 2X and X + X are different.


For example X + X can be 3, while 2X can not.
Exercise: What are the mean and variance of X + X?
[Solution: Mean = (2)(3.5) = 7. Variance = (2)(35/12) = 35/6.]
The variances of independent variables add.
Exercise: What are the mean and variance of 2X?
[Solution: Mean = (2)(3.5) = 7. Variance = (22 )(35/12) = 35/3.]
Multiplying a variable by a constant, multiplies the variance by the square of that constant.
Note that while X + X and 2X each have mean 2 E[X], they have different variances.
Var[X + X] = 2 Var[X].
Var[2X] = 4 Var[X].

Convoluting Discrete Distributions:


Let X have a discrete distribution such that f(0) = 0.7, f(1) = 0.3.
Let Y have a discrete distribution such that g(0) = 0.9, g(1) = 0.1.
If X and Y are independent, we can calculate the distribution of Z = X + Y as follows:
If X is 0 and Y is 0, then Z = 0. This has a probability of (0.7)(0.9) = 0.63.
If X is 0 and Y is 1, then Z = 1. This has a probability of (0.7)(0.1) = 0.07.
If X is 1 and Y is 0, then Z = 1. This has a probability of (0.3)(0.9) = 0.27.
If X is 1 and Y is 1, then Z = 2. This has a probability of (0.3)(0.1) = 0.03.
Thus Z has a 63% chance of being 0, a 34% chance of being 1, and a 3% chance of being 2.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 31

One can put this solution in terms of formulas. Let z be the outcome, Z = X + Y. Let f(x) be the
density of X. Let g(y) be the density of Y. Let h(z) be the density of Z.
Then since y = z - x and x = z - y one can sum over the possible outcomes in either of two ways:
h(z) =

f(x)g(z-x) = f(z-y)g(y).
x

Exercise: Use the above formulas to calculate h(1).


1
[Solution:
h(1) = f(x)g(1-x) = f(0)g(1) + f(1)g(0) = (0.7)(0.1) + (0.3)(0.9) = 0.34, or
x=0
1

h(1) = f(1-y)g(y) = f(1)g(0) + f(0)g(1) = (0.3)(0.9) + (0.7)(0.1) = 0.34. ]


y=0

One could arrange this type of calculation in a spreadsheet:


x
0
1

f(x)
0.7
0.3

Product
0.07
0.27

g(1-x)
0.1
0.9

Sum

0.34

1-x
1
0

One has to write g in the reverse order, so as to line up the appropriate entries. Then one takes
products and sums them. Let us see how this works for a more complicated case.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 32

Exercise: Let X have a discrete distribution such that f(2) = 0.3, f(3) = 0.4, f(4) = 0.1, and f(5) = 0.2.
Let have Y have a discrete distribution such that g(0) = 0.5, g(1) = 0.2, g(2) = 0, and g(3) = 0.3.
X and Y are independent. Z = X +Y. Calculate the density at 4 of Z.
[Solution: Since we want x + y = 4, we put f(3) next to g(1), etc.
x
1
2
3
4
5

f(x)
0.3
0.4
0.1
0.2

Product
0
0
0.08
0.05
0

g(4-x)
0.3
0
0.2
0.5

Sum

0.13

1
f(4-x)
0.2
0.1
0.4
0.3
1

4-x
3
2
1
0

Alternately, we can list f(4-x).


x
-1
0
1
2
3

g(x)
0.5
0.2
0
0.3

Product
0
0.05
0.08
0
0

Sum

0.13

4-x
5
4
3
2

Alternately, list the possible ways the two variables can add to 4:
X = 2 and Y = 2, with probability: (0.3)(0) = 0,
X = 3 and Y = 1, with probability: (0.4)(0.2) = 0.8,
X = 4 and Y = 0, with probability: (0.1)(0.5) = 0.5.
The density at 4 of X + Y is the sum of these probabilities: 0 + 0.8 + 0.5 = 0.13.
Comment: f*g = g*f.]
In a similar manner we can calculate the whole distribution of Z:
z
h(z)

2
0.15

3
0.26

4
0.13

5
0.21

6
0.16

7
0.03

8
0.06

Note that the probabilities sum to one: 0.15 + 0.26 + 0.13 + 0.21 + 0.16 + 0.03 + 0.06 = 1.
This is one good way to check the calculation of a convolution.
One could arrange this whole calculation in spreadsheet form as follows:

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 33

The possible sums of X and Y are:


X

Y
1

2
3
4
5

2
3
4
5

3
4
5
6

4
5
6
7

5
6
7
8

0.3

0%
0%
0%
0%

9%
12%
3%
6%

With the corresponding probabilities:


Probabilities of
X

0.5

0.3
0.4
0.1
0.2

15%
20%
5%
10%

Probabilities of Y
0.2
6%
8%
2%
4%

Then adding up probabilities:


sum = 2: 15%.
sum = 3: 20% + 6% = 26%.
sum = 4: 5% + 8% + 0% = 13%.
sum = 5: 10% + 2% + 0% + 9%= 21%.

sum = 6: 4% + 0% + 12% = 16%.


sum = 7: 0% + 3% = 3%.
sum = 8: 6%.

Convoluting Three or More Variables:


If one wants to add up three numbers, one can sum the first two and then add in the third number.
For example 3 + 5 + 12 = (3 + 5) + 12 = 8 + 12 = 20. Similarly, if one wants to add three variables
one can sum the first two and then add in the third variable. In terms of convolutions, one can first
convolute the first two densities and then convolute this result with the third density.
Continuing the previous example, once one has the distribution of Z = X + Y, then we could
compute the densities of X + Y + Y = Z + Y, by performing another convolution. For example,
here is how one could compute the density of X + Y + Y = Z + Y at 6:
x
2
3
4
5
6
7
8

h(x)
0.15
0.26
0.13
0.21
0.16
0.03
0.06

Product
0
0.078
0
0.042
0.08
0
0

g(6-x)

6-x

0.3
0
0.2
0.5

3
2
1
0

Sum

0.2

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 34

Notation:
We use the notation h = f * g, for the convolution of f and g.
Repeated convolutions are indicated by powers, in the same manner as is continued multiplication.
f * f = f*2 . f*f*f = f*3 .
Sum of 2 independent, identically distributed variables: f* f = f*2 .
Sum of 3 independent, identically distributed variables: f* f*f = f*3 .
Both Loss Models and Actuarial Mathematics employ the convention that f*0 (x) = 1 if x = 0, and
0 otherwise.25

f*1 (x) = f(x).

Similar notation is used for the distribution of the sum of two independent variables.
If X follows F and Y follows G, then the distribution of X + Y is H = F * G.
H(z) =

F(x)g(z - x) = f(z - y)G(y) = f(x)G(z - x) = F(z - y)g(y) .


x

Again repeated convolutions are indicated by a power: F*F*F = F*3 .


Both Loss Models and Actuarial Mathematics employ the convention that F*0 (x) = 0 if x < 0
and 1 if x 0; F*0 has a jump discontinuity at 1. F*1 (x) = F(x).
Properties of Convolutions:
The convolution operator is commutative and associative.
f*g = g*f.
(f*g)*h = f* (g*h).
Note that the moment generating function of the convolution f* g is the product of the moment
generating functions of f and g. Mf*g = Mf Mg . This follows from the fact that the moment
generating function for a sum of independent variables is the product of the moment generating
functions of each of the variables. Thus if one takes the sum of n independent identically distributed
variables, the Moment Generating Function is taken to the power n.
Similarly, the Probability Generating Function of the convolution f* g is just of the product of the
Probability Generating Functions of f and g. Pf*g = Pf Pg .
25

This will be used when one writes aggregate or compound distributions in terms of convolutions.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 35

This follows from the fact the Probability Generating Function for a sum of independent variables is
the product of the Probability Generating Functions of each of the variables. Thus if one takes the
sum of n independent identically distributed variables, the Probability Generating Function is taken to
the power n.
Convolution of Continuous Distributions:
The same convolution formulas apply if one is dealing with continuous rather than
discrete distributions, with integration taking the place of summation. One can get the density function
for the sum of two independent variables X + Y:
h(z) =

f(x) g(z-x) dx = f(z-y) g(y) dy.

Similar formulas apply to get the Distribution Function of X + Y:

H(z) = f(x)G(z-x) dx = F(z-y)g(y) dy = F(x)g(z-x) dx = f(z-y)G(y) dy.


For example, let X have a uniform distribution on [1,4], while Y has a uniform distribution on [7,12].
If X and Y are independent and Z = X + Y, here is how one can use the convolution formulas to
compute the density of Z.
In this case f(x) = 1/3 for 1< x < 4 and g(y) = 1/5 for 7<y<12, so the density of the sum is:
x=4

f(x) g(z-x)dx = (1/3) g(z-x)dx = (1/15)Length[{7< (z-x) <12} and {1 <x < 4}].
x=1

Length[{7< (z-x) <12} and {1 < x < 4}] = Length[{z-12 < x < z-7} and {1 < x < 4}].
If z < 8, then Length[{z-12 < x < z-7} and {1 < x < 4}] = 0.
If 8 z 11, then Length[{z-12 < x < z-7} and {1 < x < 4}] = z - 8.
If 11 z 13, then Length[{z-12 < x < z-7} and {1 < x < 4}] = 3.
If 13 z 16, then Length[{z-12 < x < z-7} and {1 < x < 4}] = 16 - z.
If 16 < z, then Length[{z-12 < x < z-7} and {1 < x < 4}]= 0.
Thus the density of the sum is:
0
z8
(z-8)/15 8 z 11
3/15
11 z 13
(16-z)/15 13 z 16
0
16 z

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 36

A graph of this density:

11

13

16

For example, the density of the sum at 10 is:


x=4

f(x) g(10-x)dx = (1/3) g(10-x)dx = (1/15)Length[{7<(10-x)<12} and {1 <x < 4}] =


x=1

(1/15)Length[{1 < x < 3}] = (1/15)(2) = 2/15.


Note the convolution is in this case a continuous density function. Generally the convolution will
behave better than the original distributions, so convolution serves as a smoothing operator.26
Exercise: X has density f(x) = e-x, x > 0. Y has density g(y) = e-y, y > 0. If X and Y are
independent, use the convolution formula to calculate the density function for their sum.
[Solution: The density of X + Y is a Gamma Distribution with = 2 and = 1.

f* g = f(x) g(z-x) dx =
x=0

e-x e-(z-x) dx = e-z dx = ze-z.

x=0

x=0

Note that the integral extends only over the domain of g, so that z-x > 0 or x < z. ]
Exercise: X has a Gamma Distribution with = 1 and = 0.1.
Y has a Gamma Distribution with = 5 and = 0.1.
If X and Y are independent, use the convolution formula to calculate the density function for their sum.
[Solution: The density of X + Y is a Gamma Distribution with = 1 + 5 = 6 and = 0.1.

f* g = f(z-y)g(y) dy =
y=0

(10e-10(z-y) ) (105y 4e-10y /4!) dy = 106e-z y4/4! dy = 106e-z z5/5!.

y=0

y=0

Note that the integral extends only over the domain of f, so that z-y > 0 or y < z. ]
26

See An Introduction to Probability Theory and Its Applications by William Feller.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 37

Thus we have shown that adding an independent Exponential to a Gamma with the same scale
parameter, , increases the shape parameter of the Gamma, , by 1. In general, the sum of two
independent Gammas with the same scale parameter, is another Gamma with the same and the
sum of the two alphas.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Problems:
Use the following information for the next 10 questions:
Let X have density: f(0) = 0.6, f(1) = .3, and f(2) = 0.1.
Let Y have density: g(0) = 0.2, g(1) = .5, and g(2) = 0.3.
2.1 (1 point) What is f*f at 2?
A. Less than 0.15
B. At least 0.15 but less than 0.20
C. At least 0.20 but less than 0.25
D. At least 0.25 but less than 0.30
E. At least 0.30
2.2 (1 point) What is the cumulative distribution function of X + X at 2?
A. Less than 0.80
B. At least 0.80 but less than 0.85
C. At least 0.85 but less than 0.90
D. At least 0.90 but less than 0.95
E. At least 0.95
2.3 (2 points) What is f*3 = f*f*f at 3?
A. Less than 0.12
B. At least 0.12 but less than 0.14
C. At least 0.14 but less than 0.16
D. At least 0.16 but less than 0.18
E. At least 0.18
2.4 (3 points) What is f*4 = f*f*f*f at 5?
A. Less than 0.04
B. At least 0.04 but less than 0.06
C. At least 0.06 but less than 0.08
D. At least 0.08 but less than 0.10
E. At least 0.10
2.5 (1 point) What is g*g at 3?
A. Less than 0.20
B. At least 0.20 but less than 0.25
C. At least 0.25 but less than 0.30
D. At least 0.30 but less than 0.35
E. At least 0.35

Page 38

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

2.6 (1 point) What is the cumulative distribution function of Y + Y at 1?


A. Less than 0.10
B. At least 0.10 but less than 0.15
C. At least 0.15 but less than 0.20
D. At least 0.20 but less than 0.25
E. At least 0.25
2.7 (1 point) What is g*3 = g*g*g at 4?
A. Less than 0.25
B. At least 0.25 but less than 0.30
C. At least 0.30 but less than 0.35
D. At least 0.35 but less than 0.40
E. At least 0.40
2.8 (1 point) What is g*4 = g*g*g*g at 3?
A. Less than 0.10
B. At least 0.10 but less than 0.15
C. At least 0.15 but less than 0.20
D. At least 0.20 but less than 0.25
E. At least 0.25
2.9 (1 point) What is f*g at 3?
A. Less than 0.15
B. At least 0.15 but less than 0.20
C. At least 0.20 but less than 0.25
D. At least 0.25 but less than 0.30
E. At least 0.30
2.10 (1 point) What is the cumulative distribution function of X + Y at 2?
A. Less than 0.70
B. At least 0.70 but less than 0.75
C. At least 0.75 but less than 0.80
D. At least 0.80 but less than 0.85
E. At least 0.85

Page 39

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 40

2.11 (1 point) Which of the following are true?


1. If f has a Gamma Distribution with = 3 and = 7,
then f*10 has a Gamma Distribution with = 30 and = 70.
2. If f has a Normal Distribution with = 3 and = 7,
then f*10 has a Normal Distribution with = 30 and = 70.
3. If f has a Negative Binomial Distribution with = 3 and r = 7,
then f*10 has a Negative Binomial Distribution with = 30 and r = 70.
A. 1, 2

B. 1, 3

C. 2, 3

D. 1, 2, 3

E. None of A, B, C, or D.

2.12 (2 points) The severity distribution is: f(1) = 40%, f(2) = 50% and f(3) = 10%.
There are three claims. What is the chance they sum to 6?
A. Less than 0.24
B. At least 0.24 but less than 0.25
C. At least 0.25 but less than 0.26
D. At least 0.26 but less than 0.27
E. At least 0.27
2.13 (3 points) The waiting time, x, from the date of an accident to the date of its report to an
insurance company is exponential with mean 1.7 years. The waiting time, y, in years, from the
beginning of an accident year to the date of an accident is a random variable with density
f(y) = 0.9 + 0.2y, 0 y 1. Assume x and y are independent. What is the expected portion of the
total number of accidents for an accident year reported to the insurance company by one half year
after the end of the accident year?
A. Less than 0.41
B. At least 0.41 but less than 0.42
C. At least 0.42 but less than 0.43
D. At least 0.43 but less than 0.44
E. At least 0.44
2.14 (2 points) Let f(x) = 0.02(10 - x), 0 x 10. What is the density of X + X at 7?
A. Less than 0.07
B. At least 0.07 but less than 0.08
C. At least 0.08 but less than 0.09
D. At least 0.09 but less than 0.10
E. At least 0.10

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 41

Use the following information for the next two questions:


X is the sum of a random variable with a uniform distribution on [1, 6] and an independent random
variable with a uniform distribution on [3, 11].
2.15 (1 point) What is the density of X at 14?
A. Less than 0.08
B. At least 0.08 but less than 0.09
C. At least 0.09 but less than 0.10
D. At least 0.10 but less than 0.11
E. At least 0.11
2.16 (2 point) What is the Distribution Function of X at 14?
A. Less than 0.87
B. At least 0.87 but less than 0.88
C. At least 0.88 but less than 0.89
D. At least 0.89 but less than 0.90
E. At least 0.90
2.17 (2 points) X follows an Exponential Distribution with mean 7. Y follows an Exponential
Distribution with mean 17. X and Y are independent. What is the density of Z = X + Y?
A. e-z/12 / 12
B. ze-z/12 / 144
C. ( e-z/17 - e-z/7) / 10
D. ( e-z/17 + e-z/7) / 24
E. None of the above
2.18 (3 points) Tom, Dick, and Harry are actuaries working on the same project.
Each actuary performs his calculations with no intermediate rounding. Each result is a large number,
which the actuary rounds to the nearest integer. If without any rounding Toms and Dicks results
would sum to Harrys, what is the probability that they do so after rounding?
A. 1/2
B. 3/5
C. 2/3
D. 3/4
E. 7/8
2.19 (2 points) Let X be the results of rolling a 4-sided die (1, 2, 3 or 4), and let Y be the result of
rolling a 6-sided die. X and Y are independent. What is the distribution of X + Y?
2.20 (2 points) The density function for X is f(1) = .2, f(3) = .5, f(4) = .3.
The density function for Y is g(0) = .1, g(2) = .7, g(3) = .2.
X and Y are independent. Z = X + Y. What is the density of Z at 4?
A. 7%
B. 10%
C. 13%
D. 16%
E. 19%

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 42

Use the following information for the next three questions:

The Durham Bulls and Toledo Mud Hens baseball teams will play a series of games against each
other.

Each game will be played either in Durham or Toledo.


Each team has a 55% chance of winning and 45% chance of losing any game at home.
The outcome of each game is independent of the outcome of any other game.
In the series, the Durham Bulls will play one more game at home than the Toledo Mud Hens.
2.21 (2 points) If the series consists of 3 games, what is probability that the Durham Bulls win the
series; in other words win more games than their opponents the Toledo Mud Hens?
2.22 (3 points) If the series consists of 5 games, what is probability that the Durham Bulls win the
series; in other words win more games than their opponents the Toledo Mud Hens?
2.23 (5 points) If the series consists of 7 games, what is probability that the Durham Bulls win the
series; in other words win more games than their opponents the Toledo Mud Hens?

2.24 (1 point) Which of the following are true?


1. If f has a Binomial Distribution with m = 2 and q = 0.07,
then f*5 has a Binomial Distribution with m = 10 and q = 0.07.
2. If f has a Pareto Distribution with = 4 and = 8,
then f*5 has a Pareto Distribution with = 40 and = 8.
3. If f has a Poisson Distribution with = 0.2, then f*5 has a Poisson with = 1.
A. 1, 2

B. 1, 3

C. 2, 3

D. 1, 2, 3

E. None of A, B, C, or D.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 43

2.25 (4B, 5/85, Q.49) (2 points) The waiting time, x, in years, from the date of an
accident to the date of its report to an insurance company is a random variable with probability
density function (p.d.f.) f(x), 0 < x < . The waiting time, y, in years, from the beginning of an
accident year to the date of an accident is a random variable with p.d.f.
g(y), 0 < y < 1. Assuming x and y are independent, which of the following expressions represents
the expected proportion of the total number of accidents for an accident year reported to the
insurance company by the end of the accident year?
F(x), 0 < x < , and G(y), 0 < y < 1 represent respectively the distribution functions of x and y.
1

f(t) G(1-t) dt

A.

0
1

f(t) G(t) dt

B.
0

C.

f(t) g(1-t) dt
0
1

F(t) G(t) dt

D.
0

F(t) G(1-t) dt

E.
0

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 44

2.26 (5A, 11/94, Q.21) (1 point) Let S = X1 + X2 , where X1 and X2 are independent random
variables with distribution functions defined below:
X
F1 (X)
F2 (X)
0
0.3
0.6
1
0.4
0.8
2
0.6
1.0
3
0.7
4
1.0
Calculate Pr(S 2).
A . Less than 0.25
B. At least 0.25, but less than 0.35
C. At least 0.35, but less than 0.45
D. At least 0.45, but less than 0.55
E. Greater than or equal to 0.55
2.27 (5A, 11/94, Q.23) (1 point) X1 , X2 , X3 ,and X4 are independent random variables for a
Gamma distribution with the parameters = 2.2 and = 0.2.
If S = X1 + X2 + X3 + X4 , then what is the distribution function for S?
A. Gamma distribution with the parameters = 8.8 and = 0.8.
B. Gamma distribution with the parameters = 8.8 and = 0.2.
C. Gamma distribution with the parameters = 2.2 and = 0.8.
D. Gamma distribution with the parameters = 2.2 and = 0.2.
E. None of the above
2.28 (5A, 5/95, Q.19) (1 point) Assume S = X1 + X2 + ... + XN, where X1 , X2 , ... XN are
identically distributed and N, X1 , X2 , ... XN are mutually independent random variables.
Which of the following statements are true?
1. If the distribution of the Xis is continuous and the Prob(N = 0) > 0, the distribution of S will be
continuous.
2. If the distribution of the Xis is normal, then the nth convolution of the Xis is normal.
3. If the distribution of the Xis is exponential, then the nth convolution of the Xis is exponential.
A. 1

B. 2

C. 1, 2

D. 2, 3

E. 1, 2, 3

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 45

2.29 (5A, 11/95, Q.19) (1 point) Let S = X1 + X2 , where X1 and X2 are independent random
variables with the following distribution functions:
X
F1 (X)
F2 (X)
0
0.5
0.3
1
0.8
0.6
2
1
1
What is the probability that S > 2?
A. Less than 0.20
B. At least 0.20 but less than 0.40
C. At least 0.40 but less than 0.60
D. At least 0.60 but less than 0.80
E. At least 0.80
2.30 (5A, 11/97, Q.22) (1 point) The following information is given regarding three mutually
independent random variables:
x
f1 (x) f2 (x) f3 (x)
0
0.5
0.2
0.1
1
0.4
0.2
0.9
2
0.1
0.2
3
0.2
4
0.2
If S = x1 + x2 + x3 , calculate the probability that S = 5.
A. Less than 0.10
B. At least 0.10, but less than 0.15
C. At least 0.15, but less than 0.20
D. At least 0.20, but less than 0.25
E. 0.25 or more
2.31 (5A, 11/98, Q.24) (1 point) Assume that S = X1 +X2 +X3 +...+XN where X1 , X2 , X3 , ...XN are
identically distributed and N, X1 , X2 , X3 , ... XN are mutually independent random variables. Which of
the following statements is true?
1. If the distribution of the Xi's is continuous and the Pr[N=0] > 0, the distribution of S will be
continuous.
2. The nth convolution of a normal distribution with parameters and is also normal
with mean n and variance n2.
3. If the individual claim amount distribution is discrete, the distribution of S is also discrete.
A. 1 B. 2 C. 3 D. 1, 2, 3
E. None of A, B, C, or D

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 46

2.32 (5A, 5/99, Q.37) (2 points) ABC Insurance Company writes liability coverage with one
maximum covered loss of $90,000 offered to all insureds. Only two types of losses to the insurer
arise out of this coverage:
(1) total limits plus expenses: $100,000
(2) loss expenses only: $50,000
You are given the following distribution of aggregate losses that applies in years when the insurer
faces 2 claims.
x
f(x)
100,000
90.25%
150,000
9.5%
200,000
0.25%
If, next year, the insurer faces 3 claims, what is the likelihood that the aggregate losses will exceed
$150,000?
2.33 (5A, 11/99, Q.23) (1 point) X1 , X2 , X3 are mutually independent random variables with
probability functions as follows:
x
f1 (X)
f2 (X)
0
1
2
3
S = X1

0.9
0.5
0.1
0.3
0.0
0.2
0.0
0.0
+ X2 + X3 . Find fS(2).

f3 (X)
0.25
0.25
0.25
0.25

A. Less than 0.25


B. At least 0.25 but less than 0.26
C. At least 0.26 but less than 0.27
D. At least 0.27 but less than 0.28
E. At least 0.28
2.34 (Course 151 Sample Exam #1, Q.8) (1.7 points)
Aggregate claims S = X1 + X2 + X3 , where X1 , X2 and X3 are mutually independent random
variables with probability functions as follows:
x
f1 (x)
f2 (x)
f3 (x)
0
0.6
1
0.4
2
0.0
3
0.0
4
0.0
You are given FS(4) = 0.6.
Determine p.
(A) 0.0
(B) 0.1

p
0.3
0.0
0.0
0.7-p

0.0
0.5
0.5
0.0
0.0

(C) 0.2

(D) 0.3

(E) 0.4

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 47

2.35 (Course 151 Sample Exam #3, Q.11) (1.7 points)


S = X1 + X2 + X3 where X1 , X2 and X3 are independent random variables distributed as follows:
x

f1 (X)

f2 (X)

0
0.2
0
1
0.3
0
2
0.5
p
3
0.0
1-p
4
0.0
0
You are given FS(4) = 0.43.
Determine p.
(A) 0.1
(B) 0.2

(C) 0.3

f3 (X)
0.5
0.5
0.0
0.0
0.0

(D) 0.4

(E) 0.5

2.36 (1, 11/01, Q.37) (1.9 points) A device containing two key components fails when, and only
when, both components fail. The lifetimes, T1 and T2 , of these components are independent with
common density function f(t) = e-t, t > 0.
The cost, X, of operating the device until failure is 2T1 + T2 .
Which of the following is the density function of X for x > 0?
(A) e-x/2 - e-x

(B) 2(e-x/2 - e-x)

(C) x2 e-x/2

(D) e-x/2/2

(E) e-x/3/3

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 48

Solutions to Problems:
2.1. C.
x
0
1
2

f(x)
0.6
0.3
0.1

f(2-x)
0.1
0.3
0.6

Product
0.06
0.09
0.06

Sum

0.21

Comment: f*f(0) = f(0) f(0) = (0.6)(0.6) = 0.36.


f*f(1) = f(0) f(1) + f(1) f(0) = (0.6)(0.3) + (0.3)(0.6) = 0.36.
f*f(3) = f(1) f(2) + f(2) f(1) = (0.1)(0.3) + (0.3)(0.1) = 0.06. f*f(4) = f(2)f(2) = (0.1)(0.1) = 0.01.
2.2. D.
x
0
1
2

F(x)
0.6
0.9
1

Sum

f(2-x)
0.1
0.3
0.6

Product
0.06
0.27
0.6

0.93

Comment: Alternately, f*f(0) + f*f(1) + f*f(2) = 0.36 + 0.36 + 0.21 = 0.93.


2.3. B. One can use the fact that f*f*f = (f*f)*f.
x
-1
0
1
2
3
4

f*f(x)

f(3-x)

Product

0.36
0.36
0.21
0.06
0.01

0.1
0.3
0.6

0.036
0.063
0.036

Sum

0.135

2.4. A. One can use the fact that f*f = (f*f)*(f*f).


x
0
1
2
3
4
5
Sum

f*f(x)
0.36
0.36
0.21
0.06
0.01

f*f(5-x)
0.01
0.06
0.21
0.36
0.36

Product
0
0.0036
0.0126
0.0126
0.0036
0.0324

2013-4-3,

Aggregate Distributions 2 Convolutions,

2.5. D.
x
0
1
2
3

g(x)
0.2
0.5
0.3

g(3-x)
0.3
0.5
0.2

Sum

Product
0
0.15
0.15
0
0.3

2.6. D.
x
0
1
2

G(x)
0.2
0.7
1

g(1-x)
0.5
0.2

Sum

Product
0.1
0.14
0
0.24

2.7. B.
x
0
1
2
3
4

g*g(x)
0.04
0.2
0.37
0.3
0.09

g(4-x)

0.3
0.5
0.2

Sum

Product
0
0
0.111
0.15
0.018
0.279

2.8. C. One can use the fact that g*g = (g*g)*(g*g).


x
0
1
2
3
4

g*g(x)
0.04
0.2
0.37
0.3
0.09

g*g(3-x)
0.3
0.37
0.2
0.04

Sum

Product
0.012
0.074
0.074
0.012
0
0.172

2.9. A.
x
0
1
2
Sum

f(x)
0.6
0.3
0.1

g(3-x)
0.3
0.5

Product
0
0.09
0.05
0.14

HCM 10/23/12,

Page 49

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 50

2.10. D. One can calculate the answer in either of two ways.


x
0
1
2

F(x)
0.6
0.9
1

g(2-x)
0.3
0.5
0.2

Product
0.18
0.45
0.2

Sum

0.83

x
0
1
2

G(x)
0.2
0.7
1

f(2-x)
0.1
0.3
0.6

Product
0.02
0.21
0.6

Sum

0.83

2.11. E. 1. False, the sum of 10 independent Gammas is a Gamma with parameters 10 and .
2. False. The variances add, so that the new variance is 102 and the new standard deviation is
10 , not 10. The sum of 10 independent Normals is a Normal with parameters 10 and 10 .
3. False. The sum of 10 independent Negative Binomials is a Negative Binomial with parameters
and 10r.
2.12. B. First one can compute f*f. f*f(2) = 0.16. f*f(3) = 0.40.
f*f(4) = (0.4)(0.1) + (0.5)(0.5) + (0.1)(0.4) = 0.33. f*f(5) = 0.10. f*f(6) = 0.01.
Then use the fact that f*f*f = (f*f)*f.
x
2
3
4
5
6

f*f(x)
0.16
0.4
0.33
0.1
0.01

f(6-x)

Sum

Product
0
0.04
0.165
0.04
0

0.1
0.5
0.4

0.245

Comment: The mean of f is 0.4 + 1 + 0.3 = 1.7. If one computes f*f*f and computes the mean, one
gets 5.1 = (3) (1.7) The mean of the sum of 3 claims is three times the mean of a single claim.
x
3
4
5
6
7
8
9

f*f*f(x)
0.064
0.24
0.348
0.245
0.087
0.015
0.001

Product
0.192
0.96
1.74
1.47
0.609
0.12
0.009

Sum

5.1

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 51

2.13. D. For date of accident 0 < y < 1, the expected portion of accidents reported by time is:
1.5 is: 1 - e-(1.5-y)/1.7. We can integrate over the dates of accident: H(1.5) =
1

G(1.5 - y) f(y) dy = 0 (1 - e - (1.5 - y) / 1.7 ) (0.9 + 0.2y) dy =


1

0 {0.9 - 0.9e - (1.5 - y) / 1.7 + 0.2y - 0.2y e - (1.5 - y) / 1.7} dy


- (1.5 - y) / 1.7

= {0.9y - 1.53e

0.1y2

- 0.34y

e - (1.5 - y) / 1.7

y =1
(1.5
y)
/
1.7
0.0578e
}

y =0

0.9 - 0.5070 + 0.1 - 0.2534 + 0.1915 = 0.4311.


Comment: One could instead use: H(1.5) =

g(x) F(1.5-x) dx.

2.14. E. Now f(7-x) > 0 for 0 7-x < 10, which implies -3 < x 7.
In addition f(x) > 0 when 0 x < 10. Thus f(x)f(7-x) > 0 when 0 x 7.
7

f*f(7) = (0.02)(10 - x) (0.02){10 - (7 - x)} dx = 0.0004

0 30 + 7x - x2 dx =

(0.0004) {210 + (7/2)(49) - (343/3)} = 0.1069.


2.15. A. Let f(y) = 1/5 for 1 < y < 6 and g(z) = 1/8 for 3 < z < 11, then density of the sum is:
6

h(14) =

f(y) g(14 - y) dy = 3 (1/ 5) (1/ 8) dy = 3/40 = 0.075.

Comment: The integrand is zero unless 1 y 6 and 3 14-y 11.


Therefore, we only integrate from y = 3 to y = 6.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 52

2.16. C. Let F(y) = (y-1)/5 for 1 y 6, F(y) = 1 for y > 6, and g(z) = 1/8 for 3 z 11, then the
distribution of the sum is:
8

11

F(14 - z) g(z) dz z = 3 (1/ 8) dz + 8 (13 - z)/ 40 dz = 5/8 + 21/80 = 0.8875.

H(14) =

Alternately, X has a density function:


(x - 4)/40
5/40
(17-x)/40

4x9
9 x 12
12 x 17.
9

Thus F(14) =

14

4 (x - 4)/ 40 dx + 1/8 dx + 12 (17 - x) / 40 dx = 0.3125 + 0.375 + 0.200 = 0.8875.

Comment: In general, if X is the sum of two uniform variables on (a, b) and (c, d), with
d - c b - a, then X has a density:
(x - (a+b))/{(b-a)(d-c)}, a + c x b + c
(b-a)/{(b-a)(d-c)}, b + c x a + d
(c+d-x)/{(b-a)(d-c)}, a + d x b + d.
2.17. C. f(x) = e-x/7 /7. g(y) = e-y/17 /17. Using the convolution formula:
z

h(z) =

f(t) g(z - t) dt = 0 (e- t / 7

/ 7) (e- (z - t) / 17 / 17) dt = (e-z/17 /119)

0 e- 0.0840336t dt =

(e-z/17 /119) (e-0.0840336z - 1) / (-0.0840336) = ( e-z/17 - e- z / 7) /10.


Alternately, the Moment Generating Functions are: 1/(1-7t) and 1/(1-17t).
Their product is: 1/{(1-7t)(1-17t)} = (17/10)/(1-17t) - (7/10)/(1-7t). This is:
(17/10)(m.g.f. of an Exponential with mean 17) - (7/10)(m.g.f. of an Exponential with mean 7) =
m.g.f of [(17/10 times an Expon. with mean 17)- (7/10 times an Exponential with mean 7)].
Thus Z is (17/10 times an Expon. with mean 17) - (7/10 times an Exponential with mean 7). Density
of Z is: (17/10)(e-y/17 /17) - (7/110)( e-z/7 /7) = ( e-z/17 - e-z/7) /10.
Comment: In general, the sum of two independent Exponentials with different means 1 and 2 , has
a density of: {exp(-z/1 ) - exp(-z/2 )}/(1 - 2 ). If 1 = 2 , then one would instead get a Gamma
Distribution with parameters = 2 and = 1 = 2 . In this case with differing means, the density is
closely approximated by a Gamma Distribution with =2 and = (7+17)/2 =12, but it is not a
Gamma Distribution. The sum of n independent Exponentials with different means i, has
density: in-2 exp(-z/i)/ (i - j), n2, i j.
i

ij

See Example 2.3.3 in Actuarial Mathematics, not on the Syllabus.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 53

2.18. D. Let x = Toms unrounded result - Toms rounded result.


Then x is uniformly distributed from -0.5 to 0.5.
Let y = Dicks unrounded result - Dicks rounded result.
Then y is uniformly distributed from -0.5 to 0.5.
Then z = x + y has a triangle distribution:
f(z) = 1 + z for -1 z 0, and f(z) = 1 - z for 1 z 0.
F(z) = (1 + z)2 /2 for -1 z 0, and F(z) = 1 - (1 - z)2 /2 for 1 z 0.
The sum of the rounded results equals the rounding of the sums provided z is between -.5 and +.5.
The probability that -0.5 < z < 0.5 is: F(0.5) - F(-0.5) = 7/8 - 1/8 = 3/4.
Alternately, let t = decimal portion of Toms unrounded result and d = decimal portion of Dicks
unrounded result. Divide into cases:
t < 1/2, d < 1/2, and 0 t + d < 1/2; OK; Prob = 1/8.
t < 1/2, d < 1/2, and 1 > t + d 1/2; not OK; Prob = 1/8.
t < 1/2, d 1/2; OK; Prob = 1/4.
t 1/2, d < 1/2; OK; Prob = 1/4.
t 1/2, d 1/2, and 1 t + d < 3/2; not OK; Prob = 1/8.
t 1/2, d 1/2, and 2 t + d 3/2; OK; Prob = 1/8.
Total probability where OK is: 1/8 + 1/4 + 1/4 + 1/8 = 3/4.
2.19. The possible results range from 2 to 10. There are (4)(6) = 24 equally likely results.
For example, a 4 can result from 1, 3; 2, 2; or 3, 1. Thus the chance of a 4 is: 3/24.
A 7 can result from 1, 6; 2; 5; 3, 4; or 4, 3. Thus the chance of a 7 is 4/24.
The probability density function is:
Result
Number of Chances
Probability

2
1
0.042

3
2
0.083

4
3
0.125

5
4
0.167

6
4
0.167

7
4
0.167

8
3
0.125

9
2
0.083

10
1
0.042

2013-4-3,

Aggregate Distributions 2 Convolutions,

2.20. A. h(4) =

HCM 10/23/12,

Page 54

f(x) g(4 - x) = f(1) g(3) + f(3) g(1) + f(4) g(0)


x

= (0.2)(0.2) + (0.5)(0) + (0.3)(0.1) = 0.07.


Or h(4) =

f(4 - y) g(y) = f(4)g(0) + f(2)g(2) + f(1)g(3) = (0.3)(0.1) + (0)(0.7) + (0.2)(0.2) = 0.07.


y

One can arrange this calculation in a spreadsheet:


x

f(x)

g(4-x)

f(x)g(4-x)

1
2
3
4

0.2
0
0.5
0.3

0.2
0.7
0
0.1

0.04
0
0
0.03
0.07

Comment: Similarly, h(5) = f(x)g(5-x) = 0.35.


x

f(x)

1
2
3
4

0.2
0
0.5
0.3

g(5-x)

f(x)g(5-x)

0.2
0.7
0
0.1

0
0
0.35
0
0
0.35

h(6) = f(x)g(6-x) = (0.5)(0.2) + (0.3)(0.7) = 0.31.


The entire convolution of f and g is shown below:
z
h(z)

1
0.02

2
0

3
0.19

4
0.07

5
0.35

6
0.31

7
0.06

Note that being a distribution, h = f*g sums to 1.


2.21. The number of home games the Durham Bulls win is Binomial with m = 2 and q = 0.55:
f(0) = 0.452 = 0.2025. f(1) = (2)(0.55)(0.45) = 0.4950. f(2) = 0.552 = 0.3025.
The number of road games the Durham Bulls win is Binomial with m = 1 and q = 0.45.
Prob[3 wins] = Prob[2 home wins]Prob[1 road win] = (0.3025)(0.45) = 0.1361.
Prob[2 wins] = Prob[2 home wins]Prob[0 road win] + Prob[1 home wins]Prob[1 road win] =
(0.3025)(0.55) + (0.4950)(0.45) = 0.3891.
Prob[at least 2 wins] = 0.1361 + 0.3891 = 52.52%.
Comment: The total number of games they win is the convolution of the two Binomials.
In a three game championship series, if one team won the first 2 games, then the final game
would not be played. However, this does not affect the answer to the question.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 55

2.22. The number of home games the Durham Bulls win is Binomial with m = 3 and q = .55:
f(0) = 0.453 = 0.0911. f(1) = (3)(0.55)(0.452 ) = 0.3341.
f(2) = (3)(0.552 )(0.45) = 0.4084. f(3) = 0.553 = 0.1664.
The number of road games the Durham Bulls win is Binomial with m = 2 and q = 0.45:
g(0) = 0.552 = 0.3025. g(1) = (2)(0.45)(0.55) = 0.4950. g(2) = 0.452 = 0.2025.
Prob[5 wins] = Prob[3 home wins]Prob[2 road wins] = (0.1664)(0.2025) = 0.0337.
Prob[4 wins] = Prob[3 home wins]Prob[1 road win] + Prob[2 home wins]Prob[2 road wins] =
(0.1664)(0.4950) + (0.4084)(0.2025) = 0.1651.
Prob[3 wins] = Prob[3 home wins]Prob[0 road win] + Prob[2 home wins]Prob[1 road win]
+ Prob[1 home win]Prob[2 road wins] = (0.1664)(0.2025) + (0.4084)(0.4950) + (0.3341)(0.2025)
= 0.3201. Prob[at least 3 wins] = 0.0337 + 0.1651 + 0.3201 = 51.89%.
Comment: The longer the series, the less advantage the Durham Bulls get from the extra home
game.
2.23. The number of home games the Durham Bulls win is Binomial with m = 4 and q = .55:
f(0) = 0.454 = 0.0410. f(1) = (4)(0.55)(0.453 ) = 0.2005. f(2) = (6)(0.552 )0(.452 ) = 0.3675.
f(3) = (4)(.553 )(0.45) = 0.2995. f(4) = 0.554 = 0.0915.
The number of road games the Durham Bulls win is Binomial with m = 3 and q = 0.45:
g(0) = 0.553 = 0.1664. g(1) = (3)(0.552 )(0.45) = 0.4084.
g(2) = (3)(0.55)(0.452 ) = 0.3341. g(3) = 0.453 = 0.0911.
Prob[7 wins] = Prob[4 home wins]Prob[3 road wins] = (0.0915)(0.0911) = 0.0083.
Prob[6 wins] = Prob[4 home wins]Prob[2 road wins] + Prob[3 home wins]Prob[3 road wins] =
(.0915)(0.3341) + (.2995)(0.0911) = 0.0579.
Prob[5 wins] = Prob[4 home wins]Prob[1 road win] + Prob[3 home wins]Prob[2 road wins]
+ Prob[2 home wins]Prob[3 road wins]
= (0.0915)(0.4084) + (0.2995)(0.3341) + (.03675)(0.0911) = 0.1709.
Prob[4 wins] = Prob[4 home wins]Prob[0 road win] + Prob[3 home wins]Prob[1 road win]
+ Prob[2 home wins]Prob[2 road wins] + Prob[1 home wins]Prob[3 road wins]
= (0.0915)(0.1664) + (0.2995)(0.4084)) + (0.3675)(0.3341) + (0.2005)(0.0911) = 0.2786.
Prob[at least 4 wins] = 0.0083 + 0.0579 + 0.1709 + 0.2786 = 51.57%.
Comment: The probabilities for the number of games won by the Durham Bulls are:
0.68%, 5.01%, 15.67%, 27.06%, 27.86% , 17.09%, 5.79%, 0.83%.

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 56

2.24. B. The sum of 5 independent, identically distributed Binomials is another Binomial with q the
same and m multiplied by 5. Statement #1 is true.
The sum of independent, identically distributed Paretos is not another Pareto.
The sum of 5 independent, identically distributed Poissons is another Poisson with multiplied by
5. Statement #3 is true.
2.25. A. We can add up over all possible reporting delays the chance that for a given reporting
delay the accident date is such that the accident will be reported by the end of the accident year. For
a given reporting delay x, the accident will be reported by the end of the accident year (time = 1) if
and only if the accident date is 1 - x. This only holds for x1; if the reporting delay is greater than 1,
then the accident can not be reported before the end of the accident year, regardless of the accident
date. The chance that the accident date is 1-x is: G(1-x). So the chance that we have an reporting
delay of x and that the accident is reported by the end of the accident year is the product: f(x)G(1-x),
since X and Y are given to be independent. Integrating over all reporting delays less than or equal
to 1, we get the chance that the accident is reported by the end of the accident year:
1

0 f(x) G(1- x) dx .
Comment: More generally, if x is the time from the accident date to the reporting date and y is the
time from the beginning of the accident year to the accident date, then the time from the beginning of
the accident year to the reporting date is x + y. An accident is reported by time z from the beginning
of the accident year is x + y z. So the distribution of Z is the distribution of the sum of X and Y. If X
and Y are independent, then the probability density function of their sum is given by the convolution
formula:

f(x) g(z - x) dx = f(z - y) g(y) dy . The distribution function is given by:


z

z=

f(x) g(z - x) dz dx = f(x) G(z - x) dx .

x=- z=-
-
In this case we are asked for the chance that the time between the beginning of the accident year
and the date of reporting is less than 1 (year) so we want the chance that z = x+y 1. Since in this
case 0 < x < , the integral only goes at most from 0 to infinity. Also since in this case 0 < y < 1, we
have 0 < (z -x) < 1 so that (z-1) < x < z. Thus the integral goes at most from z-1 to z. Thus the
chance that the accident is reported by z from the beginning of the accident year is for z > 0:
z

f(x) G(1- x) dx .

max[0, z-1]
Thus the chance that the accident is reported by the end of the accident year (z=1) is:

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 57

0 f(x) G(1- x) dx .
An important actuarial idea, very unlikely to be asked about on this exam.
Alternately to the intended solution, we can add up over all possible accident dates the chance that
for a given accident date the reporting delay is less than or equal to the time until the end of the
accident year. For a given accident date y, the time until the end of the accident year is 1-y. The
chance that the reporting delay is 1-y is: F(1-y). So the chance that we have an accident date of y
and that it is reported by the end of the accident year is the product
g(y)F(1-y), since X and Y are given to be independent. Integrating over all possible accident dates
(0<y<1) we get an alternate form of the chance that the accident is reported by the end of the
accident year:
1

0 g(y) F(1- y) dy .
Convolutions can generally be computed in either of these two alternate forms.
2.26. D. S > 2 when: X2 = 0 and X1 > 2, X2 = 1 and X1 > 1, X2 = 2 and X1 > 0.
This has probability: (0.6)(1 - 0.6) + (0.2)(1 - 0.4) + (0.2)(1 - 0.3) = 0.50. Pr(S 2) = 1 - 0.5 = 0.5.
Alternately, FS(2) = f1 (x)F2 (2-x) = (0.3)(1) + (0.1)(0.8) + (0.2)(0.6) = 0.5.
FS(2) = f1 (2-x)F2 (x) = (0.2)(0.6) + (0.1)(0.8) + (0.3)(0.1) = 0.5.
FS(2) = F1 (x)f2 (2-x) = (0.3)(0.2) + (0.4)(0.2) + (0.6)(0.6) + (0.7)(0) + (1)(0) = 0.5.
FS(2) = F1 (2-x)f2 (x) = (0.6)(0.6) + (0.4)(0.2) + (0.3)(0.2) = 0.5.
2.27. B. The sum of 4 independent, identical Gamma Distributions is another Gamma Distribution
with the same parameter and 4 times the parameter, in this case: = 8.8 and = 0.2.
2.28. B. 1. False. There will be a point mass of probability at zero. 2. True.
3. False. The nth convolution of an Exponential is a Gamma with shape parameter = n.
2.29. B. S > 2 if X1 = 1 and X2 2, or X1 = 2 and X2 1.
This has probability: (0.3)(0.4) + (0.2)(0.7) = 0.26.
Comment: Ive used the formula: (F*G)(z) = f(x) G(z-x).

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 58

2.30. C. The ways in which S can be 5 are: (0, 4, 1), (1, 4, 0), (1, 3, 1), (2, 3, 0), (2, 2, 1) with
probability:
(0.5)(0.2)(0.9) + (0.4)(0.2)(0.1) + (0.4)(0.2)(0.9) + (0.1)(0.2)(0.1) + (0.1)(0.2)(0.9) = 0.19.
Alternately, f1 (x)*f3 (x) = .05 @ 0, .49 @ 1, .37 @ 2, and .09 @ 3.
f1 *f3 *f2 (5) = (0.49)(0.2) + (0.37)(0.2) + (0.09)(0.2) = 0.19.
2.31. E. If there is a chance of no claims, then there is an extra point mass of probability at zero in
the aggregate distribution, and the distribution of aggregate losses is not continuous at zero, so
Statement #1 is False. The sum of n independent Normal Distributions is also Normal, with n times
the mean and n times the variance, so statement #2 is true.
Statement #3 is True.
2.32. Based on the aggregate distribution with two losses, there is a

90.25% = 0.95 chance of a

$50,000 loss and a 0.25% = 0.05 chance of a $100,000 loss. The aggregate distribution with
three claims is that for two claims convoluted with that for one claim; it has density at $150,000 of
(0.95)(0.9025) = 0.857375. With 3 claims the aggregate distribution is $150,000, so the chance
of exceeding $150,000 is: 1 - 0.857375 = 14.2625%.
2.33. A. f1 *f2 is: (0.9)(0.5) = 0.45@0, (0.9)(0.3) + (0.1)(0.5) = 0.32 @1,
(0.9)(0.2) + (0.1)(0.3) = 0.21@2, and (0.1)(0.2) = 0.02 @ 3.
fS = f1 *f2 *f3 is: (0.45)(0.25) = 0.1125 @ 0, (0.45)(0.25) + (0.32)(0.25) = 0.1925 @ 1,
(0.45)(0.25) + (0.32)(0.25) + (0.21)(0.25) = 0.245 @ 2,
(0.45)(0.25) + (0.32)(0.25) + (0.21)(0.25) + (0.02)(0.25) = 0.25 @ 3,
(0.32)(0.25) + (0.21)(0.25) + (0.02)(0.25) = 0.1375 @ 4,
(0.21)(0.25) + (0.02)(0.25) = 0.0575 @ 5, and (0.02)(0.25) = 0.005 @ 6.
2.34. D. We are given that the chance that S is greater than 4 is: 1- 0.6 = 0.4. Since the sum of
severities one and three is either 1, 2 or 3, and since the second severity is either 0, 1, or 4, S is
greater than 4 if and only if the second severity is 4. Thus 0.7 - p = 0.4. p = 0.3.
Alternately, we can compute the distribution function of S, FS, via convolution. First convolute F1 and
f3 . F1 * f3 is: (.6)(.5) = .3 @ 1, (.6)(.5) + (1)(.5) = .8 @ 2, and (1)(.5) = (1)(.5) = 1 @ 3. Next
convolute by f2 . FS = F1 * f3 * f2 is: .3p @ 1, .8p + (.3)(.3) @ 2, p + (.3)(.8) + (0)(.3) @ 3,
p + (.3)(1) +(0)(.8) + 0(.3) @ 4, p + .3 + 0 + 0 + (.7-p)(.3) @ 5, p + .3 + (.7-p)(.8) @ 6,
and 1 @ 7. We are given FS(4) = 0.6, so that 0.6 = p + (.3)(1) +(0)(.8) + 0(.3). Therefore p = .3.
Comment: In order to compute FS, one can do the convolutions in any order.
I did it in the order I found easiest.
In general, FX+Y(z) =

FX(x) fY(z - x) = FY(y) fX(z - y) = FX(z - y) fY(y) = FY(z - x) fX(x) .


x

2013-4-3,

Aggregate Distributions 2 Convolutions,

HCM 10/23/12,

Page 59

2.35. B. f1 *f3 is: (0.2)(0.5) = .1@0, (0.2)(0.5) + (0.3)(0.5) = 0.25 @1,


(0.3)(0.5) + (0.5)(0.5) = 0.4@2, and (0.5)(0.5) = 0.25 @ 3.
fS = f2 *f1 *f3 is: 0.1p @2, 0.1(1 - p) + 0.25p @3, and (0.25)(1-p) + (0.4)(p) @4.
Since S 2, FS(4) = fS(2) + fS(3) +fS(4) =
0.1p + 0.1(1 - p) + 0.25p + (0.25)(1-p) + (0.4)(p) = 0.35 + 0.4p.
Setting FS(4) = 0.43: 0.35 + 0.4p = 0.43. p = 0.2.
Comment: Although it is not needed to solve the problem: f2 *f1 *f3 is:
0.25p + 0.4(1-p) @5, and 0.25(1-p) @6. One can verify that the density of S sums to one.
2.36. A. T1 is Exponential with mean 1. When we multiply by 2, we get another Exponential with
mean 2. Let 2T1 = U. Then U is Exponential with = 2.
Density of U: e-u/2/2. X = U + V, where V = T2 . Density of V is: e-v = e-(x-u).
x

Density of x = (e- u / 2 / 2)

e- (x - u)

du = e-x eu / 2 / 2 du = (e-x)(ex/2 - 1) = e- x / 2 - e- x, x > 0.

Comment: The sum of an Exponential with = 2 and an Exponential with = 1, is not a Gamma
Distribution.

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 60

Section 3, Using Convolutions


Convolutions can be useful for computing either aggregate distributions or compound distributions.27
Aggregate Distributions:
Exercise: Frequency is given by a Poisson with mean 7. Severity is given by an Exponential with
mean 1000. Frequency and Severity are independent.
Write the Distribution Function for the aggregate losses.
[Solution: The chance of n claims is e-7 7n / n!.
If one has n claims, then the Distribution of Aggregate Losses is the sum of n independent
Exponentials or a Gamma with parameter = n and = 1000.
Let FA(x) be the Distribution of Aggregate Losses.
FA(x) = (Probability of n claims)(Aggregate Distribution given n claims) =
(e-7 7n / n!) (n; x/1000).]
We note that each Gamma Distribution was the nth convolution of the Exponential.
Each term of the sum is the density of the frequency distribution at n times the nth convolution of the
severity distribution.
More generally, if frequency is FN, if severity is FX, frequency and severity are independent, and
aggregate losses are FA then:

FA (x) =

fN (n) FX*n (x).

n = 0

fA (x) =

fN (n) fX*n (x).

Recalling that f*0 (0) 1.

n = 0

If one has discrete severity distributions, one can employ these formulas to directly calculate the
distribution of aggregate losses.28
27

The same mathematics applies to aggregate distributions (independent frequency and severity) and compound
distributions.
28
If the severity is continuous, as will be discussed in a subsequent section, then one could approximate it by a
discrete distribution.

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 61

An Example with a Discrete Severity:


Exercise: Let a discrete severity distribution be: f(10) = 0.4, f(20) = 0.5, f(30) = 0.1. What is f*f?
[Solution: List the possible ways the two variables can add to 20:
10 and 10, with probability: (0.4)(0.4) = 0.16. f*f(20) = 0.16.
List the possible ways the two variables can add to 30:
10 and 20, with probability: (0.4)(0.5) = 0.20, or 20 and 10, with probability: (0.5)(0.4) = 0.20.
f*f(30) = 0.20 + 0.20 = 0.40. List the possible ways the two variables can add to 40:
10 and 30, with probability: (0.4)(0.1) = 0.04, or 20 and 20, with probability: (0.5)(0.5) = 0.25,
or 30 and 10, with probability: (0.1)(0.4) = 0.04. f*f(40) = 0.04 + 0.25 + 0.04 = 0.33.
List the possible ways the two variables can add to 50:
20 and 30, with probability: (0.5)(0.1) = 0.05, or 30 and 20, with probability: (0.1)(0.5) = 0.05.
f*f(50) = 0.05 + 0.05 = 0.10. List the possible ways the two variables can add to 60:
30 and 30, with probability: (0.1)(0.1) = 0.01. f*f(60) = 0.01.]
Exercise: For the distribution in the previous exercise, what is f*f*f?
[Solution: One can use the fact that f*f*f = (f*f)*f. f*f*f(30) = 0.064. f*f*f(40) = 0.240.
f*f*f(50) = 0.348. f*f*f(60) = 0.245. f*f*f(70) = 0.087. f*f*f(80) = 0.015. f*f*f(90) = 0.001.
x
10
20
30
40
50
60

f*f(x)

Product

f(50-x)

0.16
0.4
0.33
0.1
0.01

0.016
0.2
0.132
0
0

0.1
0.5
0.4

Sum

0.348

x
10
20
30
40
50
60

f*f(x)

Product

f(60-x)

0.16
0.4
0.33
0.1
0.01

0
0.04
0.165
0.04
0

0.1
0.5
0.4

Sum

0.245

x
10
20
30
40
50
60

f*f(x)

Product

f(70-x)

0.16
0.4
0.33
0.1
0.01

0
0
0.033
0.05
0.004

0.1
0.5
0.4

Sum

0.087

50-x
40
30
20
10
0

60-x
50
40
30
20
10
0
70-x
60
50
40
30
20
10

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 62

Exercise: Let frequency be Binomial with parameters m = 3 and q = 0.2.


Let the severity have a discrete distribution such that f(10) = 0.4, f(20) = 0.5, f(30) = 0.1.
Calculate the distribution of aggregate losses, using the convolutions calculated in the previous
exercises.
[Solution: Recall that f*0 (0) 1.
n
Binomial

0
0.512

1
0.384

2
0.096

3
0.008

f*0

f*f

f*f*f

0
10
20
30
40
50
60
70
80
90

Sum

0.4
0.5
0.1

0.16
0.40
0.33
0.10
0.01

Aggregate
Density

0.064
0.240
0.348
0.245
0.087
0.015
0.001

0.512000
0.153600
0.207360
0.077312
0.033600
0.012384
0.002920
0.000696
0.000120
0.000008

The aggregate density at 30 is: (0.384)(0.1) + (0.096)(0.40) + (0.008)(0.064) = 0.077312.


Comment: Since the Binomial Distribution and severity distribution have finite support, so does the
aggregate distribution. In this case the aggregate losses can only take on the values 0 through 90.]
In general, when the frequency distribution is Binomial, there are only a finite number of terms in the
sum used to get the aggregate density via convolutions:
m

fA(x) =

m!
qn (1- q)m - n fX* n (x) .
n! (mn)!

n=0

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 63

Density of a Compound Distribution in terms of Convolutions:


For example, assume the number of taxicabs that arrive per minute at the Heartbreak Hotel is
Poisson with mean 1.3. In addition, assume that the number of passengers dropped off at the hotel
by each taxicab is Binomial with q = 0.4 and m = 5. The number of passengers dropped off by
each taxicab is independent of the number of taxicabs that arrive and is independent of the number
of passengers dropped off by any other taxicab. Then the aggregate number of passengers
dropped off per minute at the Heartbreak Hotel is a compound Poisson-Binomial distribution, with
parameters = 1.3, q = 0.4, m = 5.
Let the primary distribution be p and the secondary distribution be s and let c be the compound
distribution. Then we can write the density of c, in terms of a weighted average convolutions of s.
For example, assume we have 4 taxis. Then the distribution of the number of people is given by
the sum of 4 independent variables each distributed as per the secondary distribution, s.
This sum is distributed as the four-fold convolution of s: s* s * s* s = s* 4 .
The chance of having four taxis is the density of the primary distribution at 4: p(4).
Thus this possibility contributes p(4)s* 4 to the compound distribution c.
The possibility of n taxis contributes p(n)s* n to the compound distribution c.
Therefore, the compound distribution is the sum of such terms:29

c(x) =

p(n)s*n (x)
n =0

Compound Distribution =
Sum over n of: (Density of primary distribution at n)(n-fold convolution of secondary distribution)

29

See page 207 in Loss Models. The same formula holds for the distribution of aggregate losses, where severity
takes the place of the secondary distribution.

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 64

Exercise: What is the four-fold convolution of a Binomial distribution, with parameters q = 0.4, m = 5.
[Solution: The sum of 4 independent Binomials each with parameters q = 0.4, m = 5 is a Binomial
with parameters q = 0.4, m = (4)(5) = 20.]
The n-fold convolution of a Binomial distribution, with parameters q = 0.4, m = 5 is a Binomial with
(5n)!
parameters q = 0.4, m = 5n. It has density at x of:
0.4x 0.65n-x.
x! (5n - x)!
Exercise: Write a formula for the density of a compound Poisson-Binomial distribution, with
parameters = 1.3, q = 0.4, m = 5.
[Solution:
c(x) = p(n)s*

n (x)

n=0

e- 1.3 1.3n
(5n)!
0.4 x 0.6n - x .]
(x!) (5n - x)!
n!

One could perform this calculation in a spreadsheet as follows:

2013-4-3,
n
Poisson

x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Sum

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 65

0
0.27253

1
0.35429

2
0.23029

3
0.099792

4
0.032432

5
0.008432

Binomial
m=0

Binomial
m=5

Binomial
m=10

Binomial
m=15

Binomial
m=20

Binomial
m=25

0.07776
0.25920
0.34560
0.23040
0.07680
0.01024

0.00605
0.04031
0.12093
0.21499
0.25082
0.20066
0.11148
0.04247
0.01062
0.00157
0.00010

0.000470
0.004702
0.021942
0.063388
0.126776
0.185938
0.206598
0.177084
0.118056
0.061214
0.024486
0.007420
0.001649
0.000254
0.000024
0.000001

0.000037
0.000487
0.003087
0.012350
0.034991
0.074647
0.124412
0.165882
0.179706
0.159738
0.117142
0.070995
0.035497
0.014563
0.004854
0.001294
0.000270
0.000042
0.000005
0.000000
0.000000

0.000003
0.000047
0.000379
0.001937
0.007104
0.019891
0.044203
0.079986
0.119980
0.151086
0.161158
0.146507
0.113950
0.075967
0.043410
0.021222
0.008843
0.003121
0.000925
0.000227
0.000045
0.000007
0.000001
0.000000
0.000000
0.000000

Compound
PoissonBinomial
0.301522089
0.101600875
0.152585482
0.137881306
0.098817323
0.070981212
0.050696418
0.033505760
0.021065986
0.012925619
0.007625757
0.004278394
0.002276687
0.001138213
0.000525897
0.000221047
0.000083312
0.000027689
0.000007950
0.000001926
0.000000383
0.000000061
0.000000007
0.000000001
0.000000000
0.000000000

0.997769395

For example, the density at 2 of the compound distribution is calculated as:


(0.27253)(0) + (0.35429)(0.34560) + (0.23029)(0.12093) + (0.099792)(0.021942) +
(0.032432)(0.003087) + (0.008432)(0.000379) = 0.1526.
Thus, there is a 15.26% chance that two passengers will be dropped off at the Heartbreak Hotel
during the next minute. Note that by not including the chance of more than 5 taxicabs in our
spreadsheet, we have allowed the calculation to fit in a finite sized spreadsheet, but have also left
out some possibilities.30

30

As can be seen, the computed compound densities only add to .998 < 1. The approximate compound densities
at x < 10 are fairly accurate; for larger x one would need a bigger spreadsheet.

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 66

Practical Issues:
When one has a frequency with infinite support and a discrete severity, while these calculations of
the aggregate distribution via convolutions are straightforward to perform on a computer, they can
get rather lengthy.31 Also if the severity distribution has a positive density at zero, then each
summation contains an infinite number of terms.32
When the frequency or primary distribution is a member of the (a, b, 0) class or (a, b, 1) class,
aggregate and compound distributions can also be computed via the Panjer Algorithm (Recursive
Method), to be discussed in a subsequent section. The Panjer Algorithm avoids some of these
practical issues.33

31

As stated in Section 9.5 of Loss Models, in order to compute the aggregate distribution up to n using
convolutions, the number of calculations goes up as n3 .
32
One can get around this difficulty when the frequency distribution can be thinned.
33
As stated in Section 9.5 of Loss Models, in order to compute the aggregate distribution up to n using the Panjer
Algorithm, the number of calculations goes up as n2 .

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 67

Problems:
3.1 (3 points) There are either one, two or three claims, with probabilities of 60%, 30% and 10%,
respectively.
Each claim is of size $100 or $200, with probabilities of 80% and 20% respectively, independent of
the size of any other claim.
Frequency and severity are independent.
Calculate the aggregate distribution.
3.2 (2 points) The number of claims in a period has a Geometric distribution with mean 2.
The amount of each claim X follows P(X = x) = 0.50, x = 1, 2.
The number of claims and the claim amounts are independent.
S is the aggregate claim amount in the period.
Calculate Fs(3).
(A) 0.66

(B) 0.67

(C) 0.68

(D) 0.69

(E) 0.70

3.3 (3 points) The number of accidents per year follows a Binomial distribution with m = 2 and
q = 0.7. The number of claims per accident is Geometric with = 1.
The number of claims for each accident is independent of the number of claims for any other accident
and of the total number of accidents.
Calculate the probability of 2 or fewer claims in a year.
A. Less than 80%
B. At least 80% but less than 82%
C. At least 82% but less than 84%
D. At least 84% but less than 86%
E. At least 86%
3.4 (3 points) For a certain company, losses follow a Poisson frequency distribution with mean 2 per
year, and the amount of a loss is 1, 2, or 3, each with probability 1/3.
Loss amounts are independent of the number of losses, and of each other.
What is the probability of 4 in annual aggregate losses?
A. 7%
B. 8%
C. 9%
D. 10%
E. 11%
3.5 (8 points) The number of accidents is either 0, 1, 2, or 3 with probabilities 50%, 20%, 20%, and
10% respectively.
The number of claims per accident is 0, 1, 2 or 3 with probabilities 30%, 40%, 20%, and 10%
respectively.
Calculate the distribution of the total number of claims.

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 68

3.6 (5A, 11/94, Q.37) (2 points) Let N = number of claims and S = X1 + X2 + ... + XN.
Suppose S has a compound Poisson distribution with Poisson parameter = 0.6.
The only possible individual claim amounts are $2,000, $5,000, and $10,000 with probabilities 0.6,
0.3, and 0.1, respectively. Calculate Prob[S $7000 | N 2].
3.7 (CAS3, 5/04, Q.37) (2.5 points)
An insurance portfolio produces N claims with the following distribution:
n
P(N = n)
0
0.1
1
0.5
2
0.4
Individual claim amounts have the following distribution:
x
fX(x)
0
0.7
10
0.2
20
0.1
Individual claim amounts and claim counts are independent.
Calculate the probability that the ratio of aggregate claim amounts to expected aggregate claim
amounts will exceed 4.
A. Less than 3%
B. At least 3%, but less than 7%
C. At least 7%, but less than 11%
D. At least 11%, but less than 15%
E. At least 15%

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 69

Solutions to Problems:
3.1. The severity distribution is: f(100) = .8 and f(200) = .2.
f*f is: 200@64%, 300@32%, 400@4%, since theYpossible sums of two claims are:
100
200

100

200

200
300

300
400

with the corresponding probabilities:

0.8
0.2

0.8

0.2

64%
16%

16%
4%

f*f *f = f*(f*f) is: 300@51.6%, 400@38.4%, 500@9.6%, 600@0.8%,


since the possible sums of three claims are:
Y
100
200

200

300

400

300
400

400
500

500
600

0.64

0.32

0.04

51.2%
12.8%

25.6%
6.4%

3.2%
0.8%

with the corresponding probabilities:


0.8
0.2

The aggregate distribution is Prob(N = n) f*n .


n

0
0.00

1
0.60

2
0.30

3
0.10

f*0

f*f

f*f*f

0
100
200
300
400
500
600

Sum

0.8
0.2

0.64
0.32
0.04

Aggregate
Distribution

0.512
0.384
0.096
0.008

0.0000
0.4800
0.3120
0.1472
0.0504
0.0096
0.0008

1.0000

For example, the probability that the aggregate distribution is 300 is:
(.3)(.32) + (.1)(.512) = 14.72%.
The aggregate distribution is:
100@48%, 200@31.2%, 300@14.72%, 400@5.04%, 500@.96%, 600@.08%.
Comment: One could instead use semi-organized reasoning. For example, the aggregate can be
300 if either one has 2 claims of sizes 100 and 200, or one has 3 claims each of size 100.
This has probability of: (30%)(2)(80%)(20%) + (10%)(80%)(80%)(80%) = 14.72%.

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 70

3.2. C. For the Geometric with = 2: f(0) = 1/3, f(1) = 2f(0)/3 = 2/9,
f(2) = 2f(1)/3 = 4/27, f(3) = 2f(2)/3 = 8/81.
The ways in which the aggregate is 3:
0 claims: 1/3 = .3333. 1 claim: 2/9 = .2222.
2 claims of sizes 1 & 1, 1 & 2, or 2 & 1: (3/4)(4/27) = 1/9 = .1111.
3 claims of sizes 1 & 1 & 1: (1/8)(8/81) =1/81 = .0123.
Distribution of aggregate at 3 is: .3333 + .2222 + .1111 + .0123 = 0.679.
Alternately, using convolutions:
n
Geometric

0.3333

0.2222

0.1481

0.0988

f*0

f*f

f*f*f

0
1
2
3

0.50
0.50

0.250
0.500

0.1250

Aggregate
Distribution

0.3333
0.1111
0.1481
0.0864

Distribution of aggregate at 3 is: .3333 + .1111 + .1481 + .0864 = 0.679.


Comment: Similar to but easier than 3, 11/02, Q.36.
One could also use the Panjer Algorithm (Recursive Method).
3.3. A. For the Binomial with m = 2 and q = .7: f(0) = .32 = .09, f(1) = (2)(.3)(.7) = .42,
f(2) = 0.72 = .49.
For a Geometric with = 1, f(0) = 1/2, f(1) = 1/4, f(2) = 1/8.
The number of claims with 2 accidents is the sum of two independent Geometrics, which is a
Negative Binomial with r = 2 and =1, with:
f(0) = 1/(1+ )r = 1/4. f(1) = r/(1+ )r+1 = 1/4. f(2) = {r(r+1)/2}2 /(1+ )r+2 = 3/16.
Using convolutions:
n
Poisson

0.09

0.42

0.49

f*0

f*f

0
1
2

0.5000
0.2500
0.1250

0.2500
0.2500
0.1875

Compound
Distribution
0.4225
0.2275
0.1444

Prob[2 or fewer claims] = .4225 + .2275 + .1444 = 0.7944.


Comment: One could instead use the Panjer Algorithm (Recursive Method).

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

3.4. E. For the Poisson with = 2: f(0) = e-2 = 0.1353, f(1) = 2f(0) = 0.2707,
f(2) = 2f(1)/2 = 0.2707, f(3) = 2f(2)/3 = 0.1804, f(4) = 2f(3)/4 = 0.0902.
Using convolutions:
n
Poisson

0.1353

0.2707

0.2707

0.1804

0.0902

f*0

f*f

f*f*f

f*f*f*f

0
1
2
3
4

0.3333
0.3333
0.3333

0.1111
0.2222
0.3333

0.0370
0.1111

0.0123

Aggregate
Distribution
0.1353
0.0902
0.1203
0.1571
0.1114

Prob[Aggregate = 4] = Prob[2 claims]Prob[2 claims sum to 4] +


Prob[3 claims]Prob[3 claims sum to 4] + Prob[4 claims]Prob[4 claims sum to 4] =
(0.2707)(0.3333) + (0.1804)(0.1111) + (0.0902)(0.0123) = 0.1114.
Comment: One could instead use the Panjer Algorithm (Recursive Method).

Page 71

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 72

3.5. The possible sums of the numbers of claims for 2 accidents is:
0
1
2
3

0
1
2
3

1
2
3
4

2
3
4
5

3
4
5
6

0.3

0.4

0.2

0.1

9%
12%
6%
3%

12%
16%
8%
4%

6%
8%
4%
2%

3%
4%
2%
1%

With the corresponding


Probabilities ofprobabilities:
0.3
0.4
0.2
0.1

f*f = 0@9%, 1@24%, 2@28%,3@22%, 4@12%, 5@4%, 6@1%.


The possible sums of the numbers of claims for 3 accidents is:
0
1
2
3
4
5
6

0
1
2
3
4
5
6

1
2
3
4
5
6
7

2
3
4
5
6
7
8

3
4
5
6
7
8
9

0.3

0.4

0.2

0.1

2.7%
7.2%
8.4%
6.6%
3.6%
1.2%
0.3%

3.6%
9.6%
11.2%
8.8%
4.8%
1.6%
0.4%

1.8%
4.8%
5.6%
4.4%
2.4%
0.8%
0.2%

0.9%
2.4%
2.8%
2.2%
1.2%
0.4%
0.1%

With the corresponding


Probabilities ofprobabilities:
0.09
0.24
0.28
0.22
0.12
0.04
0.01

f*f*f = 0@2.7%, 1@10.8%, 2@19.8%,3@23.5%, 4@20.4%, 5@13.2%, 6@6.5%, 7@2.4%,


8@0.6, 9@0.1%.

2013-4-3,
n

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Compound
Distribution

0.5

0.2

0.2

0.1

f*0

f*f

f*f*f

0
1
2
3
4
5
6
7
8

0.30
0.40
0.20
0.10

0.09
0.24
0.28
0.22
0.12
0.04
0.01

0.027
0.108
0.198
0.235
0.204
0.132
0.065
0.024
0.006
0.001

0.5807
0.1388
0.1158
0.0875
0.0444
0.0212
0.0085
0.0024
0.0006
0.0001

9
sum

For example, (0.2)(0.1) + (0.2)(0.22) + (0.1)(0.235) = 0.0875.

Page 73

2013-4-3,

Aggregate Distributions 3 Using Convolutions,

HCM 10/23/12,

Page 74

3.6. f*f(4) = Prob[1st claim = 2]Prob[2nd claim = 2] = 0.62 = 0.36.


f*f(7) = Prob[1st claim = 2]Prob[2nd claim = 5] + Prob[1st claim = 5]Prob[2nd claim = 2] =
(2)(.6)(.3) = .36. f*f(10) = Prob[1st claim = 5]Prob[2nd claim = 5] = 0.32 = 0.09.
The aggregate distribution is Prob(N = n) f*n .
Given that N 2, we need only calculate the first three terms of that sum.
n
Poisson

0
0.5488

1
0.3293

2
0.0988

f*0

f*f

0
1
2
3
4
5
6
7
8
9
10

Sum

0.6
0.36
0.3
0.36

0.1

0.09

Aggregate
Distribution
0.5488
0.0000
0.1976
0.0000
0.0356
0.0988
0.0000
0.0356
0.0000
0.0000
0.0418
0.9581

The probability that N 2 and the aggregate losses are less 7 is:
0.5488 + 0.1976 + 0.0356 + 0.0988 + 0.0356 = 0.9164.
The probability that N 2 is 0.5488 + 0.3293 + 0.0988 = 0.9769. Thus Prob[S $7000 | N 2]
= Prob[ S $7000 and N 2] / Prob[N 2] = 0.9164 / 0.9769 = 0.938.
Comment: If there are more than 3 claims, the aggregate losses are > 7. The chance of three claims
all of size 2 is (e-.6 0.63 / 6)(0.63 ) = 0.0043. Thus the unconditional probability that S 7 is
0.9164 + 0.0043 = 0.9207.
3.7. A. Mean Frequency is 1.3. Mean severity is 4. Mean Aggregate is: (1.3)(4) = 5.2.
Prob[Agg > (4)(5.2) = 20.4] = Prob[Agg 30].
The aggregate is 30 if there are two claims of sizes: 10 and 20, 20 and 10, or 20 and 20.
Prob[Agg 30] = (0.4) {(2)(0.2)(0.1) + 0.12 } = 2%.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 75

Section 4, Generating Functions34


There are a number of different generating functions, with similar properties. On this exam, the
the Probability Generating Function (p.g.f.)35 is used for working with frequency distributions.
On this exam, the Moment Generating Function (m.g.f.) and the Probability Generating
Function are used for working with aggregate distributions. Other generating functions include: the
Characteristic Function, the Laplace Transform, and the Cumulant Generating Function.
Name

Symbol

Formula

Probability Generating Function

P X(t)

E[tx ] = MX(ln(t))

Moment Generating Function

M X(t)

E[et x] = PX( et )

Characteristic Function

X(t)

E[eitx] = MX(it)

Laplace Transform

LX(t)

E[e-tx]

Cumulant Generating Function

X(t)

ln MX(t) = ln E[etx]

Moment Generating Functions:36


The moment generating function is defined as MX(t) = E[et x].
The moment generating function for a continuous loss distribution with support from 0 to is given
by:37

M(t) = E[ext] =

f(x)ext dx .

x =0
34

See Section 3.3 of Loss Models . Also see page 38 of Actuarial Mathematics, not on the Syllabus.
See Mahlers Guide to Frequency Distributions.
36
Moment Generating Functions are used in the study of Aggregate Distributions and Continuous Time Ruin
Theory. Continuous Time Ruin Theory is not on the syllabus.
See either Chapter 13 of Actuarial Mathematics or Chapter 11 of Loss Models.
37
In general the integral goes over the support of the probability distribution. In the case of discrete distributions,
one substitutes summation for integration.
35

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 76

For example for the Exponential distribution:

M(t) =

f(x)ext dx = {e-x/ /}ext dx = (1/)ex(t1/) /(t - 1/) ] = 1/(1 - t),

x=0

x=0

for t < 1/.

x=0

Exercise: What is the moment generating function for a uniform distribution on [3, 8]?
[Solution: M(0) = E[ex0] = E[1] = 1. M(t) = E[ext] =
8

(1/5)ext dx = (1/5)ext /t
x=3

x=8

= (e8t - e3t)/5t, for t 0.]

x=3

The Moment Generating Functions of severity distributions, when they exist, are given
in Appendix A of Loss Models. The Probability Generating Functions of frequency
distributions are given in Appendix B of Loss Models.
M(t) = P(et ) .

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 77

Table of Moment Generating Functions


Distribution38

Parameters

Support of M.G.F.

(ebt - eat)/ t(b-a) 39

Uniform on [a, b]
Normal

Moment Generating Function

exp[t + 2t2 /2]

1/(1 - t)

t < 1/

Gamma

(1 - t)

t < 1/

Inverse Gaussian

Bernoulli

Binomial

q, m

Exponential

Poisson
Geometric
Negative Binomial

r,

exp[( / ) (1 -

1 - 2t2 / )]

t < / 22

qet + 1 - q
(qet + 1 - q)m
exp[(et - 1)]
1/{1 - (et - 1)}

et < (1+)/

{1 - (et - 1)}-r

et < (1+)/

Exercise: Assume X is Normally Distributed with parameters = 2 and = 3. What is E[etX]?


[Solution: If X is Normally distributed with parameters = 2 and = 3, then tX is Normally distributed
with parameters = 2t and = 3t. etX is LogNormally distributed with = 2t and = 3t.
E[etX] = mean of a LogNormal Distribution = exp[2t + (3t)2 /2] = exp[2t + 4.5t2 ]. ]
X is Normal(, ) tX is Normal(t, t) etX is LogNormal(t, t).
X is Normal(, ) MX(t) E[ext] = mean of LogNormal(t, t) = exp[t + 2t2 /2].40
The Moment Generating Function of a Normal Distribution is: M(t) = exp[t + t2 2 /2].
38

As per Loss Models.


M(0) = 1.
40
The mean of a LogNormal Distribution is: exp[(first parameter) + (second parameter)2 /2].
39

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 78

Discrete Distributions:
For a discrete distribution, we substitute summation for integration. For example, for the Poisson
Distribution the m.g.f. is determined as follows:

M(t) = E[ext] = (e x / x!) etx = e (et)x / x! = e exp[et] = exp[(et -1)].


x=0

x=0

Exercise: Severity is 300 with probability 60% and 700 with probability 40%.
What is the moment generating function for severity?
[Solution: M(t) = E[ext] = 0.6e300t + 0.4e700t.]
Relation of Moment and Probability Generating Functions:
M(t) = E[ext] = E[(et)x] = P(et). Thus one can write the Moment Generating Function in terms of the
Probability Generating Function, M(t) = P(et ).41 For example, for the Poisson Distribution,
P(t) = exp[(t-1)]. Therefore, M(t) = P(et) = exp[(et -1)].
On the other hand, if one knows the Moment Generating Function, one can get the Probability
Generating Function as: P(t) = M(ln(t)).
Exercise: What is the Moment Generating Function of a Negative Binomial Distribution as per
Loss Models?
[Solution: As shown in Appendix B.2.1.4 of Loss Models, for the Negative Binomial Distribution:
P(t) = {1 - (t - 1)}-r. Thus M(t) = P(et) = {1 - (et - 1)}-r.
Comment: Instead, one could calculate E[ext] for a Geometric Distribution as 1/{1 - (et - 1)}, and
since a Negative Binomial is a sum of r independent Geometrics, M(t) = {1 - (et - 1)}-r.]

41

The Probability Generating Functions of frequency distributions are given in Appendix B of Loss Models. The
Moment Generating Functions of severity distributions, when they exist, are given in Appendix A of Loss Models.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 79

Properties of Moment Generating Functions:


The Moment Generating Function has useful properties. For example, for X and Y independent
variables, the moment generating function of their sum is:
MX+Y(t) = E[e(X+Y) t] = E[eXt eYt] = E[eXt]E[eYt] = MX(t)MY(t).
The moment generating function of the sum of two independent variables is the product
of their moment generating functions:
M X+Y(t) = MX(t) MY(t)
Exercise: X and Y are each Exponential with mean 25. X and Y are independent.
What is the m.g.f. of their sum?
[Solution: X and Y each have m.g.f.: 1/(1 - 25t). Thus the m.g.f. of their sum is: 1/(1 - 25t)2 .
Comment: This is the m.g.f. of a Gamma Distribution with = 25 and = 2.]
Exercise: X follows an Inverse Gaussian Distribution with = 10 and = 8.
Y follows an Inverse Gaussian Distribution with = 5 and = 2. X and Y are independent.
What is the m.g.f. of their sum?
[Solution: The m.g.f. of an Inverse Gaussian Distribution is:
M(t) = exp[(/){1 M Y(t) = exp[.4{1 -

1 - 22 t / }]. Therefore, MX(t) = exp[0.8{1 1 - 25t }]. MX+Y(t) = MX(t) MY(t) = exp[1.2{1 -

1 - 25t }] and
1 - 25t }].

Comment: This is the m.g.f. of another Inverse Gaussian with = 15 and = 18.]
Since the moment generating function of the sum of two independent variables is the product of their
moment generating functions, the Moment Generating Function converts convolution into
multiplication:
M f * g = Mf M g .
The Moment Generating Function for a sum of independent variables is the product of
the Moment Generating Functions of each of the variables. Of particular importance for
working with aggregate losses, the sum of n independent, identically distributed variables
has the Moment Generating Function taken to the power n.
The m.g.f. of f*n is the nth power of the m.g.f. of f.42
42

Using characteristic functions rather than Moment Generating Functions, this is the key idea behind the
Heckman-Meyers algorithm not on the Syllabus. The Robertson algorithm, not on the Syllabus, relies on the similar
properties of the Fast Fourier Transform. See Section 9.8 of Loss Models.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 80

For example, MX+X+X(t) = MX(t)3 . Thus the Moment Generating Function for a sum of 3
independent, identical, Exponential variables is M(t) = {1/(1 - t)}3 , for t < 1/, the moment
generating function of a Gamma Distribution with = 3.
Exercise: What is the m.g.f. for the sum of 5 independent Exponential Distributions, each with mean
17?
[Solution: M(t) = {1/(1 - 17t)}5 .]
Adding a constant to a variable, multiplies the m.g.f. by e to the power of that constant times t:
M X+b (t) = E[e(x+b)t] = ebt E[ext] = ebt MX(t).
Multiplying a constant times a variable, gives an m.g.f that is the original m.g.f at t times that constant:
M cX(t) = E[ecxt] = MX(ct).
Exercise: There is uniform inflation of 4% between 2001 and 2002. What is the m.g.f. of the severity
distribution in 2002 in terms of that in 2001?
[Solution: y = 1.04x, therefore, M2002(t) = M2001(1.04t).]
For example, if losses in 2001 follow a Gamma Distribution with = 2 and = 1000, then in 2001
M(t) = (1 - 1000t)-2. If there is uniform inflation of 4% between 2001 and 2002, then in 2002 the
m.g.f. is: {1 - 1000(1.04)t}-2 = (1 - 1040t)-2, which is that of a Gamma Distribution with = 2 and
= 1040.
Exercise: What is the m.g.f. for the average of 5 independent Exponential Distributions, each with
mean 17?
[Solution: Their average is their sum multiplied by 1/5. MY/5(t) = MY(t/5).
Therefore, the m.g.f. of the average is the m.g.f. of the sum at t/5: {1/(1 - 17t/5)}5 .]
In general, the Moment Generating Function of the average of n independent identically distributed
variables is the nth power of the Moment Generating Function of t/n. For example, the average of n
independent Geometric Distribution each with parameter , each with Moment Generating Function:
(1 - (et - 1))-1, has m.g.f.: (1 - (et/n - 1))-n.
In addition, the moment generating function determines the distribution, and vice-versa.
Therefore, one can take limits of a distribution by instead taking limits of the Moment Generating
Function.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 81

Exercise: Use moment generating functions to take the limit of Negative Binomial Distributions, such
that r = 7 as 0.
[Solution: The moment generating function of a Negative Binomial is:
(1 - (et-1))-r, which for r = 7/ is: (1 - (et - 1))-7/.
ln ((1 - (et - 1))-7/) = -(7/) ln(1 - (et - 1)) -(7/)(- (et - 1)) = 7(et - 1).
Thus the limit as 0 of the m.g.f. is exp[7(et - 1)], which is the m.g.f. of a Poisson with mean 7.
Thus the limit of these Negative Binomials is a Poisson with the same mean.]
Moments and the Moment Generating Function:
The moments of the function can be obtained as the derivatives of the moment generating function
at zero, by reversing the order of integration and the taking of the derivative.

M(0) = f(x)xdx = E[X].


M(s) = f(x)x2 exsdx.
M(0) =f(x)x2 dx = E[X2 ].

M(s) = f(x)xexsdx.

M(0) = E[X0 ] = 1
M (0) = E[X]
M (0) = E[X2 ]
M(0) = E[X3 ]
M ( n )(0) = E[Xn ]
For example, for the Gamma Distribution: M(t) = (1 - t).
M(t) = (1 - t)(+1). M(0) = = mean.
M(t) = 2 (+1)(1 - t)(+2). M(0) = 2 (+1) = second moment.
M(t) = 3 (+1)(+2)(1 - t)(+3). M(0) = 3 (+1)(+2) = third moment.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 82

Exercise: A distribution has m.g.f. M(t) = exp[11t + 27t2 ].


What are the mean and variance of this distribution?
[Solution: M(t) = M(t)(11 + 54t). mean = M(0) = 11.
M(t) = M(t)(11 + 54t) + 54M(t). 2nd moment = M(0) = (11)(11) + 54 = 175.
Variance = 175 - 112 = 54.]
Moment Generating Function as a Power Series:
One way to remember the relationship between the m.g.f. and the moments is to expand the
exponential into a power series:

M(t) = E[ext]= E (xt)n / n! = E[Xn] t n / n! = (nth moment) tn /n!.


n=0
n=0

So the nth moment of the distribution is the term multiplying tn /n! in the power series representation
of its m.g.f., M(t).

For example, the power series for 1/(1-y) is:

yn / n!,
n=0

while the m.g.f. of an Exponential Distribution is: M(t) = 1/(1 - t). Therefore,

M(t) = 1/(1 - t) =

n=0

(t) n / n!

n tn / n! .
n=0

Therefore, the nth moment of an Exponential is n .


When one differentiates n times the power series for M(t), the first n terms vanish.
dn (tn /n!)/dtn = 1, and the remaining terms all still have powers of t, which will vanish when we set
t = 0. Therefore,

dn E[Xi ] ti / i!
i=0

at t equal to zero = nth moment.


M n (0) =
n
dt

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 83

The Moments of a Negative Binomial Distribution:


As discussed previously, the probability generating function of a Negative Binomial Distribution is
M(t) = P(et) = 1/{1 - (et - 1)}r.
Exercise: Using its moment generating function, determine the first four moments of a Negative
Binomial Distribution.
[Solution: M(t) = ret/{1 - (et - 1)}r+1. First Moment = M(0) = r.
M(t) = M(t) + r(r+1)2e2t/{1 - (et - 1)}r+2. Second Moment = M(0) = r + r(r+1)2.
M(t) = M(t) + 2r(r+1)2e2t/{1 - (et - 1)}r+2 + r(r+1)(r+2)3e3t/{1 - (et - 1)}r+3.
Third Moment = M(0) = r + 3 r(r+1)2 + r(r+1)(r+2)3.
M(t) = M(t) + 4r(r+1)2e2t/{1 - (et - 1)}r+2 + 2r(r+1)(r+2)3e3t/{1 - (et - 1)}r+3 +
3r(r+1)(r+2)3e3t/{1 - (et - 1)}r+3 + r(r+1)(r+2)(r+3)4e4t/{1 - (et - 1)}r+4.
Fourth Moment = M(0) = r + 3 r(r+1)2 + r(r+1)(r+2)3 + 4r(r+1)2 + 2r(r+1)(r+2)3
+ 3r(r+1)(r+2)3 + r(r+1)(r+2)(r+3)4 = r + 7 r(r+1)2 + 6r(r+1)(r+2)3 + r(r+1)(r+2)(r+3)4.]
Exercise: Determine the CV, skewness, and kurtosis of a Negative Binomial Distribution.
[Solution: Variance = r + r(r+1)2 - (r)2 = r(1 + ).
Coefficient of Variation = r(1 + ) / r = (1+ ) / (r) .
Third Central Moment = E[X3 ] - 3E[X]E[X2 ] + 2E[X]3 =
r + 3 r(r+1)2 + r(r+1)(r+2)3 - 3(r){r + r(r+1)2} + 2(r)3 =
r( + 32 + 23) = r(1 + )(1 + 2).
Skewness = r(1 + )(1 + 2)/{r(1 + )}1.5 = (1 + 2) /

(1 + ) .

Fourth Central Moment = E[X4 ] - 4E[X]E[X3 ] + 6E[X]2 E[X2 ] - 3E[X]4 =


r + 7 r(r+1)2 + 6r(r+1)(r+2)3 + r(r+1)(r+2)(r+3)4
- 4(r){r + 3 r(r+1)2 + r(r+1)(r+2)3} + 6(r)2 { r + r(r+1)2} -3(r)4 =
r{1 + 7 + 122 + 63 + 3r(1 + )2 }.
Kurtosis = r{1 + 7 + 122 + 63 + 3r(1 + )2 }/{r(1 + )}2 = 3 + {62 + 6 + 1}/{(1 + )r}.]
Therefore, for the Negative Binomial Distribution:
Skewness = {3 Variance - 2 mean + 2(Variance - mean)2 /mean}/Variance1.5.43
43

See equation 6.7.8 in Risk Theory by Panjer and Willmot.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 84

Calculating the Moment Generating Function of an Inverse Gaussian Distribution:

Exercise: What is the integral

x-3/2 exp(-(a2x + b2/x)) dx ?


0

Hint: An Inverse Gaussian density integrates to unity from zero to infinity.


[Solution: The density of an Inverse Gaussian with parameters and is:
f(x) =

/ (2 ) x-1.5 exp(-(x/ -1)2 / (2x)) =

/ (2 ) x-1.5 exp(-x/ 22 + / - / 2x).

Let a2 = / 22 and b2 = / 2, then = 2b2 and = b/a.


Then f(x) =

b2 / x-1.5 exp(-a2 + 2ba - b2 /x) = e2ba (b/ ) x-1.5 exp(-a2 - b2 /x) .

Since this integrates to unity: (e2ba b/ ) x-1.5 exp(-a2 - b2 /x) dx = 1.


0

x-3/2 exp(-(a2x + b2/x)) dx =

e-2ba / b.

This is a special case of a Modified Bessel Function of the Third Kind, K-.5. See for example,
Appendix C of Insurance Risk Models by Panjer & Willmot. ]
Exercise: Calculate the Moment Generating Function for an Inverse Gaussian Distribution with
parameters and . Hint: Use the result of the previous exercise.
[Solution: The Moment Generating Function is the expected value of ezx.
M(z) =

e/

ezx f(x) dx = ezx / 2 x-1.5 exp(-x/ 22 + / - / 2x) dx =


/ (2 ) x-1.5 exp(-(/ 22 - z)x - / 2x) dx = e/ / (2 ) { e-2ba / b} =

e/ exp[-(/) 1 - 2z2 / ] = exp[ (/) (1 - 1 - 2z2 / ) ].


Where we have used the result of the previous exercise with a2 = / 22 - z and b2 = /2.
The former requires that z < / 22 .
Note that ba =

b2 a2 = ( / 2) ( / 22 - z) = (/ 2) 1 - 2z2 / . ]

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 85

An Example, Calculating the Skewness of an Inverse Gaussian Distribution:


Let's see how the Moment Generating Function could be used to determine the skewness of an
Inverse Gaussian Distribution.
We can use the Moment Generating Function of the Inverse Gaussian to calculate its moments. 44
M(t) = exp[(/) {1 -

1 - 2t2 / }]. M(0) = 1.

M'(t) = M(t) / 1 - 2t2 / . M'(0) = = mean.


M''(t) = M'(t) / 1 - 2t2 / + M(t) (3/) (1 - 2t2/)-3/2.
M''(0) = 2 + 3/ = second moment. Variance = 3/.
M'''(t) = M''(t)/ 1 - 2t2 / + 2M'(t) (3/) (1 - 2t2/)-3/2 + M(t) (35/2) (1 - 2t2/)-5/2.
M'''(0) = (2 + 3/) + 2(3/) + 35/2 = 3 + 3(4/)(1+ /) = third moment.
Exercise: What is the coefficient of variation of an Inverse Gaussian Distribution with and ?
[Solution:

Variance / Mean = (3/2/1/2)/ =

/ . ]

Exercise: What is the skewness of an Inverse Gaussian Distribution with parameters and ?
[Solution: Third Central Moment = 3 + 3(4/)(1+ /) - 3 (2 + 3/) + 23 = 35/2.
Skewness = Third Central Moment / (Variance)1.5 = (35/2)/ (3/)1.5 = 3 / . ]
Thus we see that the skewness of the Inverse Gaussian Distribution, 3 / , is always three times
its coefficient of variation, / . In contrast, for the Gamma Distribution its skewness of 2/ , is
always twice times its coefficient of variation of 1/ .
Existence of Moment Generating Functions:
Moment Generating Functions only exist for distributions all of whose moments exist. Thus for
example, the Pareto does not have all of its moments, so that its Moment Generating Function does
not exist. If a distribution is short tailed enough for all of its moments to exist, then its moment
generating function may or may not exist. While the LogNormal Distribution has all of its moments
exist, its Moment Generating Function does not exist.45 The Moment Generating Function of the
Transformed Gamma, or its special case the Weibull, only exists if 1.46
44

One can instead use the cumulant generating function, ln M(z), to get the cumulants.
See for example, Kendall's Advanced Theory of Statistics, page 412.
45
The LogNormal is the heaviest-tailed distribution all of whose moments exist.
46
While if > 1, the m.g.f. of the Transformed Gamma exists, the m.g.f. is not a well known function.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 86

The moments of the function can be obtained as the derivatives of the moment generating function
at zero. Thus if the Moment Generating Function exists (within an interval around zero) then so do all
the moments. However the converse is not true.
As discussed previously, the moment generating function when it exists can be written as a power
series in t, where E[Xn ] is the nth moment about the origin of the distribution:

M(t) =

E[Xn] t n / n! .
n=0

In order for the moment generating function to converge, in an interval around zero, the moments
E[Xn ] may not grow too quickly as n gets large.
For example, the LogNormal Distribution has moments: E[Xn ] = exp[n + .5 n2 2] =
exp[n] exp[ .5 n2 2] . Thus in this case E[Xn ] tn / n! = exp[n] exp[5 n2 2] tn / n! . Using Sterling's
Formula, for large n, n! increases approximately as: en nn . Thus E[Xn ] tn / n! increases
approximately as: exp[n] exp[n2] tn / (en nn ) = exp[n2] (t/n)n . Thus as n increases, ln[E[Xn ] tn / n!]
increases approximately as: n2+ nln(t) - nln(n) = n{n + ln(t) - ln(n)}. Since n increases more
quickly than ln(n), this expression approaches infinity as n approaches infinity. Thus so does
E[Xn ] tn / n! . Since the terms of the power series go to infinity, the sum does not converge.47
Thus for the LogNormal Distribution the Moment Generating Function fails to exist.
In general, the Moment Generating Function of a distribution exists if and only if the distribution
has a tail which is "exponential bounded."48 A distribution is exponentially bounded if for some
K > 0, c > 0 and for all x: 1 - F(x) Ke-cx. In other words, the survival function has to decline at
least exponentially.
For example, for the Weibull Distribution the survival function is exp(-cx). For > 1 this survival
function declines faster than e-x and thus the Weibull is exponentially bounded.
For < 1 this survival function declines slower than e-x and thus the Weibull is not
exponentially bounded.49 Thus the Weibull for > 1 has a Moment Generating Function,
while for < 1 it does not.50
47

In fact, in order for the power series to converge the terms have to decline faster than 1/n.
See page 186 of Adventures in Stochastic Processes by Sidney L. Resnick.
49
For = 1 one has an Exponential Distribution which is exponentially bounded and its m.g.f. exists.
50
The Transformed Gamma has the same behavior as the Weibull; for > 1 the Moment Generating Function exists
and the distribution is lighter-tailed than < 1 for which the Moment Generating Function does not exist. For a
Transformed Gamma with = 1, one gets a Gamma, for which the m.g.f. exists.
48

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 87

Characteristic Function:
The Characteristic Function is defined as X(z) = E[eizx] = MX(iz) = PX(eiz).51
The Characteristic Function has the advantage that it exists for all z.
However, Characteristic Functions involve complex variables:

X(z) = E[eizx] = E[cos(zx) + i sin(zx)] = E[cos(zx)] + i E[sin(zx)].


One can obtain most of the same useful results using either Moment Generating Functions,
Probability Generating Functions, or Characteristic Functions.
Cumulant Generating Function:
The Cumulant Generating Function is defined as the natural log of the Moment Generating Function.52
X(t) = ln MX(t) = ln E[etx]. The cumulants are then obtained from the derivatives of the Cumulant
Generating Function at zero. The first cumulant is the mean. (0) = E[X].
The 2nd and 3rd cumulants are equal to the 2nd and 3rd central moments.53
Thus one can obtain the variance as (0).
(0) = Var[X].54
d2 (ln MX(t)) / dt2 | t =0 = Var[X].

d3 (ln MX(t)) / dt3 | t =0 = 3rd central moment of X.

Cumulants of independent variables add. Thus for X and Y independent the 2nd and 3rd central
moments add.55
Exercise: What is the cumulant generating function of an Inverse Gaussian Distribution?
[Solution: M(t) = exp[(/) {1 -

1 - 22 t / }]. (t) = ln M(t) = (/) {1 -

1 - 22 t / }. ]

Exercise: Use the cumulant generating function to determine the variance of an Inverse Gaussian
Distribution.
[Solution: (t) = (/)(2/) (1 - 2t2/)-.5 = (1 - 2t2/)-.5. Mean = (0) = .
(t) = (2/)(1 - 2t2/)-1.5. Variance = (0) = 3/. ]

51

See Definition 6.18 and Theorem 6.19 in Loss Models, not on the syllabus.
See Kendall's Advanced Theory of Statistics Volume 1, by Stuart and Ord or Practical Risk Theory for Actuaries,
by Daykin, Pentikainen and Pesonen.
53
The fourth and higher cumulants are not equal to the central moments.
54
See pages 387 and 403 of Actuarial Mathematics.
55
The 4th central moment and higher central moments do not add.
52

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 88

Exercise: Use the cumulant generating function to determine the skewness of an Inverse Gaussian
Distribution.
[Solution: (t) = (3/) (32/)(1 - 2t2/)-1.5. Third Central Moment = (0) = 35/2.
Skewness = Third Central Moment / Variance1.5 = (35/2)/( 3/)1.5 = 3 / . ]
Aggregate Distributions:
Generating functions are useful for working with the distribution of aggregate losses, when frequency
and severity are independent.
Let A be Aggregate Losses, X be severity and N be frequency, then the Moment Generating
Function of the Aggregate Losses can be written in terms of the p.g.f. of the frequency and m.g.f. of
the severity:
M A(t) = E[exp[tl]] = E[exp[tl] | N = n] Prob(n = N) = {E[exp[tx1 ] ... E[exp[txn ]}Prob(n = N) =

MX(t)n Prob(n = N) = EN[MX(t)n]

= PN(MX(t)).

MA (t) = PN (MX(t)) = MN(ln(MX(t))).


Exercise: Frequency is given by a Poisson with mean 7. Frequency and severity are independent.
What is the Moment Generating Function for the aggregate losses?
[Solution: As shown in Appendix B.2.1.1 of Loss Models, PN(z) = e(z-1) = exp7(z-1).
M A(t) = PN(MX(t)) = exp[7(MX(t) - 1)].]
In general, for any Compound Poisson distribution, MA (t) = exp(( MX(t) - 1)).
Exercise: Frequency is given by a Poisson with mean 7.
Severity is given by an Exponential with mean 1000. Frequency and severity are independent.
What is the Moment Generating Function for the aggregate losses?
[Solution: For the Exponential, MX(t) = 1/(1 - t) = 1/(1 - 1000t), t < 0.001.
M A(t) = exp((MX(t) - 1)) = exp(7(1/(1 - 1000t) -1)) = e(7000t)/(1-1000t) , t < 0.001.]
The p.g.f. of the Negative Binomial Distribution is: [1 - (z-1)]-r. Thus for any Compound Negative
Binomial distribution, MA(t) = [1 - (MX(t) - 1)]-r, for MX(t) < 1 + 1/.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 89

The probability generating function of the Aggregate Losses can be written in terms of the p.g.f. of
the frequency and p.g.f. of the severity:56
PA(t) = PN(PX(t)).
Exercise: What is the p.g.f of an Exponential distribution?
[Solution: The m.g.f. the Exponential distribution is: 1/(1 - t). In general P[t] = M(ln(t)).
Thus for an Exponential distribution P(t) = 1/(1 - ln(t)).]
Exercise: Frequency is given by a Poisson with mean 7.
Severity is given by an Exponential with mean 1000. Frequency and severity are independent.
What is the Probability Generating Function for the aggregate losses?
[Solution: The p.g.f. the Exponential distribution is: P(t) = 1/(1 - 1000 ln(t)), while that for the Poisson
is: P(z) = e7(z-1). Thus the p.g.f of aggregate losses is:
exp[7((1/(1 - 1000 ln(t)) -1] = exp[7000ln(t)/(1 - 1000 ln(t))] = t 7000/(1-1000ln(t)).]
Recall that a compound frequency distribution is mathematically equivalent to an aggregate
distribution. Therefore, for a compound frequency distribution, PN(z) = P1 (P2 (z)).
Exercise: What is the p.g.f of a compound Geometric-Poisson frequency distribution?
[Solution: For the Geometric primary distribution: P1 (t) = 1/{1 - (t - 1)}. For the Poisson secondary
distribution: P2 (t) = exp[(t - 1)]. PN(t) = P1 (P2 (t)) = 1/{1 - (exp[(t - 1)] - 1)}. ]
Exercise: Frequency is a compound Geometric-Poisson distribution, with = 3 and = 7.
Severity is Exponential with mean 10. Frequency and severity are independent.
What is the p.g.f. of aggregate losses?
[Solution: The p.g.f for the frequency is: PN(t) = 1/(1 - 3(exp[7(t - 1)] - 1)).
The p.g.f. for the Exponential severity is: PX(t) = 1/(1 - 10 ln(t)).
PA(t) = PN(PX(t)) = 1/(1 - 3(exp[7(1/(1 - 10 ln(t)) - 1)] - 1)) =
1/{1- 3(exp[70 ln(t)/(1 - 10 ln(t))] - 1)} = 1/(4 - 3t70/(1-10ln(t))).]
In general, for a compound frequency distribution and an independent severity:
PA(t) = PN[PX(t)] = P1 [P2 (PX(t))].

56

This is the same result as for compound frequency distributions; the mathematics are identical. See Mahlers
Guide to Frequency Distributions.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 90

The Laplace Transform for the aggregate distribution is: LA(z) = E[e-zA] = PN[LX[z]].57
The Characteristic Function for the aggregate distribution is: A(z) = E[eizA] = PN[X[z]].
Mixtures:
Exercise: Assume one has a two point mixture of distributions:
H(x) = pF(x) + (1-p)G(x). What is the Moment Generating Function of H?
[Solution: MH(t) = EH[ext] = pEF[ext] + (1-p)EG[ext] = pMF(t) + (1-p)MG(t).]
Thus the Moment Generating Function of a mixture is a mixture of the Moment
Generating Functions.58 In particular (1/4) + (3/4)(1/(1 - 40t)) is the m.g.f. of a 2-point mixture of a
point mass at zero and an Exponential distribution with mean 40, with weights 1/4 and 3/4.
Exercise: What is the m.g.f. of a 60%-40% mixture of Exponentials with means of 3 and 7?
[Solution: 0.6/(1 - 3t) + 0.4/(1 - 7t).]
One can apply these same ideas to continuous mixtures. For example, assume the frequency of
each insured is Poisson with parameter , with the parameters varying across the portfolio via a
Gamma distribution with parameters and .59 Then the Moment Generating Function of the
frequency distribution for the whole portfolio is the mixture of the individual Moment Generating
Functions.
For a given value of , the Poisson has an m.g.f. of exp[(et - 1)].
The Gamma density of is: f() = 1 e/ / ().

exp[(et - 1)]1 e/ / () d = {/ ()} 1 exp[-(1 + 1/ - et )] d =


{ / ()} (1 + 1/ - et )() = ( + 1 - et ) = {1 - (et - 1)}.
This is the m.g.f. of a Negative Binomial Distribution, with r = and = . Therefore, the mixture of
Poissons via a Gamma, with parameters and , is a Negative Binomial Distribution, with r = and
= .60

57

See page 205 of Loss Models.


This applies equally well to n-point mixtures and continuous mixtures of distributions.
59
This is the well known Gamma-Poisson frequency process.
60
This same result was derived using Probability Generating Functions in Mahlers Guide to Frequency
Distributions.
58

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 91

Policy Modifications:
Let F(x) be the ground up severity distribution.
Let PNumberLoss be the probability generating function of the number of losses.
Assume there is a deductible, d. Then the expected number of (non-zero) payments is less than
the expected number of losses.
The number of (non-zero) payments can be thought of as coming from a compound process.
First one generates a random number of losses. Then each loss has S(d) chance of being a nonzero payment, independent of any other loss. This is mathematically equivalent to a compound
distribution with secondary distribution that is Bernoulli with q = S(d).61
The probability generating function of this Bernoulli is P(z) = 1 + S(d)(z - 1) = F(d) + S(d)z.
Therefore, the probability generating function of this compound situation is:
PNumberPayments(z) = PNumberLoss(F(d) + S(d)z).62
With a deductible, the severity distribution is altered.63
The per loss variable is zero with probability F(d) and GPerLoss(y) = F(y + d) for y > 0.

Therefore, MPerLoss(t) = E[ety] = F(d)et0 + ety f(y + d) dy = F(d) + ety f(y + d) dy.
0

The distribution of the (non-zero) payments has been truncated and shifted from below at d.
G PerPayment(y) = {F(y + d) - F(d)}/S(d) for y > 0. gPerPayment(y) = f(y + d)/S(d) for y > 0.

Therefore, MPerPayment(t) = E[ety] =

ety f(y + d)/S(d) dy.


0

Therefore, MPerLoss(t) = F(d) + S(d) MPerPayment(t).64


As discussed previously, the aggregate distribution can be thought of either in terms of the per loss
variable or the per payment variable.
61

This same mathematical idea was used in proving thinning results in Mahlers Guide to Frequency Distributions.
See Section 8.6 and Equation 9.31 in Loss Models.
63
See Mahlers Guide to Loss Distributions.
64
See Equation 9.30 in Loss Models.
62

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 92

Therefore, MAggregate(t) = PNumberLoss(MPerLoss(t)), and


M Aggregate(t) = PNumberPayments(MPerPayment(t)).
PNumberPayments(MPerPayment(t)) = PNumberLoss[F(d) + S(d)MPerPayment(t)] =
PNumberLoss(MPerLoss(t)), confirming that these two versions of MAggregate(t) are equal.
One can compute the moment generating function after the effects of coverage modifications.
Exercise: Prior to the effects of a deductible, the sizes of loss follow an Exponential distribution with
mean 8000. For a deductible of 1000, determine the moment generating function for the size of the
non-zero payments by the insurer.
[Solution:

M PerPayment(t) = E[ety] = ety f(y + d)/S(d) dy = ety (e-(y +1000)/8000 /8000) / e-1000/8000 dy =
0

exp[-y(1/8000 - t)] dy /8000 = {1/(1/8000 - t)}/8000 = 1/(1 - 8000t), for t < 1/8000.
0

Comment: Due to the memoryless property of the Exponential, the non-zero payments are also an
Exponential distribution with mean 8000.]
Exercise: Prior to the effects of a deductible, the sizes of loss follow an Exponential distribution with
mean 8000. For a deductible of 1000, determine the moment generating function of the payment
per loss variable.
[Solution: MPerLoss(t) = F(d) + S(d) MPerPayment(t) =
1 - e-1000/8000 + e-1000/8000/(1 - 8000t), for t < 1/8000. ]

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 93

Exercise: Prior to the effects of a maximum covered loss, the sizes of loss follow an Exponential
distribution with mean 8000. For a maximum covered loss of 25,000, determine the moment
generating function for the size of payments by the insurer.
[Solution: For the data censored from above at 25,000, there is a density of:
e-x/8000/8000 for x < 25,000,
and a point mass of probability of e-25000/8000 = e-3.125 at 25,000.
Therefore, M(t) = E[ext] =
25000

x = 25000

{e-x/8000 /8000}ext dx + e-3.125e25000t = ext-x/8000 /(1 - 8000t)] + e25000t-3.125 =


0

x=0

(1 - e25000t-3.125)/(1 - 8000t) + e25000t-3.125 = (1 - 8000te25000(t - 1/8000))/(1 - 8000t).]

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 94

Problems:
4.1 (1 point) Let f(2) = 0.3, f(5) = 0.6, f(11) = 0.1. Let M be the moment generating function of f.
What is M(0.4)?
A. Less than 11
B. At least 11 but less than 12
C. At least 12 but less than 13
D. At least 13 but less than 14
E. At least 14
4.2 (2 points) An aggregate loss distribution is defined by
Prob(N = n) = 2n / (n! e2 ), n = 0, 1, 2,... and f(x) = 62.5x2 e-5x, x > 0.
What is the Moment Generating Function of the distribution of aggregate losses?
A. exp[2{1/(1 - 5t)3 - 1}]
B. exp[2{1/(1 - 5t) - 1}]
C. exp[2{1/(1 - t/5)3 - 1}]
D. exp[2{1/(1 - t/5) - 1}]
E. None of A, B, C, or D.
4.3 (2 points) Frequency is given by a Binomial Distribution with m =10 and q = 0.3.
The size of losses are either 100 or 250, with probability 80% and 20% respectively.
What is the Moment Generating Function for Aggregate Losses at 0.01?
A. Less than 1600
B. At least 1600 but less than 1700
C. At least 1700 but less than 1800
D. At least 1800 but less than 1900
E. At least 1900
4.4 (2 points) Y follows a Gamma Distribution with = 3 and = 100. Z = 40 + Y.
What is the Moment Generating Function of Z?
A. (1 - 140t)-3
B. (1 - 140e40t)-3
C. e40t (1 - 100t)-3
D. (1 - 100t e40t)-3
E. None of A, B, C, or D.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Use the following information for the next five questions:


Assume the m.g.f. of a distribution is: M(t) = exp[10{1 4.5 (1 point) What is the mean of this distribution?
A. 2.8
B. 2.9
C. 3.0
D. 3.1

1 - 0.6t }].

E. 3.2

4.6 (2 points) What is the variance of this distribution?


A. Less than 0.7
B. At least 0.7 but less than 0.8
C. At least 0.8 but less than 0.9
D. At least 0.9 but less than 1.0
E. At least 1.0
4.7 (3 points) What is the skewness of this distribution?
A. Less than 0.7
B. At least 0.7 but less than 0.8
C. At least 0.8 but less than 0.9
D. At least 0.9 but less than 1.0
E. At least 1.0
4.8 (1 point) After uniform inflation of 20%, what is the Moment Generating Function?
A. exp[12{1 -

1 - 0.6t }]

B. exp[12{1 -

1 - 0.72t }]

C. exp[10{1 -

1 - 0.6t }]

D. exp[10{1 - 1 - 0.72t }]
E. None of A, B, C, or D.
4.9 (1 point) X and Y each follow this distribution. X and Y are independent.
Z = X + Y. What is the Moment Generating Function of Z?
A. exp[10{1 -

1 - 0.6t }]

B. exp[40{1 -

1 - 0.6t }]

C. exp[10{1 -

1 - 1.2t }]

D. exp[40{1 - 1 - 1.2t }]
E. None of A, B, C, or D.

Page 95

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 96

4.10 (3 points) The distribution of aggregate losses is compound Poisson with = 5.


The Moment Generating Function of Aggregate Losses is: M(t) = exp[5/(1-7t)3 - 5].
What is the second moment of the severity distribution?
A. Less than 550
B. At least 550 but less than 600
C. At least 600 but less than 650
D. At least 650 but less than 700
E. At least 700
4.11 (2 points) What is the integral:

0 x - 3 / 2 exp[-(a2x + b2 / x)] dx

Hint: make use of the fact that the density of an Inverse Gaussian Distribution integrates to unity from
zero to infinity.
A.

e-ba / b

B.

e-2ba / b

C.

ae-ba / b

D.

ae-2ba / b

E. None of A, B, C, or D.
4.12 (2 points) Calculate the Moment Generating Function for an Inverse Gaussian Distribution with
parameters and . Hint: Use the result of the previous problem.
A. exp[(/) {1 -

1 - t / }], t /.

B. exp[(/) {1 -

1 - 2t / }], t / 2.

C. exp[(/) {1 -

1 - t2 / }], t /2 .

D. exp[(/) {1 -

1 - 2t2 / }], t / 22 .

E. None of A, B, C, or D.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 97

4.13 (2 points ) Frequency is given by a Negative Binomial Distribution with r = 3 and = 1.2.
The size of losses are uniformly distributed on (8, 23).
What is the Moment Generating Function for Aggregate Losses?
A. t3 / {1.2t - 0.08(e23t - e8t)}3
B. t3 / {2.2t - 0.08(e23t - e8t)}3
C. t3 / {1.2t - (e23t - e8t)/75}3
D. t3 / {2.2t - (e23t - e8t)/75}3
E. None of A, B, C, or D.
4.14 (2 points) The probability of snowfall in any day in January is 20%. If it snows during a day, the
amount of snowfall in inches that day is Gamma distributed with = 3 and = 1.7.
Each day is independent of the others.
What is the Moment Generating Function for the amount of snow during January?
A. 0.2 + 0.8(1 - 1.7t)-93
B. {0.2 + 0.8(1 - 1.7t)-3}31
C. 0.8 + 0.2(1 - 1.7t)-93
D. {0.8 + 0.2(1 - 1.7t)-3}31
E. None of A, B, C, or D.
4.15 (3 points) The number of people in a group arriving at
The Restaurant at the End of the Universe is Logarithmic with = 3.
The number of groups arriving per hour is Poisson with mean 10.
Determine the distribution of the total number of people arriving in an hour.
4.16 (2, 5/83, Q.35) (1.5 points) Let X be a continuous random variable with density function
b e-bX for x > 0, where b > 0.
If M(t) is the moment-generating function of X, then M(-6b) is which of the following?
A. 1/7
B. 1/5
C. 1/(7b)
D. 1/(5b)
E. +
4.17 (2, 5/83, Q.37) (1.5 points) Let X have the probability density function
f(x) = (8/9)x/9, for x = 0, 1, 2, . . . What is the moment-generating function of X?
A. 1/(9 - 8et)

B. 9/(9 - 8et)

C. 1/(8et)

D. 9/(8et)

E. 9 - 8et

4.18 (2, 5/85, Q.13) (1.5 points) Let the random variable X have moment-generating function
M(t) = 1 / (1 - t)2 , for t < 1. Find E(X3 ).
A. -24
B. 0
C. 1/4
D. 24
E. Cannot be determined from the information given.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 98

4.19 (4B, 5/85, Q.47) (3 points) Given that M'(t) = r M(t) / [1- (1 - p)et], where M(t) represents the
moment generating function of a distribution. Which of the following represents, respectively, the
mean and variance of this distribution?
A. mean = r/p

variance = r/p2

B. mean = r(1-p)/p

variance = r/p2

C. mean = r/p

variance = r(1-p)/p2

D. mean = r(1-p)/p
E. None of the above

variance = r(1-p)/p2

4.20 (2, 5/88, Q.25) (1.5 points) Let the random variable X have moment generating function
M(t) = exp[3t + t2 ]. What is E(X2 )?
A. 1
B. 2
C. 3

D. 9

E. 11

4.21 (2, 5/90, Q.12 and Course 1 Sample Exam, Q.26) (1.7 points)
Let X be a random variable with moment-generating function M(t) = {(2 + et)/3}9 for - < t < .
What is the variance of X?
A. 2
B. 3
C. 8
D. 9
E. 11
4.22 (5A, 5/95, Q.21) (1 point) Which of the following are true for the moment generating function
M S(t) for the aggregate claims distribution S = X1 + X2 +...+ XN?
1. If the Xis are independent and identically distributed with m.g.f. MX(t), and the
number of claims, N, is fixed, then MS(t) = MX(t)N.
2. If the Xis are independent and identically distributed with m.g.f. MX(t), and the
number of claims, N has m.g.f. MN(t), then MS(t) = MN[exp(MX(t))].
3. If the Xis are independent and identically distributed with m.g.f. MX(t), and N is
Poisson distributed, then MS(t) = exp[(MX(t) - 1)].
A. 1

B. 3

C. 1, 2

D. 1, 3

E. 2, 3

4.23 (2, 2/96, Q.30) (1.7 points) Let X and Y be two independent random variables with moment
generating functions MX(t) = exp[t2 + 2t], MY(t) = exp[3t2 + t].
Determine the moment generating function of X + 2Y.
A. exp[t2 + 2t] + 2exp[3t2 + t]

B. exp[t2 + 2t] + exp[12t2 + 2t]

C. exp[7t2 + 4t]

D. 2exp[4t2 + 3t]

E. exp[13t2 + 4t]

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 99

4.24 (5A, 5/96, Q.22) (1 point) Let M(t) denote the moment generating function of a claim amount
distribution. The number of claims distribution is Poisson with a moment generating function
exp[(exp(t)-1)].
What is the moment generating function of the compound Poisson Distribution?
A. (M(t) -1)

B. exp[exp[M(t) -1]]

D. exp[[M(t) -1]]

E. None of A, B, C, D

C. exp[M(t) -1]

4.25 (Course 151 Sample Exam #1, Q.9) (1.7 points) For S = X1 + X2 + ... + XN:
(i) X1 , X2 ... each has an exponential distribution with mean 1/.
(ii) the random variables N, X1 , X2 ,... are mutually independent.
(iii) N has a Poisson distribution with mean 1.0.
(iv) MS(1.0) = 3.0.
Determine .
(A) 1.9

(B) 2.0

(C) 2.1

(D) 2.2

(E) 2.3

4.26 (1, 5/00, Q.35) (1.9 points) A company insures homes in three cities, J, K, and L.
Since sufficient distance separates the cities, it is reasonable to assume that the losses occurring in
these cities are independent.
The moment generating functions for the loss distributions of the cities are:
M J(t) = (1 - 2t)-3.

M K(t) = (1 - 2t)-2.5.

M L (t) = (1 - 2t)-4.5.

Let X represent the combined losses from the three cities. Calculate E(X3 ).
(A) 1,320
(B) 2,082
(C) 5,760
(D) 8,000
(E) 10,560
4.27 (IOA 101, 9/00, Q.9) (3.75 points) The size of a claim, X, which arises under a certain type
of insurance contract, is to be modeled using a gamma random variable with parameters
and (both > 0) such that the moment generating function of X is given by
M(t) = (1 - t), t < 1/.
By using the cumulant generating function of X, or otherwise, show that the coefficient of
skewness of the distribution of X is given by 2/ .
4.28 (1, 11/00, Q.11) (1.9 points) An actuary determines that the claim size for a certain class of
accidents is a random variable, X, with moment generating function
M X(t) = 1 / (1 - 2500t)4 .
Determine the standard deviation of the claim size for this class of accidents.
(A) 1,340
(B) 5,000
(C) 8,660
(D) 10,000 (E) 11,180

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 100

4.29 (1, 11/00, Q.27) (1.9 points) Let X1 , X2 , X3 be a random sample from a discrete distribution
with probability function p(0) = 1/3 and p(1) = 2/3.
Determine the moment generating function, M(t), of Y = X1 X2 X3 .
(A) 19/27 + 8et/27
(B) 1 + 2et
(C) (1/3 + 2et/3)3
(D) 1/27 + 8e3t/27
(E) 1/3 + 2e3t/3
4.30 (IOA 101, 4/01, Q.5) (2.25 points) Show that the probability generating function for a
Binomial Distribution with parameters m and q is P(z) = (1 - q + qz)m.
Deduce the moment generating function.
4.31 (IOA 101, 4/01, Q.6) (2.25 points) Let X have a normal distribution with mean and
standard deviation , and let the ith cumulant of the distribution of X be denoted i.
Given that the moment generating function of X is M(t) = exp[t + 2t2 /2],
determine the values of 2, 3, and 4.
4.32 (IOA 101, 4/01, Q.7) (1.5 points) The number of policies (N) in a portfolio at any one time is
modeled as a Poisson random variable with mean 10.
The number of claims (Xi) arising on a policy is also modeled as a Poisson random variable with
mean 2, independently for each policy and independent of N.
N

Determine the moment generating function for the total number of claims, Xi ,
i=1

arising for the portfolio of policies.


4.33 (1, 5/03, Q.39) (2.5 points) X and Y are independent random variables with common
moment generating function M(t) = exp[t2 /2]. Let W = X + Y and Z = Y - X.
Determine the joint moment generating function, M(t1 , t2 ), of W and Z.
(A) exp[2t1 2 + 2t2 2 ]

(B) exp[(t1 - t2 )2 ]

(D) exp[2t1 t2 ]

(E) exp[t1 2 + t2 2 ]

(C) exp[(t1 + t2 )2 ]

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 101

Solutions to Problems:
4.1. D. M(t) = E[ext] = .3e2t + .6e5t + .1e11t.
M(.4) = 0.3e.8 + 0.6e2 + 0.1e4.4 = 0.6676 + 4.4334 + 8.1451 = 13.25.
4.2. C. Frequency is Poisson, with = 2 and p.g.f.: P(z) = exp(2(z-1)).
Severity is Gamma with = 3 and = 1/5 and m.g.f.: 1/(1 - t/5)3 .
M A(t) = PN(MX(t)) = exp[2{1/(1-t/5)3 - 1}].
4.3. A. The p.g.f. of the Binomial Frequency is: P(z) = (1+.3(z-1))10. The m.g.f. for severity is:
M(t) = .8 e100t + .2e250t. MA(t) = PN(MX(t)) = (1 + .3( .8 e100t + .2e250t - 1))10.
M A(.01) = (1 + .3( .8 e1 + .2e2.5 - 1))10 = 1540.
4.4. C. The m.g.f. of Y is MY(t) = (1-100t)-3. E[ezt] = E[e(y+40)t] = E[eyt e40t] = e40t E[eyt].
Therefore, the m.g.f of Z is: MZ(t) = e40t MY(t) = e4 0 t (1-100t)- 3.
4.5. C. M(t) = exp[10{1 -

1- 0.6t }]. M(t) = M(t) 3 / 1- 0.6t . mean = M(0) = 3.

4.6. D. M(t) = exp[10{1 -

1- 0.6t }]. M(t) = M(t) 3 / 1- 0.6t . Mean = M(0) = 3.

M(t) = M(t) 3 / 1- 0.6t + M(t) 0.9 /(1 - 0.6t)1.5.


Second moment = M(0) = (3)(3) + 0.9 = 9.9. Variance = 9.9 - 32 = 0.9.
4.7. D. M(t) = M(t) 3 / 1- 0.6t + M(t) .9 /(1- 0.6t)1.5.
M'''(t) = M''(t)3/ 1- 0.6t + 2M'(t)(.9)(1 - .6t)-3/2 + M(t)(.81)(1 - .6t)-5/2. M(0) = 9.9.
M'''(0) = (9.9)(3) + (2)(3)(0.9) + (1)(0.81) = 35.91 = third moment. Third central moment =
35.91 - (3)(9.9)(3) + 2(33 ) = 0.81. Thus the skewness = 0.81 /(0.91.5) = 0.949.
Comment: This is an Inverse Gaussian Distribution with = 3 and = 30, with mean 3,
variance 33 /30 = 0.9, and coefficient of variation: 0.9 / 3 = 0.3162.
The skewness of an Inverse Gaussian is three times the CV:
(3)(0.3162) = 0.949 = 3 3 / 30 = 3 / .

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 102

4.8. D. In general McX(t) = E[etcx] = MX(ct). In this case we have multiplied x by 1.20, so the new
m.g.f is exp[10{1 -

1 - 0.6(1.2t)}] = exp[10{1 -

(1 - 0.72t) }].

Alternately, prior to inflation this is an Inverse Gaussian Distribution with = 3 and = 30.
Under uniform inflation, both parameters are multiplied by the inflation factor, so after inflation we
have = 3.6 and = 36. The m.g.f. of an Inverse Gaussian is:
exp[(/){1 -

1 - 22 t / )}]. After inflation this is: exp[10{1 -

4.9. E. M X+Y(t) = MX(t) MY(t) = exp[10{1 exp[20{1 -

(1 - 0.72t) }].

1 - 0.6t }] exp[10{1 -

1 - 0.6t }] =

1 - 0.6t }].

Comment: This is the moment generating function of another Inverse Gaussian, but with = 6 and
= 120, rather than = 3 and = 30. In general, the sum of two independent identically distributed
Inverse Gaussian Distributions with parameters and , is another Inverse Gaussian Distribution,
but with parameters 2 and 4.
4.10. B. For a Compound Poisson, MA(t) = exp((MX(t)-1)). Thus MX(t) = 1/(1-7t)3 .
This is the moment Generating Function of a Gamma Distribution, with parameters = 3 and = 7.
Thus it has a mean of: (3)(7) = 21, a variance of: 3(72 ) = 147,
and second moment of: 147 + 212 = 588.
Alternately, MX(t) = 21/(1-7t)4 . MX(t) = 588/(1-7t)4 .
second moment of the severity = MX(0) = 588.
Alternately, M(t) = 105M(t)(1-7t)-4. M(0) = 105 = mean of aggregate losses.
M(t) = 2940M(t)(1-7t)-5 + 105M(t)(1-7t)-4.
M(0) = 2940 + (105)(105) = 13965 = 2nd moment of the aggregate losses.
Variance of the aggregate losses = 13965 - 1052 = 2940.
For a compound Poisson, variance of the aggregate losses = (2nd moment of severity).
Therefore, 2nd moment of severity = 2940/5 = 588.
Comment: One could use the Cumulant Generating Function, which is defined as the natural log of
the Moment Generating Function. (t) = ln M(t) = ln {exp(5/(1-7t)3 - 5))} =
5/(1-7t)3 - 5. (t) = 105/(1-7t)4 . (t) = 2940/(1-7t)5 . Variance = (0) = 2940.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 103

4.11. B. The density of an Inverse Gaussian with parameters and is:


f(x) =

-1.5
x
exp(-(x/ -1)2 / (2x)) =
2

-1.5
x
exp(-x/ 22 + / - / 2x).
2

Let a2 = / 22 and b2 = / 2, then = 2b2 and = b/a.


b -1.5
b2 -1.5
x
exp(-a2 + 2ba - b2 /x) = e2ba
x
exp(-a2 - b2 /x).

Then f(x) =

b
Since this integrates to unity: e2ba

0 x - 1.5 exp(-a2 - b2 / x) dx = 1.

0 x - 1.5 exp(-a2 - b2 / x) dx =

e- 2 b a / b.

Comment: This is a special case of a Modified Bessel Function of the Third Kind, K-.5.
See for example, Appendix C of Insurance Risk Models by Panjer & Willmot.
4.12. D. The Moment Generating Function is the expected value of etx.

M(t) =

0 etx f(x) dx = 0 etx

e/

- 1.5
x
exp[-x / (22) + / - / (2x)] dx =
2

0 x - 1.5 exp[-{ / (22) - t}x - / (2x)] dx.

Provided t / 22 , let a2 = / 22 - t, and b2 = / 2.


Then the integral is of the type in the previous problem and has a value of: e-2ba / b.
Therefore, M(t) = e/

{ e-2ba / b} =
2

e/ exp[-(/) 1 - 2t2 / ] = exp[(/) {1 -

1 - 2t 2 / }].

We required that t / (22 ), so that a2 0; M(t) only exists for t / (22 ).


Comment: Note that ba =

b2 a2 = ( / 2) (

- t) = {/ (2)}
22

1 - 2t2 / .

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 104

4.13. B. The p.g.f. of the Negative Binomial Frequency is:


P(z) = (1 - 1.2(z-1))-3 = (2.2 - 1.2z)-3.
The m.g.f. for the uniform severity is: M(t) = (exp(23t) - exp(8t))/ (15t).
M A (t) = PN(MX(t)) = {2.2 - 1.2((exp(23t) - exp(8t))/(15t) }-3 =
{2.2 - .08((exp(23t) - exp(8t))/t)}-3 = t3 /{2.2t - 0.08(e2 3 t - e8 t) }3 .
4.14. D. The frequency for a single day is Bernoulli, with q = .2 and p.g.f. P(z) = .8 + .2z.
The Gamma severity has m.g.f. M(t) = (1- 1.7t)-3. Thus the m.g.f of the aggregate losses for one
day is P(M(t)) = .8 +.2(1- 1.7t)-3. The m.g.f for 31 independent days is the 31st power of that for a
single day, {0.8 + 0.2(1 - 1.7t)- 3}3 1.
Alternately, the frequency for 31 days is Binomial with m = 31, q = .2, and p.g.f.
P(z) = (.8 + .2z)31. Thus the m.g.f. for the aggregate losses is P(M(t)) = (.8 +.2(1- 1.7t)-3)31.
4.15. This is a compound distribution with primary distribution a Poisson and secondary distribution
a Logarithmic. Alternately, it is a Compound Poisson with severity Logarithmic.
ln[1 - (z -1)]
PLogarithmic(z) = 1 .
ln[1 + ]
PPoisson(z) = exp[(z-1)].
PAggregate(z) = PPoisson[PLogarithmic(z)] = PPoisson[1 -

ln[1 - (z -1)]
ln[1 - (z -1)]
] = exp[-
]
ln[1 + ]
ln[1 + ]

= exp[ln[1 - (z-1)]]- /ln(1+) = {1 - (z-1)}- /ln(1+).


PNegativeBinomial(z) = {1 - (z - 1)}-r.
Thus the aggregate distribution is Negative Binomial, with r = /ln(1+) = 10/ln(4) = 7.213.
The aggregate number of people is Negative Binomial with r = 7.213 and = 3.
Comment: Beyond what you should be asked on your exam.
See Example 6.14 in Loss Models, not on the syllabus,
or Section 6.8 in Insurance Risk Models by Panjer and Willmot, not on the syllabus.
The mean of the Logarithmic Distribution is 3/ln(4).
(Mean number of groups) (Average Size of Groups) = (10){3/ln(4)} = 30/ln(4) = {10/ln(4)}(3) =
Mean of the Negative Binomial.
The variance of the Logarithmic Distribution is: 3{4 - 3/ln(4)} / ln(4).
(Mean number of groups) (Variance of Size of Groups)
+ (Average Size of Groups)2 (Variance of Number of Groups) =
(10)3{4 - 3/ln(4)}/ln(4) + {3/ln(4)}2 (10) = 120/ln(4) - 90/{ln(4)}2 + 90/{ln(4)}2 = 120/ln(4) =
{10/ln(4)}(3)(4) = Variance of the Negative Binomial.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 105

4.16. A. An Exponential with = 1/b. M(t) = 1/(1 - t) = 1/(1 - t/b). M(-6b) = 1/(1 + 6) = 1/7.
4.17. A. A Geometric Distribution with = 8. P(z) = 1/{1 - (z-1)} = 1/(9 - 8z).
M(t) = P(et) = 1/(9 - 8et ).
4.18. D. M(t) = 24/(1 - t)5 . E[X3 ] = M(0) = 24.
Alternately, this is a Gamma Distribution, with = 2, = 1, and E[X3 ] = 3(+1)(+2) = 24.
4.19. C. The moment generating function is always unity at zero; M(t) = E[etX],
M(0) = E[1] = 1. The mean is M(0) = M(0)r/p = r/p. M(t) = d (r M(t) / [1-(1-p)et] / dt =
r{M(t)(1-(1-p)et) + M(t)(1-p)et}/ {1-(1-p)et}2 .
The second moment is M(0) = r{pM(0) + M(0)(1-p)} / p2 = r{r +(1-p)} / p2 .
Therefore, the variance = r{r +(1-p)} / p2 - (r/p)2 = r(1-p)/p2 .
Comment: This is the Moment Generating Function of the number of Bernoulli trials, each with
chance of success p, it takes until r successes. The derivation of the Negative Binomial Distribution
involves the number of failures rather than the number of trials. See Mahlers Guide to Frequency
Distributions. Thus the variable here is:
r + (a Negative Binomial with parameters r and = (1-p)/p).
This variable has mean: r + (r)(1-p)/p = r/p and variance: r(1+) = r (1-p)/p2 .
Note that the m.g.f of this variable is:
M(t) = ert (m.g.f of a Negative Binomial) = ert (1 - (et -1))-r = ert p r(1 - (1-p)(et))-r.
M(t) = rert p r(1 - (1-p)(et))-r + r(1-p)(et)ert p r(1 - (1-p)(et))-(r+1) =
r M(t) + r(1-p)et M(t) /[1- (1-p)et] = rM(t){ 1- (1-p)et + (1-p)et}/[1- (1-p)et] = rM(t)/[1- (1-p)et].
4.20. E. M(t) = (3 + 2t)M(t). M(t) = 2M(t) + (3 + 2t)2 M(t). E[X2 ] = M(0) = 2 + 9 = 11.
Comment: A Normal Distribution with mean 3 and variance 2. E[X2 ] = 32 + 2 = 11.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 106

4.21. A. M(t) = 9et(2 + et)8 /39 . E[X] = M(0) = 3.


M(t) = 72et(2 + et)7 /39 + 9et(2 + et)8 /39 . E[X2 ] = M(0) = 8 + 3 = 11. Variance = 11 - 32 = 2.
Alternately, lnM(t) = 9 ln(2 + et) - 9 ln(3). d(lnM(t))/dt = 9et/(2 + et).
d2 (lnM(t))/dt2 = 9et/(2 + et) - 9e2t/(2 + et)2 . Var[X] = d2 (ln MX(t)) / dt2 | t =0 = 3 - 1 = 2.
Alternately, M(t) = {(2 + et)/3}9 P(z) = {(2 + z)/3}9 = (1 + (z - 1)/3)9 .
This is the probability generating function of a Binomial Distribution with q = 1/3 and m = 9.
Variance = mq(1-q) = (9)(1/3)(1 - 1/3) = 2.
4.22. D. 1. True. 2. False. MS(t) = PN(MX(t)) = MN(ln(MX(t))).
3. True. For a Poisson distribution, PN(z) = exp((z - 1)).
For a Compound Poisson, MS(t) = PN(MX(t)) = exp((MX(t)-1)).
4.23. E. M 2Y(t) = MY(2t) = exp[12t2 + 2t]. MX+2Y(t) = MX(t)M2Y(t) = exp[13t2 + 4t].
Comment: X is Normal with mean 2 and variance 2.
Y is Normal with mean 1 and variance 6. 2Y is Normal with mean 2 and variance 24.
X + 2Y is Normal with mean 4 and variance 26. For a Normal, M(t) = exp[t + 2t2 /2].
4.24. D. For a Compound Poisson, MA(t) = MN(ln(MX(t))) = exp((MX(t)-1)).
4.25. A. The moment generating function of aggregate losses can be written in terms of those of
the frequency and severity: MN(ln(MX(t)) = PN(MX(t)). For a Poisson Distribution, the probability
generating function is e(z-1). For an Exponential Distribution with mean 1/, the moment generating
function is 1/(1- t/) = /(-t). Therefore, the moment generating function of the aggregate losses is:
exp[(MX(t)-1)] = exp[(/(-t)-1)] = exp[t/(-t)]. In this case = 1, so MS(t) = exp[t/(-t)].
We are given MS(1) = 3, so that 3 = exp[1/(-1)]. Therefore, = 1+ 1/ln(3) = 1.91.
4.26. E. MX(t) = MJ(t) MK(t) ML (t) = (1 - 2t)-10.
M X(t) = 20 (1 - 2t)-11. MX(t) = 440 (1 - 2t)-12. MX(t) = 10560 (1 - 2t)-13.
E[X3 ] = MX(0) = 10,560.
Alternately, the three distributions are each Gamma with = 2, and = 3, 2.5, and 4.5.
Therefore, their sum is Gamma with = 2, and = 3 + 2.5 + 4.5 = 10.
E[X3 ] = 3 (+1)(+2) = (8)(10)(11)(12) = 10,560.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 107

4.27. (t) = ln M(t) = - ln (1 - t). (t) = / (1 - t). (t) = 2 / (1 - t)2 .


(t) = 23 / (1 - t)3 .

Var[X] = (0) = 2 .

Third Central Moment of X = (0) = 23 .

Skewness = 23 /(2 )1.5 = 2/ .

Comment: One could instead get the moments from the Appendix attached to the exam or use the
Moment Generating Function to get the moments. Then the Third Central Moment is:
E[X3 ] - 3 E[X] E[X2 ] + 2E[X]3 .
4.28. B. MX(t) = (1 - 2500t)-4. MX(t) = 10,000 (1 - 2500t)-5. E[X] = MX(0) = 10,000.
M X(t) = 125 million (1 - 2500t)-6. E[X2 ] = MX(0) = 125 million.
Standard deviation = 125 million - 100002 = 5000.
Alternately, ln MX(t) = -4ln(1 - 2500t). d ln MX(t)/ dt = 10000/(1 - 2500t).
d2 ln MX(t)/ dt2 = 25,000,000/(1- 2500t)2 .
Var[X] = d2 ln MX(0)/ dt2 = 25,000,000. Standard deviation = 5000.
Alternately, the distribution is Gamma with = 2500 and = 4.
Variance = 2 = (4)(25002 ). Standard deviation = (2)(2500) = 5000.
4.29. A. Prob[Y = 1] = Prob[X1 = X2 = X3 = 1] = (2/3)3 = 8/27. Prob[Y = 0] = 1 - 8/27 = 19/27.
M Y(t) = E[eyt] = Prob[Y = 0] e0t + Prob[Y = 1] e1t = 19/27 + 8et /27.
4.30. P(z) = E[zn ] = f(n) zn . For a Bernoulli, P(z) = f(0)z0 + f(1)z1 = 1 - q + qz.
The Binomial is the sum of m independent, identically distributed Bernoullis, and therefore has
P(z) = (1 - q + qz)m.
M(t) = P(et) = (1 - q + qet)m.
4.31. The cumulant generating function is (t) = ln M(t) = t + 2t2 /2.
(t) = + 2t. 1 = (0) = = mean.
(t) = 2. 2 = (0) = 2 = variance.
(t) = 0. 3 = (0) = 0 = third central moment.
(t) = 0. 4 = (0) = 0.
Comment: i is the coefficient of ti/i! in the cumulant generating function.

2013-4-3,

Aggregate Distributions 4 Generating Functions,

HCM 10/23/12,

Page 108

4.32. For the Poisson Distribution, P(z) = exp[(z - 1)]. M(t) = exp[(et - 1)].
Pprimary(z) = exp[10(z - 1)]. Msecondary(t) = exp[2(et - 1)].
M compound(t) = Pprimary[Msecondary(t)] = exp[10 {exp[2(et - 1)] - 1}].
Alternately, Pcompound(z) = Pprimary[Psecondary(z)] = exp[10 {exp[2(z - 1)] - 1}].
Then let z = et, in order to get the moment generating function of the compound distribution.
4.33. E. The joint moment generating function of W and Z:
M(t1 , t2 ) E[exp[t1 w + t2 z]] = E[exp[t1 (x+y) + t2 (y-x)]] = E[exp[y(t1 + t2 ) + x(t1 - t2 )]] =
E[exp[y(t1 + t2 )]] E[exp[x(t1 - t2 )]] = MY(t1 + t2 ) MX(t1 - t2 ) = exp[(t1 + t2 )2 /2] exp[(t1 - t2 )2 /2] =
exp[t1 2 /2 + t2 2 /2 + t1 t2 ] exp[t1 2 /2 + t2 2 /2 - t1 t2 ] = exp[t1 2 + t2 2 ].
Comment: Beyond what you should be asked on your exam.
X and Y are two independent unit Normals, each with mean 0 and standard deviation 1.
E[W] = 0. Var[W] = Var[X] + Var[Y] = 1 + 1 = 2. E[Z] = 0. Var[Z] = Var[X] + Var[Y] = 1 + 1 = 2.
Cov[W, Z] = Cov[X + Y, Y - X] = -Var[X] + Var[Y] + Cov[X, Y] - Cov[X, Y] = -1 + 1 = 0.
Corr[W, Z] = 0. W and Z are bivariate Normal, with W = 0, W2 = 2, Z = 0, Z2 = 2, = 0.
For a bivariate Normal, M(t1 , t2 ) = exp[1t1 + 2t2 + 12t1 2 /2 + 22t2 2 /2 + 12t1 t2 ].
See for example, Introduction to Probability Models, by Ross.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 109

Section 5, Moments of Aggregate Losses


If one is not given the frequency per exposure, but is rather just given the frequency for the whole
number of exposures, whatever they are for the particular situation, then
Losses = (Frequency) (Severity).
Thus Mean Aggregate Loss = (Mean Frequency) (Mean Severity).
Exercise: The number of claims is given by a Negative Binomial Distribution with
r = 4.3 and = 3.1. The size of claims is given by a Pareto Distribution with = 1.7 and
= 1400. What is the expected aggregate loss?
[Solution: The mean frequency is r = (4.3)(3.1) = 13.33. The mean severity is /(-1) =
1400 / 0.7 = 2000. The expected aggregate loss is: (13.33)(2000) = 26,660.]
Since they depend on both the number of claims and the size of claims, aggregate losses have
more reasons to vary than do either frequency or severity individually. Random fluctuation occurs
when one rolls dice, spins spinners, picks balls from urns, etc. The observed result varies from time
period to time period due to random chance. This is also true for the aggregate losses observed for
a collection of insureds. The variance of the observed aggregate losses that occurs due to random
fluctuation is referred to as the process variance. That is what will be discussed here.65

Independent Frequency and Severity:


You are given the following:

The number of claims for a single exposure period is given by a Binomial Distribution
with q = 0.3 and m = 2.

The size of the claim will be 50, with probability 80%, or 100, with probability 20%.

Frequency and severity are independent.

65

The process variance is distinguished from the variance of the hypothetical means as discussed in Mahlers Guide
to Buhlmann Credibility.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 110

Exercise: Determine the variance of the aggregate losses.


[Solution: List the possibilities and compute the first two moments:
Situation
0 claims
1 claim @ 50
1 claim @ 100
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 100
2 claims @ 100 each

Probability
49.00%
33.60%
8.40%
5.76%
2.88%
0.36%

Overall

100.0%

Aggregate Loss Square of Aggregate Loss


0
0
50
2500
100
10000
100
10000
150
22500
200
40000
36

3048

For example, the probability of 2 claims is: 0.32 = 9%. We divide this 9% among the possible
claim sizes: 50 and 50 @ (0.8)(0.8) = 64%, 50 and 100 @ (0.8)(0.2) = 16%,
100 and 50 @ (0.2)(0.8) = 16%, 100 and 100 @ (0.2)(0.2) = 4%.
(9%)(64%) = 5.76%, (9%)(16% + 16%) = 2.88%, (9%)(4%) = 0.36%.
One takes the weighted average over all the possibilities. The average Pure Premium is 36.
The second moment of the Pure Premium is 3048.
Therefore, the variance of the pure premium is: 3048 - 362 = 1752.]
In this case, since frequency and severity are independent one can make use of the following
formula:66
(Process) Variance of Aggregate Loss =
(Mean Freq.) (Variance of Severity) + (Mean Severity)2 (Variance of Freq.)

Agg2 = F X 2 + X 2 F2.
Memorize this formula for the variance of the aggregate losses when frequency and severity are
independent. Note that each of the two terms has a mean and a variance, one from frequency and
one from severity. Each term is in dollars squared; that is one way to remember that the mean
severity (which is in dollars) enters as a square while that for mean frequency (which is not in dollars)
does not.
In the above example, the mean frequency is mq = 0.6 and the variance of the frequency is:
mq(1 - q) = (2)(0.3)(0.7) = 0.42. The average severity is 60 and the variance of the severity is:
(0.8)(102 ) + (0.2)(402 ) = 400. The process variance of the aggregate losses is:
(0.6)(400) + (602 )(0.42) = 1752, which matches the result calculated previously.
66

See equation 9.9 in Loss Models. Note Loss Models uses S for aggregate losses rather than A, and N for
frequency rather than F. I have used X for severity in order to follow Loss Models. This formula can also be used to
compute the process variance of the pure premium, when frequency and severity are independent.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 111

One can rewrite the formula for the process variance of the aggregate losses in terms of
coefficients of variation by dividing both sides by the square of the mean aggregate loss:67
CVA2 = CVF2 + CVX2 / F.
In the example above, CVF2 = (2)(0.3)(0.7) / {(2)(0.3)}2 = 1.167, CVX2 = 400 / 602 = 1.111,
and therefore CVA2 = 1.167 + 1.111/0.6 = 1.352 = 1752 / 362 .
Thus the square of the coefficient of variation of the aggregate losses is the sum of the CV2 for
frequency and the CV2 for severity divided by the mean frequency.
An Example of Dependent Frequency and Severity:
On both the exam and in practical applications, frequency and severity are usually independent.
However, here is an example in which frequency and severity are dependent.
Assume that you are given the following:

The number of claims for a single exposure period will be either 0, 1, or 2:


Number of Claims Probability
0
60%
1
30%
2
10%

If only one claim is incurred, the size of the claim will be 50, with probability 80%;
or 100, with probability 20%.

If two claims are incurred, the size of each claim, independent of the other, will be 50,
with probability 50%; or 100, with probability 50%.
How would one determine the variance of the aggregate losses?
First list the aggregate losses and probability of each of the possible outcomes.
If there is no claim (60% chance) then the aggregate loss is zero.
If there is one claim, then the aggregate loss is either 50 with (30%)(80%) = 24% chance,
or 100 with (30%)(20%) = 6% chance.
If there are two claims then there are three possibilities. There is a (10%)(25%) = 2.5% chance that
there are two claims each of size 50 with an aggregate loss of 100. There is a (10%)(50%) = 5%
chance that there are two claims one of size 50 and one of size 100 with an aggregate loss of 150.
There is a (10%)(25%) = 2.5% chance that there are two claims each of size 100 with an aggregate
loss of 200.
Next, the first and second moments can be calculated by listing the aggregate losses for all the
possible outcomes and taking the weighted average using the probabilities as weights of either the
aggregate loss or its square.
67

The mean of the aggregate losses is the product of the mean frequency and the mean severity.

2013-4-3,

Aggregate Distributions 5 Moments,

Situation
0 claims
1 claim @ 50
1 claim @ 100
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 100
2 claims @ 100 each

Probability
60.0%
24.0%
6.0%
2.5%
5.0%
2.5%

Overall

100.0%

HCM 10/23/12,

Page 112

Aggregate Loss Square of Aggregate Loss


0
0
50
2500
100
10000
100
10000
150
22500
200
40000
33

3575

One takes the weighted average over all the possibilities. The average aggregate loss is 33.
The second moment of the aggregate losses is 3575.
Therefore, the variance of the aggregate losses is: 3575 - 332 = 2486.
Note that the frequency and severity are not independent in this case. Rather the severity
distribution depends on the number of claims. For example, the average severity if there is 1 claim
is 60, while the average severity if there are 2 claims is 75.
In general, one can calculate the variance of the aggregate losses in the above manner from the
second and first moments. The variance is: the second moment - (first moment)2 .
The first and second moments can be calculated by listing the aggregate losses for all the possible
outcomes and taking the weighted average, applying the probabilities as weights to either the
aggregate loss or its square. In continuous cases, this will involve taking integrals, rather than sums.
Policies of Different Types:
Let us assume we have a portfolio consisting of two types of policies:

Type
A
B

Number
of Policies
10
20

Mean Aggregate
Loss per Policy
6
9

Variance of Aggregate
Loss per Policy
3
4

Assuming the results of each policy are independent, then the mean aggregate loss for the portfolio
is: (10)(6) + (20)(9) = 240.
The variance of aggregate loss for the portfolio is: (10)(3) + (20)(4) = 110.68
For independent policies, the means and variances of the aggregate losses for each policy add.
The sum of the aggregate losses from two independent policies has the sum of the means and
variances of the aggregate losses for each policy.
68

Since we are given the variance of aggregate losses, there is no need to compute the variance of aggregate
losses from the mean frequency, variance of frequency, mean severity, and variance of severity.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 113

Exercise: Compare the coefficient of variation of aggregate losses in the above example to that if
one had instead 100 policies of Type A and 200 polices of type B.
[Solution: For the original example, CV =

110 / 240 = 0.043.

For the new example, CV = 1100 / 2400 = 0.0138.


Comment: Note that as we have more policies, all other things being equal, the coefficient of
variation goes down.]
Exercise: For each of the two cases in the previous exercise, using the Normal Approximation
estimate the probability that the aggregate losses will be at least 5% more than their mean.
[Solution: For the original example, Prob[Agg. > 252] 1 - [(252 - 240)/ 110 ] = 1 - [1.144] =
12.6%. For the new example, Prob[Agg. > 2520] 1 - [(2520 - 2400)/ 1100 ] =
1 - [3.618] = 0.015%.]
For a larger portfolio, all else being equal, there is less chance of an extreme outcome in a given year
measured as a percentage of the mean.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 114

Derivation of the Formula for the Process Variance of the Aggregate Losses:
The above formula for the process variance of the aggregate losses for independent frequency and
severity is a special case of the formula that also underlies analysis of variance:
Var(Y) = EZ[VARY(Y | Z)] + VARZ(EY[Y | Z]), where Z and Y are any random variables.
Letting Y be the aggregate losses, A, and Z be the number of claims, N, in the above formula
gives:
Var(Agg) = EN[VARA(A | N)] + VARN(EA[A | N]) = EN[NX2 ] + VARN(XN) =
EN[N]X2 + X2 VARN(N) = F X2 + X2 F2 .
Where we have used the assumption that the frequency and severity are independent and the
facts:

For a fixed number of claims N, the variance of the aggregate losses is the variance of
the sum of N independent identically distributed variables each with variance X2 .
(Since frequency and severity are assumed independent, X2 is the same for each
value of N.) Such variances add so that VARA(A | N) = NX2 .

For a fixed number of claims N, for frequency and severity independent the expected
value of the aggregate losses is N times the mean severity: EA[A | N] = XN.

Since with respect to N the variance of the severity acts as a constant :


EN[NX2 ] = X2 EN[N] = F X2 .

Since with respect to N the mean of the severity acts as a constant :


VARN(XN) = X2 VARN(N) = X2 F2 .

Lets apply this derivation to the previous example. You were given the following:

For a given risk, the number of claims is given by a Binomial Distribution with
q = 0.3 and m = 2.

The size of the claim will be 50, with probability 80%, or 100, with probability 20%.

frequency and severity are independent.


There are only three possible values of N: N=0, N=1, or N=2. If N = 0 then A = 0. If N = 1 then
either A = 50 with 80% chance or A = 100 with 20% chance. If N = 2 then A = 100 with 64%
chance, A = 150 with with 32% chance or A = 200 with 4% chance.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 115

We then get :
N

Probability

0
1
2

49%
42%
9%

Mean

Mean A
Given N
0
60
120
36

Square of Mean Second Moment of


A Given N
of A Given N
0
0
3600
4000
14400
15200
2808

Var of A
Given N
0
400
800
240

For example, given two claims, the second moment of the aggregate losses is:
(64%)(1002 ) + (32%)(1502 ) + (4%)(2002 ) = 15,200. Thus given two claims the variance of the
aggregate losses is: 15,200 - 1202 = 800.
Thus EN[VARA(A | N)] = 240, and VARN(EA[A | N]) = 2808 - 362 = 1512. Thus the variance of the
aggregate losses is EN[VARA(A | N)] + VARN(EA[A | N]) = 240 + 1512 = 1752, which matches
the result calculated above. The (total) process variance of the aggregate losses has been split into
two pieces. The first piece calculated as 240, is the expected value over the possible numbers of
claims of the process variance of the aggregate losses for fixed N. The second piece calculated as
1512, is the variance over the possible numbers of the claims of the mean aggregate loss for fixed
N.
Poisson Frequency:
Assume you are given the following:

For a given risk, the number of claims for a year is Poisson with mean 7.

The size of the claim will be 50, with probability 80%, or 100, with probability 20%.

frequency and severity are independent.


Exercise: Determine the variance of the aggregate losses for this risk.
[Solution: F = F2 = 7. X = 60. X2 = 400.

A2 = F X2 + X2 F2 = (7)(602 ) + (400)(7) = 28,000.]


In the case of a Poisson Frequency with independent frequency and severity the formula for the
process variance of the aggregate losses simplifies. Since F = F2 :

A2 = F X2 + X2 F2 = F(X2 + X2 ) = F(2nd moment of the severity).


The variance of a Compound Poisson is: (2nd moment of severity).

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 116

In the example above, the second moment of the severity is: (0.8)(502 ) + (0.2)(1002 ) = 4000.
Thus A2 = F(2nd moment of the severity) = (7)(4000) = 28,000.
As a final example, assume you are given the following:

For a given risk, the number of claims for a year is Poisson with mean 3645.

The severity distribution is LogNormal, with parameters = 5 and = 1.5.

Frequency and severity are independent.

Exercise: Determine the variance of the aggregate losses for this risk.
[Solution: The second moment of the severity = exp(2 + 22 ) = exp(14.5) = 1,982,759.
Thus A2 = (2nd moment of the severity) = (3645)(1,982,759) = 7.22716 x 109 .]
Formula in Terms of Moments of Severity:
It may sometimes be useful to rewrite the variance of the aggregate loss in terms of the first and
second moments of the severity:

A2 = FX2 + X2 F2 = F(E[X2 ] - E[X]2 ) + E[X]2 F2 = FE[X2 ] + E[X]2 (F2 - F).


For a Poisson frequency distribution the final term is zero, A2 = E[X2 ].
For a Negative Binomial frequency distribution, A2 = rE[X2 ] + E[X]2 r2.
For the Binomial frequency distribution, A2 = mqE[X2 ] - E[X]2 mq2 .
Normal Approximation:
For frequency and severity independent, for large numbers of expected claims, the observed
aggregate losses are approximately normally distributed. The more skewed the severity
distribution, the higher the expected frequency has to be for the Normal Approximation to produce
worthwhile results.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 117

For example, continuing the example above, the mean Poisson frequency is 3645,
and the mean severity = exp[ + 0.52 ] = exp(6.125) = 457.14.
Thus the mean aggregate losses is: (3645)(457.14) = 1,666,292.
One could ask what the chance of the observed aggregate losses being between 1.4997 million
and 1.8329 million. Since the variance of the aggregate losses is 7.22716 x 109 , the standard
deviation of the aggregate losses is 85,013.
Thus the probability of the observed aggregate losses being within 10% of 1.6663 million is
approximately: [(1.8329 - 1.6663 million) / 85,013] - [(1.4997 - 1.6663 million) / 85,013] =
[1.96] - [-1.96] = 0.975 - (1 - 0.975) = 95%.
LogNormal Approximation:
When one has other than a large number of expected claims, the distribution of aggregate losses
typically has a significantly positive skewness. Therefore, it makes sense to approximate the
aggregate losses with a distribution that also has a positive skewness.69 Loss Models illustrates
how to use a LogNormal Distribution to approximate the Aggregate Distribution.70 One applies the
method of moments to fit a LogNormal Distribution with the same mean and variance as the
Aggregate Losses.
Exercise: For a given risk, the number of claims for a year is Negative Binomial with
= 3.2 and r = 14. The severity distribution is Pareto with parameters = 2.5 and = 10.
Frequency and severity are independent.
Determine the mean and variance of the aggregate losses.
[Solution: The mean frequency is: (3.2)(14) = 44.8. The variance of frequency is:
(3.2)(1 + 3.2)(14) = 188.16. The mean severity is 10/(2.5 - 1) = 6.667.
The second moment of severity is: 22 / {(-1)(-2)} = 200 / {(0.5)(1.5)} = 266.67.
The variance of the severity is: 266.67 - 6.6672 = 222.22.
Thus the mean of the aggregate losses is: (44.8)(6.667) = 298.7.
The variance of the aggregate losses is: (44.8)(222.22) + (6.6672 )(188.16) = 18,319.]

69

The Normal Distribution being symmetric has zero skewness.


See Example 9.4 of Loss Models. Actuarial Mathematics, at pages 388-389 not on the Syllabus, demonstrates
how to use a translated Gamma Distribution. Approximations of the Aggregate Loss Distribution, by Papush,
Patrik, and Podgaits, CAS Forum Winter 2001, recommends that if one uses a 2 parameter distribution, one use the
Gamma Distribution. Loss Models mentions that one could match more than the first two moments by using
distributions with more than two parameters.
70

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 118

Exercise: Fit a LogNormal Distribution to the aggregate losses in the previous exercise, by matching
the first two moments.
[Solution: mean = 298.7. second moment = 18,319 + 298.72 = 107,541.
Matching the mean of the LogNormal and the data: exp( + 0.5 2) = 298.7.
Matching the second moment of the LogNormal and the data: exp(2 + 22) = 107,541.
Divide the second equation by the square of the first equation:
exp(2 + 22) / exp(2 + 2) = exp(2) = 1.205.

= 0.1867 = 0.432. = ln(298.7) - 2/2 = 5.606.


Comment: The Method of Moments as discussed in Mahlers Guide to Fitting Loss Distributions.]
Then one can use the approximating LogNormal Distribution to answer questions about the
aggregate losses.
Exercise: Using the LogNormal approximation, estimate is the probability that the aggregate losses
are less than 500?
[Solution: [(ln(500) - 5.606)/0.432] = [1.41] = 0.9207.]
Exercise: Using the LogNormal approximation, estimate is the probability that the aggregate losses
are between 200 and 500?
[Solution: [(ln(500)-5.606)/0.432] - [(ln(200)-5.606)/0.432] = [1.41] - [-0.71] =
0.9207 - 0.2389 = 0.6818.]
Higher Moments of the Aggregate Losses:
When frequency and severity are independent, just as one can write the variance or coefficient of
variation of the aggregate loss distribution in terms of quantities involving frequency and severity,
one can write higher moments in this manner. For example, the third central moment of the aggregate
losses can be written as:71
third central moment of the aggregate losses =
(mean frequency)(3rd central moment of severity) +
3(variance of frequency)(mean severity)(variance of severity) +
(mean severity)3 (3rd central moment of frequency).
Note that each term is in dollars cubed.
71

See Equation 9.9 in Loss Models. Also, see either Actuarial Mathematics or Practical Risk Theory for Actuaries, by
Daykin, et. al. As shown in the latter, one can derive this formula via the cumulant generating function.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 119

This formula can be written in terms of skewnesses as follows:


E[A - E[A]]3 = F X3 X + 3 F2 XX2 + F3 FX3 .
Therefore, the skewness of the aggregate losses is:
A = {F X3 X + 3 F2 XX2 + F3 FX3 } / A3 .
This can also be written in terms of coefficients of variation rather than variances:
A =

(CVX3 X / F2 ) + (3 CVF2 CVX2 / F) + CVF3 F


.
CVA3

Exercise: If the frequency is Negative Binomial with r = 27 and = 7/3, then what are the mean,
coefficient of variation and skewness?
[Solution: F = r = 63, CVF =

1 +
= 0.2300, and F = (1+2) /
r

(1+ )r = 0.391.]

Exercise: If the severity is given by a Pareto Distribution with = 4 and = 3, then what are the
mean, coefficient of variation and skewness?
[Solution: E[X] = /(-1) = 1. E[X2 ] = 22/{(-1)(-2)} = 3. E[X3 ] = 62/{(-1)(-2)(-3)} = 27.

X = E[X] = 1. Var[X] = E[X2 ] - E[X]2 = 2. CVX = 2 /1 = 1.414.


3rd central moment = E[X3 ] - 3E[X]E[X2 ] + 2E[X]3 = 20. X = 20/21.5 = 7.071.]
Exercise: The frequency is Negative Binomial with r = 27 and = 7/3.
The severity is Pareto with = 4 and = 3. Frequency and severity are independent.
What are the mean, coefficient of variation and skewness of the aggregate losses?
[Solution: A = FX = 63, CVA2 = CVF2 + CVX2 / F = 0.0846 and therefore CVA = 0.291, and
A =

(CVX3 X / F2 ) + (3 CVF2 CVX2 / F) + CVF3 F


=
CVA3

{(1.4143 )(7.071)/632 + 3(1.4142 )(0.232 )/63 + (0.233 )(0.391)} / 0.2913 = 0.0148/0.0246 = 0.60.
Note that the variance of the aggregate losses in this case is:

A2 = F X2 + X2 F2 = (63)(2) + (1)(210) = 336.


Comment: Actuarial Mathematics in Table 12.5.1, not on the Syllabus, has the following formula for
the third central moment of a Compound Negative Binomial:
r{E[X3 ] + 3 2E[X]E[X2 ] + 2 3E[X]3 }. In this case, this formula gives a third central moment of:
(207){(7/3)(27) + (3)(7/3)2 (1)(3) + (2)(7/3)3 (13 ) = 3710.
Thus the skewness is: 3710 / (3361.5) = 0.60.]

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 120

As the expected number of claims increases, the skewness of the aggregate losses 0, making
the Normal Approximation better. As discussed above, when the skewness of the aggregate
losses is significant, one can approximate the aggregate losses via a LogNormal.72
For the Poisson Distribution the mean = , the variance = , and the skewness is 1/ .73
Therefore, for a Poisson Frequency, F3 F = 1.5/ = , and the third central moment of the
aggregate losses is: E[A - E[A] ]3 = F X3 X + 3 F2 XX2 + F3 FX3 =
(third central moment of severity) + 3 XX2 + X3 =
{E[X3 ] - 3XE[X2 ] - 2X3 + 3X (E[X2 ] - X2 ) + X3 } = (third moment of the severity).
The Third Central Moment of a Compound Poisson Distribution is:
(mean frequency) (third moment of the severity).
For a Poisson Frequency, the variance of the aggregate losses is: (2nd moment of severity).
Therefore, skewness of a compound Poisson =
(third moment of the severity) / { (2nd moment of severity)1.5}.
Exercise: Frequency is Poisson with mean 3.1. Severity is discrete with: P[X=100] = 2/3,
P[X=500] = 1/6, and P[X=1000] = 1/6. Frequency and Severity are independent.
What is the skewness of the distribution of aggregate losses?
[Solution: The second moment of the severity is:
(2/3)(1002 ) + (1/6)(5002 ) + (1/6)(10002 ) = 215,000. The third moment of the severity is:
(2/3)(1003 ) + (1/6)(5003 ) + (1/6)(10003 ) = 188,166,667.
The skewness of a compound Poisson =
(third moment of the severity) / { (2nd moment of severity)1.5} =
188,166,667/ { 3.1 (215000)1.5} = 1.072.
Comment: Since the skewness is a dimensionless quantity that does not depend on the scale, we
would have gotten the same answer if we had instead worked with a severity distribution with all of
the amounts divided by 100: P[X=1] = 2/3, P[X=5] = 1/6, and P[X=10] = 1/6.]

The Kurtosis of a Compound Poisson Distribution is: 3 +

72

E[X4 ]
.
E[X2] 2

As discussed in Actuarial Mathematics, when the skewness of the aggregate losses is significant, one can
approximate with a translated Gamma Distribution rather than a Normal Distribution.
73
See Mahlers Guide to Frequency Distributions.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 121

Per Claim Deductibles:74


If frequency and severity are independent, then the aggregate losses depend on the number of
losses greater than the deductible amount and the size of the losses truncated and shifted by the
deductible amount. If frequency is Poisson with parameter , then the losses larger than d are also
Poisson, but with parameter S(d).75
Exercise: Frequency is Poisson with = 10.
If 19.75% of losses are large, what is the frequency distribution of large losses?
[Solution: It is Poisson with = (10)(19.75%) = 1.975.]
If frequency is Negative Binomial with parameters and r, then the losses larger than d are also
Negative Binomial, but with parameter S(d) and r.76
Exercise: Frequency is Negative Binomial with r = 2.4 and = 1.1.
If 77.88% of losses are large, what is the frequency distribution of large losses?
[Solution: It is Negative Binomial with r = 2.4 and = (1.1)(0.7788) = 0.8567.]
One can then look at the non-zero payments by the insurer. Their sizes are distributed as the original
distribution truncated and shifted by d.77 The mean of the non-zero payments = the mean of the
severity distribution truncated and shifted by d: {E[X] - (E[X d]} / S(d).
Exercise: For a Pareto with = 4 and = 1000, compute the mean of the non-zero payments given
a deductible of 500.
[Solution: For the Pareto E[X] = /(-1) = 1000/ 3 = 333.33. The limited expected value is
E[X 500] = {/(1)} {1(/(+500))1} = 234.57. S(500) = 1/(1+500/1000)4 = 0.1975.
(E[X] - E[X 500])/S(500) = (333.33 - 234.57)/0.1975 = 500.
Alternately, the mean of the data truncated and shifted at 500 is the mean excess loss at 500.
For the Pareto, e(x) = (x+)/(-1). e(500) = (500+1000)/(4-1) = 500.
Alternately, the distribution truncated and shifted (from below) at 500 is
G(x) = {F(x+500) - F(500)}/S(500) = {(1+500/1000)-4 - (1+(x+500)/1000)-4}/(1+500/1000)-4 =
1 - (1+x/1500)-4}. This is a Pareto with = 4 and = 1500, and mean 1500/(4-1) = 500.]
74

A per claim deductible operates on each loss individually. This should be distinguished from an aggregate
deductible which applies to the aggregate losses, as discussed below in the section on stop loss premiums.
75
Where S(d) is the survival function of the severity distribution prior to the impact of the deductible.
76
See Mahlers Guide to Frequency Distributions.
77
See Mahlers Guide to Loss Distributions.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 122

Thus a Pareto truncated and shifted at d, is another Pareto with parameters and
+d. Therefore the above Pareto distribution truncated and shifted at 500, with parameters 4 and
1000 + 500 = 1500, has a variance of (15002 )(4)/{(4-2)(4-1)2 } = 500,000.
The Exponential distribution has a similar nice property. An Exponential Distribution truncated and
shifted at d, is another Exponential with the same mean. Thus if one has an Exponential with
= 2000, and one truncates and shifts at 500 (or any other value) one gets another exponential with
= 2000. Thus the mean of the severity truncated and shifted is 2000, and the variance is:
20002 = 4,000,000
For any severity distribution, given a deductible of d, the variance of the non-zero payments =
the variance of the severity distribution truncated and shifted by d is:78
{E[X2 ] - E[(X d)2 ] - 2d{E[X] - (E[X d]}}/S(d) - {{E[X] - (E[X d]}/S(d)}2 .
Exercise: For a Pareto with = 4 and = 1000, use the above formula to compute the variance of
the non-zero payments given a deductible of 500.
[Solution: E[X2 ] = (10002 )2/((4-1)(4-2) = 333,333.
E[(X 500)2 ]= E[X2 ] {1 - (1+ 500/ )1[1+ (-1)500/ ]} = 86,420.
E[X 500] = 234.57. E[X] = /(-1) = 1000/ 3 = 333.33. S(500) = 0.1975.
{E[X2 ] - E[(X d)2 ] - 2d{E[X] - (E[X d]}}/S(d) - {{E[X] - (E[X d]}/S(d)}2 =
{333,333 - 86420 - (2)(500(333.33 - 234.57)}/0.1975 - (500)2 = 500,000.]
We can then combine the frequency and severity after the effects of the per claim deductible, in
order to work with the aggregate losses.
Exercise: Frequency is Poisson with = 10. Severity is Pareto with = 4 and = 1000.
Severity and frequency are independent. There is a per claim deductible of 500.
What are the mean and variance of the aggregate losses excess of the deductible?
[Solution: The frequency of non-zero payments is Poisson with mean (0.1975)(10) = 1.975.
The severity of non-zero payments is Pareto with = 4 and = 1500.
The mean aggregate loss is (1.975)(1500/3) = 987.5.
The variance of this compound Poisson is:
1.975 (2nd moment of Pareto) = (1.975)(15002 2/ (4-1)(4-2)) = 1,481,250.]

78

See Mahlers Guide to Loss Distributions.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 123

Exercise: Frequency is Negative Binomial with r = 2.4 and = 1.1. Severity is Exponential with
= 2000. Severity and frequency are independent. There is a per claim deductible of 500.
What is the mean and variance of the aggregate losses excess of the deductible?
[Solution: S(500) = e-500/2000 = 0.7788. The frequency of non-zero payments is: Negative
Binomial with r = 2.4 and = (1.1)(0.7788) = 0.8567. The mean frequency of non-zero payments
is: (2.4)(0.8567) = 2.056. The variance of the number of non-zero payments is:
(2.4)(0.8567)(1.8567) = 3.818. The severity of non-zero payments is Exponential with = 2000.
The mean non-zero payment is 2000.
The variance of size of non-zero payments is 20002 = 4 million.
The mean of the aggregate losses excess of the deductible is: (2.056)(2000) = 4112.
The variance of the aggregate losses excess of the deductible is:
(2.056)(4 million) + (3.818)(20002 ) = 23.5 million.
Comment: See Course 3 Sample Exam, Q.20. ]
Maximum Covered Losses:79
Assume the severity follows a LogNormal Distribution with parameters = 8 and = 2. Assume
frequency is Poisson with = 100. The mean severity is: exp( + 2/2) = e10 = 22026.47, and the
mean aggregate losses are: 2,202,647. The second moment of the severity is: exp(2 + 22) = e24
= 26.49 billion. Thus the variance of the aggregate losses is: (100)(26.49 billion) = 2649 billion.
Exercise: If there is a $250,000 maximum covered loss, what are the mean and variance of the
aggregate losses paid by the insurer?
[Solution: E[X x] = exp( + 2/2)[(lnx 2)/] + x{1 - [(lnx )/]}.
E[X 250,000] = e10 [(ln(250,000) - 8 - 4 )/2] + (250,000){1 - [(ln(250,000) 8)/2]} =
(22026)[.2146] + (250,000)(1[2.2146]) =
(22026)(0.5850) + (250,000)(1- 0.9866) = 16235.
E[(X x)2 ] = exp[2 + 22][{ln(x) (+ 22)} / ] + x2 {1- [{ln(x) } / ] }.
E[(X 250,000)2 ] = exp(24)[-1.7854] + 62.5 billion{1-[2.2146]} =
(26.49 billion) (0.03710) + (62.5 billion) (1- 0.9866) = 1.820 billion.
The frequency is unaffected by the maximum covered loss.
Thus the mean aggregate losses are (100)(16,235) = 1.62 million.
The variance of the aggregate losses is: (100)(1.820 billion) = 182.0 billion.]
79

See Mahlers Guide to Loss Distributions.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 124

We note how the maximum covered loss reduces both the mean and variance of the aggregate
losses. By cutting off the effect of the heavy tail of the severity, the maximum covered loss has a
significant impact on the variance of the aggregate losses.
In general one can do similar calculations for any severity distribution and frequency distribution.
The moments of the severity are calculated using the limited expected moments, as shown in
Appendix A of Loss Models, while the frequency is unaffected by the maximum covered loss.
Maximum Covered Losses and Deductibles:
If one has both a maximum covered loss u and a per claim deductible d, then the severity is the
layer of loss between d and u, while the frequency is the same as that in the presence of just the
deductible. The first moment of the nonzero payments is:
{E[X u] - (E[X d]}}/S(d), while the second moment of the nonzero payments is:
{E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - (E[X d]}}/S(d).80

Exercise: If in the previous exercise there were a per claim deductible of $50,000 and a maximum
covered loss of $250,000, what would be the mean and variance of the aggregate losses paid by
the insurer?
[Solution: E[X 50,000] = 10,078. E[X 250,000] = 16,235.
E[(X 250,000)2 ] = 1.820 billion. E[(X 50,000)2 ] = .323 billion.
S(50000) = 1 - ((ln(50000)-8)/2) = 1 - (1.410) = 1 - 0.9207 = 0.0793.
The first moment of the nonzero payments is: (16235 - 10078)/0.0793 = 77,642.
The second moment of the nonzero payments is:
(1.820 billion - 0.323 billion - (2)(50000)(16235 - 10078))/0.0793 = 11.11 billion.
The frequency of nonzero payments is Poisson with = (100)(0.0793) = 7.93.
Thus the mean aggregate losses are: (7.93)(77642) = 616 thousand.
The variance of the aggregate losses is: (7.93)(11.11 billion) = 88.1 billion.]
We can see how such calculations involving aggregate losses in the presence of both a deductible
and a maximum covered loss, can quickly become too time consuming for exam conditions.

80

See Mahlers Guide to Loss Distributions.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 125

Compound Frequency Distributions:81


For example, assume the number of taxicabs that arrive per minute at the Heartbreak Hotel is
Poisson with mean 1.3.
In addition, assume that the number of passengers dropped off at the hotel by each taxicab is
Binomial with q = 0.4 and m = 5.
The number of passengers dropped off by each taxicab is independent of the number of taxicabs
that arrive and is independent of the number of passengers dropped off by any other taxicab.
Then the aggregate number of passengers dropped off per minute at the Heartbreak Hotel is an
example of a compound frequency distribution.
Compound distributions are mathematically equivalent to aggregate distributions,
with a discrete severity distribution.
Poisson Frequency.
Binomial Severity.
Thus although compound distributions are not on the syllabus, on your exam, one could describe
the above situation as a collective risk model with Poisson frequency and Binomial Severity.
Aggregate Distribution

Compound Frequency Distribution

Frequency

Primary (# of cabs)

Severity

Secondary (# of passengers per cab)

C 2 = f s 2 + s 2 f 2 .
f frequency or first (primary)
s severity or secondary.

81

Discussed more extensively in Mahlers Guide to Frequency Distributions.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 126

Process Covariances:82
Assume a claims process in which frequency and severity are independent of each other, and the
claim sizes are mutually independent random variables with a common distribution.83 Then let each
claim be divided into two pieces in a well-defined manner not dependent on the number of claims.
For convenience, we refer to these two pieces as primary and excess.84
Let:

Tp = Total Primary Losses

Te = Total Excess Losses

Xp = Primary Severity

Xe = Excess Severity

N = Frequency
Then as will be proved subsequently:
COV[Tp , Te ] = E[N] COV[Xp ,Xe ] + VAR[N] E[Xp ] E[Xe ].85
Exercise: Assume severity follows an Exponential Distribution with = 10,000.
The first 5000 of each claim is considered primary losses. Xp = Primary Severity = X 5000.
Excess of 5000 is considered excess losses. Xe = Excess Severity = (X - 5000)+.
Determine the covariance of Xp and Xe .
Hint:

x e - x / / dx = -x e-x/ - e-x/. x2 e- x / / dx = -x2 e-x/ - 2x e-x/ - 22 e-x/.

[Solution: E[Xp ] = E[X 5000] = (10,000) (1 - e-5000/10,000) = 3935.


E[Xe ] = 10,000 - 3935 = 6065.

5000

E[Xp Xe ] =

x 0 e - x / 10000 dx +

x (x - 5000) e- x / 10000 dx

5000

x2 e- x / 10000 dx - 5000
x e - x / 10000 dx

5000
5000

= {(50002 ) +(2)(5000)(10,000) + (2)(10,0002 )} e-0.5 - (5000){5000 + 10,000}e-0.5 =


151.633 million.
Cov[Xp , Xe ] = E[Xp Xe ] - E[Xp ]E[Xe ] = 151.633 million - (3935)(6065) = 127.767 million.]
82

Beyond what you will be asked on your exam.


In other words, assume the usual collective risk model.
84
In a single-split experience rating plan, the first $5000 of each claim might be primary and anything over $5000
would contribute to the excess losses.
85
This is a generalization of the formula we had for the process variance of aggregate losses.
See Appendix A of Howard Mahlers discussion of Glenn Meyers An Analysis of Experience Rating, PCAS 1987.
83

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 127

Exercise: Assume severity follows an Exponential Distribution with = 10,000.


The first 5000 of each claim is primary losses. Excess of 5000 is excess losses.
Frequency is Negative Binomial with r = 3 and = 0.4.
Determine the covariance of the total primary and total excess losses.
[Solution: COV[Tp , Te ] = E[N] COV[Xp ,Xe ] + VAR[N] E[Xp ] E[Xe ] =
(3)(0.4) (127.761 million) + (3)(0.4)(1.4) (3935)(6065) = 193.408 million.]
Proof of the Result for Process Covariances:
The total primary losses Tp is the sum of the individual primary portions of claims Xp (i), where
i runs from 1 to N, the number of claims. Similarly, Te is a sum of Xe (i).
Since N is a random variable, both frequency and severity contribute to the covariance of Tp and Te .
To compute the covariance of Tp , and Te , begin by calculating E[Tp Te | N=n].
n
n

Fix the number of claims n and find E


Xe(i)
Xp(i)

i=1
i=1

Expanding the product yields n2 terms of the form Xp (i) Xe (j).


From the definition of covariance, when i = j the expected value of the term is:
E[Xp (i) Xe (i)] = COV[Xp (i), Xe (i)] + E[Xp (i)] E[Xe (i)].
Otherwise, for i j, X(i) and X(j) are independent and E[Xp (i) Xe (j)] = E[Xp (i)] E[Xe (j)].
n
n

Thus E
Xe(i)
Xp(i)

i=1
i=1

= n COV[Xp , Xe ] + n2 E[Xp ] E[Xe ].

Now, by general considerations of conditional expectations: E[Tp Te ] = EN[ E[Tp Te | N=n] ].


Thus, taking the expected value of the above equation with respect to N gives:
E[Tp Te ] = E[N] COV[Xp , Xe ] + E[N2 ] E[Xp ] E[Xe ]
= E[N] COV[Xp , Xe ] + {Var[N] + E[N]2 } E[Xp ] E[Xe ].
COV[Tp , Te ] = E[Tp Te ] - E[Tp ] E[Te ]
= E[N] COV[Xp , Xe ] + {Var[N] + E[N]2 } E[Xp ] E[Xe ] - E[N] E[Xp ] E[N] E[Xe ]
= E[N] COV[Xp ,Xe ] + VAR[N] E[Xp ] E[Xe ].

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 128

Problems:
5.1 (1 point) You are given the following:

mean frequency = 13

variance of the frequency = 37

mean severity = 300

variance of the severity = 200,000

frequency and severity are independent


What is the variance of the aggregate losses?
A. Less than 5 million
B. At least 5 million but less than 6 million
C. At least 6 million but less than 7 million
D. At least 7 million but less than 8 million
E. At least 8 million
5.2 (2 points) A six-sided die is used to determine whether or not there is a claim. Each side of the
die is marked with either a 0 or a 1, where 0 represents no claim and 1 represents a claim. Two sides
are marked with a 0 and four sides with a 1. In addition, there is a spinner representing claim
severity. The spinner has three areas marked 2, 5 and 14. The probabilities for each claim size are:
Claim Size
Probability
2
20%
5
50%
14
30%
The die is rolled and if a claim occurs, the spinner is spun.
What is the variance for a single trial of this risk process?
A. Less than 24
B. At least 24 but less than 25
C. At least 25 but less than 26
D. At least 26 but less than 27
E. At least 27
5.3 (2 points) You are given the following:
Number of claims for an insured follows a Poisson distribution with mean 0.25.
The amount of a single claim has a uniform distribution on [0, 5000]
Number of claims and claim severity are independent.
Determine the variance of the aggregate losses for this insured.
A. Less than 2.1 million
B. At least 2.1 million but less than 2.2 million
C. At least 2.2 million but less than 2.3 million
D. At least 2.3 million but less than 2.4 million
E. At least 2.4 million

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 129

5.4 (3 points) You are given the following:


For a given risk, the number of claims for a single exposure period will be 1,
with probability 4/5; or 2, with probability 1/5.
If only one claim is incurred, the size of the claim will be 50, with probability 3/4;
or 200, with probability 1/4.
If two claims are incurred, the size of each claim, independent of the other, will be 50,
with probability 60%; or 150, with probability 40%.
Determine the variance of the aggregate losses for this risk.
A. Less than 4,000
B. At least 4,000, but less than 4,500
C. At least 4,500, but less than 5,000
D. At least 5,000, but less than 5,500
E. At least 5,500
5.5 (2 points) A large urn has many balls of three different kinds in the following proportions:
Type of Ball:
Proportion
Red
70%
Green $50
20%
Green $200
10%
The risk process is as follows:
1. Set the aggregate losses equal to zero.
2. Draw a ball from the urn.
3. If the ball is Red then Exit, otherwise continue to step 4.
4. If the ball is Green add the amount shown to the aggregate losses and return to step 2.
Determine the process variance of the aggregate losses for a single trial of this risk process.
A. Less than 9,000
B. At least 9,000 but less than 10,000
C. At least 10,000 but less than 11,000
D. At least 11,000 but less than 12,000
E. At least 12,000
5.6 (3 points) Assume there are 3 types of risks. Whether or not there is a claim is determined by
whether a six-sided die comes up with a zero or a one, with a one indicating a claim. If a claim occurs
then its size is determined by a spinner.
Type
Number of die faces with a 1 rather than a 0
Claim Size Spinner
I
2
$100 70%, $200 30%
II
3
$100 50%, $200 50%
III
4
$100 30%, $200 70%
Determine the variance of aggregate annual losses for a portfolio of 300 risks, consisting of 100 risks
of each type.
A. 1.9 million
B. 2.0 million
C. 2.1 million
D. 2.2 million
E. 2.3 million

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 130

Use the following information for the next 5 questions:

Number of claims follows a Poisson distribution with mean of 5.


Claim severity is independent of the number of claims and has the following
probability density function: f(x) = 3.5 x-4.5, x >1.

5.7 (2 points) Determine the variance of the aggregate losses.


A. Less than 11.0
B. At least 11.0 but less than 11.3
C. At least 11.3 but less than 11.6
D. At least 11.6 but less than 11.9
E. At least 11.9
5.8 (2 points) Using the Normal Approximation, estimate the probability that the aggregate losses
will exceed 11.
A. 10%
B. 12%
C. 14%
D. 16%
E. 18%
5.9 (2 points) Approximating with a LogNormal Distribution, estimate the probability that the
aggregate losses will exceed 11.
A. Less than 12%
B. At least 12%, but less than 14%
C. At least 14%, but less than 16%
D. At least 16%, but less than 18%
E. At least 18%
5.10 (2 points) Determine the skewness of the aggregate losses.
A. Less than 0.4
B. At least 0.4 but less than 0.6
C. At least 0.6 but less than 0.8
D. At least 0.8 but less than 1.0
E. At least 1.0
5.11 (2 points) Determine the variance of the aggregate losses, if there is a maximum covered loss
of 5.
A. Less than 11.0
B. At least 11.0 but less than 11.3
C. At least 11.3 but less than 11.6
D. At least 11.6 but less than 11.9
E. At least 11.9

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 131

Use the following information for the next two questions:

Number of claims follows a Poisson distribution with mean .

The amount of a single claim has an exponential distribution given by:


f(x) = e-x/ / , x > 0, > 0

Number of claims and claim severity distributions are independent.

5.12 (2 points ) Determine the variance of the aggregate losses.


A.

B. 2

C. 2

D. 22

E. None of A, B, C, or D.

5.13 (2 points ) Determine the skewness of the aggregate losses.


1

3
3
A.
B.
C.
D.
E. None of A, B, C, or D.
2
2
2
2
5.14 (1 point) You are given the following:

The frequency distribution follows the Poisson process with mean 3.

The second moment about the origin for the severity distribution is 200.

Frequency and Severity are independent.

What is the variance of the aggregate losses?


A. 400
B. 450
C. 500
D. 550

E. 600

5.15 (3 points) The number of accidents any particular automobile has during a year is Poisson with
mean 0.03. The damage to an automobile due any single accident is uniformly distributed over the
interval from 0 to 3000. Using the Normal Approximation, what is the minimum number of
independent automobiles that must be insured so that the probability that the aggregate annual
losses exceed 160% of expected is at most 5%?
A. 295
B. 305
C. 315
D. 325
E. 335
5.16 (2 points) You are given the following:

For baseball player Don, the number of official at bats in a season is Poisson with = 600.
For Don the probabilities of the following types of hits per official at bat are:
Single 22%, Double 4%, Triple 1%, and Home Run 5%.
Dons contract provides him incentives of: $2000 per single, $4000 per double,
$6000 per triple and $8000 per home run ($2000 per base.)
What is the chance that next year Don will earn at most $700,000 from his incentives?
Use the Normal Approximation.
A. 84%
B. 86%
C. 88%
D. 90%
E. 92%

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 132

Use the following information for the next five questions:


There are three types of risks. For each type of risk, the frequency and severity are independent.
Type Frequency Distribution
Severity Distribution

Binomial: m = 8, q = 0.4

Pareto: = 4, = 1000

Poisson: = 3

LogNormal: = 7, = 0.5

Negative Binomial: r = 3, = 2

Gamma: = 3, = 200

5.17 ( 2 points) For a risk of Type , what is the variance of the aggregate losses?
A. Less than 0.5 million
B. At least 0.5 million but less than 0.6 million
C. At least 0.6 million but less than 0.7 million
D. At least 0.7 million but less than 0.8 million
E. At least 0.8 million
5.18 ( 2 points) For a risk of Type , what is the variance of the aggregate losses?
A. Less than 5.7 million
B. At least 5.7 million but less than 5.8 million
C. At least 5.8 million but less than 5.9 million
D. At least 5.9 million but less than 6.0 million
E. At least 6.0 million
5.19 ( 2 points) For a risk of Type , what is the variance of the aggregate losses?
A. Less than 7.0 million
B. At least 7.0 million but less than 7.5 million
C. At least 7.5 million but less than 8.0 million
D. At least 8.0 million but less than 8.5 million
E. At least 8.5 million
5.20 ( 2 points) Assume one has a portfolio made up of 55 risks of Type , 35 risks of Type , and
10 risks of Type . Each risk in the portfolio is independent of all the others.
For this portfolio, what is the variance of the aggregate losses?
A. 310 million

B. 320 million

C. 330 million

D. 340 million

E. 350 million

5.21 (5 points) For a risk of Type , what is the skewness of the aggregate losses?
A. Less than 1.0
B. At least 1.0 but less than 1.1
C. At least 1.1 but less than 1.2
D. At least 1.2 but less than 1.3
E. At least 1.3

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 133

Use the following information for the next 6 questions:

The severity distribution is an Exponential distribution with = 5000, prior to the impact of
any deductible or maximum covered loss.

The number of losses follows a Poisson distribution with = 2.4, prior to the impact of any
deductible.

Frequency and severity are independent.


5.22 (1 point) What are the mean aggregate losses excess of a 1000 per claim deductible?
A. 9,600
B. 9,800
C. 10,000
D. 10,200
E. 10,400
5.23 (2 points) What is the standard deviation of the aggregate losses excess of a 1000 per claim
deductible?
A. 8,700
B. 9,000
C. 9,300
D. 9,600
E. 9,900
5.24 (1 point) What are the mean aggregate losses if there is a 10,000 maximum covered loss and
no deductible?
A. Less than 9,800
B. At least 9,800, but less than 10,000
C. At least 10,000, but less than 10,200
D. At least 10,200, but less than 10,400
E. At least 10,400
5.25 (2 points) What is the standard deviation of the aggregate losses if there is a 10,000 maximum
covered loss?
A. Less than 6,000
B. At least 6,000, but less than 7,000
C. At least 7,000, but less than 8,000
D. At least 8,000, but less than 9,000
E. At least 9,000
5.26 (2 points) What are the mean aggregate losses if there is both a 1000 per claim deductible
and a 10,000 maximum covered loss?
A. 7,800
B. 8,000
C. 8,200
D. 8,400
E. 8,600
5.27 (3 points) What is the standard deviation of the aggregate losses if there is both a 1000 per
claim deductible and a 10,000 maximum covered loss?
A. Less than 6,500
B. At least 6,500, but less than 7,000
C. At least 7,000, but less than 7,500
D. At least 7,500, but less than 8,000
E. At least 8,000

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 134

Use the following size of loss data from ABC Insurance for the next 6 questions:
Range
number of losses
0-1
60
1-3
30
3-5
20
5-10
10
120
Assume a uniform distribution of loss sizes within each interval.
In addition there are 5 losses of size greater than 10: 12, 15, 17, 20, 30.
5.28 (2 points) Calculate the mean.
A. 2.3
B. 2.5
C. 2.7

D. 2.9

E. 3.1

5.29 (3 points) Calculate the variance.


A. less than 15.5
B. at least 15.5 but less than 16.0
C. at least 16.0 but less than 16.5
D. at least 16.5 but less than 17.0
E. at least 17.0
5.30 (2 points) Calculate e(7).
A. less than 6.0
B. at least 6.0 but less than 6.5
C. at least 6.5 but less than 7.0
D. at least 7.0 but less than 7.5
E. at least 7.5
5.31 (2 points) The annual number of losses for ABC Insurance is Poisson with mean 40.
What is the coefficient of variation of its aggregate annual losses?
(A) 0.3
(B) 0.4
(C) 0.5
(D) 0.6
(E) 0.7
5.32 (3 points) The annual number of losses for ABC Insurance is Poisson with mean 40.
ABC Insurance buys reinsurance for the layer 10 excess of 5 (the layer from 5 to 15).
How much does the reinsurer expect to pay per year due to losses by ABC Insurance?
(A) 19
(B) 20
(C) 21
(D) 22
(E) 23
5.33 (3 points) The annual number of losses for ABC Insurance is Poisson with mean 40.
ABC Insurance buys reinsurance for the layer 10 excess of 5. What is the coefficient of variation of
the annual payment by the reinsurer due to losses by ABC Insurance?
(A) 0.6
(B) 0.7
(C) 0.8
(D) 0.9
(E) 1.0

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 135

5.34 (3 points) You are given the following:

The severity distribution is a Pareto distribution with = 3.2 and = 20,000, prior to the
impact of any deductible.

The number of losses follows a Negative Binomial with r = 4.1 and = 2.8, prior to the
impact of any deductible.

Frequency and severity are independent.


There is a 50,000 per claim deductible.
What is the chance that the aggregate losses excess of the deductible are greater than 15,000?
Use the Normal Approximation.
A. Less than 20%
B. At least 20%, but less than 25%
C. At least 25%, but less than 30%
D. At least 30%, but less than 35%
E. At least 35%
Use the following information for the next two questions:

The claim frequency for each policy is Poisson.


The expected mean frequencies differ across the portfolio of policies.
The mean frequencies are Gamma Distributed across the portfolio with = 5 and = 0.4.
Claim severity has a mean of 20 and a variance of 300.
Claim frequency and severity are independent.
5.35 (3 points) If an insurer has sold 200 independent policies, determine the probability that the
aggregate loss for the portfolio will exceed 110% of the expected loss.
Use the Normal Approximation.
A. 7.5%
B. 8.5%
C. 9.5%
D. 10.5%
E. 11.5%
5.36 (2 points) Determine the minimum number of independent policies that would have to be sold
so that the probability that the aggregate loss for the portfolio will exceed 110% of the expected
loss does not exceed 1%. Use the Normal Approximation.
A. 400
B. 500
C. 600
D. 700
E. 800
5.37 (2 points) Frequency has mean 10 and variance 20. Severity has mean 1000 and variance
200,000. Severities are independent of each other and of the number of claims.
Let be the standard deviation of the aggregate losses.
Let be the standard deviation of the aggregate losses, given that 8 claims have occurred.
Calculate /.
(A) 2.9

(B) 3.1

(C) 3.3

(D) 3.5

(E) 3.7

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 136

5.38 (3 points) Use the following information:

An insurer issues a policy that pays for hospital stays.


Each hospital stay results in room charges and other charges.
Total room charges for a hospital stay have mean 5000 and standard deviation 8000.
Total other charges for a hospital stay have mean 2000 and standard deviation 3000.
The correlation between total room charges and total other charges for a hospital stay is 0.6.
The insurer reimburses 100% for room charges and 75% for other charges.
The number of annual admission to the hospital is Binomial with m = 4 and q = 0.1.
Determine the standard deviation of the insurer's annual aggregate payments for this policy.
(A) 5500
(B) 6000
(C) 6500
(D) 7000
(E) 7500
5.39 (3 points) Use the following information:

An insurer issues a policy that pays for loss plus loss adjustment expense.
Losses follow a Gamma Distribution with = 4 and = 1000.
Loss Adjustment Expenses follow a Gamma Distribution with = 3 and = 200.
The correlation between loss and loss adjustment expense is 0.8.
The number of annual claims is Poisson with = 0.6.
Determine the standard deviation of the insurer's annual aggregate payments for this policy.
(A) 3800
(B) 3900
(C) 4000
(D) 4100
(E) 4200
5.40 (2 points) For aggregate claims A, you are given:

(i) fA(x) =

p* n (x) 3n e - 3 /

n!

n=0

(ii)

x
p(x)
1
0.5
2
0.3
3
0.2
Determine Var[A].
(A) 7.5
(B) 8.5

(C) 9.5

(D) 10.5

(E) 11.5

5.41 (3 points) Aggregate Losses have a mean of 100 and a variance of 90,000.
Approximating the aggregate distribution by a LogNormal Distribution, estimate the probability that
the aggregate losses are greater than 2000.
(A) 0.1%
(B) 0.2%
(C) 0.3%
(D) 0.4%
(E) 0.5%

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 137

5.42 (3 points) Use the following information:

The number of claims follows a Poisson distribution with a mean of 7.

Claim severity has a Pareto Distribution with = 3 and = 100.

Frequency and severity are independent.


Approximating the aggregate losses by a LogNormal Distribution, estimate the probability that the
aggregate losses will exceed 1000.
A. 1%
B. 2%
C. 3%
D. 4%
E. 5%
5.43 (2 points) The number of losses is Poisson with mean .
The ground up distribution of size of loss is Exponential with mean .
Frequency and severity are independent.
Let B be the variance of aggregate payments if there is a deductible b.
Let C be the variance of aggregate payments if there is a deductible c > b.
Determine the ratio of C/B.
When is this ratio less than one, equal to one, and greater than one?
5.44 (2 points) Frequency is Poisson with = 3.
The size of loss distribution is Exponential with = 400.
Frequency and severity are independent.
There is an ordinary deductible of 500.
Calculate the variance of the aggregate payments excess of the deductible.
A. Less than 250,000
B. At least 250,000, but less than 260,000
C. At least 260,000, but less than 270,000
D. At least 270,000, but less than 280,000
E. 280,000 or more
5.45 (3 points) The number of losses is Poisson with mean .
The ground up distribution of size of loss is Pareto with parameters > 2, and .
Frequency and severity are independent.
Let B be the variance of aggregate payments if there is a deductible b.
Let C be the variance of aggregate payments if there is a deductible c > b.
Determine the ratio of C/B.
When is this ratio less than one, equal to one, and greater than one?

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 138

Use the following information for the next six questions:


The distribution of aggregate losses has a mean of 20 and a variance of 100.
5.46 (1 point) Approximate the distribution of aggregate losses by a Normal Distribution with mean
and variance equal to that of the aggregate losses.
Estimate the probability that the aggregate losses are greater than 42.
A. Less than 2.0%
B. At least 2.0%, but less than 2.5%
C. At least 2.5%, but less than 3.0%
D. At least 3.0%, but less than 3.5%
E. At least 3.5%
5.47 (3 points) Approximate the distribution of aggregate losses by a LogNormal Distribution with
mean and variance equal to that of the aggregate losses.
Estimate the probability that the aggregate losses are greater than 42.
A. 1.5%
B. 2.0%
C. 2.5%
D. 3.0%
E. 3.5%
5.48 (3 points) Approximate the distribution of aggregate losses by a Gamma Distribution with
mean and variance equal to that of the aggregate losses.
Estimate the probability that the aggregate losses are greater than 42.
n-1

Hint: [n; ] = 1 - e - i / i! .
i=0

A. Less than 2.0%


B. At least 2.0%, but less than 2.5%
C. At least 2.5%, but less than 3.0%
D. At least 3.0%, but less than 3.5%
E. At least 3.5%
5.49 (4 points) Approximate the distribution of aggregate losses by an Inverse Gaussian
Distribution with mean and variance equal to that of the aggregate losses.
Estimate the probability that the aggregate losses are greater than 42.
exp[-x2 / 2]
Use the following approximation for x > 3: 1 - [x]
(1/x - 1/x3 + 3/x5 - 15/x7 ).
2
A. Less than 2.0%
B. At least 2.0%, but less than 2.5%
C. At least 2.5%, but less than 3.0%
D. At least 3.0%, but less than 3.5%
E. At least 3.5%

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 139

5.50 (4 points) If Y follows a Poisson Distribution with parameter , then for c > 0, cY follows an
Over-dispersed Poisson Distribution with parameters c and .
Approximate the distribution of aggregate losses by an Over-dispersed Poisson Distribution with
mean and variance equal to that of the aggregate losses.
Estimate the probability that the aggregate losses are greater than 42.
A. Less than 2.0%
B. At least 2.0%, but less than 2.5%
C. At least 2.5%, but less than 3.0%
D. At least 3.0%, but less than 3.5%
E. At least 3.5%
5.51 (4 points) Approximate the distribution of aggregate losses by an Inverse Gamma
Distribution with mean and variance equal to that of the aggregate losses.
Estimate the probability that the aggregate losses are greater than 42.
n-1

Hint: [n; ] = 1 - e - i / i! .
i=0

A. Less than 2.0%


B. At least 2.0%, but less than 2.5%
C. At least 2.5%, but less than 3.0%
D. At least 3.0%, but less than 3.5%
E. At least 3.5%

5.52 (2 points) The frequency for each insurance policy is Poisson with mean 2.
The cost per loss has mean 5 and standard deviation 12.
The number of losses and their sizes are all mutually independent.
Determine the minimum number of independent policies that would have to be sold so that the
probability that the aggregate loss for the portfolio will exceed 115% of the expected loss does not
exceed 2.5%. Use the Normal Approximation.
(A) 500
(B) 600
(C) 700
(D) 800
(E) 900
5.53 (3 points) Frequency is Negative Binomial with r = 4 and = 3.
The size of loss distribution is Exponential with = 1700.
Frequency and severity are independent.
There is an ordinary deductible of 1000.
Calculate the variance of the aggregate payments excess of the deductible.
A. 40 million
B. 50 million
C. 60 million
D. 70 million

E. 80 million

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 140

Use the following information for the next two questions:


An insurance company sold policies as follows:
Number of Policy
Probability of
Policies
Maximum
Claim Per Policy
10,000
25
3%
15,000
50
5%
You are given:
(i) The claim amount for each policy is uniformly distributed between 0 and the policy
maximum.
(ii) The probability of more than one claim per policy is 0.
(iii) Claim occurrences are independent.
5.54 (2 points) What is the variance of aggregate losses?
A. Less than 650,000
B. At least 650,000, but less than 700,000
C. At least 700,000, but less than 750,000
D. At least 750,000, but less than 800,000
E. 800,000 or more
5.55 (1 point) What is the probability that aggregate losses are greater than 24,000?
Use the Normal Approximation.
A. Less than 4%
B. At least 4%, but less than 5%
C. At least 5%, but less than 6%
D. At least 6%, but less than 7%
E. At least 7%

5.56 (3 points) The number of Property Damage Liability claims is Poisson with mean .
The size of Property Damage Liability claims has mean 10 and standard deviation 15.
The number of Bodily Injury Liability claims is Poisson with mean /3.
The size of Bodily Injury Liability claims has mean 24 and standard deviation 60.
Let P = the 90th percentile of the aggregate distribution of Property Damage Liability.
Let B = the 90th percentile of the aggregate distribution of Bodily Injury Liability.
B/P = 1.061.
Using the Normal Approximation, determine .
A. 60

B. 70

C. 80

D. 90

E. 100

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 141

5.57 (5 points) Use the following information:

You are given five years of observed aggregate losses:


Year
2006
2007
2008
2009
2010

Aggregate Loss ($ million)


31
38
36
41
41

Frequency is Poisson with mean 3000.


Severity follows a Pareto Distribution.
Frequency and severity are independent.
Inflation is 4% per year.
Using the method of moments to fit the aggregate distribution to the data,
estimate the probability that an individual loss will be of size greater than $20,000 in 2012.
A. Less than 5%
B. At least 5%, but less than 10%
C. At least 10%, but less than 15%
D. At least 15%, but less than 20%
E. At least 20%
5.58 (3 points) Let S be the aggregate loss and N be the number of claims.
Given the following information, determine the variance of S.
N

Probability

E[S | N]

0
20%
0
1
40%
100
2
30%
250
3
10%
400
A. 60,000
B. 65,000

E[S2 | N]

0
50,000
150,000
300,000
C. 70,000
D. 75,000

E. 80,000

5.59 (3 points) Frequency is Binomial with m = 5 and q = 0.4.


Severity is LogNormal with = 6 and = 0.3.
Frequency and severity are independent.
Using the Normal Approximation, estimate the probability that the aggregate losses are greater than
150% of their mean.
A. 20%
B. 25%
C. 30%
D. 35%
E. 40%

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 142

5.60 (3 points) Use the following information:

Annual claim occurrences follow a Zero-Modified Negative Binomial Distribution


with pM
0 = 40%, r = 2 and = 0.4.

Each claim amount follows a Gamma Distribution with = 3 and = 500.


Claim occurrences and amounts are independent.
Determine the variance of aggregate annual losses.
A. 3.2 million
B. 3.4 million
C. 3.6 million

D. 3.8 million

E. 4.0 million

5.61 (2 points) You are given six years of aggregate losses:


111, 106, 98, 120, 107, 113.
Use the sample variance together with the Normal Approximation, in order to estimate the
probability that the aggregate losses next year are less than 100.
A. 9%
B. 10%
C. 11%
D. 12%
E. 13%
5.62 (3 points) The number of losses per year has a Poisson distribution with a mean of 0.35.
There are three types of claims:
Type of Claim
Mean Frequency
Mean Severity
Coefficient of Variation of Severity
I
0.20
100
5
II
0.10
200
4
III
0.05
300
3
The number of claims of one type is independent of the number of claims of the other types.
Determine the variance of the distribution of annual aggregate losses.
(A) 150,000
(B) 165,000
(C) 180,000
(D) 195,000
(E) 210,000
5.63 (2 points) The distribution of the number of claims is:
n
f(n)
1
40%
2
30%
3
20%
4
10%
The natural logarithm of the sizes of claims are Normal distributed with mean 6 and variance 0.7.
Determine the variance of the distribution of annual aggregate losses.
(A) 1.0 million
(B) 1.1 million
(C) 1.2 million
(D) 1.3 million
(E) 1.4 million
5.64 (2 points) X has the following distribution:
Prob[X = 0] = 20%, Prob[X = 1] = 30%, Prob[X = 2] = 50%.
Y is the sum of X independent Normal random variables, each with mean 3 and variance 5.
What is the variance of Y?
A. 8
B. 9
C. 10
D. 11
E. 12

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 143

5.65 (4 points) For liability insurance, the number of accidents per year is Poisson with mean 10%.
The number of claimants per accident follows a zero-truncated Binomial Distribution with
m = 4 and q = 0.2.
The size of each claim follows a Gamma Distribution with = 3 and = 10,000.
Determine the coefficient of variation of the aggregate annual losses.
A. 3.2
B. 3.4
C. 3.6
D. 3.8
E. 4.0
5.66 (5 points) The Spring & Sommers Company has 2000 employees.
Spring & Sommers provides a generous disability program for its employees.
A disabled employee is paid 2/3 of his or her weekly salary.
The company self-insures the first 5 weeks of any disability and has an lnsurance policy that will
cover any disability payments beyond 5 weeks.
Occurrences of disability among employees are independent of one another.
Assume that an employee can suffer at most one disability per year.
Disabilities have duration:
1 week
30%
2 weeks
20%
3 weeks
10%
4 weeks
10%
5 or more weeks
30%
There are two types of employees:
Type
Number of Employees
Weekly Salary
Annual Probability of a Disability
1
1500
600
5%
2
500
900
8%
Determine the coefficient of variation of the distribution of total annual payments Spring & Sommers
pays its employees for disabilities, excluding any amounts paid by the insurance policy.
(A) 0.10
(B) 0.15
(C) 0.20
(D) 0.25
(E) 0.30
5.67 (3 points) Use the following information:

Annual claim occurrences follow a Zero-Modified Poisson Distribution with pM


0 = 25% and = 0.1.
Each claim amount follows a LogNormal Distribution with = 8 and = 0.6.
Claim occurrences and amounts are independent.
Determine the variance of aggregate annual losses.
A. 6.0 million
B. 6.5 million
C. 7.0 million

D. 7.5 million

E. 8.0 million

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 144

5.68 (4, 5/85, Q.35) (2 points) Suppose x is a claim-size variable which is gamma distributed with
probability density function:
f(x) = ar xr-1 e-ax / (r)
where a and r are > 0 and x > 0, mean = r/a, variance = r/a2 .
N

Let T = total losses = xi

where N is a positive integer.

i=1

Assume the number of claims is independent of their amounts.


If E[N] = ,Var[N] = 2, which of the following is the variance of T?
A. r (2r+ ) / a2

B. 2 + r/a2

C. 2r/a2

D. r (2r+ 1) / a2

E. None of the above


5.69 (4, 5/87, Q.44) (1 point) Let Xi be independent, identically distributed claim-size variables
which are gamma-distributed, with parameters and .
N

Let T = total losses =

Xi where N is a positive integer.


i=1

Assume the number of claims is independent of their amounts.


If E[N] = m, VAR[N] = 3m, which of the following is the variance of T?
A. 3m + 2

B. m(3 + 1)2

C. m(3a + m)2

D. 3m2

E. None A, B, C, or D.
5.70 (4, 5/90, Q.43) (2 points) Let N be a random variable for the claim count with:
Pr{N = 4} = 1/4
Pr{N = 5} = 1/2
Pr{N = 6} = 1/4
Let X be a random variable for claim severity with probability density function
f(x) = 3 x-4, for 1 x < .
Find the coefficient of variation, R, of the aggregate loss distribution, assuming that claim severity and
frequency are independent.
A. R < 0.35
B. 0.35 R < 0.50
C. 0.50 R < 0.65
D. 0.65 R < 0.70
E. 0.70 R

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 145

5.71 (Course 151 Sample Exam #1, Q.10) (1.7 points)


For aggregate claims S, you are given:

n + 2
(i) fS(x) = p *n (x)
(0.6)3 (0.4)n .
n

n=0
x
p(x)
1
0.3
2
0.6
3
0.1
Determine Var[S].
(A) 7.5
(B) 8.5

(ii)

(C) 9.5

(D) 10.5

(E) 11.5

5.72 (Course 151 Sample Exam #1, Q.18) (2.5 points)


The policies of a building insurance company are classified according to the location of the building
insured:
Number of
Claim
Policies
Claim
Region
Amount
in Region
Probability
A
20
300
0.01
B
10
500
0.02
C
5
600
0.03
D
15
500
0.02
E
18
100
0.01
There is at most one claim per policy and if there is a claim it is for the stated amount.
Using the normal approximation, relative security loadings are computed for each region such that
the probability that the total claims for the region do not exceed the premiums collected from policies
in that region is 0.95.
The relative security loading is defined as: (premiums / expected losses) - 1.
Which region pays the largest relative security loading?
(A) A
(B) B
(C) C
(D) D
(E) E

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 146

5.73 (Course 151 Sample Exam #2, Q.13) (1.7 points) For aggregate claims
N

S = Xi, you are given:


i=1

(i) Xi has distribution


x
1
2

p(x)
p
1-p

(ii) is a Poisson random variable with parameter 1/p


(iii) given = , N is Poisson with parameter
(iv) the number of claims and claim amounts are mutually independent
(v) Var(S) = 19/2 .
Determine p.
(A) 1/6
(B) 1/5
(C) 1/4
(D) 1/3
(E) 1/2
5.74 (Course 151 Sample Exam #2, Q.14) (1.7 points)
For an insured portfolio, you are given:
(i) the number of claims has a Geometric distribution with = 1/3
(ii) individual claim amounts can take values of 3, 4 or 5 with equal probability
(iii) the number of claims and claim amounts are independent
(iv) the premium charged equals expected aggregate claims plus the variance of
aggregate claims.
Determine the exact probability that aggregate claims exceeds the premium.
(A) 0.01
(B) 0.03
(C) 0.05
(D) 0.07
(E) 0.09
5.75 (Course 151 Sample Exam #2, Q.16) (1.7 points)
Let S be the aggregate claims for a collection of insurance policies.
You are given:
The size of claims has mean E[X] and second moment E[X2 ].
G is the premium with relative security loading , (premiums / expected losses) - 1.
S has a compound Poisson distribution with parameter .
R = S/G (the loss ratio).
Which of the following is an expression for Var(R)?
(A)

E[X2] 1
E[X] 1+

(B)

1
E[X2 ]
2
E[X] (1+ )

(D)

1
E[X2 ]
E[X]2 (1+ )2

(E)

1
E[X2] 2
E[X]2 (1+ )2

(C)

1
E[X2] 2
2
E[X] (1+ )

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 147

5.76 (Course 151 Sample Exam #3, Q.8) (1.7 points)


An insurer has a portfolio of 40 independent policies. For each policy you are given:

The probability of a claim is 1/8 and there is at most one claim per policy.
The benefit amount given that there is a claim has an Inverse Gaussian distribution
with = 400 and = 8000.

Using the Normal approximation, determine the probability that the total claims for the portfolio are
greater than 2900.
(A) 0.03
(B) 0.06
(C) 0.09
(D) 0.12
(E) 0.15
5.77 (Course 151 Sample Exam #3, Q.10) (1.7 points)
An insurance company is selling policies to individuals with independent future lifetimes and identical
mortality profiles. For each individual, the probability of death by all causes is 0.10 and the
probability of death due to accident is 0.01. Each insurance policy pays the following benefits:
10 for accidental death
1 for non-accidental death
The company wishes to to have at least a 95% probability that premiums with a relative security
loading of 0.20 are adequate to cover claims.
The relative security loading is: (premiums / expected losses) - 1.
Using the normal approximation, determine the minimum number of policies that must be sold.
(A) 1793
(B) 1975
(C) 2043
(D) 2545
(E) 2804
5.78 (5A, 11/94, Q.20) (1 point) The probability of a loss in a given period is 0.01.
The probability of more than one loss in a given period is 0. Given that a loss occurs, the damage
is assumed to be uniformly distributed over the interval from 0 to 10,000.
What is the variance of the aggregate loss experience within the given time period?
A. Less than 200,000
B. At least 200,000, but less than 250,000
C. At least 250,000, but less than 300,000
D. At least 300,000, but less than 350,000
E. Greater than or equal to 350,000
5.79 (5A, 5/95, Q.36) (2 points) Suppose S is a compound Poisson distribution of aggregate
claims with a mean number of claims = 3 for a collection of insurance policies over a single premium
period. The first and second moments of the individual claim amount distribution are 100 and
15,000 respectively.
The aggregate premium was determined by applying a relative security loading,
(premiums / expected losses) - 1, of 0.1 to the expected aggregate claim amount and by ignoring
expenses. Determine the mean and variance of the loss ratio.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 148

5.80 (5A, 5/96, Q.36) The XYZ Insurance Company insures 500 risks. For each risk, there is a
10% probability of having a claim, but no more than one claim is possible.
The individual claim amount distribution is given by f(x) = 0.001exp(-x/1000), for x > 0.
Assume that the risks are independent.
a. (1.5 points) What is the expectation and standard deviation of
S = X1 + X2 + ... + X500 where Xi is the loss on insured unit i?
b. (1 point) Assuming no expenses, using the Normal Approximation, estimate the premium
per risk necessary so that there is a 95% chance that the premiums are sufficient
to pay the resulting claims.
5.81 (5A, 5/97, Q.39) (2 points) For a one-year term life insurance policy, suppose the insurer
agrees to pay a fixed amount if the insured dies.
You are given the following information regarding the binomial claim distribution for this policy:
E[x] = 30
Var[x] = 29,100.
Calculate the amount of the death payment and the probability that the insured will die within the
next year.
5.82 (5A, 5/98, Q.37) (2 points) For a collection of homeowners policies, assume:
i) S represents the aggregate claim amount for the entire collection of policies.
ii) G is the aggregate premium collected.
iii) G = 1.2 E(S)
iv) The distribution for the number of claims is Poisson with = 5.
v) The claim amounts are identically distributed random variables that are uniform over the
interval (0,10).
vi) The number of claims and the claim amounts are mutually independent.
Find the variance of the loss ratio, S/G.
5.83 (5A, 11/98, Q.23) (1 point) The distribution of aggregate claims, S, is compound Poisson
with = 3. Individual claim amounts are distributed as follows:
x
p(x)
1
0.40
2
0.20
3
0.40
Which of the following is the closest to the normal approximation of Pr[S > 9]?
A. 8%
B. 11%
C. 14%
D. 17%
E. 20%

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 149

5.84 (5A, 5/99, Q.24) (1 point) You are given the following information concerning the claim
severity, X, and the annual aggregate amount of claims, S:
E[X] = 50,000.
Var[X] = 500,000,000.
Var[S] = 30,000,000.
Assume that the claim sizes (X1 , X2 , ...) are identically distributed random variables and the number
of claims sizes are mutually independent.
Assume that the number of claims (N) follows a Poisson distribution.
What is the likelihood that there will be at least one claim next year?
A. Less than 5%
B. At least 5%, but less than 50%
C. At least 50%, but less than 95%
D. At least 95%
E. Cannot be determined from the above information.
5.85 (5A, 5/99, Q.38) (2.5 points) For a particular line of business, the aggregate claim amount S
follows a compound Poisson distribution. The aggregate number of claims N, has a mean of 350.
The dollar amount of each individual claim, xi, i = 1,..., N is uniformly distributed over the interval from
0 to 1000 . Assume that N and the Xi are mutually independent random variables.
Using the Normal Approximation, calculate the probability that S > 180,000.
5.86 (5A, 11/99, Q.38) (3 points) Use the following information:

An insurer has a portfolio of 14,000 insured properties as shown below.


Property Value
Number of Properties
$20,000
3000
$35,000
4000
$60,000
5000
$75,000
2000

The annual probability of a claim for each of the insured properties is .04.

Each property is independent of the others.


Assume only total losses are possible.

In order to reduce risk, the insurer buys reinsurance with a retention of $30,000 on each
property. (For example, in the case of a loss of $75,000, the insurer would pay
$30,000, while the reinsurer would pay $45,000.)

The annual reinsurance premium is set at 125% of the expected excess annual claims.
Calculate the probability that the total cost (retained claims plus reinsurance cost) of insuring the
properties will exceed $28,650,000 in any year.
Use the Normal Approximation.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 150

5.87 (Course 3 Sample Exam, Q.20) You are given:


An insureds claim severity distribution is described by an exponential distribution:
F(x) = 1 - e-x/1000.

The insureds number of claims is described by a negative binomial distribution with:


= 2 and r = 2.

A 500 per claim deductible is in effect.


Calculate the standard deviation of the aggregate losses in excess of the deductible.
A. Less than 2000
B. At least 2000 but less than 3000
C. At least 3000 but less than 4000
D. At least 4000 but less than 5000
E. At least 5000
5.88 (Course 3 Sample Exam, Q.25)
For aggregate losses S = X1 + X2 + . . . + XN, you are given:

N has a Poisson distribution with mean 500.


X1 , X2 , ... have mean 100 and variance 100.
N, X1 , X2 , ... are mutually independent.
You are also given:
For a portfolio of insurance policies, the loss ratio is the ratio of the aggregate losses
to aggregate premiums collected.
The premium collected is 1.1 times the expected aggregate losses.
Using the normal approximation to the compound Poisson distribution, calculate the probability that
the loss ratio exceeds 0.95.
5.89 (IOA 101, 4/00, Q.2) (2.25 points) Insurance policies providing car insurance are such that the
sizes of claims are normally distributed with mean 1,870 and standard deviation 610. In one month
50 claims are made. Assuming that claims are independent, calculate the probability that the total of
the claim sizes is more than 100,000.
5.90 (3, 5/00, Q.16) (2.5 points) You are given:
Standard
Mean
Deviation
Number of Claims
8
3
Individual Losses
10,000
3,937
Using the normal approximation, determine the probability that the aggregate loss will exceed
150% of the expected loss.
(A) (1.25) (B) (1.5)

(C) 1 - (1.25)

(D) 1 - (1.5)

(E) 1.5 (1)

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 151

5.91 (3, 5/00, Q.19) (2.5 points) An insurance company sold 300 fire insurance policies as follows:
Number of Policy
Probability of
Policies
Maximum
Claim Per Policy
100
400
0.05
200
300
0.06
You are given:
(i) The claim amount for each policy is uniformly distributed between 0 and the policy maximum.
(ii) The probability of more than one claim per policy is 0.
(iii) Claim occurrences are independent.
Calculate the variance of the aggregate claims.
(A) 150,000
(B) 300,000
(C) 450,000
(D) 600,000
(E) 750,000
5.92 (3, 11/00, Q.8 & 2009 Sample Q.113) (2.5 points)
The number of claims, N, made on an insurance portfolio follows the following distribution:
n
Pr(N=n)
0
0.7
2
0.2
3
0.1
If a claim occurs, the benefit is 0 or 10 with probability 0.8 and 0.2, respectively.
The number of claims and the benefit for each claim are independent. Calculate the probability that
aggregate benefits will exceed expected benefits by more than 2 standard deviations.
(A) 0.02
(B) 0.05
(C) 0.07
(D) 0.09
(E) 0.12
5.93 (3, 11/00, Q.32 & 2009 Sample Q.118) (2.5 points) For an individual over 65:
(i) The number of pharmacy claims is a Poisson random variable with mean 25.
(ii) The amount of each pharmacy claim is uniformly distributed between 5 and 95.
(iii) The amounts of the claims and the number of claims are mutually independent.
Determine the probability that aggregate claims for this individual will exceed 2000 using the normal
approximation.
(A) 1 - (1.33)

(B) 1 - (1.66)

(C) 1 - (2.33)

(D) 1 - (2.66)

(E) 1 - (3.33)

5.94 (3, 5/01, Q.29 & 2009 Sample Q.110) (2.5 points)
You are the producer of a television quiz show that gives cash prizes.
The number of prizes, N, and prize amounts, X, have the following distributions:
n
Pr(N = n)
x
Pr (X=x)
1
0.8
0
0.2
2
0.2
100
0.7
1000
0.1
Your budget for prizes equals the expected prizes plus the standard deviation of prizes.
Calculate your budget.
(A) 306
(B) 316
(C) 416
(D) 510
(E) 518

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 152

5.95 (3, 11/01, Q.7 & 2009 Sample Q.98) (2.5 points) You own a fancy light bulb factory.
Your workforce is a bit clumsy they keep dropping boxes of light bulbs. The boxes have varying
numbers of light bulbs in them, and when dropped, the entire box is destroyed.
You are given:
Expected number of boxes dropped per month: 50
Variance of the number of boxes dropped per month: 100
Expected value per box: 200
Variance of the value per box: 400
You pay your employees a bonus if the value of light bulbs destroyed in a month is less than
8000.
Assuming independence and using the normal approximation, calculate the probability that you will
pay your employees a bonus next month.
(A) 0.16
(B) 0.19
(C) 0.23
(D) 0.27
(E) 0.31
5.96 (3, 11/02, Q.6 & 2009 Sample Q.91) (2.5 points) The number of auto vandalism claims
reported per month at Sunny Daze Insurance Company (SDIC) has mean 110 and variance 750.
Individual losses have mean 1101 and standard deviation 70.
The number of claims and the amounts of individual losses are independent.
Using the normal approximation, calculate the probability that SDICs aggregate auto
vandalism losses reported for a month will be less than 100,000.
(A) 0.24
(B) 0.31
(C) 0.36
(D) 0.39
(E) 0.49
5.97 (CAS3, 11/03, Q.24) (2.5 points) Zoom Buy Tire Store, a nationwide chain of retail tire
stores, sells 2,000,000 tires per year of various sizes and models.
Zoom Buy offers the following road hazard warranty:
"If a tire sold by us is irreparably damaged in the first year after purchase, we'll replace it free,
regardless of the cause."
The average annual cost of honoring this warranty is $10,000,000, with a standard deviation of
$40,000.
Individual claim counts follow a binomial distribution, and the average cost to replace a tire is $100.
All tires are equally likely to fail in the first year, and tire failures are independent.
Calculate the standard deviation of the replacement cost per tire.
A. Less than $60
B. At least $60, but less than $65
C. At least $65, but less than $70
D. At least $70, but less than $75
E. At least $75

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 153

5.98 (CAS3, 11/03, Q.25) (2.5 points) Daily claim counts are modeled by the negative binomial
distribution with mean 8 and variance 15. Severities have mean 100 and variance 40,000.
Severities are independent of each other and of the number of claims.
Let be the standard deviation of a day's aggregate losses.
On a certain day, 13 claims occurred, but you have no knowledge of their severities.
Let be the standard deviation of that day's aggregate losses, given that 13 claims occurred.
Calculate / - 1.
A. Less than -7.5%
B. At least -7.5%, but less than 0
C. 0
D. More than 0, but less than 7.5%
E. At least 7.5%
5.99 (SOA3, 11/03, Q.4 & 2009 Sample Q.85) (2.5 points)
Computer maintenance costs for a department are modeled as follows:
(i) The distribution of the number of maintenance calls each machine will need in a year
is Poisson with mean 3.
(ii) The cost for a maintenance call has mean 80 and standard deviation 200.
(iii) The number of maintenance calls and the costs of the maintenance calls are all
mutually independent.
The department must buy a maintenance contract to cover repairs if there is at least a 10%
probability that aggregate maintenance costs in a given year will exceed 120% of the expected
costs. Using the normal approximation for the distribution of the aggregate maintenance costs,
calculate the minimum number of computers needed to avoid purchasing a maintenance contract.
(A) 80
(B) 90
(C) 100
(D) 110
(E) 120
5.100 (SOA3, 11/03, Q.33 & 2009 Sample Q.88) (2.5 points) A towing company provides all
towing services to members of the City Automobile Club. You are given:
(i)
Towing Distance
Towing Cost
Frequency
0-9.99 miles
80
50%
10-29.99 miles
100
40%
30+ miles
160
10%
(ii) The automobile owner must pay 10% of the cost and the remainder is paid by the City
Automobile Club.
(iii) The number of towings has a Poisson distribution with mean of 1000 per year.
(iv) The number of towings and the costs of individual towings are all mutually independent.
Using the normal approximation for the distribution of aggregate towing costs, calculate the
probability that the City Automobile Club pays more than 90,000 in any given year.
(A) 3%
(B) 10%
(C) 50%
(D) 90%
(E) 97%

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 154

5.101 (CAS3, 5/04, Q.19) (2.5 points) A company has a machine that occasionally breaks down.
An insurer offers a warranty for this machine. The number of breakdowns and their costs are
independent.
The number of breakdowns each year is given by the following distribution:
# of breakdowns
Probability
0
50%
1
20%
2
20%
3
10%
The cost of each breakdown is given by the following distribution:
Cost
Probability
1,000
50%
2,000
10%
3,000
10%
5,000
30%
To reduce costs, the insurer imposes a per claim deductible of 1,000.
Compute the standard deviation of the insurer's losses for this year.
A. 1,359
B. 2,280
C. 2,919
D. 3,092
E. 3,434
5.102 (CAS3, 5/04, Q.22) (2.5 points) An actuary determines that claim counts follow a negative
binomial distribution with unknown and r. It is also determined that individual claim amounts are
independent and identically distributed with mean 700 and variance 1,300.
Aggregate losses have mean 48,000 and variance 80 million.
Calculate the values for and r.
A. = 1.20, r = 57.19
B. = 1.38, r = 49.75
C. = 2.38, r = 28.83
D. = 1,663.81, r = 0.04
E. = 1,664.81, r = 0.04

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 155

5.103 (CAS3, 5/04, Q.38) (2.5 points)


You are asked to price a Workers' Compensation policy for a large employer.
The employer wants to buy a policy from your company with an aggregate limit of 150% of total
expected loss. You know the distribution for aggregate claims is Lognormal.
You are also provided with the following:
Mean
Standard Deviation
Number of claims
50
12
Amount of individual loss
4,500
3,000
Calculate the probability that the aggregate loss will exceed the aggregate limit.
A. Less than 3.5%
B. At least 3.5%, but less than 4.5%
C. At least 4.5%, but less than 5.5%
D. At least 5.5%, but less than 6.5%
E. At least 6.5%
5.104 (CAS3, 5/04, Q.39) (2.5 points) PQR Re provides reinsurance to Telecom Insurance
Company. PQR agrees to pay Telecom for all losses resulting from "events", subject to a $500
per event deductible.
For providing this coverage, PQR receives a premium of $250.
Use a Poisson distribution with mean equal to 0.15 for the frequency of events.
Event severity is from the following distribution:
Loss
Probability
250
0.10
500
0.25
750
0.30
1,000
0.25
1,250
0.05
1,500
0.05

i = 0%
Using the normal approximation to PQR's annual aggregate losses on this contract, what is the
probability that PQR will payout more than it receives?
A. Less than 12%
B. At least 12%, but less than 13%
C. At least 13%, but less than 14%
D. At least 14%, but less than 15%
E. 15% or more

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 156

5.105 (CAS3, 11/04, Q.31) (2.5 points)


The mean annual number of claims is 103 for a group of 10,000 insureds.
The individual losses have an observed mean and standard deviation of 6,382 and 1,781,
respectively. The standard deviation of the aggregate claims is 22,874.
Calculate the standard deviation for the annual number of claims.
A. 1.47
B. 2.17
C.4.72
D. 21.73
E. 47.23
5.106 (CAS3, 11/04, Q.32) (2.5 points)
An insurance policy provides full coverage for the aggregate losses of the Widget Factory.
The number of claims for the Widget Factory follows a negative binomial distribution with mean 25
and coefficient of variation 1.2. The severity distribution is given by a lognormal distribution with
mean 10,000 and coefficient of variation 3.
To control losses, the insurer proposes that the Widget Factory pay 20% of the cost of each loss.
Calculate the reduction in the 95th percentile of the normal approximation of the insurer's loss.
A. Less than 5%
B. At least 5%, but less than 15%
C. At least 15%, but less than 25%
D. At least 25%, but less than 35%
E. At least 35%
5.107 (SOA3, 11/04, Q.15 & 2009 Sample Q.125) (2.5 points) Two types of insurance claims
are made to an insurance company. For each type, the number of claims follows a Poisson
distribution and the amount of each claim is uniformly distributed as follows:
Type of Claim

Poisson Parameter for Number of Claims

Range of Each Claim Amount

I
12
(0, 1)
II
4
(0, 5)
The numbers of claims of the two types are independent and the claim amounts and claim
numbers are independent.
Calculate the normal approximation to the probability that the total of claim amounts
exceeds 18.
(A) 0.37
(B) 0.39
(C) 0.41
(D) 0.43
(E) 0.45

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 157

5.108 (CAS3, 5/05, Q.8) (2.5 points) An insurance company increases the per claim deductible of
all automobile policies from $300 to $500.
The mean payment and standard deviation of claim severity are shown below.
Deductible
Mean Payment
Standard Deviation
$300
1,000
256
$500
1,500
678
The claims frequency is Poisson distributed both before and after the change of deductible.
The probability of no claim increases by 30%, and the probability of having exactly one claim
decreases by 10%.
Calculate the percentage increase in the variance of the aggregate claims.
A. Less than 30%
B. At least 30%, but less than 50%
C. At least 50%, but less than 70%
D. At least 70%, but less than 90%
E. 90% or more
5.109 (CAS3, 5/05, Q.9) (2.5 points) Annual losses for the New Widget Factory can be modeled
using a Poisson frequency model with mean of 100 and an exponential severity model with mean of
$10,000. An insurance company agrees to provide coverage for that portion of any individual loss
that exceeds $25,000.
Calculate the standard deviation of the insurer's annual aggregate claim payments.
A. Less than $36,000
B. At least $36,000, but less than $37,000
C. At least $37,000, but less than $38,000
D. At least $38,000, but less than $39,000
E. $39,000 or more
5.110 (CAS3, 5/05, Q.40) (2.5 points)
An insurance company has two independent portfolios.
In Portfolio A, claims occur with a Poisson frequency of 2 per week and severities are distributed as
a Pareto with mean 1,000 and standard deviation 2,000.
In Portfolio B, claims occur with a Poisson frequency of 1 per week and severities are distributed as
a log-normal with mean 2,000 and standard deviation 4,000.
Determine the standard deviation of the combined losses for the next week.
A. Less than 5,500
B. At least 5,500, but less than 5,600
C. At least 5,600, but less than 5,700
D. At least 5,700, but less than 5,800
E. 5,800 or more

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 158

5.111 (SOA M, 5/05, Q.17 & 2009 Sample Q.164) (2.5 points)
For a collective risk model the number of losses has a Poisson distribution with = 20.
The common distribution of the individual losses has the following characteristics:
(i) E[X] = 70
(ii) E[X 30] = 25
(iii) Pr(X > 30) = 0.75
(iv) E[X2 | X > 30] = 9000
An insurance covers aggregate losses subject to an ordinary deductible of 30 per loss.
Calculate the variance of the aggregate payments of the insurance.
(A) 54,000 (B) 67,500 (C) 81,000 (D) 94,500 (E) 108,000
Note: This past exam question has been rewritten.
5.112 (SOA M, 5/05, Q.31 & 2009 Sample Q.167) (2.5 points)
The repair costs for boats in a marina have the following characteristics:
Boat
Number of Probability that
Mean of repair
Variance of repair
type
boats
repair is needed
cost given a repair
cost given a repair
Power boats
100
0.3
300
10,000
Sailboats
300
0.1
1000
400,000
Luxury yachts
50
0.6
5000
2,000,000
At most one repair is required per boat each year.
The marina budgets an amount, Y, equal to the aggregate mean repair costs plus the standard
deviation of the aggregate repair costs.
Calculate Y.
(A) 200,000
(B) 210,000
(C) 220,000
(D) 230,000
(E) 240,000
5.113 (SOA M, 5/05, Q.40 & 2009 Sample Q.171) (2.5 points) For aggregate losses, S:
(i) The number of losses has a negative binomial distribution with mean 3 and variance 3.6.
(ii) The common distribution of the independent individual loss amounts is uniform
from 0 to 20.
Calculate the 95th percentile of the distribution of S as approximated by the normal distribution.
(A) 61
(B) 63
(C) 65
(D) 67
(E) 69

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 159

5.114 (CAS3, 11/05, Q.30) (2.5 points) On January 1, 2005, Dreamland Insurance sold 10,000
insurance policies that pay $100 for each day 2005 that a policyholder is in the hospital.
The following assumptions were used in pricing the policies:

The probability that a given policyholder will be hospitalized during the year is 0.05.
No policyholder will be hospitalized more than one time during the year.

If a policyholder is hospitalized, the number of days spent in the hospital follows a


lognormal distribution with = 1.039 and = 0.833.
Using the normal approximation, calculate the premium per policy such that there is a 90%
probability that total premiums will exceed total losses.
A. Less than 21.20
B. At least 21.20, but less than 21.50
C. At least 21.50, but less than 21.80
D. At least 21.80, but less than 22.10
E. At least 22.10
5.115 (CAS3, 11/05, Q.34) (2.5 points) Claim frequency follows a Poisson process with rate of
10 per year. Claim severity is exponentially distributed with mean 2,000.
The method of moments is used to estimate the parameters of a lognormal distribution for the
aggregate losses. Using the lognormal approximation, calculate the probability that annual
aggregate losses exceed 105% of expected annual losses.
A. Less than 34.5%
B. At least 34.5%, but less than 35.5%
C. At least 35.5%, but less than 36.5%
D. At least 36.5%, but less than 37.5%
E. At least 37.5%

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 160

5.116 (SOA M, 11/05, Q.34 & 2009 Sample Q.210) (2.5 points) Each life within a group
medical expense policy has loss amounts which follow a compound Poisson process with = 0.16.
Given a loss, the probability that it is for Disease 1 is 1/16.
Loss amount distributions have the following parameters:
Mean per loss
Standard Deviation per loss
Disease 1
5
50
Other diseases
10
20
Premiums for a group of 100 independent lives are set at a level such that the probability
(using the normal approximation to the distribution for aggregate losses) that aggregate
losses for the group will exceed aggregate premiums for the group is 0.24.
A vaccine which will eliminate Disease 1 and costs 0.15 per person has been discovered.
Define:
A = the aggregate premium assuming that no one obtains the vaccine, and
B = the aggregate premium assuming that everyone obtains the vaccine and the cost of the
vaccine is a covered loss.
Calculate A/B.
(A) 0.94
(B) 0.97
(C) 1.00
(D) 1.03
(E) 1.06
5.117 (SOA M, 11/05, Q.38 & 2009 Sample Q.212) (2.5 points) For an insurance:
(i) The number of losses per year has a Poisson distribution with = 10.
(ii) Loss amounts are uniformly distributed on (0, 10).
(iii) Loss amounts and the number of losses are mutually independent.
(iv) There is an ordinary deductible of 4 per loss.
Calculate the variance of aggregate payments in a year.
(A) 36
(B) 48
(C) 72
(D) 96
(E) 120
5.118 (SOA M, 11/05, Q.40) (2.5 points) Lucky Tom deposits the coins he finds on the way to
work according to a Poisson process with a mean of 22 deposits per month.
5% of the time, Tom deposits coins worth a total of 10.
15% of the time, Tom deposits coins worth a total of 5.
80% of the time, Tom deposits coins worth a total of 1.
The amounts deposited are independent, and are independent of the number of deposits.
Calculate the variance in the total of the monthly deposits.
(A) 180
(B) 210
(C) 240
(D) 270
(E) 300

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 161

5.119 (CAS3, 11/06, Q.29) (2.5 points)


Frequency of losses follows a binomial distribution with parameters m = 1,000 and q = 0.3.
Severity follows a Pareto distribution with parameters = 3 and = 500.
Calculate the standard deviation of the aggregate losses.
A. Less than 7,000
B. At least 7,000, but less than 7,500
C. At least 7,500, but less than 8,000
D. At least 8,000, but less than 8,500
E. At least 8,500
5.120 (SOA M, 11/06, Q.21 & 2009 Sample Q.282) (2.5 points)
Aggregate losses are modeled as follows:
(i) The number of losses has a Poisson distribution with = 3.
(ii) The amount of each loss has a Burr (Burr Type XII, Singh-Maddala) distribution with = 3, = 2,
and = 1.
(iii) The number of losses and the amounts of the losses are mutually independent.
Calculate the variance of aggregate losses.
(A) 12
(B) 14
(C) 16
(D) 18
(E) 20
5.121 (SOA M, 11/06, Q.32 & 2009 Sample Q.287) (2.5 points)
For an aggregate loss distribution S:
(i) The number of claims has a negative binomial distribution with r = 16 and = 6.
(ii) The claim amounts are uniformly distributed on the interval (0, 8).
(iii) The number of claims and claim amounts are mutually independent.
Using the normal approximation for aggregate losses, calculate the premium such that the
probability that aggregate losses will exceed the premium is 5%.
5.122 (4, 5/07, Q.17) (2.5 points) You are given:
(i) Aggregate losses follow a compound model.
(ii) The claim count random variable has mean 100 and standard deviation 25.
(iii) The single-loss random variable has mean 20,000 and standard deviation 5000.
Determine the normal approximation to the probability that aggregate claims exceed 150% of
expected costs.
(A) 0.023
(B) 0.056
(C) 0.079
(D) 0.092
(E) 0.159

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 162

Solutions to Problems:
5.1. B. A2 = F S2 + S2 F2 = (13)(200000) + (3002 )(37) = 5,930,000.
5.2. C. Frequency is Bernoulli with q = 2/3, with mean = 2/3 and variance = (2/3)(1/3) = 2/9.
Mean severity = 7.1, variance of severity = 72.1 - 7.12 = 21.69.
Thus A2 = F S2 + S2 F2 = (2/3)(21.69) + (7.12 )(2/9) = 25.66.
For the severity the mean and the variance are computed as follows:
Probability
20%
50%
30%

Size
of claim
2
5
14

Square of
Size of Claim
4
25
196

Mean

7.1

72.1

5.3. A. Since the frequency and severity are independent, the variance of the aggregate losses =
(mean frequency)(variance of severity) + (mean severity)2 (variance of frequency)
= 0.25 {(variance of severity) + (mean severity)2 } = 0.25 (2nd moment of the severity)
5000

= (0.25 / 5000)

x2 dx = (0.25 / 5000) (5000)3 / 3 = 2,083,333.

5.4. E. The average aggregate loss is 106. The second moment of the aggregate losses is
16940. Therefore, the variance = 16940 - 1062 = 5704.
Situation
1 claim @ 50
1 claim @ 200
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 150
2 claims @ 150 each
Overall

Probability
60.0%
20.0%
7.2%
9.6%
3.2%
100.0%

Aggregate Loss Square of Aggregate Loss


50
2500
200
40000
100
10000
200
40000
300
90000
106

16940

For example, the chance of 2 claims with one of size 50 and one of size 150 is the chance of having
two claims times the chance given two claims that one will be 50 and the other 150: (.2){(2)(.6)(.4)}
= 9.6%. In that case the aggregate loss is 50 + 150 = 200. One takes the weighted average over all
the possibilities.
Comment: Note that the frequency and severity are not independent.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 163

5.5. A. Severity and frequency are independent. Frequency is Geometric with


= (1 - 0.7) / 0.7 = 0.4286.
The mean frequency is = 0.4286, variance is (1+) = 0.6122.
The mean severity is 100.
The variance of the severity = (2/3)(50-100)2 + (1/3) (200-100)2 = 5000.

A2 = FS2 + S2 F2 = (.4286)(5000) + (1002 ) (0.6122) = 8265.


Comment: The chance of 0 claims is 0.7 = 1/(1+), the chance of 1 claim is (0.3)(0.7),
the chance of 2 claims is (0.32 )(0.7), etc.
5.6. A. Use for each type the formula: variance of aggregate = f s2 + s2 f2.
For example for Type I : f = 1/3, f2 = (1/3)(2/3), (Bernoulli claim frequency), s = 130,
s2 = (70%)(302 ) + (30%)(702 ) = 2100;
Type I variance of aggregate = (1/3)(2100) + (2/9)(1302 ) = 4456.
Type
Risk
I
II
III

A Priori
Chance of
Risk
0.333
0.333
0.333

Mean
Freq.
0.333
0.500
0.667

Variance
of Freq.
0.222
0.250
0.222

Mean
Severity
130
150
170

Variance
#VALUE!
2100
2500
2100

Process
Variance
of Agg.
4456
6875
7822

Variance for the portfolio is: (100)(4456) + (100)(6875) + (100)(7822) = 1,915,300.


5.7. D. Since the frequency and severity are independent, the variance of the aggregate losses=
(mean frequency)(variance of severity) + (mean severity)2 (variance of frequency)

= 5 {(variance of severity) + (mean severity)2 } = 5 (2nd moment of the severity).


Second moment of the severity is:

1 x2 (3.5 x- 4.5 ) dx = 2.333.


Therefore variance of aggregate losses = (5)(2.333) = 11.67.
Comment: The Severity is a Single Parameter Pareto Distribution.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 164

5.8. B. The mean severity is:

1 x (3.5 x- 4.5 ) dx = 1.4.

Thus the mean aggregate loss is (5)(1.4 ) = 7. From the solution to the prior question, the variance of
the aggregate losses is: 11.667. Thus the standard deviation of the aggregate losses is
11.667 = 3.416. To apply the Normal Approximation we subtract the mean and divide by the
standard deviation. The probability that the total losses will exceed 11 is approximately:
1 - [(11 - 7)/ 3.416] = 1 - (1.17) = 12.1%.
5.9. A. Mean = exp( + 2/2) = 7. 2nd moment = exp(2 + 22) = 11.67 + 72 = 60.67.
Dividing the 2nd equation by the square of the 1st:
exp(2 + 22) / exp(2 + 2) = 60.67 / 72 exp(2) = 1.238 =

ln(1.238) = 0.4621.

= ln(7) - 2/2 = 1.839. 1 - ((ln(11) - 1.839)/.4621) = 1 - (1.21) = 11.31%.


Comment: Below shown as dots is the aggregate distribution approximated via simulation of
10,000 years, the Normal Approximation shown as the dotted line, and the LogNormal
Approximation shown as the solid line:
0.15

0.125

0.1
0.075

0.05

0.025

Here is a similar graph of the righthand tail:

10

15

20

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 165

0.01

0.008

0.006

0.004

0.002

16

18

20

22

24

As shown above, the Normal Distribution (dashed) has a lighter righthand tail than the LogNormal
Distribution (solid), with the aggregate distribution (dots) somewhere in between.
For example, S(20) = 0.6% for the LogNormal, while S(20) = 0.007% for the Normal.
For the simulation, S(20) = 0.205%, less than the LogNormal, but more than the Normal.
5.10. D. The severity is a Single Parameter Pareto with = 3.5 and = 1.
It has second moment of: (3.5)(12 )/(3.5-2) = 2.333, and third moment of: (3.5)(13 )/(3.5-3) = 7.
The Third Central Moment of a Compound Poisson Distribution is:
(mean frequency) (third moment of the severity) = (5)(7) = 35.
Variance of the aggregate losses is: (2nd moment of severity) = (5)(2.333) = 11.67.
Therefore, skewness = 35/(11.67)1.5 = 0.878.
Alternately, skewness of a compound Poisson =
(third moment of the severity) / { (2nd moment of severity)1.5} = 7/{ 5 (2.3331.5)} = 0.878.
5.11. B. Since the frequency and severity are independent, and frequency is Poisson with mean 5,
the process variance of the aggregate losses = (5 )(2nd moment of the severity).
The severity distribution is Single Parameter Pareto. F(x) = 1 - x-3.5, prior to the effects of the
maximum covered loss. The 2nd moment of the severity after the maximum covered loss is:
5

x2 (3.5 x - 4.5 )

dx +

(52 )S(5)

= -(3.5 / 1.5)

x=5
1.5
x
]

+ (25)( 5-3.5) = 2.125 + 0.089 = 2.214.

x =1

Therefore, the variance of the aggregate losses = (5)(2.214) = 11.07.


Comment: For a Single Parameter Pareto Distribution, E[(X

x)2] = 2/( - 2) - 22/{( - 2)x2} =

(3.5)(12 )/(3.5 - 2) - (2)(12 )/{(3.5 - 2)53.5-2} = 2.333 - .119 = 2.214.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 166

5.12. D. A2 = F S2 + S2 F2 = (2) + ()2 = 22.


5.13. C. The third moment of the Exponential severity is 63 . The Third Central Moment of a
Compound Poisson Distribution is: (mean frequency) (third moment of the severity) = 63 .
From the solution to the previous question, the variance of the aggregate losses is 22 .
Therefore, the skewness of the aggregate losses is: 63 / {22 }1.5 = 3/ 2 .
5.14. E. For a Poisson frequency with severity and frequency independent, the process variance
of the aggregate losses = F(2nd moment of the severity) = (3)(200) = 600.
5.15. E. The mean aggregate loss for N automobiles is: N(.03)(3000/2) = 45N.
Second moment of the uniform distribution from 0 to 3000 is: 30002 /12 + 15002 = 2,250,000.
Variance of aggregate loss for N automobiles is: N(.03)(2,250,000) = 90,000N.
Prob[aggregate 160% of expected] = [(.6)45N/ 90,000N ] = [.09/ N ].
We want this probability to be at least 95% 0.09/ N 1.645 N 334.1.
Comment: Similar to SOA3, 11/03, Q.4.
5.16. C. This is a compound Poisson. In units of bases, the mean severity is:
(1)(0.22) + (2)(0.04) + (3)(0.01) + (4)(0.05) = .530. (This is Dons expected slugging percentage.)
The second moment of the severity is: (1)(0.22) +(4)(0.04) + (9)(0.01) + (16)(0.05) = 1.27.
Thus the variance of the aggregate losses is (600)(1.27) = 762.
The mean of the aggregate losses is: (600)(.530) = 318 bases. The chance that Don will have at
most $700,000 in incentives, is the chance that he has no more than 350 total bases:
((350.5-318)/ 762 ) = (1.177) = 88.0%.
5.17. E. For the Binomial frequency: mean = mq = 3.2, variance = mq(1-q) = 1.92.
For the Pareto severity: mean = / (-1) = 333.333, second moment = 22 / {(-1)(-2)} =
333,333, variance = 333,333 - 333.3332 = 222,222.
Since the frequency and severity are independent:

A2 = F S2 + S2 F2 = (3.2)(222,222) + (333.3332 )(1.92) = 924,443.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 167

5.18. D. The 2nd moment of a LogNormal Distribution is:


exp(2 + 22) = exp[2(7) + 2(.52 )] = exp(14.5) = 1,982,759.
Since the frequency is Poisson and the frequency and severity are independent:

A2 = (mean frequency)(2nd moment of the severity) = (3)(1,982,759) = 5,948,278.


5.19. B. For the Negative Binomial frequency: mean = r = 6, variance = r(1+) = 18.
For the Gamma severity: mean = = 600, variance = 2 = 120,000.
Since the frequency and severity are independent:

A2 = F S2 + S2 F2 = (6)(120,000) + (6002 )(18) = 7,200,000.


5.20. C. The variances of independent risks add.
(55)(924,443) + (35)(5,948,278) + (10)(7,200,000) = 331 million.
5.21. D. For the Negative Binomial, mean = r = 6, variance = r(1+) = 18,
skewness = (1 + 2) /

(1+ )(r) = 5/ 18 = 1.1785.

For the Gamma severity: mean = = 600, variance = 2 = 120,000,


skewness = 2 /

= 1.1547.

From a previous solution, A2 = 7,200,000. Since the frequency and severity are independent:
A = {F X3 X + 3 F2XX2 + F3FX3} / A3 =
{(6)(120,0001.5)(1.1547) + (3)(18)(600)(120,000) + (181.5)(1.1785)(6003 )}/7,200,0001.5 =
1.222.
Comment: Well beyond what you should be asked on your exam!
The skewness is a dimensionless quantity, which does not depend on the scale. Therefore, we
would have gotten the same answer for the skewness if we had set the scale parameter of the
Gamma, = 1, including in the calculation of A2 .
5.22. B. S(1000) = e-1000/5000 = 0.8187. The frequency of non-zero payments is Poisson with
mean: (2.4)(.8187) = 1.965. The severity distribution truncated and shifted at 1000 is also an
exponential with mean 5000. The mean aggregate losses excess of the deductible is
(1.965)(5000) = 9825.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 168

5.23. E. The frequency of non-zero payments is Poisson with = 1.965. The severity of non-zero
payments is an Exponential distribution with = 5000, with second moment 2(50002 ).
Thus the variance of the aggregate losses excess of the deductible is: (1.965)(2)(50002 ).
The standard deviation is: 5000 3.93 = 9912.
5.24. D. For the Exponential: E[X

x] = (1-e-x/).

E[X 10,000] = (5000)(1-e-10000/5000) = 4323.


Thus the mean aggregate losses are: (2.4)(4323) = 10,376.
5.25. D. For the Exponential: E[(X

x)2 ] = 22[3;x/] + x2 e-x/. E[(X

10,000)2 ] =

2(50002 )[3; 2] + (100 million)(e-2) = (50 million)(.3233) + 13.52 million = 29.7 million.
Thus the variance of the aggregate losses is: (2.4)(29.7 million) = 71.3 million, for a standard
deviation of 8443.
Comment: Using Theorem A.1 in Loss Models: [3 ; 2] = 1 - e-2{1 + 2 + 22 /2 } = 0.3233.
5.26. C. S(1000) = e-1000/5000 = 0.8187. The frequency of non-zero payments is Poisson with
mean (2.4)(.8187) = 1.965. The severity distribution truncated and shifted at 1000 is also an
Exponential with mean 5000. The maximum covered loss reduces the maximum payment to:
10,000 - 1,000 = 9,000. For the Exponential: E[X

x] = (1-e-x/). Thus, the average non-zero

payment is: E[X 9,000] = (5000)(1 - e-9000/5000) = 4174. Alternately, the average non-zero
payment is: (E[X 10,000] - E[X 1,000])/S(1000) = (4323 - 906)/0.8187 = 4174.
Thus the mean aggregate losses are: (1.965)(4174) = 8202.
5.27. C. The frequency of non-zero payments is Poisson with = 1.965. The severity of
non-zero payments is an Exponential distribution with = 5000, censored at
10,000 - 1000 = 9000, with second moment E[(X

9,000)2 ] =

2(50002 )[3; 9000/5000] + (90002 )e-9000/5000 = (50 million)(.2694) + 13.39 million =


26.86 million. Thus the variance of the aggregate losses is: (1.965)(26.86 million) =
52.78 million, for a standard deviation of 7265.
Comment: Using Theorem A.1 in Loss Models: [3 ; 1.8] = 1 - e-1.8{1 + 1.8 + 1.82 /2 } = 0.2694.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 169

5.28. C. For each interval [a,b], the first moment is (a+b)/2.


Lower
Endpoint
0
1
3
5

Upper
Endpoint
1
3
5
10

Number
of Losses
60
30
20
10

First
Moment
0.50
2.00
4.00
7.50

Contribution
to the Mean
30.00
60.00
80.00
75.00
245

Mean = ((60)(.5) + (30)(2) + (20)(4) + (10)(7.5) + 12 + 15 + 17 + 20 + 30)/125 = 2.71.


5.29. D. For each interval [a,b], the second moment is: (b3 - a3 )/(3(b-a)).
Lower
Endpoint
0
1
3
5

Upper
Endpoint
1
3
5
10

Number
of Losses
60
30
20
10

Second
Moment
0.33
4.33
16.33
58.33

Contribution
to 2nd Moment
20.00
130.00
326.67
583.33
1,060

We add to these contributions, those of each of the large losses; 2nd moment =
{(60)(.33) + (30)(4.33) + (20)(16.33) + (10)(58.33) + 122 + 152 + 172 + 202 + 302 }/125 =
24.14. Variance = 24.14 - 2.712 = 16.8.
5.30. B. In the interval 5 to 10, 3/5 of the losses are assumed to be of size greater than 7.
There are (3/5)(10) = 6 such losses of average size (7 + 10)/2 = 8.5.
Thus they contribute (6)(8.5 - 7) = 9 to the layer excess of 7.
The 5 large losses contribute: 5 + 8 + 10 + 13 + 23 = 59. e(7) = (9 + 59)/(6 + 5) = 6.2.
5.31. A. Mean aggregate loss is: (40)(2.71) = 108.4.
Variance of aggregate loss is: (40)(2nd moment of severity) = (40)(24.14) = 965.5.
CV of aggregate loss is:

965.5 / 108.4 = 0.29.

5.32. E. In the interval 5 to 10, there are 10 loses of average size 7.5.
Thus they contribute (10)(7.5 - 5) = 25 to the layer from 5 to 15.
The 5 individual large losses contribute: (12 - 5) + (15 - 5) + 10 + 10 + 10 = 47.
The payment per loss is: (25 + 47)/125 = 0.576.
For 40 losses the reinsurer expects to pay: (40)(0.576) = 23.0.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 170

5.33. A. The contributions to this layer from the losses in interval [5, 10], are uniform on
[0, 5]; the second moment is: (53 - 03 )/(3(5-0)) = 8.333. The second moment of the ceded losses
is: ((10)(8.333) + (12 - 5)2 + (15 - 5)2 + 102 + 102 + 102 )/125 = 4.259.
Variance of aggregate ceded losses = (4.259)(40) = 170.4.
CV of aggregate ceded losses =

170.4 / 23.0 = 0.57.

5.34. E. S(50000) = {(20000/(20000 + 50000)}3.2 = (2/7)3.2 = .01815.


Therefore, the frequency of non-zero payments is:
Negative Binomial with r = 4.1 and = (2.8)(0.01815) = 0.05082.
The mean frequency is (4.1)(.05082) = 0.2084.
The variance of the frequency is: (4.1)(.05082)(1.0508) = 0.2190.
Truncating and shifting from below produces another Pareto; the severity of non-zero payments is
also Pareto with = 3.2 and = 20000 + 50000 = 70000. This Pareto has mean 70000/2.2 =
31,818 and variance (3.2)(700002 )/(1.2 (2.22 )) = 2700 million.
Thus the variance of the aggregate losses excess of the deductible is:
(0.2084)(2700 million) + (0.2190)(318182 ) = 784.3 million. The standard deviation is: 28.0
thousand. The mean the aggregate losses excess of the deductible is: (.2084)(31818) = 6631.
Thus the chance that the aggregate losses excess of the deductible are greater than 15,000 is
approximately: 1 - [(15,000 - 6631)/ 28,000] = 1 - [0.30] = 38.2%.
5.35. B. We are mixing Poisson frequencies via a Gamma, therefore frequency for the portfolio is a
Negative Binomial with r = = 5 and = = 0.4, per policy,
with mean: (5)(.4) = 2, and variance: (5)(.4)(1.4) = 2.8.
The mean loss per policy is: (2)(20) = 40.
The variance of the loss per policy is: (2)(300) + (202 )(2.8) = 1720.
For 200 independent policies, Mean Aggregate Loss = (200)(40) = 8000.

110% of mean aggregate loss is 8800.


Variance of Aggregate Loss = (200)(1720) = 344,000.
Prob(Aggregate Loss > 1.1mean) = Prob(Aggregate Loss > 8800)
1 - [(8800 - 8000)/ 344,000 ] = 1 - (1.36) = 8.7%.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 171

5.36. C. For N independent policies, Mean Aggregate Loss = 40N, and


Variance of Aggregate Loss = 1720N.
Prob(Aggregate Loss > 1.1mean) 1 - [(0.1)40N/ 1720N )] = 1 - (0.09645 N ).
We want this probability to be at most 1% 0.09645 N 2.326 N 582.
Comment: Similar to SOA3, 11/03 Q.4.
5.37. E. f = 10. f2 = 20. s = 1000. s2 = 200,000.
Variance of the aggregate: fs2 + s2 f2 = (10)(200,000) + (10002 )(20) = 22,000,000.

= 4690. Now if we know that there have been 8 claims, then the aggregate is the sum of 8
independent, identically distributed severities. Var[Aggregate] = 8 Var[Severity] =
(8)(200,000) = 1,600,000. =

1,600,000 = 1265. / = 4690/1265 = 3.7.

Comment: Similar to CAS3, 11/03, Q.25.


5.38. D. Mean severity is: 5000 + (2000)(0.75) = 6500.
Let X = room charges, Y = other charges, Z = payment. Z = X + 0.75Y.
Var[Z] = Var[X + .75Y] = Var[X] + .752 Var[Y] + (2)(.75)Cov[X, Y] =
80002 + (.752 )(30002 ) + (2)(.75){(.6)(8000)(3000)} = 90.66 million.
Variance of Aggregate = (Mean Freq.)(Variance of Sev.) + (Mean Severity)2 (Var. of Freq.)
= (.4)(90.66 million) + (65002 )(.36) = 51.47 million.
Standard Deviation of Aggregate =

51.47 million = 7174.

5.39. C. Mean loss is: (4)(1000) = 4000. Variance of loss is: (4)(10002 ) = 4 million.
Mean loss adjustment expense is: (3)(200) = 600.
Variance of loss adjustment expense is: (3)(2002 ) = 0.12 million.
Var[Loss + LAE] = Var[Loss] + Var[LAE] + 2Cov[Loss, LAE] =
4 million + 0.12 million + (2) (0.8) (4 million)(0.12 million) = 5.2285 million.
Variance of Aggregate = (Mean Freq.)(Variance of Sev.) + (Mean Severity)2 (Var. of Freq.)
= (.6)(5.2285 million) + (46002 )(.6) = 15.83 million.
Standard Deviation of Aggregate =

15.83 million = 3979.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 172

5.40. D. One has to recognize this as a compound Poisson, with p(x) the severity, and frequency
3n e-3/n!. Frequency is Poisson with = 3.
The second moment of the severity is: (.5)(12 ) + (.3)(22 ) + (.2)(32 ) = 3.5.
The variance of aggregate losses is: (3)(3.5) = 10.5.
Comment: Similar to Course 151 Sample Exam #1, Q.10.
5.41. C. Matching the mean of the LogNormal and the aggregate distribution:
exp( + .5 2) = 100.
Matching the second moments: exp(2 + 22) = 90,000 + 1002 = 100,000.
Divide the second equation by the square of the first equation:
exp(2 + 22)/exp(2 + 2) = exp(2) = 10.

= ln(10) = 1.517. = ln(100) - 2/2 = 3.455.


Prob[agg. > 2000] 1 - F(2000) = 1 - [(ln(2000) - 3.455)/1.517] = 1 - [2.73] = 0.0032.
5.42. C. The severity has: mean = /(-1) = 50, and second moment = 22/{(-1)(-2)} = 10,000.
The mean aggregate loss is: (7)(50) = 350.
Since the frequency and severity are independent, and frequency is Poisson, the variance of the
aggregate losses = (mean frequency) (2nd moment of the severity) = (7)(10,000) = 70,000.
For the LogNormal Distribution the mean is exp[ +.5 2], while the second moment is
exp[2 + 22]. Matching the first 2 moments of the aggregate losses to that of the LogNormal
Distribution: exp[ +.5 2] = 350 and exp[2 + 22] = 70000 + 3502 = 192,500. We can solve by
dividing the square of the 1st equation into the 2nd equation: exp[2] = 192,500 / 3502 = 1.571.
Thus = .672 and thus = 5.632.
Therefore the probability that the total losses will exceed 1000 is approximately:
1 - [(ln(1000) - 5.632) / 0.672] = 1 - [1.90] = 2.9%.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 173

5.43. Due to the memoryless property of the Exponential, the payments excess of a deductible
follow the same Exponential Distribution as the ground up losses.
Thus the second moment of (non-zero) payments is 22.
The number of (non-zero) payments with a deductible b is Poisson with mean:
S(b) = e-b/.
Therefore, with deductible b, B = variance of aggregate payments = e-b/22.
With deductible c, C = variance of aggregate payments = e-c/22.
C/B = e(b-c)/. Since c > b, this ratio is less than one.
Comment: In the case of an Exponential severity, the variance of aggregate payments decreases
as the deductible increases.
Similar to CAS3, 5/05, Q.9.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 174

5.44. D. Due to the memoryless property of the Exponential, the payments excess of a
deductible follow the same Exponential Distribution as the ground up losses.
Thus the second moment of (non-zero) payments is: (2)(4002 ) = 320,000.
The number of (non-zero) payments is Poisson with mean: 3 e-500/400 = 0.85951.
Therefore, variance of aggregate payments = (0.85951)(320,000) = 275,045.
Alternately, for the Exponential Distribution, E[X] = = 400, and E[X2 ] = 22 = 320,000.
For the Exponential Distribution, E[X x] = (1 - e-x/).
E[X

500] = 400(1 - e-500/400) = 285.40.

For the Exponential, E[(X


E[(X

x)n ] = n! n (n+1; x/) + xn e-x/.

500)2 ] = (2)4002 (3; 500/400) + 5002 e-500/400.

According to Theorem A.1 in Loss Models, for integral , the incomplete Gamma function
(; y) is 1 minus the first densities of a Poisson Distribution with mean y.
(3; y) = 1 - e-y(1 + y + y2 /2). (3; 1.25) = 1 - e-1.25(1 + 1.25 + 1.252 /2) = 0.13153.
Therefore, E[(X 500)2 ] = (320,000)(0.13153) + 250,000e-1.25 = 113,716.
The first moment of the layer from 500 to is: E[X] - E[X 400] = 400 - 285.40 = 114.60.

Second moment of the layer from 500 to is: E[X2 ] - E[(X 500)2 ] - (2)(500)(E[X] - E[X 500]) =
320,000 - 113,716 - (1000)(114.60) = 91,684. The number of losses is Poisson with mean 3.
Thus the variance of the aggregate payments excess of the deductible is: (3)(91,684) = 275,052.
Alternately, one can work directly with the integrals, using integration by parts.
The second moment of the layer from 500 to is:

500 (x - 500)2 e- x / 400 / 400 dx =

500 x2 e- x / 400 / 400 dx - 2.5 500 x e- x / 400 dx + 625 500 e- x / 400 dx


x=

= -x2 e- x / 400 - 800 x e- x / 400 - 320,000 e- x / 400

x = 500
x =

+ 1000 x e- x / 400 + 400,000 e- x / 400

+ 320,000 e-1.25 =

x = 500

e-1.25 {250,000 + 400,000 + 320,000 - 500,000 - 400,000 + 250,000} = 91,682.


The variance of the aggregate payments excess of the deductible is: (3)(91,682) = 275,046.
Comment: Similar to CAS3, 5/05, Q.9.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 175

5.45. The payments excess of a deductible follow a Pareto Distribution with parameters and
+ d. Thus the second moment of (non-zero) payments is 2( + d)2/{( - 1)( - 2)}.
The number of (non-zero) payments with a deductible b is Poisson with mean:
S(b) = {/( + b)}.
Therefore, with deductible b, B = {/( + b)}2( + b)2/{( - 1)( - 2)}.
With deductible c, C = variance of aggregate payments = {/( + c)}2( + c)2/{( - 1)( - 2)}.
C/B = {( + b)/( + c)}2. Since > 2 and c > b, this ratio is less than one.
Comment: As approaches 2, the ratio C/B approaches one. For 2, the second moment of
the Pareto does not exist, and neither does the variance of aggregate payments.
Here the variance of the aggregate payments decreases as the deductible increases.
In CAS3, 5/05, Q.8, the variance of aggregate payments increases as the deductible increases.
5.46. A. S(42) 1 - [(42 - 20)/10] = 1 - [2.2] = 1 - .9861 = 1.39%.
5.47. E. exp[ + 2/2] = 20. exp[2 + 22] = 100 + 202 = 500. exp[2] = 500/202 = 1.25.

= 0.4724. = 2.8842.
S(42) 1 - [(ln(42) - 2.8842)/.4724] = 1 - [1.81] = 3.51%.
5.48. D. = 20. 2 = 100. = 5. = 4.
S(42) 1 - [4; 42/5] = 1 - [4; 8.4] = e-8.4(1 + 8.4 + 8.42 /2 + 8.43 /6) = 3.23%.
Comment: An example of the method of moments.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 176

5.49. E. = 20. 3/ = 100. = 80.


S(42) 1 - [(42/20 - 1)(80/42).5] - exp[(2)(80)/20][-(42/20 + 1)(80/42).5] =
1 - [1.52] - e8 [-4.278].
[-4.278] = 1 - [4.278] {exp[-4.2782 /2]}/ 2 }(1/4.278 - 1/4.2783 + 3/4.2785 - 15/4.2787 ) =
9.423 x 10-9.
S(42) 1 - 0.9357 - (2981)(9.423 x 10-9) = 0.0643 - 0.0281 = 3.62%.
Comment: An example of the method of moments.
5.50. B. The mean of c times a Poisson is c. The variance of c times a Poisson is c2 .
c = 20. c2 = 100. c = 5. = 4. 5N > 42. N > 42/5 = 8.2.
S(42) 1 - e-4(1 + 4 + 42 /2 + 43 /6 + 44 /4! + 45 /5! + 46 /6! + 47 /7! + 48 /8!) = 2.14%.
Comment: Well beyond what you are likely to be asked on your exam! Since Var[cX]/E[cX] =
cVar[X]/E[X], for a c > 1, the Over-dispersed Poisson Distribution has a variance greater than it
mean. See for example A Primer on the Exponential Family of Distributions, by David R. Clark
and Charles Thayer, CAS 2004 Discussion Paper Program.
5.51. D. /( - 1) = 20. 2/{( - 1)( - 2)} = 100 + 202 = 500. ( - 1)/( - 2) = 1.25

= 6. = 100. S(42) [6; 100/42] = [6; 2.381] =


1 - e-2.381(1 + 2.381 + 2.3812 /2 + 2.3813 /6 + 2.3814 /24 + 2.3815 /120) = 3.45%.
Comment: Beyond what you are likely to be asked on your exam. An example of the method of
moments. Which distribution is used to approximate the Aggregate Distribution can make a very
significant difference!
From lightest to heaviest righthand tail, the approximating distributions are:
Normal, Over-dispersed Poisson, Gamma, Inverse Gaussian, LogNormal, Inverse Gamma.
Here is a table of the inverse of the survival functions for various sizes:
Distribution
Normal
0
Gamma
Inv. Gaussian
LogNormal
Inv. Gamma

40

50

1/S(x)
60

43.96
46.81
23.60
21.87
22.60
23.80

740.8
352.1
96.75
68.21
67.63
60.37

31,574
3653
436.3
212.4
192.0
136.9

70

80

90

100

3.5e+6
50,171
2109
657.3
515.9
283.4

1.0e+9
882,744
10,736
2019
1315
544.0

7.8e+11
1.9e+7
59,947
6152
3194
981.9

1.5e+15
5.2e+8
312,137
18,623
7423
1683

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 177

5.52. B. Mean aggregate per policy: (2)(5) = 10.


Variance of aggregate per policy: (2nd moment of severity) = (2)(122 + 52 ) = 338.
For N policies, mean is: 10N, and the variance is: 338N. 115% of the mean is: 11.5N.
Prob[Aggregate > 115% of mean] 1 - [(11.5N - 10N)/ 338N ] = 1 - [0.08159 N ].
This probability 2.5%. [0.08159 N ] 97.5%.
[1.960] = 0.975. We want 0.08159 N 1.960. N 577.1.
Comment: Similar to SOA3, 11/03, Q.4.
5.53. D. Due to the memoryless property of the Exponential, the payments excess of a
deductible follow the same Exponential Distribution as the ground up losses.
The size of payments has mean 1700, and variance 17002 = 2.89 million.
For the original Exponential, S(1000) = exp[-1000/1700] = 0.5553.
Thus the number of (non-zero) payments is Negative Binomial with r = 4, and
= (0.5553)(3) = 1.666.
The number of payments has mean: (4)(1.666) = 6.664, and variance: (4)(1.666)(2.666) = 17.766.
Therefore, the variance of aggregate payments is:
(6.664)(2.89 million) + (17002 )(17.766) = 70.6 million.
5.54. B. & 5.55. A. Mean aggregate = (10000)(.03)(12.5) + (15000)(.05)(25) = 22,500.
Policy Type one has a mean severity of 12.5 and a variance of the severity of
(25 - 0)2 /12 = 52.083. Policy Type one has a mean frequency of .03 and a variance of the
frequency of (.03)(.97) = .0291. Thus, a single policy of type one has a variance of aggregate
losses of: (.03)(52.083) + (12.52 )(.0291) = 6.109.
Policy Type two has a mean severity of 25 and a variance of the severity of
(50 - 0)2 /12 = 208.333. Policy Type two has a mean frequency of .05 and a variance of the
frequency of (.05)(.95) = .0475. Thus, a single policy of type two has a variance of aggregate
losses of: (.05)(208.333) + (252 )(.0475) = 40.104.
Therefore, the variance of the aggregate losses of 10000 independent policies of type one and
15000 policies of type two is: (10000)(6.109) + (15000)(40.104) = 662,650.
Standard Deviation of aggregate losses is: 814.037.
Prob[Aggregate > 24,000] 1 - [(24,000 - 22,500)/814.037] = 1 - [1.84] = 3.3%.
Comment: Similar to 3, 5/00, Q.19.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 178

5.56. C. The aggregate distribution of Property Damage Liability has mean 10, and variance
(152 + 102 ) = 325. [1.282] = 90%. Therefore, P 10 + 1.282 325 = 10 + 23.11 .
The aggregate distribution of Bodily Injury Liability has mean 24/3 = 8, and variance
(/3)(242 + 602 ) = 1392. Therefore, B 8 + 1.282 1392 = 8 + 47.83 .
B/P = {8 + 47.83 }/{10 + 23.11 } = 1.061.

8 + 47.83 = 10.61 + 24.52. = 8.93. = 79.7.


5.57. D. First inflate all of the aggregate losses to the 2012 level:
(1.046 ) (31,000,000) = 39,224,890.
(1.045 ) (38,000,000) = 46,232,811.
(1.044 ) (36,000,000) = 42,114,908.
(1.043 ) (41,000,000) = 46,119,424.
(1.042 ) (41,000,000) = 44,345,600.
Next we calculate the mean and the second moment of the inflated losses:
Mean = 43.6075 million.
Second Moment = 1908.65 x 1012.
The mean of the aggregate distribution is: (first moment of severity) = 3000

.
1

The variance of the aggregate distribution is:


(second moment of severity) = 3000

2 2
.
( 1) ( 2)

Matching the theoretical and empirical moments:

43.6075 million = 3000


. = 14,539( - 1).
1
1908.65 x 1012 - (43.6075 million)2 = 3000

2 2
2 = 11.73 million ( - 1)( - 2).
( 1) ( 2)

Dividing the second equation by the square of the first: 1 = 5.549( - 2)/( - 1). = 2.220.

= 17,738. S(20,000) = {17,738/(17,738 + 20,000)}2.220 = 18.7%.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 179

5.58. C. For example, VARS(S | N = 3) = 300,000 - 160,000 = 140,000.


EN[VARS(S | N)] =
(20%)(0) + (40%)(40,000) + (30%)(87,500) + (10%)(140,000) = 56,250.
N
0
1
2
3
Mean

Probability Mean of S
Given N
20%
40%
30%
10%

Square of Mean
of S Given N

Second Moment of
of S Given N

Var of S
Given N

0
100
250
400

0
10,000
62,500
160,000

0
50,000
150,000
300,000

0
40,000
87,500
140,000

155

38,750

56,250

VARN(ES[S | N]) = 38,750 - 1552 = 14,725.


Thus the variance of the aggregate losses is:
EN[VARA(A | N)] + VARN(EA[A | N]) = 56,250 + 14,725 = 70,975.
Comment: We have not assumed that frequency and severity are independent.
The mathematics here is similar to that for the EPV and VHM, used in Buhlmann Credibility.
5.59. A. The Binomial has a mean of: (5)(.4) = 2, and a variance of: (5)(.4)(.6) = 1.2.
The LogNormal distribution has a mean of: exp[6 + 0.32 /2] = 422, a second moment of:
exp[(2)(6) + (2).32 ] = 194,853, and variance of: 194,853 - 4222 = 16,769.
The aggregate losses have a mean of: (2)(422) = 844.
The aggregate losses have a variance of: (2)(16,769) + (4222 )(1.2) = 247,239.
Prob[Aggregate > (1.5)(844)] 1 - [(.5)(844)/ 247,239 ] = 1 - [0.85] = 19.77%.
5.60. B. The density at zero for the non-modified Negative Binomial is: 1/1.42 = 0.5102.
The mean of the zero-modified Negative Binomial is: (1 - 0.4)(0.8) / (1 - 0.5102) = 0.9800.
The second moment of the zero-modified Negative Binomial is:
(1 - 0.4){(2)(0.4)(1.4) + 0.82 } / (1 - 0.5102) = 2.1560.
Thus the variance of the zero-modified Negative Binomial is: 2.1560 - 0.98002 = 1.1956.
The mean of the Gamma is: (3)(500) = 1500.
The variance of the Gamma is: (3)(5002 ) = 750,000.
Thus the variance of the annual aggregate loss is:
(0.9800)(750,000) + (15002 )(1.1956) = 3,425,100.
5.61. C. The sample mean is 109.167
The sample variance is 54.967.
100 - 109.16
Prob[Aggregate < 100] = [
] = [-1.24] = 1 - 0.8925 = 10.75%.
54.967

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 180

5.62. B. By thinning, each type of claim is Poisson.


For each type, variance of aggregate is: (second moment of severity) = (mean2 ) (1 + CV2 ).
Variance of Type I: (0.20) (1002 ) (1 + 52 ) = 52,000.
Variance of Type II: (0.10) (2002 ) (1 + 42 ) = 68,000.
Variance of Type III: (0.05) (3002 ) (1 + 32 ) = 45,000.
The variance of the distribution of annual aggregate losses is:
52,000 + 68,000 + 45,000 = 165,000.
Alternately, severity is a mixture, with weights: 20/35, 10/35, and 5/35.
The second moment of the mixture is the mixture of the second moments:
(4/7) (1002 ) (1 + 52 ) + (2/7) (2002 ) (1 + 42 ) + (1/7) (3002 ) (1 + 32 ) = 471,429.
The variance of the distribution of annual aggregate losses is: (0.35)(471,429) = 165,000.
5.63. A. Severity is LogNormal with = 6 and 2 = 0.7.
Mean severity is: exp[6 + 0.7/2] = 572.5.
Second moment of severity is: exp[(2)(6) + (2)(0.7)] = 660,003.
Variance of severity is: 660,003 - 572.52 = 332,247.
Mean frequency is: (40%)(1) + (30%)(2) + (20%)(3) + (10%)(4) = 2.
Second Moment of frequency is: (40%)(12 ) + (30%)(22 ) + (20%)(32 ) + (10%)(42 ) = 5.
Variance of frequency is: 5 - 22 = 1.
The variance of the distribution of annual aggregate losses is:
(2)(332,247) + (572.52 )(1) = 992,250.
5.64. E. X is the discrete frequency, severity is Normal; Y is the aggregate loss.
E[X] = 1.3. E[X2 ] = 2.3. Var[X] = 2.3 - 1.32 = 0.61.
Var[Y] = (1.3) (5) + (32 ) (0.61) = 11.99.
Alternately, this is a mixture:
with probability 20% Y is 0,
with probability 30% Y is Normal with mean 3 and variance 5,
with probability 50% Y is Normal with mean 6 and variance 10.
Thus E[Y] = (0.2)(0) + (0.3)(3) + (0.5)(6) = 3.9.
E[Y2 ] = (0.2)(0) + (0.3)(5 + 32 ) + (0.5)(10 + 62 ) = 27.2.
Var[Y] = 27.2 - 3.92 = 11.99.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 181

5.65. D. The Gamma has a mean of 30,000, and a variance of: (3)(10,0002 ) = 300 million.
The mean of the zero-truncated Binomial is: (4)(0.2) / (1 - 0.84 ) = 1.355.
Thus the mean number of claimants is: (0.1)(1.355) = 0.1355.
Thus the mean annual aggregate loss is: (0.1355)(30,000) = 4065.
The second moment of the non-truncated Binomial is: (4)(0.2)(0.8) + {(4)(0.2)}2 = 1.28.
The second moment of the zero-truncated Binomial is: 1.28 / (1 - 0.84 ) = 2.168.
The annual number of claimants follows a compound Poisson zero-truncated Binomial Distribution,
in other words as if there were a Poisson Frequency and a zero-truncated Binomial severity.
Thus the variance of the number of claimants is: (0.1)(2.168) = 0.2168.
Thus the variance of the annual aggregate loss is:
(0.1355)(300 million) + (30,0002 )(0.2168) = 235.77 million.
CV of aggregate loss is:

235.77 million / 4065 = 3.78.

5.66. A. For the portion paid by Spring & Sommers, the mean disability is:
(0.3)(1) + (0.2)(2) + (0.1)(3) + (0.1)(4) + (0.3)(5) = 2.9 weeks.
Second moment is: (0.3)(12 ) + (0.2)(22 ) + (0.1)(32 ) + (0.1)(42 ) + (0.3)(52 ) = 11.1.
Variance is: 11.1 - 2.92 = 2.69.
The number of disabilities from Type 1 are Binomial with m = 1500 and q = 5%.
The mean severity is: (2/3)(600)(2.9) = 1160.
The variance of severity is: (4002 ) (2.69) = 430,400.
For Type 1, the mean aggregate is: (5%)(1500) (1160) = 87,000.
The variance aggregate is: (5%)(1500)(430,400) + (11602 )(1500)(0.05)(0.95) = 128,154,000.
The number of disabilities from Type 2 are Binomial with m = 500 and q = 8%.
The mean severity is: (2/3)(900)(2.9) = 1740.
The variance of severity is: (6002 ) (2.69) = 968,400.
For Type 2, the mean aggregate is: (8%)(500) (1740) = 69,600.
The variance aggregate is: (8%)(500)(968,400) + (17402 )(500)(0.08)(0.92) = 150,151,680.
The total mean aggregate is: 87,000 + 69,600 = 156,600.
The variance of total aggregate is: 128,154,000 + 150,151,680 = 278,305,680.
The coefficient of variation of the distribution of total annual payments is:
278,305,680 / 156,600 = 0.1065.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 182

5.67. D. The mean of the zero-modified Poisson is: (1 - 0.25)(0.1) / (1 - e-0.1) = 0.7881.
The second moment of the zero-modified Poisson is: (1 - 0.25)(0.1 + 0.12 ) / (1 - e-0.1) = 0.8669.
Thus the variance of the zero-modified Poisson is: 0.8669 - 0.78812 = 0.2458.
The mean of the LogNormal is: exp[8 + 0.62 /2] = 3569.
The second moment of the LogNormal is: exp[(2)(8) + (2)(0.62 )] = 18,255,921.
Thus the variance of the LogNormal is: 18,255,921 - 35692 = 5,518,160.
Thus the variance of the annual aggregate loss is:
(0.7881)(5,518,160) + (35692 )(0.2458) = 7,479,804.
5.68. D. For frequency independent of severity, the process variance of the aggregate losses is
given by: (Mean Freq.)(Variance of Severity) + (Mean Severity)2 (Variance of Freq.)
= (r/a2) + (r/a)2(2) = r(2r + 1) / a2 .
5.69. B. Let X be the claim sizes, then VAR[T] = E[N]VAR[X] + E[X]2VAR[N] =
m(2) + ()2(3m) = m (3 + 1)2.
5.70. A. The mean frequency = (1/4)(4) + (1/2)(5) + (1/4)(6) = 5. The second moment of the
frequency = (1/4)(42 ) + (1/2)(52 ) + (1/4)(62 ) = 25.5. The variance of the frequency = 25.5 - 52 =
.5. The severity distribution is a Single Parameter Pareto with = 1 and = 3.
mean = ( / ( 1)) = 3/2. 2nd moment = ( /( 2))2 = 3. variance = 3 - 9/4 = 3/4.
The variance of the aggregate losses = (5)(3/4) + (3/2)2 (.5) = 4.875.
The mean of the aggregate loss is: (mean frequency)(mean severity) = (5)(3/2) = 7.5.
The coefficient of variation of the aggregate loss =

4.875 / 7.5 = 0.294.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 183

5.71. E. One has to recognize this as a compound Negative Binomial, with p(x) the severity, and
n + 2
frequency density:
(0.6)3 (0.4)n .
n
Frequency is Negative Binomial with r = 3 and /(1+) = 0.4, so that = 2/3.
The mean frequency is: r = 2, and the variance of the frequency is: r(1+) = 10/3.
The mean severity is: (0.3)(1) + (0.6)(2) + (0.1)(3) = 1.8.
The second moment of the severity is: (0.3)(12 ) + (0.6)(22 ) + (0.1)(32 ) = 3.6.
Thus the variance of the severity is: 3.6 - 1.82 = .36.
The variance of aggregate losses is: (0.36)(2) + (1.82 )(10/3) = 11.52.
Comment: Where p is the density of the severity, f is the density of the frequency, and frequency

and severity are independent, then the density of aggregate losses is:

p * n(x) f(n).
n=0

You should recognize that this is the convolution form of writing an aggregate distribution.
In the frequency density, you have a geometric decline factor of 0.4. So 0.4 looks like /(1+) in a
Geometric Distribution. However, we also have the binomial coefficients in front which is one way of
n + 2
n + 2
r(r +1)...(r + k -1)
writing a Negative Binomial.
=
= (3) ... (n+2) / n!
.

k!
n
2
This is the form of the Negative Binomial density in Loss Models, with r = 3.
There are only a few frequency distributions in the Appendix, so when you see something like this,
there are only a few choices to try to match things up.
It is more common for them to just say frequency is Negative Binomial or whatever.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 184

5.72. E. In each case the premium is the mean aggregate loss plus 1.645 standard deviations,
since (1.645) = .95.
Thus the relative security loading is 1.645 standard deviations/ mean.
Let A be the fixed amount of the claim, let p be the probability of a claim, and N be the number of
policies. Since we are told that each policy has either zero or one claim, the number of claims for N
policies is Binomial with parameters p and N.
Therefore, the mean aggregate losses is: NpA.
The variance of aggregate losses is: (N(p)(1-p))A2 .
Thus the relative security loading is: (1.645)

N(p)(1- p)A2 / (NpA) = 1.645

(1- p) / (Np) .

So the largest relative security loading corresponds to the largest value of (1-p)/(Np).
As shown below this occurs for region E.
Region

(1-p)/(Np)

relative security loading

A
B
C
D
E

300
500
600
500
100

0.01
0.02
0.03
0.02
0.01

0.330
0.098
0.054
0.098
0.990

0.94
0.51
0.38
0.51
1.64

5.73. E. This is a mixed Poisson-Poisson frequency. For a given value of , the first moment of a
Poisson frequency is . Thus for the mixed frequency, the first moment is E[] = 1/p.
For a given value of , the second moment of a Poisson frequency is + 2.
Thus for the mixed frequency, the second moment is: E[ + 2] = E[] + E[ 2] =
1/p + (second moment of a Poisson with mean 1/p) = 1/p + (1/p + 1/p2 ) = 2/p + 1/p2 .
Thus the variance of the mixed frequency distribution is: 2/p + 1/p2 - 1/p2 = 2/p.
The mean severity is: p + (2)(1-p) = 2 - p.
The second moment of the severity is: p + (4)(1-p) = 4 - 3p.
Thus the variance of the severity is: 4 - 3p - (2-p)2 = p - p2 .
Variance of aggregate losses is:
(variance of severity)(mean frequency) + (mean severity)2 (variance of frequency) =
(p-p2 )(1/p) + (2-p)2 (2/p) = p - 7 + 8/p. Setting this equal to the given 19/2, p - 7 + 8/p = 19/2.
Therefore, 2p2 - 33p + 16 = 0. p = (33

332 - (4)(2)(16) )/4 = (33 31)/4 = 1/2 or 16.

However, in order to have a legitimate severity distribution, we must have 0 p 1.


Therefore, p = 1/2.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 185

5.74. B. The mean frequency is = 1/3. The mean severity is 4. Thus the mean aggregate loss is
(1/3)(4) = 4/3. The second moment of the severity is (9 + 16 + 25)/3 = 50/3.
Thus the variance of the severity is: 50/3 - 42 = 2/3.
The variance of the frequency is: (1+) = (1/3)(4/3) = 4/9.
Thus the variance of aggregate losses is: (1/3)(2/3) + (4/9)(42 ) = 66/9.
Thus the premium is: 4/3 + 66/9 = 78/9 = 8.667.
The aggregate losses do not exceed the premiums if: there are 0 claims, there is 1 claim, and
sometimes when there are 2 claims.
The probability of 0 claims is: 1/(1+) = .75. The probability of 1 claim is: /(1+)2 = .1875.
The probability of 2 claims is: 2/(1+)3 = .046875. If there are two claims, the aggregate losses are
< 8.667 if the claims are of sizes: 3,3; 3,4; 3,5; 4,3; 4,4; 5,3.
This is 6 out of 9 equally likely possibilities, when there are two claims.
Therefore, the probability that the aggregate losses exceed the premiums is:
1 - {0.75 + 0.1875 + (0.046875)(6/9)} = 1 - 0.96875 = 0.03125.
Comment: In spreadsheet form, the probability that the aggregate losses do not exceed the
premiums is calculated as 0.96875:
A

Number of
Claims

Frequency
Distribution

Probability that
Aggregate Losses Premiums

Column B
times Column C

0
1
2
3

0.75000
0.18750
0.04688
0.01172

1.00000
1.00000
0.66667
0.00000

0.75000
0.18750
0.03125
0.00000
0.96875

5.75. D. G = E[S](1+) = E[X](1+). Var[S] = E[X2 ].


Var[R] = Var[S/G] = Var[S]/G2 = E[X2 ]/(E[X](1+))2 = E[X2 ]/{E[X]2 (1+)2 }.
Comment: The premiums G does not vary, so we can treat G like a constant; G comes out of the
variance as a square. S is a compound Poisson, so its variance is the mean frequency times the
second moment of the severity.
5.76. E. Frequency is Binomial, with mean = (40)(1/8) = 5 and variance = (40)(1/8)(7/8) = 35/8.
The mean severity is = 400. The variance of the severity is: 3/ = 4003 /8000 = 8000.
Thus the mean aggregate loss is: (5)(400) = 2000 and the variance of aggregate losses is:
(4002 )(35/8) + (5)(8000) = 740,000. Thus the probability that the total dollars of claims for the
portfolio are greater than 2900 is approximately:
1 - [(2900 - 2000)/ 740,000 ] = 1 - [1.05] = 1 - 0.852 = 0.147.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 186

5.77. B. Mean severity is: (0.9)(1) + (0.1)(10) = 1.9.


Second moment of the severity is: (0.9)(12 ) + (0.1)(102 ) = 10.9.
Variance of the severity is: 10.9 - 1.92 = 7.29.
Mean frequency is 0.10. Variance of frequency is: (0.10)(.90) =0 .09.
Mean aggregate loss is: N(.01)(1.9) = 0.019N.
Variance of aggregate losses is: N{(0.10)(7.29) + (0.09)(1.92 )} = 1.0539N.
A 95% probability corresponds to 1.645 standard deviations greater than the mean,
since (1.645) = 0.95.
Thus, safety loading = .2(mean aggregate loss) = 1.645 (standard deviations). Thus,
(.2)(0.019N) = 1.645 1.0539N . Solving, N = 1.6452 (1.0539)/.0382 = 1975 policies.
Comment: If one knows classical credibility, one can do this problem as follows.
P = 95%, but since one performs only a one-sided test in this case y = 1.645.
k = 20%. The CV2 of the severity is: 7.29/1.92 = 2.019.
The standard for full credibility is: (y/k)2 (F2/F + CVS2 ) = (1.645/.2)2 (.09/.1 + 2.019) =
(67.65)(2.919) = 197.5 claims. This corresponds to 197.5/.1 = 1975 policies.
5.78. D. Mean Freq = .01. Variance of freq. = (0.01)(0.99) = 0.0099.
Mean Severity = 5000. Variance of severity = (10000-0)2 /12 = 8,333,333.
Variance of Aggregate Losses = (.01)(8333333) + (.0099)(50002 ) = 330,833.
5.79. The mean of the aggregate losses = (3)(100) = 300.
Since the frequency is Poisson, the variance of aggregate losses =
(mean frequency)(second moment of the severity) = (3)(15000) = 45,000.
Premiums = (300)(1.1) = 330. Mean Loss Ratio = 300/330 = 0.91.
Var(Loss Ratio) = Var(Loss/Premium) = Var(Loss)/3302 = 45,000/3302 = 0.41.
5.80. a. E[S] = (# risks)(mean frequency)(mean severity) = (500)(.1)(1000) = 50,000.
Var[S] = (# risks){(mean frequency)(var. of sev.) + (mean severity)2 (var. freq.)} =
(500){(0.1)(10002 ) + (10002 )0(.1)(0.9)} = 95,000,000. StdDev{S] =

95,000,000 = 9746.

b. So that there is 95% chance that the premiums are sufficient to pay the resulting claims, the
aggregate premiums = mean + 1.645(StdDev) = 50000 + (1.645)(9746) = 66,033.
Premium per risk = 66033/500 = 132.
5.81. Let the death benefit be b and the probability of death be q.
Then 30 = E[x] = bq and 29100 = Var[x] = q(1-q)b2 .
Thus 29100/302 = (1-q)/q. q = 0.03. b = 1000.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 187

5.82. Premium = 1.2E[S] = 1.2(5)(0 +10)/2 = 30.


Var[S] = (mean freq)(second moment of severity) = (5)(102 /3) = 166.67.
Var[Loss Ratio] = Var[S/30] = Var[S]/900 = 166.67/900 = 0.185.
Alternately, G = 1.2 E[N]E[X], Var[S] = E[N]Var[X] + E[X]2 Var[N].
Var[S/G] = Var[S]/G2 = {E[N]Var[X] + E[X]2 Var[N]}/ {1.2 E[N]E[X]}2 =
{Var[X]/ (E[N]E[X]2 ) + Var[N]/E[N]2 } / 1.44 = {(100/12)/((5)(25)) +(5)/(25)} / 1.44 = .185.
5.83. D. Mean aggregate loss = (3){(.4)(1) + (.2)(2) + (.4)(3)} = 6.
For a compound Poisson, variance of aggregate losses =
(mean frequency)(second moment of severity) = (3){(.4)(12 ) + (.2)(22 ) + (.4)(32 )} = 14.4.
Since the severity is discrete, one should use the continuity correction.
Pr[S > 9] 1 - [(9.5 - 6)/ 14.4 ] = 1 - (0.92) = 1 - 0.8212 = 17.88%.
5.84. A. Since frequency is Poisson, Var[S] = (mean frequency)(second moment of the severity).
30,000,000 = (500,000,000 + 50,0002 ). = 1/100.
Prob(N1) = 1 - Prob(N = 0) = 1 - e = 1 - e-.01 = 0.995%.
5.85. Mean Aggregate Loss = (350)(500) = 175,000. Since frequency is Poisson,
Variance of Aggregate Loss = (350)(2nd moment of severity) = (350)(10002 /3) =
116.67 million. Prob (S > 180,000) 1 - [(180,000 - 175,000)/ 116.67 million] =
1 - [0.46] = 32.3%.
Comment: The second moment of the uniform distribution (a , b) is: (b3 - a3 )/(3(b-a)).

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 188

5.86. The expected excess annual claims are:


(0)(.04)(3000) + (5000)(.04)(4000) + (30000)(.04)(5000) + (45000)(.04)(2000) = 10.4 million.
Therefore, the reinsurance cost is: (125%)(10.4 million) = 13 million.
The expected retained annual claims are: (20000)(.04)(3000) + (30000)(.04)(4000) +
(30000)(.04)(5000) + (30000)(.04)(2000) = 15.6 million.
The variance of the retained annual claims are:
(200002 )(.04)(.96)(3000) + (300002 )(.04)(.96)(4000) + (300002 )(.04)(.96)(5000) +
(300002 )(.04)(.96)(2000) = 4.2624 x 1011.
The total cost (retained claims plus reinsurance cost) of insuring the properties has mean 15.6 million
+ 13 million = 28.6 million and variance 4.2624 x 1011.
Probability that the total cost exceeds $28,650,000

1 - [(28.65 million - 28.6 million) / 4.2624 x 1011 ] = 1 - [0.08] = 46.8%.


Comment: The insurers cost for reinsurance does not depend on the insurers actual losses in a
year; rather it is fixed and has a variance of zero.
5.87. B. S(500) = e-500/1000 = .6065. The frequency distribution of losses of size greater than
500 is also a Negative Binomial Distribution, but with = (.6065)(2) = 1.2131 and r = 2.
Therefore, the frequency of non-zero payments has mean: (2)(1.2131) = 2.4262 and
variance: (2)(1.2131)(1+1.2131) = 5.369. When one truncates and shifts an Exponential
Distribution, one gets the same distribution, due to the memoryless property of the Exponential.
Therefore, the severity distribution of payments on losses of size greater than 500 is also an
Exponential Distribution with = 1000. The aggregate losses excess of the deductible, which are
the sum of the non-zero payments, have a variance of:
(mean freq.)(var. of sev.) + (mean sev. 2)(var. of freq) = (2.4262)(10002 ) + (10002 )(5.269) =
(7.796)(10002 ). Thus the standard deviation of total payments is: (1000) 7.796 = 2792.
Comment: The mean of the aggregate losses excess of the deductible is:
(2.4262)(1000) = 2426.
5.88. The expected aggregate losses are (500)(100) = 50,000. Thus the premium is:
(1.1)(50,000) = 55,000. If the loss ratio exceeds .95 then the aggregate losses exceed
(.95)(55,000) = 52250. The variance of the aggregate losses is: 500(100+1002 ) = 5,050,000.
Thus the chance that the losses exceed 52,250 is about:
1 - [(52,250 - 50,000) /

5,050,000 ] = 1 - (1.00) = 1 - 0.8413 = 0.159.

Comment: For a Compound Poisson Distribution, the variance of the aggregate losses =
(mean frequency)(2nd moment of the severity).

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 189

5.89. Mean aggregate is: (50)(1870) = 93,500.


Variance of aggregate is: (50)(6102 ) = 18,605,000.
Prob[Agg > 100,000] 1 - [(100,000 - 93,500) /

18,605,000 ] = 1 - [1.51] = 0.0655.

Comment: We know how many claims there were, and therefore the variance of frequency is 0.
5.90. C. The mean aggregate losses are: (8)(10,000) = 80,000.
agg2 = freq sev2 + sev2 freq2 = (8)(39372 ) + (10000)2 (32 ) = 1,023,999,752.
The probability that the aggregate losses will exceed (1.5)(80,000) = 120,000 is approximately:
1 - ((120,000 - 80,000) /

1,023,999,752 = 1 - (1.25).

Comment: Short and easy. 1 - (1.25) = 10.6%.


5.91. D. Policy Type one has a mean severity of 200 and a variance of the severity of
(400-0)2 /12 = 13,333.
Policy Type one has a mean frequency of 0.05 and a variance of the frequency of (0.05)(0.95) =
0.0475.
Thus, a single policy of type one has a variance of aggregate losses of:
(0.05)(13,333) + (2002 )(0.0475) = 2567.
Policy Type two has a mean severity of 150 and a variance of the severity of (300-0)2 /12 = 7500.
Policy Type two has a mean frequency of 0.06 and a variance of the frequency of (0.06)(0.94) =
0.0564.
Thus, a single policy of type two has a variance of aggregate losses of:
(0.06)(7500) + (1502 )(.0564) = 1719.
Therefore, the variance of the aggregate losses of 100 independent policies of type one and 200
policies of type two is: (100)(2567) + (200)(1719) = 600,500.
Comment: Frequency is Bernoulli. Severity is uniform.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 190

5.92. E. Mean frequency = (0)(0.7) + (2)(0.2) + (3)(0.1) = .7.


Second moment of the frequency = (02 )(0.7) + (22 )(0.2) + (32 )(0.1) = 1.7.
Variance of the frequency = 1.7 - 0.72 = 1.21. Mean severity = (0)(0.8) + (10)(0.2) = 2.
Second moment of the severity = (02 )(0.8) + (102 )(0.2) = 20.
Variance of the severity = 20 - 22 = 16. Mean aggregate loss = (.7)(2) = 1.4.
Variance of the aggregate losses = (.7)(16) + (22 )(1.21) = 16.04.
Mean + 2 standard deviations = 1.4 + 2 16.04 = 9.41.
The aggregate benefits are greater than 9.41 if and only if there is at least one non-zero claim.
The probability of no non-zero claims is: .7 + (.2)(.82 ) + (.1)(.83 ) = 0.8792.
Thus the probability of at least one non-zero claim is: 1 - 0.8792 = 0.1208.
Comment: If one were to inappropriately use the Normal Approximation, the probability that
aggregate benefits will exceed expected benefits by more than 2 standard deviations is:
1 - (2) = 1 - 0.9772 = 0.023. The fact that the magic phrase "use the Normal Approximation" did
not appear in this question, might make one think. One usually relies on the Normal Approximation
when the expected number of claims is large. In this case one has very few expected claims.
Therefore, one should not rush to use the Normal Approximation.
5.93. D. For the uniform distribution on [5, 95], E[X] = (5+95)/2 = 50,
Var[X] = (95 - 5)2 /12 = 675. E[X2 ] = 675 + 502 = 3175. Therefore, the aggregate claims have
mean of: (25)(50) = 1250 and variance of: (25)(3175) = 79,375.
Thus, Prob(aggregate claims > 2000) 1 - [(2000 - 1250) /

79,375 ] = 1 - (2.662).

5.94. E. Mean frequency is: (.8)(1) + (.2)(2) = 1.2.


Variance of the frequency is: (.8)(12 ) + (.2)(22 ) - 1.22 = .16.
Mean severity is: (.2)(0) + (.7)(100) + (.1)(1000) = 170.
Second moment of the severity is: (.2)(02 ) + (.7)(1002 ) + (.1)(10002 ) = 107,000.
Variance of the severity is: 107000 - 1702 = 78,100. Mean aggregate loss: (1.2)(170) = 204.
Variance of aggregate loss is: (1.2)(78,100) + (1702 )(.16) = 98,344.
mean plus the standard deviation = 204 +

98,344 = 518.

Comment: The frequency is 1 plus a Bernoulli Distribution with q = .2.


Therefore, it has mean: 1 + .2 = 1.2, and variance: (.2)(1 - .2) = .16.
5.95. A. mean aggregate loss = (50)(200) = 10,000.
Variance of aggregate loss = (50)(400) + (2002 )(100) = 4,020,000.
Prob[aggregate < 8000] [(8000 - 10000) /

4,020,000 ] = [-1.00] = 15.9%.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 191

5.96. A. Mean of aggregate = (110)(1101) = 121,110.


Variance of aggregate = (110)(702 ) + (11012 )(750) = 909,689,750.
Prob[aggregate < 100,000] [(100000 - 121,110) /

909,689,750 ] =

[-0.70] = 1 - 0.7580 = 0.2420.


5.97. E. The average number of tires repaired per year is: $10,000,000/$100 = 100,000.
There are 2,000,000 tires sold per year, so q = 100,000/2,000,000 = .05.
f = mq = (2000000)(.05) = 100,000. f2 = mq(1-q) = (2000000)(.05)(.95) = 95,000.
s = 100. We are given that the variance of the aggregate is: 400002 = fs2 + s2 f2 .
1,600,000,000 = 100,000s2 + (1002 )(95,000). s2 = 6500. s = 80.62.
5.98. B. f = 8. f2 = 15. s = 100. s2 = 40,000.
Variance of the aggregate: fs2 + s2 f2 = (8)(40,000) + (1002 )(15) = 470,000. = 685.57.
Now if we know that there have been 13 claims, then the aggregate is the sum of 13 independent,
identically distributed severities. Var[Aggregate] = 13 Var[Severity] =
(13)(40,000) = 520,000. =

520,000 = 721.11. / - 1 = 685.57/721.11 - 1 = -4.93%.

Alternately, if we know that there have been 13 claims, f = 13, f2 = 0,


and the variance of the aggregate is: (13)(40,000) + (1002 )(0) = 520,000. Proceed as before.
5.99. C. Mean aggregate per computer: (3)(80) = 240.
Variance of aggregate per computer: (2nd moment of severity) = (3)(2002 + 802 ) = 139,200.
For N computers, mean is: 240N, and the variance is: 139,200N. 120% of the mean is: 288N.
Prob[Aggregate > 120% of mean] 1 - [(288N - 240N) /

139,200N ] = 1 - [0.12865 N ].

This probability < 10%. [0.12865 N ] > 90%..


[1.282] = 0.90. We want 0.12865 N > 1.282. N > 99.3.
Alternately, for classical credibility, we might want a probability of 90% of being within 20%.
However, here we are only interested in avoiding +20%, a one-sided rather than two-sided test.
For 10% probability in one tail, for the Standard Normal Distribution, y = 1.282.
n0 = (1.282/.2)2 = 41.088. Severity has a CV of: 200/80 = 2.5.
Number of claims needed for full credibility of aggregate losses: (1 + 2.52 )(41.088) = 297.89.
However, the number of computers corresponds to the number of exposures.
Thus we need to divide by the mean frequency of 3: 297.89/3 = 99.3.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 192

5.100. B. Mean severity is: (50%)(80) + (40%)(100) + (10%)(160) = 96.


Second Moment of severity is: (50%)(802 ) + (40%)(1002 ) + (10%)(1602 ) = 9760.
Mean Aggregate is: (1000)(96) = 96,000. Variance of Aggregate is: (1000)(9760) = 9,760,000.
Prob[Club pays > 90,000] = Prob[Aggregate > 100,000]
1 - [(100000 - 96,000) /

9,760,000 ] = 1 - [1.28] = 10.0%.

Comment: One could instead work with the payments, which are 90% of the losses.
5.101. B. Frequency has mean of 0.9, second moment of 1.9, and variance of 1.09.
Severity reduced by the 1000 deductible has mean of:
(50%)(0) + (10%)(1000) + (10%)(2000) + (30%)(4000) = 1500,
second moment of: (50%)(02 ) + (10%)(10002 ) + (10%)(20002 ) + (30%)(40002 ) = 5.3 million,
and variance of: 5.3 million - 15002 = 3.05 million.
A2 = (.9)(3.05 million) + (15002 )(1.09) = 5,197,500. A = 2280.
Alternately, for the original severity distribution:
E[X] = (50%)(1000) + (10%)(2000) + (10%)(3000) + (30%)(5000) = 2500.
E[X 1000] = (50%)(1000) + (10%)(1000) + (10%)(1000) + (30%)(1000) = 1000.
E[X2 ] = (50%)(10002 ) + (10%)(20002 ) + (10%)(30002 ) + (30%)(50002 ) = 9.3 million.
E[(X 1000)2 ] = (50%)(10002 ) + (10%)(10002 ) + (10%)(10002 ) + (30%)(10002 ) = 1 million.
First moment of the layer from 1000 to is:
E[X] - E[X 1000] = 2500 - 1000 = 1500.
Second moment of the layer from 1000 to is:
E[(X )2 ] - E[(X 1000)2 ] - (2)(1000){E[X ] - E[X 1000]} =
E[X2 ] - E[(X 1000)2 ] - (2000){E[X] - E[X 1000]} =
9.3 million - 1 million - (2000)(2500 - 1000) = 5.3 million.
Proceed as before.
5.102. B. Frequency has mean r and variance r(1+).
Mean Aggregate = r700. 48000 = r700. r = 68.571.
Variance of Aggregate = r1300 + 7002 r(1+) = 491300r + 490000r2.

80,000,000 = 491300r + 490000r2 = (491300)(68.571) + (490000)(68.571).


= 1.378. r = 49.76.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 193

5.103. B. Mean of Aggregate: (50)(4500) = 225,000.


Variance of Aggregate: (50)(30002 ) + (45002 )(122 ) = 3366 million.
Set Mean of Lognormal equal to that of the aggregate: exp[ + 2/2] = 225,000.
Set Second Moment of Lognormal equal to that of the aggregate:
exp[2 + 22] = 225,0002 + 3366 million = 5399.1 million.
Divide the second equation by the square of the first: exp[2] = 1.0665.

= .2537. = 12.292.
S((1.5)(225000)) = S(337500) = 1 - [(ln(337500) - 12.292)/.2537] = 1 - [1.72] = 4.27%.
5.104. A. After the deductible, the severity has a mean of:
(0.35)(0) + (0.3)(250) + (0.25)(500) + (0.05)(750) + (0.05)(1000) = 287.5.
After the deductible, the severity has a second moment of
(0.35)(02 ) + (0.3)(2502 ) + (0.25)(5002 ) + (0.05)(7502 ) + (0.05)(10002 ) = 159,375.
Average Aggregate: (0.15)(287.5) = 43.125.
Variance of Aggregate: (0.15)(159,375) = 23,906.
Prob[Aggregate > 250] 1 - [(250 - 43.125)/ 23,906 ] = 1 - [1.34] = 9.01%.
5.105. B. A2 = F X2 + X2 F2 . 228742 = (103)(17812 ) + 62822 F2 . F = 2.197.
Comment: Sometimes the answer given by the exam committee, in this case 2.17, will not match
the exact answer, 2.20 in this case. Very annoying! In this case, you might have checked your work
once, but then put down B as the best choice and move on.
5.106. C. With a coinsurance factor of 80%, each payment is each loss times 0.8.
When we multiply a variable by a constant, the mean and standard deviation are each multiplied by
that constant. The 95th percentile of the normal approximation is:
mean + (1.645)(standard deviation). Thus it is also multiplied by .8. The reduction is 20%.
Alternately, before the reduction, A = (25)(10,000) = 250,000, and
A2 = F X2 + X2 F2 = (25){(3)(10000)}2 + (10000)2 {(1.2)(25)}2 = 112,500 million.
mean + (1.645)(standard deviation) = 250,000 + (1.645)(335,410) = 801,750.
Paying 80% of each loss, multiplies the severity by .8. The mean of the lognormal is multiplied by
.8 and its coefficient of variation is unaffected. A = (25)(8000) = 200,000.
A2 = F X2 + X2 F2 = (25){(3)(8000)}2 + (8000)2 {(1.2)(25)}2 = 72,000 million.
mean + (1.645)(standard deviation) = 200,000 + (1.645)(268,328) = 641,400.
Reduction in the estimated 95th percentile: 1 - 641,400/801,750 = 20%.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 194

5.107. A. For Type I, mean: (12)(1/2) = 6, variance: 12(12 )/3 = 4.


For Type II, mean: (4)(2.5) = 10, variance: 4(52 )/3 = 33.33.
Overall mean = (12)(1/2) + (4)(2.5) = 6 + 10 = 16.
Overall variance = 4 + 33.33 = 37.33.
Prob[aggregate > 18] 1 - ((18 - 16)/ 37.33 ) = 1 - (0.33) = 1 - 0.6293 = 0.3707.
Comment: For a Poisson frequency, variance of aggregate = (second moment of severity).
The two types of claims are independent, so their variances add.
5.108. D. Let the mean frequencies be b before and c after.
The probability of no claim increases by 30%. 1.3e-b = e-c.
The probability of having one claim decreases by 10%. .9be-b = ce-c.
Dividing: 0.9/1.3 = c/b.
The expected aggregate before is: (10002 + 2562 )b = 1,065,536b.
The expected aggregate after is: (1,5002 + 6782 )c = 2,709,684c.
The ratio of after over before is: 2.543c/b = 2.543(0.9/1.3) = 1.760. 76.0% increase.
5.109. E. Due to the memoryless property of the Exponential, the payments excess of a
deductible follow the same Exponential Distribution as the ground up losses.
Thus the second moment of (non-zero) payments is: 2(10,0002 ) = 200 million.
The number of (non-zero) payments is Poisson with mean: 100e-25000/10000 = 8.2085.
Therefore, variance of aggregate payments = (8.2085)(200 million) = 1641.7 million.
Standard deviation of aggregate payments =

1641.7 million = 40,518.

5.110. A. Var[A] = (2)(10002 + 20002 ) = 10 million.


Var[B] = (1)(20002 + 40002 ) = 20 million.
The variances of two independent portfolios add.
Var[A] + Var[B] = 10 million + 20 million = 30 million.
Standard deviation of the combined losses is:

30 million = 5,477.

2013-4-3,

Aggregate Distributions 5 Moments,

5.111. B. E[X2 | X > 30] =

HCM 10/23/12,

Page 195

30 x2 f(x) dx / S(30). 30 x2 f(x) dx = S(30)E[X2 | X > 30] =

(0.75)(9000) = 6750.

30

30 x f(x) dx = 0 x f(x) dx - 0 x f(x) dx = E[X] - {E[X 30] - 30S(30)} = 70 - {25 - (30)(.75)} = 67.5.

30 f(x) dx = S(30) = 0.75.


With a deductible of 30 per loss, the second moment of the payment per loss is:

30 (x - 30)2 f(x) dx = 30 x2 f(x) dx - 60 30 x f(x) dx + 900 30 f(x) dx


= 6750 - (60)(67.5) + (900)(0.75) = 3375.
Since frequency is Poisson, the variance of the aggregate payments is:
(second moment of the payment per loss) = (20)(3375) = 67,500.
Alternately, e(30) = (E[X] - E[X

30])/S(30) = (70 - 25)/.75 = 60.

(X - 30)2 = X2 - 60X + 900 = X2 - 60(X - 30) - 900.


E[(X - 30)2 | X > 30] = E[X2 - 60(X - 30) - 900 | X > 30] =
E[X2 | X > 30] - 60 E[X - 30 | X > 30] - E[900 | X > 30] = 9000 - 60 e(30) - 900 =
9000 - (60)(60) - 900 = 4500.
The number of losses of size greater than 30 is Poisson with mean: (.75)(20) = 15.
The variance of the aggregate payments is:
(number of nonzero payments)(second moment of nonzero payments) = (15)(4500) = 67,500.
Comment: Difficult.
In the original exam question, number of losses, X, should have read number of losses, N,
5.112. B. Mean = (100)(.3)(300) + (300)(.1)(1000) + (50)(.6)(5000) = 189,000.
Variance per power boat: (.3)(10,000) + (.3)(.7)(3002 ) = 21,900.
Variance per sailboat: (.1)(400,000) + (.1)(.9)(10002 ) = 130,000.
Variance per luxury yacht: (.6)(2,000,000) + (.6)(.4)(50002 ) = 7,200,000.
Total Variance: (100)(21,900) + (300)(130,000) + (50)(7,200,000) = 401,190,000.
Mean + standard deviation = 189,000 +

401,190,000 = 209,030.

Comment: Assume that the repair costs for one boat are independent of those of any other boat.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 196

5.113. C. Mean = (3)(10) = 30. Variance = (3)(202 /12) + (102 )(3.6) = 460.
(1.645) = .95. The 95th percentile is approximately: 30 + 1.645

460 = 65.3.

5.114. C. The primary distribution is Binomial with m = 10000 and q = 0.05, with
mean: (10000)(.05) = 500, and variance: (10000)(.05)(1 - .05) = 475.
The second distribution is LogNormal with mean: exp[1.039 + .8332 /2] = 3.9986,
second moment: exp[2(1.039) + 2(.8332 )] = 32.0013,
and variance: 32.0013 - 3.99862 = 16.012.
The mean number of days is: (500)(3.9986) = 1999.3.
The variance of the number of days is: (500)(16.012) + (3.99862 )(475) = 15601.
The 90th percentile of the Standard Normal Distribution is 1.282.
Thus the 90th percentile of the aggregate number of days is approximately:
1999.3 + 1.282 15,601 = 2159.43.
This corresponds to losses of: ($100)(2159.43) = $215,943.
Since this was for 10,000 policies, this corresponds to a premium per policy of $21.59.
5.115. D. The mean aggregate is: (10)(2000) = 20,000.
Variance of aggregate = (second moment of severity) = (10)(2)(20002 ) = 80,000,000.
Match the first and second moments of a LogNormal to that of the aggregate:
exp[ + 2/2] = 20,000.
exp[2 + 22] = 80,000,000 + 20,0002 = 480,000,000.
Divide the second equation by the square of the first equation:
exp[2 + 22]/exp[2 + 2] = exp[2] = 1.2.

= 0.427. = 9.812.
105% of the expected annual loss is: (1.05)(20000) = 21,000.
For the approximating LogNormal, S(21,000) = 1 - [(ln(21000) - 9.812)/0.427]
= 1 - [.33] = 37.07%.
Comment: We have fit a LogNormal to the aggregate losses via the method of moments.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 197

5.116. C. The mixed severity has mean: (1/16)(5) + (15/16)(10) = 9.6875.


The mixed severity has second moment: (1/16)(502 + 52 ) + (15/16)(202 + 102 ) = 626.56.
Thus without the vaccine, for 100 lives the mean of the compound Poisson Process is:
(100)(.16)(9.6875) = 155.0, and the variance is: (100)(.16)(626.56) = 10025.
[0.71] = 0.7611 1 - 0.24. Therefore, we set the aggregate premium for 100 individuals as:
155.0 + (0.71) 10,025 = 226.1.
With the vaccine, the cost for 100 individuals has mean: (100)(0.15) + (100)(0.16)(10)(15/16) =
165, and variance: (100)(.016)(15/16)(202 + 102 ) = 7500.
Therefore, with the vaccine we set the aggregate premium for 100 individuals as:
165.0 + (0.71) 7500 = 226.5.
A/B = 226.1/226.5 = 0.998.
Alternately, one can thin the original process into two independent Poisson Processes,
that for Disease 1 with = .16/16 = .01, and that for other diseases with = (.16)15/16 = .15.
The first process has mean: (0.01)(5) = 0.05, and variance: (0.01)(502 + 52 ) = 25.25.
The second process has mean: (0.15)(10) = 1.5, and variance: (0.15)(202 + 102 ) = 75.
Without the vaccine, for 100 lives, the aggregate loss has mean: (100)(0.05 + 1.5) = 155, and
variance: (100)(25.25 + 75) = 10025.
With the vaccine, for 100 lives, the aggregate cost has mean: (100)(.15 + 1.5) = 165,
and variance: (100)(0 + 75) = 7500. Proceed as before.
Comment: The use of the vaccine increases the mean cost, but decreases the variance.
This could result in either an increase or decrease in aggregate premiums, depending on the criterion
used to set premiums, as well as the number of insured lives.
For example, assume instead that the premiums for a group of 100 independent lives are set at a
level such that the probability that aggregate losses for the group will exceed aggregate premiums
for the group is 5%. Then A = 155.0 + (1.645) 10,025 = 319.7,
B = 165.0 + (1.645) 7500 = 307.5, and A/B = 1.04.
5.117. C. The non-zero payments are Poisson with mean: (.6)(10) = 6.
The size of the non-zero payments is uniform from 0 to 6, with mean 3 and variance: 62 /12 = 3.
Variance of aggregate payments is: (second moment of severity) = (6)(3 + 32 ) = 72.
Alternately, the payments per loss are a mixture of 40% zero and 60% uniform from 0 to 6.
Therefore, the size of the payments per loss has second moment:
(40%)(0) + (60%)(3 + 32 ) = 7.2.
Variance of aggregate payments is: (second moment of severity) = (10)(7.2) = 72.

2013-4-3,

Aggregate Distributions 5 Moments,

HCM 10/23/12,

Page 198

5.118. B. 2nd moment of the amount distribution is: (5%)(102 ) + (15%)(52 ) + (80%)(12 ) = 9.55.
Variance of the compound Poisson Process is: (22)(9.55) = 210.1.
5.119. D. The Binomial frequency has mean: (1000)(.3) = 300, and variance: (1000)(.3)(.7) = 210.
The Pareto severity has a mean of: 500/(3 -1) = 250,
second moment: (2)(5002 )/{(3 -1)(3 -2)} = 250,000, and variance: 250,000 - 2502 = 187,500.
Variance of Aggregate is: (300)(187,500) + (2502 )(210) = 69,375,000.
Standard deviation of the aggregate losses is:

69,375,000 = 8329.

5.120. A. E[X2 ] = 2 [1 + 2/] [ - 2/]/[] = 22 [1 + 2/1] [3 - 2/1]/[3] = (4)[3][1]/[3] = 4.


Since frequency is Poisson, the variance of aggregate is:
(second moment of severity) = (3)(4) = 12.
Comment: A Burr Distribution with = 1, is a Pareto Distribution.
E[X2 ] = 22 / {(-1)(-2)} = 2(22 ) / {(3-1)(3-2)} = 4.
5.121. D. Negative Binomial has mean: r = 96, and variance: r(1 + ) = 672.
Uniform severity has mean: 4, and variance 82 /12 = 16/3.
Mean of Aggregate is: (96)(4) = 384.
Variance of Aggregate is: (96)(16/3) + (42 )(672) = 11,264.
[1.645] = 95%. Premium is: 384 + 1.645 11,264 = 559.
5.122. A. Mean of the aggregate is: (100)(20,000) = 2,000,000.
Variance of the aggregate is: (100)(50002 ) + (20,000)2 (252 ) = 2.525 1011.
Prob[Aggregate > (1.5)(2,000,000)] 1 - [1,000,000/

2.525 x 1011 ] = 1 - [2.0] = 2.3%.

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 199

Section 6, Individual Risk Model86


In the individual risk model, the aggregate loss is the sum of the losses from different independent
policies.
Throughout this section we will assume that each policy has at most one claim per year.
Thus frequency is Bernoulli for each policy, with the q parameters varying between policies.
Mean and Variance of the Aggregate:
Often, the claim size will be a fixed amount, bi, for policy i.
In this case, the mean aggregate is the sum of the policy means: qi bi.
The variance of the aggregate is the sum of the policy variances: (1 - qi)qi bi2 .87
Exercise: There are 300 policies insuring 300 independent lives.
For the first 100 policies, the probability of death this year is 2% and the death benefit is 5.
For the second 100 policies, the probability of death this year is 4% and the death benefit is 20.
For the third 100 policies, the probability of death this year is 6% and the death benefit is 10.
Determine the mean and variance of Aggregate Losses.
[Solution: Mean = ni qi bi = (100)(2%)(5) + (100)(4%)(20) + (100)(6%)(10) = 150.
Variance = ni (1-qi)qi bi2 =
(100)(0.98)(0.02)(52 ) + (100)(0.96)(0.04)(202 ) + (100)(0.94)(0.06)(102 ) = 2149.]
If the severity is not constant, assume the severity distribution for the policy i has mean i and
variance i2. In that case, the mean aggregate is the sum of the policy means: qi i.
The variance of the aggregate is the sum of the policy variances: {qii2 + (1-qi)qi i2 }.88
For example, assume that for a particular policy the benefit for ordinary death is 5 and for accidental
death is 10, and that 30% of deaths are accidental.89 Then for this policy:
= (70%)(5) + (30%)(10) = 6.5, and 2 = (70%)(52 ) + (30%)(102 ) - 6.52 = 5.25.

86

See Sections 9.11.1 and 9.11.2 in Loss Models.


Applying the usual formula for the variance of the aggregate, where the variance of a Bernoulli frequency is
q(1-q) and for a given policy severity is fixed.
88
Applying the usual formula for the variance of the aggregate, where a Bernoulli frequency has mean q and
variance q(1-q) .
89
See Example 9.19 in Loss Models.
87

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 200

Parametric Approximation:
As discussed before, one could approximate the Aggregate Distribution by a Normal Distribution,
a LogNormal Distribution, or some other parametric distribution.
Exercise: For the situation in the previous exercise, what is the probability that the aggregate loss will
exceed 170? Use the Normal Approximation.
[Solution: Prob[A > 170] 1 - [(170 - 150)/ 2149 ] = 1 - [0.43] = 33.36%.
Comment: We usually do not use the continuity correction when working with aggregate
distributions. Here since everything is in units of 5, a more accurate approximation would be:
1 - [(172.5 - 150)/ 2149 ] = 1 - [0.49] = 31.21%.]
Direct Calculation of the Aggregate Distribution:
When there are only 3 policies, it is not hard to calculate the aggregate distribution directly.
Policy
Probability of Death
Death Benefit
1
2%
5
2
4%
20
3
6%
10
For the first policy, there is a 98% chance of an aggregate of 0 and a 2% chance of an aggregate of
5. For the second policy there is a 96% chance of 0 and a 4% chance of 20.
Adding the first two policies, the combined aggregate has:90
(98%)(96%) = 94.08% @0, (2%)(96%) = 1.92% @ 5, (98%)(4%) = 3.92% @ 20, and
(2%)(4%) = 0.08% at 25.
Exercise: Add in the third policy, in order to calculate the aggregate for all three policies.
[Solution: The third policy has a 94% chance of 0 and a 6% chance of 10.
The combined aggregate has: (94.08%)(94%) = 88.4352% @ 0,
(1.92%)(94%) = 1.8048% @ 5, (94.08%)(6%) = 5.6448% @ 10, (1.92%)(6%) = 0.1152% @ 15,
(3.92%)(94%) = 3.6848% @ 20, (0.08%)(94%) = 0.0752% @ 25,
(3.92%)(6%) = 0.2352% @ 30, (0.08%)(6%) = 0.0048% @ 35.]

90

This is the same as convoluting the two aggregate distributions.

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 201

One can use this distribution to calculate the mean and variance of aggregate losses:
Aggregate

Probability

First Moment

Second Moment

0
5
10
15
20
25
30
35

88.4352%
1.8048%
5.6448%
0.1152%
3.6848%
0.0752%
0.2352%
0.0048%

0.0000
0.0902
0.5645
0.0173
0.7370
0.0188
0.0706
0.0017

0.0000
0.4512
5.6448
0.2592
14.7392
0.4700
2.1168
0.0588

Sum

1.5000

23.7400

The mean is 1.5 and the variance is: 23.74 - 1.52 = 21.49, matching the previous results.
Exercise: What is the probability that the aggregate will exceed 17?
[Solution: Prob[A > 17] = 3.6848% + 0.0752% + 0.2352% + 0.0048% = 4%.
Comment: A > 17 if and only if there is claim from the second policy.
The exact 4% differs significantly from the result using the Normal Approximation of 0.04%!
The Normal Approximation is poor when the total number of claims expected is only 0.12.]

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 202

Problems:
6.1 (2 points) An insurer provides life insurance for the following group of independent lives:
Number
Death
Probability
of Lives
Benefit
of Death
2000
1
0.05
3000
5
0.04
4000
10
0.02
Using the Normal Approximation, what is the probability that the aggregate losses exceed 110% of
their mean?
(A) 6.5%
(B) 7.0%
(C) 7.5%
(D) 8.0%
(E) 8.5%
Use the following information for the next two questions:
An insurer writes two classes of policies with the following distributions of losses per policy:
Class
Mean
Variance
1
10
20
2
15
40
6.2 (1 point) The insurer will write 10 independent policies, 5 of class one and 5 of class 2.
What is variance of the aggregate losses?
A. 300
B. 320
C. 340
D. 360
E. 380
6.3 (2 points) The insurer will write 10 independent policies.
The number of these policies that are class one is Binomial with m = 10 and q = 0.5.
What is variance of the aggregate losses?
A. 300
B. 320
C. 340
D. 360
E. 380
6.4 (2 points) An insurer provides life insurance for the following group of 4 independent lives:
Life
Death Benefit
Probability of Death
A
10
0.03
B
25
0.06
C
50
0.01
D
100
0.02
What is the coefficient of variation of the aggregate losses?
A. 2.9
B. 3.1
C. 3.3
D. 3.5
E. 3.7

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 203

Use the following information for the next two questions:


An insurer provides life insurance for the following group of 400 independent lives:
Number of Lives
Death Benefit
Probability of Death
100
10
0.03
100
25
0.06
100
50
0.01
100
100
0.02
6.5 (2 points) What is the coefficient of variation of the aggregate losses?
A. 0.3

B. 0.4

C. 0.5

D. 0.6

E. 0.7

6.6 (2 points) Using the Normal Approximation, what is the probability that the aggregate losses
are less than 300?
A. 20%
B. 21%

C. 22%

D. 23%

E. 24%

6.7 (2 points) An insurer provides life insurance for the following group of 4 independent lives:
Death
Probability
Life
Benefit
of Death
1
10
0.04
2
10
0.03
3
20
0.02
4
30
0.05
What is the probability that the aggregate losses are more than 40?
A. less than 0.09%
B. at least 0.09% but less than 0.10%
C. at least 0.10% but less than 0.11%
D. at least 0.11% but less than 0.12%
E. at least 0.12%

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 204

6.8 (Course 151 Sample Exam #1, Q.22) (2.5 points) An insurer provides one year term life
insurance to a group. The benefit is 100 if death is due to accident and 10 otherwise.
The characteristics of the group are:
Probability
Probability
Number
of Death
of Death
Gender
of Lives
(all causes)
(accidental causes)
Female
100
0.004
0.0004
Male
200
0.006
0.0012
The aggregate claims distribution is approximated using a compound Poisson distribution which
equates the expected number of claims. The premium charged equals the mean plus 10% of the
standard deviation of this compound Poisson distribution.
Determine the relative security loading, (premiums / expected losses) - 1.
(A) 0.10
(B) 0.13
(C) 0.16
(D) 0.19
(E) 0.22
6.9 (Course 151 Sample Exam #2, Q.20) (1.7 points)
An insurer provides life insurance for the following group of independent lives:
Number of Death
Probability
Lives
Benefit
of Death
100

0.02

200

0.03

Let S be the total claims. Let w be the variance of the compound Poisson distribution which
approximates the distribution of S by equating the expected number of claims.
Determine the maximum value of such that w 2500.
(A) 6.2

(B) 8.0

(C) 9.8

(D) 11.6

(E) 13. 4

6.10 (Course 151 Sample Exam #2, Q.23) (2.5 points)


An insurer has the following portfolio of policies:
Benefit
Number
Probability
Class
Amount
of Policies
of a Claim
1
1
400
0.02
2
10
100
0.02
There is at most one claim per policy.
The insurer reinsures the amount in excess of R (R >1) per policy.
The reinsurer has a reinsurance loading of 0.25.
The insurer wants to minimize the probability, as determined by the normal approximation,
that retained claims plus cost of reinsurance exceeds 34. Determine R.
(A) 1.5
(B) 2.0
(C) 2.5
(D) 3.0
(E) 3.5

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 205

6.11 (Course 151 Sample Exam #3, Q.9) (1.7 points)


An insurance company has a portfolio of two classes of insureds:
Probability Number of Relative Security
Class
Benefit
of a Claim
Insureds
Loading
I
5
0.20
N
0.10
II
10
0.10
2N
0.05
The relative security loading is defined as: (premiums / expected losses) - 1.
Assume all claims are independent. The total of the premiums equals the 95th percentile of the
normal distribution that approximates the distribution of total claims. Determine N.
(A) 1488
(B) 1538
(C) 1588
(D) 1638
(E) 1688
6.12 (5A, 5/94, Q.37) (2 points)
An insurance company has two classes of insureds as follows:
Number of Probability Claim
Class Insureds
of 1 Claim
Amount
1
200
0.05
2000
2
300
0.01
1500
There is at most one claim per insured and each insured has only one size of claim.
The insurer wishes to collect an amount equal to the 95th percentile of the distribution of total claims,
where each individual's share is to be proportional to the expected claim amount.
Calculate the relative security loading, (premiums / expected losses) - 1, using the Normal
Approximation.
6.13 (5A, 11/97, Q.39) (2 points) A life insurance company issues 1-year term life contracts for
benefit amounts of $100 and $200 to individuals with probabilities of death of .03 or .09.
The following table gives the number of individuals in each of the four classes.
Class Probability Benefit Number
1
.03
100
50
2
.03
200
40
3
.09
100
60
4
.09
200
50
The company wants to collect from this population an amount equal to the 95th percentile of the
distribution of total claims, and it wants each individual's share of this amount to be proportional to the
individual's expected claim. Using the Normal Approximation,
calculate the required relative security loading, (premiums / expected losses) - 1.

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 206

6.14 (Course 1 Sample Exam, Q.15) (1.9 points) An insurance company issues insurance
contracts to two classes of independent lives, as shown below.
Class
Probability of Death
Benefit Amount
Number in Class
A
0.01
200
500
B
0.05
100
300
The company wants to collect an amount, in total, equal to the 95th percentile of the distribution of
total claims.
The company will collect an amount from each life insured that is proportional to that lifes expected
claim. That is, the amount for life j with expected claim E[Xj] would be kE[Xj].
Using the Normal Approximation, calculate k.
A. 1.30
B. 1.32
C. 1.34
D. 1.36

E. 1.38

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 207

Solutions to Problems:
6.1. C. Mean = niqi bi = (2000)(.05)(1) + (3000)(.04)(5) + (4000)(.02)(10) = 1500.
Variance = ni(1-qi)qi bi2 = (2000)(.95)(.05)(12 ) + (3000)(.96)(.04)(52 ) + (4000)(.98)(.02)(102 ) =
10,815. Prob[A > (1.1)(1500)] 1 - [150/ 10,815 ] = 1 - [1.44] = 1 - 0.9251 = 7.49%.
6.2. A. (5)(20) + (5)(40) = 300.
Comment: Since we are given the variance of aggregate losses, there is no need to compute the
variance of aggregate losses from the mean frequency, variance of frequency, mean severity, and
variance of severity.
6.3. D. Using analysis of variance, let n be the number of policies of class 1:
Var[A] = En [Var[A | n]] + Varn [E[A | n]] = En [(n)(20) + (10-n)(40)] + Varn [(n)(10) + (10-n)(15)] =
En [400 - 20n] + Varn [150 - 5n] = 400 - 20En [n] + 25Varn [n] =
400 - (20)(10)(0.5) + (25)(10)(0.5)(1 - 0.5) = 362.5.
Comment: Similar to Exercise 9.72 in Loss Models.
6.4. E. Mean = qi bi = (.03)(10) + (.06)(25) + (.01)(50) + (.02)(100) = 4.3.
Variance = (1-qi)qi bi2 = (.97)(.03)(102 ) + (.94)(.06)(252 ) + (.99)(.01)(502 ) + (.98)(.02)(1002 ) =
258.91. CV = ( 258.91 )/4.3 = 3.74.
6.5. B. Mean = ni qi bi = 100{(.03)(10) + (.06)(25) + (.01)(50) + (.02)(100)} = 430.
Variance = ni (1-qi)qi bi2 = 100{(.97)(.03)(102 ) + (.94)(.06)(252 ) + (.99)(.01)(502 ) +
(.98)(.02)(1002 )} = 25,891. CV = ( 258.91 )/430 = 0.374.
Comment: With 100 policies of each type, the coefficient of variation is 1/10 of what it would have
been with only one policy of each type as in the previous question.
6.6. B. From the previous solution, Mean = 430 and Variance = 25,891.
Prob[Aggregate < 300] [(300 - 430)/ 258.91 ] = [-0.81] = 1 - 0.7910 = 20.9%.
6.7. C. Prob[A > 40] = Prob[A = 50] + Prob[A = 60] + Prob[A = 70] =
Prob[lives 3 and 4 die] + Prob[lives 1, 2, and 4 die] + Prob[lives 1, 3, and 4 die] +
Prob[lives 2, 3, and 4 die] + Prob[lives 1, 2, 3, and 4 die] =
(.96)(.97)(.02)(.05) + (.04)(.03)(.98)(.05) + (.04)(.97)(.02)(.05) + (.96)(.03)(.02)(.05) +
(.03)(.04)(.02)(.05) = 0.00106.

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 208

6.8. B. The expected number of fatal accidents is: (100)(.0004) + (200)(.0012) = .28.
The expected number of deaths (all causes) is: (100)(.004) + (200)(.006) = 1.6.
So the expected number of deaths from other than accidents is: 1.6 - .28 = 1.32.
Therefore, the mean severity is: ((1.32)(10) + (.28)(100))/1.6 = 25.75.
The second moment of the severity is: ((1.32)(102 ) + (.28)(1002 ))/1.6 = 1832.5.
The mean aggregate loss is: (1.6)(25.75) = 41.2.
If this were a compound Poisson Distribution, then the variance would be:
(mean frequency)(second moment of the severity) = (1.6)(1832.5) = 2932.
The standard deviation is: 2932 = 54.14.
Thus the premium is: 41.2 + (10%)(54.14) = 46.614.
The relative security loading is: premium /(expected loss) - 1 = 46.614/41.2 - 1 = 13.1%.
6.9. C. Expected number of claims = (100)(.02) + (200)(.03) = 2 + 6 = 8.
Therefore, 2/8 of the claims are expected to have death benefit , while 6/8 of the claims are
expected to have death benefit of 2.
The second moment of the severity is: (2/8)(2) + (6/8)(2)2 = 132 /4.
Thus the variance of aggregate losses is: (8)(132/4) = 262.
Setting 2500 = 262, = 9.8.

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 209

6.10. B. Expected number of claims = (400)(.02) + (100)(.02) = 8 + 2 = 10.


For a retention of 10 > R > 1, the expected losses retained are : (8)(1) + (2)(R) = 8 + 2R.
Variance of retained losses = (12 )(400)(.02)(.98) + (R2 )(100)(.02)(.98) = 7.84 + 1.96R2 .
The expected ceded losses are 2(10-R) = 20 - 2R. Thus the cost of reinsurance is:
1.25(20 - 2R) = 25 - 2.5R. Thus expected retained losses plus reinsurance costs are:
8 + 2R + 25 - 2.5 R = 33 - .5 R. The variance of the retained losses plus reinsurance costs is that of
the retained losses. Therefore, the probability that the retained losses plus reinsurance costs
exceed 34 is approximately: 1 - [(34 - (33 - .5 R))/ 7.84 + 1.96R2 ].
This probability is minimized by maximizing
(34 - (33 - .5 R))/ 7.84 + 1.96R2 = (1 + .5 R)/ 7.84 + 1.96R2 .
Setting the derivative with respect to R equal to zero:
0 = {(.5) 7.84 + 1.96R2 - (1 + .5 R)(1/2)(2)(1.96R)/ 7.84 + 1.96R2 }/ (7.84 + 1.96R2 ).
Therefore, (.5)(7.84 + 1.96R2 ) = (1 + .5 R)(1.96R). 3.92 + .98R2 = 1.96R + .98R2 . R = 2.
Comment: The graph of (1 + .5 R)/ 7.84 + 1.96R2 , for 1 < R < 10 is as follows:
0.5

0.48

0.46

0.44

10

The graph of the approximate probability that the retained losses plus reinsurance costs exceed 34,
1 - [(34 - (33 - .5 R))/ 7.84 + 1.96R2 ], for 1 < R < 10 is as follows:
0.335
0.33
0.325
0.32
0.315
0.31

10

This probability is minimized for R = 2. However, this probability is insensitive to R, so in this


case this may not be a very practical criterion for selecting the best R.

2013-4-3, Aggregate Distributions 6 Individual Risk Model,

HCM 10/23/12,

Page 210

6.11. A. The 95th percentile of the Normal Distribution implies that the premium =
expected aggregate loss + 1.645(standard deviations).
The expected aggregate loss is: (5)(.2)N + (10)(.1)(2N) = 3N.
The variance of aggregate losses is: (53 )(.2)(.8)N + (102 )(.1)(.9)(2N) = 22N.
The premiums = (1.1)(5)(.2)N + (1.05)(10)(.1)(2N) = 3.2N.
Setting the premiums equal to expected aggregate loss + 1.645(standard deviations):
3.2N = 3N + (1.645) 22N . Solving, N = 22(1.645/.2)2 = 1488.
6.12. The mean loss is: (.05)(2000)(200) + (.01)(1500)(300) = 24,500.
The variance of aggregate losses is: (.05)(.95)(20002 )(200) + (.01)(.99)(15002 )(300)
= 44,682,500. The 95th percentile of aggregate losses is approximately:
24,500 + (1.645) 44,682,500 = 24,500 + 10996.
The relative security loading is: 10996/24500 = 45%.
6.13. The mean loss is: (.03)(100)(50) + (.03)(200)(40) + (.09)(100)(60) + (.09)(200)(50) =
1830. The variance of aggregate losses is:
(.03)(.97)(1002 )(50) + (.03)(.97)(2002 )(40) + (.09)(.91)(1002 )(60) + (.09)(.91)(2002 )(50) =
274,050. The 95th percentile of aggregate losses is approximately:
1830 + (1.645) 274,050 = 2691. The security loading is: 2691 - 1830 = 861.
The relative security loading is: 861/1830 = 47%.
6.14. E. The mean aggregate is: (.01)(500)(200) + (.05)(300)(100) = 2,500.
The variance of the aggregate is: (.01)(.99)(500)(2002 ) + (.05)(.95)(300)(1002 ) = 340,500.
Using the Normal Approximation, the 95th percentile of the aggregate is:
2500 + 1.645 340,500 = 3460.
k = 3460/2500 = 1.384.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 211

Section 7, Recursive Method / Panjer Algorithm


As discussed previously, the same mathematics apply to aggregate distributions (independent
frequency and severity) and compound frequency distributions.
While one could calculate the density function by brute force, tools have been developed to make it
easier to work with aggregate distributions when the primary distribution has certain forms. The
Panjer Algorithm, referred to in Loss Models as the recursive method, is one such technique.91
The first step is to calculate the density at zero of the aggregate or compound distribution.
Density of Aggregate Distribution at Zero:
For a aggregate distribution one can calculate the probability of zero claims as follows from first
principles. For example, assume frequency follows a Poisson Distribution, with = 1.3, and severity
has a 60% chance of being zero.
Exercise: What is the density at 0 of the aggregate distribution?
[Solution: There are a number of ways one can have zero aggregate.
One can either have zero claims or one can have n claims, each with zero severity.
Assuming there were n claims, the chance of each of them having zero severity is 0.6n .
The chance of having zero claims is the density of the Poisson distribution at 0, e-1.3.
Thus the chance of zero aggregate is:
e-1.3 + (1.3)e-1.3 0.6 + (1.32 /2!)e-1.3 0.62 + (1.33 /3!)e-1.3 0.63 +( 1.34 /4!)e-1.3 0.64 + ... =

e-1.3

{(1.3)(.6)} n / n! = e-1.3 exp((1.3) .6) = exp(-(1.3)(1 - .6)) = e-0.52 = 0.5945.

n=0

Comment: I have used the fact that:

xn!

= ex.]

In this exercise, we have computed that there is a 59.45% chance that there is zero aggregate.
Instead one can use the following formula, the first step of the Panjer algorithm: c(0) = Pp (s(0)),
where c is the compound or aggregate density,
s is the density of the severity or secondary distribution,
and Pp is the probability generating function of the frequency or primary distribution.
Loss Models points out that the number of computations increases as n2 , O(n2 ), rather than n3 , O(n3 ), as for
direct calculation using convolutions.
91

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 212

Exercise: Apply this formula to determine the density at zero of the aggregate distribution.
[Solution: For the Poisson, P(z) = exp[(z - 1)] = exp[1.3(z - 1)].
c(0) = Pp (s(0)) = Pp (.6) = e1.3(.6-1) = e-0.52 = 0.5945205.]
Derivation of the Formula for the Density of Aggregate Distribution at Zero:
Let the frequency or primary distribution be p, the severity or secondary distribution be s, and let c
be the aggregate or compound distribution.
The probability of zero aggregate is :
c(0) = p(0) + p(1)s(0) + p(2)s(0)2 + p(3)s(0)3 + ...

c(0) = p(n) s(0)n .


n=0

We note that by the definition of the Probability Generating Function, the righthand side of the
above equation is the Probability Generating of the primary distribution at s(0).92
Therefore, the density of the compound distribution at zero is:93
c(0) = Pp (s(0)) = P.G.F. of primary distribution at (density of secondary distribution at zero).
Formulas for the Panjer Algorithm (recursive method):
Let the frequency or primary distribution be p, the severity or secondary distribution be s, and
c be the aggregate or compound distribution. If the primary distribution p is a member of the
(a,b,0) class94, then one can use the Panjer Algorithm (recursive method) in order to iteratively
compute the compound density:95
c(x) =

x
1
(a + jb / x) s(j) c(x - j) .
1 - a s(0) j=1

c(0) = Pp (s(0)).
P(z) = E[zn] = p(n) zn.
See Theorem 6.14 in Loss Models.
This is the source of the values given in Table D.1 in Appendix D of Loss Models.
94
f(x+1)/f(x) = a + b/(x+1), which holds for the Binomial, Poisson, and Negative Binomial Distributions.
95
Formula 9.22 in Loss Models. Note that if primary distribution is a member of the (a, b, 1) class, then as discussed
in the next section, there is a modification of this algorithm which applies.
92
93

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 213

Aggregate Distribution Example:


In order to apply the Panjer Algorithm one must have a discrete severity distribution. Thus either the
original severity distribution must be discrete or one must approximate a continuous severity
distribution with a discrete one.96 We will assume for simplicity that the discrete distribution has
support on the nonnegative integers. If not, we can just change units to make it more convenient.
Exercise: Assume the only possible sizes of loss are 0, $1000, $2000, $3000, etc.
How could one change the scale so that the support is the nonnegative integers?
[Solution: One puts everything in units of thousands of dollars instead of dollars. If f(2000) is the
original density at 2000, then it is equal to s(2), the new density at 2. The new densities still sum to
unity and the aggregate distribution will be in units of $1000.]
Let the frequency distribution be p,97 the discrete severity distribution be s,98 and let c be the
aggregate loss distribution.99
Exercise: Let severity have density: s(0) = 60%, s(1) = 10%, s(2) = 25%, and s(3) = 5%.
Frequency is Poisson with = 1.3.
Use the Panjer Algorithm to calculate the density at 3 of the aggregate distribution.
[Solution: For the Poisson a = 0 and b = = 1.3.
c(0) = Pp (s(0)) = Pp ( .6) = e1.3(.6-1) = 0.5945205.
c(x) =

x
x
1
1.3
(a + jb / x) s(j) c(x - j) =
j s(j) c(x - j) .

1 - a s(0) j=1
x j=1

c(1) = (1.3/1) (1) s(1) c(1-1) = (1.3/1) {(1)(0.1)(0.5945205)} = 0.077288.


c(2) = (1.3/2){(1)s(1)c(1) + (2)s(2)c(0)} = (1.3/2) {(1)(0.1)(0.077288) +(2)(.25)(0.5945205)}
= 0.19824.
c(3) = (1.3/3) {(1)(0.1)(0.19824) + (2)(0.25)(0.077288) + (3)(0.05)(0.5945205)} = 0.06398. ]
By continuing iteratively in this manner, one could calculate the density for any value.100
The Panjer algorithm reduces the amount of work needed a great deal while providing exact results,
provided one retains enough significant digits in the intermediate calculations.
96

There are a number of ways of performing such an approximation, as discussed in a subsequent section.
In general, p is the primary distribution, which for this application of the Panjer algorithm is the frequency
distribution.
98
In general, s is the secondary distribution, which for this application of the Panjer algorithm is the discrete severity
distribution.
99
In general c is the compound distribution, which for this application of the Panjer algorithm is the aggregate losses.
100
In this case, the aggregate distribution out to 10 is: 0.594521, 0.0772877, 0.198243, 0.06398, 0.0380616,
0.0170385, 0.00657186, 0.00276448, 0.000994199, 0.000356408, 0.000123164.
The chance of the aggregate losses being greater than 10 is: 0.0000587055.
97

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 214

If the severity density is zero at 0, then c(0) = Pp (s(0)) = Pp (0).


Now Pp (0) = lim Pp (z) = lim E[ZN] = lim {p(0) +
z0

z0

z0

p(n)zn } = p(0).
n=1

Therefore, if s(0) = 0, the probability of zero aggregate losses = c(0) = p(0) = the probability of no
claims. If s(0) > 0, then there is an additional probability of zero aggregate losses, due to the
contribution of situations with claims of size zero.
Thinning the Frequency or Primary Distribution:
The Panjer Algorithm directly handles situations in which there is a positive chance of a zero severity.
In contrast, if one tried to apply convolution to such a situation, one would need to calculate a lot of
convolutions, since one can get zero aggregate even if one has many claims.
One can get around this difficulty by thinning the frequency or primary distribution.101
As in the previous example, let frequency be Poisson with = 1.3, and the severity have density:
s(0) = 60%, s(1) = 10%, s(2) = 25%, and s(3) = 5%.
Exercise: What is the distribution of the number of claims with non-zero severity?
[Solution: Poisson with = (1.3)(40%) = 0.52.]
Exercise: If the severity distribution is truncated to remove the zeros, what is the resulting
distribution?
[Solution: s(1) = 10%/40% = 25%, s(2) = 25%/40% = 62.5%, and s(3) = 5%/40% = 12.5%.]
Only the claims with non-zero severity contribute to the aggregate.
Therefore, we can compute the aggregate distribution by using the thinned frequency, Poisson with
= (1.3)(40%) = 0.52, and the severity distribution truncated to remove the zero claims.
Exercise: Use convolutions to calculate the aggregate distribution up to 3.
[Solution: p(3) = e-0.52 0.523 /6 = 0.01393. (s*s)[3] = (2)(0.25)(0.625) = 0.3125.
(0.30915)(0.12500) + (0.08038)(0.31250) + (0.01393)(0.01562) = 0.06398.
n
Poisson

0
0.59452

1
0.30915

2
0.08038

3
0.01393

s*0

s*s

s*s*s

0
1
2
3

1
0.25000
0.62500
0.12500

0.06250
0.31250

0.01562

Aggregate
Density
0.594521
0.077288
0.198243
0.063980

Comment: Matching the result obtained previously using the Panjer Algorithm.]
101

Thinning is discussed in Mahlers Guide to Frequency Distributions.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 215

One can apply this same thinning technique when the frequency is any member of the (a, b, 0)
class. If the original frequency is Binomial, then the non-zero claims are also Binomial with parameters
m and {1 - s(0)}q. If the original frequency is Negative Binomial, then the non-zero claims are also
Negative Binomial with parameters r and {1 - s(0)}.
Compound Distribution Example:
The mathematics are the same in order to apply the Panjer Algorithm to the compound case.
For example, assume the number of taxicabs that arrive per minute at the Heartbreak Hotel is
Poisson with mean 1.3. In addition, assume that the number of passengers dropped off at the hotel
by each taxicab is Binomial with q = 0.4 and m = 5. The number of passengers dropped off by
each taxicab is independent of the number of taxicabs that arrive and is independent of the number
of passengers dropped off by any other taxicab. Then the aggregate number of passengers
dropped off per minute at the Heartbreak Hotel is a compound Poisson-Binomial distribution, with
parameters: = 1.3, q = 0.4, m = 5.
Exercise: Use the Panjer Algorithm to calculate the density at 3 for this example.
[Solution: The densities of the secondary Binomial Distribution are:
j
s(j)
j
s(j)
0
0.07776
3
0.2304
1
0.2592
4
0.0768
2
0.3456
5
0.01024
For the primary Poisson a = 0, b = = 1.3, and P(z) = exp[(z - 1)] = exp[1.3(z - 1)].
c(0) = Pp (s(0)) = Pp (0.07776) = e1.3(0.07776-1) = 0.301522.
c(x) =

x
x
1
1.3
(a
+
jb
/
x)
s(j)
c(x
j)
=

j s(j) c(x - j) .
1 - a s(0) j=1
x j=1

c(1) = (1.3/1) (1) s(1) c(1-1) = (1.3/1) {(1)(0.2592)(0.301522)} = 0.101601.


c(2) = (1.3/2) {(1)(0.2592)(0.101601) + (2)(0.3456)(0.301522)} = 0.152586.
c(3) =
(1.3/3) {(1)(0.2592)(0.152586) + (2)(0.3456)(0.101601) + (3)(0.2304)(0.301522)} = 0.137882.]
By continuing iteratively in this manner, one could calculate the density for any value. The Panjer
algorithm reduces the amount of work needed a great deal while providing exact results, provided
one retains enough significant digits in the intermediate calculations.102
Here are the densities for the compound Poisson-Binomial distribution, with parameters = 1.3, q = 0.4, m = 5,
calculated using the Panjer Algorithm, from 0 to 20: 0.301522, 0.101601, 0.152586, 0.137882, 0.0988196,
0.070989, 0.0507183, 0.0335563, 0.0211638, 0.0130872, 0.0078559, 0.00456369, 0.00258682,
0.00143589, 0.000779816, 0.000414857, 0.000216723, 0.000111302, 0.0000562232, 0.0000279619,
0.0000137058.
102

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 216

Preliminaries to the Proof of the Panjer Algorithm:


In order to prove the Panjer Algorithm/ Recursive Method, we will use two results for convolutions.
First, s*n(x) = Prob(sum of n losses is x) =

i=0

i=0

Prob(first loss is i) Prob(sum of n- 1 losses is x - i) = s(i)

s * ( n - 1) (x - i) .

In other words, s*n = s * s*n-1. In this case, since the severity density is assumed to have support
equal to the nonnegative integers, s*n-1(x-i) is 0 for i > x, so the terms for
i > x drop out of the summation:
x

s*n(x) =

s(i)

s * ( n - 1) (x - i) .

i=0

Second, assume we have n independent, identically distributed losses, each with distribution s.
Assume we know their sum is x > 0, then by symmetry the conditional expected value of any of
these losses is x/n.

x/n = E[L1 | L1 + L2 + ... + Ln = x] =

i Prob[ L1 = i | L1 + L2 + ... + Ln = x] =

i=0

i Prob[ L1 = i and L1 + L2 + ... + Ln = x] / Prob[L1 + L2 + ... + Ln = x] =

i=0

i Prob[ L1 = i and L2 + ... + Ln = x - i] / s*n(x) =

i=0

(1 / s*n(x))

i s(i) s * (n - 1)(x - i) = {1 / s*n(x)}

i=0

i s(i)

s * (n - 1) (x - i).103

i=1
x

Therefore, s*n(x) = (n/x)

i s(i)

s * (n - 1) (x - i).

i=1

103

Note that the term for i = 0 drops out. Since for use in the Panjer Algorithm the severity density is assumed to
have support equal to the nonnegative integers, s*n-1(x-i) is 0 for i > x,
so the terms for i > x drop out of the summation.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 217

Exercise: Verify the above relationship for n = 2 and x = 5.


[Solution: s*2 (5) = probability two losses sum to 5 =
s(0)s(5) + s(1)s(4) + s(2)s(3) + s(3)s(2) + s(4)s(1) + s(5)s(0) =
2{s(0)s(5) + s(1)s(4) + s(2)s(3)}.
x

(n/x) i s(i) s * (n - 1) (x - i) = (2/5){s(1)s(4) + 2s(2)s(3) + 3s(3)s(2) + 4s(4)s(1) + 5s(5)s(0)} =


i=1

2{s(0)s(5) + s(1)s(4) + s(2)s(3)}.]


Proof of the Panjer Algorithm:
Recall that for a member of the (a, b, 0) class of frequency distributions:
f(n+1) / f(n) = a + {b / (n+1)}. Thus, f(n) = f(n-1){a + b /n}, for n > 0.
For the compound distribution to take on a value x > 0, there must be one or more losses.
As discussed previously, one can write the compound distribution in terms of convolutions:

c(x) =

s * n (x)

f(n)

n=1

a f(n- 1)
n=1
x

a s(i)
i=0

= a f(n- 1)

s * n (x)

n=1

s(i) s * ( n - 1) (x - i) + b

i=0

f(n- 1)

s * (n - 1)(x - i)

n=1

i=0

+ b f(n- 1) s * n (x) / n =
n=1

n=1

f(n- 1) (n / x) i s(i) s * (n - 1) (x - i) / n =
i=1

+ (b/x) i s(i)
i=1

a s(i) c(x i) + (b/x)

f(n- 1) s * (n - 1)(x - i) =

n=1

i s(i) c(x i) =
i=1

a s(0) c(x) +

(a

+ bi / x) s(i) c(x - i) .

i=1

Taking the first term to the lefthand side of the equation and solving for c(x):
c(x) =

x
1
(a + i b / x) s(i) c(x - i).
1 - a s(0)
i=1

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Problems:
Use the following information for the next 11 questions:

One has a compound Geometric distribution with = 2.1.


The discrete severity distribution is as follows:
0
25%
1
35%
2
20%
3
15%
4
5%

7.1 (1 point) What is the mean aggregate loss?


A. less than 3.0
B. at least 3.0 but less than 3.5
C. at least 3.5 but less than 4.0
D. at least 4.0 but less than 4.5
E. at least 4.5
7.2 (2 points) What is the variance of the aggregate losses?
A. less than 13
B. at least 13 but less than 14
C. at least 14 but less than 15
D. at least 15 but less than 16
E. at least 16
7.3 (1 point) What is the probability that the aggregate losses are zero?
A. less than .37
B. at least .37 but less than .38
C. at least .38 but less than .39
D. at least .39 but less than .40
E. at least .40
7.4 (2 points) What is the probability that the aggregate losses are one?
A. less than 12%
B. at least 12% but less than 13%
C. at least 13% but less than 14%
D. at least 14% but less than 15%
E. at least 15%

Page 218

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 219

7.5 (2 points) What is the probability that the aggregate losses are two?
A. less than 8%
B. at least 8% but less than 9%
C. at least 9% but less than 10%
D. at least 10% but less than 11%
E. at least 11%
7.6 (2 points) What is the probability that the aggregate losses are three?
A. less than 9.2%
B. at least 9.2% but less than 9.3%
C. at least 9.3% but less than 9.4%
D. at least 9.4% but less than 9.5%
E. at least 9.5%
7.7 (2 points) What is the probability that the aggregate losses are four?
A. less than 7.2%
B. at least 7.2% but less than 7.3%
C. at least 7.3% but less than 7.4%
D. at least 7.4% but less than 7.5%
E. at least 7.5%
7.8 (2 points) What is the probability that the aggregate losses are five?
A. less than 4.7%
B. at least 4.7% but less than 4.8%
C. at least 4.8% but less than 4.9%
D. at least 4.9% but less than 5.0%
E. at least 5.0%
7.9 (2 points) What is the probability that the aggregate losses are greater than 5?
Use the Normal Approximation.
A. less than 10%
B. at least 10% but less than 15%
C. at least 15% but less than 20%
D. at least 20% but less than 25%
E. at least 25%

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 220

7.10 (2 points) Approximate the distribution of aggregate losses via a LogNormal Distribution, and
estimate the probability that the aggregate losses are greater than 5.
A. less than 10%
B. at least 10% but less than 15%
C. at least 15% but less than 20%
D. at least 20% but less than 25%
E. at least 25%
7.11 (2 points) What is the 70th percentile of the distribution of aggregate losses?
A. 2
B. 3
C. 4
D. 5
E. 6
Use the following information for the next 6 questions:

Frequency follows a Binomial Distribution with m = 10 and q = 0.3.

Frequency and Severity are independent.

The discrete severity distribution is as follows:


0
20%
1
50%
2
20%
3
10%

7.12 (1 point) What is the probability that the aggregate losses are zero?
A. less than 4%
B. at least 4% but less than 5%
C. at least 5% but less than 6%
D. at least 6% but less than 7%
E. at least 7%
7.13 (2 points) What is the probability that the aggregate losses are one?
A. less than 13%
B. at least 13% but less than 14%
C. at least 14% but less than 15%
D. at least 15% but less than 16%
E. at least 16%
7.14 (2 points) What is the probability that the aggregate losses are two?
A. less than 16%
B. at least 16% but less than 17%
C. at least 17% but less than 18%
D. at least 18% but less than 19%
E. at least 19%

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 221

7.15 (2 points) What is the probability that the aggregate losses are three?
A. less than 17%
B. at least 17% but less than 18%
C. at least 18% but less than 19%
D. at least 19% but less than 20%
E. at least 20%
7.16 (2 points) What is the probability that the aggregate losses are four?
A. less than 14%
B. at least 14% but less than 15%
C. at least 15% but less than 16%
D. at least 16% but less than 17%
E. at least 17%
7.17 (2 points) What is the probability that the aggregate losses are five?
A. less than 11%
B. at least 11% but less than 12%
C. at least 12% but less than 13%
D. at least 13% but less than 14%
E. at least 14%

Use the following information for the next 2 questions:


The number of snowstorms each winter in Springfield is Negative Binomial with r = 5 and
= 3. The probability that a given snowstorm will close Springfield Elementary School for at least
one day is 30%, independent of any other snowstorm.
7.18 (2 points) What is the probability that Springfield Elementary School will not be closed due to
snow next winter?
A. 2%
B. 3%
C. 4%
D. 5%
E. 6%
7.19 (2 points) What is the probability that Springfield Elementary School will be closed by exactly
one snowstorm next winter?
A. 10%
B. 12%
C. 14%
D. 16%
E. 18%

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 222

7.20 (2 points) The frequency distribution is a member of the (a, b , 0) class, with
a = 0.75 and b = 3.75. The discrete severity distribution is: 0, 1, 2, 3 or 4 with probabilities of: 15%,
30%, 40%, 10% and 5%, respectively. The probability of the aggregate losses being
6, 7, 8 and 9 are: 0.0695986, 0.0875199, 0.107404, and 0.127617, respectively.
What is the probability of the aggregate losses being 10?
A. less than 14.6%
B. at least 14.6% but less than 14.7%
C. at least 14.7% but less than 14.8%
D. at least 14.8% but less than 14.9%
E. at least 14.9%
Use the following information for the next 2 questions:
The number of hurricanes that form in the Atlantic Ocean each year is Poisson with = 11.
The probability that a given such hurricane will hit the continental United States is 15%, independent
of any other hurricane.
7.21 (2 points) What is the probability that no hurricanes hit the continental United States next year?
A. 11%
B. 13%
C. 15%
D. 17%
E. 19%
7.22 (2 points) What is the probability that exactly one hurricane will hit the continental United States
next year?
A. 30%
B. 32%
C. 34%
D. 36%
E. 38%
Use the following information for the next 6 questions:
One has a compound Geometric-Poisson distribution with parameters = 1.7 and = 3.1.
7.23 (1 point) What is the density function at zero?
A. less than 0.36
B. at least 0.36 but less than 0.37
C. at least 0.37 but less than 0.38
D. at least 0.38 but less than 0.39
E. at least 0.39
7.24 (2 points) What is the density function at one?
A. less than 3.3%
B. at least 3.3% but less than 3.4%
C. at least 3.4% but less than 3.5%
D. at least 3.5% but less than 3.6%
E. at least 3.6%

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

7.25 (2 points) What is the density function at two?


A. less than 5.4%
B. at least 5.4% but less than 5.5%
C. at least 5.5% but less than 5.6%
D. at least 5.6% but less than 5.7%
E. at least 5.7%
7.26 (2 points) What is the density function at three?
A. less than 6.2%
B. at least 6.2% but less than 6.3%
C. at least 6.3% but less than 6.4%
D. at least 6.4% but less than 6.5%
E. at least 6.5%
7.27 (2 points) What is the density function at four?
A. less than 6.0%
B. at least 6.0% but less than 6.1%
C. at least 6.1% but less than 6.2%
D. at least 6.2% but less than 6.3%
E. at least 6.3%
7.28 (2 points) What is the median?
A. 2
B. 3
C. 4

D. 5

E. 6

7.29 (2 points) The frequency distribution is a member of the (a, b , 0) class,


with a = -0.42857 and b = 4.71429.
The discrete severity distribution is: 0, 1, 2, or 3 with probabilities of:
20%, 50%, 20% and 10% respectively.
The probability of the aggregate losses being 10, 11 and 12 are:
0.00792610, 0.00364884, and 0.00157109, respectively.
What is the probability of the aggregate losses being 13?
A. less than 0.03%
B. at least 0.03% but less than 0.04%
C. at least 0.04% but less than 0.05%
D. at least 0.05% but less than 0.06%
E. at least 0.06%

Page 223

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 224

7.30 (3 points) Frequency is given by a Poisson-Binomial compound frequency distribution, as per


Loss Models, with parameters = 1.2, m = 4, and q = 0.1.
(Frequency is Poisson with = 1.2, and severity is Binomial with m = 4 and q = 0.1.)
What is the density function at 1?
A. less than 0.20
B. at least 0.20 but less than 0.21
C. at least 0.21 but less than 0.22
D. at least 0.22 but less than 0.23
E. at least 0.23
7.31 (4 points) Assume that S has a compound Poisson distribution with = 2 and individual claim
amounts that are 20, 30, and 50 with probabilities of 0.5, 0.3 and 0.2, respectively.
Calculate Prob[S > 75].
A. 30%
B. 32%
C. 34%
D. 36%
E. 38%
7.32 (8 points) The number of crises per week faced by the superhero Underdog follows a
Negative Binomial Distribution with r = 0.3 and = 4. The number of super energy pills he requires
per crisis is distributed as follows: 50% of the time it is 1, 30% of the time it is 2, and 20% of the time
it is 3. What is the minimum number of super energy pills Underdog needs at the beginning of a
week to be 99% certain he will not run out during the week?
Use a computer to help you perform the calculations.
7.33 (5A, 11/95, Q.36) (2 points) Suppose that the aggregate loss S has a compound Poisson
distribution with expected number of claims equal to 3 and the following claim amount distribution:
individual claim amounts can be 1, 2 or 3 with probabilities of 0.6, 0.3, and 0.1, respectively.
Calculate the probability that S = 2.
7.34 (5A, 5/98, Q.36) (2.5 points) Assume that S has a compound Poisson distribution with
= 0.6 and individual claim amounts that are 1, 2, and 3 with probabilities of 0.25, 0.35 and 0.40,
respectively. Calculate Prob[S = 1], Prob[S= 2] and Prob[S=3].
7.35 (Course 151 Sample Exam #3, Q.12) (1.7 points) You are given:
(i) S has a compound Poisson distribution with = 2.
(ii) individual claim amounts, x, are distributed as follows:
x
p(x)
1
0.4
2
0.6
Determine fS(4).
(A) 0.05

(B) 0.07

(C) 0.10

(D) 0.15

(E) 0.21

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 225

Use the following information for the next two questions:


The frequency distribution of the number of losses in a year is geometric-Poisson with geometric
primary parameter = 3 and Poisson secondary parameter = 0.5.
7.36 (Course 3 Sample Exam, Q.41)
Calculate the probability that the total number of losses in a year is at least 4.
7.37 (Course 3 Sample Exam, Q.42) If individual losses are all exactly 100, determine the
expected aggregate losses in excess of 400.
7.38 (3, 11/02, Q.36 & 2009 Sample Q.95) (2.5 points)
The number of claims in a period has a geometric distribution with mean 4.
The amount of each claim X follows P(X = x) = 0.25, x = 1, 2, 3, 4.
The number of claims and the claim amounts are independent.
S is the aggregate claim amount in the period. Calculate Fs(3).
(A) 0.27

(B) 0.29

(C) 0.31

(D) 0.33

(E) 0.35

7.39 (3 points) The number of claims in a period has a geometric distribution with mean 5.
The amount of each claim X follows P(X = x) = 0.2, x = 0, 1, 2, 3, 4.
The number of claims and the claim amounts are independent.
S is the aggregate claim amount in the period. Calculate Fs(3).
(A) 0.27

(B) 0.29

(C) 0.31

(D) 0.33

(E) 0.35

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 226

7.40 (CAS3, 5/04, Q.40) (2.5 points) XYZ Re provides reinsurance to Bigskew Insurance
Company. XYZ agrees to pay Bigskew for all losses resulting from events, subject to:

a $500 deductible per event and


a $100 annual aggregate deductible
For providing this coverage, XYZ receives a premium of $150.
Use a Poisson distribution with mean equal to 0.15 for the frequency of events.
Event severity is from the following distribution:
Loss
Probability
250
0.10
500
0.25
800
0.30
1,000
0.25
1,250
0.05
1,500
0.05

i = 0%
What is the actual probability that XYZ will payout more than it receives?
A. 8.9%
B. 9.0%
C. 9.1%
D. 9.2%
E. 9.3%
7.41 (4, 5/07, Q.8) (2.5 points)
Annual aggregate losses for a dental policy follow the compound Poisson distribution with = 3.
The distribution of individual losses is:
Loss Probability
1
0.4
2
0.3
3
0.2
4
0.1
Calculate the probability that aggregate losses in one year do not exceed 3.
(A) Less than 0.20
(B) At least 0.20, but less than 0.40
(C) At least 0.40, but less than 0.60
(D) At least 0.60, but less than 0.80
(E) At least 0.80

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 227

Solutions to Problems:
7.1. A. The mean severity is: (1)(.35) + (2)(.2) + (3)(.15) +(4)(.05) = 1.4.
The mean aggregate losses = (2.1)(1.4) = 2.94.
7.2. D. The second moment of the severity is: (12 )(.35) + (22 )(.2) + (32 )(.15) +(42 )(.05) = 3.3.
Thus the variance of the severity is: 3.3 - 1.42 = 1.34. The mean frequency is 2.1.
The variance of the frequency is: (2.1)(1 + 2.1) = 6.51.
The variance of the aggregate losses is: (2.1)(1.34) + (1.42 )(6.51) = 15.57.
7.3. C. The p.g.f. of the Geometric Distribution is: P(z) = (1 - 2.1(z-1))-1.
c(0) = P(s(0)) = P(.25) = (1 - 2.1(.25-1))-1 = 0.38835.
Alternately, the non-zero losses are Geometric with = (75%)(2.1) = 1.575.
The only way to get an aggregate of 0 is to have no non-zero losses: 1/2.575 = 0.38835.
Comment: In the alternative solution, we are trying to determine the number of losses of size other
than zero. If one has one or more such loss, then the aggregate losses are positive.
If one has zero such losses, then the aggregate losses are zero.
We can have any number of losses of size zero without affecting the aggregate losses.
7.4. A. For the Geometric Distribution: a = /(1+) = 2.1/3.1 = .67742 and b = 0.
1/(1-as(0)) = 1/(1-(.67742)(.25)) = 1.20388.
Use the Panjer Algorithm,
x

c(x) = {1/(1-as(0))}(a +jb/x)s(j)c(x-j) = 1.20388 .67742s(j) c(x-j) = .81553 s(j) c(x-j)


j=1

j=1

j=1

c(1) = .81553 s(1) c(0) = (.81553)(.35)(.38835) = 0.11085.


Alternately, the non-zero losses are Geometric with = (75%)(2.1) = 1.575,
and severity distribution: 35/75 = 7/15 @ 1, 4/15 @2, 3/15 @3, and 1/15 @4.
The only way to get an aggregate of 1 is to have one non-zero loss of size 1:
(1.575/2.5752 ) (7/15) = 0.11085.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 228

7.5. C. Use the Panjer Algorithm, c(2) = .81553 {s(1) c(1) + s(2)c(0)} =
(.81553){(.35)(.11085) + (.20)(.38835)} = 0.09498.
Alternately, the non-zero losses are Geometric with = (75%)(2.1) = 1.575,
and severity distribution: 35/75 = 7/15 @ 1, 4/15 @2, 3/15 @3, and 1/15 @4.
Ways to get an aggregate of 2:
One non-zero loss of size 2: (1.575/2.5752 ) (4/15) = 0.06334.
Two non-zero losses, each of size 1: (1.5752 /2.5753 ) (7/15)2 = 0.03164.
Total probability: 0.06334 + 0.03164 = 0.09498.
7.6. B. Use the Panjer Algorithm, c(3) = .81553 {s(1) c(2) + s(2)c(1) +s(3)c(0)} =
(.81553){(.35)(.09498) + (.20)(.11085) + (.15)(.38835)} = 0.09270.
Alternately, the non-zero losses are Geometric with = (75%)(2.1) = 1.575,
and severity distribution: 35/75 = 7/15 @ 1, 4/15 @2, 3/15 @3, and 1/15 @4.
Ways to get an aggregate of 3:
One non-zero loss of size 3: (1.575/2.5752 ) (3/15) = 0.04751.
Two non-zero losses, one of size 1 and one of size 2 in either order:
(1.5752 /2.5753 ) (2)(7/15)(4/15) = 0.03616.
Three non-zero losses, each of size 1: (1.5753 /2.5754 ) (7/15)3 = 0.00903.
Total probability: 0.04751 + 0.03616 + 0.00903 = 0.09270.
7.7. A. c(4) = .81553 {s(1) c(3) + s(2)c(2) +s(3)c(1) + s(4)c(0)} =
(.81553){(.35)(.09270) + (.20)(.09498) + (.15)(.11085) + (.05)(.38835)} = 0.07135.
Alternately, the non-zero losses are Geometric with = (75%)(2.1) = 1.575,
and severity distribution: 35/75 = 7/15 @ 1, 4/15 @2, 3/15 @3, and 1/15 @4.
Ways to get an aggregate of 4:
One non-zero loss of size 4: (1.575/2.5752 ) (1/15) = 0.01584.
Two non-zero losses, one of size 1 and one of size 3 in either order:
(1.5752 /2.5753 ) (2)(7/15)(3/15) = 0.02712.
Two non-zero losses, each of size 2:
(1.5752 /2.5753 ) (4/15)2 = 0.01033.
Three non-zero losses, two of size 1 and one of size 2 in any order:
(1.5753 /2.5754 ) (3)(7/15)2 (4/15) = 0.01548.
Four non-zero losses, each of size 1: (1.5754 /2.5755 ) (7/15)4 = 0.00258.
Total probability: 0.01584 + 0.02712 + 0.01033 + 0.01548 + 0.00258 = 0.07135.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 229

7.8. E. c(5) = .81553 {s(1) c(4) + s(2)c(3) +s(3)c(2) + s(4)c(1) + s(5)c(0)} =


(.81553){(.35)(.07135) + (.20)(.09270) + (.15)(.09498) + (.05)(.11085) + (0)(.38835)} =
0.05162.
Comment: The aggregate distribution from 0 to 20 is:
0.38835, 0.110849, 0.094983, 0.0926988, 0.0713479, 0.0516245, 0.0415858, 0.0327984,
0.0253694, 0.0197833, 0.0154927, 0.0120898, 0.00943242, 0.00736622, 0.00575178,
0.0044901, 0.00350553, 0.00273696, 0.00213682, 0.00166827, 0.00130247.
7.9. E. From previous solutions, the mean and variance of the aggregate losses are: 2.94 and
15.57. Thus the probability that the aggregate losses are greater than 5 is approximately:
1 - [(5.5 - 2.94)/ 15.57 ] = 1 - [(5.5 - 2.94)/ 15.57 ] = 1 - [.65] = 25.8%.
Comment: Based on the previous solutions, the exact answer is 1 - .80985 = 19.0%.
7.10. B. From previous solutions, the mean and variance of the aggregate losses are: 2.94 and
15.57. The mean of a LogNormal is exp( + .52). The second moment of a LogNormal is
exp(2 + 22). Therefore set: exp( + .52) = 2.94 and exp(2 + 22) = 15.57 + 2.942 .
1 + 15.57 / 2.942 = exp(2 + 22)/exp(2 + 2) = exp(2).
=

ln(2.8013) = 1.015. = ln( 2.94 / exp(.5 (1.0152 ))) = .5634. Since the aggregate losses are

discrete, we apply a continuity correction; more than 5 corresponds to 5.5.


The probability that the aggregate losses are greater than 5 is approximately:
1 - [(ln(5.5) - .5634)/1.015] = 1 - [1.12] = 13.1%.
Comment: Based on the previous solutions, the exact answer is 1 - .80985 = 19.0%.
7.11. C. c(0) + c(1) + c(2) + c(3) = 0.38835 + 0.110849 + 0.094983 + 0.0926988 =
.6868808 < 70%.
c(0) + c(1) + c(2) + c(3) + c(4) = 0.38835 + 0.110849 + 0.094983 + 0.0926988 + 0.0713479 =
.7582287 70%.
The 70th percentile is 4, the first value such that the distribution function is 70%.
7.12. D. P(z) = (1 + .3(z-1))10. c(0) = P(s(0)) = P(.2) = (1 + .3(.2-1))10 = 0.06429.
Alternately, the non-zero losses are Binomial with q = (80%)(0.3) = 0.24.
The only way to get an aggregate of 0 is to have no non-zero losses: (1 - 0.24)10 = 0.06429.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 230

7.13. A. For the Binomial, a = -q/(1-q) = - 0.3/0.7 = - 0.42857.


b = (m+1)q/(1-q) = 33/7 = 4.71429.
x

c(x) = {1/(1-as(0))}

(a + jb/x) s(j) c(x-j) = .92105 (-.42857 +4.71429 j/x) s(j) c(x-j)


j=1

j=1

= .39474 (-1 + 11 j/x) s(j) c(x-j)


j=1

c(1) = .39474(-1 + 11(1/1))s(1)c(0) = .39474(10)(.5)(.06429) = 0.12689.


Alternately, the non-zero losses are Binomial with q = (80%)(0.3) = 0.24.
The severity distribution truncated to remove the zero losses is: 5/8 @ 1, 2/8 @2, and 1/8 @3.
The only way to get an aggregate of 1 is to have 1 non-zero loss of size 1:
10 (1 - 0.24)9 (.24) (5/8) = 0.12689.
7.14. B. c(2) = .39474{(-1 + 11(1/2))s(1)c(1) + (-1 + 11(2/2))s(2)c(0)} =
.39474{(4.5)(.5)(.12689) + (10)(.2)(.06429)} = 0.16345.
Alternately, in order to get an aggregate of 2 we have either 1 non-zero loss of size 2 or two
non-zero losses each of size 1: 10 (1 - 0.24)9 (.24) (2/8) + 45 (1 - 0.24)8 (.24)2 (5/8)2 = 0.16345.
7.15. B. c(3) = .39474{(-1 + 11(1/3))s(1)c(2) + (-1 + 11(2/3))s(2)c(1) + (-1 + 11(3/3))s(3)c(0)}
= .39474{(2.6667)(.5)(.16345) + (6.3333)(.2)(.12689) + (10)(.1)(.06429)} = 0.17485.
Alternately, in order to get an aggregate of 3 we have either 1 non-zero loss of size 3, two non-zero
losses each of sizes 1 and 2 in either order, or three non-zero losses each of size 1:
10 (1 - 0.24)9 (.24) (1/8) + 45 (1 - 0.24)8 (.24)2 (2)(5/8)(2/8) + 120 (1 - 0.24)7 (.24)3 (5/8)3 =
0.17485.
7.16. C. c(4) = .39474{(-1 + 11(1/4))s(1)c(3) + (-1 + 11(2/4))s(2)c(2) +
(-1 + 11(3/4))s(3)c(1) + (-1 + 11(4/4))s(4)c(0)} =
.39474{(1.75)(.5)(.17485) + (4.5)(.2)(.16345) + (7.25)(.1)(.12689) + (10)(0)(.06429)} =
0.15478.
7.17. B. c(5) = .39474{(-1 + 11(1/5))s(1)c(4) + (-1 + 11(2/5))s(2)c(3) +
(-1 + 11(3/5))s(3)c(2) + (-1 + 11(4/5))s(4)c(1) + (-1 + 11(5/5))s(5)c(0)} =
.39474{(1.2)(.5)(.15478) + (3.4)(.2)(.17485) + (5.6)(.1)(.16345)} = 0.11972.
Comment: Note that the terms involving s(4) and s(5) drop out, since s(4) = s(5) = 0.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 231

7.18. C. & 7.19. A. We are thinning a Negative Binomial; the snowstorms that close the school are
also Negative Binomial, but with r = 5 and = (.3)(3) = .9.
f(0) = 1/(1 + .9)5 = 4.0%. f(1) = (5)(.9)/(1 + .9)6 = 9.6%.
Alternately, the number of storms that close the school is a compound Negative Binomial - Bernoulli
Distribution. c(0) = Pp(s(0)) = Pp(.7) = (1 - (3)(.7 - 1))-5 = 1/1.95 = 4.04%.
a = /(1+) = 3/4. b = (r-1)/(1+) = 3. Using the recursive method / Panjer Algorithm:
c(1) = {1/(1 - as(0))} {(a + b)s(1)c(0)} = {1/(1 - (3/4)(.7)} {(15/4)(.3)(.0404)} = 9.57%.
7.20. D. Apply the Panjer Algorithm.
x

c(x) = {1/(1-as(0))}

(a +jb/x) s(j) c(x-j) =

1.12676 (.75 + 3.75 j/x) s(j) c(x-j)

j=1

j=1

= .84507 (1 + 5 j/x) s(j) c(x-j)


j=1

c(10) = .84507{(1+ 5(1/10))s(1)c(9) + (1+ 5(2/10))s(2)c(8) + (1+ 5(3/10))s(3)c(7) +


(1+ 5(4/10))s(4)c(6)} =
.84507{(1 + 5(1/10))(.3)(0.127617) + (1 + 5(2/10))(.4)(0.107404) +
(1 + 5(3/10))(.1)(0.0875199) + (1+ 5(4/10))(.05)(0.0695986)} = 0.14845.
Comment: Note that the terms involving s(5), s(6), etc., drop out, since in this case the severity is
such that s(x) = 0, for x > 4. Frequency follows a Negative Binomial Distribution with r = 6 and = 3.
The aggregate distribution from 0 to 10 is: 0.0049961, 0.0075997, 0.0168763, 0.0250745,
0.0385867, 0.052303, 0.0695986, 0.0875199, 0.107404, 0.127617, 0.148454.
7.21. E. & 7.22. B. We are thinning a Poisson; the hurricanes that hit the continental United States
are also Poisson, but with = (.15)(11) = 1.65.
f(0) = e-1.65 = 19.2%. f(1) = 1.65e-1.65 = 31.7%.
Alternately, the number of storms that hit the continental United States is a compound
Poisson-Bernoulli Distribution.
c(0) = Pp(s(0)) = Pp(.85) = exp[(11)(.85 - 1)] = e-1.65 = 19.205%.
a = 0. b = = 11. Using the recursive method / Panjer Algorithm:
c(1) = {1/(1 - as(0))}{(a + b)s(1)c(0)} = {1}{(11)(.15)(.19205)} = 31.688%.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 232

7.23. D. For the Primary Geometric, P(z) = 1/{1 - (z-1)} = 1/(2.7 - 1.7z).
The secondary Poisson has density at zero of e = e-3.1. The density of the compound distribution
at zero is the p.g.f. of the primary distribution at e-3.1: 1/{2.7 - 1.7e-3.1} = 0.3812.
7.24. C. For the Primary Geometric, a = /(1+) = 1.7/2.7 = .62963 and b = 0 .
The secondary Poisson has density at zero of e = e-3.1 = .045049.
1/(1 - as(0)) = 1/{1 - (.62963)(.045049)} = 1.02919. Use the Panjer Algorithm,
x

c(x) = {1/(1 - as(0))}(a + jb/x)s(j)c(x-j) =1.02919 .62963s(j) c(x-j) = .64801 s(j) c(x-j).
j=1

j=1

j=1

c(1) = .64801 s(1) c(0) = (.64801)(.139653)(.3812) = 0.03450.


Alternately, the compound distribution is one if and only if the Geometric is n 1, and of the resulting
n Poissons one is 1 and the rest are 0.

c(1) = Prob[Geometric = n] n Prob[Poisson = 1] Prob[Poisson = 0]n-1 =


n=1

{(1.7/2.7)n /2.7} n 3.1e-3.1 (e-3.1)n-1 = (3.1/2.7) n(e-3.11.7/2.7)n =


n=1

n=1

(3.1/2.7){0.0283643 + 0.0016091 + 0.0000685 + .0000026 + .0000001 + ...} = 0.03450.


Comment: The densities of the secondary Poisson Distribution with = 3.1 are:
n
0
1
2
3
4

s(n)
0.045049
0.139653
0.216461
0.223677
0.173350

The formula for the Panjer Algorithm simplifies a little since for the Geometric b = 0.
7.25. D. Use the Panjer Algorithm, c(2) = .64801 {s(1) c(1) + s(2)c(0)} =
(.64801){(.139653)(.03450)+(.216461)(.3812)} = 0.05659.
7.26. E. Use the Panjer Algorithm, c(3) = .64801 {s(1) c(2) + s(2)c(1) +s(3)c(0)} =
(.64801){(.139653)(.05659) +(.216461)(.03450) + (.223677)(.3812) } = 0.06521.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 233

7.27. C. c(4) = .64801 {s(1) c(3) + s(2)c(2) +s(3)c(1) + s(4)c(0)} =


(.64801){(.139653)(.06521) +(.216461)(.05659) + (.223677)(.03450) + (.173350)(.3812) } =
0.06166.
7.28. B. c(0) + c(1) + c(2) = .3812 + .03450 + .05659 = .47229 < 50%.
c(0) + c(1) + c(2) + c(3) = .3812 + .03450 + .05659 + .06521 = .5375 50%.
The median is 3, the first value such that the distribution function is 50%.
7.29. E. Apply the Panjer Algorithm.
x

c(x) = {1/(1-as(0))}

(a +jb/x) s(j) c(x-j) =


j=1

.92105 (-.42857 +4.71429 j/x) s(j) c(x-j)


j=1

= .39474 (-1 + 11 j/x) s(j) c(x-j)


j=1

c(13) = .39474{(-1+ 11(1/13))s(1)c(12) + (-1+ 11(2/13))s(2)c(11) + (-1+ 11(3/13))s(3)c(10)} =


.39474{(-.15385)(.5)(.00157109) + (.69231)(.2)( .00364884) + (1.53846)(.1)(.00792610)} =
0.00063307.
Comment: Terms involving s(4), s(5), etc., drop out, since in this case the severity is such that
s(x) = 0, for x > 3. Frequency follows a Binomial Distribution with m = 10 and q = .3.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 234

7.30. E. The secondary Binomial has density at zero of (1-q)m = .94 = .6561. The density of the
compound distribution at zero is the p.g.f. of the primary Poisson distribution at .6561:
exp[1.2(.6561 - 1)] = .66187.
For the Primary Poisson a = 0 and b = = 1.2. 1/(1-as(0)) = 1. Use the Panjer Algorithm,
x

c(x) = {1/(1 - a s(0))}(a +jb/x)s(j)c(x-j) = 1.2 (j/x)s(j) c(x-j) .


j=1

j=1

c(1) = (1.2)(1/1) s(1) c(0) = (1.2){(4)(.93 ) (.1)}(.66187) = 0.23160.


Alternately, the p.g.f. of the compound distribution is:
P(z) = exp(1.2({1+ .1(z-1)}4 -1)). P(0) = exp(1.2({1+ .1(0-1)}4 -1)) = .66187.
P(z) = P(z) (1.2)(4)(.1)(1+.1(z-1))3 .
P(0) = P(0) (.48)(.1)(1+.1(0-1))3 = (.66187)(.48)(.93 ) = .23160.
f(n) = (dn P(z) / dzn )z=0 / n!, so that f(1) = P(0) = 0.23160.
Comment: Alternately, think of the Primary Poisson Distribution as the number of accidents, while the
secondary Binomial represents the number of claims on each accident. The only way for the
compound distribution to have be one, is if all but one accident has zero claims and the remaining
accident has 1 claim.
For example, the chance of 3 accidents is: 1.23 e-1.2 /3! = .086744.
The chance of an accident having no claims is: .94 = .6561. The chance of an accident 1 claim is:
(4)(.93 ) (.1) = .2916. Thus if one has 3 accidents, the chance that 2 accidents are for zero and 1
accident is 1 is: (3)(.2916)(.65612 ) = .37657. Thus the chance that there are 3 accidents and they
sum to 1 is: (.086744)(.37657) = .03267. Summing over all the possible numbers of accidents,
gives a density at one of the compound distribution of .23160:
Number
of Accidents

Poisson

Chance of all
but one at 0 claims
and one at 1 claim

Chance of
1 claim in
Aggregate

0
1
2
3
4
5
6
7
8

0.30119
0.36143
0.21686
0.08674
0.02602
0.00625
0.00125
0.00021
0.00003

0.00000
0.29160
0.38264
0.37657
0.32943
0.27017
0.21271
0.16282
0.12209

0.00000
0.10539
0.08298
0.03267
0.00857
0.00169
0.00027
0.00003
0.00000

Sum

1.00000

0.23160

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 235

7.31. A. Prob[S 75] = Prob[N=0] + Prob[N=1] +


Prob[N=2](1 - Prob[30,50 or 50,30 or 50,50]) + Prob[N=3]Prob[3@20 or 2@20 and 1@30] =
e -2 + 2e-2 + 2e-2{1 - (2)(.3)(.2) - .22 } + 4e-2/3{.53 + (3)(.52 )(.3)} = 5.1467e-2 = .6965.
Prob[S > 75] = 1 - .6965 = 30.35%.
Alternately, use the Panjer Algorithm, in units of 10:
For the Poisson Distribution, a = 0 and b = = 2.
c(0) = P(s(0)) = P(0) = e2(0-1) = e-2 = .135335.
x

c(x) = {1/(1-as(0))}

(a +jb/x) s(j) c(x-j) =


j=1

(2/x) j s(j) c(x-j).


j=1

c(1) = (2/1) (1) s(1) c(1-1) = 0.


c(2) = (2/2){(1)s(1)c(1) + (2)s(2)c(0)} = {0 + (2)(.5)(.135335)} = .135335.
c(3) = (2/3){(1)s(1)c(2) + (2)s(2)c(1) + (3)s(3)c(0)} = (2/3){0 + 0 + (3)(.3)(.135335)} = .081201.
c(4) = (2/4){(1)s(1)c(3) + (2)s(2)c(2) + (3)s(3)c(1) + (4)s(4)c(0)} =
0.5{0 + (2)(.5)(.135335) + 0 + 0} = .067668.
c(5) = (2/5){(1)s(1)c(4) + (2)s(2)c(3) + (3)s(3)c(2) + (4)s(4)c(1) + (5)s(5)c(0)} =
0.4{0 + (2)(.5)(.081201) + (3)(.3)(.135335) + 0 + (5)(.2)(.135335)} = .135335.
c(6) = (2/6){(1)s(1)c(5) + (2)s(2)c(4) + (3)s(3)c(3) + (4)s(4)c(2) + (5)s(5)c(1) + (6)s(6)c(0)} =
(1/3){0 + (2)(.5)(.067668) + (3)(.3)(.081201) + 0 + 0 + 0} = .046916.
c(7) = (2/7){(1)s(1)c(6) + (2)s(2)c(5) + (3)s(3)c(4) + (4)s(4)c(3) + (5)s(5)c(2) + (6)s(6)c(1) +
(7)s(7)c(0)} =
(2/7){0 + (2)(.5)(.135335) + (3)(.3)(.067668) + 0 + (5)(.2)(.135335) + 0 + 0} = .094735.
c(0) + c(1) + c(2) + c(3) + c(4) + c(5) + c(6) + c(7) = .696525.
1 - .696525 = 30.35%.
Alternately, use convolutions:
n
Poisson

0
0.1353

1
0.2707

2
0.2707

3
0.1804

p*0

p*p

p*p*p

0
10
20
30
40
50
60
70

Sum

0.5
0.3
0.2

1 - .6965 = 30.35%.

0.25
0.30
0.09
0.20

0.125
0.225

0.84

0.35

Aggregate
Density

Aggregate
Distribution

0.1353
0.0000
0.1353
0.0812
0.0677
0.1353
0.0469
0.0947

0.1353
0.1353
0.2707
0.3519
0.4195
0.5549
0.6018
0.6965

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 236

7.32. For the Negative Binomial Distribution, a = /(1+) = 4/5 = .8, b = (r - 1)/(1+) = -0.56,
and P(z) = 1/{1 - (z -1)}r = 1/{1 - 4(z - 1)}.3 = 1/{5 - 4z)}.3.
c(0) = Pp (s(0)) = Pp (0) = 1/5.3 = .6170339.
x

c(x) = {1/(1 - as(0))} (a + jb/x) s(j) c(x-j) = (.8 - .56j/x) s(j) c(x-j).
j=1

j=1

c(1) = (.8 - .56)s(1)c(0) =(.24)(.5)(.6170339) = .0740441.


c(2) = (.8 - .56/2)s(1)c(1) + (.8 - .56(2/2))s(2)c(0) = (.52)(.5)(.0740441) + (.24)(.3)(.6170339)
= .0636779.
c(3) = (.8 - .56/3)s(1)c(2) + (.8 - .56(2/3))s(2)c(1) + (.8 - .56(3/3))s(3)c(0)
= (.613333)(.5)(.0636779) + (.426667)(.3)(.0740441) + (.24)(.2)(.6170339) = .0586232.
c(4) = (.8 - .56/4)s(1)c(3) + (.8 - .56(2/4))s(2)c(2) + (.8 - .56(3/4))s(3)c(1)
+ (.8 - .56(4/4))s(4)c(0) = (.66)(.5)(.0586232) + (.52)(.3)(.0636779) + (.38)(.2)(.0740441) +
(.24)(0)(.6170339) = .0349068.
Continuing in this manner produces the following densities for the compound distribution from zero to
twenty: 0.617034, 0.0740441, 0.0636779, 0.0586232, 0.0349067, 0.0280473, 0.0224297,
0.0173693, 0.0140905, 0.0114694, 0.00937036, 0.00773602, 0.00641437,
0.00534136, 0.00446732, 0.00374844, 0.00315457, 0.00266188, 0.00225134,
0.00190808, 0.0016202.
The corresponding distribution functions from zero to twenty are: 0.617034, 0.691078, 0.754756,
0.813379, 0.848286, 0.876333, 0.898763, 0.916132, 0.930223, 0.941692, 0.951062,
0.958798, 0.965213, 0.970554, 0.975021, 0.97877, 0.981924, 0.984586, 0.986838, 0.988746,
0.990366.
Thus Underdog requires 20 pills to be 99% certain he will not run out during the week.
Comment: This compound distribution has a long righthand tail. Therefore, one would not get the
same result if one used the Normal Approximation. Note that since the secondary distribution has
only three non-zero densities, each recursion involves summing at most three non-zero terms.
7.33. Using the Panjer algorithm, for the Poisson a = 0 and b = = 3.
c(0) = P(s(0)) = P( 0) = e3(0-1) = e-3 = .04979.
x

c(x) = {1/(1-as(0))}

(a +jb/x) s(j) c(x-j) =


j=1

(3/x) j s(j) c(x-j)


j=1

c(1) = (3/1) (1) s(1) c(1-1) = (3/1) {(1)(.6)(.04979)} = .08962.


c(2) = (3/2){(1)s(1)c(1) + (2)s(2)c(0)} = (3/2) {(1)(.6)(.08962) +(2)(.3)(.04979)} = 0.1255.
Alternately, if the aggregate losses are 2, then there is either one claim of size 2 or two claims each
of size 1. This has probability: (3e-3)(.3) + (32 e-3 /2)(.62 ) = .04481 + .08066 = 0.1255.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 237

7.34. Using the Panjer algorithm, for the Poisson a = 0 and b = = .6.
c(0) = P(s(0)) = P(0) = e.6(0-1) = e-.6 = .54881.
x

c(x) = {1/(1-as(0))}

(a +jb/x) s(j) c(x-j) =


j=1

(.6/x) j s(j) c(x-j)


j=1

c(1) = (.6/1) (1) s(1) c(1-1) = (.6) {(1)(.25)(.54881)} = 0.08232.


c(2) = (.6/2){(1)s(1)c(1) + (2)s(2)c(0)} = (.3) {(1)(.25)(.08232) + (2)(.35)(.54881)} = 0.12142.
c(3) = (.6/3){(1)s(1)c(2) + (2)s(2)c(1) + (3)s(3)c(0)} =
(.2) {(1)(.25)(.12142) + (2)(.35)(.08232) + (3)(.40)(.54881)} = 0.14931.
Comment: For example, one could instead calculate the probability of the aggregate losses being
three as: Prob[1 claim @3] + Prob[ 2 claims of sizes 1 and 2] + Prob [3 claims each @1] =
(.4)(.6e-.6) + (2)(.25)(.35)(.62 e-.6 /2) + (.253 )(.63 e-.6 /6) = .1493.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 238

7.35. D. For the aggregate losses to be 4, one can have either 2 claims each of size 2,
3 claims of which 2 are size 1 and one is size 2 (there are 3 ways to order the claim sizes),
or 4 claims each of size one.
Thus fS(4) = (22 e-2/2)(.62 ) + (23 e-2/6)((3)(.42 )(.6)) + (24 e-2/24)(.44 ) = 1.121e-2 = 0.152.
Alternately, use the Panjer Algorithm (recursive method): For the Poisson a = 0 and b = = 2.
c(0) = P(s(0)) = P(0) = e2(0-1) = e-2 = 0.13534.
x

c(x) = {1/(1 - as(0))} (a +jb/x) s(j) c(x-j) = (2/x) j s(j) c(x-j).


j=1

j=1

c(1) = (2/1) (1) s(1) c(1-1) = (2/1) {(1)(.4)(.13534)} = 0.10827.


c(2) = (2/2){(1)s(1)c(1) + (2)s(2)c(0)} = (.4)(.10827) + (2)(.6)( .13534) = 0.20572.
c(3) = (2/3){s(1)c(2) + 2s(2)c(1) + 3s(3)c(0)} = (2/3){(.4)(.20572) + (2)(.6)(.10827) + 0} =
0.14147.
c(4) = (2/4){s(1)c(3) + 2s(2)c(2) + 3s(3)c(1) + 4s(4)c(0)} =
(2/4){(.4)(.14147) + (2)(.6)(.20572) + 0 + 0} = 0.1517.
Alternately, weight together convolutions of the severity distribution:
(.1353)(0) + (.2707)(0) + (.2707)(.36) + (.1804)(.288) + (.0902)(.0256) = 0.1517.
n
Poisson

0
0.1353

1
0.2707

2
0.2707

3
0.1804

4
0.0902

p*0

p*p

p*p*p

p*4

0
1
2
3
4
5
6
7
8

Sum

0.4
0.6

0.16
0.48
0.36

0.064
0.288
0.432
0.216

0.0256
0.1536
0.3456
0.3456
0.1296

Aggregate
Distribution
0.135335
0.108268
0.205710
0.141470
0.151720

Comment: Since we only want the density at 4, and do not need the densities at 0, 1, 2, and 3 in
order to answer this question, the Panjer Algorithm involves more computation in this case.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 239

7.36. The secondary Poisson has density at zero of e = e-0.5 = 0.6065.


The densities of the secondary Poisson Distribution with = 0.5 are:
n
0
1
2
3
4
5

s(n)
0.6065
0.3033
0.0758
0.0126
0.0016
0.0002

The density of the compound distribution at zero is the p.g.f. of the primary Geometric distribution,
P(z) = 1/{1 - (z-1)}, at z = e-0.5 : 1/{4 - 3e-.5} = .4586.
For the Primary Geometric, a = /(1+) = 3/4 = .75 and b = 0.
1/(1 - as(0)) = 1/{1 - (.75)(.6065)} = 1.8345.
Use the Panjer Algorithm:
x

c(x) = {1/(1-as(0))}(a +jb/x)s(j)c(x-j) =1.8345 .75s(j) c(x-j) = 1.3759 s(j) c(x-j)


j=1

j=1

j=1

c(1) = 1.3759 s(1) c(0) = (1.3759)(.3033)(.4586) = .1914.


c(2) = 1.3759 {s(1) c(1) + s(2)c(0)} = ( 1.3759){(.3033)(.1914)+(.0758)(.4586)} = .1277.
c(3) = 1.3759 {s(1) c(2) + s(2)c(1) +s(3)c(0)} =
( 1.3759){(.3033)(.1277) +(.0758)(.1914) + (.0126)(.4586) } = .0812.
The chance of 4 or more claims in a year is 1 - (c(0) + c(1) + c(2) + c(3)) =
1 - (.4586 + .1914 + .1277 + .0812) = 0.1411.
Comment: Long! Using the Normal Approximation, one would proceed as follows.
The expected number of losses per year is:
(mean of Geometric)(Mean of Poisson) = (3)(0.5) = 1.5.
The variance of the compound frequency distribution is:
(mean of Poisson)2 (variance of geometric) + (mean of geometric)(variance of Poisson)
= (2)(1+) + = 3 + 1.5 = 4.5. Thus the chance of more than 3 losses is approximately:
1 - [(3.5 - 1.5)/ 4.5 ] = 1 - [0.94] = 1 - 0.8264 = 0.1736. Due to the skewness of the
compound frequency distribution, the approximation is not particularly good.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 240

7.37. The expected number of losses per year is:


(mean of Geometric)(Mean of Poisson) = (3)(0.5) = 1.5. Thus the expected annual aggregate
losses are: (100)(1.5) = 150. Since each loss is of size 100, if one has 4 or more losses, then the
aggregate losses are greater than 400. Therefore, the expected losses limited to 400:
0f(0) + 100f(1) + 200f(2) + 300f(3) + 400{1-(f(0)+f(1)+f(2)+f(3))} =
100{4 - 4f(0) - 3f(1) -2f(2) - f(3)} = 100{4 - 4(.4586) - 3(.1914) -2(.1277) - .0812} = 125.48.
Therefore, the expected losses excess of 400 are: 150 - 125.48 = 24.52.
Comment: Uses the intermediate results of the previous question. Since severity is constant, this
question is basically about the frequency. The question does not specify that it wants expected
annual excess losses.
7.38. E. For the geometric distribution with = 4 , P(z) = 1/(1 - (z-1)) = 1/(5 - 4z).
a = /(1 + ) = .8, b = 0. Using the Panjer algorithm, c(0) = Pf(s(0)) = P(0) = .2.
x

c(x) = {1/(1-as(0))} (a + jb/x) s(j) c(x-j) = .8 s(j) c(x-j).


j=1

j=1

c(1) = .8s(1)c(0) = (.8)(1/4)(.2) = .04.


c(2) = .8{s(1)c(1) + s(2)c(0)} = (.8){(1/4)(.04) + (1/4)(.2)} = .048.
c(3) = .8{s(1)c(2) + s(2)c(1) + s(3)c(0)} = (.8){(1/4)(.048) + (1/4)(.04) + (1/4)(.2)} = .0576.
Distribution of aggregate at 3 is: .2 + .04 + .048 + .0576 = 0.3456.
Alternately, one can use semi-organized reasoning
For the Geometric with = 4: f(0) = 1/5 = .2, f(1) = .8f(0) = .16,
f(2) = .8f(1) = .128, f(3) = .8f(2) = .1024.
The ways in which the aggregate is 3:
0 claims: .2. 1 claim of size 3: (3/4)(.16) = .12.
2 claims of sizes 1 & 1, 1 & 2, or 2 & 1: (3/16)(.128) = .024.
3 claims of sizes 1 & 1 & 1: (1/64)(.1024) = .0016.
Distribution of aggregate at 3 is: .2 + .12 + .024 + .0016 = 0.3456.
Alternately, using convolutions:
n
Geometric

0
0.200

1
0.160

2
0.128

3
0.102

f*0

f*f

f*f*f

0
1
2
3

0.25
0.25
0.25

0.062
0.125

0.0156

Aggregate
Distribution

0.2000
0.0400
0.0480
0.0576

Distribution of aggregate at 3 is: .2 + .04 + .048 + .0576 = 0.3456.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 241

7.39. E. One can thin the Geometric Distribution.


The non-zero claims are Geometric with = (5)(4/5) = 4.
The size distribution for the non-zero claims is: P(X = x) = 0.25, x = 1, 2, 3, 4.
Only the non-zero claims contribute to the aggregate distribution.
Thus this question has the same solution as the previous question 3, 11/02, Q. 36.
Distribution of aggregate at 3 is: 0.3456.
Comment: In general, thinning the frequency to only consider the non-zero claims can simplify the
use of convolutions or semi-organized reasoning. Thinning a Binomial affects q.
Thinning a Poisson affects . Thinning a Negative Binomial affects .
7.40. E. For XYZ to pay out more than it receives, the aggregate has to be > 250 prior to the
application of the aggregate deductible. After the 500 per event deductible, the severity distribution
is: 0 @ 35%, 300 or more @ 65%.
Thus XYZ pays out more than it receives if and only if XYZ makes a single nonzero payment.
Nonzero payments are Poisson with mean: (65%)(.15) = .0975.
Probability of at least one nonzero payment is: 1 - e-.0975 = 9.29%.
Alternately, for the aggregate distribution after the per event deductible,
using the Panjer Algorithm, c(0) = P(s(0)) = Exp[.15(.35 - 1)] = .9071. 1 - .9071 = 9.29%.
Comment: The aggregate deductible applies after the per event deductible is applied.

2013-4-3,

Aggregate Distributions 7 Panjer Algorithm,

HCM 10/23/12,

Page 242

7.41. B. Some densities of the Poisson frequency are: f(0) = e-3 = .0498, f(1) = 3e-3 = .1494,
f(2) = 32 e-3/2 = .2240, f(3) = 33 e-3/6 = .2240.
Ways in which the aggregate can be less than or equal to 3:
no claims: .0498.
1 claim of size less than 4: (.1494)(.9) = .1345.
2 claims of sizes 1 and 1, 1 and 2, 2 and 1: (.2240){(.4)(.4) + (.4)(.3) + (.3)(.4)} = .0896.
3 claims each of size 1: (.2240)(.43 ) = .0143
The sum of these probabilities is: .0498 + .1345 + .0896 + .0143 = 0.2882.
Alternately, using the Panjer algorithm, for the Poisson a = 0 and b = = 3. P(z) = exp[(z-1)].
c(0) = P(s(0)) = P(0) = e3(0-1) = e-3 = .04979.
x

c(x) = {1/(1-as(0))} (a + jb/x) s(j) c(x-j) = (3/x) j s(j) c(x-j)


j=1

j=1

c(1) = (3/1) (1) s(1) c(1-1) = (3) {(1)(.4)(.04979)} = .05975.


c(2) = (3/2){(1)s(1)c(1) + (2)s(2)c(0)} = (1.5) {(1)(.4)(.05975) + (2)(.3)(.04979)} = .08066.
c(3) = (3/3){(1)s(1)c(2) + (2)s(2)c(1) + (3)s(3)c(0)}
= (1)(.4)(.08066) + (2)(.3)(.05975) + (3)(.2)(.04979) = .09799.
The sum of the densities of the aggregate at 0, 1, 2, and 3 is:
.04979 + .05975. + .08066 + .09799 = 0.2882.

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 243

Section 8, Recursive Method / Panjer Algorithm, Advanced104


Additional items related to the Recursive Method / Panjer Algorithm will be discussed.
Aggregate Distribution, when Frequency is a Compound Distribution:105
Assume frequency is a Compound Geometric-Poisson Distribution with = 0.8 and = 1.3.
Let severity have density: s(0) = 60%, s(1) = 10%, s(2) = 25%, and s(3) = 5%.
Then we can use the Panjer Algorithm twice, in order to compute the Aggregate Distribution.
First we use the Panjer Algorithm to calculate the density of an aggregate distribution with a Poisson
frequency with = 1.3, and this severity. As computed in the previous section, this aggregate
distribution is:106 0.594521, 0.0772877, 0.198243, 0.06398, ...
These then are used as the secondary distribution in the Panjer algorithm, together with a Geometric
with = 0.8 as the primary distribution.
For the primary Geometric, P(z) = 1/{1 - (z-1)} = 1/(1.8 - .8z), a = /(1+) = .8/1.8 = 0.4444444,
and b = 0.
c(0) = Pp (s(0)) = 1/{1.8 - (0.8)(0.594521)} = 0.7550685.
1/{1 - as(0)} = 1/{1- (0.4444444)(0.594521)} = 1.359123.
x
x
1
c(x) =
(a + j b / x) s(j) c(x - j) =1.359123 0.4444444 s(j) c(x- j)
1 - a s(0)
j=1

j=1

= 0.604055 s(j) c(x- j) .


j=1

c(1) = 0.604055 s(1) c(0) = (0.604055)(0.0772877)(0.7550685) = 0.035251.


c(2) = 0.604055{s(1)c(1)+s(2)c(0)} =
(0.604055){(0.0772877)(0.035251) +(0.198243)(0.7550685)} = 0.092065.
c(3) = .604055 {s(1)c(2) + s(2)c(1) + s(3)c(0)} =
(0.604055){(0.0772877)(0.092065) + (0.198243)(0.035251) + (0.06398)(0.7550685)} =
0.0377009.
One could calculate c(4), c(5), c(6), .... , in a similar manner.
104

See Section 9.6 of Loss Models.


See Section 9.6.1 of Loss Models, not on the syllabus.
106
The densities at 0, 1, 2, and 3 were computed, while the densities from 4 through 10 were displayed.
105

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 244
Practical Issues:
Loss Models mentions a number of concerns that may arise in practical applications of the recursive
method/ Panjer Algorithm.
Whenever one uses recursive techniques, one has to be concerned about the propagation of
rounding errors. Small errors can compound at each stage and become very significant.107 While the
chance of this occurring can be minimized by keeping as many significant digits as possible, in
general the chance can not be eliminated.
In the case of the Panjer Algorithm, one is particularly concerned about the calculated right hand tail of
the aggregate distribution. In the case of a Poisson or Negative Binomial frequency distribution, the
relative errors in the tail of the aggregate distribution do not grow quickly; the algorithm is numerically
stable.108 However, in the case of a Binomial frequency, rarely the errors in the right hand tail will
blow up.109
Exercise: Aggregate losses are compound Poisson with = 1000. There is a 5% chance that the
size of a loss is zero. What is the probability that the aggregate losses are 0?
[Solution: P(A=0) = PN(fX(0)) = exp(1000(fX(0)-1)) = exp(-950) = 2.6 x 10-413.]
Thus for this case, the probability of the aggregate losses being zero is an extremely small number.
Depending on the computer and software used, e-950 may not be distinguishable from zero. If this
value is represented as zero, then the results of the Panjer Algorithm would be complete
nonsense.110
Exercise: Assume in the above exercise, the probability of aggregate losses at zero, c(0) is
mistakenly taken as zero. What is the aggregate distribution calculated by the Panjer algorithm?
[Solution: c(x) =

x
1
(a + j b / x) s(j) c(x - j)
1 - a s(0)
j=1

Thus c(1) = {1/(1 - as(0))} (a + b/1) s(1) c(0) = 0.


Then, c(2) = {1/(1 - as(0))} {(a + b/2) s(1) c(1) + (a + 2b/2)s(2)c(0)} = 0.
In a similar manner, the whole aggregate distribution would be calculated as zero.]
107

This is a particular concern when one is applying the recursion formula many times. While on a typical exam
question one would apply the recursion formula at most 4 times, in a practical application one could apply it
thousands of times.
108
See Section 9.6.3 in Loss Models.
109
In this case, the calculated probabilities will alternate sign. Of course the probabilities are actually nonnegative.
110
Of course with such a large expected frequency, it is likely that the Normal, LogNormal or other approximation to
the aggregate losses may be a superior technique to using the Panjer algorithm.

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 245
So taking c(0) = 0, rather than the correct c(0) = 2.6 x 10-413, would defeat the whole purpose of
using the Panjer algorithm. Thus we see that it is very important when applying the Panjer Algorithm
to such situations either to carefully distinguish between extremely small numbers and zero, or to be
a little clever in applying the algorithm.
Exercise: Aggregate losses are compound Poisson with = 1000. The severity distribution is:
f(0) = 5%, f(1) = 75%, and f(2) = 20%. What are the mean and variance of the aggregate losses?
[Solution: The mean of the severity is 1.15. The second moment of the severity is 1.55.
Therefore, the mean of the aggregate losses is 1150 and the variance of the aggregate losses is:
(1000)(1.55) = 1550.]
Thus in this case the mean of the aggregate losses minus 6 standard deviations is:
1150 - 6 1550 = 914. In general, we expect there to be extremely little probability more than 6
standard deviations below the mean.111
One could take c(x) = 0 for x 913, and c(914) = 1; basically we start the algorithm at 914. Then
when we apply the algorithm, the distribution of aggregate losses will not sum to unity, since we
arbitrarily chose c(914) = 1. However, at the end we can add up all of the calculated densities and
divide by the sum, in order to normalize the distribution of aggregate losses.
Exercise: Assume the aggregate losses have a mean of 100 and standard deviation of 5.
Explain how you would apply the Panjer algorithm?
[Solution: One assumes there will be very little probability below 100 - (6)(5) = 70. Thus we take
c(x) = 0 for x < 70, and c(70) = 1. Then we apply the Panjer algorithm starting at 70; we calculate,
c(71), c(72), c(73), ... , c(130). Then we sum up the probabilities. Perhaps they sum to 1,617,012.
Then we would divide each of these calculated values by 1,617,012.]
Another way to solve this potential problem, would be first to perform the calculation for
= 1000/128 = 7.8125 rather than 1000.112 Let g(x) be the result of performing the Panjer algorithm
with = 7.8125. Then the desired distribution of aggregate losses, corresponding to =1000, can
be obtained as g(x)*128. Note that we can power-up the convolutions by successively taking
convolutions. For example, (g*8 ) * (g*8 ) = (g*16), and then in turn
(g*16) * (g*16) = (g*32). In this manner we need only perform 7 convolutions in order to get the
27 = 128th convolution. This technique relies on the property that the sum of independent, identically
distributed compound Poisson distributions is another compound Poisson distribution.113
(-6) = 9.87 x 10-10. Loss Models in Section 9.6.2, suggests starting at 6 standard deviations below the mean.
One would pick some sufficiently large power of 2, such as for example 128.
113
See a Mahlers Guide to Frequency Distributions.
111
112

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 246
Since the compound Negative Binomial shares the same property, one can apply a similar
technique.
Exercise: Assume you have a Compound Negative Binomial with = 20 and r = 30.
How might you use the Panjer Algorithm to calculate the distribution of aggregate losses?
[Solution: One could apply the Panjer Algorithm to a Compound Negative Binomial with
= 20 and r = 30/32 = 0.9375, and then take the 32nd convolution of the result.]
For the Binomial, since the m parameter has to be integer, one has to modify the technique slightly.
Exercise: Assume you have a Compound Binomial with q = 0.6 and m = 592. How might you use
the Panjer Algorithm to calculate the distribution of aggregate losses?
[Solution: One could apply the Panjer Algorithm to a Compound Binomial with
q = 0.6 and m = 1, and then take the 592nd convolution of the result.
Comment: One could get the 29 = 512nd convolution relatively quickly and then convolute that with
the 80th convolution. In base 2, 592 is written as 1001010000. Therefore, in order to get the
592nd convolution, one would retain the 512nd, 64th and 16th convolutions, and convolute
them at the end. ]

Panjer Algorithm (Recursive Method) for the (a,b,1) class:


If the frequency distribution or primary distribution, pk, is a member of the (a,b,1) class, then one can
modify the Panjer Algorithm:114 115
x
1
s(x) {p1 - (a+ b) p0}
c(x) =
+
(a + jb / x) s(j) c(x - j) ,
1 - a s(0)
1 - a s(0) j=1

where p0 = frequency density at zero, and p1 = frequency density at one.


c(0) = Pp (s(0)) = p.g.f. of the frequency distribution at (density of severity distribution at zero.)
If p is a member of the (a, b, 0) class, then p1 = (a+b)p0 , and the first term of c(x) drops out.
Thus this formula reduces to the previously discussed formula for the (a, b, 0) class.

114

See Theorem 9.8 in Loss Models. While on the syllabus, it is very unlikely that you will be asked about this.
The (a, b, 1) class of frequency distributions includes the (a, b, 0) class. For the (a, b, 1) class, the recursion
relationship f(x+1) = f(x)/{a + b/(x+1)} need only hold for x 1, rather than x 0.
115

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 247
Exercise: Calculate the density at 1 for a zero-modified Negative Binomial with = 2,
r = 3, and probability at zero of 22%.
[Solution: Without the modification, f(0) = 1/(1+2)3 = 0.037037,
and f(1) = (3)(2)/(1+2)4 = 0.074074. Thus with the zero-modification, the density at one is:
(0.074074)(1 - 0.22)/(1 - 0.037037) = 0.06.]
Exercise: What is the probability generating function for a zero-modified Negative Binomial with
= 2, r = 3, and probability at zero of 22%?
[Solution: P(z) = 0.22 + (1 - 0.22)(p.g.f. of zero-truncated Negative Binomial) =
0.22 + (0.78){(1 - 2(z-1))-3 - (1+2)-3} / {1 - (1+2)-3} = 0.22 + (0.81){(1 - 2(z-1))-3 - 0.037037}.]
Exercise: Let severity have density: s(0) = 30%, s(1) = 60%, s(2) = 10%. Aggregate losses are
given by a compound zero-modified Negative Binomial distribution, with parameters
= 2, r = 3, and the probability at zero for the zero-modified Negative Binomial is 22%.
Use the Panjer algorithm to calculate the density at 0 of the aggregate losses.
[Solution: From the previous exercise, the zero-modified Negative Binomial has p.g.f.
P(z) = 0.22 + (.81){(1 - 2(z-1))-3 - 0.037037}.
c(0) = Pp (s(0)) = Pp (.3) = 0.22 + (0.81){(1 - 2(.3-1))-3 - 0.037037} = 0.24859.]
Exercise: Use the Panjer algorithm to calculate the density at 2 of the aggregate losses.
[Solution: For the zero-modified Negative Binomial, a = 2/(1+2) = 2/3 and
b = (3-1)(2)/(1+2) = 4/3.
x
1
s(x) {p1 - (a+ b) p0}
c(x) =
+
(a + j b / x) s(j) c(x - j) =
1 - a s(0)
1 - a s(0)
j=1

s(x){0.06 - (2/3 + 4/3)(0.22)}/(1- (2/3)(0.3)) + 1/{1 - (2/3)(0.3)}

(2 / 3

+ j4 / 3x) s(j) c(x - j) =

j=1

-0.475 s(x) + 0.83333

(1 + 2j / x) s(j) c(x - j) .
j=1

c(1) = -0.475 s(1) + (0.83333)(1 + (2)(1)/1) s(1) c(0) =


(-0.475)(0.6) + (0.83333)(3)(0.6)(0.24859) = 0.087885.
c(2) = -0.475 s(2) + (0.83333){(1 + (2)(1)/2)s(1) c(1) + (1 + (2)(2)/2)s(2) c(0)} =
(-0.475)(0.1) + (0.83333){(2)(0.6)(0.087885) + (3)(0.1)(0.24859)} = 0.10253.
Comment: The densities out to 10 are: 0.24859, 0.087885, 0.102533, 0.102533, 0.0939881,
0.0811716, 0.0671683, 0.0538092, 0.0420268, 0.0321601, 0.0241992.]

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 248
Here is a graph of the density of the aggregate losses:

0.25
0.2
0.15
0.1
0.05

10

15

20

Other than the large probability of zero aggregate losses, the aggregate losses look like they could
be approximated by one of the size of loss distributions in Appendix A of Loss Models.
Continuous Severity Distributions:
If one has a continuous severity distribution s(x), and the frequency distribution, p, is a member of
the (a, b, 1) class,116 then one has an integral equation for the distribution of aggregate losses, c,
similar to the Panjer Algorithm:117
x

c(x) = p1 s(x) +

0 (a + by / x) s(y) c(x - y) dy .

Loss Models merely states this result without using it. Instead as has been discussed,
Loss Models demonstrates how one can employ the Panjer algorithm using a discrete severity
distribution. One can either have started with a discrete severity distribution, or one can have
approximated a continuous severity distribution by a discrete severity distribution, as will be
discussed in the next section.
116

It also holds for the members of the (a, b, 0) class, which is a subset of the (a, b, 1) class.
See Theorem 9.26 in Loss Models. This is a Volterra integral equation of the second kind. See for example
Appendix D of Insurance Risk Models, by Panjer and Willmot.
117

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 249
Problems:
Use the following information for the next 6 questions:

Frequency follows a zero-truncated Poisson with = 0.8.


For the zero-truncated Poisson, P(z) = (ez - 1) / (e - 1), a = 0, and b = .
Severity is discrete and takes on the following values:
Size
0
1
2
3
4

Probability
20%
40%
20%
10%
10%

Frequency and Severity are independent.


8.1 (2 points) What is the probability that the aggregate losses are zero?
A. less than 12%
B. at least 12% but less than 13%
C. at least 13% but less than 14%
D. at least 14% but less than 15%
E. at least 15%
8.2 (2 points) What is the probability that the aggregate losses are one?
A. less than 30%
B. at least 30% but less than 31%
C. at least 31% but less than 32%
D. at least 32% but less than 33%
E. at least 33%
8.3 (2 points) What is the probability that the aggregate losses are two?
A. less than 19%
B. at least 19% but less than 20%
C. at least 20% but less than 21%
D. at least 21% but less than 22%
E. at least 22%

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 250
8.4 (2 points) What is the probability that the aggregate losses are three?
A. less than 9%
B. at least 9% but less than 10%
C. at least 10% but less than 11%
D. at least 11% but less than 12%
E. at least 12%
8.5 (3 points) What is the probability that the aggregate losses are four?
A. less than 9%
B. at least 9% but less than 10%
C. at least 10% but less than 11%
D. at least 11% but less than 12%
E. at least 12%
8.6 (3 points) What is the probability that the aggregate losses are five?
A. less than 5%
B. at least 5% but less than 6%
C. at least 6% but less than 7%
D. at least 7% but less than 8%
E. at least 8%

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 251
Solutions to Problems:
8.1. D. P(z) = (ez - 1)/(e - 1) = (e.8z - 1)/(e.8 - 1) .
c(0) = P(s(0)) = P(.2) = (e.8(.2) - 1)/(e.8 - 1) = 0.141579.
8.2. B. For the zero-truncated Poisson, a = 0 and b = = .8.
p(0) = 0. p(1) = .8e-.8/(1 - e-.8) = .652773.
x

c(x) = s(x){p(1) - (a+b)p(0)}/(1 - as(0)) + {1/(1 - as(0))}

(a + jb/x) s(j) c(x-j) =


j=1

s(x){.652773 - (.8)(0)}/(1 - (0)(.2)) + 1/{1 - (0)(.2)} (0 + .8j/x) s(j) c(x-j) =


j=1
x

.652773 s(x) + .8 (j/x) s(j) c(x-j)


j=1

c(1) = .652773 s(1) + (.8)(1/1)s(1) c(0) = (.652773)(.4) + (.8)(1)(.4)(.141579) = 0.306414.


8.3. C. c(2) = .652773 s(2) + (.8){(1/2)s(1)c(1) + (2/2)s(2)c(0)} =
(.652773) (.2) + (.8){(1/2)(.4)(.306414) + (1)(.2)(.141579)} = 0.202233.
8.4. E. c(3) = .652773 s(3) + (.8){(1/3)s(1)c(2) + (2/3)s(2)c(1) + (3/3)s(3)c(0)} =
(.652773)(.1) + (.8){(1/3)(.4)(.202233) + (2/3)(.2)(.306414) + (3/3)(.1)(.141579)} =
0.130859.
8.5. E. c(4) = .652773 s(4) + (.8){(1/4)s(1)c(3) + (2/4)s(2)c(2) + (3/4)s(3)c(1) + (4/4)s(4)c(0)} =
(.652773)(.1) + (.8){(1/4)(.4)(.130859) + (2/4)(.2)(.202233) + (3/4)(.1)(.306414) +
(4/4)(.1)(.141579)} = 0.121636.

2013-4-3, Aggregate Distributions 8 Advanced Panjer Algorithm, HCM 10/23/12, Page 252
8.6. A. c(5) = .652773 s(5) + (.8){(1/5)s(1)c(4) + (2/5)s(2)c(3) + (3/5)s(3)c(2) + (4/5)s(4)c(1) +
(5/5)s(5)c(0)} = (.652773) (0) + (.8){(1/5)(.4)(.121636) + (2/5)(.2)(.130859) +
(3/5)(.1)(.202233)+ (4/5)(.1)(.306414) + (5/5)(0)(.141579)} = 0.045477.
Comment: The distribution of aggregate losses from 0 to 15 is: 0.141579, 0.306414, 0.202234,
0.130859, 0.121636, 0.0454774, 0.0249329, 0.0133713, 0.00776193, 0.00303326,
0.00146421, 0.000689169, 0.000325073, 0.000126662, 0.0000556073, 0.0000237919.
Here is a graph:
0.3
0.25
0.2
0.15
0.1
0.05
0

10

12

14

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 253

Section 9, Discretization118
With a continuous severity distribution, in order to apply the Recursive Method / Panjer Algorithm,
one would first need to approximate this continuous distribution by a discrete severity distribution.
There are a number of methods one could use to do this.
Method of Rounding:119
Assume severity follows an Exponential distribution with = 100.
For example, we could use a discrete distribution g, with support 0, 20, 40, 60, 80, 100, etc.
Then we could take g(0) = F(20/2) = F(10) = 1 - e-10/100 = 1 - e-0.1 = 0.095163.
We could let g(20) = F(30) - F(10) = e-.1 - e-.3 = 0.164019.120
Exercise: Continuing in this manner what is g(40)?
[Solution: g(40) = F(50) - F(30) = (1 - e-50/100) - (1 - e-30/100) = e-0.3 - e-0.5 = 0.134288.]
Graphically, one can think of this procedure as balls dropping from above, with their probability
horizontally following the density of this Exponential Distribution. The method of rounding is like
setting up a bunch of cups each of width the span of 20, centered at 0, 20, 40, 60, etc.

20

40

Then the expected percentage of balls falling in each cup is the discrete probability produced by
the method of rounding. This discrete probability is placed at the center of each cup.
118

See Section 9.6.5 of Loss Models.


See Section 9.6.5.1 of Loss Models. Also called the method of mass dispersal.
120
Loss Models actually takes g(0) = F(10) - Prob(10), g(20) = (F(30) - Prob(30)) - (F(10) - Prob(10)), etc. This makes
no difference for a continuous distribution such as the Exponential. It would make a difference if there happened to
be a point mass of probability at either 10 or 30. Loss Models provides no explanation for this choice of including a
point mass at 30 in the discretized distribution at 40. It is unclear that this choice is preferable to instead either
including a point mass at 30 in the discretized distribution at 20 or splitting it equally between 20 and 40.
119

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 254

We could arrange this calculation in a spreadsheet:


x
0
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400

F(x+10)
0.095163
0.259182
0.393469
0.503415
0.593430
0.667129
0.727468
0.776870
0.817316
0.850431
0.877544
0.899741
0.917915
0.932794
0.944977
0.954951
0.963117
0.969803
0.975276
0.979758
0.983427

g(x)
0.095163
0.164019
0.134288
0.109945
0.090016
0.073699
0.060339
0.049402
0.040447
0.033115
0.027112
0.022198
0.018174
0.014879
0.012182
0.009974
0.008166
0.006686
0.005474
0.004482
0.003669

x
400
420
440
460
480
500
520
540
560
580
600
620
640
660
680
700
720
740
760
780
800

F(x+10)
0.983427
0.986431
0.988891
0.990905
0.992553
0.993903
0.995008
0.995913
0.996654
0.997261
0.997757
0.998164
0.998497
0.998769
0.998992
0.999175
0.999324
0.999447
0.999547
0.999629
0.999696

g(x)
0.003669
0.003004
0.002460
0.002014
0.001649
0.001350
0.001105
0.000905
0.000741
0.000607
0.000497
0.000407
0.000333
0.000273
0.000223
0.000183
0.000150
0.000122
0.000100
0.000082
0.000067

The discrete distribution g is the result of discretizing the continuous Exponential distribution.
Note that one could continue beyond 800 in the same manner, until the probabilities got sufficiently
small for a particular application.
This is an example of the Method of Rounding.
For the Method of Rounding with span h, construct the discrete distribution g:
g(0) = F(h/2).
g(ih) = F(h(i + 1/2)) - F(h(i - 1/2)).
We have that for example F(30) = G(30) = 1 - e-0.3. In this example, F and G match at 10, 30, 50,
70, etc. In general, the Distribution Functions match at all of the points halfway between
the support of the discretized distribution obtained from the method of rounding.121
In this example, the span was 20, the spacing between the chosen discrete sizes. If one had
instead taken a span of 200, there would have been one tenth as many points and the
approximation would have been worse. If instead one had taken a span of 2, there would have
been 10 times as many points and the approximation would have been much better.

121

This contrasts with the method of local moment matching, to be discussed subsequently.

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 255

Since one discretizes in order to simplify calculations, usually one wants to have fewer points, and
thus a larger span. This goal conflicts with the desire to have a good approximation to the continuous
distribution, which requires a smaller span. Thus in practical applications one needs to select a
span that is neither too small nor too large. One can always test whether making the span
smaller would materially effect your results.
One could use this discretized severity distribution obtained from the method of rounding in the
Panjer Algorithm in order to approximate the aggregate distribution.
Exercise: Using the above discretized approximation to the Exponential distribution with

= 100, and a Geometric frequency with = 9, calculate the first four densities of the aggregate
distribution via the Panjer Algorithm.
[Solution: The discretized distribution has span of 20, so we treat 20 as 1, 40 as 2, etc., for
purposes of the Panjer Algorithm.
The p.g.f. of the Geometric Distribution is: P(z) = 1/{1 - 9(z - 1)} = 1/(10 - 9z).
c(0) = P(s(0)) = P(0.095163) = 1/{10 - 9(0.095163)} = 0.10937.
For the Geometric Distribution: a = /(1 + ) = 9/10 = 0.9 and b = 0.
1/(1 - as(0)) = 1/{1 - (0.9)(0.095163)} = 1.10967.
c(x) = {1/(1 - as(0))}

j=1

j=1

j=1

(a + jb / x) s(j) c(x - j) = 1.10967 0.9 s(j) c(x - j) = 0.98430 s(j) c(x - j)

c(1) = 0.98430 s(1)c(0) = (0.98430)(0.164019)(0.10937) = 0.01766.


c(2) = 0.98430 {s(1)c(1) + s(2)c(0)} =
(0.98430){(0.164019)(0.01766) + (0.134288)(0.10937)} = 0.01731.
c(3) = 0.98430 {s(1)c(2)+ s(2)c(1) + s(3)c(0)} =
(0.98430){(0.164019)(0.01731) + (0.134288)(0.01766) + (0.109945)(0.10937)} = 0.01695.]
Thus the approximate discrete densities of the aggregate distribution at 0, 20, 40, and 60 are:
0.10937, 0.01766, 0.01731, 0.01695.
Exercise: What is the moment generating function of aggregate losses if the severity is
Exponential with = 100 and frequency is Geometric with = 9?
[Solution: For the Exponential Distribution the m.g.f. is: MX(t) = (1 - 100t)-1.
The p.g.f. of the Geometric Distribution is: P(z) = 1/{1 - 9(z - 1)} = 1/(10 - 9z).
M A(t) = 1 /{ 10 - 9(1 - 100t)-1} = (1 - 100t) / (1 - 1000t).]

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 256

Note that (1 - 100t)/(1 - 1000t) = 0.1 + (0.9){1/(1 - 1000t)}. This is the weighted average of the
m.g.f. of a point mass at zero and the m.g.f. of an Exponential distribution with mean 1000.
Therefore, the aggregate distribution is a weighted average of a point mass at zero and an
Exponential distribution with mean 1000, using weights 10% and 90%.122
Thus the distribution function of aggregate losses for x > 0 is:
C(x) = 0.1 + 0.9(1 - e-x/1000) = 1 - 0.9e-x/1000.
One can create a discrete approximation to this aggregate distribution via the method of
rounding with a span of 20. Here are the first four discrete densities:
g(0) = C(10) = 1 - 0.9 e-0.01 = 0.10896.
g(20) = C(30) - C(10) = 0.9(e-0.01 - e-0.03) = 0.01764.
g(40) = C(50) - C(30) = 0.9(e-0.03 - e-0.05) = 0.01729.
g(60) = C(70) - C(50) = 0.9(e-0.05 - e-0.07) = 0.01695.
This better discrete approximation to the aggregate distribution is similar to the previous
approximation obtained by applying the Panjer Algorithm using the approximate severity
distribution:

x
0
20
40
60

122

Discrete Density from Applying Panjer


Algorithm to the Approximate Severity
0.10937
0.01766
0.01731
0.01695

Discrete Density from Applying Method of


Rounding to the Exact Aggregate Distribution
0.10896
0.01764
0.01729
0.01695

This is an example of a general result discussed in my section on Analytic Results.

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 257

Exercise: Create a discrete approximation to a Pareto Distribution with = 40 and


= 3, using the method of rounding with a span of 50. Stop at 1000.
[Solution: F(x) = 1 - {40/(40 + x)}3 .
For example, g(50) = F(75) - F(25) = (40/(40 + 25))3 - (40/(40 + 75))3 = .190964.
g(0) = F(25) = 0.767.
F(25) = 0.767
g(50) = F(75) - F(25) = 0.191.
F(75) = 0.958
g(100) = F(125) - F(75) = 0.028.
F(125) = 0.986
g(150) = F(175) - F(125) = 0.008.
F(175) = 0.994
g(200) = F(225) - F(175) = 0.003.
F(225) = 0.997
etc.
x

F(x+25)

g(x)

F(x+25)

g(x)

0
50
100
150
200
250
300
350
400
450
500

0.766955
0.957919
0.985753
0.993560
0.996561
0.997952
0.998684
0.999105
0.999363
0.999531
0.999645

0.766955
0.190964
0.027834
0.007807
0.003001
0.001391
0.000731
0.000421
0.000259
0.000168
0.000114

550
600
650
700
750
800
850
900
950
1000

0.999725
0.999782
0.999825
0.999857
0.999882
0.999901
0.999916
0.999929
0.999939
0.999947

0.000080
0.000058
0.000043
0.000032
0.000025
0.000019
0.000015
0.000012
0.000010
0.000008

Comment: By stopping at 1000, there is 1 - 0.999947 = 0.000053 of probability not included in


the discrete approximation. One could place this additional probability at some convenient spot.
For example, we could figure out where 1 - F(x) = 0.000053/2. This occurs at x = 1302. Thus
one might put a probability of 0.000053 at 1300.]
The sum of the first n densities that result from the method of rounding is:
F(h/2) + F(3h/2) - F(h/2) + F(5h/2) - F(3h/2) + ... + F(h(n + 1/2)) - F(h(n - 1/2)) = F(h(n + 1/2)).
As n goes to infinity, this sum approaches F() = 1.
Thus the method of rounding includes in the discrete distribution all of the probability.

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 258

Average of the Result of the Method of Rounding:


The mean of the discrete distribution that results from the method of rounding is:
0 F(h/2) + h{F(3h/2) - F(h/2)} + 2h{F(5h/2) - F(3h/2)} + 3h{F(7h/2) - F(5h/2)} + ... =
h{S(h/2) - S(3h/2)} + 2h{S(3h/2) - S(5h/2)} + 3h{S(5h/2) - S(3h/2)} + ... =

h{S(h/2) + S(3h/2) + S(5h/2) + S(7h/2) + ...}

0 S(x) dx = E[X].

Thus the method of rounding produces a discrete distribution with approximately the same mean as
the continuous distribution we are approximating. The smaller the span, h, the better the
approximation will be.
Here is a computation of the mean of the previous result of applying the method of rounding with
span 20 to an Exponential distribution with = 100.
x
0
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400

F(x+10)
0.0952
0.2592
0.3935
0.5034
0.5934
0.6671
0.7275
0.7769
0.8173
0.8504
0.8775
0.8997
0.9179
0.9328
0.9450
0.9550
0.9631
0.9698
0.9753
0.9798
0.9834

g(x)
0.0952
0.1640
0.1343
0.1099
0.0900
0.0737
0.0603
0.0494
0.0404
0.0331
0.0271
0.0222
0.0182
0.0149
0.0122
0.0100
0.0082
0.0067
0.0055
0.0045
0.0037

Extension
0.0000
3.2804
5.3715
6.5967
7.2013
7.3699
7.2407
6.9162
6.4715
5.9607
5.4224
4.8835
4.3617
3.8687
3.4110
2.9922
2.6131
2.2732
1.9706
1.7030
1.4677

x
400
420
440
460
480
500
520
540
560
580
600
620
640
660
680
700
720
740
760
780
800
Sum

F(x+10)
0.98343
0.98643
0.98889
0.99090
0.99255
0.99390
0.99501
0.99591
0.99665
0.99726
0.99776
0.99816
0.99850
0.99877
0.99899
0.99917
0.99932
0.99945
0.99955
0.99963
0.99970

g(x)
0.00367
0.00300
0.00246
0.00201
0.00165
0.00135
0.00111
0.00090
0.00074
0.00061
0.00050
0.00041
0.00033
0.00027
0.00022
0.00018
0.00015
0.00012
0.00010
0.00008
0.00007

Extension
1.4677
1.2617
1.0822
0.9263
0.7914
0.6749
0.5747
0.4886
0.4149
0.3518
0.2979
0.2521
0.2130
0.1799
0.1517
0.1279
0.1077
0.0906
0.0762
0.0640
0.0538
101.0249003

In this case, the mean of the discrete distribution is 101, compared to 100 for the Exponential.
For a longer-tailed distribution such as a Pareto, the approximation might not be this close.

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 259

Method of Local Moment Matching:123


The method of local moment matching is another technique for approximating a continuous
distribution by a discrete distribution with a span of h. In the method of moment matching, the
approximating distribution will have the same lower moments as the original distribution.
In the simplest case, one requires that the means match. In a more complicated version, one could
require that both the first and second moments match.
In order to have the means match, using a span of h, the approximating densities are:124
g(0) = 1 - E[X h]/h.
g(ih) = {2E[X ih] - E[X (i-1)h] - E[X (i+1)h]} / h.
For example, for an Exponential Distribution, E[X
g(ih) = {2E[X

ih] - E[X

(i-1)h] - E[X

x] = (1 - e-x/).

(i+1)h]}/h = e-ih/{eh/ + e-h/ - 2}/h.

For an Exponential Distribution with = 100 using a span of 20:


g(0) = 1 - E[X

20]/20 = 1 - (100)(1 - e-.2)/20 = .093654.

g(ih) = e-ih/{eh/ + e-h/ - 2}/h = (100)e-i/5{e.2 + e-.2 - 2}/20 = 0.2000668e-i/5.


g(20) = 0.2000668e-1/5 = 0.164293.
g(40) = 0.2000668e-2/5 = 0.134511.
Out to 800, the approximating distribution is:
0.093654, 0.164293, 0.134511, 0.110129,
0.040514, 0.033170, 0.027157, 0.022235,
0.008180, 0.006697, 0.005483, 0.004489,
0.001651, 0.001352, 0.001107, 0.000906,
0.000333, 0.000273, 0.000223, 0.000183,
0.000067.125

0.090166,
0.018204,
0.003675,
0.000742,
0.000150,

0.073821,
0.014904,
0.003009,
0.000608,
0.000123,

0.060440,
0.012203,
0.002464,
0.000497,
0.000100,

0.049484,
0.009991,
0.002017,
0.000407,
0.000082,

In general, calculating the mean matching discrete distribution requires that one calculate the limited
expected value of the original distribution at each of the spanning points.

123

See Section 9.6.5.2 of Loss Models.


For a distribution with positive support, for example x > 0.
Obviously, this would only be applied to a distribution with a finite mean.
125
While this is close to the method of rounding approximation calculated previously, they differ.
124

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 260

Exercise: Create a discrete approximation to a Pareto Distribution with = 40 and = 3,


matching the mean, with a span of 10. Stop at 400.
[Solution: E[X

x] = {/(-1)}(1 - {/(x+)}1) = 20{1 - {40/(x+40)}2 ).

LEV(x)

g(x)

Extension

LEV(x)

g(x)

Extension

0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210

0.0000
7.2000
11.1111
13.4694
15.0000
16.0494
16.8000
17.3554
17.7778
18.1065
18.3673
18.5778
18.7500
18.8927
19.0123
19.1136
19.2000
19.2744
19.3388
19.3951
19.4444
19.4880

28.000%
32.889%
15.528%
8.277%
4.812%
2.988%
1.952%
1.330%
0.937%
0.679%
0.504%
0.382%
0.295%
0.231%
0.184%
0.148%
0.121%
0.099%
0.082%
0.069%
0.058%

0.0000
3.2889
3.1057
2.4830
1.9249
1.4938
1.1715
0.9308
0.7494
0.6110
0.5041
0.4203
0.3539
0.3006
0.2574
0.2220
0.1928
0.1685
0.1480
0.1308
0.1161

200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410

19.4444
19.4880
19.5266
19.5610
19.5918
19.6195
19.6444
19.6670
19.6875
19.7062
19.7232
19.7388
19.7531
19.7663
19.7784
19.7896
19.8000
19.8096
19.8186
19.8269
19.8347
19.8420

0.049%
0.042%
0.036%
0.031%
0.027%
0.024%
0.021%
0.018%
0.016%
0.014%
0.013%
0.011%
0.010%
0.009%
0.008%
0.008%
0.007%
0.006%
0.006%
0.005%
Sum

0.1035
0.0927
0.0833
0.0751
0.0680
0.0617
0.0562
0.0514
0.0470
0.0432
0.0397
0.0366
0.0338
0.0313
0.0291
0.0270
0.0252
0.0235
0.0219
0.0205
19.5441

For example, g(0) = 1 - E[X 10]/10 = 1 - 7.2/10 = 28%.


g(10) = {2E[X 10] - E[X 20] - E[X 0]}/10 = {(2)(7.2) - 11.111 - 0}/10 = 32.89%.
g(20) = {2E[X 20] - E[X 30] - E[X 10]}/10 = {(2)(11.111) - 13.469 - 7.2}/10 = 15.53%.
Comment: Summing through 400, the mean of the approximating distribution is 19.544 < 20, the
mean of the Pareto. The Pareto is a long-tailed distribution, and we would need to include values of
the approximating distribution beyond g(400), in order to get closer to the mean.]

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 261

Relationship to Layers of the Method of Mean Matching:


Note that the numerator of g(ih) can be written as a difference of layers of loss:126
ih

g(ih) = {(E[X

ih] - E[X

(i-1)h]) - (E[X

(i+1)h] - E[X

ih])} / h = {

ih+h

ih-h S(x) dx - ih S(x) dx } / h.

This numerator is nonnegative, since S(x) is a nondecreasing function of x.


Thus all of the approximating discrete densities are nonnegative, when we match the mean.127
h g(ih) = {(E[X

ih] - E[X

(i-1)h]) - (E[X

(i+1)h] - E[X

ih])} = Layi - Layi+1,

where Layi is the layer from (i-1)h to ih, i = 1, 2, 3, ...


The first four of these successive layers are shown on the following Lee Diagram:128
Size

4h
Lay. 4
3h
Layer 3
2h
Layer 2
h
Layer 1
1

Prob.

g(ih) = Layi /h - Layi+1 /h = (average width of area i) - (average width of area i+1)
= (average contribution of S(x) to Layer i) - (average contribution of S(x) to Layer i+1).

126

This formula works even when i = 0, since we have assumed S(x) = 1 for x 0.
If one matches the first two moments, this nice property does not necessarily hold.
128
Lee Diagrams are not on the syllabus of this exam. See Mahlers Guide to Loss Distributions.
127

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 262

Demonstration that the Densities Given by the Formulas Do Match the Mean:
n

ih

ih+h

nh+h

nh+h

g(ih) = { S(x) dx - S(x) dx} / h = { S(x) dx - S(x) dx }/h = 1 - S(x) dx / h.


i=0
i=0 ih-h
ih
-h
nh
nh
The final term goes to zero as n approaches , since S(x) goes to zero as x approaches .
Therefore, the densities of the approximating distribution do sum to 1.
n

ih

ih+h

ih+h

n-1

ih g(ih) = i { S(x) dx - S(x) dx} = (i +1)


i=0
i=1 ih-h
i=0
ih
ih
n-1 ih+h

nh+h

nh

ih+h

S(x) dx -

(i +1)
i=1

S(x) dx

ih

nh+h

S(x) dx - n S(x) dx = S(x) dx - n S(x) dx .


i=0 ih
nh
0
nh
As n approaches infinity, the first term goes to the integral from zero to infinity of the survival function,
which is the mean. Assuming the mean exists, xS(x) goes to zero as x approaches infinity.129
Therefore, the second term goes to zero as n approaches infinity.130 Therefore, the mean of the
discretized distribution, g, matches the mean of the original distribution.
n

One can rewrite the above as,

ih g(ih) = E[X nh] - n{E[X nh + h] - E[X nh]}


i=0

= (n+1) E[X

nh] - n E[X

nh + h].

For example, when the Pareto was approximated, the sum up to n = 40 was:
(41)E[X 400] - (40)E[X 410] = (41)(19.8420) - (40)(19.8347) = 19.54.
Another way of showing that the mean of the approximating discrete distribution matches that of the
original continuous distribution:
n

i=0

i=1

i=1

i=1

i=1

i=1

i=1

ih g(ih) = i (Layi - Layi + 1) = i Layi - i Layi + 1 = i Layi - (i -1) Layi = Layi


= Mean.
129

If S(x) ~ 1/x for large x, then the integral of S(x) to infinity would not exist, and therefore neither would the mean.
The second term is n times the layer from nh to nh+h. As n approaches infinity, the layer starting at nh of width h
has to go to zero faster than 1/n. Otherwise when we add them up, we get an infinite sum. (The sum of 1/n
diverges.) If we got an infinite sum, then the mean would not exist.
130

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 263

Matching the First Two Moments:


According to Loss Models, matching the first two moments for the discretized distribution, results in
more accurate results, when for example calculating stop loss premiums. While the equations for
moment matching shown in Loss Models can be written out for the case of matching the first two
moments and then can be programmed on a computer, this is well beyond the level of calculations
you should be expected to perform on the exam!131
Matching the first two moments, the densities of the approximating distribution are:132
2h

g(0) =

0 (x2 - 3hx + 2h2) f(x) dx / (2h2).


ih+h

For i odd, g(ih) = -

ih-h {x2 - 2ihx + (i2 -1)h2} f(x) dx / h2.

For i even, g(ih) =


ih

ih+2h

{x2 - (2i - 3)hx + (i-1)(i - 2)h2 } f(x) dx +

ih
ih-2h

{x2 - (2i + 3)hx + (i+ 1)(i + 2)h2} f(x) dx }/(2h2 ).

Applying these formulas to an Exponential Distribution with mean 50, using a span of 10,
f(x) = e-x/50/50 and h = 10:
2h

g(0) =

20

0 (x2 - 3hx + 2h2) f(x) dx / (2h2) = 0 {(x2 - 30x + 200) e- x / 50 / 50} dx / 200

= 0.0661987.
10+h

g(10) = -

20

{x2 - 2ihx + (i2 -1)h 2} f(x) dx /h2 = - {(x2 - 20x) e- x / 50 / 50} dx / 100

0
10-h

= 0.219203.

131

Loss Models does not show an example of matching the first two moments.
Matching the first three moments is even more complicated.
132
Derived from equations 9.28 and 9.29 in Loss Models.

2013-4-3,

Aggregate Distributions 9 Discretization,

20

g(20) = {

HCM 10/23/12,

Page 264

40

0 {(x2 - 10x) e- x / 50 / 50} dx + 20 {(x2 - 70x + 1200) e - x / 50 / 50} dx} / 200

= 0.0886528.
Note these formulas involve various integrals of x2 f(x) and x f(x). Thus, one needs to be able to
calculate such integrals. For an Exponential Distribution:
b

x2 f(x) dx =

x =b

x2 e- x / / dx = -(x2 + 2x + 22)e - x / ]

x=a

= (a2 + 2a + 22)e-a/ - (b2 + 2b + 22)e-b/.

a x f(x) dx = a x

e- x / /

dx = -(x +

x= b
x
/

)e

= (a + )e-a/ - (b + )e-b/.

x= a

The resulting approximating densities at 0, 10, 20, 30,..., 600 were:


0.0661987, 0.219203, 0.0886528, 0.146936, 0.0594257, 0.0984942, 0.0398343,
0.0660226, 0.0267017, 0.0442563, 0.0178987, 0.0296659, 0.0119979, 0.0198856,
0.0080424, 0.0133297, 0.00539098, 0.00893519, 0.00361368, 0.00598944,
0.00242232, 0.00401484, 0.00162373, 0.00269123, 0.00108842, 0.00180398,
0.00072959, 0.00120925, 0.000489059, 0.000810582, 0.000327826.
Coverage Modifications:
I have previously discussed the effect of deductibles and maximum covered losses on the
aggregate distribution. Once one has the modified frequency and severity distributions, one can
apply the Panjer Algorithm or other technique of estimating the aggregate losses in the usual
manner.
In the case of a continuous severity, one could perform the modification and discretization in either
order. However, Loss Models recommends that you perform the modification first and discretization
second.133

133

See Section 9.7 of Loss Models.

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 265

Problems:
Use the following information for the next two questions:
One creates a discrete approximation to a Weibull Distribution with
= 91 and = 1.4, using the method of rounding with a span of 25.
9.1 (1 point) What is the density of this discrete approximation at 150?
A. less than 6.0%
B. at least 6.0% but less than 6.1%
C. at least 6.1% but less than 6.2%
D. at least 6.2% but less than 6.3%
E. at least 6.3%
9.2 (1 point) For this discrete approximation, what is the probability of a loss less than or equal to
75?
A. less than 54%
B. at least 54% but less than 56%
C. at least 56% but less than 58%
D. at least 58% but less than 60%
E. at least 60%

9.3 (2 points) An Exponential Distribution with = 70 is approximated using the method of matching
means with a span of 5. What is the density of the approximating distribution at 60?
A. 2.0%
B. 2.5%
C. 3.0%
D. 3.5%
E. 4.0%
9.4 (2 points) A LogNormal Distribution with = 8 and = 2 is approximated using the method of
rounding with a span of 2000.
What is the density of the approximating distribution at 20,000?
A. 1.3%
B. 1.5%
C. 1.7%
D. 1.9%
E. 2.1%
Use the following information for the next two questions:
A Pareto Distribution with = 1000 and = 2 is approximated using the method of matching means
with a span of 100.
9.5 (2 points) What is the density of the approximating distribution at 500?
A. 4.0%
B. 4.5%
C. 5.0%
D. 5.5%
E. 6.0%
9.6 (2 points) What is the density of the approximating distribution at 0?
A. 8.0%
B. 8.5%
C. 9.0%
D. 9.5%
E. 10.0%

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 266

9.7 (1 point) An Exponential Distribution with = 300 is approximated using the method of rounding
with a span of 50. What is the density of the approximating distribution at 400?
A. 3.0%
B. 3.5%
C. 4.0%
D. 4.5%
E. 5.0%
Use the following information for the next 6 questions:

Frequency follows a Poisson Distribution with = 0.8.


Severity is Exponential Distribution with = 3.
Frequency and Severity are independent.
The severity distribution is to be approximated via the method of rounding with a span of 1.
9.8 (2 points) What is the probability that the aggregate losses are zero?
A. less than 35%
B. at least 35% but less than 40%
C. at least 40% but less than 45%
D. at least 45% but less than 50%
E. at least 50%
9.9 (2 points) What is the probability that the aggregate losses are one?
A. less than 8%
B. at least 8% but less than 9%
C. at least 9% but less than 10%
D. at least 10% but less than 11%
E. at least 11%
9.10 (2 points) What is the probability that the aggregate losses are two?
A. less than 8%
B. at least 8% but less than 9%
C. at least 9% but less than 10%
D. at least 10% but less than 11%
E. at least 11%
9.11 (2 points) What is the probability that the aggregate losses are three?
A. less than 5%
B. at least 5% but less than 6%
C. at least 6% but less than 7%
D. at least 7% but less than 8%
E. at least 8%

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 267

9.12 (3 points) What is the probability that the aggregate losses are four?
A. less than 5%
B. at least 5% but less than 6%
C. at least 6% but less than 7%
D. at least 7% but less than 8%
E. at least 8%
9.13 (3 points) What is the probability that the aggregate losses are five?
A. less than 5%
B. at least 5% but less than 6%
C. at least 6% but less than 7%
D. at least 7% but less than 8%
E. at least 8%
9.14 (3 points) Losses follow a Pareto Distribution with = 100 and = 3.
There is a deductible of 5, coinsurance of 80%, and a maximum covered loss of 100.
The per loss variable is approximated using the method of rounding with a span of 4.
What is the density of the approximating distribution at 40?
A. 2.4%
B. 2.6%
C. 2.8%
D. 3.0%
E. 3.2%
Use the following information for the next two questions:
A Pareto Distribution with = 1000 and = 4 is approximated using the method of rounding with a
span of 100.
9.15 (2 points) What is the density of the approximating distribution at 500?
A. 4.7%
B. 4.9%
C. 5.1%
D. 5.3%
E. 5.5%
9.16 (1 point) What is the density of the approximating distribution at 0?
A. 16.9%
B. 17.1%
C. 17.3%
D. 17.5%
E. 17.7%
9.17 (3 points) A LogNormal Distribution with = 7 and = 0.5 is approximated using the method
of matching means with a span of 200.
What is the density of the approximating distribution at 2000?
A. 3.5%
B. 3.7%
C. 3.9%
D. 4.1%
E. 4.3%
9.18 (3 points) Losses follow a Pareto Distribution with = 100 and = 3.
There is a deductible of 5, coinsurance of 80%, and a maximum covered loss of 100.
The per payment variable is approximated using the method of rounding with a span of 4.
What is the density of the approximating distribution at 60?
A. 1.7%
B. 1.9%
C. 2.1%
D. 2.3%
E. 2.5%

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 268

9.19 (3 points) An Exponential Distribution with = 100 is approximated using the method of
matching means with a span of 25.
Let R be the density of the approximating distribution at 0.
Let S be the density of the approximating distribution at 75.
What is R + S?
A. 23%
B. 25%
C. 27%
D. 29%
E. 31%

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 269

Solutions to Problems:
9.1. E. F(x) = 1 - exp[-(x/91)1.4]. g(150) = F(150+12.5) - F(150-12.5) =
exp[-(137.5/91)1.4] - exp[-(162.5/91)1.4] = .16826 - .10521 = 6.305%.
Comment: Heres is a table of some values of the approximating distribution:
x
0
25
50
75
100
125
150
175
200
225
250
275
300

F(x+12.5)
0.060201
0.251032
0.446217
0.611931
0.739649
0.831739
0.894794
0.936158
0.962307
0.978304
0.987806
0.993298
0.996394

g(x)
0.060201
0.190831
0.195185
0.165714
0.127718
0.092090
0.063055
0.041363
0.026149
0.015998
0.009502
0.005492
0.003096

9.2. E. The distribution function of the discrete approximating density at 75 is:


g(0) + g(25) + g(50) + g(75)
= F(12.5) + {F(37.5) - F(12.5)} + {F(62.5) - F(37.5)} + {F(87.5) - F(62.5)}
= F(87.5) = 1 - exp[-(87.5/91)1.4] = 61.2%.
Alternately, the approximating distribution and the Weibull Distribution are equal at the points
midway between the span points of: 50, 75, 100, etc.
The distribution function of the approximating distribution at 75 = the distribution function of the
approximating distribution at 87.5 = the Weibull Distribution at 87.5.
F(x) = 1 - exp[-(x/91)1.4]. F(87.5) = 1 - exp[-(87.5/91)1.4] = 61.2%.
Comment: The following diagram might be helpful:
x
0
25
50
x+12.5
12.5
37.5
62.5
F(x+12.5)
6.0%
25.1%
44.6%
g(x)
6.0%
19.1%
19.5%
9.3. C. E[X

75
87.5
61.2%
16.6%

x] = (1 - e-x/) = 70(1 - e-x/70).

E[X 55] = 38.0944. E[X 60] = 40.2939. E[X 65] = 42.3418.


g(60) = {2E[X 60] - E[X 55] - E[X 65]}/5 = {(2)(40.2939) - 38.0944 - 42.3418}/5 = 3.03%.
9.4. A. g(20000) = F(21000) - F(19000) = [(ln(21000) - 8)/2] - [(ln(19000) - 8)/2] =
[.98] - [.93] = .8365 - 8238 = 0.0127.

2013-4-3,
9.5. E. E[X

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 270

x] = {/(-1)}(1 - {/(x+)}1) = 1000{1 - {1000/(x+1000)}1 ) = 1000x/(x+1000).

E[X 400] = 285.7. E[X 500] = 333.3. E[X 600] = 375.


g(500) = {2E[X 500] - E[X 400] - E[X 600]}/100 = {(2)(333.3) - 285.7 - 375}/100 = 5.9%.
9.6. C. E[X 100] = {1000/(2-1)}( 1 - {1000/(1000 + 100)}2-1) = 90.91.
g(0) = 1 - E[X 100]/100 = 1 - 90.91/100 = 9.09%.
9.7. D. g(400) = F(425) - F(375) = e-375/300 - e-425/300 = 0.044.
9.8. E. P(z) = e(z-1) = e.8(z-1).
The method of rounding assigns probability to zero of: F(.5) = 1 - e-.5/3 = 0.153518.
c(0) = P(s(0)) = P(.153518) = e.8(.153518-1) = 0.508045.
9.9. C. The method of rounding assigns probability to 1 of F(1.5) - F(.5) = e-.5/3 - e-1.5/3 =
.846482 - .606531 = .239951.
x
0
1
2
3
4
5
6

F(x+.5)
0.153518
0.393469
0.565402
0.688597
0.776870
0.840120
0.885441

s(x)
0.153518
0.239951
0.171932
0.123195
0.088273
0.063250
0.045321

x
7
8
9
10
11
12
13

F(x+.5)
0.917915
0.941184
0.957856
0.969803
0.978363
0.984496
0.988891

s(x)
0.032474
0.023269
0.016673
0.011946
0.008560
0.006134
0.004395

For the Poisson, a = 0 and b = = 0.8.


x

c(x) = {1/(1 - as(0))}

(a + jb/x) s(j) c(x-j) = 1/{1 - (0)(.153518)} (0 + .8j/x) s(j) c(x-j) =


j=1

j=1

.8 (j/x) s(j) c(x-j).


j=1

c(1) = (.8)(1/1)s(1)c(0) = (.8)(1)(.239951)(.508045) = 0.097525.


9.10. A. c(2) = (.8){(1/2)s(1)c(1) + (2/2)s(2)c(0)} =
(.8){(1/2)(.239951)(.097525) + (1)(.171932)(.508045)} = 0.079240.

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 271

9.11. C. c(3) = (.8){(1/3)s(1)c(2) + (2/3)s(2)c(1) + (3/3)s(3)c(0)} =


(.8){(1/3)(.239951)(.079240) + (2/3)(.171932)(.097525) + (3/3)(.123195)(.508045)} =
0.064084.
9.12. B. c(4) = (.8){(1/4)s(1)c(3) + (2/4)s(2)c(2) + (3/4)s(3)c(1) + (4/4)s(4)c(0)} =
(.8){(1/4)(.239951)(.064084) + (2/4)(.171932)(.079240) + (3/4)(.123195)(.097525)+
(4/4)(.088273)(.508045)} = 0.051611.
9.13. A. c(5) = (.8){(1/5)s(1)c(4) + (2/5)s(2)c(3) + (3/5)s(3)c(2) + (4/5)s(4)c(1) + (5/5)s(5)c(0) =
(.8){(1/5)(.239951)(.051611) + (2/5)(.171932)(.064084) + (3/5)(.123195)(.079240) +
(4/5)(.088273)(.097525) + (5/5)(.063250)(.508045)} = 0.0414097.
Comment: The distribution of aggregate losses from 0 to 30 is: 0.508045, 0.0975247, 0.07924,
0.064084, 0.0516111, 0.0414099, 0.033112, 0.0263947, 0.0209802, 0.0166327, 0.0131542,
0.0103797, 0.0081733, 0.00642328, 0.0050387, 0.00394576, 0.00308486, 0.0024081,
0.00187708, 0.00146114, 0.00113589, 0.000881934, 0.000683946, 0.000529806,
0.00040996, 0.000316896, 0.000244715, 0.000188795, 0.00014552, 0.000112066,
0.00008623.

0.1

0.01

0.001

0.0001
0

10

15

20

25

30

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 272

9.14. B. Prior to the policy modifications, F(x) = 1 - {100/(100 + x)}3 .


Let y be the payment per loss.
y = 0 for x 5.
y = .8(x - 5) = .8x - 4, for 5 < x 100.
y = (.8)(95) = 76, for 100 x.
Let H(y) be the distribution of the payments per loss.
H(0) = F(5) = 1 - (100/(100 + 5))3 = .1362.
H(y) = F(x) = F((y+4)/.8) = F(1.25y + 5) = 1 - (100/(105 + 1.25y))3 , for 0 < y < 76.
H(76) = 1.
g(40) = H(42) - H(38) = (100/{105 + (1.25)(38)})3 - {(100/{105 + (1.25)(42)})3 = 2.6%.
Comment: Note that we apply the modifications first and then discretize.
g(0) = H(2) = .1950. Note that at 76 we would include all the remaining probability:
1 - H(74) = {100/(105 + 1.25(74))}3 = .1298.
Here is the whole approximating distribution:
y
0
4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
64
68
72
76

H(y+2)
0.1950
0.2977
0.3836
0.4560
0.5175
0.5701
0.6153
0.6544
0.6884
0.7180
0.7440
0.7670
0.7872
0.8052
0.8212
0.8355
0.8483
0.8598
0.8702

g(y)
0.1950
0.1026
0.0859
0.0724
0.0615
0.0526
0.0452
0.0391
0.0340
0.0297
0.0260
0.0229
0.0203
0.0180
0.0160
0.0143
0.0128
0.0115
0.0104
0.1298

9.15. D. F(x) = 1 - {1000/(1000 + x)}4 .


g(500) = F(550) - F(450) = (1000/(1000 + 450))4 - (1000/(1000 + 550))4 = 0.053.
9.16. E. g(0) = F(50) = 1 - (1000/(1000 + 50))4 = 17.7%.

2013-4-3,

Aggregate Distributions 9 Discretization,

9.17. C. E[X

HCM 10/23/12,

Page 273

x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]} =

1242.6 [2 lnx - 14.5] + x{1 - [2 lnx - 14]}.

E[X
E[X

1800] = 1242.6 [.49] + 1800{1 - [.99]} = (1242.6)(.6879) + (1800)(1 - .8389) = 1144.8.


2000] = 1242.6 [.70] + 2000{1 - [1.20]} = (1242.6)(.7580) + (2000)(1 - .8849) =

1172.1.
E[X

2200] = 1242.6 [.89] + 2000{1 - [1.39]} = (1242.6)(.8133) + (2200)(1 - .9177) =

1191.7.
g(2000) = {2E[X 2000] - E[X 1800] - E[X
{(2)(1172.1) - 1144.8 - 1191.7}/200 = 3.9%.

2200]}/200 =

2013-4-3,

Aggregate Distributions 9 Discretization,

HCM 10/23/12,

Page 274

9.18. A. Prior to the policy modifications, F(x) = 1 - (100/(100 + x))3 .


Let y be the (non-zero) payment. y is undefined for x 5.
y = .8(x - 5) = .8x - 4, for 5 < x 100. y = (.8)(95) = 76, for 100 x.
Let H(y) be the distribution of the non-zero payments.
H(y) = {F(x) - F(5)}/S(5) = {F((y+4)/.8) - .1362}/.8638 = F(1.25y + 5)/.8638 - .1577 =
{1 - (100/(105 + 1.25y))3 }/.8638 - .1577 = 1 - (100/(105 + 1.25y))3 /.8638, for 0 < y < 76.
H(76) = 1.
g(60) = H(62) - H(58) = {(100/{105 + (1.25)(58)})3 - {(100/{105 + (1.25)(62)})3 }/.8638 = 1.7%.
Comment: See the latter portion of Example 9.14 in Loss Models.
Note that we apply the modifications first and then discretize.
g(0) = H(2) = .0682. Note that at 76 we would include all the remaining probability:
1 - H(74) = {100/(105 + 1.25(74))}3 /.8638 = .1503.
Here is the whole approximating distribution:
y
0
4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
64
68
72
76

9.19. A. E[X

H(y+2)
0.0682
0.1870
0.2864
0.3703
0.4415
0.5024
0.5547
0.5999
0.6393
0.6736
0.7037
0.7302
0.7537
0.7745
0.7930
0.8096
0.8244
0.8377
0.8497

g(y)
0.0682
0.1188
0.0994
0.0839
0.0712
0.0609
0.0523
0.0452
0.0393
0.0343
0.0301
0.0265
0.0234
0.0208
0.0185
0.0166
0.0148
0.0133
0.0120
0.1503

x] = (1 - e-x/) = 100(1 - e-x/100).

E[X 25] = 22.120. E[X 50] = 39.347. E[X 75] = 52.763. E[X 100] = 63.212.
g(0) = 1 - E[X 25]/25 = 1 - 22.120/25 = 0.1152.
g(75) = {2E[X 75] - E[X 50] - E[X 100]}/25 = {(2)(52.763) - 39.347 - 63.212}/25 = 0.1187.
0.1152 + 0.1187 = 0.2339.

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 275

Section 10, Analytic Results134


In some special situations the aggregate distribution has a somewhat simpler form.
Closed Under Convolution:
If one adds two independent Gamma Distributions with the same , then one gets another Gamma
Distribution with the sum of the parameters.
Gamma(1, ) + Gamma (2, ) = Gamma(1 + 2, ).
A distribution is closed under convolution, if when one adds independent identically distributed
copies, one gets a member of the same family.
Gamma(, ) + Gamma (, ) = Gamma(2, ).
Thus a Gamma Distribution is closed under convolution.
Distributions that are closed under convolution include: Gamma, Inverse Gaussian, Normal, Binomial,
Poisson, and Negative Binomial.
If severity is Gamma(, ), then fX* n (x) = Gamma(n, ), for n 1.
Gamma(n, ) has density: e-x/ xn-1 / {n (n)}.
Thus if severity is Gamma, the aggregate distribution can be written in terms of convolutions:

fA(x) =

fN(n) fX* n (x) =


n=0

fN(0){point mass of prob. 1 @ 0} +

fN(n) e- x / xn - 1/ {n (n)} .

135

n=1

This is particularly useful when is an integer. = 1 is an Exponential Distribution.


Exercise: Severity is Exponential. Write a formula for the density of the aggregate distribution.
[Solution:

fA(x) = fN(0){point mass of probability 1 @ 0} +

fN(n) e- x / xn - 1/ {n (n -1)!} . ]
n=1

134
135

See Section 9.4 of Loss Models.


Where for fX*0 (x) has a point mass of probability 1 at zero.

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 276

Geometric-Exponential:136
One interesting special case of the collective risk model has a Geometric frequency and an
Exponential severity.
Exercise: Let frequency be given by a Geometric Distribution with = 3.
Let severity be given by an Exponential with mean 10.
Frequency and severity are independent.
What is the moment generating function of the aggregate losses?
1
1
[Solution: For the Geometric Distribution, P(z) =
. For = 3, P(z) =
.
1 - (z - 1)
4 - 3z
For the Exponential Distribution MX(t) =
M Agg(t) = PN(MX(t)) =

Note that

1
1
. For = 3, MX(t) =
.
1 - t
1 - 10t

1
1 - 10t
1 - 10t
=
=
.]
4 - 3 / (1 - 10t) 4 - 40t - 3
1 - 40t

1 - 10t 0.25 - 10t + 0.75 (1/ 4)(1 - 40t) + (3 / 4)


(3 / 4)
=
=
= (1/4) +
.
1 - 40t
1 - 40t
1 - 40t
1 - 40t

This is the weighted average of the m.g.f. of a point mass at zero and the m.g.f. of an Exponential
distribution with mean 40, with weights of 1/4 and 3/4.137 Thus the combination of a Geometric
frequency and an Exponential Severity gives an aggregate loss distribution that is a mixture of a
point mass at zero and an exponential distribution.
In this case, there is a point mass of probability 25% at zero. SA(y) = 0.75 e-y/40.
For example, SA(0) = 0.75, and SA(40) = 0.75e-1 = 0.276.
This aggregate loss distribution is discontinuous at zero.138 This will generally be the case when there
is a chance of zero claims. If instead the frequency distribution has no probability at zero and the
severity is a continuous distribution with support from 0 to , then the aggregate losses will be a
continuous distribution from 0 to .139

136

See Example 9.7 in Loss Models.


The m.g.f. is the expected value of exp[xt]. Thus the m.g.f. of a point mass at zero is E[1] = 1.
In general, the m.g.f. of a point mass at c is ect.
The m.g.f. of an Exponential with = 40 is: 1/(1 - 40t).
138
The limit approaching from below zero is not equal to the limit approaching from above zero.
139
The aggregate distribution is discontinuous when the severity distribution is discontinuous.
137

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 277

In general, with a Geometric frequency and an Exponential severity:


1
1 - t
1
M Agg(t) = PN(MX(t)) =
=
=
=
1 - {MX[t] - 1} 1 - {1/ (1 - t) - 1} 1 - t - {1 - (1 - t)}
1 - t
(1 + ) (1 - t)
1 + - t - t
=
=
=
1 - (1 + )t
(1 + ) {1 - (1 + )t}
(1 + ) {1 - (1 + )t}
1 - t - t

1
/ (1 + )
+
=
+
.
(1 + ) {1 - t + t}
(1 + ) {1 - (1 + )t}
1 +
1 - (1 + )t
This is the weighted average of the moment generating function of a point mass at zero and the
moment generating function of an Exponential with mean (1+).
The weights are:

and
.
1 +
1 +

Therefore:
FAgg(0) =

1
1

. FAgg(y) =
+
(1 - e-y/{(1+)}) = 1 e-y/{(1+)}.
1 +
1 +
1 +
1 +

This mixture is mathematically equivalent to an aggregate situation with a Bernoulli frequency with

q=
and an Exponential Severity with mean (1+).140 In the latter situation there is a
1 +
probability of

1
of no claim in which case the aggregate is 0, and a probability of
of
1 +
1 +

1 claim and thus the aggregate is an Exponential with mean (1+).

140

This general technique can be applied to a mixture of a point mass at zero and another distribution.

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 278

Negative Binomial-Exponential:141
One can generalize the previous situation to a Negative Binomial frequency with r integer.
The Negative Binomial is a sum of r independent Geometrics each with . Thus the aggregate is
the sum of r independent situations as before, each of which has a Bernoulli frequency with

q=
and an Exponential Severity with mean (1+). Thus the aggregate is mathematically
1 +
the same as a Binomial frequency with m = r and q =

and an Exponential Severity with


1 +

mean (1+).
Exercise: Determine the moment generating function of an aggregate distribution with a Binomial

frequency with m = r and q =


and an Exponential Severity with mean (1+).
1 +
[Solution: For a Binomial Distribution, P(z) = {1 + q(z-1)}m.
r

(1 + + z - )
(1 + z)

For this Binomial Distribution, P(z) = {1 +


(z-1)}r =
=
.
r
(1 + )
(1 + )r
1 +
For an Exponential Distribution with mean , MX(t) =
For this Exponential Distribution, MX(t) =
(1 +
M Agg(t) = PN(MX(t)) =

141

1
.
1 - t

1
1
=
.
1 - (1 + )t 1 - t - t

)r
1 - t - t
{1 - t - t + }r
=
(1 + )r
{(1 + ) (1 - t - t)} r

{(1 + ) (1 - t)}r
(1 - t)r
=
.]
{(1 + ) (1 - t - t)} r (1 - t - t) r

See Example 9.7 in Loss Models.

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 279

Exercise: Determine the moment generating function of an aggregate distribution with a


Negative Binomial frequency and an Exponential Severity.
1
[Solution: For the Negative Binomial Distribution, P(z) =
.
{1 - (z -1)}r
For the Exponential Distribution, MX(t) =
M Agg(t) = PN(MX(t)) =

1
.
1 - t

1
(1 - t)r
(1 - t)r
=
=
r
r. ]
1
{1t

(1
(1
t))}
{1t
t}
r
{1 - (
- 1)}
1 - t

We have shown that the moment generating functions are the same; thus proving as stated above,
that with a Negative Binomial frequency with r integer, and an Exponential Severity, the aggregate is

mathematically the same as a Binomial frequency with m = r and q =


and an Exponential
1 +
Severity with mean (1+).
In the latter situation, the frequency has finite support, and severity is Exponential, so one can write
the aggregate in terms of convolutions as:
m

fAgg(x) = fN(0) fX* 0 (x) +

fN (n)
n=1

e- x/ {(1+ )} xn - 1
(1+) n n (n-1)!

r
n
e- x / {(1+ ) } xn - 1
1
r!
=
{point
mass
of
prob.
1
@
0}
+
.

n n
r
(1 + )r
n=1 n! (r - n)! (1+) (1+ ) (n- 1)!

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 280

Problems:
10.1 (1 point) Which of the following distributions is not closed under convolution?
A. Binomial B. Gamma C. Inverse Gaussian
D. Negative Binomial
E. Pareto
10.2 (3 points) Frequency is: Prob[n = 0] = 60%, Prob[n = 1] = 30%, and Prob[n = 2] = 10%.
Severity is Gamma with = 3 and = 10.
Frequency and severity are independent.
Determine the form of the aggregate distribution.
For example, what are the densities of the aggregate distribution at 10, 50, and 100?
10.3 (3 points) Calculate the density at 6 for a Compound Binomial-Poisson frequency distribution
with parameters m = 4, q = 0.6, and = 3.
A. 6%

B. 7%

C. 8%

D. 9%

E. 10%

10.4 (3 points) Frequency is Geometric with = 1.4.


Severity is Exponential with = 5.
Frequency and severity are independent.
What is the density of the aggregate distribution at 10?
A. 2.0%
B. 2.5%
C. 3.0%
D. 3.5%
E. 4.0%
10.5 (4 points) Frequency is Negative Binomial with r = 3 and = 1.4.
Severity is Exponential with = 5.
Frequency and severity are independent.
What is the density of the aggregate distribution at 10?
A. 2.2%
B. 2.6%
C. 3.0%
D. 3.4%
E. 3.8%
10.6 (3 points) Frequency is Poisson with = 2.
Severity is Exponential with = 10.
Frequency and severity are independent.
What is the density of the aggregate distribution at 30?
A. 1.0%
B. 1.2%
C. 1.4%
D. 1.6%
E. 1.8%
10.7 (5A, 11/96, Q.36) The frequency distribution is Geometric with parameter .
The severity distribution is Exponential with a mean of 1.
Frequency and severity are independent.
(1/2 point) What is the Moment Generating Function of the frequency?
(1/2 point) What is the Moment Generating Function of the severity?
(1 point) What is the Moment Generating Function of the aggregate losses?

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 281

10.8 (Course 151 Sample Exam #2, Q.15) (1.7 points)


Aggregate claims has a compound Poisson distribution with = ln(4) and individual claim amounts
probability function given by: f(x) =

2- x
, x = 1, 2, 3,....
x ln(2)

Which of the following is true about the distribution of aggregate claims?


(A) Binomial with q = 1/2.
(B) Binomial with q = 1/4.
(C) Negative Binomial with r = 2 and = 1.
(D) Negative Binomial with r = 4 and = 1.
(E) Negative Binomial with r = 2 and = 3.

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 282

Solutions to Problems:
10.1. E. The sum of two independent Pareto Distributions is not another Pareto Distribution.
10.2. The sum of two of the Gammas is also Gamma, with = 6 and =10.
Thus the aggregate distribution is:
(0.6)(point mass of prob. 1 @ 0) + (0.3)Gamma(3, 10) + (0.1)Gamma(6, 10).
For y > 0, the density of the aggregate distribution is:
fA(y) = (0.3){y2 e-y/10 / (103 (3))} + (0.1){y5 e-y/10 / (106 (6))} =
e-y/10{1,500,000y2 + 8.3333y5 } / 1010.
fA(10) = 0.00555. fA(50) = 0.004281. fA(100) = 0.000446.
Comment: This density integrated from 0 to infinity is 0.4. The remaining 60% of the probability is in
the point mass at zero, corresponding to the probability of zero claims.

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 283

10.3. E. When the Binomial primary distribution is 2, the compound distribution is the sum of two
independent Poisson distributions each with = 3, which is Poisson with = 6.
Density of the compound at 6 is:
Prob[Binomial = 1] (Density at 6 of a Poisson with = 3) +
Prob[Binomial = 2] (Density at 6 of a Poisson with = 6) +
Prob[Binomial = 3] (Density at 6 of a Poisson with = 9) +
Prob[Binomial = 4] (Density at 6 of a Poisson with = 12) =
(0.1536)(0.05041) + (0.3456)(0.16062) + (0.3456)(0.09109) + (0.1296)(0.02548) = 0.0980.
Comment: Here is the density of the compound distribution, out to 16:
n
Binomial

0
0.0256

1
0.1536

2
0.3456

3
0.3456

4
0.1296

Aggregate
Distribution

f*0

=3

=6

=9

= 12

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

0.04979
0.14936
0.22404
0.22404
0.16803
0.10082
0.05041
0.02160
0.00810
0.00270
0.00081
0.00022
0.00006
0.00001
0.00000
0.00000
0.00000

0.00248
0.01487
0.04462
0.08924
0.13385
0.16062
0.16062
0.13768
0.10326
0.06884
0.04130
0.02253
0.01126
0.00520
0.00223
0.00089
0.00033

0.00012
0.00111
0.00500
0.01499
0.03374
0.06073
0.09109
0.11712
0.13176
0.13176
0.11858
0.09702
0.07277
0.05038
0.03238
0.01943
0.01093

0.00001
0.00007
0.00044
0.00177
0.00531
0.01274
0.02548
0.04368
0.06552
0.08736
0.10484
0.11437
0.11437
0.10557
0.09049
0.07239
0.05429

0.034147
0.028475
0.051617
0.070664
0.084417
0.093636
0.098037
0.097036
0.090957
0.081063
0.068967
0.056172
0.043871
0.032891
0.023690
0.016405
0.010929

Sum

1.00000

0.99983

0.98889

0.89871

0.982974

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 284

10.4. A. In general, with a Geometric frequency and an Exponential severity:


M A(t) = PN(MX(t)) = 1/{1 - (1/(1 - t) -1)} = (1 - t)/{1 - (1+)t} =
(1+)(1 - t)/{(1+){1 - (1+)t}} = (1 + - t - t)/{(1+){1 - (1+)t}} =
(1 - t - t)/{(1+){1 - (1+)t}} + /{(1+){1 - (1+)t}} = 1/(1+) + {/(1+)}/{1 - (1+)t}.
This is the weighted average of the moment generating function of a point mass at zero and the
moment generating function of an Exponential with mean (1+).
In this case, the point mass at zero is: 1/(1+) = 1/2.4 = 5/12, and the Exponential with mean
(1+) = (2.4)(5) = 12 is given weight: {/(1+)} = 1.4/2.4 = 7/12.
Therefore, the density of the aggregate distribution at 10 is: (7/12)e-10/12/12 = 0.0211.
Comment: Similar to Example 9.7 in Loss Models.
10.5. B. The Negative Binomial is the sum of three independent Geometric Distributions with
= 1.4. In the previous solution, the aggregate was equivalent to a Bernoulli frequency with
q = 7/12 and an Exponential Severity with mean 12.
This is the sum of three independent versions of the previous solution, which is equivalent to a
Binomial frequency with m = 3 and q = 7/12, and an Exponential Severity with mean 12.
For the Binomial, Prob[n = 0] = (5/12)3 = .0723, Prob[n = 1] = (3)(7/12)(5/12)2 = .3038,
Prob[n = 2] = (3)(7/12)2 (5/12) = .4253, Prob[n = 3] = (7/12)3 = .1985.
When n = 1 the aggregate is Exponential with = 12, with density e-x/12/12.
When n = 2 the aggregate is Gamma with = 2 and = 12, with density x e-x/12/144.
When n = 3 the aggregate is Gamma with = 3 and = 12, with density x2 e-x/12/3456.
Therefore, the density of the aggregate distribution at 10 is:
(0.3038)e-10/12/12 + (0.4253)(10)e-10/12/144 + (0.1985)(100)e-10/12/3456 = 0.0606e-10/12 =
0.0263.
Comment: Similar to Example 9.7 in Loss Models. Beyond what you are likely to be asked.

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 285

10.6. B. Since the sum of n Exponentials is a Gamma with = n, the density of the aggregate at
x > 0 is:

n=1

fN(n) e- x / xn - 1
=
n (n-1)!

e - (2 + x / 10)
x

n=1

n=1

{e- 2 2n / n!} e - x / 10 xn - 1
=
10n (n- 1)!

(0.2x)n
. For x = 30 this is: (1/4452.395)
n! (n-1)!

n=1

6n
=
n! (n-1)!

(1/4452.395) {6 + 36/2 + 216/12 + 1296/144 + 7776/2880 + 46656/86400 + ...) = 0.0122.


Comment: There is a point mass of probability at zero of: e = e-2 = 13.53%.
An example of what is called a Tweedie Distribution, where more generally the severity is
Gamma. The Tweedie distribution is used in Generalized Linear Models. See for example,
A Practitioners Guide to Generalized Linear Models, by Duncan Anderson, Sholom Feldblum,
Claudine Modlin, Dora Schirmacher, Ernesto Schirmacher and Neeza Thandi,
or A Primer on the Exponential Family of Distributions, by David R. Clark and Charles A.
Thayer, both in the 2004 CAS Discussion Paper Program.
10.7. As shown in Appendix B of Loss Models, for a Geometric frequency P(z) = 1/(1-(z-1)).
For an Exponential with = 1, M(t) = 1/(1- t) = 1/(1-t).
For the aggregate losses, MA(t) = PN(MX(t)) = 1/(1-(1/(1-t)-1)) = (1 - t) / (1 - t - t).

2013-4-3,

Aggregate Distributions 10 Analytic Results,

HCM 10/23/12,

Page 286

10.8. C. The probability generating function of the aggregate distribution can be written in terms of
the p.g.f. of the frequency and severity: Paggregate(z) = Pfrequency(Pseverity(z)).
The frequency is Poisson, with p.g.f. P(z) = exp[(z-1)] = exp[ln(4)(z-1)].
The severity has p.g.f. of P(z) = E[zx] = (1/ln(2)){z/2 + (z/2)2 /2 + (z/2)3 /3 + (z/2)4 /4 + ...} =
(1/ln(2))(-ln(1 - z/2)) = (1/ln(2))ln(2/(2 - z)) = (1/ln(2))(ln(2) - ln(2 - z)) = 1 - ln(2 - z)/ln(2).
Paggregate(z) = exp[ln(4)(Pseverity(z) - 1)] = exp[ln(4){- ln(2 - z)/ln(2)}] = exp[-2 ln(2-z)] = (2-z)-2.
The p.g.f. of a Negative Binomial is [1 - (z-1)]-r. Comparing probability generating functions, the
aggregate losses are a Negative Binomial with r = 2 and = 1.
Comments: The severity (or secondary distribution in the compound frequency distribution) is a
Logarithmic distribution as per Appendix B of Loss Models, with = 1.
Thus it has p.g.f. of P(z) = 1 - ln[1- (z-1)]/ln(1+) = 1 - ln(2-z)/ln(2).
This is a compound Poisson-Logarithmic distribution.
In general, a Compound Poisson-Logarithmic distribution with a parameters and , is a Negative
Binomial distribution with parameters r = /ln(1+) and .
In this case r = ln(4)/ln(1+1) = 2ln(2)/ ln(2) = 2.
ln(1 - y) = -y - y2 /2 - y3 /3 - y4 /4 - ..., for |y| < 1, follows from taking a Taylor Series.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 287

Section 11, Stop Loss Premiums


Loss Models discusses stop loss insurance, in which the aggregate losses excess of an
aggregate deductible are being covered.142 The stop loss premium is the expected aggregate
losses excess of an aggregate deductible, or the expected cost for stop loss insurance,
ignoring expenses, taxes, risk loads, etc.
For example, assume Merlins Mall buys stop loss insurance from Halfmoon Insurance, such that
Halfmoon will pay for any aggregate losses excess of a $100,000 aggregate deductible per year.
Exercise: If Merlins Mall has aggregate losses of $302,000 in 2003, how much does Halfmoon
Insurance pay?
[Solution: 302,000 - 100,000 = $202,000.
Comment: If instead Merlins Mall had $75,000 in aggregate losses, Halfmoon would pay nothing.]
In many cases, the stop loss premium just involves the application to somewhat different situations
of mathematical concepts that have already been discussed with respect to a per claim deductible.143
One can have either a continuous or a discrete distribution of aggregate losses.
Discrete Distributions of Aggregate Losses:
Exercise: Assume the aggregate losses in thousands of dollars for Merlins Mall are approximated
by the following discrete distribution: f(50) = 0.6, f(100) = 0.2, f(150) = 0.1,f(200) = 0.05,
f(250) = 0.03, f(300) = 0.02.
What is the stop loss premium, for a deductible of 100 thousand?
[Solution: For aggregate losses of: 50, 100, 150, 200, 250, and 300, the amounts paid Halfmoon
Insurance are respectively: 0, 0, 50, 100, 150, 200.
Thus the expected amount paid by Halfmoon is:
(0)(0.6) + (0)(0.2) + (50)(0.1) + (100)(0.05) + (150)(0.03) + (200)(0.02) = 18.5 thousand.]
In general, for any discrete distribution, one can compute the losses excess of d, the stop loss
premium for a deductible of d, by taking a sum of the payments times the density function:
E[(A - d)+] =

(a - d) f(a) .

a>d
142

See Definition 9.3 in Loss Models.


Stop loss insurance is mathematically identical to stop loss reinsurance.
Reinsurance is protection insurance companies buy from reinsurers.
143
See Mahlers Guide to Loss Distributions. Those who found Lee Diagrams useful for understanding excess
losses for severity distributions, will probably also find them helpful here.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 288

For example, the stop loss premium for Merlins Mall with a deductible of 150 thousand is:
(50)(0.05) + (100)(0.03) + (150)(0.02) = 8.5 thousand.
Note that one could arrange this calculation in a spreadsheet as follows:
Aggregate
Losses

Probability

Amount Paid by Stop Loss


Insurance, Ded. of 150

Amount paid
times Probability

50
100
150
200
250
300

0.6
0.2
0.1
0.05
0.03
0.02

0
0
0
50
100
150

0
0
0
2.5
3
3

Sum

8.5

This technique is generally applicable to stop loss premium calculations involving discrete
distributions of aggregate losses.
Exercise: What is the stop loss premium for Merlins Mall with a deductible of 120 thousand ?
[Solution: (30)(0.1) + (80)(0.05) + (130)(0.03) + (180)(0.02) = 14.5 thousand.
Aggregate
Losses

Probability

Amount Paid by Stop Loss


Insurance, Ded. of 120

Amount paid
times Probability

50
100
150
200
250
300

0.6
0.2
0.1
0.05
0.03
0.02

0
0
30
80
130
180

0
0
3
4
3.9
3.6

Sum

14.5

Note that in this case there is no probability between $100,000 and $150,000. $120,000 is 40% of
the way from $100,000 to $150,000. The stop loss premium for a deductible of $120,000 is
14,500, 40% of the way from the stop loss premium for a deductible of $100,000 to the the stop
loss premium at a deductible of $150,000; 14,500 = (0.6)(18.5) + (0.4)(8.5).
In general, when there is no probability for the aggregate losses in an interval, the stop
loss premium for deductibles in this interval can be gotten by linear interpolation.144
Exercise: What is the stop loss premium for Merlins Mall with a deductible of 140 thousand ?
[Solution: (0.2)(18.5) + (0.8)(8.5) = 10.5 thousand.]

144

See Theorem 9.4 in Loss Models.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 289

Thus for a discrete probability distribution, the excess losses and the excess ratio decline lineally
over intervals in which there is no probability; the slope changes at any point which is part of the
support of the distribution. For continuous distributions, the excess losses and excess ratio decline
faster than linearly; the graphs of the excess losses and excess ratio are concave upwards.145
One can also calculate the stop loss premium, as the mean aggregate loss minus the expected
value of the aggregate loss limited to d. E[(A - d)+ ] = E[A] - E[A d].
For the prior example, the mean aggregate loss is:
(50)(0.6) + (100)(0.2) + (150)(0.1) + (200)(0.05) + (250)(0.03) + (300)(0.02) = 88.5 thousand.
One would calculate the expected value of the aggregate loss limited to 150 as follows:
A

Aggregate
Losses

Probability

Aggregate
Loss Limited
to 150

Product of
Col. B
& Col. C

50
100
150
200
250
300

0.6
0.2
0.1
0.05
0.03
0.02

50
100
150
150
150
150

30
20
15
7.5
4.5
3

Sum

E[(A - 150)+ ] = E[A] - E[A

80

150] = 88.5 - 80 = 8.5.

Recursion Formula:
When one has a discrete distribution with support spaced at regular intervals, then Loss Models
presents a systematic way to calculate the excess losses. As above, assume the aggregate losses
in thousands of dollars for Merlins Mall are approximated by the following discrete distribution:
f(50) = 0.6, f(100) = 0.2, f(150) = 0.1, f(200) = 0.05, f(250) = 0.03, f(300) = 0.02. I
n this case, the density is only positive at a finite number of points, each 50 thousand apart.

Losses excess of 150 thousand = 50,000

S(150

+ 50j)

j=0

= 50,000 {S(150) + S(200) + S(250) + S(300)} = 50,000(0.1 + 0.05 + 0.02 + 0) = 8,500.146


145

Excess losses are the integral of the survival function from x to infinity. With x at the lower limit of integration, the
derivative of the excess losses is -S(x) < 0. The second derivative of the excess losses is f(x) > 0.
146
Which matches the result calculated directly. One could rearrange the numbers that entered into the two
calculations in order to see why the results are equal.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 290

Thus in analogy to the continuous case, where one can write the excess losses as an integral of the
survival function, in the discrete case, one can write the excess losses as a sum of survival functions,
times A:147

E[(A - jA)+] = A S(jA + kA).


k =0

This result can be turned into a recursion formula.


For the above example, Losses excess of 150,000 = 50,000(0.1 + 0.05 + 0.02 + 0) = 8500.
The losses excess of 200,000 = 50,000(0.05 + 0.02 + 0) = 3500 = 8500 - 5000 =
(losses excess of 150,000) - (50000)(0.1) = losses excess of 150,000 - A S(150,000).
More generally, we can write the excess losses at the larger deductible, (j+1)A in terms of those at
the smaller deductible, jA, and the Survival Function at the smaller deductible:148
E[(A - (j+1)A)+ ] = E[(A - jA)+ ] - A S(jA).
In other words, in this type of situation, raising the aggregate deductible of the insured by A,
eliminates additional losses of A S(jA), from the point of view of the insurer.149
This recursion can be very useful if there is some maximum value A can take on. In which case we
could start at the top and work our way down. In the example, we know there are no aggregate
losses excess of 300,000; the stop loss loss premium for a deductible of 300,000 is zero. Then we
could calculate successively, the stop loss premiums for deductibles 250, 200, 150, etc. Then any
other deductibles can be handled via linear interpolation.
However, it is more generally useful to start at a deductible of zero and work ones way up.
The stop loss premium at a deductible of zero is the mean. Usually we would have already
calculated the mean aggregate loss as the product of the mean frequency and the mean severity.
In the example, the mean aggregate loss is:
(50)(0.6) + (100)(0.2) + (150)(0.1) + (200)(0.05) + (250)(0.03) + (300)(0.02) = 88.5 thousand.
Then the stop loss premium at a deductible of 50,000 is: 88.5 - (50)(1) = 38.5 thousand.
The stop loss premium at a deductible of 100 is: 38.5 - (50)(0.4) = 18.5.
147

See Theorem 9.5 in Loss Models. In the corresponding Lee Diagram, the excess losses are a sum of horizontal

rectangles of width S(Ai) and height A.


148
See Corollary 9.6 in Loss Models.
149
Or adds additional losses of AS( jA), from the point of view of the insured.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 291

This calculation can be arranged in a spreadsheet as follows:


Deductible

Survival Function

Stop Loss Premium

0
50
100
150
200
250
300

1
0.4
0.2
0.1
0.05
0.02
0

88.5
38.5
18.5
8.5
3.5
1
0

Note that the stop loss premium at a deductible of 300 is 0.


In general, the stop loss premium at a deductible of (or the largest possible aggregate
loss) is zero.
Continuous Distributions of Aggregate Losses:
Assume the aggregate annual losses for Halfmoon Insurance are closely approximated by a
LogNormal Distribution with = 16 and = 1.5.
Exercise: What are the mean aggregate annual losses for Halfmoon?
[Solution: exp(16 + 1.52 /2) = 27,371,147.]
Exercise: For Halfmoon, what is the probability that the aggregate losses in a year will be larger than
$100,000,000.
[Solution: 1 - [(ln(100,000,000) - 16)/1.5] = 1 - [1.62] = 5.26%.]
Halfmoon Insurance might buy stop loss reinsurance from Global Reinsurance.150
For example, assume Halfmoon buys stop loss reinsurance excess of $100 million.
If Halfmoons aggregate losses exceed $100 million in any given year, then Global Reinsurance will
pay Halfmoon the amount by which the aggregate losses exceed $100 million.
Exercise: Halfmoons aggregate losses in 2002 are $273 million.
How much does Global Reinsurance pay Halfmoon?
[Solution: $273 million - $100 million = $173 million.]

150

Also called aggregate excess reinsurance. Stop loss reinsurance is mathematically identical situation of the
purchase of insurance excess of an aggregate deductible.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 292

Mathematically, the payments by Global Reinsurance are the same as the losses excess of a
deductible or maximum covered loss of $100 million. The expected excess losses are the mean
minus the limited expected value.151 The expected losses retained by Halfmoon are the limited
expected value.
Exercise: What are the expected losses retained by Halfmoon and the expected payments by
Global Reinsurance?
[Solution: For the LogNormal Distribution, the limited expected value is:
E[X

x] = exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx )/]}

E[X

100,000,000] = (27.37 million)[(ln(100 million) - 16 - 1.52 )/1.5] +


(100 million){1-[(ln(100 million)- 16)/ 1.5]} =

(27.37 million)[0.11] + (100 million){1 - [1.61]} =


(27.37 million)(0.5438) + (100 million)(0.0537) = $20.25 million.
Thus Halfmoon retains on average $20.25 million of losses. Global Reinsurance pays on average
E[X] - E[X 100,000,000] = 27.37 million - 20.25 million = $7.12 million.
Comment: The formula for the limited expected value for the LogNormal Distribution is given in
Appendix A of Loss Models.]
Thus ignoring Global Reinsurances expenses, etc., the net stop loss premium Global
Reinsurance would charge Halfmoon would be in this case $7.12 million.152
In general, the stop loss premium depends on both the deductible and the distribution of the
aggregate losses. For example, the stop loss premium for a deductible of $200 million would have
been less than that for a deductible of $100 million.
Given the LogNormal Distribution one could calculate variances and higher moments for either the
losses excess of the deductible or below the deductible. One could also do calculations concerning
layers of loss. Mathematically these are the same type of calculations as were performed on
severity distributions.153

151

See Mahlers Guide to Loss Distributions.


See Definition 9.3 in Loss Models.
153
See Mahlers Guide to Loss Distributions.
152

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 293

Exercise: What is the variance of the losses retained by Halfmoon?


What is the variance of the payments by Global Reinsurance?
[Solution: For the LogNormal Distribution:
E[(X x)2 ] = exp[2 + 22][{ln(x) (+ 22)} / ] + x2 {1- [{ln(x) } / ] }
For = 16 and = 1.5,
E[(X 100 million)2 ] = 1.122 x 1015. E[X2 ] = exp[2 + 22] = 7.109 x 1015.
E[X

100 million] = 2.025 x 107 .

E[X] = exp[ + .52] = 27.37 million

The variance of Halfmoons retained losses is:


E[(X 100 million)2 ] - E[X

100 million]2 = 1.122 x 1015 - (2.025 x 107 )2 =

7.12 x 1014. The second moment of Globals payments is:


E[X2 ] - E[(X 100 m)2 ] - 2(100 million){E[X] - (E[X 100 m]} =
7.109 x 1015 - 1.122 x 1015 - (2x 108 )(2.737 x 107 - 2.025 x 107 ) = 4.563 x 1015 .
From the previous solution, the mean of Globals payments is $7.23 million.
Therefore, the variance of Globals payments is: 4.563 x 1015 - (7.23 x 106 )2 = 4.512 x 1015.]
There is nothing special about the LogNormal Distribution. One could apply the same ideas to the
Uniform, Exponential, or other continuous distributions.
Exercise: Aggregate losses are uniformly distributed on (50, 100).
What is the net stop loss premium for a deductible of 70?
100

[Solution: losses excess of 70 =

100

70 (t - 70) f(t) dt = 70 (t - 70) / 50 dt = 9.

Alternately, for the uniform distribution, E[X x] = (2xb - a2 - x2 ) / {2(b-a)}, for a x b.


E[X 70] = (2(70)(100) - 502 - 702 ) / {2(100-50)} = 66.
E[X] = (50+100)/2 = 75.
E[X] - E[X 70] = 75 - 66 = 9.]
For any continuous distribution, F(x), the mean, limited expected value, and therefore the excess
losses can be written as an integral of the survival function S(x) = 1 - F(x).154

E[A] =

154

0 S(t) dt .

See Mahlers Guide to Loss Distributions.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 294

E[A d] =

0 S(t) dt .

losses excess of d = E[A] - E[A d] =

d S(t) dt .

Loss Models also uses the notation E[(A - d)+] for the excess losses, where y+ is defined as
0 if y < 0 and y if y 0.

losses excess of d = E[(A - d)+] =

d (t - d) f(t) dt = d S(t) dt .

The stop loss premium at 0 is the mean: E[(A - 0)+ ] = E[A].


The stop loss premium at is 0: E[(A - )+ ] = E[0] = 0.
Other Quantities of Interest:
Once one has the distribution of aggregate losses, either discrete or continuous, one can calculate
other quantities than the expected losses excess of an aggregate deductible; i.e., other than the
stop loss premium. Basically any quantity we could calculate for a severity distribution,155 we could
calculate for an aggregate distribution.
For example, one can calculate higher moments. In particular one could calculate the variance of
aggregate losses excess of an aggregate deductible.

155

See Mahlers Guide to Loss Distributions.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 295

Exercise: Assume the aggregate losses in thousands of dollars for Merlins Mall are approximated
by the following discrete distribution: f(50) = 0.6, f(100) = 0.2, f(150) = 0.1,
f(200) = 0.05, f(250) = 0.03, f(300) = 0.02. Merlins Mall buys stop loss insurance from Halfmoon
Insurance, such that Halfmoon will pay for any aggregate losses excess of a $100 thousand
deductible per year. What is the variance of payments by Halfmoon?
[Solution: For aggregate losses of: 50, 100, 150, 200, 250, and 300, the amounts paid Halfmoon
Insurance are respectively: 0, 0, 50, 100, 150, 200.
Thus the expected amount paid by Halfmoon is:
(0)(0.6) + (0)(0.2) + (50)(0.1) + (100)(0.05) + (150)(0.03) + (200)(0.02) = 18.5 thousand.
The second moment is:
(02 )(0.6) + (02 )(0.2) + (502 )(0.1) + (1002 )(0.05) + (1502 )(0.03) + (2002 )(0.02) = 2225 million.
Therefore, the variance is: 2225 million - 342.25 million = 1882.75 million.]
One could calculate the mean and variance of aggregate losses subject to an aggregate limit. The
losses not paid by Halfmoon Insurance due to the aggregate deductible are paid for by the insured,
Merlins Mall. Thus from Merlins Malls point of view, it pays for aggregate losses subject to an
aggregate maximum of $100,000.
Exercise: In the previous exercise, what are the mean and variance of Merlins Malls aggregate
losses after the impact of insurance?
[Solution: For aggregate losses of: 50, 100, 150, 200, 250, and 300, the amounts paid Merlins Mall
after the effect of insurance are respectively: 50, 100, 100, 100, 100, 100.
Thus the expected amount paid by Merlins Mall is:
(50)(0.6) + (100)(0.2) + (100)(0.1) + (100)(0.05) + (100)( 0.03) + (100)(0.02) = 70 thousand.
The second moment is: (502 )(0.6) + (1002 )(0.4) = 5500 million.
Therefore, the variance is: 5500 million - 4900 million = 600 million.]
Here is an example of how one can do calculations related to layers of loss.
Assume the aggregate annual losses for Halfmoon Insurance are closely approximated by a
LogNormal Distribution with = 16 and = 1.5.
If Halfmoons aggregate losses exceed $100 million in any given year, then Global Reinsurance will
pay Halfmoon the amount by which the aggregate losses exceed $100 million.
However, Global will pay no more than $250 million per year.
Exercise: Halfmoons aggregate losses in 2002 are $273 million.
How much does Global Reinsurance pay Halfmoon?
[Solution: $273 million - $100 million = $173 million.]

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 296

Exercise: Halfmoons aggregate losses in 2004 are $517 million.


How much does Global Reinsurance pay Halfmoon?
[Solution: $517 million - $100 million = $417 million.
However, Globals payment is limited to $250 million.
Comment: Unless Halfmoon has additional reinsurance, Halfmoon pays $517 - $250 = $267 million
in losses, net of reinsurance.]
Mathematically, the payments by Global Reinsurance are the same as the layer of losses from
$100 to $350 million.
The expected losses for Global are: E[X 350 million] - E[X 100 million].156
The expected losses retained by Halfmoon are: E[X] + E[X 100 million] - E[X 350 million].
Exercise: What are the expected losses retained by Halfmoon and the expected payments by
Global Reinsurance?
[Solution: For the LogNormal Distribution, the limited expected value is:
E[X

x] = exp( + 2/2)[(lnx 2)/] + x {1 - [(lnx )/]}.

For = 16 and = 1.5:


E[X

100 million] = $20.25 million. E[X

350 million] = $25.17 million.

E[X] = exp( + 2/2) = $27.37 million.


Thus Global Reinsurance pays on average E[X 350 million] - E[X 100 million] =
25.17 million - 20.25 million = $4.92 million.
Halfmoon retains on average $27.37 - 4.92 = $22.45 million of losses.]
Similarly, one could calculate the variance of the layers of losses.
The second moment of the layer of loss from d to u is:
E[(X u)2 ] - E[(X d)2 ] - 2d{E[X u] - (E[X d]}.157

156
157

See Mahlers Guide to Loss Distributions.


See Mahlers Guide to Loss Distributions.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 297

Exercise: What is the variance of payments by Global Reinsurance?


[Solution: For the LogNormal Distribution:
E[(X x)2 ] = exp[2 + 22][{ln(x) (+ 22)} / ] + x2 {1- [{ln(x) } / ] } .
For = 16 and = 1.5,
E[(X 100 million)2 ] = 1.122 x 1015. E[(X 350 million)2 ] = 2.940 x 1015.
E[X 100 million] = 2.025 x 107 . E[X 350 million] = 2.517 x 107 .
The second moment of Globals payments is:
E[(X 350 m)2 ] - E[(X 100 m)2 ] - 2(100 million){E[X 350 m] - (E[X 100 m]} =
2.940 x 1015 - 1.122 x 1015 - (2 x 108 )( 2.517 x 107 - 2.025 x 107 ) = 8.4 x 1014.
From the previous solution, the mean of Globals payments is $4.92 million.
Therefore, the variance of Globals payments is: 8.34 x 1014 - (4.92 x 106 )2 = 8.10 x 1014.
Comment: The formula for the limited moments for the LogNormal Distribution is given in
Appendix A of Loss Models.]

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 298

Problems:
11.1 (1 point) The stop loss premium for a deductible of $1 million is $120,000.
The stop loss premium for a deductible of $1.1 million is $111,000.
Assuming the aggregate losses are very unlikely to be between $1 million and $1.1 million dollars,
what is the stop loss premium for a deductible of $1.08 million?
A. Less than 112,500
B. At least 112,500, but less than 112,600
C. At least 112,600, but less than 112,700
D. At least 112,700, but less than 112,800
E. At least 112,800
11.2 (3 points) The aggregate annual losses have a mean of 13,000 and a standard deviation of
92,000. Approximate the distribution of aggregate losses by a LogNormal Distribution, and then
estimate the stop loss premium for a deductible of 25,000.
A. Less than 7000
B. At least 7000, but less than 7100
C. At least 7100, but less than 7200
D. At least 7200, but less than 7300
E. At least 7300
Use the following information for the next 9 questions:
The aggregate losses have been approximated by the following discrete distribution:
f(0) = 0.3, f(10) = 0.4, f(20) = 0.1, f(30) = 0.08, f(40) = 0.06, f(50) = 0.04, f(60) = 0.02.
11.3 (1 point) What is the mean aggregate loss?
A. Less than 15
B. At least 15, but less than 16
C. At least 16, but less than 17
D. At least 17, but less than 18
E. At least 18
11.4 (1 point) What is the stop loss premium, for a deductible of 10?
A. Less than 4
B. At least 4, but less than 5
C. At least 5, but less than 6
D. At least 6, but less than 7
E. At least 7

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

11.5 (1 point) What is the stop loss premium, for a deductible of 20?
A. Less than 2
B. At least 2, but less than 3
C. At least 3, but less than 4
D. At least 4, but less than 5
E. At least 5
11.6 (1 point) What is the stop loss premium, for a deductible of 30?
A. Less than 0.5
B. At least 0.5, but less than 1.0
C. At least 1.0, but less than 1.5
D. At least 1.5, but less than 2.0
E. At least 2.0
11.7 (1 point) What is the stop loss premium, for a deductible of 40?
A. Less than 0.25
B. At least 0.25, but less than 0.50
C. At least 0.50, but less than 0.75
D. At least 0.75, but less than 1.00
E. At least 1.00
11.8 (1 point) What is the stop loss premium, for a deductible of 50?
A. Less than 0.1
B. At least 0.1, but less than 0.2
C. At least 0.2, but less than 0.3
D. At least 0.3, but less than 0.4
E. At least 0.4
11.9 (1 point) What is the stop loss premium, for a deductible of 60?
A. Less than 0.1
B. At least 0.1, but less than 0.2
C. At least 0.2, but less than 0.3
D. At least 0.3, but less than 0.4
E. At least 0.4
11.10 (1 point) What is the stop loss premium, for a deductible of 33?
A. Less than 1.4
B. At least 1.4, but less than 1.5
C. At least 1.5, but less than 1.6
D. At least 1.6, but less than 1.7
E. At least 1.7

HCM 10/23/12,

Page 299

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 300

11.11 (1 point) If the stop loss premium is 3.7, what is the corresponding deductible?
A. Less than 19
B. At least 19, but less than 20
C. At least 20, but less than 21
D. At least 21, but less than 22
E. At least 22
Use the following information for the next three questions:

Aggregate losses follow an Exponential distribution.


There is an aggregate deductible of 250.
An insurer has priced the stop loss insurance assuming a gross premium 30% more
than the stop loss premium.
11.12 (1 point) What is the stop loss premium if the mean of the Exponential is 100?
A. Less than 8
B. At least 8, but less than 9
C. At least 9, but less than 10
D. At least 10, but less than 11
E. At least 11
11.13 (1 point) What is the stop loss premium if the mean of the Exponential is 110?
A. Less than 12
B. At least 12, but less than 13
C. At least 13, but less than 14
D. At least 14, but less than 15
E. At least 15
11.14 (1 point) If the insurer assumed that the mean of the exponential was 100, but it was actually
110, then what is the ratio of gross premium charged to the (correct) stop loss premium?
A. Less than 94%
B. At least 94%, but less than 95%
C. At least 95%, but less than 96%
D. At least 96%, but less than 97%
E. At least 97%

11.15 (2 points) The stop loss premium at a deductible of 150 is 11.5. The stop loss premium for
a deductible of 180 is 9.1. There is no chance that the aggregate losses are between 140 and 180.
What is the probability that the aggregate losses are less than or equal to 140?
A. Less than 90%
B. 90%
C. 91%
D. 92%
E. More than 92%

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 301

11.16 (2 points) The average disability lasts 47 days. The insurer will pay for all days beyond the
first 10. The insurer will only pay for 75% of the cost of the first 10 days. The cost per day is $80.
60% of disabilities are 10 days or less. Assume that those disabilities of 10 days or less are
uniformly distributed from 1 to 10.
What is the expected cost for the insurer per disability?
A. Less than 3580
B. At least 3580, but less than 3600
C. At least 3600, but less than 3620
D. At least 3620, but less than 3640
E. At least 3640
Use the following information for the next 6 questions:

Aggregate losses for Slippery Rock Insurance have the following distribution:
f(0) = 47%
f(1) = 10%
f(2) = 20%

f(5) = 13%
f(10) = 6%
f(25) = 3%

f(50) = 1%

Slippery Rock Insurance buys aggregate reinsurance from Global Reinsurance.


Global will pay those aggregate losses in excess of 8 per year.
Slippery Rock collects premiums equal to 110% of its expected losses prior to
the impacts of reinsurance.
Global Reinsurance collects from Slippery Rock Insurance 125% of the losses
Global Reinsurance expects to pay.

11.17 (1 point) How much premium does Slippery Rock Insurance collect?
A. Less than 3.5
B. At least 3.5, but less than 3.6
C. At least 3.6, but less than 3.7
D. At least 3.7, but less than 3.8
E. At least 3.8
11.18 (1 point) What is the variance of Slippery Rocks aggregate losses, prior to the impact of
reinsurance?
A. Less than 30
B. At least 30, but less than 40
C. At least 40, but less than 50
D. At least 50, but less than 60
E. At least 60

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 302

11.19 (1 point) What are the expected aggregate losses for Slippery Rock after the impact of
reinsurance?
A. Less than 1.5
B. At least 1.5, but less than 1.6
C. At least 1.6, but less than 1.7
D. At least 1.7, but less than 1.8
E. At least 1.8
11.20 (1 point) What are the variance of aggregate losses for Slippery Rock after the impact of
reinsurance?
A. Less than 6.5
B. At least 6.5, but less than 6.6
C. At least 6.6, but less than 6.7
D. At least 6.7, but less than 6.8
E. At least 6.8
11.21 (1 point) How much does Slippery Rock pay Global Reinsurance?
A. Less than 1.4
B. At least 1.4, but less than 1.5
C. At least 1.5, but less than 1.6
D. At least 1.6, but less than 1.7
E. At least 1.7
11.22 (1 point) Global Reinsurance in turns buys reinsurance from Cosmos Assurance covering
payments due to Globals contract with Slippery Rock. Cosmos Assurance will reimburse Global
for the portion of its payments in excess of 12. What are Globals expected aggregate losses, after
the impact of its reinsurance with Cosmos?
A. Less than 0.6
B. At least 0.6, but less than 0.7
C. At least 0.7, but less than 0.8
D. At least 0.8, but less than 0.9
E. At least 0.9

11.23 (2 points) The aggregate annual losses follow approximately a LogNormal Distribution with
parameters = 9.902 and = 1.483.
Estimate the stop loss premium for a deductible of 100,000.
(A) 25,000 (B) 27,000 (C) 29,000 (D) 31,000 (E) 33,000

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 303

11.24 (2 points) The aggregate losses for Mercer Trucking are given by a Compound Poisson
Distribution with = 3. The mean severity is $10. The net stop loss premium at $25 is $14.2.
The insurer will pay Mercer Trucking a dividend if Mercer Truckings aggregate losses are less than
$25. The dividend will be 30% of the amount by which $25 exceeds Mercer Truckings aggregate
losses. What is the expected value of next years dividend?
A. Less than 2.8
B. At least 2.8, but less than 2.9
C. At least 2.9, but less than 3.0
D. At least 3.0, but less than 3.1
E. At least 3.1
Use the following information for the next 3 questions:
Aggregate Stop Loss
Aggregate Stop Loss
Deductible Premium
Deductible Premium
100,000
2643
1,000,000 141
150,000
1633
2,000,000
53
200,000
1070
3,000,000
26
250,000
750
4,000,000
15
300,000
563
5,000,000
10
500,000
293
Assume there is no probability between the given amounts.
11.25 (1 point) A stop loss insurance pays the excess of aggregate losses above 700,000.
Determine the amount the insurer expects to pay.
(A) 200
(B) 210
(C) 220
(D) 230
(E) 240
11.26 (1 point) A stop loss insurance pays the excess of aggregate losses above 250,000
subject to a maximum payment of 750,000. Determine the amount the insurer expects to pay.
(A) 600
(B) 610
(C) 620
(D) 630
(E) 640
11.27 (2 points) A stop loss insurance pays 75% of the excess of aggregate losses above
500,000 subject to a maximum payment of 1,000,000.
Determine the amount the insurer expects to pay.
(A) 140
(B) 150
(C) 160
(D) 170
(E) 180
11.28 (3 points) A manufacturer will buy stop loss insurance with an annual aggregate deductible of
D. If annual aggregate losses are less than D, then the manufacturer will pay its workers a safety
bonus of one third the amount by which annual losses are less than D.
D is chosen so as to minimize the sum of the stop loss premium and the expected bonuses.
What portion of the time will the annual aggregate losses exceed D?
(A) 1/4
(B) 1/3
(C) 2/5
(D) 1/2
(E) 3/5

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 304

11.29 (5 points) The Duff Brewery buys Workers Compensation Insurance.


Duffs annual aggregate losses are LogNormally Distributed with = 13.5 and = 0.75.
Duffs premiums depend on its actual aggregate annual losses, A.
Premium = 1.05(200,000 + 1.1A), subject to a minimum premium of 500,000 and a maximum
premium of 2,500,000.
What is Duffs average premium?
A. Less than 1.3 million
B. At least 1.3 million, but less than 1.4 million
C. At least 1.4 million, but less than 1.5 million
D. At least 1.5 million, but less than 1.6 million
E. At least 1.6 million
11.30 (Course 151 Sample Exam #1, Q.14) (1.7 points)
Aggregate claims have a compound Poisson distribution with = 4, and a severity distribution:
p(1) = 3/4 and p(2) = 1/4.
Determine the stop loss premium at 2.
(A) 3.05
(B) 3.07
(C) 3.09

(D) 3.11

(E) 3.13

11.31 (Course 151 Sample Exam #1, Q.16) (1.7 points) A stop-loss reinsurance pays 80% of
the excess of aggregate claims above 20, subject to a maximum payment of 5.
All claim amounts are non-negative integers.
Let In be the the stop loss premium for a deductible of n, (and no limit), then you are given:
E[I16] = 3.89

E[I25] = 2.75

E[I20] = 3.33

E[I26] = 2.69

E[I24] = 2.84

E[I27] = 2.65

Determine the total amount of claims the reinsurer expects to pay.


(A) 0.46
(B) 0.49
(C) 0.52
(D) 0.54
(E) 0.56
11.32 (Course 151 Sample Exam #2, Q.8) (0.8 points)
A random loss is uniformly distributed over ( 0, 80 ).
Two types of insurance are available.
Type
Premium
Stop loss with deductible 10
Insurer's expected claim plus 14.6
Complete
Insurer's expected claim times (1+k)
The two premiums are equal.
Determine k.
(A) 0.07
(B) 0.09
(C) 0.11
(D) 0.13
(E) 0.15

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 305

11.33 (Course 151 Sample Exam #2, Q.22) (1.7 points)


Aggregate claims has a compound Negative Binomial distribution with r = 2 and = 7/3,
and individual claim distribution:
x
p(x)
2
2/3
5
1/3
Determine the stop loss premium at 2.
(A) 11.4
(B) 11.8
(C) 12.2

(D) 12.6

(E) 13.0

11.34 (Course 151 Sample Exam #3, Q.21) (1.7 points)


For aggregate claims S, you are given:
(i) S can only take on positive integer values.
(ii) The stop loss premium at zero is 5/3.
(iii) The stop loss premium at two is 1/6.
(iv) The stop loss premium at three is 0.
Determine fS(1).
(A) 1/6

(B) 7/18

(C) 1/2

(D) 11/18

(E) 5/6

11.35 (5A, 5/94, Q.24) (1 point) Suppose S has a compound Poisson distribution with Poisson
parameter of 2 and E(S) = $200. Net stop-loss premiums with deductibles of $400 and $500 are
$100 and $25, respectively. The premium is $500.
The insurer agrees to pay a dividend equal to the excess of 80% of the premium over the claims.
What is the expected value of the dividend?
A. Less than $200
B. At least $200, but less than $250
C. At least $250, but less than $300
D. At least $300, but less than $350
E. $350 or more
11.36 (5A, 5/94, Q.38) (2 points) Assume that the aggregate claims for an insurer have a
compound Poisson Distribution with lambda = 2. Individual claim amounts are equal to 1, 2, 3 with
probabilities 0.4, 0.3, 0.3, respectively. Calculate the net stop-loss premium for a deductible of 2.
11.37 (CAS9, 11/98, Q.30a) (1 point) Your company has an expected loss ratio of 50%.
You have analyzed year-to-year variation and determined that any particular accident years loss
ratio will be uniformly distributed on the interval 40% to 60%.
If expected losses are $5.0 million on subject premium of $10.0 million, what is the expected value
of losses ceded to an aggregate stop-loss cover with a retention of a 55% loss ratio?

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 306

Use the following information for the next two questions:


An aggregate loss distribution has a compound Poisson distribution with expected number
of claims equal to 1.25.
Individual claim amounts can take only the values 1, 2 or 3, with equal probability.
11.38 (Course 3 Sample Exam, Q.14)
Determine the probability that aggregate losses exceed 3.
11.39 (Course 3 Sample Exam, Q.15)
Calculate the expected aggregate losses if an aggregate deductible of 1.6 is applied.
11.40 (3, 5/00, Q.11) (2.5 points) A company provides insurance to a concert hall for losses due
to power failure. You are given:
(i) The number of power failures in a year has a Poisson distribution with mean 1.
(ii) The distribution of ground up losses due to a single power failure is:
x
Probability of x
10
0.3
20
0.3
50
0.4
(iii) The number of power failures and the amounts of losses are independent.
(iv) There is an annual deductible of 30.
Calculate the expected amount of claims paid by the insurer in one year.
(A) 5
(B) 8
(C) 10
(D) 12
(E) 14
11.41 (3 points) In the previous question, calculate the expected amount of claims paid by the
insurer in one year, if there were an annual deductible of 50 rather than 30.
A. 6.5
B. 7.0
C. 7.5
D. 8.0
E. 8.5
11.42 (3, 5/01, Q.19 & 2009 Sample Q.107) (2.5 points)
For a stop-loss insurance on a three person group:
(i) Loss amounts are independent.
(ii) The distribution of loss amount for each person is:
Loss Amount
Probability
0
0.4
1
0.3
2
0.2
3
0.1
(iii) The stop-loss insurance has a deductible of 1 for the group.
Calculate the net stop-loss premium.
(A) 2.00
(B) 2.03
(C) 2.06
(D) 2.09
(E) 2.12

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 307

11.43 (3, 5/01, Q.30) (2.5 points)


You are the producer of a television quiz show that gives cash prizes.
The number of prizes, N, and prize amounts, X, have the following distributions:
n
Pr(N = n)
x
Pr (X=x)
1
0.8
0
0.2
2
0.2
100
0.7
1000
0.1
You buy stop-loss insurance for prizes with a deductible of 200.
The relative security loading is defined as: (premiums / expected losses) - 1.
The cost of insurance includes a 175% relative security load.
Calculate the cost of the insurance.
(A) 204
(B) 227
(C) 245
(D) 273
(E) 357
11.44 (3, 11/01, Q.18 & 2009 Sample Q.99) (2.5 points) For a certain company, losses follow a
Poisson frequency distribution with mean 2 per year, and the amount of a loss is 1, 2, or 3, each with
probability 1/3. Loss amounts are independent of the number of losses, and of each other.
An insurance policy covers all losses in a year, subject to an annual aggregate deductible
of 2. Calculate the expected claim payments for this insurance policy.
(A) 2.00
(B) 2.36
(C) 2.45
(D) 2.81
(E) 2.96
11.45 (2 points) In the previous question, 3, 11/01, Q.18, let Y be the claim payments for this
insurance policy. Determine E[Y | Y > 0].
(A) 3.5
(B) 3.6
(C) 3.7
(D) 3.8
(E) 3.9
11.46 (3, 11/02, Q.16 & 2009 Sample Q.92) (2.5 points) Prescription drug losses, S, are
modeled assuming the number of claims has a geometric distribution with mean 4, and the amount
of each prescription is 40. Calculate E[(S - 100)+].
(A) 60

(B) 82

(C) 92

(D) 114

(E) 146

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 308

11.47 (SOA M, 5/05, Q.18 & 2009 Sample Q.165) (2.5 points) For a collective risk model:
(i) The number of losses has a Poisson distribution with = 2.
(ii) The common distribution of the individual losses is:
x
fx(x)
1
0.6
2
0.4
An insurance covers aggregate losses subject to an aggregate deductible of 3.
Calculate the expected aggregate payments of the insurance.
(A) 0.74
(B) 0.79
(C) 0.84
(D) 0.89
(E) 0.94
Comment: I have rewritten slightly this past exam question.
11.48 (2 points) In the previous question, SOA M, 5/05, Q.18, for those cases where the
aggregate payment is positive, what is the expected aggregate payment?
(A) 2.1
(B) 2.3
(C) 2.5
(D) 2.7
(E) 2.9
11.49 (SOA M, 11/05, Q.19 & 2009 Sample Q.206) (2.5 points) In a given week, the number
of projects that require you to work overtime has a geometric distribution with = 2.
For each project, the distribution of the number of overtime hours in the week is the following:
x
f(x)
5
0.2
10
0.3
20
0.5
The number of projects and number of overtime hours are independent.
You will get paid for overtime hours in excess of 15 hours in the week.
Calculate the expected number of overtime hours for which you will get paid in the week.
(A) 18.5
(B) 18.8
(C) 22.1
(D) 26.2
(E) 28.0
11.50 (SOA M, 11/06, Q.7 & 2009 Sample Q.280) (2.5 points) A compound Poisson claim
distribution has = 5 and individual claim amounts distributed as follows:
x

fX(x)

5
0.6
k
0.4
where k > 5
The expected cost of an aggregate stop-loss insurance subject to a deductible of 5 is 28.03.
Calculate k.
(A) 6
(B) 7
(C) 8
(D) 9
(E) 10

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 309

Solutions to Problems:
11.1. E. Linearly interpolate: (.2)(120,000) + (.8)(111,000) = 112,800.
Comment: If as is commonly the case, there is some probability that the aggregate losses are
between $1 and $1.1 million, then the stop loss premium at $1.08 million is likely to be somewhat
closer to $111,000 than calculated here.
11.2. E. Set the observed and theoretical first two moments equal:
mean = 13,000 = exp( + 2/2).
second moment = exp(2 + 22) = 92,0002 + 13,0002 = 8633 million.
=

ln(8633 million) - 2 ln(13000) = 3.933 = 1.983.

= ln(13000) - 2/2 = 7.507. E[X] = exp( + 2/2) = 13000.


E[X

x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.

E[X

25,000] = 13000[-.66] + (25000){1 - [1.32]} =

(13000)(1 - .7454) + (25000)(1 - .9066) = 5645.


E[X] - E[X

25,000] = 13000 - 5645 = 7355.

11.3. A. (0)(.3) + (10)(.4) + (20)(.1) + (30)(.08) + (40)(.06) + (50)(.04) + (60)(.02) = 14.0.


11.4. E., 11.5. D., 11.6. E., 11.7. D., 11.8. C., and 11.9. A.
The fastest way to do this set of problems is to use the recursion formula:
E[(A - (j+1)A)+] = E[(A - jA)+] - A S( jA).
Deductible

Survival Function

Stop Loss Premium

0
10
20
30
40
50
60

0.7
0.3
0.2
0.12
0.06
0.02
0

14.00
7.00
4.00
2.00
0.80
0.20
0.00

11.10. D. Since there is no probability between 30 and 40, we can linearly interpolate:
(0.7)(2.00) + (0.3)(0.8) = 1.64.
11.11. D. Linearly interpolating between a deductible of 20 and 30:
d = {(4 - 3.7)(30) + (3.7 - 2)(20)} / (4 - 2) = 21.5.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 310

11.12. B. The mean excess loss for an Exponential is equal to its mean.
The losses excess of 250 are: e(250) S(250) = 100( e-250/100) = 8.21.
11.13. A. The mean excess loss for an Exponential is equal to its mean.
The losses excess of 250 are: e(250)S(250) = 110(e-250/110) = 11.33.
11.14. B. The insurer would charge a gross premium of: (1.3)(8.21) = 10.67.
The mean excess losses are: 11.33. 10.67 / 11.33 = 0.942.
Comment: Thus the expected excess losses are 94.2% of charged premiums, rather than the
desired: 1/1.3 = 76.9%.
Even a 10% mistake in estimating the mean, had a large effect on excess losses.
This is the same mathematical reason why the inflation rate of excess losses over a fixed limit is
greater than that of the total losses.
11.15. D. Since there is no probability in the interval 150 to 180,
the stop loss premium at 180 = stop loss premium at 150 - (180 - 150)S(150).
Therefore, S(150) = (11.5 - 9.1)/(180 - 150) = .08. Therefore F(150) = 1 - 0.08 = 0.92.
Since there is no probability in the interval (140, 150], F(140) = F(150) = 0.92.
11.16. C. If the insurer paid for everything, then the expected cost = (47)(80) = $3760.
There is the equivalent of a (25%)(80)= $20 per day deductible for the first 10 days.
For those who stay more than 10 days this is $200. For those who stay 10 days or less, they
average: (1+10)/2 = 5.5 days, so the deductible is worth on average: (5.5)($20) = $110.
Weighting together the two cases, the deductible is worth: (60%)(110) + (40%)(200) = $146.
Thus the insurer expects to pay: 3760 - 146 = $3614.
11.17. A. Slippery Rocks expected losses are:
(1)(0.1) + (2)(0.2) + (5)(0.13) + (10)(0.06) + (25)(0.03) + (50)(0.01) = 3.
Thus it charges of a premium of: (1.1)(3) = 3.3.
11.18. C. The second moment of Slippery Rocks losses are:
(1)(0.1) + (4)(0.2) + (25)(0.13) + (100)(0.06) + (625)(0.03) + (2500)(0.01) = 53.9.
Therefore, the variance = 53.9 - 32 = 44.9.
11.19. E. (1)(0.1) + (2)(0.2) + (5)(0.13) + (8)(0.06) + (8)(0.03) + (8)(0.01) = 1.95.
11.20. D. After reinsurance, the second moment of Slippery Rocks losses are:
(1)(0.1) + (4)(0.2) + (25)(0.13) + (64)(0.06) +(64)(0.03) + (64)(0.01) = 10.55.
Therefore, the variance = 10.55 - 1.952 = 6.75.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 311

11.21. A. Globals expected payments are 3 - 1.95 = 1.05.


Therefore, Global charges Slippery Rock: (1.25)(1.05) = 1.31.
11.22. B. Cosmos pays Global when Slippery Rocks losses exceed: 8 + 12 = 20.
Thus Cosmoss expected losses are: (5)(0.03) + (30)(0.01) = .45.
Thus Globals expected losses net of reinsurance are 1.05 - 0.45 = 0.60.
11.23. A. E[X] = exp( + 2/2) = exp(9.902 + 1.4832 /2) = 59,973.
E[X

x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.

E[X

100,000] = (59,973) [(ln100000 - 9.902 - 1.4832 )/1.483]) +

(100000) {1 - [(ln100000 - 9.902 )/1.483]}} =


59973[-.40] + (100000)(1 - [1.09]) = (59,973)(.3446) + (100,000)(.1379) = 34,457.
E[X] - E[X

100,000] = 59,973 - 34,457 = 25.5 thousand.

11.24. A. Let A be the aggregate losses.


The net stop loss premium at 25 is: E[A] - E[A 25] = 14.2.
Thus E[A 25] = E[A] - 14.2 = (3)(10) - 14.2 = 15.8.
E[25 - A]+ = 25 - E[A 25] = 25 - 15.8 = 9.2.
The dividend is 0.3[25 - A]+. Therefore, the expected dividend is: (0.3)(9.2) = 2.76.
11.25. D. Linearly interpolating: (0.6)(293) + (0.4)(141) = 232.
11.26. B. In order to get the aggregate layer from 250,000 to 250,000 + 750,000 = 1,000,000,
subtract the stop loss premiums: 750 - 141 = 609.
11.27. D. An aggregate loss of 1,833,333 results in a payment of:
(0.75)(1,833,333 - 500,000) = 1 million.
Thus the insurer pays 75% of the layer from 500,000 to 1.833 million.
In order to get the stop loss premium at 1.833 million, linearly interpolate:
(0.167)(141) + (0.833)(53) = 67.7.
In order to get the aggregate layer from 500,000 to 1,833,333, subtract the stop loss premiums:
293 - 67.7 = 225.3. The insurer pays 75% of this layer: (75%)(225.3) = 169.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

11.28. A. The stop loss premium is: E[(A - D)+] = E[A] - E[A

HCM 10/23/12,

Page 312

D].

The average amount by which aggregate losses are less than D is: E[(D - A)+] = D - E[A

D].

The stop loss premium plus expected bonus is:


E[(A - D)+] + E[(D - A)+]/3 = E[A] - E[A D] + (D - E[A

D].

D])/3 = E[A] + D/3 - (4/3)E[A

Note that E[A D] is the integral of the survival function from 0 to D, and therefore,
d E[A D] / dD = S(D).
Setting equal to zero the derivative of the stop loss premium plus expected bonus:
1/3 - (4/3)S(D) = 0. S(D) = 1/4.
11.29. A. Premium = 500,000 500000 = 1.05(200,000 + 1.1A)
A = (500000/1.05 - 200000))/1.1 = 251082.
So the minimum premiums are paid if A 251,082.
Premium = 2,500,000 2500000 = 1.05(200,000 + 1.1A)

A = (2500000/1.05 - 200000))/1.1 = 1982684.


So the maximum premiums are paid if A 1,982,684.
If there were no maximum or minimum premium, then the average premium would be:
1.05(200,000 + 1.1E[A]).
If there were no minimum premium, then the average premium would be:
1.05(200,000 + 1.1E[A 1,982,684]).
Due to the minimum premium, we add to [A 1,982,684] the average amount by which losses are
less than 251,082, which is: 251,082 - E[A 251,082].
Thus the average premiums are:
1.05(200,000 + 1.1{E[A 1982684] + 251,082 - E[A 251,082]}) =
500,000 + (1.05)(1.1){E[A 1,974,026] - E[A 251,082]} =
Minimum Premium + (1.05)(1.1){Layer of Loss from 251,082 to 1,974,026}.
For the LogNormal, E[X
E[X

x] = exp( + 2/2)[(lnx 2)/] + x {1 [(lnx )/]}.

1982684] = exp(13.5 + .752 /2)[(ln1982684 - 13.5 - .752 )/.75] +

1982684 {1 - [(ln1982684 - 13.5)/.75]} = 966320 [.59] + 1982684 {1 - [1.33]} =


(966,320)(.7224) + (1,982,684)(1 - .9082) = 880,080.
E[X

251082] = exp(13.5 + .752 /2)[(ln251082 - 13.5 - .752 )/.75] +

251082 {1 - [(ln251082 - 13.5)/.75]} = 966320 [-2.17] + 251082 {1 - [-1.42]} =


(966,320)(.0150) + (251,082)(.9222) = 246,048.
Therefore, the average premium is:
500,000 + (1.05)(1.1)(880,080 - 246,048) = 1.23 million.
Comment: A simplified example of Retrospective Rating.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 313

11.30. C. The mean severity is: (3/4)(1) + (1/4)(2) = 5/4.


The mean aggregate loss is (5/4)(4) = 5.
The probability that the aggregate losses are zero is the probability of zero claims, which is: e-4.
The probability that the aggregate losses are one, is the probability that there is one claims and it is
of size one, which is: (3/4)4e-4 = 3e-4.
The stop loss premium at 0 is the mean aggregate loss of 5.
We can make use of the recursion: E[(X - (j+1)x)+] = E[(X - jx)+] - x S( jx).
Deductible

Survival Function

Stop Loss Premium

0
1
2

0.9817
0.9267

5
4.0183
3.0916

Alternately, the expected value of aggregate losses limited to 2 is:


(0)( e-4) + (1)( 3e-4) + (2)(1 - 4e-4) = 1.9084.
The expected value of aggregate losses unlimited is 5.
Thus the expected value of aggregate losses excess of 2 is: 5 - 1.9084 = 3.0916.
11.31. C. If the insurer has aggregate losses of 26.25, then the reinsurer pays: .8(26.25 - 20) = 5.
If the insurer has losses greater than 26.25, then the reinsurer still pays 5. If the insurer has losses
less than 20, then the reinsurer pays nothing. Thus the reinsurers payments are 80% of the layer of
aggregate losses from 20 to 26.25. The layer from 20 to 26.25 is the difference between the
aggregate losses excess of 20 and those excess of 26.25: I20 - I26.25.
By linear interpolation I26.25 = 2.68.
Thus the reinsurers expected payments are: (0.8)(I20 - I26.25) = (0.8)(3.33 - 2.68) = 0.52.
11.32. D. For a uniform distribution on (0, 80), the expected loss is 40.
Thus the premium for complete insurance is (1+k)(40).
For a deductible of 10 the average payment per loss is:
80

80

(x-10) (1/80) dx = (1/80)(x-10)2/2 ] = 702/160 = 30.625.


10

10

Thus with a deductible of 10, the premium is: 30.625 + 14.6 = 45.225.
Setting the two premiums equal: (1+k)(40) = 45.225, or k = 0.13.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 314

11.33. C. The mean severity is: (2)(2/3) + (5)(1/3) = 3. The mean frequency is: (2)(7/3) = 14/3.
Thus the mean aggregate loss is: (14/3)(3) = 14.
The aggregate losses are at least 2 if there is a claim.
The chance of no claims is: 1/(1+)r = 1/(10/3)2 = 0.09.
Thus the expected aggregate losses limited to 2, are: (0)(0.09) + (2)(1 - 0.09) = 1.82.
Thus the expected aggregate losses excess of 2 are: 14 - 1.82 = 12.18.
11.34. C. Since the stop loss premium at 3 is zero, S is never greater than 3.
Since S can only take on positive integer values, S can only be 1, 2, or 3.
Stop loss premium at zero = (1)f(1) + (2)f(2) + (3)f(3) = 5/3.
Stop loss premium at two = (0)f(1) + (0)f(2) + (1)f(3) = 1/6.
Therefore, f(3) = 1/6 and f(1) + 2f(2) = 5/3 - 3/6 = 7/6. Therefore, f(2) = (7/6 - f(1))/2.
Now f(1) + f(2) + f(3) = 1. Therefore, f(1) + (7/6 - f(1))/2 + 1/6 = 1. Solving, f(1) = 1/2.
Comment: For f(1) = 1/2, f(2) = 1/3, and f(3) =1/6,
the stop loss premium at 0 is the mean: (1)(1/2) + (2)(1/3) + (3)(1/6) = 5/3.
The stop loss premium at 2 is: (0)(1/2) + (0)(1/3) + (1/6)(3 - 2) = 1/6.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 315

11.35. D. If the loss is (80%)(500) = 400 or more, then the insurer pays no dividend.
If the loss S is less than 400, then the insurer pays a dividend of 400 - S.
Thus the dividend is 400 - S when S 400, and is zero when S 400.
The net stop loss premium at 400 is: E [zero when S 400 and S - 400 when S 400].
Dividend + S - 400 is zero when S 400, and S - 400 when S 400.
Therefore, E[dividend + S - 400] = net stop loss premium at 400.
E[dividend] + E[S] - 400 = net stop loss premium at 400.
E[dividend] = 400 + (net stop loss premium at 400) - E[S] = 400 + 100 - 200 = 300.
Comment: Somewhat similar to 3, 5/00, Q.30 and Course 151 Sample Exam #1, Q.16.
When the dividend is the excess of y over the aggregate loss, then the expected dividend is:
y + net stop loss premium at y - mean aggregate loss. In the following Lee Diagram, applied to
the distribution of aggregate losses, Area A is the average dividend, Area C is the net stop loss
premium at 400. Area B + C is the average aggregate loss. Therefore, Area B = Average
aggregate loss - stop loss premium at 400. Area A + Area B = 400.
Therefore, 400 = average dividend + average aggregate loss - stop loss premium at 400.

C
400
A

11.36. The average severity is (1)(.4) + (2)(.3) + (3)(.3) = 1.9.


The average aggregate losses are (1.9)(2) = 3.8.
The only way the aggregate losses can be zero is if there are no claims, which has probability
e-2 = 0.1353.
The only way the aggregate losses can be 1 is if there is one claim of size 1, which has probability:
(0.4)(2e-2) = 0.1083.
Thus E[A 2] = (0)(0.1353) + (1)(0.1083) + (2)(1 - 0.1353 - 0.1083) = 1.6212.
Thus the net stop-loss premium at 2 is: E[A] - E[A 2] = 3.8 - 1.62 = 2.18.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 316

11.37. The expected loss ratio excess of 55% is:


60%

(x - 55%) / 20% dx = 0.0625%.

55%
The corresponding premium is: ($10.0 million)(0.0625%) = $62,500.
11.38. We need to calculate the density of the aggregate losses at 0, 1, 2 and 3, then sum them
and subtract from unity.
The aggregate losses are 0, if there are no claims; f(0) = e-1.25. The aggregate losses are 1 if there
is a single claim of size 1; f(1) = (1/3) 1.25 e-1.25. The aggregate losses are 2 if either there is a
single loss of size 2 or there are two losses each of size 1;
f(2) = (1/3) 1.25 e-1.25 + (1/9) (1.252 /2)e-1.25. The aggregate losses are 3 if either there is a single
loss of size 3, there are two losses of sizes 1 and 2 or 2 and 1, or there are three losses each of size
1; f(3) = (1/3) 1.25 e-1.25 + (2/9) (1.252 /2) e-1.25 + (1/27) (1.253 /6) e-1.25.
Thus, f(0) + f(1) + f(2) + f(3) = e-1.25 { 1 + 1.25 + (1.252 /6) + (1.253 /162)} = .723.
Thus the chance that the aggregate losses are greater than 3 is: 1 - .723 = 0.277.
Alternately, one can compute the convolutions of the severity distribution and weight them together
using the Poisson probabilities of various numbers of claims.
For example (f*f*f) (7) = (f*f)(7-x) f(x) = (1/9)(1/3) + (2/9)(1/3) + (3/9)(1/3) = 6/27 = .2222.
Note that I have shown more than it is necessary in order to answer this question. One need only
calculate up to the f*f*f and only for values up to 3. I have not shown the aggregate distribution for
larger values, since that would require the calculation of higher convolutions.
Poisson Probability
Number of Claims
Dollars of Loss
0
1
2
3
4
5

0.2865
0

0.3581
1

0.2238
2

0.0933
3

0.0291
4

f*0
1.0000

f
0.0000
0.3333
0.3333
0.3333

f*f
0.0000
0.0000
0.1111
0.2222
0.3333
0.2222

f*f*f
0.0000
0.0000
0.0000
0.0370
0.1111
0.2222

f*f*f*f
0
0
0
0
0.0123
0.0494

Aggregate
Distribution
0.2865
0.1194
0.1442
0.1726
0.0853
N.A.

Then the chance of aggregate losses of 0, 1, 2 or 3 is: .2865 + .1194 + .1442 + .1726 = .7227.
Thus the chance that the aggregate losses are greater than 3 is: 1- .723 = 0.277.
Alternately, we can use the Panjer Algorithm, since this is a compound Poisson Distribution. The
severity distribution is s(1) = s(2) = s(3) = 1/3.
The p.g.f of a Poisson is e(x-1). s(0) = severity distribution at zero = 0.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 317

c(0) = Pf(s(0)) = p.g.f. of frequency dist. at (density of severity distribution at zero.) = e1.25(0-1) =
.2865. For the Poisson Distribution, a = 0 and b = = 1.25.
x

c(x) = {1/(1-as(0))}

(a + jb/x) s(j) c(x-j) = (1.25/x) j s(j) c(x-j)


j=1

j=1

c(1) = (1.25/1) (1) s(1) c(1-1) = (1.25)(1/3)(.2865)} = .1194.


c(2) = (1.25/2) {(1)(1/3)(.1194) +(2)(1/3)(.2865)} = .1442.
c(3) = (1.25/3) {(1)(1/3)(.1442) +(2)(1/3)(.1194) +(3)(1/3)(.2865)} = .1726.
Then the chance of aggregate losses of 0, 1, 2 or 3 is: .2865 + .1194 + .1442 + .1726 = .7227.
Thus the chance that the aggregate losses are > 3 is: 1 - .723 = 0.277.
11.39. In the absence of a deductible, the mean aggregate losses are:
(average frequency)(average severity) = 1.25(2) = 2.5. In the previous solution, we calculated f(0)
= .2865, and f(1) = .1194. Therefore, the limited expected value at 1.6 of the aggregate losses is:
(0)f(0) + (1)f(1) + 1.6(1 - (f(0) + f(1)) = 1.6 - 1.6 f(0) - .6 f(1) =
1.6 - (1.6)(.2865) - (.6)(.1194) = 1.07. Thus the average aggregate losses with the deductible of
1.6 are: E[A] - E[A 1.6] = = 2.5 - 1.07 = 1.43.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 318

11.40. E. The mean severity is (0.3)(10) + (0.3)(20) + (0.4)(50) = 29. The mean frequency is 1.
Therefore, prior to a deductible the mean aggregate losses are: (1)(29) = 29.
The probability of no claims is: e-1 = .3679. The probability of one claim is: e-1 = .3679.
The probability of two claims is: e-1/2 = .1839. Therefore, the probability of no aggregate losses is
0.3679. Aggregate losses of 10 correspond to one power failure costing 10, with probability
(0.3)(0.3679) =0 .1104. Aggregate losses of 20 correspond to either one power failure costing 20,
or two power failures each costing 10, with probability: (0.3)(0.3679) + (0.32 )(0.1839) = 0.1269.
Thus the chance of aggregate losses of 30 or more is: 1 - (0.3679 + 0.1104 + 0.1269) = 0.3948.
Therefore, the limited expected value of aggregate losses at 30 is:
(0)(0.3679) + (10)(0.1104) + (20)(0.1269) + (30)(0.3948) = 15.49.
Thus the losses excess of 30 are: 29 - 15.49 = 13.5.
Alternately, one could use the Panjer Algorithm (Recursive Method) to get the distribution of
aggregate losses. Since the severity distribution has support 10, 20, 50, we let x = 10:
10 1, 20 2, 30 3, ...

For the Poisson, a = 0, b = = 1, and P(z) = e(z-1).

c(0) = Pf(s(0)) = Pf(0) = e1(0-1) = 0.3679.


x

c(x) = {1/(1-as(0))}

(a +jb/x) s(j) c(x-j) =

(1/x) j s(j) c(x-j)

j=1

j=1

c(1) = (1/1) (1) s(1) c(1-1) = {(.3)(.3679)} = 0.1104.


c(2) = (1/2){(1)s(1)c(1) + (2)s(2)c(0)} = (1/2) {(.3)(.1104) +(2)(.3)(.3679)} = 0.1269.
One can also calculate the distribution of aggregate losses using convolutions.
For the severity distribution, s* s(20) = 0.09, s* s(30) = 0.18, s* s(40) = 0.09, s* s(60) = 0.24,
s* s(70) = 0.24, and s* s(100) = 0.16.
Number of Losses
Poisson Frequency
Aggregate Losses
0
10
20

0
0.3679
1

1
0.3679

2
0.1839

s
0
0.3
0.3

s*s

0.09

Aggregate Distribution
0.3679
0.1104
0.1269

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 319

Once one has the distribution of aggregate losses, one can use the recursion formula:
E[(A - (j+1)A)+] = E[(A - jA)+] - A S(jA).
Deductible

Survival Function

Stop Loss Premium

0
10
20
30

0.6321
0.5217
0.3948

29
22.679
17.462
13.514

Alternately, once one has the aggregate distribution, one can calculate the expected amount not paid
by the stop loss insurance as follows:
A

Aggregate
Losses

Probability

Amount Not Paid


by Stop Loss
Insurance
Ded. of 30

Product of
Col. B
& Col. C

0
10
20
30 or more

0.3679
0.1104
0.1269
0.3948

0
10
20
30

0
1.104
2.538
11.844

Sum

15.486

Since the mean aggregate loss is 29, the expected amount paid by the stop loss insurance is:
29 - 15.486 = 13.514.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 320

11.41. A. The densities of the Poisson frequency are:


0
0.3679

1
0.3679

2
0.1839

3
0.0613

4
0.0153

Aggregate losses of 30 correspond to either two power failures costing 10 and 20, or three power
failures each costing 10, with probability: (2)(0.32 )(0.1839) + (0.33 )(0.0613) = 0.0348.
Aggregate losses of 40 correspond to either two power failures costing 20 each, three failures
costing 10, 10 and 20, or four power failures each costing 10, with probability:
(0.32 )(0.1839) + (3)(0.33 )(0.0613) + (0.34 )(0.0153) = 0.0216.
Thus the chance of aggregate losses of 50 or more is:
1 - (0.3679 + 0.1104 + 0.1269 + 0.0348 + 0.0216) = 0.3384.
Therefore, the limited expected value of aggregate losses at 50 is:
(0)(0.3679) + (10)(0.1104) + (20)(0.1269) + (30)(0.0348) + (40)(0.0216) + (50)(0.3384) =
22.47.
Thus the losses excess of 50 are: 29 - 22.47 = 6.5.
Alternately, one could use the Panjer Algorithm to get the distribution of aggregate losses.
Continuing from the previous solution:
c(3) = (1/3){(1)s(1)c(2) + (2)s(2)c(1) + (3)s(3)c(0)} =
(1/3) {(.3)(.1269) + (2)(.3)(.1104) + (3)(0)(.3679)} = 0.0348.
c(4) = (1/4){(1)s(1)c(3) + (2)s(2)c(2) + (3)s(3)c(1) + 4s(4)c(0)} =
(1/4) {(.3)(.0348) + (2)(.3)(.1269) + (3)(0)(.1104) + (4)(0)(.3679)} = 0.0216.
One can also calculate the distribution of aggregate losses using convolutions.
Number of Losses
Poisson Frequency

0
0.3679

Aggregate Losses
0
10
20
30
40

1
0.3679

2
0.1839

3
0.0613

4
0.0153

s
0
0.3
0.3
0
0

s*s

s*s*s

s*s*s*s

0.09
0.18
0.09

0.027
0.081

0.0081

Aggregate Distrib.
0.3679
0.1104
0.1269
0.0348
0.0216

Once one has the distribution of aggregate losses, one can use the recursion formula:
E[(A - (j+1)A)+] = E[(A - jA)+] - A S(jA).
Deductible

Survival Function

Stop Loss Premium

0
10
20
30
40
50

0.6321
0.5217
0.3948
0.3600
0.3384

29.000
22.679
17.462
13.514
9.914
6.530

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 321

11.42. C. For each person the mean is: (0.4)(0) + (0.3)(1) + (0.2)(2) + (0.1)(3) = 1.
Therefore, the overall mean for 3 people is: (3)(1) = 3.
Prob(Aggregate loss = 0) = 0.43 = 0.064.
Therefore, the limited expected value at 1 is: (0.064)(0) + (1 - 0.064)(1) = 0.936.
Net Stop Loss Premium at 1 is: Mean - Limited Expected Value at 1 = 3 - 0.936 = 2.064.
11.43. D. The probability of zero loss: Prob(n = 1)Prob(x = 0) + Prob(n =2)Prob(x =0)2 =
(.8)(.2) + (.2)(.2)2 = 0.168. The probability of an aggregate loss of 100 is:
Prob(n = 1)Prob(x = 100) + Prob(n = 2)(2)Prob(x = 0)Prob(x = 100) =
(.8)(.7) + (.2)(2)(.2)(.7) = .616. Therefore, the probability that the aggregate losses are 200 or
more is: 1 - (.168 + .616) = .216.
Therefore, E[A 200] = (.168)(0) + (.616)(100) + (.216)(200) = 104.8.
Mean frequency is: (.8)(1) + (.2)(2) = 1.2.
Mean severity is: (.2)(0) + (.7)(100) + (.1)(1000) = 170.
Mean aggregate loss is: (1.2)(170) = 204.
Stop loss premium is: E[A] - E[A 200] = 204 - 104.8 = 99.2.
With a relative security loading of 175%, the insurance costs: (1 + 1.75)(99.2) = 273.
Alternately, the probability of 2000 in aggregate loss is:
Prob(n = 2)Prob(x = 1000)2 = (0.2)(0.12 ) = 0.002.
The probability of 1100 in aggregate loss is:
Prob(n = 2)(2)Prob(x = 100)Prob(x = 1000) = (.2)(2)(0.7)(0.1) = 0.028.
The probability of 1000 in aggregate loss is:
Prob(n = 2)(2)Prob(x = 0)Prob(x = 1000) + Prob(n = 1)Prob(x = 1000) =
(.2)(2)(0.2)(0.1) + (0.8)(0.1) = 0.088.
These are the only possible aggregate values greater than 200.
Therefore, the expected aggregate loss excess of 200 is:
(2000 - 200)(.002) + (1100 - 200)(.028) + (1000 - 200)(0.088) = 99.2.
With a relative security loading of 175%, the insurance costs: (1 + 1.75)(99.2) = 273.
11.44. B. Prob[0 claims] = e-2. Prob[1 claim] = 2e-2. Prob[aggregate = 0] = Prob[0 claims] = e-2.
Prob[aggregate = 1] = Prob[1 claim] Prob[size = 1] = (2 e-2) (1/3) = 2e-2/3.
Limited Expected Value of Aggregate at 2 = (0)e-2 + (1)2e-2/3 + (2){1- (e-2 + 2e-2/3)} =
2 - 8e-2/3. Mean Severity = (1 + 2 + 3)/3 = 2. Mean Aggregate Loss = (2)(2) = 4.
Expected Excess of 2 = 4 - (2 - 8e-2/3) = 2 + 8e-2/3 = 2.36.
Alternately, let A = aggregate loss. E[A] = (2)(2) = 4.
E[(A-1)+] = E[A] - S(0) = 4 - (1 - e-2).
E[(A-2)+] = E[(A-1)+] - S(1) = 4 - (1 - e-2) - (1 - e-2 - 2e-2/3) = 2 + 8e-2/3 = 2.36.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 322

11.45. B. From the previous solution, Prob[aggregate = 0] = Prob[0 claims] = e-2.


Prob[aggregate = 1] = Prob[1 claim] Prob[size = 1] = 2e-2/3.
Now the aggregate can be two if there are 2 claims of size 1 or 1 claim of size 2.
Prob[aggregate = 2] = (22 e-2 / 2)(1/3)2 + (2 e-2)(1/3) = 8e-2/9.
Thus the probability of a zero total payment by the insurer is: e-2 + 2e-2/3 + 8e-2/9 = 0.3459.
From the previous solution, expected claim payments are 2.36.
Thus the expected claim payments for this insurance policy when it is positive is:
2.36 / (1 - 0.3459) = 3.61.
11.46. C. For a geometric with = 4: f(0) = 1/5 = 0.2, f(1) = 0.8f(0) = 0.16, f(2) = 0.8f(1) = 0.128.
E[S] = (4)(40) = 160. E[S 100] = 0f(0) + 40f(1) + 80f(2) + 100{1 - (f(0) + f(1) + f(2))} =
(40)(0.16) + (80)(0.128) + (100){1 - (0.2 + 0.16 + 0.128)} = 67.84.
E[(S - 100)+] = E[S] - E[S 100] = 160 - 67.84 = 92.16.
11.47. A. Prob[Agg = 0] = e-2 = 0.1353. Prob[Agg = 1] = 2e-2(0.6) = 0.1624.
Prob[Agg = 2] = Prob[1 loss of size 2 or 2 losses of size 1] = 2e-2(0.4) + (22 e-2/2)(0.62 ) = 0.2057.
E[A 3] = 0.1624 + (2)(0.2057) + (3)(1 - 0.1353 - 0.1624 - 0.2057) = 2.0636.
E[A] = (mean frequency)(mean severity) = (2)(1.4) = 2.8.
E[(A - 3)+] = E[A] - E[A 3] = 2.8 - 2.0636 = 0.7364.
Comment: The Exam Committee meant to say subject to an aggregate deductible of 3.
11.48. B. From the previous solution, Prob[Agg = 0] = e-2 = 0.1353,
Prob[Agg = 1] = 2e-2(0.6) = 0.1624,
Prob[Agg = 2] = Prob[1 loss of size 2 or 2 losses of size 1] = 2e-2(0.4) + (22 e-2/2)(.62 ) = 0.2057.
The aggregate can be three if: 3 clams of size 1, or one claim of size one and one claim of size 2.
Prob[Agg = 3] = (23 e-2/6)(0.63 ) + (22 e-2/2) {(2)(0.6)(0.4)} = 0.1689.
Thus the chance the insurer makes a positive payment is:
1 - (0.1353 + 0.1624 + 0.2057 + 0.1689) = 0.3277.
From the previous solution, the expected aggregate payment is 0.7364.
Thus the average of the positive aggregate payments is: 0.7364 / 0.3277 = 2.247.
Comment: In the exam question we are determining the average aggregate payment that the
insurer makes in a year, including those years in which the aggregate payment is zero. In contrast, in
this followup question we restrict our attention to only those years where the insurer makes a
positive payment.

2013-4-3,

Aggregate Distributions 11 Stop Loss Premiums,

HCM 10/23/12,

Page 323

11.49. B. For a Geometric Distribution with = 2:


f(0) = 1/3, f(1) = (2/3)f(0) = 2/9, f(2) = (2/3)f(1) = 4/27.
The mean of the distribution of overtime hours is: (5)(.2) + (10)(.3) + (20)(.5) = 14.
The mean aggregate is: (2)(14) = 28.
Prob[Agg = 0] = Prob[0 projects] = 1/3.
Prob[Agg = 5] = Prob[1 project]Prob[5 overtime] = (2/9)(.2) = .04444.
Prob[Agg = 10] = Prob[1 project]Prob[10 overtime] + Prob[2 projects]Prob[5 overtime]2 =
(2/9)(0.3) + (4/27)(0.2)2 = 0.07259.
E[Agg 15] = (0)(1/3) + (5)(0.04444) + (10)(0.07259) + 15(1 - 1/3 - 0.04444 - 0.07259) = 9.19.
Expected overtime in excess of 15 is:
Mean[Agg] - E[Agg 15] = 28 - 9.19 = 18.81.
Alternately, one can use a recursive method, with steps of 5.
As above, E[A] = 28. Also get the first few values of the aggregate distribution as above.
E[(A - 5)+] = E[A] - 5SA(0) = 28 - (5)(1 - 1/3) = 24.667.
E[(A - 10)+] = E[(A - 5)+] - 5SA(5) = 24.667 - (5)(1 - 1/3 - 0.04444) = 21.556.
E[(A - 15)+] = E[(A - 10)+] - 5SA(10) = 21.556 - (5)(1 - 1/3 - 0.04444 - 0.07259) = 18.81.
11.50. D. Let A be the aggregate loss. E[A] = (5){(0.6)(5) + 0.4 k} = 15 + 2k.
Prob[A = 0] = Prob[0 claims] = e-5. Prob[A 5] = Prob[at least 1 claim] = 1 - e-5.
E[A

5] = (0)e-5 + 5(1 - e-5) = 5 - 5e-5.

28.03 = E[(A - 5)+] = E[A] - E[A

5] = 10 + 2k + 5e-5. k = (18.03 - 5e-5)/2 = 9.

Comment: Given the output, solve for the missing input.

2013-4-3, Aggregate Distributions 12 Important Ideas,

HCM 10/23/12,

Page 324

Section 12, Important Formulas and Ideas


Introduction (Section 1)
The Aggregate Loss is the total dollars of loss for an insured or set of an insureds.
Aggregate Losses = (Exposures)(Frequency)(Severity).
If one is not given the frequency per exposure, but is rather just given the frequency for the whole
number of exposures, whatever they are for the particular situation, then
Aggregate Losses = (Frequency)(Severity).
Loss Models list of advantages of separately analyzing frequency and severity:
1. The number of claims changes as the volume of business changes.
2. The effects of inflation can be incorporated.
3. One can adjust the severity distribution for changes in deductibles, maximum covered loss, etc.
4. One can adjust frequency for changes in deductibles.
5. One can appropriately combine data from policies with different deductibles and
maximum covered losses into a single severity distribution.
6. One can create consistent models for the insurer, insured, and reinsurer.
7. One can analyze the tail of the aggregate losses by separately analyzing the tails of
the frequency and severity.
Loss Models recommends for modeling aggregate losses, infinitely divisible frequency distributions
and severity distributions that are members of scale families.
Convolutions (Section 2)
Convolution calculates the density or distribution function of the sum of two independent variables.
There are discrete and continuous cases.
(f*g)(z) =

f(x) g(z - x) = f(z - y) g(y) .


x

(F*G)(z) =

F(x) g(z - x) = f(z - y) G(y) = f(x) G(z- x) = F(z - y) g(y) .


x

2013-4-3, Aggregate Distributions 12 Important Ideas,

(f*g)(z) = f(x) g(z-x) dx =

HCM 10/23/12,

Page 325

f(z-y) g(y) dy.

(F*G)(z) = f(x)G(z-x)dx = F(z-y)g(y)dy = F(x)g(z-x)dx = f(z-y)G(y)dy.


The convolution operator is commutative and associative: f* g = g* f. (f* g)* h = f* (g* h).
Using Convolutions (Section 3)
If frequency is N, if severity is X, frequency and severity are independent, and aggregate losses are
A then:

FA(x) =

fN (n) FX * n(x) .
n=0

fA (x) =

fN (n) fX * n (x).

n= 0

Generating Functions (Section 4)


Probability Generating Function:

P X(t) = E[tx ] = MX(ln(t))

Moment Generating Function:

M X(t) = E[et x] = PX( et )

The Moment Generating Functions of severity distributions, when they exist, are given in
Appendix A of Loss Models. The Probability Generating Functions of frequency distributions are
given in Appendix B of Loss Models. M(t) = P(et).
For an Exponential, M(t) = 1 / (1 - t), t < 1/.
For a Poisson, P(z) = exp[(z - 1)], and M(t) = exp[(et - 1)].
The moment generating function of the sum of two independent variables is the product
of their moment generating functions:
M X+Y(t) = MX(t) MY( t )

2013-4-3, Aggregate Distributions 12 Important Ideas,

HCM 10/23/12,

Page 326

The Moment Generating Function converts convolution into multiplication:


M f * g = Mf M g .
The sum of n independent identically distributed variables has the Moment Generating
Function taken to the power n.
The m.g.f. of f*n is the nth power of the m.g.f. of f.
M X+b (t) = ebt MX(t).

M cX(t) = E[ecxt] = MX(ct).

M cX + b (t) = ebt MX(ct).


M cX + dY + b(t) = ebt MX(ct) MY(dt), for X and Y independent.
The Moment Generating Function of the average of n independent, identically distributed variables
is the nth power of the Moment Generating Function of t/n.
The moment generating function determines the distribution, and vice-versa. Therefore,
one can take limits of a distribution by instead taking limits of the Moment Generating Function.
M(0) = 1

M (0) = E[X]

M (0) = E[X2 ]

M(0) = E[X3 ]

M ( n )(0) = E[Xn ]

M X(t) =

(nth moment of X) tn / n!
n=0

Moment Generating Functions only exist for distributions all of whose moments exist. However the
converse is not true. For the LogNormal Distribution the Moment Generating Function fails to exist,
even though all of its moments exist.
d2 ln[MX(t)]
| t = 0 = Var[X].
dt2

3
d ln[MX(t)]
| t = 0 = 3rd central moment of X.
dt3

Let A be Aggregate Losses, X be severity and N be frequency, then the probability generating
function of the Aggregate Losses can be written in terms of the p.g.f. of the frequency and p.g.f. of
the severity:
PA(t) = PN(PX(t)).

2013-4-3, Aggregate Distributions 12 Important Ideas,

HCM 10/23/12,

Page 327

The Moment Generating Function of the Aggregate Losses can be written in terms of the p.g.f. of
the frequency and m.g.f. of the severity:
M A (t) = PN ( MX(t)) = MN(ln(MX(t)))
For any Compound Poisson distribution, MA(t) = exp((MX(t)-1)).
The Moment Generating Function of a mixture is a mixture of the Moment Generating Functions.
Moments of Aggregate Losses (Section 5)
Mean Aggregate Loss = (Mean Frequency)(Mean Severity)
When frequency and severity are independent:
Process Variance of Aggregate Loss =
(Mean Freq.)(Variance of Severity) + (Mean Severity)2 (Variance of Freq.)

A2 = F X 2 + X 2 F2.
The variance of a Compound Poisson is: (2nd moment of severity).
The mathematics of Aggregate Distributions and Compound Frequency Distributions are
the same:
Aggregate Dist.
Compound Frequency Dist.
Frequency

Primary (# of cabs)

Severity

Secondary (# of passengers)

One can approximate the distribution of aggregate losses using the Normal
Approximation. One could also approximate aggregate losses via a LogNormal Distribution by
matching the first two moments.
The Third Central Moment of a Compound Poisson Distribution is:
(mean frequency) (third moment of the severity).

2013-4-3, Aggregate Distributions 12 Important Ideas,

HCM 10/23/12,

Page 328

Recursive Method / Panjer Algorithm (Sections 7 and 8)


The Panjer Algorithm (recursive method) can be used to compute the aggregate distribution when
the severity distribution is discrete and the frequency distribution is a member of the
(a, b, 0) class.
If the frequency distribution is a member of the (a, b, 0) class:

c(0) = Pf (s(0)).

c(x) =

x
1
(a + jb / x) s(j) c(x - j) .
1 - a s(0) j=1

In situations in which there is a positive chance of a zero severity, it may be helpful to thin the
frequency distribution and work with the distribution of nonzero losses.
In the same manner, the Panjer Algorithm (recursive method) can be used to compute a compound
frequency distribution when the primary distribution is a member of the (a, b, 0) class.
If the frequency distribution, pk, is a member of the (a, b, 1) class:
x
1
s(x) {p1 - (a+ b) p0}
c(0) = Pf(s(0)). c(x) =
+
(a + jb / x) s(j) c(x - j) .
1 - a s(0)
1 - a s(0) j=1

Discretization (Section 9)
For the method of rounding with span h, construct the discrete distribution g:
g(0) = F(h/2).
g(ih) = F(h(i + 1/2)) - F(h(i - 1/2)).
For the method of rounding, the original and approximating Distribution Function match at all of the
points halfway between the support of the discretized distribution.
In order to instead have the means match, the approximating densities are:
g(0) = 1 - E[X h]/h.
g(ih) = {2E[X ih] - E[X (i-1)h] - E[X (i+1)h]} / h.

Analytic Results (Section 10)


A distribution is closed under convolution, if when one adds independent identically distributed
copies, one gets a member of the same family. Closed under convolution:
Gamma, Inverse Gaussian, Normal, Binomial, Poisson, and Negative Binomial.

2013-4-3, Aggregate Distributions 12 Important Ideas,

HCM 10/23/12,

Page 329

Stop Loss Premiums (Section 11)


The stop loss premium is the expected aggregate losses excess of an aggregate
deductible.
The stop loss premium at zero is the mean; the stop loss premium at infinity is zero.

expected losses excess of d = E[(A - d)+] =

d (t - d) f(t) dt = d S(t) dt .

expected losses excess of d = E[(A - d)+] =

(a - d) f(a) .

a>d

expected aggregate losses excess of d = E[A] - E[A

d].

When there is no probability for the aggregate losses in an interval, the stop loss premium for
deductibles in this interval can be gotten by linear interpolation.
If the distribution of aggregate losses is discrete with span A:
E[(A - (j+1)A)+] = E[(A - jA)+] - A S(jA).

Mahlers Guide to

Risk Measures
Joint Exam 4/C
prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-4


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-4,

Risk Measures,

HCM 10/8/12,

Page 1

Mahlers Guide to Risk Measures


Copyright 2013 by Howard C. Mahler.
The Risk Measure concepts in Loss Models are discussed.1 2
Information in bold or sections whose title is in bold are more important for passing the exam.
Larger bold type indicates it is extremely important.
Information presented in italics (and sections whose titles are in italics) should not be needed to
directly answer exam questions and should be skipped on first reading. It is provided to aid the
readers overall understanding of the subject, and to be useful in practical applications.
Solutions to the problems in each section are at the end of that section.3

Section #

Pages

Section Name

1
2
3
4
5
6
7
8

2-3
4-13
14-24
25-44
45-64
65-72
73-82
83-84

Introduction
Premium Principles
Value at Risk
Tail Value at Risk
Distortion Risk Measures
Coherence
Using Simulation
Important Ideas and Formulas

Exam 4/C Exam Questions by Section of this Study Aid4


Question 27 of the Spring 2007 exam, in my Section 5, was on the Proportional Hazard
Transform, no longer on the syllabus.
The 11/07 and subsequent exams were not released.

See Section 3.5 and Section 21.2.5 of Loss Models.


Prior to 11/09 this material was from An Introduction to Risk Measures in Actuarial Applications by Mary Hardy.
3
Note that problems include both some written by me and some from past exams. Since this material was added to
the syllabus for 2007, there are few past exam questions. Past exam questions are copyright by the Casualty
Actuarial Society and the Society of Actuaries and are reproduced here solely to aid students in studying for
exams. The solutions and comments are solely the responsibility of the author; the CAS and SOA bear no
responsibility for their accuracy. While some of the comments may seem critical of certain questions, this is
intended solely to aid you in studying and in no way is intended as a criticism of the many volunteers who work
extremely long and hard to produce quality exams. In some cases Ive rewritten past exam questions in order to
match the notation in the current Syllabus.
4
This topic was added to the syllabus in 2007.
2

2013-4-4,

Risk Measures 1 Introduction,

HCM 10/8/12,

Page 2

Section 1, Introduction
Assume that aggregate annual losses (in millions of dollars) follow a LogNormal Distribution with
= 5 and = 1/2, with mean = exp[5 + (1/2)2 /2] = 168.174, second moment =
exp[(2)(5) + (2)(1/2)2 ] = 36,315.5, and variance = 36,315.5 - 168.1742 = 8033:
Prob.
0.006
0.005
0.004
0.003
0.002
0.001
100

200

300

400

500

600

Assume instead that aggregate annual losses (in millions of dollars) follow a LogNormal Distribution
with = 4.625 and = 1, with mean = exp[4.625 + 12 /2] = 168.174, second moment =
exp[(2)(4.625) + (2)(12 )] = 76,879.9, and variance = 76,879.9 - 168.1742 = 48,597:
Prob.
0.006
0.005
0.004
0.003
0.002
0.001
100

200

300

400

500

600

While the two portfolios have the same mean loss, the second portfolio has a much bigger
variance. The second portfolio has a larger probability of an extremely bad year. Therefore, we
would consider the second portfolio risker to insure than the first portfolio.
We will discuss various means to quantify the amount of risk, so-called risk measures.

2013-4-4,

Risk Measures 1 Introduction,

HCM 10/8/12,

Page 3

There are three main uses of risk measures in insurance:5


1. Helping to determine the premium to charge.
2. Determining the appropriate amount of policyholder surplus (capital).
3. Helping to determine an appropriate amount for loss reserves.
We would expect that all other things being equal, an insurer would charge more to insure the riskier
second portfolio, than the less risky first portfolio.
We would expect that all other things being equal, an insurer should have more policyholder
surplus if insuring the riskier second portfolio, than the less risky first portfolio.6
Definition of a Risk Measure:
A risk measure is defined as a functional mapping of an aggregate loss distribution to
the real numbers.
(X) is the notation used for the risk measure.
Given a specific choice of risk measure, a number is associated with each loss distribution
(distribution of aggregate losses), which encapsulates the risk associated with that loss distribution.
Exercise: Let the risk measure be: the mean + two standard deviations.7
In other words, (X) = E[X] + 2 StdDev[X].
Determine the risk of the two portfolios discussed previously.
[Solution: For the first portfolio: 168.2 + 2 8025 = 347.4.
For the second portfolio: 168.2 + 2 48,589 = 609.1
Comment: As expected, using this measure, the second portfolio has a larger risk than the first.]

These same ideas can be applied with appropriate modification to banking.


What is an appropriate amount of surplus might be determined by an insurance regulator or by the market effects
of the possible ratings given to the insurer by a rating agency.
7
This is an example of the standard deviation premium principle, to be discussed in the next section.
6

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 4

Section 2, Premium Principles


Three simple premium principles will be discussed:
1. The Expected Value Premium Principle
2. The Standard Deviation Premium Principle
3. The Variance Premium Principle
Each premium principle generates a premium which is bigger than the expected loss.
The difference between the premium and the mean loss is the premium loading, which acts as a
cushion against adverse experience.
For a given loss distribution, different choices of risk measure result in different premiums.
As elsewhere on the syllabus of this exam, we ignore expenses, investment income, etc., unless
specifically stated otherwise.
The Expected Value Premium Principle:
For example, let the premium be 110% of the expected losses.8
More generally, (X) = (1 + k)E[X], k > 0.
In the above example, k = 10%.
The Standard Deviation Premium Principle:9
For example, let the premium be the expected losses plus 1.645 times the standard deviation.
More generally, (X) = E[X] + k Var[X] , k > 0.10
In the above example, k = 1.645.
Using the Normal Approximation, since [1.645] = 95%, E[X] + 1.645 Var[X] is approximately
the 95th percentile of the aggregate distribution.11 Thus we would expect that the aggregate loss
would exceed the premium approximately 5% of the time.

On the exam, you would be given the 110%; you would not be responsible for selecting it.
See Example 3.12 in Loss Models.
10
While I have used the same letter k in the different risk measures, k does not have the same meaning.
11
The Normal Approximation is one common way to approximate an aggregate distribution, but not the only
method. See Mahlers Guide to Aggregate Distributions.
9

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 5

The Variance Premium Principle:


2 = E[(X - E[X])2 ].
For example, let the premium be the expected losses plus 20% times the variance.12
More generally, (X) = E[X] + k Var[X], k > 0.
In the above example, k = 0.2.

Further Reading:
There have been many discussions of the use of possible different methods of calculating risks
loads along the lines of these simple premium principles.13
Other risk measures have been developed more recently and have certain nice mathematical
properties.14

12

On the exam, you would be given the 20%; you would not be responsible for selecting it.
See for example, Reinsurer Risk Loads from Marginal Surplus Requirements by Rodney Kreps, PCAS 1990
and discussion by Daniel F. Gogol, PCAS 1992; Risk Loads for Insurers, by Sholom Feldblum, PCAS 1990,
discussion by Steve Philbrick PCAS 1991, Authors reply PCAS 1993, discussion by Todd R. Bault, PCAS 1995;
The Competitive Market Equilibrium Risk Load Formula, by Glenn G. Meyers, PCAS 1991, discussion by Ira
Robbin PCAS 1992, Authors reply PCAS 1993; Balancing Transaction Costs and Risk Load in Risk Sharing
Arrangements, by Clive L. Keatinge, PCAS 1995; Pricing to Optimize an Insurers Risk-Return Relationship,
Daniel F. Gogol,
14
In the CAS literature, there have continued to be many papers on this subject. See for example, An Application
of a Game Theory: Property Catastrophe Risk Load, PCAS 1998, Capital Consumption: An Alternative
Methodology for Pricing Reinsurance, by Donald Mango, Winter 2003 CAS Forum,
Implementation of PH-Transforms in Ratemaking by Gary Venter, PCAS 1998,
and The Dynamic Financial Analysis Call Papers in the Spring 2001 CAS Forum.
13

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 6

Problems:
2.1 (2 points) Suppose S is a compound Poisson distribution of aggregate claims with a mean
number of claims = 500 and with individual claim amounts distributed as an Exponential with mean
1000. The insurer wishes to collect a premium equal to the mean plus one standard deviation of
the aggregate claims distribution. Calculate the required premium. Ignore expenses.
(A) 500,000 (B) 510,000 (C) 520,000 (D) 530,000 (E) 540,000
2.2 (2 points) For an insured portfolio, you are given:
(i) the number of claims has a Geometric distribution with = 12.
(ii) individual claim amounts can take values of 1, 5, or 25 with equal probability.
(iii) the number of claims and claim amounts are independent.
(iv) the premium charged equals expected aggregate claims
plus 2% of the variance of aggregate claims.
Determine the premium charged.
(A) 200
(B) 300
(C) 400
(D) 500
(E) 600
2.3 (2 points) For aggregate claims S = X1 + X2 + ...+ XN:
(i) N has a Poisson distribution with mean 400.
(ii) X1 , X2 . . . have mean 2 and variance 3.
(iii) N, X1 , X2 . . . are mutually independent.
Three actuaries each propose premiums based on different premium principles.
Wallace proposes using the expected value premium principle with k = 15%.
Yasmin proposes using the standard deviation premium principle with k = 200%.
Zachary proposes using the variance premium principle with k = 5%.
Rank the three proposed premiums from smallest to largest.
(A) Wallace, Yasmin, Zachary
(B) Wallace, Zachary, Yasmin
(C) Yasmin, Zachary, Wallace
(D) Zachary, Yasmin, Wallace
(E) None of A, B, C, or D

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 7

Use the following information for the next two questions:

An insurer has a portfolio of 1000 insured properties as shown below.


Property Value
Number of Properties
$50,000
300
$100,000
500
$200,000
200

The annual probability of a claim for each of the insured properties is .03.

Each property is independent of the others.


Assume only total losses are possible.

2.4 (2 points) Insurance premiums are set at the mean loss plus one standard deviation.
Determine the premium.
(A) Less than 3.5 million
(B) At least 3.5 million, but less than 3.6 million
(C) At least 3.6 million, but less than 3.7 million
(D) At least 3.7 million, but less than 3.8 million
(E) At least 3.8 million
2.5 (2 points) The insurer buys reinsurance with a retention of $75,000 on each property.
(For example, in the case of a loss of $200,000, the insurer would pay $75,000, while the
reinsurer would pay $125,000.)
The annual reinsurance premium is set at 110% of the expected annual excess claims.
Insurance premiums are set at the reinsurance premiums plus mean annual retained loss plus one
standard deviation of the annual retained loss.
Determine the premium.
(A) Less than 3.5 million
(B) At least 3.5 million, but less than 3.6 million
(C) At least 3.6 million, but less than 3.7 million
(D) At least 3.7 million, but less than 3.8 million
(E) At least 3.8 million
2.6 (2 points) Annual aggregate losses have the following distribution:
Annual Aggregate Losses
Probability
0
50%
10
30%
20
10%
50
5%
100
5%
Determine the variance premium principle with k = 1%.
A. 16
B. 18
C. 20
D. 22
E. 24

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 8

2.7 (Course 151 Sample Exam #1, Q.20) (2.5 points)


For aggregate claims S = X1 + X2 + ...+ XN:
(i) N has a Poisson distribution with mean 0.5
(ii) X1 , X2 . . . have mean 100 and variance 100
(iii) N, X1 , X2 . . . are mutually independent.
For a portfolio of insurance policies, the loss ratio during a premium period is the ratio of aggregate
claims to aggregate premiums collected during the period.
The relative security loading, (premiums / expected losses) - 1, is 0.1.
Using the normal approximation to the compound Poisson distribution, calculate the probability that
the loss ratio exceeds 0.75 during a particular period.
(A) 0.43
(B) 0.45
(C) 0.50
(D) 0.55
(E) 0.57
2.8 (Course 151 Sample Exam #2, Q.12) (1.7 points) An insurer provides life insurance for the
following group of independent lives:
Number
Death
Probability
of Lives
Benefit
of Death
100
1
0.01
200
2
0.02
300
3
0.03
The insurer purchases reinsurance with a retention of 2 on each life.
The reinsurer charges a premium H equal to its expected claims plus the standard deviation of its
claims.
The insurer charges a premium G equal to expected retained claims plus the standard deviation of
retained claims plus H.
Determine G.
(A) 44
(B) 46
(C) 70
(D) 94
(E) 96
2.9 (Course 151 Sample Exam #3, Q.3) (0.8 points) A company buys insurance to cover
medical claims in excess of 50 for each of its three employees. You are given:
(i) claims per employee are independent with the following distribution:
x
p(x)
0
0.4
50
0.4
100
0.2
(ii) the insurer's relative security loading, (premiums / expected losses) - 1, is 50%.
Determine the premium for this insurance.
(A) 30
(B) 35
(C) 40
(D) 45
(E) 50

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 9

2.10 (5A, 11/94, Q.34) (2 points) You are the actuary for Abnormal Insurance Company.
You are assigned the task of setting the initial surplus such that the probability of losses less
premiums collected exceeding this surplus at the end of the year is 2%.
Company premiums were set equal to 120% of expected losses.
Assume that the aggregate losses are distributed according to the information below:
Prob(Aggregate Losses < L) = 1 - [10,000,000/(L + 10,000,000)]2 .
What is the lowest value of the initial surplus that will satisfy the requirements described above?
2.11 (5A, 5/95, Q.35) (1 point) Suppose S is a compound Poisson distribution of aggregate
claims with a mean number of claims = 2 and with individual claim amounts distributed as
exponential with E(X) = 5 and VAR(X) = 25.
The insurer wishes to collect a premium equal to the mean plus one standard deviation of the
aggregate claims distribution.
Calculate the required premium. Ignore expenses.
2.12 (IOA, 9/09, Q. 3) (9 points) A small bank wishes to improve the performance of its
investments by investing 1m in high returning assets. An investment bank has offered the bank
two possible investments:
Investment A: A diversified portfolio of shares and derivatives which can be assumed to produce
a return of R1 million where R1 = 0.1 + N, where N is a normal N(1,1) random variable.
Investment B: An over-the-counter derivative which will produce a return of R2 million where the
investment bank estimates:
1.5 with probability 0.99
R2 =
-5.0 with probability 0.01
The chief executive of the bank says that if one investment has a better expected return and a
lower variance than the other then it is the best choice.
(i) (4.5 points)
(a) Calculate the expected return and variance of each investment A and B.
(b) Discuss the chief executives comments in the light of your calculations.
(ii) (1.5 points) Calculate the following risk measures for each of the two investments A and B:
(a) probability of the returns falling below 0.
(b) probability of the returns falling below -2.
(iii) (3 points)
(a) Define other suitable risk measures that could be calculated.
(b) Discuss what these risk measures would show.

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 10

Solutions to Problems:
2.1. D. E[S] = = (500)(1000)= 500,000. Var[S] = 22 = (500)(2)(10002 ) = 1,000,000,000.
E[S] +

Var[S] = 500,000 +

1,000,000,000 = 531,622.

2.2. D. E[N] = 12. Var[N] = (12)(12 + 1) = 156.


E[X] = (1 + 5 + 25)/3 = 10.333.
E[X2 ] = (12 + 52 + 252 )/3 = 217.
Var[X] = 217 - 10.3332 = 110.2.
The aggregate has mean: (12)(10.333) = 124.
The aggregate has variance: (12)(110.2) + (10.3332 )(156) = 17,979.
E[S] + (0.02)Var[S] = 124 + (2%)(17,979) = 484.
2.3. E. Mean of aggregate is: (400)(2) = 800.
Variance of aggregate is: (second moment of severity) = (400)(3 + 22 ) = 2800.
Wallaces proposed premium is: (1.15)(800) = 920.
Yasmins proposed premium is: 800 + (2) 2800 = 905.8.
Zacharys proposed premium is: 800 + (.05)(2800) = 940
From smallest to largest: Yasmin, Wallace, Zachary.
2.4. D. Frequency is Binomial with m = 1000 and q = .03.
Mean frequency is: (1000)(.03) = 30. Variance of Frequency is: (1000)(.03)(.97) = 29.1.
Mean severity is: (30%)(50000) + (50%)(100000) + (20%)(200000) = 105,000.
Variance of severity is:
(30%)(50000 - 105000)2 + (50%)(100000 - 105000)2 + (20%)(200000 - 105000)2 =
2725 million.
Mean aggregate is: (30)(105,000) = 3.15 million.
Variance of aggregate is: (30)(2725 million) + (105,000)2 (29.1) = 402,577.5 million.
Premium is: 3.15 million + 402,577.5 million = 3.15 million + 0.63 million = 3.78 million.

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 11

2.5. C. For a $50,000 loss, all $50,000 is retained. For a loss of either $100,000 or $200,000,
$75,000 is retained. The mean retained severity is: (30%)(50000) + (70%)(75000) = 67,500.
The mean aggregate retained is: (30)(67,500) = 2.025 million.
Therefore the mean aggregate excess is: 3.15 million - 2.025 million = 1.125 million.
The reinsurance premium is: (110%)(1.125 million) = 1.238 million.
Variance of retained severity is: (30%)(50000 - 67,500)2 + (70%)(75000 - 67,500)2 =
131.25 million.
Variance of aggregate retained is: (30)(131.25 million) + (67,500)2 (29.1) = 136,524.4 million.
Premium is: 1.238 million + 2.025 million +

136,524.4 million = 3.63 million.

Comment: Purchasing reinsurance has reduced the risk of the insurer.


2.6. B. E[X] = (0)(50%) + (30%)(10) + (10%)(20) + (5%)(50) + (5%)(100) = 12.5.
2 = (0 - 12.5)2 (50%) + (30%)(10 - 12.5)2 + (10%)(20 - 12.5)2 + (5%)(50 - 12.5)2
+ (5%)(100 - 12.5)2 = 538.75.
E[X] + (1%)2 = 12.5 + (.01)( 538.75) = 17.9.
2.7. D. The mean aggregate loss is: (100)(.5) = 50.
The premiums are: (1.1)(50) = 55.
Since frequency is Poisson, the variance of the aggregate loss is:
(mean frequency)(second moment of the severity) = (.5)(100 + 1002 ) = 5050.
The loss ratio is 75% if the loss is: (55)(.75) = 41.25. Thus the loss ratio exceeds 75% if the loss
exceeds 41.25. Thus using the Normal approximation, the probability that the loss ratio exceeds
75% is: 1 - ((41.25 - 50)/ 5050 ) = 1 - (-.12) = (.12) = 0.5478.
2.8. B. For the insurer, the mean payment is:
(100)(.01)(1) + (200)(.02)(2) +(300)(.03)(2) = 1 + 8 + 18 = 27.
For the insurer, the variance of payments is :
(100)(.01)(.99)(12 ) + (200)(.02)(.98)(22 ) + (300)(.03)(.97)(22 ) = 51.59.
For the reinsurer, the mean payment is :
(100)(.01)(0) + (200)(.02)(0) + (300)(.03)(1) = 9.
For the reinsurer, the variance of payments is :
(100)(.01)(.99)(02 ) + (200)(.02)(.98)(02 ) + (300)(.03)(.97)(12 ) = 8.73.
Reinsurers premium = 9 +
Insurers premium = 27 +

8.73 = 11.955.
51.59 + 11.955 = 46.14.

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 12

2.9. D. The expected payment per employee is: (0)(.4) + (0)(.4) + (100 - 50)(.2) = 10.
The expected aggregate payments are: (3)(10) = 30. The premiums = (1.5)(30) = 45.
2.10. The distribution of L is a Pareto Distribution with = 2 and = 10 million.
Therefore, E[L] = /(1) = $10 million.
Premiums are (1.2)E(L) = (1.2)($10 million) = $12 million.
The 98th percentile of the distribution of aggregate losses is such that
.02 = [10,000,000/(L + 10,000,000)]2 . Therefore 98th percentile of L = 60.71 million.
Therefore, we require that: 60.71 million = initial surplus + $12 million.
initial surplus = $48.71 million.
Comment: Use the 98th percentile of the given Pareto Distribution, rather than the Normal
Approximation to the Pareto Distribution.
2.11. The mean of the aggregate losses = (2)(5) =10.
The variance of aggregate losses = (2)(25) + (2)(52 ) = 100. Mean + Stddev = 10 + 10 = 20.

2013-4-4,

Risk Measures 2 Premium Principles,

HCM 10/8/12,

Page 13

2.12. (i) Investment A


Expected return = E[0.1 + N] = 0.1 + 1 = 1.1
Variance = 1
Investment B
Expected return = (1.5)(0.99) + (-5.0)(0.01) = 1.435.
Variance = (0.99)(1.435 - 1.5)2 + (0.01){1.435 - (- 5)}2 = 0.418275
Investment B has both a higher expected return and lower variance so would be preferred on this
basis. However there is an issue with the possibility of very bad returns on Investment B.
Also there might be an issue with the estimated probabilities of investment B being somewhat
unreliable as they are probably derived from the heavy righthand tail of a distribution. Thus it might
be wise to take this calculation with a grain of salt.
(ii) a. Investment A.
Probability of return below 0 is probability of the return from N(1, 1) being below -0.1:
[(-0.1 - 1)/1] = [-1.1] = 0.1357.
Investment B Probability of return below 0 is 0.01.
b. Investment A.
Probability of return below 2 is probability of the return from N(1,1) being below -2.1:
[(-2.1 - 1)/1] = [-3.1] = 0.0010.
Investment B. Probability of return below -2 is 0.01.
(iii) One could instead use the Value at Risk or Tail Value at Risk.
For example, 95%-VaR is the 95th percentile of the distribution of losses, which in this case would
be the 5th percentile of the returns.
For Investment A, 95%-VaR is: 0.1 + {1 - 1.645) = -0.545.
For Investment B, 95%-VaR is: 1.5.
For Investment A, 99%-VaR is: 0.1 + {1 - 2.326) = -1.226.
For Investment B, 99%-VaR is: -5.0.
The 95%-TVaR is the average of those losses greater than or equal to 95%-VaR, or in this case
the average of the returns less than or equal to 95%-VaR.
For the Normal Distribution, TVaRp [X] = + [zp ] / (1 - p).
Thus for Investment A, 95%-TVar is: 1.1 - 1

exp[-(1.6452) / 2]
2

/ 0.05 = -0.9622.

For Investment B, 95%-TVar is: (0.4)(1.5) + (0.1)(-5.0) = 0.1.


For Investment A, 95%-TVar is: 1.1 - 1

exp[-(2.3262) / 2]
2

/ 0.01 = -1.5674.

For Investment B, 99%-TVar is: -5.0.


Comment: There is no one correct measure of risk.
Different measures of risk give different orderings in this case.

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 14

Section 3, Value at Risk15 16


In this section, another risk measure will be discussed:
Value at Risk VaR Quantile Risk Measure Quantile Premium Principle.
Percentiles:
Exercise: Assume that aggregate annual losses (in millions of dollars) follow a LogNormal
Distribution with = 5 and = 1/2. Determine the 95th percentile of this distribution.
[Solution: 0.95 = F(x) = [(lnx - 5)/(1/2)]. (2)(lnx - 5) = 1.645.

x = exp[5 + (1/2)(1.645)] = 337.8.


Comment: Find the 95th percentile of the underlying Normal and exponentiate.]
In other words, for this portfolio, there is a 95% chance that the aggregate loss is less than 337.8.
p is the 100pth percentile.
For this portfolio, 95% = 337.8.
Quantiles:
The 95th percentile is also referred to as Q0.95, the 95% quantile.
For this portfolio, the 95% quantile is 337.8.
90th percentile Q0.90 90% quantile.
99th percentile Q0.99 99% quantile.
median Q.50 50% quantile.

15
16

See Section 3.5.3 of Loss Models.


Value at Risk is also discussed in Chapter 25 of Derivative Markets by McDonald, not on the syllabus.

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 15

Definition of the Value at Risk:


The Value at Risk, VaRp , is defined as the 100pt h percentile.
p is sometimes called the security level.
VaRp (X) = p .
If aggregate annual losses follow a LogNormal Distribution with = 5 and = 1/2, then
VaR95% is the 95th percentile, or 337.8.
For this LogNormal Distribution with = 5 and = 1/2, here is a graph of VaRp as a function of p:
VaR
700
600
500
400
300
200
100
p
0.2

0.4

0.6

0.8

0.9

0.999

Exercise: If annual aggregate losses follow a Weibull Distribution with = 10 and = 3,


determine VaR90%.
[Solution: .90 = 1 - exp[-(x/10)3 ]. x = 13.205.
Comment: We have determined the 90th percentile of this Weibull Distribution.
As shown in Appendix A: VaRp (X) = {-ln(1-p)}1/ ].
In Appendix A of the Tables attached to the exam, there are formulas for VaRp (X) for a
many of the distributions.17
17

This will also help in finding percentiles and in performing simulation by inversion.

2013-4-4,

Risk Measures 3 Value at Risk,

Distribution

VaRp (X)

Exponential

- ln(1-p)

Pareto

{(1-p)-1/ - 1}

Weibull

{-ln(1-p)}1/

Single Parameter Pareto

(1- p) - 1/

Loglogistic

{p-1 - 1}-1/

Inverse Pareto

{p-1/ - 1}-1

Inverse Weibull

{-ln(p)}1/

Burr

{(1-p)-1/ - 1}1/

Inverse Burr

{p-1/ - 1}-1/

Inverse Exponential

{-ln(p)}-1

Paralogistic

{(1-p)-1/ - 1}1/

Inverse Paralogistic

{p-1/ - 1}-1/

Normal18

+ zp .

18

Not shown in Appendix A attached to the exam.


See Example 3.14 in Loss Models.
zp is the pth percentile of the Standard Normal.

HCM 10/8/12,

Page 16

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 17

Problems:
3.1 (1 point) Losses are Normal with = 1000 and = 25.
Determine the VaR80%.
A. 1010

B. 1020

C. 1030

D. 1040

E. 1050

3.2 (2 points) Losses follow a LogNormal Distribution with = 8 and = 0.7.


Premiums are 110% of expected losses.
Determine the amount of policyholder surplus the insurer must have so that there is a 10% chance
that the losses will exceed the premium plus surplus.
(A) Less than 2500
(B) At least 2500, but less than 3000
(C) At least 3000, but less than 3500
(D) At least 3500, but less than 4000
(E) At least 4000
3.3 (1 point) If annual aggregate losses follow a Pareto Distribution with = 4 and = 100,
determine VaR95%.
A. 80

B. 90

C. 100

D. 110

E. 120

3.4 (2 points) A group medical insurance policy covers the medical expenses incurred by 2000
mutually independent lives.
The annual loss amount, X, incurred by each life is distributed as follows:
x
Pr(X=x)
0
0.40
100
0.40
1000
0.15
5000
0.05
The premium is equal to the 99th percentile of the normal distribution which approximates the
distribution of total claims. Determine the premium per life.
(A) Less than 470
(B) At least 470, but less than 480
(C) At least 480, but less than 490
(D) At least 490, but less than 500
(E) At least 500

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 18

3.5 (3 points) Annual Losses for the Rocky Insurance Company are Normal with mean 20 and
standard deviation 3.
Annual Losses for the Bullwinkle Insurance Company are Normal with mean 30 and standard
deviation 4.

The annual losses for the Rocky and Bullwinkle companies have a correlation of 60%.
(i) Determine the VaR90% for the Rocky Insurance Company.
(ii) Determine the VaR90% for the Bullwinkle Insurance Company.
(iii) The Rocky and Bullwinkle companies merge.
Determine the VaR90% for the merged company.

3.6 (1 point) Losses follow a Weibull Distribution with = 10 and = 0.3, for a 99% security level
determine the Value at Risk.
(A) Less than 1500
(B) At least 1500, but less than 2000
(C) At least 2000, but less than 2500
(D) At least 2500, but less than 3000
(E) At least 3000

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 19

3.7 (1 point) Annual aggregate losses have the following distribution:


Annual Aggregate Losses
Probability
0
50%
10
30%
20
10%
50
4%
100
2%
200
2%
500
1%
1000
1%
Determine the 95% Value at Risk.
A. 60
B. 70
C. 80
D. 90
E. 100
3.8 (Course 151 Sample Exam #2, Q.11) (1.7 points) A group medical insurance policy
covers the medical expenses incurred by 100,000 mutually independent lives.
The annual loss amount, X, incurred by each life is distributed as follows:
x
Pr(X=x)
0
0.30
50
0.10
200
0.10
500
0.20
1,000
0.20
10,000
0.10
The policy pays 80% of the annual losses for each life.
The premium is equal to the 95th percentile of the normal distribution which approximates the
distribution of total claims.
Determine the difference between the premium and the expected aggregate payments.
(A) 1,213,000 (B) 1,356,000 (C) 1,446,000 (D) 1,516,000 (E) 1,624,000
3.9 (5A, 11/94, Q.36) (2 points) An auto insurer has 2 classes of insureds with the following claim
probabilities and distribution of claim amounts:
Number of
Probability of
Claim
Class
Insureds
One Claim
Severity
1
400
0.10
3,000
2
600
0.05
2,000
An insured will have either no claims or exactly one claim.
The size of claim for each class is constant.
The insurer wants to collect a total dollar amount such that the probability of total claims dollars
exceeding that amount is 5%. Using the normal approximation and ignoring expenses, how much
should the insurer collect?

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 20

3.10 (5A, 11/95, Q.35) (2 points) An insurance company has two classes of insureds with the
following claim probabilities and distribution of claim amounts:
Number of Probability Claim
Class Insureds
of 1 claim
Severity
1
1,000
0.15
$600
2
5,000
0.05
$800
The probability of an insured having more than one loss is zero. The company wants to collect an
amount equal to the 95th percentile of the distribution of aggregate losses.
Determine the total premium.
3.11 (5A, 11/98, Q.35) (2 points) You are a pricing actuary offering a new coverage and you
have analyzed the distribution of losses capped at various limits shown below:
Capped Limit Expected Value Variance
30,000
500
100,000
25,000
450
50,000
20,000
400
40,000
15,000
350
28,000
10,000
250
14,000
5,000
200
9,000
Your chief actuary requires that the premiums be at the 95th percentile of the distribution of losses.
The general manager requires that the difference between the premiums and the expected losses
be no greater than $200. What is the highest limit of the new coverage that can be written
consistent with these requirements?
3.12 (5A, 11/99, Q.37) (2 points) An insurer issues 1-year warranty coverage policies to two
different types of insureds. Group 1 insureds have a probability of having a claim of .05 and
Group 2 insureds have a probability of having a claim of .10. There are two possible claim
amounts of $500 and $1,000. The following table shows the number of insureds in each class.
Class
Prob. of Claim
Claim Amount
# of Insureds
1
0.05
$500
200
2
0.10
$500
200
3
0.05
$1000
300
4
0.10
$1000
250
Using the Normal Approximation, how much premium should the insurer collect such that the
collected premium equals the 95th percentile of the distribution of total claims?

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 21

3.13 (8, 5/09, Q.28) (2.25 points) Given the following information about Portfolios A and B:

The returns on a stock are Normally distributed.


The volatility is the standard deviation of the returns on a stock.
If you buy stocks, then the loss is the difference between the initial cost of the portfolio
and the current value of the portfolio.

The value of Portfolio A is $15 million and consists only of Company A stock.
The daily volatility of Portfolio A is 3%.
The value of Portfolio B is $7 million and consists only of Company B stock.
The daily volatility of Portfolio B is 2%.
The correlation coefficient between Company A and Company B stock prices is 0.40.
a. (0.75 point) Calculate the 10-day 99% Value-at-Risk (VaR) for Portfolio A.
b. (0.75 point) Calculate the 10-day 99% VaR for a portfolio consisting of Portfolios A and B.
Note: I have revised this past exam question.

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 22

Solutions to Problems:
3.1. B. .80 = F(x) = [(x - 1000)/25]. (x - 1000)/25 = 0.842

x = 1000 + (25)(0.842) = 1021.


3.2. C. E[X] = exp[8 + .72 /2] = 3808. Premium is: (1.1)(3808) = 4189.
.90 = F(x) = [(lnx - 8)/0.7]. (lnx - 8)/0.7 = 1.282.

x = exp[8 + (0.7)(1.282)] = 7313. 90th percentile of the LogNormal is 7313.


Required surplus is: 7313 - 4189 = 3124.
3.3. D. .95 = 1 - {100/(100 + x)}4 . 20 = (1 + x/100)4 . x = 111.5.
As shown in Appendix A, for a Pareto Distribution with parameters and , > 1:
VaRp (X) = [(1-p)-1/ - 1]. VaR0.95 = (100) {(0.05)-1/4 - 1} = 111.5.
3.4. D. E[X] = (0)(.4) + (100)(.4) + (1000)(.15) + (5000)(.05) = 440.
E[X2 ] = (02 )(.4) + (1002 )(.4) + (10002 )(.15) + (50002 )(.05) = 1,404,000.
Var[X] = 1,404,000 - 4402 = 1,210,400.
The aggregate has mean: (2000)(440) and variance: (2000)(1,210,400).
[2.326] = 0.99. Total premium is: (2000)(440) + (2.326) (2000)(1,210,400) .
Premium per life is: 440 + (2.326) 1,210,400 / 2000 = 497.2.
3.5. (i) For Rocky, VaR90% is: 20 + (1.282)(3) = 23.846.
(ii) For Bullwinkle, VaR90% is: 30 + (1.282)(4) = 35.128.
(iii) Annual losses for Rocky plus Bullwinkle are Normal with mean: 20 + 30 = 50,
and variance: 32 + 42 + (2)(.6)(3)(4) = 39.4.
For Rocky plus Bullwinkle, VaR90% is: 50 + (1.282)( 39.4 ) = 58.047.
Comment: 58.047 < 58.974 = 23.846 + 35.128. Merging has reduced the risk measure, an
example of the advantage of diversification. As will be discussed with respect to coherent risk
measures, this property is called subadditivity. While Value at Risk is usually subadditive, it is not
always subadditive.

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 23

3.6. B. .99 = 1 - exp[-(x/10)0.3]. 100 = exp[(x/10)0.3]. x = 1625.


As shown in Appendix A: VaRp (X) = [ -ln(1-p) ]1/.
VaR0.99(X) = (10) { -ln(0.01) ]1/0.3 = 1625.
3.7. E. F(50) = 94% < 95%. F(100) = 96% 95%. Thus 100 is the 95% VaR.
3.8. A. The variance of the severity is: 10,254,250 - 13252 = 8,498,625.
x

density

first
moment

second
moment

0
50
200
500
1000
10000

0.3
0.1
0.1
0.2
0.2
0.1

0
5
20
100
200
1000

0
250
4,000
50,000
200,000
10,000,000

1325

10,254,250

The mean aggregate payment by the insurer is: (100000)(.8)(1325) = 106 million.
The variance of the insurers aggregate payment is: (.82 )(100000)(8,498,625).
The standard deviation is: 737,503. For the 95th percentile, one adds 1.645 standard deviations
to the mean. Thus the premium is: 106,000,000 + (1.645)(737,503).
Premiums - expected aggregate payments = (1.645)(737,503) = 1,213,192.
3.9. Mean Aggregate Loss = (400)(.10)(3000) + (600)(.05)(2000) = 180,000.
Variance of Aggregate Losses = (400)(.10)(.9)(30002 ) + (600)(.05)(.95)(20002 ) =
438,000,000.
Since the 95th percentile of the Unit Normal Distribution is 1.645, we want to collect:
Mean + 1.645 Standard Deviations = 180,000 + 1.645 438,000,000 = 214,427.
3.10. The mean loss is: (.15)(600)(1000) + (.05)(800)(5000) = 290,000.
The variance of aggregate losses is:
(.15)(.85)(6002 )(1000) + (.05)(.95)(8002 )(5000) = 197,900,000.
The 95th percentile of aggregate losses is approximately:
290,000 + (1.645) 197,900,000 = 290,000 + 23,141 = 313,141.
Comment: The relative security loading is: 23,141/290,000 = 8.0%.

2013-4-4,

Risk Measures 3 Value at Risk,

HCM 10/8/12,

Page 24

3.11. Using the Normal Approximation, The 95th percentile is approximately:


mean + 1.645(Standard Deviation), .
The difference between the premiums and the expected losses is: 1.645(Standard Deviation).
Therefore, we require 1.645(Standard Deviation) < 200 Variance < 14,782. The highest limit of
the new coverage that can be written consistent with these requirements is $10,000.
3.12. With severity s, Bernoulli parameter q, and n insureds:
mean of aggregate losses = nqs, variance of aggregate losses = nq(1-q)s2 .
Class

Frequency

Severity

# of Insureds

Mean

Variance

1
2
3
4

0.05
0.10
0.05
0.10

500
500
1000
1000

200
200
300
250

5,000
10,000
15,000
25,000

2,375,000
4,500,000
14,250,000
22,500,000

55,000

43,625,000

Overall

Approximate the distribution of aggregate losses by the Normal Distribution with the same mean
and variance. The 95th percentile 55000 + 1.645 43,625,000 = 65,865.
3.13. a. [2.326] = 99%.
Assuming the returns on different days are independent, the variances add; variances are
multiplied by N, while standard deviations are multiplied by
The volatility over ten days is: 0.03

N.

10 .

One standard deviation of movement in value is: ($15 million) (0.03 10 ).


The 1% worst outcomes are when the value declines by 2.326 standard deviations or more.
VaR0.99 = ($15 million) (2.326) (0.03

10 ) = 3.31 million.

b. The standard deviation of the daily change in the value of the portfolio is:
(152 )(0.032 ) + (7 2 )(0.022) + (2)(0.4)(15)(0.03)(7)(0.02) = 0.522 million.
VaR0.99 = (0.522 million) (2.326)

10 = 3.84 million.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 25

Section 4, Tail Value at Risk19 20 21


In this section, another risk measure will be discussed:
Tail Value at Risk TVaR Conditional Tail Expectation CTE

Tail Conditional Expectation TCE Expected Shortfall Expected Tail Loss.


Definition of the Tail Value at Risk:
For a given value of p, the security level, the Tail Value at Risk of a loss distribution is defined as
the average of the 1 - p worst outcomes: TVaRp (X) E[X | X > p ].
The corresponding risk measure is: (X) = TVaRp (X).
Exercise: The aggregate losses are uniform from 0 to 100. Determine TVaR.80 and TVaR.90.
[Solution: TVaR.80 = (100 + 80)/2 = 90. TVaR.90 = (100 + 90)/2 = 95.]
As with the Value at Risk, for larger choices of p, the Tail Value at Risk is larger, all other things
being equal.
TVaRp = average size of those losses of size greater than the pth percentile, p .

TVaRp =

x f(x) dx /

f(x) dx =

x f(x) dx / (1p).

The average size of those losses of size between a and b is:22


E[X | b > X > a] = ({E[X b] - b S(b)} - {E[X a] - a S(a)}) / {F(b) - F(a)}.
Letting a = p and b = :
TVaRp = ({E[X] - 0} - {E[X p ] - p S(p )} ) / {1 - F(p )}
= {E[X] - E[X p ] + p (1 - p)}/(1 - p) = p + (E[X] - E[X p ])/(1 - p).
TVaRp (X) = p + (E[X] - E[X p ]) / (1 - p).
19

See Section 3.5.4 of Loss Models.


For an example of an application, see DFA Insurance Company Case Study, Part 2 Capital Adequacy and
Capital Allocation, by Stephen W. Philbrick and Robert A. Painter, in the Spring 2001 CAS Forum.
21
This is also discussed in Section 25.2 of Derivative Markets by McDonald, not on the syllabus.
22
See Mahlers Guide to Loss Distributions.
20

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 26

Exercise: Losses follow a Pareto Distribution with = 3 and = 20.


Determine TVaR0.90.
[Solution: Set 0.90 = F(0.90) = 1 - {20/(0.90 + 20)}3 . 0.90 = 23.09.
E[X] = /( -1) = 20/(3 - 1) = 10.
E[X 23.09] = {/( -1)}{1 - {20/(23.09 + 20)}2 } = 7.846.
TVaR0.90 = 0.90 + (E[X] - E[X 0.90])/(1 - .90) = 23.09 + (10 - 7.846)/0.1 = 44.63.
Alternately, X truncated and shifted from below at 23.09 is Pareto with = 3 and = 20 + 23.09 =
43.09, with mean 43.09/(3 - 1) = 21.54.
TVaR0.90 = E[X | X > 0.90] = 23.09 + 21.54 = 44.63.
Comment: As shown in Appendix A of the Tables attached to the exam:
TVaRp = [(1-p)1/ - 1] +

(1- p) - 1/
, for > 1. ]
- 1

For this Pareto Distribution with = 3 and = 20, here is a graph of TVaRp as a function of p:
TVaR

250

200

150

100

50

p
0.2

0.4

0.6

TVaR0 (X) = E[X | over the worst 100% of outcomes] = E[X].


For a loss distribution with a maximum, TVaR1 (X) = Max[X].

0.8

0.9

0.999

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 27

In Appendix A, there are formulas for TVaRp (X) for a few of the distributions:
Exponential, Pareto, Single Parameter Pareto.
Distribution

TVaRp (X)

Exponential

- ln(1-p) +

Pareto

{(1-p)-1/ - 1} +

Single Parameter Pareto

(1- p) -1/
,>1
- 1

Normal23

+ [zp ] / (1 - p)

23

(1- p) - 1/
,>1
- 1

Not shown in Appendix A attached to the exam.


See Example 3.14 in Loss Models.
zp is the pth percentile of the Standard Normal, and is the density of the Standard Normal.
For example, z0.975 = 1.960.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 28

Relationship to the Mean Excess Loss:


The mean excess loss, e(x) = E[X - x | X > x] = E[X | X > x] - x.24
Therefore, E[X | X > x] = x + e(x).
Therefore, TVaRp (X) = E[X | X > p ] = p + e(p ).
This matches a previous formula, since e(p ) = (E[X] - E[X p ])/S(p ) =
(E[X] - E[X p ])/(1 - p). This form of the formula for the TVaR can be useful in those cases where
one remembers the form of the mean residual life.

For example, for a Pareto Distribution with = 3 and = 20, as determined previously,
.90 = 23.09. The mean excess loss for a Pareto is e(x) = (x + )/( - 1).
Therefore, e(23.09) = (23.09 + 20)/(3 - 1) = 21.54.
TVaR.90 = 23.09 + 21.54 = 44.63, matching the previous result.
For a Pareto Distribution with parameters and , > 1:
p = {(1 - p)1/ - 1}.
e(p ) = (p + )/( - 1) = (1 - p)1//( - 1).
TVaRp = p + e(p ) = {(1 - p)1/ /( - 1) - 1}.25
For the above example, TVaR.90 = 20{(.1-1/3)(3/2) - 1} = 44.63, matching the previous result.
Exercise: For an Exponential Distribution with mean 600, determine TVaR.99.
[Solution: Set .99 = 1 - exp[-.99/600]. .99 = 2763.
For the Exponential, e(x) = = 600. Therefore, TVaR.99 = .99 + e(.99) = 2763 + 600 = 3363.]
For an Exponential Distribution with mean :
p = - ln[1 - p]. e(p ) = .
TVaRp = p + e(p ) = (1 - ln[1 - p]).26
For the above example, TVaR.99 = 600(1 - ln[.01]) = 3363, matching the previous result.
24

See Mahlers Guide to Loss Distributions.


I would not memorize this formula.
26
I would not memorize this formula.
25

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 29

A Example with a Discrete Distribution:


Let us assume that the aggregate distribution is:
10
50%
50
30%
100
10%
500
8%
1000
2%
Then E[L | L 500] = {(500)(8%) + (1000)(2%)}/10% = 600.
In contrast, E[L | L > 500] = 1000.
Neither 600 or 1000 is the average of the 5% worst outcomes. Thus neither is used for TVaR.95.
Rather we compute TVaR.95 by averaging the 5% worst possible outcomes:
TVaR.95 = {(500)(3%) + (1000)(2%)}/5% = 700.
In general, in order to calculate TVaRp :27
(1) Take the 1 - p worst outcomes.
(2) Average over these worst outcomes.

27

This is equivalent to what had been done in the case of a continuous aggregate distribution.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 30

TVaR Versus VaR:


Exercise: The aggregate losses are uniform from 0 to 100. Determine VaR95% and TVaR95%.
[Solution: VaR95% = .95 = 95. TVaR95% = (100 + 95)/2 = 97.5.]
Since TVaRp (X) E[X | X > p ], TVaRp (X) VaRp (X). 28
Unlike VaRp, TVaRp is affected by the behavior in the extreme righthand tail of the distribution.
Exercise: The aggregate losses are a two-component splice between a uniform from 0 to 95, and
a uniform from 95 to 200, with 95% weight to the first component of the splice.
Determine VaR95% and TVaR95%.
[Solution: VaR95% = .95 = 95. TVaR95% = (200 + 95)/2 = 147.5.]
For a heavier-tailed distribution, TVaRp can be much larger than VaRp . 29

Exercise: The aggregate losses are a two-component splice between a uniform from 0 to 95, and
above 95 a density proportional to a Pareto with = 3 and = 300, with 95% weight to the first
component of the splice. Determine VaR95% and TVaR95%.
[Solution: VaR95% = .95 = 95. Above 95 the density of the splice is proportional to a Pareto

Distribution, let us say c fPareto(x). e(95) = (x - 95) c fPareto (x) dx / {c SPareto(x)} =


95

ePareto(95). For a Pareto with = 3 and = 300, e(x) = (x + 300)/(3 - 1). e(95) = 395/2 = 197.5.
TVaR95% = 95 + e(95) = 95 + 197.5 = 292.5.
Comment: In this and the previous exercise, the 95% Values at Risk are the same, even though
the distribution in this exercise has a larger probability of extremely bad outcomes such as 300.]
Derivative of TVaR:

Let G(x) = E[X | X > x] = x + e(x) = x +

S(t) dt /S(x).
x

Then dG/dx = 1 - S(x)/S(x) + f(x) S(t)


x

28

dt /S(x)2

= {f(x)/S(x)} S(t) dt /S(x) = h(x) e(x).


x

Only in very unusual situations would the two be equal.


A heavier-tailed distribution has f(x) go to zero more slowly as x approaches infinity. The Pareto and LogNormal
are examples of heavier-tailed distributions. See Mahlers Guide to Loss Distributions.
29

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 31

dE[X | X > x]/dx = h(x) e(x).30


dE[X | X > x]/dx > 0, and as expected E[X | X > x] is an increasing function of x.
For example, for a Pareto with parameters and , e(x) = (x + )/( - 1), and h(x) = / (+x).
Therefore, for a Pareto, dE[X | X > x]/dx = h(x) e(x) = /( - 1), for > 1.31
Exercise: For an Exponential Distribution, determine dE[X | X > x]/dx.
[Solution: e(x) = , and h(x) = 1/. dE[X | X > x]/dx = h(x) e(x) = 1.
Comment: For the Exponential: E[X | X > x] = x + e(x) = x + .]
TVaRp (X) = E[X | X > p ] = G(p ).
Therefore, by the Chain Rule, TVaRp /dp = h(p ) e(p ) d p /dp.
For example, for a Pareto with parameters and ,
p = F(p ) = 1 - {/(p + )}1/. p = {(1 - p)-1/ - 1}.
Therefore, for a Pareto, with the shape parameter > 1,
TVaRp /dp = h(p ) e(p ) d p /dp = {/( - 1)} (/)(1 - p)-(1+1/) = {/( - 1)}(1 - p)-(1+1/).32
Exercise: For an Exponential Distribution, determine TVaRp /dp.
[Solution: p = F(p ) = 1 - exp[-p /]. p = - ln[1 - p].
TVaRp /dp = h(p ) e(p ) d p /dp = (1) /(1 - p) = /(1 - p).
Comment: For the Exponential: TVaRp = p + e(p ) = - ln[1 - p] + .]
Since p is an increasing function of p, d p /dp > 0.
Therefore, TVaRp /dp = h(p ) e(p ) d p /dp > 0,
and as expected TVaRp is an increasing function of p.

30

See Exercise 3.37 in Loss Models. h(x) is the hazard rate.


For the Pareto: E[X | X > x] = x + e(x) = x + (x + )/( - 1).
32
As discussed previously, for the Pareto: TVaRp = {(1 - p)1/ /( - 1) - 1}.
31

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 32

Normal Distribution:33
For a Normal Distribution, the pth percentile is: + zp ,
where zp is the pth percentile of the Standard Normal.
Exercise: For a Normal Distribution with = 100 and = 20, determine VaR0.95[X].
[Solution: The 95th percentile of the Standard Normal is 1.645.
VaR0.95[X] = 100 + (20)(1.645) = 132.9.]
As derived below, TVaRp [X] = + [zp ] / (1 - p).
Exercise: For a Normal Distribution with = 100 and = 20, determine TVaR0.95[X].
[Solution: [zp ] = [1.645] = exp[-1.6452 /2] /

2 = 0.10311.

TVaR0.95[X] = 100 + (20)(0.10311)/(1 - 0.95) = 141.24.


Comment: Note that TVaR0.95[X] > VaRp [X].]
For the Standard Normal:

x (x) dx =

x exp[-x2 / 2] / 2 dx = -exp[-x2 / 2] / 2

= exp[-x2 /2] /

2 = (x).

For the nonstandard Normal:

x x f(x) dx = x x [(x - ) / ] / dx = (x -)/ (y + ) [y] dy =

y [y] dy + [y] dy = [(x-)/] + (1 - [(x-)/]).

(x - )/
(x - )/

TVaRp [X] =

x f(x) dx / (1 - p) = [(p-)/] / (1 - p) + {1 - [(p-)/]} / (1 - p) =


p

[zp ] / (1 - p) + (1 - [zp ]) / (1 - p) = [zp ] / (1 - p) + (1 - p)/(1 - p) = + [zp ] / (1 - p).


33

See Example 3.14 in Loss Models.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 33

Problems:
4.1 (2 points) What is the TVaR0.95 for an Exponential Distribution with mean 100?
A. 400

B. 425

C. 450

D. 475

E. 500

4.2 (3 points) Losses are Normal with = 300 and = 10.


Determine the 90% Tail Value at Risk.
Hint: For the Normal Distribution,
E[X

x] = [(x)/] - [(x)/] + x {1 - [(x)/]}.

A. less than 315


B. at least 315 but less than 320
C. at least 320 but less than 325
D. at least 325 but less than 330
E. at least 330
4.3 (3 points) F(x) = 1 - {/( + x)}4 .
Calculate the Tail Value at Risk at a security level of 99%.
A. 2.6

B. 2.8

C. 3.0

D. 3.2

E. 3.4

4.4 (2 points) For an Exponential Distribution with mean , determine TVaRp - VaRp .
A.

B. - ln(1 - p)

C. - ln(1 - p)

D. + ln(1/p)

E. None of A, B, C, or D

4.5 (3 points) Losses follow a LogNormal Distribution with = 7 and = 0.8.


Determine TVaR0.995.
A. less than 12,000
B. at least 12,000 but less than 13,000
C. at least 13,000 but less than 14,000
D. at least 14,000 but less than 15,000
E. at least 15,000

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 34

Use the following information for the next two questions:


Annual aggregate losses have the following distribution:
Annual Aggregate Losses
Probability
0
50%
10
30%
20
10%
50
4%
100
2%
200
2%
500
1%
1000
1%
4.6 (1 point) Determine the 90% Tail Value at Risk.
A. 200
B. 210
C. 220
D. 230

E. 240

4.7 (1 point) Determine the 95% Tail Value at Risk.


A. 375
B. 400
C. 425
D. 450

E. 475

Use the following information for the next 2 questions:


Losses follows a Single Parameter Pareto Distribution, with = 6 and = 1000.
4.8 (1 point) Determine the 98% Value at Risk.
A. 1800

B. 1900

C. 2000

D. 2100

E. 2200

4.9 (2 points) Determine the 98% Tail Value at Risk.


A. 1900

B. 2000

C. 2100

D. 2200

E. 2300

4.10 (3 points) You are given the following information:

Frequency is Binomial with m = 500 and q = 0.3.


Severity is LogNormal with = 8 and = 0.6.
Frequency and severity are independent.
Using the Normal Approximation, determine the 99% Tail Value at Risk for Aggregate Losses.
Hint: For the Normal Distribution, TVaRp (X) = + [-1(p)] / (1 - p).
A. 655,000

B. 660,000

C. 665,000

D. 670,000

E. 675,000

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 35

Use the following information for the next 4 questions:


For the aggregate losses, VaR0.9 is 1,000,000.
4.11 (2 points) John believes that the aggregate losses follow an Exponential Distribution.
Determine Johns estimate of TVaR0.9.
4.12 (4 points) Paul believes that the aggregate losses follow a LogNormal Distribution
with = 0.6. Determine Pauls estimate of TVaR0.9.
4.13 (4 points) George believes that the aggregate losses follow a LogNormal Distribution
with = 1.2. Determine Georges estimate of TVaR0.9.
4.14 (3 points) Ringo believes that the aggregate losses follow a Pareto Distribution with = 3.
Determine Ringos estimate of TVaR0.9.

Use the following information for the next 2 questions:


In the state of Windiana, a State Fund pays for losses due to hurricanes.
The worst possible annual amounts to be paid by the State Fund in millions of dollars are:
Amount
Probability
100
3.00%
200
1.00%
300
0.50%
400
0.25%
500
0.10%
600
0.05%
700
0.04%
800
0.03%
900
0.02%
1000
0.01%
4.15 (2 points) Determine TVaR0.95 in millions of dollars.
A. 120

B. 140

C. 160

D. 180

E. 200

4.16 (2 points) Determine TVaR0.99 in millions of dollars.


A. 400

B. 450

C. 500

D. 550

E. 600

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 36

Use the following information for the next 2 questions:


Losses follow a mixture of two Exponential Distributions,
with means of 1000 and 2000, and with weights of 60% and 40% respectively.
4.17 (2 points) Determine the 95% Value at Risk.
A. 3000
B. 3500
C. 4000
D. 4500

E. 5000

4.18 (3 points) Determine the 95% Tail Value at Risk.


A. 5700
B. 6000
C. 6300
D. 6600

E. 6900

Use the following information for the next 2 questions:


f(x) = 0.050 for 0 x 10, f(x) = 0.010 for 10 < x 50, and f(x) = 0.002 for 50 < x 100.
4.19 (1 point) Determine the 80% Value at Risk.
A. 35
B. 40
C. 45
D. 50

E. 55

4.20 (2 points) Determine the 80% Tail Value at Risk.


A. 50
B. 55
C. 60
D. 65

E. 70

4.21 (2 points) F(x) = (x/10)4 , 0 x 10.


Determine TVaR0.90.
A. less than 9.70
B. at least 9.70 but less than 9.75
C. at least 9.75 but less than 9.80
D. at least 9.80 but less than 9.85
E. at least 9.85
4.22 (2 points) For a Normal Distribution with = 10 and = 3, determine TVaR95%.
A. less than 14
B. at least 14 but less than 15
C. at least 15 but less than 16
D. at least 16 but less than 17
E. at least 17
4.23 (3 points) f(x) = 0.0008 for x 1000, and f(x) = 0.0004 exp[2 - x/500] for x > 1000.
Determine the 95% Tail Value at Risk.
A. 2200
B. 2300
C. 2400
D. 2500
E. 2600

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 37

Solutions to Problems:
4.1. A. Set 0.95 = 1 - exp[-.95/100]. .95 = -(100)ln(.05) = 299.6.
For the Exponential, e(x) = = 100. TVaR.95 = .95 + e(.95) = 299.6 + 100 = 399.6.
As shown in Appendix A: TVaRp (X) = - ln(1-p) + = -(100)ln(.05) + 100 = 399.6.
Comment: See Example 3.15 in Loss Models.
4.2. B. 0.90 = F(x) = [(x - 300)/10]. (x - 300)/10 = 1.282

x = 300 + (10)(1.282) = 312.82.


[(312.82 - )/] = [1.282] = exp[-1.2822 /2]/ 2 = 0.1754.
[(312.82 - )/] = 0.9.

E[X
E[X

x] = [(x)/] - [(x)/] + x {1 - [(x)/]}.

312.82] = (300)(0.9) - (10)(0.1754) + (312.82)(1 - 0.9) = 299.53.


e(312.82) = (E[X] - E[X 312.82]) / (1 - 0.9) = (300 - 299.53)/0.1 = 4.7.
TVaRp = p + e(p ) = 312.82 + 4.7 = 317.5.
Alternately, for a Normal Distribution, TVaRp [X] = + [zp ] = 300 + (10)[1.282] =
300 + (10) exp[-1.2822 /2]/ 2 = 317.5.
Comment: See Example 3.14 in Loss Models.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 38

4.3. D. For the Pareto Distribution, .99 = 1 - {/( + .99)}4 . .99 = (100.25 - 1) = 2.1623.
TVaRp = p + (E[X] - E[X p ])/(1 - p) = p + e(p ).
TVaR.99 = 2.1623 + e(2.1623) = 2.1623 + (2.1623 + )/(4 - 1) = 3.2164.
As shown in Appendix A, for a Pareto Distribution with parameters and , > 1:
TVaRp (X) = VaRp (X) +

(1 - p) -1/
(1 - p) -1/
= [(1-p)-1/ - 1] +
=
- 1
- 1

- 1/
(1 - p)
- 1 .

- 1
With = 4, TVaR0.99(X) = [(1%)-0.25 - 1] + (1%)-0.25 / 3 = 2.1623 + 1.0541 = 3.2164 .
Comment: See Example 3.16 in Loss Models.
For the Pareto Distribution, e(x) = (x + )/( - 1), > 1.
TVaRp , the Tail Variance at Risk as a function of p, for F(x) = 1 - {/( + x)}4 :
TVaR
600
500
400
300
200
100
0.8

0.9

0.95

0.99

4.4. A. TVaRp = p .+ e(p ). TVaRp p = e(p ) = .


As shown in Appendix A: VaRp (X) = - ln(1-p). TVaRp (X) = - ln(1-p) + .
TVaRp (X) - VaRp (X) = .
Comment: For an Exponential Distribution, e(x) = .

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 39

4.5. A. E[X] = exp[7 + 0.82 /2] = 1510.


0.995 = F(x) = [(lnx - 7)/0.8]. (lnx - 7)/0.8 = 2.576.

x = exp[7 + (0.8)(2.576)] = 8611. 99.5th percentile of the LogNormal is 8611.


E[X

8611] = (1510)[(ln8611 - 7 - 0.82 )/0.8] + (8611){1 - [(ln8611 - 7)/0.8]}

= (1510)[1.78] + (8611){1 - [2.58]} = (1510)(.9625) + (8611)(.0049) = 1496.


TVaR0.995 = 0.995 + (E[X] - E[X

.995])/(1 - 0.995) = 8611 + (1510 - 1496)/0.0049

= 11,468.
4.6. D. Average the 10% worst possible outcomes:
TVaR.90 = {(4%)(50) + (2%)(100) + (2%)(200) + (1%)(500) + (1%)(1000)}/10% = 230.
4.7. B. Average the 5% worst possible outcomes:
TVaR.95 = {(1%)(100) + (2%)(200) + (1%)(500) + (1%)(1000)}/5% = 400.
4.8. B. F(X) = 1 - (/x). .98 = 1 - (1000/.98)6 . .98 = 1919.
As shown in Appendix A: VaRp = (1-p)1/.
VaR0.98 = (1000) (0.02)-1/6 = 1919.
4.9. E. E[X
E[X

x] = { - (/x)1} / ( - 1).

1919] = (1000){6 - (1000/1919)5 }/(6 - 1) = 1192.315.

E[X] = / ( - 1) = (1000)(6/5) = 1200.


TVaR.98 = .98 + (E[X] - E[X

.98])/(1 - .98) = 1919 + (1200 - 1192.315)/.02 = 2303.

Alternately, f(x) = 6 x 108 / x7 . TVaR.98 =

x f(x) dx /.02 = 2303.

1919

(1-p)-1 /
As shown in Appendix A: TVaRp =
, for > 1.
-1
TVaR0.98 = (6) (1000) (0.02)-1/6 / 5 = 2303.
Comment: For a Single Parameter Pareto, with parameters and , p = /(1 - p)1/.
E[X] - E[X

p ] = (/p )1 / ( - 1) = {(1 - p)1/}1 / ( - 1) = (1 - p)1 - 1/ / ( - 1).

TVaRp = p + (E[X] - E[X

p ])/(1 - p) = { /(1 - p)1/} {/( - 1)} = {/( - 1)} p .

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 40

4.10. B. The mean severity is: exp[8 + 0.62 /2] = 3569.


The second moment of severity is: exp[(2)(8) + (2)(0.62 )] = 18,255,921.
The variance of severity is: 18,255,921 - 35692 = 5,518,160.
The mean frequency is: (500)(0.3) = 150.
The variance of frequency is: (500)(0.3)(0.7) = 105.
The mean aggregate loss is: (150)(3569) = 535,350.
The variance of aggregate loss is: (150)(5,518,160) + (35692 )(105) = 2,165,188,905.
Thus we approximate by a Normal Distribution
with = 535,350 and =

2,165,188,905 = 46,532.

[-1(p)] = [-1(99%)] = [2.326] = exp[-2.3262 /2] /

2 = 0.02667.

TVaR95%(X) = 535,350 + (46,532)(0.02667) / 0.01 = 659,451.


Comment: VaR0.99 = 535,350 + (2.326)(46,532) = 643,583.
The formula for TVaR for the Normal Distribution is given the Example 3.14 in Loss Models.
4.11. For the Exponential Distribution, VaRp = - ln(1-p).
Thus 1,000,000 = - ln(1-0.9). = 434,294.
For the Exponential Distribution, TVaRp = VaRp + = 1,000,000 + 434,294 = 1,434,294.
4.12. For the LogNormal, the distribution function at 1,000,000 is 0.9.
0.9 = [{ln[1,000,000] - }/0.6]. 1.282 = {ln[1,000,000] - }/0.6. = 13.0463.
For this LogNormal, E[X] = exp[13.0463 + 0.62 /2] = 554,765.
ln(x) 2
ln(x)
E[X x] = exp( + 2/2)
+ x {1 -

E[X 1,000,000] = 554,765 [0.682] + (1,000,000) {1 - [1.282]}


= (554,765)(0.7517) + (1,000,000)(0.10) = 517,067.
e(1 million) = (554,765 - 517,067)/0.1 = 376,980.
TVaRp = 1,000,000 + 376,980 = 1,376,980.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 41

4.13. For the LogNormal, the distribution function at 1,000,000 is 0.9.


0.9 = [{ln[1,000,000] - }/1.2]. 1.282 = {ln[1,000,000] - }/1.2. = 12.2771.
For this LogNormal, E[X] = exp[12.2771 + 1.22 /2] = 441,132.
ln(x) 2
ln(x)
E[X x] = exp( + 2/2)
+ x {1 -

}.

E[X 1,000,000] = 441,132 [0.082] + (1,000,000) {1 - [1.282]}


= (441,132)(0.5319) + (1,000,000)(0.10) = 334,638.
e(1 million) = (441,132 - 334,638)/0.1 = 1,064,940.
TVaRp = 1,000,000 + 1,064,940 = 2,064,940.
Comment: The Tail Value at Risk depends on which form of distribution one assumes.
Even assuming a LogNormal Distribution, the Tail Value at Risk depends on :
TVaR ($ million)
4.0
3.5
3.0
2.5
2.0
1.5
0.5

1.0

1.5

2.0

sigma

The bigger , the heavier the righthand tail and thus the larger TVaR, all else being equal.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 42

4.14. For the Pareto Distribution, VaRp = {(1-p)-1/ - 1}.


Thus 1,000,000 = {(1-0.9)-1/3 - 1}. = 866,225.
For the Pareto Distribution, TVaRp = VaRp +

(1- p) - 1/
- 1

= 1,000,000 + (866,225)(1-0.9)-1/3 / (3 -1) = 1,933,113.


Comment: The Tail Value at Risk depends as a function of :
TVaR ($million)
3.5

3.0

2.5

2.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

alpha

The smaller , the heavier the righthand tail and thus the larger TVaR, all else being equal.
4.15. D. What is shown here is the 5% worst outcomes. Their average is:
{(100)(3%) + (200)(1.00%) + (300)(0.50%) + (400)(0.25%) + (500)(0.10%) + (600)(0.05%) +
(700)(0.04%) + (800)(0.03%) + (900)(0.02%) + (1000)(0.01%)}/5% = 182 million.
4.16. A. The average of the worst 1% of outcomes is:
{(300)(0.50%) + (400)(0.25%) + (500)(0.10%) + (600)(0.05%) +
(700)(0.04%) + (800)(0.03%) + (900)(0.02%) + (1000)(0.01%)}/1% = 410 million.

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 43

4.17. D. We wish to find where the survival function is 5%.


0.05 = 0.6 exp[-x/1000] + 0.4 exp[-x/2000].
5 exp[x/1000] - 60 - 40 exp[x/2000] = 0.
Let y = exp[x/2000]. Then y2 - 8y - 12 = 0

y=

64 + 48
= 9.292, taking the positive root.
2

exp[x/2000] = 9.292. x = 4458.


4.18. C. The mean of the mixture is: (60%)(1000) + (40%)(2000) = 1400.
The limited expected value of the mixture at 4458 is:
(60%)(1000)(1 - e-4458/1000) + (40%)(2000)(1 - e-4458/2000) = 1306.94
e(4458) = (1400 - 1306.94)/0.05 = 1861. TVaR95% = 4458 + e(4458) = 4458 + 1861 = 6319.
4.19. B. & 4.20. C. There is 50% probability on the first interval and 40% probability on the
second interval, so the 80th percentile is in the second interval.
F(40) = 50% + 30% = 80%. Thus the 80th percentile is 40.
50

The 80% Tail Value at Risk is: E[X | X > 40] = {

100

40 x 0.01 dx + 50 x 0.002 dx } / 0.2

= (4.5 + 7.5) / 0.2 = 60.


50

Alternately, e(40) = E[(X - 40) | X > 40] = {

100

40 (x - 40) 0.01 dx + 50 (x - 40) 0.002 dx } / 0.2 =

(0.5 + 3.5) / 0.2 = 20. The 80% Tail Value at Risk is: 40 + e(40) = 60.
4.21. E. 0.90 = (x/10)4 . 90th percentile = 9.74.
f(x) = 4 x3 / 10,000, 0 x 10.
10

TVaR0.90 =

x 4x3 / 10,000 dx / 0.1 = 9.873.

9.74

2013-4-4,

Risk Measures 4 Tail Value at Risk,

HCM 10/8/12,

Page 44

4.22. D. For the Normal Distribution: TVaRp [X] = + [zp ] / (1 - p).


The 95th percentile of the Standard Normal Distribution is 1.645.
[1.645] = exp[-1.6452 /2] /

2 = 0.10311.

TVaR95% = 10 + (3)[1.645] / 0.05 = 10 + (60)(0.10311) = 16.19.


4.23. A. f(x) = 0.0008 for x 1000. F(1000) = 0.8.
To find the 95th percentile:
x

0.95 = 0.8 +

0.0004 exp[2 - t / 500] dt = 0.8 + (0.2) (1 - exp[2- x /500]).

1000

0.25 = exp[2- x /500]. x = 1693.

TVaR95%[X] =

0.008

e2

0.0004 exp[2 - t / 500] t dt / 0.05 = 0.008 e2 e- t / 500 t dt =

1693
1693

(-500te - t / 500

t=
t
/
500
2
500 e
)

t = 1693

= (0.008) e2 (37,110) = 2194.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 45

Section 5, Distortion Risk Measures34


A distortion function, g, maps [0, 1] to [0, 1] such that g(0) = 0, g(1) = 1, and g is increasing.
A distortion risk measure is obtained by taking the integral of g[S(x)]:35

H(X) =

g[S(x)]dx .
0

Examples of distortion risk measures are:


PH Transform36

g(y) = y1/

Wang Transform

g(y) = [ -1[y] + ]

Dual Power Transform

g(y) = 1 - (1 - y)

It is less obvious, but the Value-at-Risk and Tail-Value-at-Risk risk measures can also be put in this
form and are thus distortion risk measures.
Proportional Hazard (PH) Transform:
Define the Proportional Hazard (PH) Transform to be:
g(S(x)) = S(x)1/, 1.
Exercise: What is the PH transform of an Exponential Distribution?
[Solution: For the Exponential, S(x) = exp[-x/]. S(x)1/ = exp[-x/]1/ = exp[-x/()].
Thus the PH transform is also an Exponential, but with replaced by .]

Recall that E[X] =

S(x) dx .

37

The above integral computes the expected value of the losses. If instead we raised the survival
function to some power less than one, we would get a larger integral, since
S(x) < S(x)1/ for > 1 and S(x) < 1.
34

No longer on the syllabus.


The integral is taken over the domain of X, which is most commonly 0 to .
36
The PH Transform is a special case of a Beta Transform, where g is a Beta Distribution with = 1.
37
See Mahlers Guide to Loss Distributions.
35

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 46

The Proportional Hazard Transform risk measure is for 1:

H(X) =

S(x)1/ dx .

38

For = 1, the PH Transform is the mean. As the selected increases, so does the PH Transform.
The more averse to risk one is, the higher the selected should be, resulting in a higher level of
security.
For a Pareto Distribution with = 4 and = 240, E[X] = 240/(3 - 1) = 80.
S(x) = {240/(240 + x)}4 . S(x)1/ = {240/(240 + x)}4/, a Pareto with = 4/ and = 240.
Exercise: For = 1.2, what is the PH Transform risk measure for this situation?
[Solution: The transformed distribution is a Pareto with = 4/1.2 = 3.33 and = 240.

Therefore

S(x)1/ dx = mean of this transformed Pareto = 240/(3.33 - 1) = 103.]


0

In the case of the Pareto Distribution, the PH Transform is also a Pareto, but with replaced by
/. Thus the PH Transform has reduced the Pareto's shape parameter, resulting in a distribution
with a heavier tail.39 The PH Transform risk measure is: /(/ - 1), for < .

38
39

The integral is taken over the domain of X, which is most commonly 0 to .


Heavier-tailed distributions are sometimes referred to as more risky.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 47

Here is a graph of the PH Transform Risk Measure, for a Pareto Distribution with = 4 and
= 240, as function of :

PHTrans. R. M.
1500

1000

500

80
1

3.5

kappa

Exercise: For this situation, what value of k corresponds to a relative security loading of 50%?
[Solution: /(/ - 1) = 1.5E[X] = 1.5 /( - 1). - 1 = 1.5/ - 1.5.

= 1.5/( + .5) = (1.5)(4)/(4 + .5) = 6/4.5 = 1.33.]


Exercise: Losses follow an Exponential Distribution with = 1000.
Determine the PH Transform risk measure for = 1.6.
[Solution: S(x) = exp[-x/1000]. S(x)1/1.6 = exp[-x/1600].
The PH Transform risk measure is the mean of the new Exponential, 1600.
Comment: For the Exponential Distribution, the PH Transform risk measure is .]
Wangs Transform:
Wangs Transform produces another risk measure, which is useful for working with Normal or
LogNormal losses.
Let X be LogNormal with = 6 and = 2. Then S(x) = 1 - [(lnx - 6)/2].
-1 is the inverse function of . -1[0.95] = 1.645. [1.645] = 0.95.
-1[1 - .95] = --1[.95] = -1.645. [-1.645] = 0.05.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 48

-1[S(x)] = -1[1 - [(lnx - 6)/2]] = - -1[[(lnx - 6)/2]] = -(lnx - 6)/2.


[-1[S(x)] + 0.7] = [0.7 - (lnx - 6)/2] = 1 - [(lnx - 6)/2 - 0.7] = 1 - [(lnx - {6 + (0.7)(2)})/2].
Thus [-1[S(x)] + 0.7] is the survival function of a LogNormal with = 6 + (.7)(2) and = 2.
Define Wangs Transform to be: g(S(x)) = [-1[S(x)] + ], 0.
As shown in the above example, if X is LogNormal with parameters and , then the Wang
Transform is also LogNormal but with parameters + and .
The Wang Transform risk measure is for 0:

H(X) =

[ 1[S(x)] + ] dx .

40

Exercise: X is LogNormal with = 6 and = 2.


Determine the Wang Transform risk measure for = 0.7.
[Solution: The Wang Transform is LogNormal with = 6 + (0.7)(2) = 7.4 and = 2.
The Wang Transform risk measure is the mean of that LogNormal: exp[7.4 + 22 /2] = 12,088.]
Dual Power Transform:
g(S(x)) = 1 - {1 - S(x)} = 1 - F(x), 1.
Exercise: X is uniform from 0 to 1. Determine the Dual Power Transform with = 3.
[Solution: F(x) = x. 1 - F(x)3 = 1 - x3 , 0 x 1.
Comment: This is the survival function of a Beta Distribution with a = 3, b = 1, and = 1.
The corresponding density is: 3x2 , 0 x 1.]
Prob[Maximum of a sample of size N x] = Prob[X x]N = F(x)N.
The distribution function of the maximum of a sample of size N is F(x)N.
Therefore, 1 - F(x)N is the survival function of the maximum of a sample of size N.
40

The integral is taken over the domain of X, which is most commonly 0 to .

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 49

Therefore, if is an integer, the Dual Power Transform is the survival function of the maximum of a
sample of size .
The Dual Power Transform measure of risk is:

H(X) =

1 -

F(x) dx , 1.41

Thus if k = N, then the Dual Power Transform risk measure is the expected value of the maximum
of a sample of size N.
Exercise: X is uniform from 0 to 1. Determine the Dual Power Transform risk measure with = 3.
1

[Solution: F(x) = x.

1 - F(x)3

=1-

x3 ,

1 - x 3 dx = 1 - 1/4 = 3/4.

0 x 1.

Comment: The mean of a Beta Distribution with a = 3, b = 1, and = 1 is: (1)(3)/(3 + 1) = 3/4.
The expected value of the maximum of a sample of size N from a uniform distribution on (0, ) is:
N/(N + 1). See Mahlers Guide to Statistics, covering material on the syllabus of CAS3.]
Value at Risk:
For the VaRp risk measure:
0 if 0 y 1 - p
g(y) =
1 if 1 - p < y 1
Using the above distortion function, g[S(x)] is one when S(x) > 1 - p, and otherwise zero.
S(x) > 1 - p when x < p . Thus

g[S(x)]dx = 1 dx = p.

Exercise: What is the distortion function for VaR.95?


0 if 0 y 0.05
[Solution: g(y) =
.]
1 if 0.05 < y 1

41

The integral is taken over the domain of X, which is most commonly 0 to .

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 50

Tail-Value-at-Risk:
For the TVaRp risk measure:
y / (1-p) if 0 y 1 - p
g(y) =
1 if 1 - p < y 1
Then, g[S(x)] is S(x)/(1 - p) when S(x) 1 - p, in other words when x p , and otherwise 1. Thus

g[S(x)]dx = 1 dx + S(x)/ (1 -

p) dx = p + e(p ) = TVaRp .

Exercise: What is the distortion function for TVaR.95?


20y if 0 y 0.05
[Solution: g(y) =
.]
1 if 0.05 < y 1
Then, g[S(x)] is 20S(x) when S(x) 0.05, in other words when x .9 5, and otherwise 1.
.95

Thus

g[S(x)]dx =

1 dx +

= .9 5 + e(.9 5) = TVaR.95.

.95

20 S(x) dx = .9 5 + (layer from .9 5 to )/.05

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 51

Problems:
5.1 (1 point) What is the PH Transform risk measure with = 1.5 for an Exponential Distribution
with mean of 100?
A. 150
B. 160

C. 170

D. 180

E. 190

5.2 Which of the following distributions are not preserved under a PH Transform?
A. Single Parameter Pareto
B. Weibull
C. Burr
D. Gompertzs Law, F(x) = 1 - exp[-B[cx - 1)/ln(c)]
E. LogNormal
5.3 (2 points) F(x) = 1 - {300/(300 + x)}5 .
Determine the Proportional Hazard Transform risk measure with = 2.
A. 200

B. 220

C. 240

D. 260

E. 280

5.4 (2 points) Losses follow a Uniform Distribution from 0 to 100.


Determine the Proportional Hazard Transform risk measure with = 1.3.
A. less than 55
B. at least 55 but less than 60
C. at least 60 but less than 65
D. at least 65 but less than 70
E. at least 70
5.5 (3 points) Aggregate losses follow a Single Parameter Pareto Distribution with = 3 and
= 10. However, a reinsurance contract caps the insurers payments at 30.
Determine the Proportional Hazard Transform risk measure of the insurers payments with = 1.2.
A. less than 16
B. at least 16 but less than 17
C. at least 17 but less than 18
D. at least 18 but less than 19
E. at least 19
5.6 (3 points) Losses follow a Weibull Distribution with = 1000 and = 0.4.
Determine the Proportional Hazard Transform risk measure with = 1.8. Hint: (1/2) =
A. 14,000

B. 14,500

C. 15,000

D. 15,500

E. 16,000

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 52

5.7 (2 points) Annual aggregate losses have the following distribution:


Annual Aggregate Losses
Probability
100
60%
500
30%
1000
10%
Determine the Proportional Hazard Transform risk measure with = 2.
A. less than 350
B. at least 350 but less than 400
C. at least 400 but less than 450
D. at least 450 but less than 500
E. at least 500
Use the following information for the next two questions:
The premium will be set equal to the proportional hazard transform of the distribution of aggregate
annual losses retained by the insurer or reinsurer, with = 1.2.
The relative security loading, , is such that: Premiums = (1 + ) (Expected Losses).
5.8 (6 points) Annual aggregate losses follow an Exponential Distribution with mean 100.
Determine the relative security loading for the following situations:
(a) The insurer retains all losses.
(b) The insurer retains only the layer from 0 to 50.
(c) A reinsurer retains only the layer from 50 to 100.
(d) A reinsurer retains only the layer above 100.
5.9 (6 points) Annual aggregate losses follow a Pareto Distribution with = 3 and = 200.
Determine the relative security loading for the following situations:
(a) The insurer retains all losses.
(b) The insurer retains only the layer from 0 to 50.
(c) A reinsurer retains only the layer from 50 to 100.
(d) A reinsurer retains only the layer above 100.
5.10 (2 points) Losses are LogNormal with = 4 and = 0.6.
Determine the Wang Transform risk measure for = 0.3.
A. less than 75
B. at least 75 but less than 80
C. at least 80 but less than 85
D. at least 85 but less than 90
E. at least 90

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 53

5.11 (2 points) Losses are Normal with = 7000 and = 500.


Determine the Wang Transform risk measure for = 0.8.
A. 7000

B. 7100

C. 7200

D. 7300

E. 7400

5.12 (3 points) Annual aggregate losses have the following distribution:


Annual Aggregate Losses
Probability
100
60%
500
30%
1000
10%
Determine the Wang Transform risk measure with = 0.5.
A. 350

B. 400

C. 450

D. 500

E. 550

5.13 (4 points) Y+ is defined as 0 if Y 0, and Y if Y > 0.


X is the value of a put option.
X = (220 - P)+, where P follows a LogNormal Distribution with = 5.5 and = 0.2.
Determine the Wang Transform risk measure with = 0.6.
A. 17

B. 19

C. 21

D. 23

E. 25

5.14 (4 points) Y+ is defined as 0 if Y 0, and Y if Y > 0.


X is the value of a call option.
X = (P - 300)+, where P follows a LogNormal Distribution with = 5.5 and = 0.2.
Determine the Wang Transform risk measure with = 0.4.
A. 8

B. 9

C. 10

D. 11

E. 12

5.15 (2 points) What is the Dual Power Transform risk measure with = 3 for an Exponential
Distribution with mean 100?
A. 140
B. 160
C. 180

D. 200

E. 220

5.16 (2 points) Losses are uniform from 0 to 100.


Determine the Dual Power Transform risk measure with = 1.4.
A. 52

B. 54

C. 56

D. 58

E. 60

5.17 (2 points) Losses follow a Pareto Distribution with = 5 and = 10.


Determine the Dual Power Transform risk measure with = 2.
A. 3.3

B. 3.5

C. 3.7

D. 3.9

E. 4.1

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 54

5.18 (2 points) Annual aggregate losses have the following distribution:


Annual Aggregate Losses
Probability
100
60%
500
30%
1000
10%
Determine the Dual Power Transform risk measure with = 1.5.
A. less than 350
B. at least 350 but less than 400
C. at least 400 but less than 450
D. at least 450 but less than 500
E. at least 500
5.19 (1 point) Graph the distortion functions corresponding to VaR.9.
5.20 (1 point) Graph the distortion function corresponding to TVaR.9
5.21 (1 point) Graph the distortion function corresponding to the PH Transform with = 2.
5.22 (1 point) Graph the distortion function corresponding to the Dual Power Transform with = 2.
5.23 (3 points) Graph the distortion function corresponding to the Wang Transform with = 0.3.
5.24 (3 points) A distortion risk measure has:
10y if 0 y 0.1
g(y) =
1 if 0.1 < y 1
Determine the risk measure for a Pareto Distribution with = 3 and = 200.
A. less than 500
B. at least 500 but less than 600
C. at least 600 but less than 700
D. at least 700 but less than 800
E. at least 800
5.25 (1 point) Which of the following are distortion risk measures?
1. PH (Proportional Hazard) Transform
2. Dual Power Transform
3. Expected Value Premium Principle
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, and 3
E. Not A, B, C, or D

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 55

5.26 (2 points) Which of the following is the distortion function for the VaR risk measure for
p = 90%?
0 if 0 y 0.10
A. g(y) =
1 if 0.10 < y 1
0 if 0 y 0.90
B. g(y) =
1 if 0.90 < y 1
10y if 0 y 0.10
C. g(y) =
1 if 0.10 < y 1
10y if 0 y 0.90
D. g(y) =
1 if 0.90 < y 1
E. None of A, B, C, or D
5.27 (2 points) Which of the following is the distortion function for the TVaR risk measure for
p = 90%?
0 if 0 y 0.10
A. g(y) =
1 if 0.10 < y 1
0 if 0 y 0.90
B. g(y) =
1 if 0.90 < y 1
10y if 0 y 0.10
C. g(y) =
1 if 0.10 < y 1
10y if 0 y 0.90
D. g(y) =
1 if 0.90 < y 1
E. None of A, B, C, or D
5.28 (4, 5/07, Q.27) (2.5 points) You are given the distortion function:
g(x) =

x.

Calculate the distortion risk measure for losses that follow the Pareto distribution with = 1000 and
= 4.
(A) Less than 300
(B) At least 300, but less than 600
(C) At least 600, but less than 900
(D) At least 900, but less than 1200
(E) At least 1200

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 56

Solution to Problems:
5.1. A. S(x) = exp[-x/100]. S(x)1/ = exp[-x/100]1/1.5 = exp[-x/150].
Thus the PH transform is also an Exponential, but with mean 150.
5.2. E. For the Single Parameter Pareto, S(x) = (/x). S(x)1/ = (/x)/.
Another Single Parameter Pareto, but with replaced by /.
For the Weibull, S(x) = exp[-(x/)].
S(x)1/ = exp[-(x/)]1/ = exp[-(x/)/] = exp[-{x/( 1/)}].
Thus the PH transform is also a Weibull distribution, but with replaced by 1/.
For the Burr, S(x) = {/( + x)}. S(x)1/ = {/( + x)}/.
Thus the PH transform is also a Burr Distribution, but with replaced by /.
For Gompertzs Law, S(x) = exp[-B(cx - 1)/ln(c)]. S(x)1/ = exp[-(B/)(cx - 1)/ln(c)].
Thus the PH transform is also Gompertzs Law, but with B replaced by B/.
For the LogNormal, S(x) = 1 - [(lnx - )/]. S(x)1/ is not of the same form.
5.3. A. S(x) = {300/(300 + x)}5 . S(x)1/2 = {300/(300 + x)}2.5, a Pareto Distribution with = 2.5
and = 300. The risk measure is the mean of this second Pareto Distribution, 300/(2.5 - 1) = 200.
5.4. B. S(x) = 1 - x/100, x 100. S(x)1/1.3 = (1 - x/100)1/1.3.
100

(1 -

x / 100)1/ 1.3

dx = -100(1 -

x/100)1+1/1.3/(1

x=100

+ 1/1.3)

= 100/(1 + 1/1.3) = 56.52.

x=0

5.5. A. S(x) = (10/x)3 , x < 30. S(x)1/1.2 = (10/x)3/1.2, x < 30, a Single Parameter Pareto
Distribution with = 3/1.2 = 2.5 and = 10, capped at 30.
The risk measure is the limited expected value at 30 of this transformed Single Parameter Pareto.
E[X x] = { - (/x)1} / ( - 1). E[X 30] = 10{2.5 - (10/30)1.5} / 1.5 = 15.38.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 57

5.6. B. For the Weibull, S(x) = exp[-(x/)].


S(x)1/ = exp[-(x/)]1/ = exp[-(x/)/] = exp[-{x/( 1/)}].
Thus the PH transform is also a Weibull distribution, but with replaced by 1/.
Therefore, the risk measure is the mean of a Weibull with = (1000)(1.81/0.4) = 4347 and = 0.4:
4347 (1 + 1/0.4) = 4347 (3.5) = (4347)(2.5)(1.5)(0.5)(1/2) = 8151 = 14,447.
5.7. E. For the original distribution: S(x) = 1 for x < 100, 0.4 for 100 x < 500,
0.1 for 500 x < 1000, 0 for x 1000.
For the PH Transform, S(x) = 1 for x < 100, 0.41/2 = 0.6325 for 100 x < 500,
0.11/2 = 0.3162 for 500 x < 1000, 0 for x 1000.
The integral of the Survival Function of the PH Transform is:
(100)(1) + (500 - 100)(0.6325) + (1000 - 500)(0.3162) = 511.
Comment: The mean of the original distribution is:
(100)(1) + (500 - 100)(0.4) + (1000 - 500)(0.1) = 310
= (60%)(100) + (30%)(500) + (10%)(1000).
5.8. For an Exponential with mean , S(x) = e-x/. S(x)1/ = e-x/().
u

S(x)dx = (e-d/ - e-u/).


d

Therefore for the layer from d to u, E[X] = (e-d/ - e-u/), and H(X) = {e-d/() - e-u/()}.
H(X)/E[X] = {e-d/() - e-u/()}/(e-d/ - e-u/) = 1.2(e-d/120 - e-u/120)/(e-d/100 - e-u/100).
(a) For d = 0 and u = , H(X)/E[X] = 1.2. = 20.0%.
(b) For d = 0 and u = 50, H(X)/E[X] = 1.2(1 - e-50/120)/(1 - e-50/100) = 1.039. = 3.9%.
(c) For d = 50 and u = 100, H(X)/E[X] = 1.2(e-50/120 - e-100/120)/(e-50/100 - e-100/100) = 1.130.
= 13.0%.
(d) For d = 100 and u = , H(X)/E[X] = 1.2(e-100/120 - 0)/(e-100/100 - 0) = 1.418. = 41.8%.
Comment: The lowest layer gets the smallest relative security loading, while the highest layer gets
the highest relative security loading.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 58

5.9. For the Pareto, S(x) = /(x + ). S(x)1/ = //(x + )/.


u

S(x)dx = {1/( + d)1 - 1/( + u)1}/( - 1).


d

Therefore for the layer from d to u, E[X] = {1/( + d)1 - 1/( + u)1}/( - 1) =
2003 {1/(200 + d)2 - 1/(200 + u)2 }/2, and H(X) = /{1/( + d)/1 - 1/( + u)/1}/(/ - 1) =
2002.5{1/(200 + d)1.5 - 1/(200 + u)1.5}/1.5.
H(X)/E[X] = (4/3)200-.5{1/(200 + d)1.5 - 1/(200 + u)1.5}/{1/(200 + d)2 - 1/(200 + u)2 }.
(a) For d = 0 and u = , H(X)/E[X] = 4/3. = 1/3 = 33.3%.
(b) For d = 0 and u = 50, H(X)/E[X] = (4/3)200-.5{1/2001.5 - 1/2501.5}/{1/2002 - 1/2502 } = 1.054.
= 5.4%.
(c) For d = 50 and u = 100, H(X)/E[X] = (4/3)200-.5{1/2501.5 - 1/3001.5}/{1/2502 - 1/3002 } =
1.167. = 16.7%.
(d) For d = 100 and u = , H(X)/E[X] = (4/3)200-.5{1/3001.5}/{1/3002 } = 1.633. = 63.3%.
5.10. B. The Wang Transform is LogNormal with = 4 + (.3)(.6) = 4.18 and = 0.6.
The Wang Transform risk measure is the mean of that LogNormal: exp[4.18 + .62 /2] = 78.26.
5.11. E. -1[S(x)] = -1[1 - [(x - )/]] = - -1[[(x - )/]] = -(x - )/.
[-1[S(x)] + ] = [ - (x - )/] = 1 - [(x - )/ - ] = 1 - [(x - { + })/].
This is the survival function of a Normal with = + and = 2.
The Wang Transform risk measure is the mean of that Normal: + .
In this case, + = 7000 + (.8)(500) = 7400.
Comment: As applied to the Normal Distribution, the Wang Transform is equivalent to the
Standard Deviation Premium Principle, with = .
The Wang Transform Risk Measure for > 0 is greater than the mean, eliminating choice A.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 59

5.12. C. For the original distribution: S(x) = 1 for x < 100, 0.4 for 100 x < 500,
0.1 for 500 x < 1000, 0 for x 1000.
-1[S(x)] is: for x < 100, -1[0.4] = -0.253 for 100 x < 500,
-1[0.1] = -1.282 for 500 x < 1000, - for x 1000.
-1[S(x)] + is: for x < 100, 0.247 for 100 x < 500, -0.782 for 500 x < 1000,
- for x 1000.
[-1[S(x)] + ] is: 1 for x < 100, [0.25] = 0.5987 for 100 x < 500,
[-0.78] = 0.2177 for 500 x < 1000, 0 for x 1000.
The integral of the Survival Function of the Wang Transform is:
(100)(1) + (500 - 100)(0.5987) + (1000 - 500)(0.2177) = 448.
5.13. A. SX(x) = Prob[X > x] = Prob[220 - p > x] = Prob[p < 220 - x] = FP(150 - x) =
[{ln(220 - x) - 5.5}/0.2], for x 220.
-1[S(x)] is: {ln(220 - x) - 5.5}/0.2, for x 220.
-1[S(x)] + 0.6 = {ln(220 - x) - 5.38}/0.2, for x 220.
[-1[S(x)] + 0.6] = [{ln(220 - x) - 5.38}/0.2], for x 220.
Let p = 220 - x, then [-1[S(x)] + 0.6] = [{ln(p) - 5.38}/0.2], for p 220.
The integral of [-1[S(x)] + 0.6] is the integral of a LogNormal Distribution Function with = 5.38
and = 0.2, from 0 to 220.
220

220

F(x) dx =

220

1 - S(x) dx = 220 -

S(x) dx = 220 - E[X

220].

For the LogNormal with = 5.38 and = 0.2, E[X

220] =

exp[5.38 + .22 /2] [(ln220 - 5.38 - 0.22 )/0.2] + 220 {1 - [(ln220 - 5.38)/0.2]}
= (221.406)[-0.13] + 220 {1 - [0.07]} = (221.406)(.4483) + (220)(.4721) = 203.1.
The integral of [-1[S(x)] + 0.6] is: 220 - 203.1 = 16.9.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 60

5.14. D. SX(x) = Prob[X > x] = Prob[p - 300 > x] = Prob[p > 300 + x] = SP(300 + x) =
1 - [{ln(300 + x) - 5.5}/0.2] = [{5.5 - ln(300 + x)}/0.2], for x > 0.
-1[S(x)] is: {5.5 - ln(300 + x)}/0.2, for x > 0.
-1[S(x)] + 0.4 = {5.58 - ln(300 + x)}/0.2, for x > 0.
[-1[S(x)] + 0.4] = [{5.58 - ln(300 + x)}/0.2] = 1 - [{ln(300 + x) - 5.58}/0.2] for x > 0.
Let p = 300 + x, then [-1[S(x)] + 0.4] = 1 - [{ln(p) - 5.58}/0.2], for p > 300
The integral of [-1[S(x)] + 0.4] is the integral of a LogNormal Survival Function with = 5.58 and
= 0.2, from 300 to , which is for that LogNormal: E[X] - E[X

300].

For the LogNormal with = 5.58 and = 0.2, E[X] = exp[5.58 + .22 /2] = 270.426. E[X

300] =

exp[5.58 + .22 /2] [(ln300 - 5.58 - 0.22 )/0.2] + 300 {1 - [(ln300 - 5.58)/0.2]}
= (270.426)[0.42] + 300 {1 - [0.62]} = (270.426)(.6628) + (300)(.2676) = 259.5.
The integral of [-1[S(x)] + 0.4] is: 270.426 - 259.5 = 10.9.
5.15. C. F(x) = 1 - exp[-x/100]. F(x)3 = 1 - 3exp[-x/100] + 3exp[-2x/100] - exp[-3x/100].

1 - F(x)3 dx = 3(100) - (3)(100/2) + 100/3 = 183.3.


0

Comment: The expected value of the maximum of a sample of size N from an Exponential is:
N

1/ i . See Mahlers Guide to Statistics, covering material on the syllabus of CAS Exam 3.
i=1

5.16. D. F(x) = x/100. F(x)1.4 = x1.4/1001.4 .


100

1 - F(x)1.4 dx = 100 - 100/2.4 = 58.33.

Comment: For a uniform distribution on (0, ), the Dual Power Transform risk measure is:
/( + 1).
5.17. D. F(x) = 1 - {10/(10 + x)}5 . F(x)2 = 1 - 2({10/(10 + x)}5 + {10/(10 + x)}10.

1 - F(x)2 dx = 10{2/4 - 1/9} = 3.89.


0

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

5.18. B. For the original distribution: F(x) = 0 for x < 100, 0.6 for 100 x < 500,
0.9 for 500 x < 1000, 1 for x 1000.
1 - F(x) is: 1 for x < 100, 1 - 0.61.5 = 0.5352 for 100 x < 500,
1 - 0.91.5 = 0.1462 for 500 x < 1000, 0 for x 1000.
The integral of the Survival Function of the Dual Power Transform is:
(100)(1) + (500 - 100)(0.5352) + (1000 - 500)(0.1462) = 387.
5.19. 90%-VaR. g(y) = 0 for 0 y 10%, and 1 for 10% < y 1.
g(y)
1.0
0.8
0.6
0.4
0.2

0.2

0.4

0.6

0.8

1.0

5.20. TVaR90%. g(y) = y/.1 = 10y for 0 y 10%, and 1 for 10% < y 1.
g(y)
1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

Page 61

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

5.21. PH Transform with = 2. g(y) = y1/2 = y .


g(y)
1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

5.22. Dual Power Transform with = 2. g(y) = 1 - (1 - y)2 .


g(y)
1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

Page 62

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 63

5.23. Wang Transform with = 0.3. g(y) = [ -1[y] + 0.3].


g(y)
1.0

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1.0

For example, without rounding, g(.05) = [ -1[.05] + 0.3] = [-1.645 +.3] = [-1.345] = 0.0893.
g(.7) = [ -1[0.7] + 0.3] = [0.524 + 0.3] = [0.824] = 0.795.
5.24. A. This is the distortion function for TVaR.90.
H(X) =

Q.90

Q.90

g[S(x)]dx = 10

S(x) dx +

dx = (E[X] - E[X

Q.90])/(1 - .9) + Q.90 = TVaR.90.

E[X] = 200/(3 - 1) = 100.


.9 = F(x) = 1 - {200/(200 + x)}3 . Q.90 = 230.89.
E[X

230.9] = {200/(3 - 1)}{1 - (200/(200 +230.9))3-1} = 78.46.

TVaR.90 = Q.90 + (E[X] - E[X

Q.90])/(1 - .9) = 230.89 + (100 - 78.46)/.1 = 446.

Comment: y = S(x) in the definition of g(y) the distortion function.


0 y 0.1. 0 S(x) 0.1. 0.1 < F(x). Q.90 < x.
When S(x) is small, x is large, while when S(x) is large, x is small.
0.1 < y. 0.1 < S(x). F(x) 0.1. x Q.90.
5.25. A. The Expected Value Premium Principle is not a distortion risk measure.

2013-4-4,

Risk Measures 5 Distortion Risk Measures,

HCM 10/8/12,

Page 64

0 if 0 y 1 -
5.26. A. For the -VaR risk measure: g(y) =
.
1 if 1 - < y 1
0 if 0 y 0.10
For 90%-VaR: g(y) =
.
1 if 0.10 < y 1
Comment: g(S(x)) = 0 for S(x) .10, and g(S(x)) = 1 for S(x) > .10.
In other words, g(S(x)) = 0 for x Q.90, and g(S(x)) = 1 for x > Q.90.

Q.90

Therefore, g(S(x)) dx =

dx = Q.90.

y / (1- ) if 0 y 1 -
5.27. C. For the -TVaR risk measure: g(y) =
.
1 if 1 - < y 1
10y if 0 y 0.10
For 90%-TVaR: g(y) =
.
1 if 0.10 < y 1
Comment: The distortion function for -TVaR involves 1/(1-).
g(S(x)) = 10 S(x) for S(x) .10, and g(S(x)) = 1 for S(x) > .10.
In other words, g(S(x)) = 10 S(x) for x Q.90, and g(S(x)) = 1 for x > Q.90.

Q.90

Therefore, g(S(x)) dx =

dx +

10 S(x) dx = Q.90 + 10 E[(X - Q.90)+] =

Q.90

Q .90 + E[(X - Q.90)+]/.1 = Q.90 + e(Q.90) = 90%-TVaR.


5.28. D. For this Pareto, S(x) = {1000/(1000 + x)}4 .
g(S(x)) = {1000/(1000 + x)}2 , the Survival Function of another Pareto with = 1000 and = 2.
H(X) is the integral of g(S(x)), the mean of this second Pareto: 1000/( 2 - 1) = 1000.
Alternately,

g(S(x))dx = 10002 / (1000 + x)2 dx = -1,000,000/(1000 + x)

Comment: PH transform with = 2.


For a Pareto Distribution, the PH Transform risk measure is: /(/ - 1), / > 1.

] = 1000.

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 65

Section 6, Coherence42
There are various desirable properties for a risk measure to satisfy.
A risk measure is coherent if it has the following four properties:
1. Translation Invariance
2. Positive Homogeneity
3. Subadditivity
4. Monotonicity
Translation Invariance:
(X + c) = (X) + c, for any constant c.
In other words, a risk measure is translation invariant if adding a constant to the loss variable, adds
that same constant to the risk measure.
Letting X = 0, Translation Invariance (c) = c.
In other words if the outcome is certain, the risk measure is equal to the loss
For example, if the loss is always 1000, then the risk measure is 1000.
Positive Homogeneity:
(cX) = c (X), for any constant c > 0.
In other words, a risk measure is positive homogeneous if multiplying the loss variable by a
positive constant, multiplies the risk measure by the same constant.
Positive Homogeneity If the loss variable is converted to a different currency at a fixed rate of
exchange, then so is the risk measure.
Positive Homogeneity If the exposure to loss is doubled, then so is the risk measure.
Subadditivity:
(X + Y) (X) + (Y).
42

See Section 3.5.2 of Loss Models, in particular Definition 3.11. See also Setting Capital Requirements With
Coherent Measures of Risk, by Glenn G. Meyers, August 2002 and November 2002 Actuarial Reviews.

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 66

In other words, a risk measure satisfies subadditivity, if the merging of two portfolios can not
increase the total risk compared to the sum of their individual risks, but may decrease the total risk.
It should not be possible to reduce the appropriate premium or the required surplus by splitting a
portfolio into its constituent parts.43
Exercise: Determine whether VaR90% satisfies subadditivity.
[Solution: For example, take the following joint distribution for X and Y:
X = 0 and Y = 0, with probability 88%
X = 0 and Y = 1, with probability 4%
X = 1 and Y = 0, with probability 4%
X = 1 and Y = 1, with probability 4%
Then for X, Prob[X = 0] = 92%, Prob[X = 1] = 8%, .90 = 0.
For Y, Prob[Y = 0] = 92%, Prob[Y = 1] = 8%, .90 = 0.
For X + Y, Prob[X + Y = 0] = 88%, Prob[X + Y = 1] = 8%, Prob[X + Y = 2] = 4%, .90 = 1.
(X + Y) = 1 > 0 = 0 + 0 = (X) + (Y).
Thus VaR90% does not have the subadditivity property.
Comment: See Example 3.13 in Loss Models.
X and Y are not independent.
In order to have the subadditivity property, one must have that H(X + Y) H(X) + H(Y), for all
possible distributions of losses X and Y.]
Since it does not satisfy subadditivity, Value at Risk (VaR) is not coherent.44
Monotonicity:
If Prob[X Y] = 1, then (X) (Y).45
In others words, a risk measure satisfies monotonicity, if X is never greater than Y, then the risk
associated with X is not greater than the risk associated with Y.
For example, let X = (180 - P)+ and Y = (200 - P)+.46 Then X Y.
Therefore, for any risk measure that satisfies monotonicity, (X) (Y).
43

Mergers do not increase risk. Diversification does not increase risk.


Nevertheless, VaR is still commonly used, particularly in banking. In most practical applications, VaR is
subadditive. Also in some circumstances it may be valuable to disaggregate risks.
45
Technically, we are allowing X > Y on a set of probability zero, something of interest to mathematicians but not
most actuaries.
46
X and Y are two put options on the price of the same stock P. with different strike prices.
44

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 67

Risk Measures:
The Tail Value at Risk is a coherent measure of risk.
The Standard Deviation Premium Principle and the Value at Risk are not coherent
measures of risk.
Translation
Invariance

Positive
Homogen.

Subadditivity

Monotonicity

Coherence

No

Yes

Yes

Yes

No

Standard Deviation
Premium Principle Yes

Yes

Yes

No

No

Variance
Premium Principle

Yes

No

No

No

No

Value at Risk

Yes

Yes

No

Yes

No

Tail
Value at Risk

Yes

Yes

Yes

Yes

Yes

Risk Measure
Expected Value
Premium Principle

A measure of risk is coherent if and only if it can be expressed as the supremum of the
expected losses taken over a class of probability measures on a finite set of scenarios.47
A distortion measure is coherent if and only if the distortion function is concave.48
From this it follows that the PH Transform Risk Measure, the Double Power Transform, and
the Wang Transform are each coherent.
It can be shown that for a coherent risk measure: E[X] (X) Max[X].49

47

Coherent Measure of Risks by Philippe Artzner, Freddy Delbaen, Jean-Marc Eber, and David Heath,
Mathematical Finance 9 (1999), No. 3.
48
Distortion Risk Measures: Coherence and Stochastic Dominance by Julia L. Wirch and Mary R. Hardy,
presented at the 6th International Congress on Insurance: Mathematics and Economics.
49
TVaR0 = E[X] and TVaR1 = Max[X].

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 68

Problems:
6.1 (3 points) List and briefly define the properties that make a risk measure, (X), coherent.
6.2 (3 points) Briefly discuss whether the Expected Value Premium Principle is a coherent risk
measure. Which of the properties does it satisfy?
6.3 (3 points) Briefly discuss whether the Standard Deviation Premium Principle is a coherent risk
measure. Which of the properties does it satisfy?
6.4 (3 points) Briefly discuss whether the Variance Premium Principle is a coherent risk measure.
Which of the properties does it satisfy?
6.5 (3 points) Briefly discuss whether the Value at Risk is a coherent risk measure.
Which of the properties does it satisfy?
6.6 (2 points) Briefly discuss whether the Tail Value at Risk is a coherent risk measure.
Which of the properties does it satisfy?
6.7 (3 points) Briefly discuss whether (X) = E[X] is a coherent risk measure.
Which of the properties does it satisfy?
6.8 (3 points) Define (X) = Maximum[X], for loss distributions for which Maximum[X] < .
Briefly discuss whether this is a coherent risk measure.
Which of the properties does it satisfy?
6.9 (3 points) The Exponential Premium Principle has (X) = ln[E[eX]]/, > 0.
Briefly discuss whether it is a coherent risk measure. Which of the properties does it satisfy?

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 69

Solution to Problems:
6.1. 1. Translation Invariance. (X + c) = (X) + c.
2. Positive Homogeneity. (cX) = c (X), for any constant c > 0.
3. Subadditivity. (X + Y) (X) + (Y).
4. Monotonicity. If Prob[X Y] = 1, then (X) (Y).
6.2. (X) = (1 + k)E[X], k > 0.
1. (X + c) = (1 + k)E[X +c] = (1 + k)E[X] + (1 + k)c
= (X) + (1 + k)c (X) + k. Translation Invariance does not hold
2. (cX) = (1 + k)E[cX] = c(1 + k)E[X] = c (X). Positive Homogeneity does hold.
3. (X + Y) = (1 + k)E[X + Y] = (1 + k)E[X] + (1 + k)E[Y] = (X) + (Y) (X) + (Y).
Subadditivity does hold.
4. If Prob[X Y] = 1, then (X) = (1 + c)E[X] (1 + c)E[Y] = (Y). Monotonicity does hold.
The Expected Value Premium Principle is not coherent since #1 does not hold.
6.3. (X) = E[X] + kStdDev[X], k > 0.
1. (X + c) = E[X + c] + kStdDev[X+c] = E[X] + c + kStdDev[X] = (X) + c.
Translation Invariance does hold
2. (cX) = E[cX] + kStdDev[cX] = c E[X] + k c StdDev[X] = c (X).
Positive Homogeneity does hold.
3. (X + Y) = E[X + Y] + kStdDev[X + Y] = E[X] + E[Y] + kStdDev[X + Y].
Now Var[X + Y] = 2X + 2Y + 2X Y Corr[X, Y] 2X + 2Y + 2X Y, since Corr[X, Y] 1
Var[X + Y] (X + Y)2 . StdDev[X + Y] X + Y.

(X + Y) E[X] + E[Y] + kStdDev[X] + kStdDev[X] = (X) + (Y).


Subadditivity does hold.
4. Let X be uniform from 0 to 1. Let Y be constant at 2.
Let k = 10. Then (X) = 0.5 + 10 1/ 12 = 3.39. (Y) = 2 + (10)(0) = 2.
Prob[X Y] = 1, yet (X) > (Y). Monotonicity does not hold.
The Standard Deviation Premium Principle is not coherent since #4 does not hold.

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 70

6.4. (X) = E[X] + kVar[X], > 0.


1. (X + c) = E[X + c] + kVar[X+c] = E[X] + c + kVar[X] = (X) + c.
Translation Invariance does hold
2. (X) = E[cX] + kVar[cX] = c E[X] + k c2 Var[X] c (X).
Positive Homogeneity does not hold.
3. (X + Y) = E[X + Y] + kVar[X + Y] = E[X] + E[Y] + kVar[X + Y].
Now Var[X + Y] = 2X + 2Y + 2X Y Corr[X, Y].
If Corr[X, Y] > 0, then Var[X + Y] > Var[X] + Var[Y], and (X + Y) > (X) + (Y).
Subadditivity does not hold.
4. Let X be uniform from 0 to 1. Let Y be constant at 1.
Let k = 10. Then (X) = 0.5 + 10(1/12) = 1.333. (Y) = 1 + (10)(0) = 1.
Prob[X Y] = 1, yet (X) > (Y). Monotonicity does not hold.
The Variance Premium Principle is not coherent.
6.5. (X) = p , the pth percentile
1. Adding a constant to a variable adds a constant to each percentile.
(X + c) = (X) + c. Translation Invariance does hold
2. Multiplying a variable by a constant multiplies each percentile by that constant.
(cX) = c (X). Positive Homogeneity does hold.
3. For example, take the following joint distribution for X and Y:
X = 0 and Y = 0, with probability 88%
X = 0 and Y = 1, with probability 4%
X = 1 and Y = 0, with probability 4%
X = 1 and Y = 1, with probability 4%
Then for X, Prob[X = 0] = 92%, Prob[X = 1] = 8%, .90 = 0.
For Y, Prob[Y = 0] = 92%, Prob[Y = 1] = 8%, .90 = 0.
For X + Y, Prob[X + Y = 0] = 88%, Prob[X + Y = 1] = 8%, Prob[X + Y = 2] = 4%, .90 = 1.
Let p = 90%. (X + Y) = 1 > 0 = 0 + 0 = (X) + (Y).
Subadditivity does not hold.
4. If Prob[X Y] = 1, then the pth percentile of X is the pth percentile of Y.
(X) (Y). Monotonicity does hold.
Value at Risk is not coherent since #3 does not hold.

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 71

6.6. (X) = E[X | X > p ].


1. Adding a constant to a variable adds a constant to each percentile.
(X + c) = E[X + c | X + c > p + c ] = E[X + c | X > p ] = E[X | X > p ] + c = (X) + c.
Translation Invariance does hold
2. Multiplying a variable by a constant multiplies each quantile by that constant.
(cX) = E[cX | cX > cp ] = E[cX | X > p ] = c E[X | X > p ] = c (X).
Positive Homogeneity does hold.
3. E[X | worst p of the outcomes for X] E[X | worst p of the outcomes for X + Y].
(X + Y) = E[X + Y | worst p of the outcomes for X + Y] =
E[X | worst p of the outcomes for X + Y] + E[Y | worst p of the outcomes for X + Y]
E[X | worst p of the outcomes for X] + E[Y | worst p of the outcomes for Y]
= (X) + (Y). Subadditivity does hold.
4. If Prob[X Y] = 1, then the pth percentile of X is the pth percentile of Y.
Therefore (X) = E[X | worst p of the outcomes for X] E[Y | worst p of the outcomes for X]
E[Y | worst p of the outcomes for Y] = (Y). Monotonicity does hold.
The Tail Value at Risk is coherent.
6.7. (X) = E[X].
1. (X + c) = E[X + c] = E[X] + c = (X) + c. Translation Invariance does hold
2. (cX) = E[cX] = c E[X] = c (X). Positive Homogeneity does hold.
3. (X + Y) = E[X + Y] = E[X] + E[Y] = (X) + (Y) (X) + (Y). Subadditivity does hold.
4. If Prob[X Y] = 1, then (X) = E[X] E[Y] = (Y). Monotonicity does hold.
This risk measure is coherent.
6.8. (X) = Max[X].
1. (X + c) = Max[X + c] = Max[X] + c = (X) + c. Translation Invariance does hold
2. (cX) = Max[cX] = c Max[X] = c (X). Positive Homogeneity does hold.
3. (X + Y) = Max[X + Y] Max[X] + Max[Y] (X) + (Y). Subadditivity does hold.
4. If Prob[X Y] = 1, then (X) = Max[X] Max[Y] = (Y). Monotonicity does hold.
This risk measure is coherent.

2013-4-4,

Risk Measures 6 Coherence,

HCM 10/8/12,

Page 72

6.9. (X) = ln[E[eX]]/.


1. (X + c) = ln[E[e(X+c)]]/ = ln[ecE[eX]]/ = {c + ln[E[eX]]}/ = (X) + c.
Translation Invariance does hold
2. If X is Normal, (X) = + 2/2. cX is Normal with parameters c and c.
(cX) = c + (c)2/2 c(X). Positive Homogeneity does not hold.
3. (X + Y) (X) + (Y). ln[E[e(X+Y)]] ln[E[eX]] + ln[E[eY]].

E[eX eY] E[eX] E[eY]. Cov[eX, eY] 0. However, this covariance can be positive.
For example, take the following joint distribution for X and Y:
X = 0 and Y = 0, with probability 88%
X = 0 and Y = 1, with probability 4%
X = 1 and Y = 0, with probability 4%
X = 1 and Y = 1, with probability 4%
Then for X, Prob[X = 0] = 92%, Prob[X = 1] = 8%. (X) = ln(.92 + .08e).
For Y, Prob[Y = 0] = 92%, Prob[Y = 1] = 8%. (Y) = ln(.92 + .08e).
For X + Y, Prob[X + Y = 0] = 88%, Prob[X + Y = 1] = 8%, Prob[X + Y = 2] = 4%.
(X + Y) = ln(0.88 + 0.08e + 0.04e2).
For example, for = 2, (X) = (Y) = ln(0.92 + 0.08e2 )/2 = 0.206.
(X + Y) = ln(0.88 + 0.08e2 + 0.04e4 )/2 = 0.648.
(X + Y) = 0.648 > 0.412 = (X) + (Y). Subadditivity does not hold.
4. If Prob[X Y] = 1, then for > 0, eX eY. E[eX] E[eY]. ln[E[eX]] ln[E[eY]].

(X) (Y). Monotonicity does hold.


The Exponential Premium Principle is not coherent.

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 73

Section 7, Using Simulation50


For a general discussion of simulation see Mahlers Guide to Simulation. Here I will discuss using
the results of a simulation of aggregate losses to estimate risk measures.
Simulating aggregate losses could be relatively simple, for example if one assumes that
aggregate losses are LogNormal. On the other hand, it could involve a very complicated simulation
model of a property/casualty insurer with many different lines of insurance whose results are not
independent, with complicated reinsurance arrangements, etc.51
Here we will not worry about how the simulation was performed. Rather we will be given a large
simulated sample. For example, let us assume we have simulated from the distribution of
aggregate losses the following sample of size 100, arranged from smallest to largest:52
13, 19, 20, 25, 25, 31, 35, 35, 37, 39, 43, 48, 49, 51, 53, 55, 65, 68, 69, 75, 75, 79, 81, 84, 86,
87, 88, 90, 90, 94, 97, 112, 121, 128, 129, 132, 133, 133, 134, 137, 137, 138, 141, 142, 143,
144, 145, 145, 150, 150, 161, 166, 171, 186, 187, 191, 191, 206, 212, 212, 222, 226, 228,
228, 239, 250, 252, 270, 272, 274, 303, 315, 317, 319, 321, 322, 326, 340, 352, 356, 362,
365, 373, 388, 415, 434, 455, 456, 459, 516, 560, 638, 691, 762, 906, 1031, 1456, 1467,
1525, 2034.
Mean and Standard Deviation:
The sum of this sample is 27,305. X = 27,305/100 = 273.05.
The sum of the squares of the sample is 18,722,291. The estimated 2nd moment is 187,223.
Therefore, the sample variance is: (187,223 - 273.052 )(100/99) = 113,805.
Exercise: Estimate the appropriate premium using the Standard Deviation Premium Principle with k
= 0.5.
[Solution: 273.05 + (0.5)( 113,805 ) = 442.]

50

See Section 21.2.5 of Loss Models.


See for example, The Dynamic Financial Analysis Call Papers in the Spring 2001 CAS Forum.
52
In practical applications, one would usually simulate a bigger sample, such as size 1000 or 10,000.
51

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 74

Estimating Value at Risk:


Here, in order to estimate the pth percentile, Loss Models takes the value in the sample
corresponding to: 1 + the largest integer in Np.53
For a sample of size 100, VaR0.90 is estimated as:
[(100)(0.9)] + 1 = 91st value from smallest to largest.
Exercise: For the previous sample of size 100, estimate VaR0.80.
[Solution: Take 1 + the largest integer in: (0.80)(100) = 80.
So we take the 81st element in the sample from smallest to largest: 362.]
In general, let [x] be the greatest integer contained in x.
[7.2] = 7.
[7.6] = 7.
[8.0] = 8.
VaRp is estimated as the [Np] + 1 value from smallest to largest.
VaRp = L[ N p ] + 1.
Using a Series of Simulations: 54
Loss Models does not discuss how to estimate the variance of this estimate of VaR.80.
One way would be through a series of simulations.
One could repeat the simulation that resulted in the previous sample of 100, and get a new
sample of size 100. Using the original sample the estimate of VaR.80 was 362. Using the new
sample, the estimate of VaR.80 would be somewhat different. Then we could proceed to simulate
a third sample, and get a third estimate of VaR.80.
We could produce for example 500 different samples and get 500 corresponding estimates of
VaR.80. Then the mean of these 500 estimates of VaR.80, would be a good estimate of VaR.80.
The sample variance of these 500 estimates of VaR.80, would be an estimate of the variance of
any of the individual estimates of VaR.80. However, the variance of the average of these 500
estimates of VaR.80 would be the sample variance divided by 500.55
This differs from the smoothed empirical estimate of p which is the p(N+1) loss from smallest to largest, linearly
interpolating between two loss amounts if necessary. See Mahlers Guide to Fitting Loss Distributions.
54
Mahlers Guide to Simulation has many examples of simulation experiments.
See especially the section on Estimating the p-value via Simulation.
55
The variance of an average is the variance of a single draw, divided by the number of items being averaged.
53

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 75

Estimating Tail Value at Risk:


One can estimate TVaRp as an average of the worst outcomes of a simulated sample.
For a sample of size 100, VaR0.90 is estimated as:
[(100)(0.9)] + 1 = 91st value from smallest to largest.
For a sample of size 100, in order to estimate TVaR0.90, take an average of the 10 largest values.
Average the values starting at the 91st.
For the previous sample of size 100, the 91st value is 560, the estimate of 0.90.
We could estimate TVaR90% as the average of the 10 largest values in the sample:
(560 + 638 + 691 + 762 + 906 + 1031 + 1456 + 1467 + 1525 + 2034)/10 = 1107.
In general, let [x] be the greatest integer contained in x.
TVaRp is estimated as the average of the largest values in the sample,
starting with the [Np] + 1 value from smallest to largest.56
Exercise: For the previous sample of size 100, estimate TVaR95%.
[Solution: [(100)(.95)] + 1 = 96. (1031 + 1456 + 1467 + 1525 + 2034)/5 = 1502.6.
Comment: For a small sample such as this, and a large p such as 95%, the estimate of the
TVaR95% is subject to a lot of random fluctuation.]

56

There are other similar estimators that would also be reasonable.

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 76

Variance of the Estimate of the Tail Value at Risk:


In general a variance can be divided into two pieces.57
Conditioning the estimate of TVaRp on ^p , the estimator of p :
^
^
^
Var[ TVaRp ] = E[Var[ TVaRp | ^p ]] + Var[E[ TVaRp | ^p ]].

This leads to the following estimate of the variance of the estimate of TVaRp :58
^
{sp 2 + p( TVaRp - ^p ) 2 }/{N - [Np]},

where sp 2 is the sample variance of the worst outcomes used to estimate TVaRp .
For the previous sample of size 100, TVaR90% was estimated as an average of the 10 largest
values in the sample:
(560 + 638 + 691 + 762 + 906 + 1031 + 1456 + 1467 + 1525 + 2034)/10 = 1107.
The sample variance of these 10 worst outcomes is:
{(560 - 1107)2 + (638 - 1107)2 + (691 - 1107)2 + (762 - 1107)2 + (906 - 1107)2 +
(1031 - 1107)2 + (1456 - 1107)2 + (1467 - 1107)2 + (1525 - 1107)2 + (2034 - 1107)2 }/9 =
238,098.
Thus the estimate of the variance of this estimate of TVaR90% is:
{238,098 + (.9)(1107 - 560)2 }/(100 - 90) = 50,739.
Thus the estimate of the standard deviation of this estimate of TVaR90% is: 50,739 = 712.

57

As discussed in Mahlers Guide to Buhlmann Credibility.


The first term is the EPV, while the second term is the VHM.
The first term dominates for a heavier-tailed distribution, while the second term is significant for a lighter-tailed
distribution.
Although Loss Models does not explain how to derive the second term, it does cite Variance of the CTE
Estimator, by B. John Manistre and Geoffrey H. Hancock, April 2005 NAAJ. The derivation in Manistre and
Hancock uses the information matrix and the delta method, which are discussed in Mahlers Guide to Fitting Loss
Distributions.
58

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 77

Problems:
For the next six questions, you simulate the following 35 random values from a distribution:
6
7
11
14
15
17
18
19
25
29
30
34
38
40
41
48
49
53
60
63
78
103 124 140 192 198 227 330 361 421
514 546 750 864 1638
7.1 (1 point) What is the estimate of VaR0.9?
A. 546

B. 750

C. 864

D. 1638

E. None of A, B, C, or D

7.2 (1 point) What is the estimate of VaR0.7?


A. 140

B. 192

C. 198

D. 227

E. 330

7.3 (2 points) What is the estimate of TVaR0.9?


A. 900

B. 950

C. 1000

D. 1050

E. 1100

7.4 (3 points) What is the variance of the estimate in the previous question?
A. 88,000
B. 90,000
C. 92,000
D. 94,000
E. 96,000
7.5 (2 points) What is the estimate of TVaR0.7?
A. 400

B. 450

C. 500

D. 550

E. 600

7.6 (3 points) What is the variance of the estimate in the previous question?
A. 22,000
B. 24,000
C. 26,000
D. 28,000
E. 30,000

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 78

Use the following information for the next 3 questions:


One hundred values of the annual earthquake losses in the state of Allshookup for the
Presley Insurance Company have been simulated, and ranked from smallest to largest:
57, 72, 98, 128, 151, 160, 163, 171, 203, 218,
257, 262, 267, 301, 323, 327, 337, 372, 397, 401,
441, 447, 454, 464, 491, 498, 500, 509, 512, 520,
522, 523, 526, 530, 531, 553, 554, 565, 620, 632,
633, 637, 641, 648, 660, 666, 678, 685, 695, 708,
709, 728, 732, 782, 810, 826, 851, 858, 862, 871,
890, 903, 942, 947, 955, 976, 984, 992, 1016, 1023,
1024, 1027, 1041, 1047, 1048, 1050, 1055, 1055, 1057, 1062,
1076, 1081, 1088, 1117, 1131, 1148, 1192, 1220, 1253, 1270,
1329, 1398, 1406, 1537, 1578, 1658, 1814, 1909, 2431, 2702.
7.7 (1 point) Estimate VaR0.9.
A. 1192

B. 1220

C. 1253

D. 1270

E. 1329

D. 2100

E. 2200

7.8 (2 points) Estimate TVaR0.95.


A. 1800

B. 1900

C. 2000

7.9 (3 points) Estimate the standard deviation of the estimate made in the previous question.
A. 240
B. 260
C. 280
D. 300
E. 320

7.10 (1 point) XYZ Insurance Company wrote a portfolio of medical professional liability
insurance. 100 scenarios were simulated to model the aggregate losses.
The 10 worst results of these 100 scenarios are (in $ million):
104, 132, 132, 143, 152, 183, 131, 126, 191, 117.
Estimate the 95% Tail Value at Risk.

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 79

Use the following information for the next 4 questions:


One thousand values of aggregate annual losses net of reinsurance have been simulated.
They have been ranked from smallest to largest, and here are the largest 100:
3985, 4239, 4521, 4705, 4875, 5220, 5239, 5294, 5384, 5503,
5514, 5581, 5601,5630, 5735, 5756, 5823, 5872, 5902, 5909,
5945, 6004, 6038, 6085, 6204, 6249, 6265, 6270, 6326, 6338,
6371, 6378, 6398, 6402, 6457, 6533, 6548, 6667, 6679, 6688,
6822, 6920, 6994, 7004, 7039, 7050, 7100, 7126, 7126, 7128,
7129, 7133, 7208, 7250, 7317, 7317, 7352, 7361, 7377, 7466,
7467, 7468, 7470, 7472, 7527, 7534, 7538, 7544, 7547, 7547,
7578, 7607, 7613, 7651, 7663, 7712, 7757, 7771, 7785, 7823,
7849, 7865, 7878, 7880, 7906, 7923, 7928, 7941, 7955, 7963,
7976, 7979, 8011, 8021, 8032, 8034, 8052, 8065, 8089, 8116.
7.11 (1 point) Estimate VaR0.95.
A. 7128

B. 7129

C. 7133

D. 7208

E. 7250

D. 4705

E. 4875

D. 7900

E. 8000

7.12 (1 point) Estimate VaR0.90.


A. 3985

B. 4239

C. 4521

7.13 (2 points) Estimate TVaR0.99.


A. 7600

B. 7700

C. 7800

7.14 (3 points) Estimate the standard deviation of the estimate made in the previous question.
A. 20
B. 22
C. 24
D. 26
E. 28

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 80

Solution to Problems:
7.1. A. VaR.9 is estimated as the [(.90)(35)] + 1 = 32nd value from smallest to largest: 546.
7.2. B. VAR.7 is estimated as the [(.70)(35)] + 1 = 25th value from smallest to largest: 192.
7.3. B. [(.90)(35)] + 1 = 32.
Estimate TVaR.9 as the average of the worst outcomes starting with the 32nd value.
(546 + 750 + 864 + 1638)/4 = 949.5.
7.4. D. [Np] + 1 = [(.90)(35)] + 1 = 32.
The 32nd element from smallest to largest is 546, the estimate of .9.
sp 2 is the sample variance of the worst outcomes used to estimate TVaR.95:
{(546 - 949.5)2 + (750 - 949.5)2 + (864 - 949.5)2 + (1638 - 949.5)2 }/3 = 227,985.
The variance of the estimate of TVaRp : {sp 2 + p(TVaRp - p )2 }/{N - [Np]}
= {227,985 + (.9)(949.5 - 546)2 }/(35 - 31) = 93,629.
7.5. D. [(.70)(35)] + 1 = 25.
Estimate TVaR.7 as the average of the worst outcomes starting with the 25th value.
(192 + 198 + 227 + 330 + 361 + 421 + 514 + 546 + 750 + 864 + 1638)/11 = 549.2.
7.6. B. [Np] + 1 = [(.60)(35)] + 1 = 25.
The 25th element from smallest to largest is 192, the estimate of .7.
sp 2 is the sample variance of the worst outcomes used to estimate TVaR.7:
{(192 - 549.2)2 + (198 - 549.2)2 + (227 - 549.2)2 + (330 - 549.2)2 + (361 - 549.2)2 +
(421 - 549.2)2 + (514 - 549.2)2 + (546 - 549.2)2 + (750 - 549.2)2 + (864 - 549.2)2
+ (1638 - 549.2)2 }/10 = 178,080.
The variance of the estimate of TVaRp : {sp 2 + p(TVaRp - p )2 }/{N - [Np]}
= {178,080 + (.7)(549.2 - 192)2 }/(35 - 24) = 24,309.
Comment: The variance of the estimate of TVaR.7 is much smaller than the variance of the
estimate of TVaR.9. It is easier to estimate the Tail Value at Risk at a smaller value of p than a
larger value of p; it is hard to estimate what is going on in the extreme righthand tail.
7.7. E. [Np] + 1 = [(100)(.90)] + 1 = 91.
The 91st element from smallest to largest is: 1329.

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

Page 81

7.8. D. [Np] + 1 = [(100)(.95)] + 1 = 96.


Average the 96th to 100th values:
(1658 + 1814 + 1909 + 2431 + 2702)/5 = 2102.8.
7.9. C. [Np] + 1 = [(100)(.95)] + 1 = 96.
The 96th element from smallest to largest is 1658, the estimate of .95.
sp 2 is the sample variance of the worst outcomes used to estimate TVaR.95: {(1658 - 2102.8)2 +
(1814 - 2102.8)2 + (1909 - 2102.8)2 + (2431 - 2102.8)2 + (2702 - 2102.8)2 }/4 = 196,392.
The variance of the estimate of TVaRp : {sp 2 + p(TVaRp - p )2 } / {N - [Np]}
= {196,392 + (.95)(2102.8 - 1658)2 } / (100 - 95) = 76,869.
The standard deviation is:

76,869 = 277.

7.10. [Np] + 1 = [(100)(0.95)] + 1 = 96.


We average the 96th, 97th, 98th, 99th, 100th values:
(132 + 143 + 152 + 183 + 191) / 5 = $160.2 million.
Comment: TVaRp is estimated as the average of the largest values in the sample,
starting with the [Np] + 1 value from smallest to largest.
7.11. B. [Np] + 1 = [(1000)(.95)] + 1 = 951.
The 951st value is: 7129.
7.12. A. [Np] + 1 = [(1000)(.90)] + 1 = 901.
The 901st value is: 3985.
7.13. E. [Np] + 1 = [(1000)(.99)] + 1 = 991.
Average the 991st to the 1000th values:
(7976 + 7979 + 8011 + 8021 + 8032 + 8034 + 8052 + 8065 + 8089 + 8116)/10 = 8037.5.

2013-4-4,

Risk Measures 7 Using Simulation,

HCM 10/8/12,

7.14. C. [Np] + 1 = [(1000)(.99)] + 1 = 991.


The 991st element from smallest to largest is 7976, the estimate of .95.
sp 2 is the sample variance of the worst outcomes used to estimate TVaR.99:
{(7976 - 8037.5)2 + (7979 - 8037.5)2 + (8011 - 8037.5)2 + (8021 - 8037.5)2 +
(8032 - 8037.5)2 + (8034 - 8037.5)2 + (8052 - 8037.5)2 + (8065 - 8037.5)2
+ (8089 - 8037.5)2 + (8116 - 8037.5)2 }/9 = 2000.
The variance of the estimate of TVaRp : {sp 2 + p(TVaRp - p )2 }/{N - [Np]}
= {2000 + (.99)(8037.5 - 7976)2 }/(1000 - 990) = 574.4.
The standard deviation is:

574.4 = 24.0.

Page 82

2013-4-4,

Measures 8 Important Ideas,

HCM 10/8/12,

Page 83

Section 8, Important Ideas and Formulas


Introduction (Section 1):
A risk measure is defined as a functional mapping of a loss distribution to the real numbers.
(X) is the notation used for the risk measure.
Premium Principles (Section 2):
Expected Value Premium Principle: (X) = (1 + k)E[X], k > 0.
Standard Deviation Premium Principle: (X) = E[X] + k Var[X] , k > 0.
Variance Premium Principle: (X) = E[X] + k Var[X], k > 0.
Value at Risk (Section 3):
Value at Risk, VaRp (X), is defined as the 100pt h percentile.
VaRp (X) = p .
In Appendix A of the Tables attached to the exam, there are formulas for VaRp(X) for a
many of the distributions: Exponential, Pareto, Single Parameter Pareto, Inverse Pareto,
Inverse Weibull, Burr, Inverse Burr, Inverse Exponential, Paralogistic, Inverse Paralogistic.
Tail Value at Risk (Section 4):
TVaRp (X) E[X | X > p ] = p + e(p ) = p + (E[X] - E[X p ]) / (1 - p).
The corresponding risk measure is: (X) = TVaRp (X).
TVaRp (X) VaRp (X).

TVaR0 (X) = E[X].

TVaR1 (X) = Max[X].

In Appendix A, there are formulas for TVaRp (X) for a few of the distributions:
Exponential, Pareto, Single Parameter Pareto.

2013-4-4,

Measures 8 Important Ideas,

HCM 10/8/12,

Page 84

For the Normal Distribution: TVaRp (X) = + [zp ] / (1 - p).


Coherence (Section 6):
A risk measure is coherent if it has the following four properties:
1. Translation Invariance

(X + c) = (X) + c, for any constant c..

2. Positive Homogeneity

(cX) = c (X), for any constant c > 0.

3. Subadditivity

(X + Y) (X) + (Y).

4. Monotonicity

If Prob[X Y] = 1, then (X) (Y).

The Tail Value at Risk is a coherent measure of risk.


The Standard Deviation Premium Principle and the Value at Risk
are not coherent measures of risk.
Using Simulation (Section 7):
Let [x] be the greatest integer contained in x.
VaRp is estimated as the [Np] + 1 value from smallest to largest.
TVaRp is estimated as the average of the largest values in the sample,
starting with the [Np] + 1 value from smallest to largest.
Estimate of the variance of the estimate of TVaRp :
^
{sp 2 + p( TVaRp - ^p )2 }/{N - [Np]},

where sp 2 is the sample variance of the worst outcomes used to estimate TVaRp .

Mahlers Guide to

Classical Credibility
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-8


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-8,

Classical Credibility,

HCM 10/16/12,

Page 1

Mahlers Guide to Classical Credibility


Copyright 2013 by Howard C. Mahler.
The concepts in Section 2 of Credibility by Mahler and Dean1
or Section 20.2 of Loss Models are demonstrated.
Information in bold or sections whose title is in bold are more important for passing the exam. Larger
bold type indicates it is extremely important. Information presented in italics (and sections whose title
is in italics) should not be needed to directly answer exam questions and should be skipped on first
reading. It is provided to aid the readers overall understanding of the subject, and to be useful in
practical applications.
Highly Recommended problems are double underlined.
Recommended problems are underlined.
Solutions to the problems in each section are at the end of that section.
Note that problems include both some written by me and some from past exams.2 The latter are
copyright by the Casualty Actuarial Society and the Society of Actuaries and are reproduced here
solely to aid students in studying for exams.3
Section #
A

B
C

1
2
3
4
5
6
7

Pages
3-4
5-9
10-25
26-32
33-65
66-114
115-150
150-151

Section Name
Normal Distribution Table
Introduction
Full Credibility for Frequency
Full Credibility for Severity
Variance of Pure Premiums & Aggregate Losses
Full Credibility for Pure Premiums & Aggregate Losses
Partial Credibility
Important Formulas and Ideas

From Chapter 8 of the fourth Edition of Foundations of Casualty Actuarial Science. My study guide is very similar to
and formed a basis for the Credibility Chapter written by myself and Curtis Gary Dean.
2
In some cases Ive rewritten these questions in order to match the notation in the current Syllabus.
3
The solutions and comments are solely the responsibility of the author; the CAS/SOA bear no responsibility for
their accuracy. While some of the comments may seem critical of certain questions, this is intended solely to aid you
in studying and in no way is intended as a criticism of the many volunteers who work extremely long and hard to
produce quality exams.

2013-4-8,

Classical Credibility,

HCM 10/16/12,

Page 2

Course 4 Exam Questions by Section of this Study Aid4


Section

Sample

5/00

11/00

5/01

11/01

11/02

11/03

11/04

1
2

21

3
4
5

36
15

Section

14

14

26

15

5/05

11/05

11/06

35

30

3
35

5/07

1
2
3
4
5
6

The CAS/SOA did not release the 5/02, 5/03, 5/04, 5/06, 11/07 and subsequent exams.

Excluding any questions that are no longer on the syllabus.

2013-4-8,

Classical Credibility,

HCM 10/16/12,

Page 3

Normal Distribution Table


Entries represent the area under the standardized normal distribution from - to z, Pr(Z < z).
The value of z to the first decimal place is given in the left column.
The second decimal is given in the top row.
z

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.0
0.1
0.2
0.3
0.4

0.5000
0.5398
0.5793
0.6179
0.6554

0.5040
0.5438
0.5832
0.6217
0.6591

0.5080
0.5478
0.5871
0.6255
0.6628

0.5120
0.5517
0.5910
0.6293
0.6664

0.5160
0.5557
0.5948
0.6331
0.6700

0.5199
0.5596
0.5987
0.6368
0.6736

0.5239
0.5636
0.6026
0.6406
0.6772

0.5279
0.5675
0.6064
0.6443
0.6808

0.5319
0.5714
0.6103
0.6480
0.6844

0.5359
0.5753
0.6141
0.6517
0.6879

0.5
0.6
0.7
0.8
0.9

0.6915
0.7257
0.7580
0.7881
0.8159

0.6950
0.7291
0.7611
0.7910
0.8186

0.6985
0.7324
0.7642
0.7939
0.8212

0.7019
0.7357
0.7673
0.7967
0.8238

0.7054
0.7389
0.7704
0.7995
0.8264

0.7088
0.7422
0.7734
0.8023
0.8289

0.7123
0.7454
0.7764
0.8051
0.8315

0.7157
0.7486
0.7794
0.8078
0.8340

0.7190
0.7517
0.7823
0.8106
0.8365

0.7224
0.7549
0.7852
0.8133
0.8389

1.0
1.1
1.2
1.3
1.4

0.8413
0.8643
0.8849
0.9032
0.9192

0.8438
0.8665
0.8869
0.9049
0.9207

0.8461
0.8686
0.8888
0.9066
0.9222

0.8485
0.8708
0.8907
0.9082
0.9236

0.8508
0.8729
0.8925
0.9099
0.9251

0.8531
0.8749
0.8944
0.9115
0.9265

0.8554
0.8770
0.8962
0.9131
0.9279

0.8577
0.8790
0.8980
0.9147
0.9292

0.8599
0.8810
0.8997
0.9162
0.9306

0.8621
0.8830
0.9015
0.9177
0.9319

1.5
1.6
1.7
1.8
1.9

0.9332
0.9452
0.9554
0.9641
0.9713

0.9345
0.9463
0.9564
0.9649
0.9719

0.9357
0.9474
0.9573
0.9656
0.9726

0.9370
0.9484
0.9582
0.9664
0.9732

0.9382
0.9495
0.9591
0.9671
0.9738

0.9394
0.9505
0.9599
0.9678
0.9744

0.9406
0.9515
0.9608
0.9686
0.9750

0.9418
0.9525
0.9616
0.9693
0.9756

0.9429
0.9535
0.9625
0.9699
0.9761

0.9441
0.9545
0.9633
0.9706
0.9767

2.0
2.1
2.2
2.3
2.4

0.9772
0.9821
0.9861
0.9893
0.9918

0.9778
0.9826
0.9864
0.9896
0.9920

0.9783
0.9830
0.9868
0.9898
0.9922

0.9788
0.9834
0.9871
0.9901
0.9925

0.9793
0.9838
0.9875
0.9904
0.9927

0.9798
0.9842
0.9878
0.9906
0.9929

0.9803
0.9846
0.9881
0.9909
0.9931

0.9808
0.9850
0.9884
0.9911
0.9932

0.9812
0.9854
0.9887
0.9913
0.9934

0.9817
0.9857
0.9890
0.9916
0.9936

Table continued on the next page

0.09

2013-4-8,

Classical Credibility,

HCM 10/16/12,

Page 4

Entries represent the area under the standardized normal distribution from - to z, Pr(Z < z).
The value of z to the first decimal place is given in the left column.
The second decimal is given in the top row.
z
2.5
2.6
2.7
2.8
2.9

0.00
0.9938
0.9953
0.9965
0.9974
0.9981

0.01
0.9940
0.9955
0.9966
0.9975
0.9982

0.02
0.9941
0.9956
0.9967
0.9976
0.9982

0.03
0.9943
0.9957
0.9968
0.9977
0.9983

0.04
0.9945
0.9959
0.9969
0.9977
0.9984

0.05
0.9946
0.9960
0.9970
0.9978
0.9984

0.06
0.9948
0.9961
0.9971
0.9979
0.9985

0.07
0.9949
0.9962
0.9972
0.9979
0.9985

0.08
0.9951
0.9963
0.9973
0.9980
0.9986

0.09
0.9952
0.9964
0.9974
0.9981
0.9986

3.0
3.1
3.2
3.3
3.4

0.9987
0.9990
0.9993
0.9995
0.9997

0.9987
0.9991
0.9993
0.9995
0.9997

0.9987
0.9991
0.9994
0.9995
0.9997

0.9988
0.9991
0.9994
0.9995
0.9997

0.9988
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9995
0.9996
0.9997

0.9990
0.9993
0.9995
0.9996
0.9997

0.9990
0.9993
0.9995
0.9997
0.9998

3.5
3.6
3.7
3.8
3.9

0.9998
0.9998
0.9999
0.9999
1.0000

0.9998
0.9998
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

z
Pr(Z < z)

0.842
0.800

Values of z for selected values of Pr(Z < z)


1.036
1.282
1.645
1.960
0.850
0.900
0.950
0.975

2.326
0.990

2.576
0.995

For Classical Credibility, we will be using the chart at the bottom of the table, showing various
percentiles of the Standard Normal Distribution.
Using the Normal Table:5
When using the normal distribution, choose the nearest z-value to find the probability, or
if the probability is given, chose the nearest z-value. No interpolation should be used.
Example: If the given z-value is 0.759, and you need to find Pr(Z < 0.759) from the normal
distribution table, then choose the probability value for z-value = 0.76; Pr(Z < 0.76) = 0.7764.
When using the Normal Approximation to a discrete distribution, use the continuity correction.

Instructions for Exam 4/C from the SOA/CAS.

2013-4-8,

Classical Credibility 1 Introduction,

HCM 10/16/12,

Page 5

Section 1, Introduction
Assume Carpenters are currently charged a rate of $10 (per $100 of payroll) for Workers
Compensation Insurance.6 Assume further that the recent experience would indicate a rate of $5.
Then an actuarys new estimate of the rate for Carpenters might be $5, $10, or most likely
something in between. In other words, the new estimate of the appropriate rate for Carpenters
would be a weighted average of the separate $5 and $10 estimates.
If the actuary put more weight on the observation, the new estimate would be closer to the
observation of $5. If on the other hand, the actuary put less weight on the observation, then the new
estimate would be closer to the current rate of $10. One could write this as:
new estimate = (5)(Z) + (10)(1-Z), where Z is the weight, 0 Z 1.
So for example if Z = 20%, then the new estimate is ($5)(.2) + ($10)(.8) = $9. If instead Z = 60%,
then the new estimate is ($5)(.6)+($10)(.4) = $7. The weight Z is generally referred to as the
credibility assigned to the observed data.
Credibility is commonly used by actuaries in order to weight together two estimates7 of the same
quantity. Let X and Y be two estimates. X might be from a recent observation based on limited
data, while Y might be a previous estimate or one obtained from a larger but less specific data set.8
Then the estimate using credibility would =
ZX + (1 - Z)Y, where Z is the credibility assigned to the observation X.
1 - Z is generally referred to as the complement of credibility.
Thus the use of credibility involves a linear estimate of the true expectation derived as a result
of a compromise between hypothesis and observations.

Credibility: A linear estimator by which data external to a particular group or individual are combined
with the experience of the group or individual in order to better estimate the expected loss (or any
other statistical quantity) for each group or individual.
Credibility or Credibility Factor: Z, the weight given the observation.
The basic formula is: new estimate = (observation) (Z) + (old estimate) (1-Z).

Assume that there is no change in rates indicated for the Contracting Industry Group in which Carpenters are
included. So that in the absence of any specific data for the Carpenters class, Carpenters would continue to be
charged $10.
7
In some actual applications more than two estimates are weighted together.
8
For example, Y might be (appropriately adjusted) countrywide data for the Carpenters class.

2013-4-8,

Classical Credibility 1 Introduction,

HCM 10/16/12,

Page 6

Sometimes it is useful to use the equivalent formula:


new estimate = old estimate + Z (observation - old estimate).
This can be solved for the credibility:
new estimate - old estimate
Z=
.
observation - old estimate
In the example, in order to calculate a new estimate of the appropriate rate for Carpenters, one first
has to decide that one will weight together the current observations9 with the current rate for
Carpenters10. Generally on the exam when it is relevant to answering the question, it will be clear
which two estimates to weight together. Second one has to decide how much credibility to assign to
the current observation. On the exam this is generally the crux of the questions asked.
Two manners of determining how much credibility to assign are covered on the Syllabus. The first is
called Classical Credibility or Limited Fluctuation Credibility and is covered in this Study Aid.11
The second is referred to as Buhlmann Credibility, Least Squares Credibility, or Greatest
Accuracy Credibility and is covered in another Study Aid.12
Either form of credibility can be applied to various actuarial issues such as: Classification and/or
Territorial Ratemaking, Experience Rating (Individual Risk Rating), Loss Reserving, Trend, etc. On
the exam, credibility questions will usually involve experience rating13 or perhaps classification
ratemaking14, unless they deal with urns, dies, spinners, etc., that are used to model probability and
credibility theory situations.15

One would have to decide what period of time to use, for example the most recently available 3 years. Also one
would adjust the data for law changes, trend, development, etc.
10
In actual applications, various adjustments would be made to the current rate for Carpenters before using it to
estimate the proposed rate for Carpenters.
11
Classical Credibility was developed in the U.S. in the first third of the 20th century by early members of the CAS
such as Albert Mowbray and Francis Perryman.
12
Greatest Accuracy Credibility was developed in the late 1940s by Arthur Bailey, FCAS, based on earlier work by
Albert Whitney and other members of the CAS.
13
Experience Rating refers to the use of the experience of an individual policyholder in order to help determine his
premium. This can be for either Commercial Insureds (e.g. Workers Compensation) or Personal Lines Insureds (e.g.
Private Passenger Automobile.)
14
For example making the rates for the Workers Compensation class of Carpenters. Similar situations occur when
making the rates for the territories of a state or for the classes and territories in a state.
15
The reason you are given problems involving urns, etc. is that one can then ask questions that do not require the
knowledge of the specific situation. For example, in order to ask a question involving an actual application to
Workers Compensation Classification Ratemaking would require knowledge many students do not have and which
can not be covered on the syllabus for this exam. Also, the questions involving urns, etc., illustrate the importance
of modeling. In actual applications, someone has to propose a model of the underlying process, so that one can
properly apply Credibility Theory. Urn models, etc. allow one to determine which features are important and how
they are likely to affect real world situations. A good example is Philbricks target shooting example.

2013-4-8,

Classical Credibility 1 Introduction,

HCM 10/16/12,

Page 7

In general, all other things being equal, one would assign more credibility to a larger volume of data.
In Classical Credibility, one determines how much data one needs before one will assign to it
100% credibility. This amount of data is referred to as the Full Credibility Criterion or the
Standard for Full Credibility. If one has this much data or more, then Z = 100%; if one has
observed less than this amount of data then one has 0 Z <1.
For example, if I observed 1000 full time Carpenters, then I might assign 100% credibility to their
data.16 Then if I observed 2000 full time Carpenters I would also assign them 100% credibility.
I might assign 100 full time Carpenters 32% credibility. In this case we say we have assigned the
observation partial credibility, i.e., less than full credibility. Exactly how to determine the amount of
credibility assigned to different amounts of data is discussed in the following sections.
There are five basic concepts from Classical Credibility you need to know how to apply in order to
answer exam questions:
1. How to determine the Criterion for Full Credibility when estimating frequencies.
2. How to determine the Criterion for Full Credibility when estimating severities.
3. How to determine the Criterion for Full Credibility when estimating pure premiums
or aggregate losses.
4. How to determine the amount of partial credibility to assign when one has less data
than is needed for full credibility.
5. How to use credibility to estimate the future, by combining the observation and the old estimate.

16

For Workers Compensation that data would be dollars of loss and dollars of payroll.

2013-4-8,

Classical Credibility 1 Introduction,

HCM 10/16/12,

Page 8

Problems:
1.1 (1 point) The observed claim frequency is 120. The credibility given to this data is 25%.
The complement of credibility is given to the prior estimate of 200.
What is the new estimate of the claim frequency?
A. Less than 165
B. At least 165 but less than 175
C. At least 175 but less than 185
D. At least 185 but less than 195
E. At least 195
1.2 (1 point) The prior estimate was 100 and after an observation of 800 the new estimate is 150.
How much credibility was assigned to the data?
A. Less than 4%
B. At least 4% but less than 5%
C. At least 5% but less than 6%
D. At least 6% but less than 7%
E. At least 7%

2013-4-8,

Classical Credibility 1 Introduction,

HCM 10/16/12,

Solutions to Problems:
1.1. C. (25%)(120) + (75%)(200) = 180.
1.2. E. New estimate = old estimate + Z (observation - old estimate).

Z = (new estimate - old estimate) / (observation - old estimate)


= (150-100)/(800-100) = 50/700 = 7.1%.

Page 9

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 10

Section 2, Full Credibility for Frequency17


The most common uses of Classical Credibility, assume that the frequency is (approximately)
Poisson. Thus well deal with that case first.
Poisson Case:
Assume we have a Poisson process for claim frequency, with an average of 500 claims per year.
Then if we observe the numbers of claims, they will vary from year to year around the mean of 500.
The variance of a Poisson process is equal to its mean of 500. We can approximate this Poisson
Process by a Normal Distribution with a mean of 500 and a variance of 500.
We can use this Normal Approximation to estimate how often we will observe results far from the
mean. For example, how often can one expect to observe more than 550 claims? The standard
deviation is:

500 = 22.36. So 550 claims corresponds to about 50 / 22.36 = 2.24 standard

deviations greater than average. Since (2.24) = .9875, there is approximately a 1.25% chance of
observing more than 550 claims.
Thus there is about a 1.25% chance of observing more than 10% greater than the expected number
of claims. Similarly, we can calculate the chance of observing fewer than 450 claims as
approximately 1.25%. Thus the chance of observing outside 10% from the mean number of
claims is about 2.5%. In other words, the chance of observing within 10% of the expected number
of claims is 97.5% in this case18 .
If we had a mean of 1000 claims instead of 500 claims, then there would be a greater chance of
observing within 10% of the expected number of claims. This is given by the Normal
(10%) (1000)
(10%) (1000)
approximation as: [
] - [] = [3.162] - [-3.162] =
1000
1000
1 - (2){1 - [3.162]} = 2[3.162] - 1 = (2)(0.9992) - 1 = 99.84%.
Exercise: Compute the Probability of being within 5% of the mean, for 100 expected claims.
(5%) (100)
[Solution: 2[
] - 1 = 38.29%.]
100
17

A subsequent section deals with estimating Pure Premiums rather than Frequencies. As will be seen in order to
calculate a Standard for Full Credibility for the Pure Premium generally one first calculates a Standard for Full
Credibility for the Frequency. Thus questions about the former also test whether one knows how to do the latter.
18
Note that here we have ignored the continuity correction. As shown in Mahlers Guide to Frequency
Distributions the probability would be calculated including the continuity correction. The probability of more than
500 claims is approximately: 1 - [(550.5-500)/ 500 ] = 1 - (2.258) = 1 - 0.9880 = 1.20%.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 11

In general, let P be the chance of being within k of the mean, given an expected number of claims
equal to n. Then P = 2[k n ] - 1.
Here is a table showing P, for k = 10%, 5%, 2.5%, 1%, and 0.5%, and for 10, 50, 100, 500, 1000,
5000, and 10,000 claims:
Probability of Being Within k of the Mean
Expected #
of Claims
10
50
100
500
1000
5000
10000

k=10%
24.82%
52.05%
68.27%
97.47%
99.84%
100.00%
100.00%

k=5%
12.56%
27.63%
38.29%
73.64%
88.62%
99.96%
100.00%

k=2.5%
6.30%
14.03%
19.74%
42.38%
57.08%
92.29%
98.76%

k=1%
2.52%
5.64%
7.97%
17.69%
24.82%
52.05%
68.27%

k=0.5%
1.26%
2.82%
3.99%
8.90%
12.56%
27.63%
38.29%

Turning things around, given values of P and k, then one can compute the number of expected
claims n0 such that the chance of being within k of the mean is P.
For example, if P = 90% and k = 2.5%, then based on the above table n0 is somewhat less than
5000 claims. More precisely, P = 2[k
0.9 = 2[2.5%

n ] - 1, and therefore for P = 0.9 and k = 2.5%,

n0 ] - 1.

Thus we want [2.5%

n0 ] = (1+ P)/2 = 0.95. Let y be such that (y) = (1+ P)/2 = .95.

Consulting the Standard Normal Table, y = 1.645. Then we want y = 0.025

n0 .

Thus n0 = y2 / k2 = 1.6452 / 0.0252 = 4330 claims.


Having taken P = 90% and k = 2.5%, we would refer to 4330 as the Standard for Full Credibility (for
estimating frequencies.)
In general, assume one desires that the chance of being within k of the mean frequency
to be at least P, then for a Poisson Frequency, the Standard for Full Credibility is:
n0 =

y2
1+ P 19
,
where
y
is
such
that
(y)
=
.
k2
2

Exercise: Assuming frequency is Poisson, for P = 95% and for k = 5%, what is the number of claims
required for Full Credibility for estimating the frequency?
[Solution: y = 1.960 since (1.960) = (1+P)/2 = 97.5%.
Therefore, n0 = y2 / k2 = (1.96/0.05)2 = 1537 claims.]
19

See Equations 2.2.4 and 2.2.5 in Mahler and Dean.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 12

Here are values of y corresponding to various values of P:


P

(1+P)/2

60.00%
70.00%
80.00%
90.00%
95.00%
98.00%
99.00%

80.00%
85.00%
90.00%
95.00%
97.50%
99.00%
99.50%

0.842
1.036
1.282
1.645
1.960
2.326
2.576

The relevant values are shown in the lower portion of the Normal table attached to the exam.
Here is a table of values for the Standard for Full Credibility for the Frequency n0 , given various
values of P and k:20

Standards for Full Credibility for Frequency

(Claims)

Probability
Level

k = 30%

k = 20%

k = 10%

k = 7.5%

k = 5%

k = 2.5%

k = 1%

80.00%

18

41

164

292

657

2,628

16,424

90.00%
95.00%
96.00%
97.00%
98.00%
99.00%
99.90%
99.99%

30

68

271

481

1,082

4,329

27,055

43

96

384

683

1,537

6,146

38,415

47

105

422

750

1,687

6,749

42,179

52

118

471

837

1,884

7,535

47,093

60

135

541

962

2,165

8,659

54,119

74

166

664

1,180

2,654

10,616

66,349

120

271

1,083

1,925

4,331

17,324

108,276

168

378

1,514

2,691

6,055

24,219

151,367

The Standard of 1082 claims corresponding to P = 90% and k = 5% is the most commonly used,
followed by the Standard of 683 claims corresponding to P = 95% and k = 7.5%.
You should on several different occasions verify that you can calculate quickly and accurately a
randomly selected value from this table. The value 1082 claims corresponding to P = 90% and
k = 5% is commonly used in applications. For P = 90%, we want to have a 90% chance of being
within k of the mean, so we are willing to have a 5% probability outside on either tail, for a total of
10% probability of being outside the error bars. Thus (y) = 0.95 or y =1.645.
Thus n0 = y2 /k2 = (1.645 / 0.05)2 = 1082 claims.
20

See Longley-Cooks An Introduction to Credibility Theory PCAS 1962, or Some Notes on Credibility by
Perryman, PCAS 1932.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 13

Variations from the Poisson Assumption:


Assume one desires that the chance of being within k of the mean frequency to be at least P, then
the Standard for Full Credibility is n0 = y2 / k2 , where y is such that (y) = (1+ P)/2.
However, this depended on the following assumptions:21
1. One is trying to Estimate Frequency
2. Frequency is given by a Poisson Process (so that the variance is equal to the mean)
3. There are enough expected claims to use the Normal Approximation.
If any of these assumptions do not hold then one should not apply the above technique. Questions
can also deal with situations where the frequency is not assumed to be Poisson.
If a Binomial, Negative Binomial, or other frequency distribution is substituted for a Poisson
distribution, then the difference in the derivation is that the variance is not equal to the mean.
For example, assume one has a Binomial Distribution with parameters m = 1000 and
q = 0.3. The mean is 300 and the variance is (1000)(0.3)(0.7) = 210. So the chance of being
(5%) (300)
(5%) (300)
within 5% of the expected value is approximately: [
] - [
]=
210
210
(1.035) - (-1.035) = 0.8496 - 0.1504 = 69.9%. So in the case of a Binomial with parameter 0.3,
the Standard for Full Credibility with P = 70% and k = 5% is about 1000 exposures or 300
expected claims.
If instead a Negative Binomial Distribution had been assumed, then the variance would have been
greater than the mean. This would have resulted in a standard for Full Credibility greater than in the
Poisson situation.
One can derive a more general formula when the Poisson assumption does not apply.
Standard for Full Credibility for Frequency, General Case:
The Standard for Full Credibility for Frequency in terms of claims is:22
f 2
y2 f 2
=
n
.
0
k2 f
f
which reduces to the Poisson case when f2 / f = 1.
21

Unlike Buhlmann Credibility, in Classical Credibility the weight given to the prior mean does not depend on the
actuarys view of its accuracy.
22
Equation 2.2.6 in Mahler and Dean.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 14

Exercise: Find the number of claims required for full credibility. Require that there is a 90% chance
that the estimate of the frequency is correct within 2.5%. The frequency distribution has a variance
twice its mean.
[Solution: P = 90% and y = 1.645. k = 2.5%. n0 = y2 / k2 = (1.645 / .025)2 = 4329 claims.
We are given that f2 /f = 2. Thus n0 (f2 /f) = (4329)(2) = 8658 claims.]
Exercise: Find the number of claims required for full credibility. Require that there is a 99% chance
that the estimate of the frequency is correct within 10%. Assume the frequency distribution is
Negative Binomial, with parameters = 1.5 and r unknown.
[Solution: P = 99% and thus we want (y) = (1+ P)/2 = .995. Thus y = 2.576.
k = 10%. n0 = y2 / k2 = (2.576 / 0.10)2 = 664 claims. We are given that the frequency is Negative
Binomial with mean f = r and variance f2 = r(1+). Thus f2 /f = 1+ = 2.5.
Thus n0 (f2 / f) = (664)(2.5) = 1660 claims.
This is larger than the standard of 664 for a Poisson frequency, since the Negative Binomial has a
variance greater than its mean. In this case the variance is 2.5 times the mean.
Thus the standard of 1660 claims is 2.5 times 664.]
Derivation for Standard for Full Credibility for Frequency:
Require that the observed frequency should be within 100k% of the expected pure premium with
probability P. Use the following notation:
f = mean frequency. f2 = variance of frequency.
Let y be such that (y) = (1+P) / 2.
Using the Normal Approximation what is a formula for the number of claims needed for full credibility
of the frequency?
Assume there are N claims expected and therefore N/f exposures.
The mean frequency is f. The variance of the frequency for a single exposure is: f2.
A key idea is that if one adds up for example 3 independent, identically distributed variables, one
gets 3 times the variance. In this case we are assumed to have N/f independent exposures.
Therefore, the variance of the number of claims observed for N/f independent exposures is:
(N/f )f2 .

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 15

The observed frequency is the number of claims divided by the number of exposures, N/f.
When one divides by a constant, the variance is divided by that constant squared.
Therefore, the variance of the observed frequency is the variance of the number of claims,
(N / f )f2 , divided by (N/f)2 , which is: f f2 / N.
Thus the standard deviation of the observed claim frequency is: = f

f / N .

We desire that Prob(f - kf X f + kf) P.


Using the Normal Approximation this is true provided: kf = y = yf

f / N .

Solving for N = (y2 / k2 ) (f2 / f).


Exposures vs. Claims:
Standards for Full Credibility have been calculated so far in terms of the expected number of claims.
It is common to translate these into a number of exposures by dividing by the (approximate)
expected claim frequency. So for example, if the Standard for Full Credibility is 1082 claims (P =
90%, k = 5%) and the expected claim frequency in Homeowners Insurance were .04 claims per
house-year, then 1082 / .04 27,000 house-years would be a corresponding Standard for Full
Credibility in terms of exposures. In general, one can divide the Standard for Full Credibility in terms
of claims by f, in order to get it in terms of exposures.
Thus in general, the Standard for Full Credibility for Frequency in terms of exposures is:23
f 2
y2 f 2
=
n
.
0
k2 f 2
f 2

23

This is equation 20.6 in Loss Models, as applied to this situation. Equation 20.6 in Loss Models gives the number
of exposures required for full credibility: n 0 (/)2 . What Loss Models refers to as , is the standard deviation of

the quantity to be estimated, in this case frequency, and is the mean of the quantity to be estimated. So in this
case (/)2 = (f/ f )2 .

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 16

Exercise: Find the number of exposures required for full credibility. Require that there is a 99%
chance that the estimate of the frequency is correct within 10%. Assume the frequency distribution
is Negative Binomial, with parameters = 1.5 and r = 4.
[Solution: P = 99% and thus we want (y) = (1+ P)/2 = 0.995. Thus y = 2.576.
k = 10%. n0 = y2 / k2 = (2.576 / 0.10)2 = 664 claims. We are given that the frequency is Negative
Binomial with mean f = r and variance f2 = r(1+).
2
1+
2.5
Thus f2 =
=
= 0.4167.
f
r
(4) (1.5)
Thus n0

f2
= (664)(0.4167) = 277 exposures.
f 2

Comment: Note the assumed mean frequency is (4)(1.5) = 6. Thus 277 exposures correspond to
about (277)(6) = 1660 expected claims, as found in a previous exercise. ]
The Choice of P and k:
On the exam one will be given P and k. In practical applications appropriate values of P and k have
to be selected.24 While there is clearly some judgment involved in the choice of P and k, the
Standards for Full Credibility for a given application are generally chosen by actuaries within a similar
range.25
This same type of judgment is involved in the choice of error bars around an estimate of a quantity
such as the loss elimination ratio at $10,000. Many times 2 standard deviations (corresponding to
about a 95% confidence interval) will be chosen, but that is not necessarily better than choosing
1.5 or 2.5 standard deviations. Similarly one has to decide at what significance level to reject or
accept H0 when doing hypothesis testing. Should one use 5%, 1%, or some other significance
level?
So while Classical Credibility also involves somewhat arbitrary judgments, that has not stood in the
way of it being very useful for many decades in many applications.

24

For situations that come up repeatedly, the choice of P and k may have been made several decades ago, but
nevertheless the choice was made at some point in time. 1082 claims corresponding to P = 90% and k = 5% is the
single most commonly used value.
25
For example, if an actuary were estimating frequency for private passenger automobile insurance, he would
probably pick values of P and k that have been used before by other actuaries. These practical applications are
beyond the syllabus of this exam.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 17

Problems:
2.1 (1 point) Assume frequency is Poisson.
How many claims are required for Full Credibility if one requires that there be a 98% chance of the
estimated frequency being within 2.5% of the true value?
A. Less than 8,000
B. At least 8,000 but less than 9,000
C. At least 9,000 but less than 10,000
D. At least 10,000 but less than 11,000
E. At least 11,000
2.2 (3 points) Y represents the number of independent homogeneous exposures in an insurance
portfolio. The claim frequency rate per exposure is a random variable with mean = 0.10 and
variance = 0.25.
A full credibility standard is devised that requires the observed sample frequency rate per exposure
to be within 4% of the expected population frequency rate per exposure 95% of the time.
Determine the value of Y needed to produce full credibility for the portfolios experience.
A. 50,000
B. 60,000
C. 70,000
D. 80,000
E. 90,000
2.3 (1 point) Let A be the number of claims needed for full credibility, if the estimate is to be within
3% of the true value with a 80% probability. Let B be the similar number using 8% rather than 3%.
What is the ratio of A divided by B?
A. 3
B. 4
C. 5
D. 6
E. 7
2.4 (2 points) Assume you are conducting a poll relating to a single question and that each
respondent will answer either yes or no. You pick a random sample of respondents out of a very
large population. Assume that the true percentage of yes responses in the total population is
between 20% and 80%. How many respondents do you need, in order to require that there be a
90% chance that the results of the poll are within 8% of the true answer?
A. Less than 1,000
B. At least 1,000 but less than 2,000
C. At least 2,000 but less than 3,000
D. At least 3,000 but less than 4,000
E. At least 4,000

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 18

2.5 (1 point) Assume frequency is Poisson. The full credibility standard for a company is set so that
the total number of claims is to be within 8% of the true value with probability P.
This full credibility standard is calculated to be 625 claims. What is the value of P?
A. Less than 93%
B. At least 93% but less than 94%
C. At least 94% but less than 95%
D. At least 95% but less than 96%
E. 96% or more
2.6 (1 point) Find the number of claims required for full credibility.
Require that there is a 95% chance that the estimate of the frequency is correct within 10%.
The frequency distribution has a variance 3 times its mean.
A. Less than 1,000
B. At least 1,000, but less than 1,100
C. At least 1,100, but less than 1,200
D. At least 1,200, but less than 1,300
E. 1,300 or more
2.7 (2 points) A Standard for Full Credibility in terms of claims has been established for frequency
assuming that the frequency is Poisson. If instead the frequency is assumed to follow a Negative
Binomial with parameters r = 12 and = 0.5, what is the ratio of the revised Standard for Full
Credibility to the original one?
A. Less than 1
B. At least 1 but less than 1.2
C. At least 1.2 but less than 1.4
D. At least 1.4 but less than 1.6
E. At least 1.6
2.8 (1 point) Assume frequency is Poisson. How many claims are required for Full Credibility if one
requires that there be a 95% chance of being within 10% of the true frequency?
A. Less than 250
B. At least 250 but less than 300
C. At least 300 but less than 350
D. At least 350 but less than 400
E. 400 or more

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 19

2.9 (1 point) The total number of claims for a group of insureds is Poisson distributed with a mean of
m. Using the Normal approximation, calculate the value of m such that the observed number of
claims will be within 6% of m with a probability of 0.98.
A. Less than 1,000
B. At least 1,000, but less than 1,500
C. At least 1,500, but less than 2,000
D. At least 2,000, but less than 2,500
E. 2,500 or more
2.10 (1 point) Assume frequency is Poisson.
How many claims are required for Full Credibility if one requires that there be a 99% chance of the
estimated frequency being within 7.5% of the true value?
A. Less than 800
B. At least 800 but less than 900
C. At least 900 but less than 1000
D. At least 1000 but less than 1100
E. At least 1100
2.11 (2 points) You are given the following information about a book of business:
(i) Each insureds claim count has a Poisson distribution with mean , where has
a gamma distribution with = 5 and = 0.3.
(ii) The full credibility standard is for frequency to be within 2.5% of the expected
with probability 0.95.
Using classical credibility, determine the expected number of claims required for full credibility.
(A) 6,000
(B) 7,000
(C) 8,000
(D) 9,000
(E) 10,000
2.12 (2 points) Frequency is assumed to follow a Binomial with parameters q = 0.4 and m.
How many claims are required for Full Credibility if one requires that there be a 90% chance of the
estimated frequency being within 5% of the true value?
(A) 650
(B) 700
(C) 750
(D) 800
(E) 850

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 20

2.13 (3 points) A standard for full credibility has been selected so that the actual frequency would be
within 10% of the expected frequency 80% of the time.
The number of claims for an individual insured is Poisson with mean .
However, in turn varies across the portfolio via a Poisson with mean c.
What is the smallest value of c, such that the data for one insured would be given full credibility?
A. Less than 300
B. At least 300 but less than 400
C. At least 400 but less than 500
D. At least 500 but less than 600
E. At least 600
2.14 (2, 5/85, Q. 31) (1.5 points) Some scientists believe that Drug X would benefit about half of
all people with a certain blood disorder. To estimate the proportion, p, of patients who would
benefit from taking Drug X, the scientists will administer it to a random sample of patients who have
the blood disorder. The estimate of p will be p^ , the proportion of patients in the sample who benefit
from having taken the drug. Which of the following is closest to the minimum sample size that
guarantees P[|p - p^ | 0.03] 0.95?
A. 748

B. 1,068

C. 1,503

D. 2,056

E. 2,401

2.15 (4, 5/86, Q.34) (1 point) Let X be the number of claims needed for full credibility, if the
estimate is to be within 5% of the true value with a 90% probability.
Let Y be the similar number using 10% rather than 5%.
What is the ratio of X divided by Y?
A. 1/4
B. 1/2
C. 1
D. 2
E. 4
2.16 (4, 5/87, Q.46) (2 points) The "Classical" approach to credibility optimizes which of the
following error measures?
A. least squares error criterion
B. variance of the hypothetical means
C. normal approximation for skewness
D. coefficient of variation
E. None of the above

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 21

2.17 (4, 5/89, Q.29) (1 point) The total number of claims for a group of insureds is Poisson
distributed with a mean of m. Calculate the value of m such that the observed number of claims will
be within 3% of m with a probability of 0.975 using the normal approximation.
A. Less than 5,000
B. At least 5,000, but less than 5,500
C. At least 5,500, but less than 6,000
D. At least 6,000, but less than 6,500
E. 6,500 or more
2.18 (4B, 11/94, Q.15) (3 points) You are given the following:
Y represents the number of independent homogeneous exposures in an insurance portfolio.
The claim frequency rate per exposure is a random variable with mean = 0.025 and
variance = 0.0025.
A full credibility standard is devised that requires the observed sample frequency rate per exposure
to be within 5% of the expected population frequency rate per exposure 90% of the time.
Determine the value of Y needed to produce full credibility for the portfolios experience.
A. Less than 900
B. At least 900, but less than 1,500
C. At least 1,500, but less than 3,000
D. At least 3000, but less than 4,500
E. At least 4,500
2.19 (4B, 5/96, Q.13) (1 point) Using the methods of Classical credibility, a full credibility standard
of 1,000 expected claims has been established such that the observed frequency will be within 5%
of the underlying frequency, with probability P.
Determine the number of expected claims that would be required for full credibility if 5% were
changed to 1%.
A. 40
B. 200
C. 1,000
D. 5,000
E. 25,000
2.20 (4, 11/04, Q.21 & 2009 Sample Q.148) (2.5 points) You are given:
(i) The number of claims has probability function:
m
p(x) = qx (1-q)m-x, x = 0, 1, 2,, m
x
(ii) The actual number of claims must be within 1% of the expected number of claims
with probability 0.95.
(iii) The expected number of claims for full credibility is 34,574.
Determine q.
(A) 0.05
(B) 0.10
(C) 0.20
(D) 0.40
(E) 0.80

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 22

Solutions to Problems:
2.1. B. (2.326) = (1+P)/2 = (1+.98)/2 = .99, so that y = 2.326.
n0 = y2 / k2 = (2.326 / 0.025)2 = 8656.
2.2. B. k = .04, P = 95%, y = 1.960, f = .10, f2 = .25, and (y2 /k2 )(f2 /f)
= (1.960/0.04)2 (.25/.10) = 6002.5 claims 6002.5/.10 = 60,025 exposures.
2.3. E. n0 = y2 / k2 and thus for a given P the standard for full credibility is inversely proportional to
the square of k. Thus A/B = 82 / 32 = 7.11.
Comment: The standard for full credibility is larger the smaller k; being within 3% is more stringent a
requirement which requires more claims than than being within 8%.
2.4. B. Let m be the number of respondents and let q be the true percentage of yes respondents
in the total population. The number of yes responses in the sample is given by a Binomial
Distribution with parameters q and m, with variance mq(1-q).
The percentage of yes responses is N/m, with variance: mq(1-q) / m2 = q(1-q) / m.
Using the Normal Approximation 90% probability corresponds to 1.645 standard deviations of
the mean of q. Thus we want: (0.08)(q) = (1.645) q(1- q) / m.
m = (1.645)

(1- q) / q / 0.08. m = 423 {(1/q) - 1}. The desired m is a decreasing function of q.

However, we assume q 0.2, so that m 423(5 - 1) = 1692.


Alternately, for each respondent, which can be thought of as an exposure we have a Bernoulli
distribution, with f2/f2 = (1-q)q / q2 = 1/q - 1.
The standard for full credibility is in terms of exposures:
(f2 / f) (y2 / k2 ) / f = (f2 / f2 ) (y2 / k2 ) = (1.645/0.08)2 (1/q - 1) = 423 (1/q - 1).
For 0.2 q 0.8, this is maximized when q = 0 2, and is then: 423(5 - 1) = 1692 exposures.
Comment: The number of exposures needed for full credibility depends on q. We want a standard
for full credibility that will be enough exposures to satisfy the criterion regardless of q, so we pick the
maximum over q from 20% to 80%.
2.5. D. n0 = y2 / k2 . Therefore y = k n0 = 0.08 625 = 2.00. (y) = (1+P)/2.
P = 2(y) - 1 = 2(2.00) - 1 = (2)(.9772) - 1 = 0.9544.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 23

2.6. C. P = 95%. (1.960) = (1+P)/2 = 0.975, so that y = 1.960.


(f2 /f ) (y2 / k2 ) = (3)(1.960 / 0.10)2 = (3)(384) = 1152 claims.
2.7. D. For frequency, the general formula for the Standard for Full Credibility in terms of claims is:
(f2 /f ) {y2 / k2 }. Assuming y and k are fixed, then the Standard for Full Credibility is proportional to
the ratio of the variance to the mean. For the Poisson this ratio is one. For the Negative Binomial this
ratio is: {r(1+)} / (r) = 1 + .
Thus the second Standard is 1+ = 1.5 times the first Standard.
Comment: The Negative Binomial has more random fluctuation than the Poisson, and therefore the
standard for Full Credibility is larger.
2.8. D. (1.960) = 0.975, so that y = 1.960. n0 = y2 / k2 = (1.960 / 0.10)2 = 384.
2.9. C. (2.326) = 0.99, so that y = 2.326.
n0 = y2 / k2 = (2.326 / 0.06)2 = 1503 claims.
2.10. E. (2.576) = (1+P)/2 = (1+.99)/2 = 0.995, so that y = 2.576.
n0 = y2 / k2 = (2.576 / 0.075)2 = 1180 claims.
2.11. C. For the Gamma-Poisson, the mixed distribution is Negative Binomial, with r = = 5
and = = 0.3. Therefore, f2 /f = r(1 + )/(r) = 1 + = 1.3.
For P = 0.95, y = 1.960. k = 0.025.
(y2 / k2 )(f2 /f) = (1.960/0.025)2 (1.3) = 7991 claims.
Comment: Similar to 4, 11/02, Q.14.
2.12. A. (1.645) = (1+P)/2 = (1 + 0.90)/2 = .95, so that y = 1.645.
n0 = y2 / k2 = (1.645 / 0.05)2 = 1082 claims. f2 /f = (0.4)(0.6)m/(0.4m) = 0.6.
n0 f2 /f = (1082)(0.6) = 650 claims.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 24

2.13. B. We have y = 1.282 since (1.282) = 0.90 = (1 + 0.80)/2.


Therefore, n0 = y2 / k2 = (1.282 / 0.10)2 = 164 claims.
The frequency for the portfolio is a mixture of Poissons.
The mean of the mixture is: E[] = c.
The second moment of each Poisson is: variance + mean2 = + 2.
The second moment of the mixture is the mixture of the second moments:
E[ + 2] = E[] + E[2] = c + (c + c2 ) = 2c + c2 .
Thus the variance of the mixture is: 2c + c2 - c2 = 2c.
Thus the standard for full credibility in terms of number of claims is:
f2
n = (2c/c) n0 = 2n0 = (2)(164) = 328 claims.
f 0
For the data for one insured to be given full credibility, we need to have the expected number of
claims for an individual insured to be at least 328.
The smallest possible c is 328.
Alternately, EPV = E[] = c. VHM = Var[] = c.
So the variance of the mixture is: EPV + VHM = 2c. Proceed as before.
2.14. B. Want 95% probability of estimate being within 0.03/p in percent of the true value of p.
k = 0.03/p. (1.960) = (1 + 95%) = 90%. y = 1.960.
For Classical Credibility, the standard for full credibility for frequency is:
(f2/f)(y/k)2 = {p(1-p)/p}{1.960/(0.03/p)}2 = (1-p)p2 4268 claims.
Put it in terms of exposures by dividing by the mean frequency p: (1-p)p4268 exposures.
p(1-p) 1/2. Therefore, one can take n = 4268/4 = 1067.
Alternately, let x be the number who benefit. For a Binomial Distribution, p^ = x/n.
Var[X] = np(1-p). Var[ p^ ] = Var[X]/n2 = p(1-p)/n.
Using the Normal Approximation, P[|p - p^ | 0.03] = 0.95, if 0.03 = 1.960 standard deviations.
0.03 = 1.96

p(1- p)
. n = 4268 p(1-p).
n

Now for 0 p 1, p(1-p) has its maximum for p = 1/2, when p(1-p) = 1/4.
So we can take n = 4268/4 = 1067.

2013-4-8,

Classical Credibility 2 Full Credibility Frequency, HCM 10/16/12, Page 25

2.15. E. Since the full credibility standard is inversely proportional to the square of k:
n0 = y2 / k2 , X/Y = (10%/5%)2 = 4. Alternately, one can compute the values of X and Y assuming
one is dealing with the standard for frequency and that the frequency is Poisson.
(The answer to this question does not depend on these assumptions.)
For k = 5% and P = 90%: (1.645) = 0.95 = (1 + 0.90)/2, so that y = 1.645,
n0 = y2 / k2 = (1.645 / 0.05)2 = 1082 = X.
For k = 10% and P = 90%: (1.645) = 0.95 = (1 + 0.90)/2, so that y = 1.645,
n0 = y2 / k2 = (1.645 / 0.10)2 = 271 = Y.
Thus X/Y = 1082 / 271 = 4.
Comment: As the requirement gets less strict, for example k= 10% rather than 5%, the number of
claims needed for Full Credibility decreases.
2.16. E. The classical approach to credibility attempts to limit the probability of large errors.
What is considered a large error is determined by the choice of k. The classical approach to
credibility does not optimize any particular error measure. The Buhlmann or greatest accuracy
approach, optimizes the least squares error criterion.
2.17. C. Classical Credibility for frequency with k = 0.03 and P = 0.975.
y = 2.24 , since (2.24 ) = 0.9875 = (1+P)/2.
n0 = y2 / k2 = (2.24/0.03)2 = 5575 claims.
2.18. D. (1.645) = (1 + 0.90)/2 = 0.95 y = 1.645. n0 = y2 / k2 = (1.645/0.05)2 = 1082 claims.
nf = n0 (f2 /f) = (1082)(0.0025/0.025) = 108.2 claims. 108.2 claims / 0.025 = 4328 exposures.
Alternately, the standard for full credibility for frequency in terms of number of exposures is:
(y2 /k2 ) (f2 / f2 ) = (1.645/0.05)2 (0.0025 /0.0252 ) = (1082) (4) = 4328 exposures.
2.19. E. The Standard for Full Credibility (whether it is for frequency, severity, or pure premiums)
is inversely proportional to k2 .
Thus the revised Standard is: (0.05/0.01)2 (1000) = 25,000.
2.20. B. k = 1%. P = 95%. y = 1.960. n0 = (y/k)2 = 38,416 claims. Binomial Frequency.
f2 /f = mq(1-q)/(mq) = 1 - q. 34,574 = n0 f2 /f = 38,416(1 - q). q = 0.100.

2013-4-8,

Classical Credibility 3 Full Credibility Severity, HCM 10/16/12, Page 26

Section 3, Full Credibility for Severity


You are less likely to be asked a question on the exam involving applying Classical Credibility to
estimating future severities. However, the same ideas easily apply as they did to frequencies.
Assume we have 5 claims each independently drawn from an Exponential Distribution:
F(x) = 1 - e-x/100.
Then since the variance of an Exponential is 2 , the variance of a single claim is: 1002 = 10,000.
Thus the variance of the total cost of five independent claims is (5)(10,000) = 50,000.
The observed severity is the total observed cost divided by the number of claims, in this case 5.
Thus the variance of the observed severity is (1/5)2 (50000) = 2000.
When one has N claims, the variance of the observed severity is (N10000) / N2 = 10,000 / N.
In general, the variance of the observed severity =
(process variance of the severity) / (number of claims) = S2 / N.
Therefore, the standard deviation for the observed severity is S / N .
Assume we wish to have a chance of P that the observed severity will be within k of the true
average severity. As before with credibility for the frequency, use the Normal Approximation, with y
such that (y) = (1+P)/2. Then within y(standard deviations of observed severity) of the mean
covers probability of P on the Normal Distribution. Therefore, in order to have P probability of
differing from the mean severity by less than kS, we want y( S / N ) = kS.
Solving: N = (y/ k)2 S2 / S2 = n0 CVSev2 .
The Standard for Full Credibility for the Severity in terms of number of expected claims is:
y2 s 2
= n0 CVS e v2 ,
k2 s 2
where CVSev is the coefficient of variation of the severity = standard deviation / mean.26 27

26

Equation 2.3.2 in Mahler and Dean.


This is equation 20.6 in Loss Models, as applied to this situation. Equation 20.6 in Loss Models gives the
requirement for full credibility: n 0 (/)2 . What Loss Models refers to as , is the standard deviation of the
27

quantity to be estimated, in this case severity, and is the mean of the quantity to be estimated. So in this case
(/)2 = (s/ s )2 . Note that since the denominator of severity is claims, equation 20.6 gives a result here in terms of
expected claims rather than exposures.

2013-4-8,

Classical Credibility 3 Full Credibility Severity, HCM 10/16/12, Page 27

Note that no assumption was made about the distributional form of the frequency.
The Standard for Full Credibility for severity does not depend on whether the frequency is Poisson,
Negative Binomial, etc. However, we have assumed that frequency and severity are independent
and that all of the claims are drawn from the same distribution.
Exercise: Let P = 90% and k = 5%.
If the coefficient of variation of the severity is 3, then what is the Standard for Full Credibility for the
severity in terms of expected claims?
[Solution: n0 = (1.645/0.05)2 = 1082 claims.
Then the Standard for Full Credibility for the severity is: (1082)(32 ) = 9738 expected claims.]
Exposures vs. Claims:
Standards for Full Credibility have been calculated so far in terms of the expected number of claims.
It is common to translate these into a number of exposures by dividing by the (approximate)
expected claim frequency. So for example, if the Standard for Full Credibility is 9738 claims and the
expected claim frequency in Homeowners Insurance were .04 claims per house-year, then
9738 / 0.04 = 243,000 house-years would be a corresponding Standard for Full Credibility in terms
of exposures.
In general, one can divide the Standard for Full Credibility in terms of claims by f, in order to get it in
terms of exposures.
Thus in general, the Standard for Full Credibility for the Severity in terms of number of exposures is:
n0 CVSev2
y2 s 2 1
=
,
f
k2 s2 f
where CVSev is the coefficient of variation of the severity.

2013-4-8,

Classical Credibility 3 Full Credibility Severity, HCM 10/16/12, Page 28

Problems:
3.1 (2 points) You are given the following:

The claim amount distribution has mean 500, variance 5,000,000.

Frequency and severity are independent.

Find the number of claims required for full credibility, if you require that there will be a 80% chance
that the estimate of the severity is correct within 2%.
A. Less than 60,000
B. At least 60,000 but less than 70,000
C. At least 70,000 but less than 80,000
D. At least 80,000 but less than 90,000
E. At least 90,000

3.2 (3 points) You are given the following:

The claim amount distribution is LogNormal, with = 1.5.

Frequency and severity are independent.

Find the number of claims required for full credibility, if you require that there will be a 95% chance
that the estimate of the severity is correct within 10%.
A. Less than 2900
B. At least 2900, but less than 3000
C. At least 3000, but less than 3100
D. At least 3100, but less than 3200
E. At least 3200

3.3 (3 points) You are given the following:

The claim amount distribution is Pareto, with = 2.3.

Frequency and severity are independent.

Find the number of claims required for full credibility, if you require that there will be a 90% chance
that the estimate of the severity is correct within 7.5%.
(E) 3700
(A) 2900
(B) 3100
(C) 3300
(D) 3500

2013-4-8,

Classical Credibility 3 Full Credibility Severity, HCM 10/16/12, Page 29

3.4 (2 points) You require that there will be a 99% chance that the estimate of the severity is correct
within 5%. 17,000 claims are required for full credibility.
Determine the coefficient of variation of the size of loss distribution.
A. Less than 1
B. At least 1, but less than 2
C. At least 2, but less than 3
D. At least 3, but less than 4
E. At least 4
3.5 (2 points) You are given the following:
The estimated claim frequency is 4%.
Number of claims and claim severity are independent.
Claim severity has the following distribution:
Claim Size
Probability
10
.50
20
.30
50
. 20
Determine the number of exposures needed so that the estimated average size of claim is within
2% of the expected size with 95% probability.
(A) 95,000
(B) 105,000
(C) 115,000
(D) 125,000
(E) 135,000
3.6 (3 points) An actuary is determining the number of claims needed for full credibility in three
different situations:
(1) Assuming claim severity is Pareto, the estimated claim severity is to be within r
of the true value with probability p.
(2) Assuming claim frequency is Binomial, the estimated claim frequency is to be within r
of the true value with probability p.
(3) Assuming claim severity is Exponential, the estimated claim severity is to be within r
of the true value with probability p.
The same values of r and p are chosen for each situation.
Rank these three limited fluctuation full credibility standards from smallest to largest.
(A) 1, 2, 3
(B) 2, 1, 3
(C) 3, 1, 2
(D) 2, 3, 1
(E) None of A, B, C, or D

2013-4-8,

Classical Credibility 3 Full Credibility Severity, HCM 10/16/12, Page 30

3.7 (3 points) You wish to estimate the average insured damage per hurricane for hurricanes that hit
the east or gulf coasts of the United States.
The full credibility standard is to be within 10% of the expected severity 98% of the time.
The insured damage from a single hurricane in millions of dollars is modeled as a mixture of five
Exponential Distributions:
1 = 6

p 1 = 24%,

2 = 40

p 2 = 26%

3 = 700

p 3 = 30%

4 = 4000

p 4 = 17%

5 = 18,000

p 5 = 3%

Determine the number of hurricanes needed for full credibility.


A. 3000
B. 4000
C. 5000
D. 6000
E. 7000

2013-4-8,

Classical Credibility 3 Full Credibility Severity, HCM 10/16/12, Page 31

Solutions to Problems:
3.1. D. y = 1.282 since (1.282) = 0.90. n0 = y2 / k2 = (1.282/0.02)2 = 4109.
For severity, the Standard For Full Credibility is:
n0 C V2 = (4109) (5,000,000/5002 ) = (4109)(20) = 82,180 claims.
3.2. E. y = 1.960 since (1.960) = 0.975. n0 = y2 / k2 = (1.960/0.1)2 = 384.
For the LogNormal Distribution: Mean = exp( + 0.5 2),
Variance = exp(2 + 2) {exp( 2) - 1}, and therefore, the Coefficient of Variation = exp(2 ) - 1.
For = 1.5, the CV2 = exp(2.25) - 1 = 8.49.
For severity, the Standard For Full Credibility is: n0 C V2 = (384)(8.49) = 3260.
3.3. E. (1.645) = 0.95, so that y = 1.645. n0 = y2 /k2 = (1.645/0.075)2 = 481.
2 2
Using the formulas for the moments: CV2 = E[X2 ] / E2 [X] - 1 =

( 1) ( 2)
-1=
2

2 (-1) / (-2) - 1 = / (-2) . For = 2.3, CV2 = 2.3 / .3 = 7.667.


Therefore n0 (CV2 ) = (481)(7.667) = 3688.
Comment: The smaller , the heavier-tailed the Pareto Distribution, making it harder to limit
fluctuations in the estimated severity, since a single large clam can affect the observed average
severity. Therefore, the smaller , the larger the Standard for Full Credibility.
3.4. C. (2.576) = 0.995 = (1 + 0.99)/2, so that y = 2.576. n0 = y2 /k2 = (2.576/0.05)2 = 2654.
17,000 = n0 C VSev2 . CVSev =

17,000
= 2.53.
2654

3.5. D. We have y = 1.960 since (1.960) = 0.975. Therefore n0 = y2 / k2 = (1.960/0.02)2 =


9604. The mean severity is: (10)(0.5) + (20)(0.3) + (50)(0.2) = 21. The variance of the severity is:
(112 )(0.5) + (12 )(0.3) + (292 )(0.2) = 229. Thus the coefficient of variation squared = 229 / 212 =
0.519. n0 (CV2 ) = (9604) (0.519) = 4984 claims.
This corresponds to: 4984 / 0.04 = 124,600 exposures.

2013-4-8,

Classical Credibility 3 Full Credibility Severity, HCM 10/16/12, Page 32

3.6. D. (1) The coefficient of variation for the Pareto is greater than 1 (or infinite).
Thusthe Standard for Full Credibilityfor Severityis: CVSev2 n0 > 12 n0 = n0 .
(2)For the Binomial,

variance m q (1 - q)
=
= 1 - q.
mean
mq

Standard for Full Credibility for Frequency is: (1 - q) n0 < n0 .


(3) The CV for the Exponential is 1.
Thusthe Standard for Full Credibilityfor Severityis: CVSev2 n0 = 12 n0 = n0 .
Thus ranking the standards from smallest to largest: 2, 3, 1.
Comment: The CV of the Pareto is discussed in Section 30 of Mahlers Guide to Loss
Distributions. Since it is heavier tailed than the Exponential, when it is finite, the CV of the Pareto is
greater than that of the Exponential.
From its mean and second moment, one can determine that for a Pareto Distribution:
Coefficient of Variation =

,
> 2.
2

3.7. D. P = 98%. y = 2.326. k = 10%. n0 = (2.326/0.1)2 = 541 hurricanes.


The first moment of the mixed severity is:
(24%)(6) + (26%)(40) + (30%)(700) + (17%)(4000) + (3%)(18,000) = 1441.8.
The second moment of each Exponential is 22.
Thus the second moment of the mixed severity is:
(24%)(2)(62 ) + (26%)(2)(402 ) + (30%)(2)(7002 ) + (17%)(2)(40002 ) + (3%)(2)(18,0002 ) =
25,174,850.
C V2 = E[X2 ]/E[X]2 - 1 = 25,174,85 / 1441.82 - 1 = 11.110.
Standard for full credibility is: (11.110)(541) = 6010 hurricanes.
Comment: 167 hurricanes hit the continental United States from 1900 to 1999.
The reported losses would have been adjusted to a current level for inflation, changes in per capita
wealth (to represent the changes in property value above the rate of inflation), changes in insurance
utilization, and changes in number of housing units (by county). See A Macro Validation Dataset for
U.S. Hurricane Models, by Douglas J. Collins and Stephen P. Lowe, CAS Forum, Winter 2001.
Using the criterion in this question, the credibility for a century of data used for estimating severity of
hurricanes is: 167 / 6010 = 17%.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 33

Section 4, Variance of Pure Premiums and Aggregate Losses28


The same formulas and techniques can be used to calculate the process variance of the aggregate
losses or loss ratios. The loss ratio is defined as losses divided by premiums.
Exercise: XYZ Insurance insures 123,000 automobiles for one year.
Total premiums are $57 million. Total loss payments are $48 million.
What are the pure premium, aggregate annual loss, and loss ratio?
[Solution: The aggregate loss is $48 million. Pure premium = $48 million/123,000 car years = $390/
car year. Loss Ratio = $48 million/ $57 million = 84.2%.]
Aggregate Loss:
The Aggregate Loss is the total dollars of loss for an insured or set of an insureds. If not stated
otherwise, the period of time is one year.
For example, during 1999 the MT Trucking Company may have had $952,000 in aggregate losses
on its commercial automobile collision insurance policy. All of the trucking firms insured by the Fly-byNight Insurance Company may have had $15.1 million dollars in aggregate losses for collision. The
dollars of aggregate losses are determined by how many losses there are and the severity of each
one.
Exercise: During 1998 MT Trucking suffered three collision losses for $8,000, $13,500, and
$22,000. What are its aggregate losses?
[Solution: $8,000 + $13,500 + $22,000 = $43,500.]
Aggregate Losses =
# of Claims
$ of Loss
(# of Exposures)
= (Exposures) (Frequency) (Severity).
# of Exposures # of Claims
If one is not given the frequency per exposure, but is rather just given the frequency for the whole
number of exposures,29 whatever they are for the particular situation, then
Aggregate Losses = (Frequency) (Severity).
Similarly, the Aggregate Payment is the total dollars paid by an insurer on an insurance policy or set
of insurance policies. If not stated otherwise, the period of time is one year.

28

This important material is covered in Mahlers Guide to Aggregate Distributions, as well as here.
For example, the expected number claims from a large commercial insured is 27.3 per year or the expected
number of Homeowners claims expected by XYZ Insurer in the State of Florida is 12,310.
29

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 34

Exercise: During 1998 MT Trucking suffered three collision losses for $8,000, $13,500, and
$22,000. MT Trucking has a $10,000 per claim deductible on its policy with the Fly-by-Night
Insurance Company. What are the aggregate payments by Fly-by-Night?
[Solution: $0 + $3,500 + $12,000 = $15,500.]
Pure Premium:
Pure Premium = Aggregate Loss per exposure.
The mean pure premium is: (mean frequency per exposure)(mean severity).
Expected Aggregate Loss = (Mean Pure Premium)(Exposure)
Estimated expected pure premiums serve as a starting point for pricing insurance.30

Process Variance:
Random fluctuation occurs when one rolls dice, spins spinners, picks balls from urns, etc. The
observed result varies from time period to time period due to random chance. This is also true for
the pure premium observed for a collection of insureds.31 The variance of the observation for a
given risk that occurs due to random fluctuation is referred to as the process variance. That is what
will be discussed here.32
Since pure premiums depend on both the number of claims and the size of claims, pure premiums
have more reasons to vary than do either frequency or severity individually.

30

One would have to load for loss adjustment expenses, expenses, taxes, profits, etc.
In fact this is the fundamental reason for the existence of insurance.
32
The process variance is distinguished from the variance of the hypothetical pure premiums as discussed in
Buhlmann Credibility.
31

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 35

Independent Frequency and Severity:


You are given the following:

For a given risk, the number of claims for a single exposure period is given by
a Binomial Distribution with q = 0.3 and m = 2.

The size of a claim will be 50, with probability 80%, or 100, with probability 20%.

Frequency and severity are independent.


Exercise: Determine the variance of the pure premium for this risk.
[Solution: List the possibilities and compute the first two moments:
Situation
0 claims
1 claim @ 50
1 claim @ 100
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 100
2 claims @ 100 each

Probability
49.00%
33.60%
8.40%
5.76%
2.88%
0.36%

Pure Premium
0
50
100
100
150
200

Square of P.P.
0
2,500
10,000
10,000
22,500
40,000

Overall

100.0%

36

3,048

For example, the probability of 2 claims is: .32 = 9%. We split this 9% among the possible claim
sizes: 50 and 50 @ (0.8)(0.8) = 64%, 50 and 100 @ (0.8)(0.2) = 16%,
100 and 50 @ (0.2)(0.8) = 16%, 100 and 100 @ (0.2)(0.2) = 4%.
(9%)(64%) = 5.76%, (9%)(16 % + 16%) = 2.88%, (9%)(4%) = 0.36%.
One takes the weighted average over all the possibilities. The average Pure Premium is 36.
The second moment of the Pure Premium is 3048.
Therefore, the variance of the pure premium is: 3048 - 362 = 1752.]
In this case since frequency and severity are independent one can make use of the following
formula:33
Process Variance of Pure Premium =
(Mean Frequency) (Variance of Severity) + (Mean Severity)2 (Variance of Frequency)

P P2 = F S 2 + S 2 F2.
Memorize this formula! Note that each of the two terms has a mean and a variance, one from
frequency and one from severity. Each term is in dollars squared; that is one way to remember that
the mean severity (which is in dollars) enters as a square while that for mean frequency (which is not
in dollars) does not.
33

Equation 2.4.1 in Mahler and Dean.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 36

In the above example, the mean frequency is mq = 0.6 and the variance of the frequency is:
mq(1 - q) = (2)(0.3)(0.7) = 0.42. The average severity is 60 and the variance of the severity is:
(0.8)(102 ) + (0.2)(402 ) = 400. Thus, the process variance of the pure premium is:
(0.6)(400) + (602 )0(.42) = 1752, which matches the result calculated previously.
This same formula can also be used to compute the process variance of the aggregate
losses, when frequency and severity are independent.

A2 = F S 2 + S 2 F2.
The sum of losses is just the product of the pure premium and the number of exposures. Provided
the risk processes for the individual exposures are independent and identical, then both F and F2
are multiplied by the number of exposures as is PP2 .
In the above example, the process variance of the sum of the losses from 10 exposures is:
(6)(400) + (602 )(4.2) = 17520 = (10)(1752).
Dependent Frequency and Severity:
While frequency and severity are almost always independent, if they are dependent one can use a
more general technique.34 The first and second moments can be calculated by listing the pure
premiums for all the possible outcomes and taking the weighted average, applying the probabilities
as weights to either the pure premium or its square. In continuous cases, this will involve taking
integrals, rather than sums. Then one can calculate the variance of the pure premium as:
second moment - (first moment)2 .
Aggregate Losses Versus Pure Premiums:
Exercise: Assume frequency is Poisson with mean 5% for one exposure.
Severity is Exponential, with mean 100. What is the mean and variance of the pure premium?
[Solution: The mean pure premium = F S = (5%)(100) = 5.
Variance of pure premium = F S2 + S2 F2 = (5%)(1002 ) + (1002 )(5%) = 1000.]
Exercise: If we insure 1000 independent, identically distributed exposures, what is the mean and
variance of the aggregate loss?
[Solution: Overall frequency is Poisson with mean: (5%)(1000) = 50.
Mean aggregate: (50)(100) = 5000. Variance of aggregate: (50)(1002 ) + (1002 )(50) = 1 million.]
34

See 4B, 5/95, Q.14 and 4, 11/02, Q.36 for examples where frequency and severity are dependent.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 37

So we can use basically the same formula for the mean and variance when working with either the
aggregate losses or pure premiums.35 When working with pure premiums, we used 5% as the
mean frequency and 5% as the variance of the frequency, the mean and variance of the frequency
distribution for a single exposure. However, when working with the aggregate losses, we used 50
as the mean frequency and 50 as the variance of the frequency, the mean and variance of the
frequency distribution of the whole portfolio.
Note that when we add up 1000 independent, identically distributed exposures, we get 1000 times
the mean and 1000 times the variance for a single exposure.
In general, when we have N identical, independent exposures:
Mean aggregate loss = (N)(mean pure premium).
Variance of aggregate loss = (N)(variance of pure premium).
Derivation of the formula for the Process Variance of the Pure Premium:
The above formula for the process variance of the pure premium for independent frequency and
severity is a special case of the formula that also underlies analysis of variance:
Var(Y) = EX[VARY(Y|X)] + VARX(EY[Y|X]), where X and Y are any random variables.
Letting Y be the pure premium PP and X be the number of claims N in the above formula gives:
Var(PP) = EN[VARPP(PP|N)] + VARN(EPP[PP|N]) = EN[NS2 ] + VARN(SN) =
EN[N]S2 + S2 VARN(N) = F S2 + S2 F2 .
Where I have used the assumption that the frequency and severity are independent and the facts:

For a fixed number of claims N, the variance of the pure premium is the variance
of the sum of N independent identically distributed variables each with variance S2 .
(Since frequency and severity are assumed independent, S2 is the same for each value
of N.) Such variances add so that VARPP(PP | N) = NS2 .

For a fixed number of claims N, for frequency and severity independent the expected
value of the pure premium is N times the mean severity: EPP[PP | N] = SN.

Since with respect to N the variance of the severity acts as a constant:


EN[NS2 ] = S2 EN[N] = F S2 .

Since with respect to N the mean of the severity acts as a constant:


VARN(SN) = S2 VARN(N) = S2 F2 .

35

One can define the whole portfolio as one exposure; then the Aggregate Loss is mathematically just a special
case of the Pure Premium.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 38

Lets apply this derivation to a previous example. You were given the following:

For a given risk, the number of claims for a single exposure period is given by
a Binomial Distribution with q = 0.3 and m = 2.

The size of the claim will be 50, with probability 80%, or 100, with probability 20%.

Frequency and severity are independent.


There are only three possible values of N: N = 0, N = 1 or N = 2. If N = 0, then PP = 0.
If N = 1, then either PP = 50 with 80% chance or PP = 100 with 20% chance. If N = 2, then
PP = 100 with 64% chance, PP =150 with with 32% chance or PP = 200 with 4% chance.
We then get :
N
0
1
2
Mean

Probability Mean PP
Given N
49%
0
42%
60
9%
120
36

Square of Mean Second Moment of


PP Given N
of PP Given N
0
0
3600
4000
14400
15200
2808

Var of PP
Given N
0
400
800
240

For example given two claims the second moment of the pure premium =
(64%)(1002 ) + (32%)(1502 ) + (4%)(2002 ) = 15200.
Thus given two claims the variance of the pure premium is: 15200 - 1202 = 800.
Thus EN[VARPP(PP|N)] = 240, and VARN(EPP[PP|N]) = 2808 - 362 = 1512. Thus the variance of
the pure premium is EN[VARPP(PP|N)] + VARN(EPP[PP|N]) = 240 + 1512 = 1752, which
matches the result calculated above. The (total) process variance of the pure premium has been
split into two pieces. The first piece calculated as 240, is the expected value over the possible
numbers of claims of the process variance of the pure premium for fixed N. The second piece
calculated as 1512, is the variance over the possible numbers of the claims of the mean pure
premium for fixed N.
Expected Value of the Process Variance:
In order to solve questions involving Greatest Accuracy/Buhlmann Credibility and Pure Premiums or
Aggregate Losses one has to compute the Expected Value of the Process Variance of the Pure
Premium or Aggregate Losses.36 This involves being able to compute the process variance for
each specific type of risk and then averaging over the different types of risks possible. This may
involve taking a weighted average or performing an integral.

36

See Mahlers Guide to Buhlmann Credibility.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 39

Poisson Frequency:
Assume you are given the following:

For a given risk, the number of claims for a single exposure


period is Poisson with mean 7.

The size of the claim will be 50, with probability 80%,


or 100, with probability 20%.

frequency and severity are independent.


Exercise: Determine the variance of the pure premium for this risk.
[Solution: F = F2 = 7. S = 60. S2 = 400.

PP2 = F S2 + S2 F2 = (7)(602 ) + (400)(7) = 28,000.]


In the case of a Poisson Frequency with independent frequency and severity the formula for the
process variance of the pure premium simplifies. Since F = F2 :

PP2 = F S2 + S2 F2 = F(S2 + S2 ) = F(2nd moment of the severity).


When there is a Poisson Frequency, the variance of aggregate losses is:
(2nd moment of severity).
In the example above, the second moment of the severity is: (0.8)(502 ) + (0.2)(1002 ) = 4000.
Thus PP2 = F(2nd moment of the severity) = (7)(4000) = 28,000. If instead we have 20
independent exposures and take the sum of the losses, then the variance of these aggregate
losses is (140)(4000) = (20)(28,000) = 560,000.
As another example, assume you are given the following:

For a given risk, the number of claims for a single exposure


period is Poisson with mean 3645.

The severity distribution is LogNormal, with parameters = 5 and = 1.5.

frequency and severity are independent

Exercise: Determine the variance of the pure premium for this risk.
[Solution: The second moment of the severity = exp(2 + 22 ) = exp(14.5) = 1,982,759.264.
Thus PP2 = F(2nd moment of the severity) = (3645)(1,982,759) = 7.22716 x 109 .]

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 40

Normal Approximation:
For large numbers of expected claims, the observed pure premiums are approximately
Normally Distributed.37 For example, continuing the example above,
mean severity = exp( + 0.52 ) = exp(6.125) = 457.14.
Thus the mean pure premium is (3645)(457.14) = 1,666,292.
One could ask what the chance of the observed pure premiums being between 1.4997 million and
1.8329 million.
Since the variance is 7.22716 x 109 , the standard deviation of the pure premium is 85,013.
Thus the probability of the observed pure premiums being within 10% of 1.6663 million is
approximately:
[(1.8329 million - 1.6663 million) / 85,013] - [(1.4997 - 1.6663 million) / 85,013] =
[1.96] - [1.96] = 0.975 - (1 - 0.975) = 95%.
Thus in this case with an expected number of claims equal to 3645, there is about a 95% chance that
the observed pure premium will be within 10% of the expected value. One could turn this around
and ask how many claims would one need in order to have a 95% chance that the observed pure
premium will be within 10% of the expected value. The answer of 3645 claims could be taken as a
Standard for Full Credibility for the Pure Premium.38

37

The more skewed the severity distribution, the higher the expected frequency has to be for the Normal
Approximation to produce worthwhile results.
38
As discussed in the next section.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 41

Policies of Different Types:


Let us assume we have a portfolio consisting of two types of policies:

Type
A
B

Number
of Policies
10
20

Mean Aggregate
Loss per Policy
6
9

Variance of Aggregate
Loss per Policy
3
4

Assuming the results of each policy are independent, then the mean aggregate loss for the portfolio
is: (10)(6) + (20)(9) = 240.
The variance of aggregate loss for the portfolio is: (10)(3) + (20)(4) = 110.
For independent policies, the means and variances add.
Note that as we have more policies, all other things being equal, the coefficient of variation goes
down.
Exercise: Compare the coefficient of variation of aggregate losses in the above example to that if
one had instead 100 policies of Type A and 200 polices of type B.
[Solution: For the original example, CV =
For the new example, CV =

110 / 240 = 0.043.

1100 / 2400 = 0.0138.]

Exercise: For each of the two cases in the previous exercise, using the Normal Approximation
estimate the probability that the aggregate losses will be at least 5% more than their mean.
[Solution: For the original example, Prob[Agg. > 252] 1 - [(252 - 240)/ 110 ] = 1 - [1.144] =
12.6%. For the new example, Prob[Agg. > 2520] 1 - [(2520 - 2400)/ 1100 ] =
1 - [3.618] = 0.015%.]
For a larger portfolio, all else being equal, there is less chance of an extreme outcome in a given year
measured as a percentage of the mean.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 42

Problems:
Use the following information for the next five questions:

The number of claims for a single year is Poisson with mean 6200.

The severity distribution is LogNormal, with parameters = 5 and = 0.6.

Frequency and severity are independent

4.1 (1 point) Determine the expected annual aggregate losses.


A. Less than 0.8 million
B. At least 0.8 million but less than 0.9 million
C. At least 0.9 million but less than 1.0 million
D. At least 1.0 million but less than 1.1 million
E. At least 1.1 million
4.2 (2 points) Determine the variance of the annual aggregate losses.
A. Less than 270 million
B. At least 270 million but less than 275 million
C. At least 275 million but less than 280 million
D. At least 280 million but less than 285 million
E. At least 285 million
4.3 (2 points) Determine the chance that the observed annual aggregate losses will be more than
1.130 million. (Use the Normal Approximation.)
A. Less than 4%
B. At least 4%, but less than 5%
C. At least 5%, but less than 6%
D. At least 6%, but less than 7%
E. At least 7%
4.4 (2 points) Determine the chance that the observed annual aggregate losses will be less than
1.075 million. (Use the Normal Approximation.)
A. Less than 4%
B. At least 4%, but less than 5%
C. At least 5%, but less than 6%
D. At least 6%, but less than 7%
E. At least 7%
4.5 (1 point) Determine the chance that the observed annual aggregate losses will be within 2.5%
of its expected value. (Use the Normal Approximation.)
A. 86%
B. 88%
C. 90%
D. 92%
E. 94%

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 43

Use the following information for the next three questions:


There are two types of risks.
For each type of risk, the frequency and severity are independent.
Type Frequency Distribution
Severity Distribution
1

Poisson: = 4%

Gamma: = 3, = 10

Poisson: = 6%

Gamma: = 3, = 15

4.6 (1 point) Calculate the process variance of the pure premium for Type 1.
A. 48
B. 50
C. 52
D. 54
E. 56
4.7 (1 point) Calculate the process variance of the pure premium for Type 2.
A. 150
B. 156
C. 162
D. 168
E. 174
4.8 (1 point) Assume one has a portfolio made up of 80% risks of Type 1,
and 20% risks of Type 2.
For this portfolio, what is the expected value of the process variance of the pure premium?
A. 65

B. 67

C. 69

D. 71

E. 73

Use the following information for the next 3 questions:

Number of claims for a single insured follows a Negative Binomial distribution,


with parameters r = 30 and = 2/3.

The amount of a single claim has a Gamma distribution with = 4 and = 1000.

Number of claims and claim severity distributions are independent.

4.9 (2 points) Determine EN[VARPP(PP | N)], the expected value over the number of possible
claims of the variance of the pure premium for a given number of claims.
A. 50 million B. 60 million C. 70 million D. 80 million E. 90 million
4.10 (2 points) Determine VARN(EPP[PP|N]), the variance over the number of claims of the
expected value of the pure premium for a given number of claims.
A. Less than 400 million
B. At least 400 million but less than 450 million
C. At least 450 million but less than 500 million
D. At least 500 million but less than 550 million
E. At least 550 million
4.11 (2 points) Determine the pure premium's process variance for a single insured.
A. 575 million B. 585 million C. 595 million D. 605 million E. 615 million

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 44

Use the following information for the next four questions:


There are three types of risks.
For each type of risk, the frequency and severity are independent.
Type Frequency Distribution
Severity Distribution

Binomial: m =10, q = 0.3

Pareto: = 3, = 500

Poisson: = 5

LogNormal: = 6, = 0.8

Negative Binomial: r = 2.7, = 7/3

Gamma: = 2, = 250

4.12 ( 2 points) For a risk of Type , what is the process variance of the pure premium?
A. Less than 0.5 million
B. At least 0.5 million but less than 0.6 million
C. At least 0.6 million but less than 0.7 million
D. At least 0.7 million but less than 0.8 million
E. At least 0.8 million
4.13 ( 2 points) For a risk of Type , what is the process variance of the pure premium?
A. Less than 2.7 million
B. At least 2.7 million but less than 2.8 million
C. At least 2.8 million but less than 2.9 million
D. At least 2.9 million but less than 3.0 million
E. At least 3 million
4.14 ( 2 points) For a risk of Type , what is the process variance of the pure premium?
A. Less than 5.7 million
B. At least 5.7 million but less than 5.8 million
C. At least 5.8 million but less than 5.9 million
D. At least 5.9 million but less than 6.0 million
E. At least 6.0 million
4.15 ( 2 points) Assume one has a portfolio made up of 55% risks of Type , 35% risks of Type ,
and 10% risks of Type .
For this portfolio, what is the expected value of the process variance of the pure premium?
A. Less than 1.7 million
B. At least 1.7 million but less than 1.8 million
C. At least 1.8 million but less than 1.9 million
D. At least 1.9 million but less than 2.0 million
E. At least 2.0 million

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 45

4.16 (4, 5/89, Q.35) (1 point) For a given risk situation, the frequency distribution follows the
Poisson process with mean 0.5. The second moment about the origin for the severity distribution is
1,000. Frequency and severity are independent of each other.
What is the process variance of the aggregate claim amount?
A. 500

B. (0.5)2

C. 1000
D. 0.5
E. Cannot be determined from the information given

1000

4.17 (4, 5/90, Q.43) (2 points) Let N be a random variable for the claim count with:
Pr{N = 4} = 1/4
Pr{N = 5} = 1/2
Pr{N = 6} = 1/4
Let X be a random variable for claim severity with probability density function
f(x) = 3 x-4, for 1 x < .
Find the coefficient of variation, R, of the aggregate loss distribution, assuming that claim severity and
frequency are independent.
A. R < 0.35
B. 0.35 R < 0.50
C. 0.50 R < 0.65
D. 0.65 R < 0.70
E. 0.70 R
4.18 (4, 5/91, Q.26) (2 points)
The probability function of claims per year for an individual risk is Poisson with a mean of 0.10.
There are four types of claims.
The number of claims has a Poisson distribution for each type of claim.
The table below describes the characteristics of the four types of claims.
Type of
Mean
-------Severity------Claim
Frequency
Mean
Variance
W
0.02
200
2,500
X
0.03
1,000
1,000,000
Y
0.04
100
0
Z
0.01
1,500
2,000,000
Calculate the variance of the pure premium.
A. Less than 70,000
B. At least 70,000 but less than 80,000
C. At least 80,000 but less than 90,000
D. At least 90,000 but less than 100,000
E. At least 100,000

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 46

4.19 (4B, 5/92, Q.31) (2 points)


You are given that N and X are independent random variables where:

N is the number of claims, and has a binomial distribution with parameters m = 3 and q = 1/6.

X is the size of claim and has the following distribution:


P[X=100] = 2/3 P[X=1100] = 1/6 P[X=2100] = 1/6
Determine the coefficient of variation of the aggregate loss distribution.
A. Less than 1.5
B. At least 1.5 but less than 2.5
C. At least 2.5 but less than 3.5
D. At least 3.5 but less than 4.5
E. At least 4.5
4.20 (5A, 5/94, Q.22) (1 point) The probability of a particular automobile's being in an accident in a
given time period is 0.05. The probability of more than one accident in the time period is zero. The
damage to the automobile is assumed to be uniformly distributed over the interval from 0 to 2000.
What is the variance of the pure premium?
A. Less than 40,000
B. At least 40,000, but less than 50,000
C. At least 50,000, but less than 60,000
D. At least 60,000, but less than 70,000
E. 70,000 or more
4.21 (5A, 5/94, Q.35) (2 points) Your company plans to sell a certain type of policy that is
expected to have a claim frequency per policy of 0.15, and a claim size distribution with a mean of
1200 and a standard deviation of 2000.
Management believes that 40,000 of these policies can be written this year.
Assume that for the portfolio of policies, the number of claims is Poisson distributed.
Assume that the premium for each policy is 105% of expected losses. Ignore expenses.
What is the amount of surplus that must be held for this portfolio such that the probability that the
surplus will be exhausted is .005?
4.22 (5A, 5/94, Q.39) (2 points) Your company plans to sell a certain policy but will not commit
any surplus to support it. You have determined that the policy will have a mean frequency per
policy of 0.045, and a claim size distribution with a mean of 750 and a second moment about the
origin of 60,000,000. The price that is suggested is 105% of expected losses.
Management will allow the policy to be written only if the probability that losses will exceed
premiums is less than 1%. Ignore expenses and assume that for the portfolio of policies, the
number of claims is Poisson distributed. What is the smallest number of policies that must be sold
in order to satisfy management's requirement?

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 47

4.23 (5A, 11/94, Q.22) (1 point) Assume S is a compound Poisson distribution of aggregate
claims with a Poisson parameter of 3. Individual claims are uniformly distributed with integer values
from 1 to 6 What is the variance of S?
A. Less than 30
B. At least 30, but less than 40
C. At least 40, but less than 50
D. At least 50, but less than 60
E. Greater than or equal to 60
4.24 (5A, 11/94, Q.38) (3 points) Your company's automobile liability portfolio consists of three
tiers. You have determined that the aggregate claim distribution for each tier is compound Poisson,
characterized by the following:
Tier 1
Tier 2
Tier 3
Poisson parameter
2.3
3.0
1.9
Pr[X = xi | a claim has occurred]
Claim Amount
x1 = 1,000

Tier 1
0.60

x2 = 5,000

0.30

0.20

0.15

x3 = 10,000

0.10

0.10

0.05

Tier 2
0.70

Tier 3
0.80

What are the mean and variance of the aggregate claim distribution for the entire automobile
portfolio?
4.25 (4B, 5/95, Q.14) (3 points) You are given the following:

For a given risk, the number of claims for a single exposure period will be 1,
with probability 3/4; or 2, with probability 1/4.

If only one claim is incurred, the size of the claim will be 80, with probability 2/3;
or 160, with probability 1/3.

If two claims are incurred, the size of each claim, independent of the other, will
be 80, with probability 1/2; or 160, with probability 1/2.
Determine the variance of the pure premium for this risk.
A. Less than 3,600
B. At least 3,600, but less than 4,300
C. At least 4,300, but less than 5,000
D. At least 5,000, but less than 5,700
E. At least 5,700

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 48

4.26 (5A, 5/95, Q.20) (1 point)


Assume S is compound Poisson with a mean number of claims = 4.
Individual claims will be of amounts 100, 200, and 500 with probabilities 0.4, 0.5, and 0.1,
respectively. What is the variance of S?
A. Less than 150,000
B. At least 150,000, but less than 175,000
C. At least 175,000, but less than 200,000
D. At least 200,000 but less than 225,000
E. Greater than or equal to 225,000
4.27 (4B, 5/96, Q.7) (3 points) You are given the following:

The number of claims follows a negative binomial distribution with mean 800
and variance 3,200.
Claim sizes follow a transformed gamma distribution with mean 3,000

and variance 36,000,000.


The number of claims and claim sizes are independent.

Using the Central Limit Theorem, determine the approximate probability that the aggregate losses
will exceed 3,000,000.
A. Less than 0.005
B. At least 0.005, but less than 0.01
C. At least 0.01, but less than 0.1
D. At least 0.1, but less than 0.5
E. At least 0.5

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 49

4.28 (4B, 5/96, Q.18) (2 points) Two dice, A and B, are used to determine the number of claims.
The faces of each die are marked with either a 1 or a 2, where 1 represents 1 claim and 2 represents
2 claims. The probabilities for each die are:
Die
Probability of 1 Claim Probability of 2 Claims
A
2/3
1/3
B
1/3
2/3
In addition, there are two spinners, X and Y, which are used to determine claim size.
Each spinner has two areas marked 2 and 5. The probabilities for each spinner are:
Probability
Probability
Spinner that Claim Size = 2 that Claim Size = 5
X
2/3
1/3
Y
1/3
2/3
For the first trial, a die is randomly selected from A and B and rolled. If 1 claim occurs, spinner X is
spun. If 2 claims occur, both spinner X and spinner Y are spun. For the second trial, the same die
selected in the first trial is rolled again. If 1 claim occurs, spinner X is spun. If 2 claims occur, both
spinner X and spinner Y are spun.
Determine the expected amount of total losses for the first trial.
A. Less than 4.8
B. At least 4.8, but less than 5.1
C. At least 5.1, but less than 5.4
D. At least 5.4, but less than 5.7
E. At least 5.7
4.29 (3 points) In the previous question, 4B, 5/96, Q.18, determine the variance of the distribution
of total losses for the first trial.
A. 4
B. 5
C. 6
D. 7
E. 8

4.30 (5A, 5/96, Q.37) (2.5 points) Given the following information regarding a single commercial
property exposure:
The probability of claim in a policy period is 0.2.
Each risk has at most one claim per period.
The distribution of individual claim amounts is LogNormal with parameters = 7.54 and = 1.14.
Assume that all exposures are independent and identically distributed.
Using the normal approximation, how many exposures must an insurer write to be 95% sure that the
total loss does not exceed twice the expected loss?

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 50

4.31 (5A, 11/96, Q.38) (2 points) A portfolio of insurance policies is assumed to follow a
compound Poisson claims process with 100 claims expected. The claim amount distribution is
assumed to have an expected value of 1,000 and variance of 1,000,000. These insureds would
like to self insure their risk provided that there is no more than a 5% chance of insolvency in the first
year. If the premium equals the expected loss, and if there are no other risks, then how much capital
must the insureds possess in order to meet their solvency requirement?
4.32 (5A, 11/98, Q.22) (1 point) Assume S is compound Poisson with mean number of claims
(N) equal to 3. Individual claim amounts follow a distribution with E[X] = 560 and Var[X] = 194,400.
What is the variance of S?
A. Less than 1,500,000
B. At least 1,500,000, but less than 1,750,000
C. At least 1,750,000, but less than 2,000,000
D. At least 2,000,000, but less than 2,250,000
E. At least 2,250,000
4.33 (5A, 11/98, Q.36) (2 points) Assume the following:
i. S = X1 + X2 + X3 +...+ XN where X1 , X2 , X3 ,... XN are identically distributed
and N, X1 , X2 , X3 , . . ., XN are mutually independent random variables.
ii. N follows a Poisson distribution with = 4.
iii. Expected value of the variance of S given N, E[Var(S | N)] = 1,344.
iv. Var(N) [E(X)]2 = 4,096.
Calculate E[X2 ].
4.34 (5A, 5/99, Q.23) (1 point) Let S be the aggregate amount of claims. The number of claims,
N, has the following probability function: Pr(N=0) = 0.25, Pr(N = 1) = 0.25, and Pr(N=2) = 0.50.
Each claim size is independent and is uniformly distributed over the interval (2, 6).
The number of claims and the claim sizes are mutually independent. What is Var(S)?
A. Less than 6
B. At least 6, but less than 9
C. At least 9, but less than 12
D. At least 12, but less than 15
E. At least 15
4.35 (5A, 5/99, Q.36) (2 points) In a given time period, the probability that a particular automobile
insurance policyholder will have a physical damage claim is 0.05.
Assume that the policyholder can have at most one claim during the given time period.
If a physical damage claim is made, the cost of the damages is uniformly distributed over the interval
(0, 5000). Calculate the mean and variance of aggregate policy losses within the given time period.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 51

Use the following information for the next two questions:


The number of claims per year follows a Poisson distribution with mean 300.
Claim sizes follow a Generalized Pareto distribution, as per Loss Models,
with parameters = 1,000, = 3, and = 2 .
The nth moment of a Generalized Pareto Distribution is:
n ( - n) ( + n)
E[Xn ] =
, for > n.
() ()
The number of claims and claim sizes are independent.
4.36 (4B, 11/99, Q.12) (2 points) Using the Normal Approximation, determine the probability that
annual aggregate losses will exceed 360,000.
A. Less than 0.01
B. At least 0.01, but less than 0.03
C. At least 0.03, but less than 0.05
D. At least 0.05, but less than 0.07
E. At least 0.07
4.37 (4B, 11/99, Q.13) (2 points) After a number of years, the number of claims per year still
follows a Poisson distribution, but the expected number of claims per year has been cut in half.
Claim sizes have increased uniformly by a factor of two. Using the Normal Approximation,
determine the probability that annual aggregate losses will exceed 360,000.
A. Less than 0.01
B. At least 0.01, but less than 0.03
C. At least 0.03, but less than 0.05
D. At least 0.05, but less than 0.07
E. At least 0.07
4.38 (Course 151 Sample Exam #1, Q.4) (0.8 points) For an insurance portfolio:
(i) the number of claims has the probability distribution
n
p(n)
0
0.4
1
0.3
2
0.2
3
0.1
(ii) each claim amount has a Poisson distribution with mean 4
(iii) the number of claims and claim amounts are mutually independent.
Determine the variance of aggregate claims.
(A) 8
(B) 12
(C) 16
(D) 20
(E) 24

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 52

4.39 (Course 151 Sample Exam #2, Q.4) (0.8 points)


You are given S = S1 + S2 , where S1 and S2 are independent and have compound Poisson
distributions with the following characteristics:
(i) 1 = 2 and 2 = 3
(ii)

p 1 (x)

p 2 (x)

1
0.6
0.1
2
0.4
0.3
3
0.0
0.5
4
0.0
0.1
Determine the variance of S.
(A) 15.1
(B) 18.6
(C) 22.1

(D) 26.6

(E) 30.1

4.40 (Course 151 Sample Exam #3, Q.1) (0.8 points)


For a portfolio of insurance, you are given the distribution of number of claims:
n
Pr(N=n)
0
0.40
5
0.10
10
0.50
and the distribution of the claim amounts:
x
p(x)
1
0.90
2
0.10
Individual claim amounts and the number of claims are mutually independent.
Determine the variance of aggregate claims.
(A) 22.3
(B) 24.1
(C) 25.0
(D) 26.9
(E) 27.4
4.41 (Course 151 Sample Exam #3, Q.13) (1.7 points) You are given:

The number of claims is given by a mixed Poisson with an Inverse Gaussian


mixing distribution, with = 500 and = 5000.

The number and amount of claims are independent.


The mean aggregate loss is 1000.

The variance of aggregate losses is 150,000.


Determine the variance of the claim amount distribution.
(A) 88
(B) 92
(C) 96
(D) 100
(E) 104

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 53

4.42 (4, 11/02, Q.36 & 2009 Sample Q. 53) (2.5 points) You are given:
Number
of Claims

Probability

Claim
Size

1/5

3/5

25
150

1/3
2/3

1/5

50
200

2/3
1/3

Probability

Claim sizes are independent.


Determine the variance of the aggregate loss.
(A) 4,050
(B) 8,100
(C) 10,500 (D) 12,510

(E) 15,612

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 54

Solutions to Problems:
4.1. E. The mean severity = exp( + 0.52 ) = exp(5.18) = 177.6828.
Thus the mean aggregate losses are: (6200)(177.6828) = 1,101,633.
4.2. D. The second moment of the severity = exp(2 + 22 ) = exp(10.72) = 45,252.
Thus since the frequency is Poisson and independent of the severity:

PP2 = F(2nd moment of the severity) = (6200)(45252) = 280.56 million.


4.3. B. Since the variance is 280.56 million, the standard deviation of the aggregate losses is
16750. Thus the probability of the observed aggregate losses being more than 1130 thousand is
approximately: 1 - [(1130 - 1101.63) / 16.75] = 1 - [1.69] = 1 - 0.9545 = 4.55%.
4.4. C. Prob[aggregate losses < 1075 thousand] [(1075 - 1101.63) / 16.75] =
(-1.59) = 1 - 0.9441 = 5.59%.
4.5. C. Using the solutions to the prior two questions: 1 - 4.51% - 5.59% = 89.9%.
Comment: If one were asked for the Full Credibility criterion for Aggregate Losses corresponding to
a 90% chance of being within 2.5% of the expected aggregate losses, in the case of a Poisson
frequency, as explained in the next section the answer would be:
(y/k)2 (1 + CV2 ) = (1.645/0.025)2 exp(2 ) = 4330(1.4333) = 6206 claims. Note that for the
LogNormal Distribution: 1 + CV2 = exp(2 ) = exp(0.82 ) = 1.4333. That is just another way of
saying there is about a 90% chance of being within 2.5% of the expected aggregate losses when
one has about 6200 expected claims.
4.6. A. f = f2 = = 0.04. s = = 30. s2 = 2 = 300.
PP2 = fs2 + s2f2 = (0.04)(300) + (302 )(0.04) = 48.
4.7. C. PP2 = (second moment of severity) = (0.06){( + 1)2} = (0.06)(3)(4)(152 ) = 162.
4.8. D. EPV = (80%)(48) + (20%)(162) = 70.8.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 55

4.9. D. For a fixed number of claims N, the variance of the pure premium is the variance of the sum
of N independent identically distributed variables each with variance S2 .
(Since frequency and severity are assumed independent, S2 is the same for each value of N.)
Such variances add so that VARPP(PP|N) = NS2 .
EN[VARPP(PP|N)] = EN[NS2 ] = S2 EN[N] = S2 F.
For the Negative Binomial Distribution: mean = r = (30)(2/3) = 20.
For the Gamma the variance = 2 = 4(10002 ) = 4,000,000.
Thus EN[VARPP(PP|N)] = S2 F = (20)(4 million) = 80 million.
4.10. D. For a fixed number of claims N, with frequency and severity independent, the expected
value of the pure premium is N times the mean severity: EPP[PP|N] = SN.
VARN(EPP[PP|N]) = VARN( SN) = S2 VARN(N) = S2 F2 . For the Negative Binomial: variance
= r(1+) = (30)(2/3)(5/3) = 33.33. For the Gamma the mean is = 4(1000) = 4000.
Therefore, VARN(EPP[PP|N]) = S2 F2 = (4000)2 (33.33) = 533.3 million.
4.11. E. For the Negative Binomial Distribution: mean = 20, variance = 33.33.
For the Gamma the mean 4000, variance = 4,000,000. Thus PP2 = F S2 + S2 F2 =
(20)(4 million) + (4000)2 (33.33) = 80 million + 533.3 million = 613.3 million.
Comment: Note that the process variance is also the sum of the answers to the two previous
questions: 80 million + 533.3 million = 613.3 million. This is the analysis of variance that is used in the
derivation of the formula used to solve this problem.
4.12. C. For the Binomial frequency: mean = mq = 3, variance = mq(1-q) = (10)(0.3)(0.7) = 2.1.
For the Pareto severity: mean = / (-1) = 500 / 2 = 250,
variance =

2
(3) (5002)
=
= 187,500.
( 1)2 ( 2) (3 -1)2 (3 - 2)

Since the frequency and severity are independent:

PP2 = F S2 + S2 F2 = (3)(187,500) + (2502 )(2.1) = 693,750.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 56

4.13. D. For the Poisson frequency: mean = variance = = 5.


For the LogNormal severity: Mean = exp( + 0.5 2) = exp[6 + (0.5 )(0.82 )] = 555.573,
Variance = exp(2 + 2) {exp( 2) - 1} = exp[2(6) +(0.82 )] (exp[0.82 ] - 1) =
(308661.3) (1.89648 -1) = 276,709.
Since the frequency and severity are independent:

PP2 = F S2 + S2 F2 = (5)(276709) + (555.5732 )(5) = 2,926,852.


Alternately, since the frequency is Poisson and the frequency and severity are independent:

PP2 = (mean frequency)(2nd moment of the severity).


The 2nd moment of a LogNormal Distribution is:
exp(2 + 22) = exp[2(6) + 2(0.82 )] = exp(13.28) = 585370.3. Therefore,

PP2 = (mean frequency)(2nd moment of the severity) = (5)(585370.3) = 2,926,852.


4.14. E. For the Negative Binomial frequency: mean = r = (2.7)(7/3)= 6.3,
variance = r(1+) = (2.7)(7/3)(10/3) = 21.
For the Gamma severity: mean = = 2(250) = 500, variance = 2 = 2 (250)2 = 125000.
Since the frequency and severity are independent:

PP2 = F S2 + S2 F2 = (6.3)(125000) + (5002 )(21) = 6,037,500.


4.15. E. (55%)(693,750) + (35%)(2,926,852) + (10%) (6,037,500) = 2,009,711.
4.16. A. For a Poisson frequency, PP2 = F(2nd moment of the severity) = (0.5)(1000) = 500.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 57

4.17. A. The mean frequency = (1/4)(4) + (1/2)(5) + (1/4)(6) = 5. 2nd moment of frequency =
(1/4)(42 ) + (1/2)(52 ) + (1/4)(62 ) = 25.5. The variance of the frequency = 25.5 - 52 = 0.5.

mean severity =

1 x f(x) dx = 1

x=
3
2
x 4 dx = -3x / 2 ] = 3/2.
x
x=1

second moment =

x2

f(x) dx =

x2

x=
3
1
dx = -3x / 1] = 3.
x4
x=1

Thus the variance of the severity is: 3 - (3/2)2 = 3/4.


For independent frequency and severity, the variance of the pure premiums =
(mean frequency)(variance of severity) +(mean severity)2 (variance of frequency) =
(5)(3/4) + (3/2)2 (0.5) = 4.875.
The mean of the pure premium is (mean frequency)(mean severity) = (5)(3/2) = 7.5.
The coefficient of variation of the pure premium =

variance of P.P.
=
mean of P.P.

4.875
= 0.294.
7.5

Comment: The severity distribution is a Single Parameter Pareto with = 1 and = 3.


The mean =

= 3/2. The variance =


= 3/4.
( 1)2 ( 2)
1

4.18. E. Since we have a Poisson Frequency, the Process Variance for each type of claim is given
by the mean frequency times the second moment of the severity.
For example, for Claim Type Z, the process variance of the pure premium is:
(0.01)(2,250,000 + 2,000,000) = 42,500.
Then the process variances for each type of claim add to get the total variance, 103,570.
Type
of
Claim
W
X
Y
Z
SUM

Mean
Frequency
0.02
0.03
0.04
0.01

Mean
Severity
200
1000
100
1500

Square
of Mean
Severity
40,000
1,000,000
10,000
2,250,000

Variance
of
Severity
2,500
1,000,000
0
2,000,000

Process
Variance
of P.P.
850
60,000
400
42,500
103,750

Comment: This is like adding up four independent die rolls; the variances add. For example this
could be a nonrealistic model of homeowners insurance with the four types of claims being: Fire,
Liability, Theft and Windstorm.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 58

4.19. B. The mean frequency is mq = 1/2, while the variance of the frequency is
mq(1-q) = (3)(1/6)(5/6) = 5/12.
The mean severity is: (2/3)(100) +(1/6)(1100)+ (1/6)(2100) = 600.
The second moment of the severity is: (2/3)(1002 ) + (1/6)(11002 ) + (1/6)(21002 ) = 943,333.
Thus the variance of the severity is: 943,333 - 6002 = 583,333.
The variance of the pure premium = (variance of frequency)(mean severity)2 + (variance of
severity)(mean frequency) = (5/12)(600)2 + (1/2)(583,333) = 441,667.
The mean pure premium is (1/2)(600) = 300. Therefore, the coefficient of variation is:
standard deviation / mean = 441,667 / 300 = 2.2.
4.20. D. The variance of the frequency is: (0.05)(0.95) = 0.0475. The mean damage is: 1000.
The variance of the damage is: (2000 - 0)2 / 12 = 333,333.
The variance of the pure premium = (10002 )(0.0475) + (0.05)(333,333) = 64,167.
4.21. The mean aggregate loss is: (40000)(0.15)(1200) = 7.2 million.
Since frequency is Poisson with mean: (40000)(0.15) = 6000, the variance of aggregate losses is:
(mean frequency)(2nd moment of severity) = (6000) (20002 + 12002 ) = 32,640 million.
The standard deviation of aggregate losses is:

32,640 million = 180,665.

Premium = (1.05)(expected losses) = (1.05)(7.2 million) = 7.56 million.


We want the Premiums + Surplus Actual Losses.

Surplus Actual Losses - Premiums = Actual Losses - 7.56 million.


(2.756) = 0.995 the 99.5th percentile of the Standard Normal Distribution is 2.576.
Therefore, 99.5% of the time actual losses are less than or equal to:
7.2 million + (2.576)(180,665) = 7.665 million.
Therefore, we want surplus of at least: 7.665 million - 7.56 million = 105 thousand.
Comment: 100% - 99.5% = 0.5% of the time, actual losses will be greater than
7.665 million, and a surplus of 105 thousand would be exhausted.
4.22. Let N be the number of policies written.
The mean aggregate loss = N(.045)(750) and the variance of aggregate losses =
N(0.045)(60,000,000). Thus premiums are: 1.05N(.045)(750).
The 99th percentile of the Unit Normal Distribution is 2.326. Thus we want
Premiums - Expected Losses = 2.326(standard deviation of aggregate losses).
(0.05)N(0.045)(750) = 2.326

N (0.045) (60,000,000) .

Therefore, N = (60,000,000/7502 ) (2.326/0.05)2 / 0.045 = 5,129,743.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 59

4.23. C. Second moment of the severity = (12 + 22 + 32 + 42 + 52 + 62 )/6 = 15.167.


Since the frequency is Poisson, the variance of aggregate losses =
(mean frequency)(second moment of the severity) = (3)(15.167) = 45.5.
4.24. For each tier, the mean aggregate loss = (mean frequency)(mean severity) and since
frequency is Poisson, the variance of aggregate loss = (mean frequency)(second moment of the
severity). The means and variances of the tiers add to get an overall mean of: 19,125,
and an overall variance of: 106,875,000.
Tier 1

Tier 2

Tier 3

1000
5000
10000

0.6
0.3
0.1

0.7
0.2
0.1

0.8
0.15
0.05

Mean Severity
2nd Moment of Severity

3100
18,100,000

2700
15,700,000

2050
9,550,000

Poisson Parameter

2.3

1.9

Mean Aggregate
Variance of Aggregate

7,130
41,630,000

8,100
47,100,000

3,895
18,145,000

Overall

19,125
106,875,000

Alternately, the tiers, each of which is compound Poisson, add to get a new compound
Poisson with mean frequency: 2.3 + 3 + 1.9 = 7.2. The mean severity overall is a weighted
average of the means for the individual tiers:
{(2.3)(3100) + (3)(2700) + (1.9)(2050)}/7.2 = 2556.25.
Thus the mean aggregate loss is: (2556.25)(7.2) = 19,125.
The second moment of the severity overall is a weighted average of the second moments for the
individual tiers:
{(2.3)(18.1 million) + (3)(15.7 million) + (1.9)(9.55 million)}/7.2 = 14.844 million.
Thus the variance of aggregate losses is: (7.2)(14.844 million) = 106.9 million.
4.25. D. For example, the chance of 2 claims of size 80 each is the chance of having two claims
times the chance given two claims that they will each be 80 = (1/4)(1/2)2 = 1/16. In that case the
pure premium is 80 + 80 = 160. One takes the weighted average over all the possibilities. The
average Pure Premium is 140. The second moment of the Pure Premium is 24800. Therefore, the
variance = 24800 - 1402 = 5200.
Situation
1 claim @ 80
1 claim @ 160
2 claims @ 80 each
2 claims: 1 @ 80 & 1 @ 160
2 claims @ 160 each

Probability
0.5000
0.2500
0.0625
0.1250
0.0625

Pure Premium
80
160
160
240
320

Square of P.P.
6400
25600
25600
57600
102400

Overall

1.0000

140

24800

Comment: Note that the frequency and severity are not independent.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 60

4.26. C. Since the frequency is Poisson, the variance of aggregate losses =


(mean frequency) (second moment of the severity) =
(4) {(0.4)(1002 ) + (0.5)(2002 ) + (0.1)(5002 )} = 196,000.
4.27. B. The mean pure premium is (3000)(800) = 2.4 million. Since frequency and severity are
independent, the (process) variance of the aggregate losses is: f s2 + s2 f2 =
(800)(36 million) + (3000)2 (3200) = 57.6 billion.
Thus the standard deviation of the pure premiums is: 57.6 billion = 240,000.
To apply the Normal Approximation we subtract the mean and divide by the standard deviation.
The probability that the total losses will exceed 3 million is approximately:
1 - [(3 million - 2.4 million)/ 240,000] = 1 - (2.5) = 1 - 0.9938 = 0.0062.
Comment: One makes no specific use of the information that the frequency is given by a Negative
Binomial, nor that the severity is given by a Transformed Gamma Distribution.
4.28. B. Since Die A and Die B are equally likely, the chance of 1 claim is: (1/2)(2/3) + (1/2)(1/3) =
1/2, while the chance of 2 claims is: (1/2)(1/3) + (1/2)(2/3) = 1/2.
The mean of Spinner X is: (2/3)(2) + (1/3)(5) = 3,
while the mean of Spinner Y is: (1/3)(2) + (2/3)(5) = 4.
If we have one claim the mean loss is E[X] = 3. If we have two claims, then the mean loss is:
E[X+Y] = E[X] + E[Y] = 3 + 4 = 7. The overall mean pure premium is:
(chance of 1 claim)(mean loss if 1 claim) + (chance of 2 claims)(mean loss if 2 claims)
= (1/2)(3) + (1/2)(7) = 5.
Comment: In this problem frequency and severity are not independent.
The next question on this exam used the same setup and asked one to perform Bayes Analysis;
See section 13 of my guide to Buhlmann Credibility and Bayes Analysis.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 61

4.29. D. Since Die A and Die B are equally likely, the chance of 1 claim is: (1/2)(2/3) + (1/2)(1/3) =
1/2, while the chance of 2 claims is: (1/2)(1/3) + (1/2)(2/3) = 1/2.
If we have one claim, then spinner X is spun and the loss is either:
2 with probability 2/3 or 5 with probability 1/3.
If we have 2 claims, then spinners X and Y are spun and the loss is either:
4 with probability 2/9, 7 with probability 5/9, or 10 with probability 2/9.
Thus the distribution of losses is:
2 @ 1/3, 5 @ 1/6, 4 @ 1/9, 7 @ 5/18,and 10 @ 1/9.
Mean loss is: (2)(1/3) + (5)(1/6) + (4)(1/9) + (7)(5/18) + (10)(1/9) = 5.
Second moment is: (22 )(1/3) + (52 )(1/6) + (42 )(1/9) + (72 )(5/18) + (102 )(1/9) = 32.
Variance = 32 - 52 = 7.
Alternately, this is a 50-50 mixture of two situations one claim or two claims.
The mean of Spinner X is: (2/3)(2) + (1/3)(5) = 3.
The variance of Spinner X is: (2/3)(2 - 3)2 + (1/3)(5 - 3)2 = 2.
The mean of Spinner Y is: (1/3)(2) + (2/3)(5) = 4.
The variance of Spinner Y is: (1/3)(2 - 4)2 + (2/3)(5 - 4)2 = 2.
If we have one claim the mean loss is E[X] = 3.
If we have two claims, then the mean loss is: E[X+Y] = E[X] + E[Y] = 3 + 4 = 7.
The overall mean is: (1/2)(3) + (1/2)(7) = 5.
If we have one claim the second moment is from spinner X: 2 + 32 = 11.
If we have two claims the variance is the sum of those for X and Y: 2 + 2 = 4.
Thus if we have two claims the second moment: 4 + 72 = 53.
Thus the second moment of the mixture is: (1/2)(11) + (1/2)(53) = 32.
Therefore, the variance of the mixture is: 32 - 52 = 7.
Alternately, take the two types as 1 or 2 claims, equally likely.
The hypothetical means for 1 and 2 claims are: 3 and 7.
Therefore, the variance of the hypothetical means is: (1/2)(3 - 5)2 + (1/2)(7 - 5)2 = 4.
When there is one claims, the process variance is that of spinner X: 2.
When there are 2 claims, the process variance is the sum of those for spinners X and Y: 2 + 2 = 4.
Expected Value of the process variance is: (1/2)(2) + (1/2)(4) = 3.
Total variance is: EPV + VHM = 3 + 4 = 7.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 62

Alternately, take the two types as Die A and B, equally likely.


The hypothetical mean if Die A is: (2/3)(3) + (1/3)(7) = 13/3.
The hypothetical mean if Die B is: (1/3)(3) + (2/3)(7) = 17/3.
Therefore, the variance of the hypothetical means is: (1/2)(13/3 - 5)2 + (1/2)(17/3 - 5)2 = 4/9.
When there is one claim, the second moment of pure premium is: 2 + 32 = 11.
When there is two claims, the second moment of pure premium is: 4 + 72 = 53.
Therefore, if one has die A, the second moment of pure premium is: (2/3)(11) + (1/3)(53) = 25.
Thus the process variance when die A is: 25 - (13/3)2 = 56/9.
Therefore, if one has die B, the second moment of pure premium is: (1/3)(11) + (2/3)(53) = 39.
Thus the process variance when die B is: 39 - (17/3)2 = 62/9.
Expected Value of the process variance is: (1/2)(56/9) + (1/2)(62/9) = 59/9.
Total variance is: EPV + VHM = 59/9 + 4/9 = 7.
Comment: In this problem frequency and severity are not independent.
4.30. For the LogNormal Distribution, E[X] = exp[ + 2/2] = exp[7.54 + 1.142 /2] = 3604.
E[X2 ] = exp[2 + 22] = exp[(2)(7.54) + (2)(1.142 )] = 47,640,795.
Var[X] = 47,640,795 - 36042 = 34,651,979.
Mean aggregate loss (per exposure) = (0.2)(3604) = 721.
Variance of Aggregate Losses (per exposure) is:
(.2)( 34,651,979) + (0.2)(0.8)(36042 ) = 9,008,606.
Thus if we write N exposures, mean loss = 721N,
and standard deviation of the aggregate loss is 3001 N .
We want N such that Prob(Aggregate Loss > (2)(mean)) 5%.
Prob((Aggregate Loss - mean) > 721N) 5%.
Using the Normal Approximation, we want: (1.645)(stddev) < 721N.

1.645(3001 N ) < 721N. N > 46.9.


4.31. The variance of aggregate losses is: (100)(1,000,000 + 10002 ) = 200,000,000.
The 95th percentile of the aggregate losses exceeds the mean by about 1.645 standard
deviations: (1.645) 200,000,000 = 23,264.
With at least this much capital, there is no more than a 5% chance of insolvency in the first year.
4.32. B. For a compound Poisson, variance of aggregate losses =
(mean frequency)(second moment of severity) = (3)(194400 + 5602 ) = 1,524,000.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 63

4.33. 1344 = EN[Var(S | N)] = EN[Var(X1 + X2 + X3 +...+ XN)] = EN[N Var[X]] = E[N] Var[X] =
4Var[X]. Var[X] = 1344/4 =336. 4,096 = Var(N)[E(X)]2 = 4[E(X)]2 . [E(X)]2 = 1024.
E[X2 ] = Var[X] + [E(X)]2 = 336 + 1024 = 1360.
Comment: E[Var(S | N)] given in the question is the expected value over N of the variance of the
aggregate losses conditional on N.
4.34. D. Mean Frequency = (0.25)(0) + (0.25)(1) + (0.50)(2) = 1.25.
Second Moment of the Frequency = (0.25)(02 ) + (0.25)(12 ) + (.050)(22 ) = 2.25.
Variance of the Frequency = 2.25 - 1.252 = 0.6875.
Mean Severity = (2 + 6)/2 = 4. Variance of the severity = (6-2)2 /12 = 4/3.
Variance of the Aggregate Losses = (42 )(0.6875) + (4/3)(1.25) = 12.67.
4.35. Mean frequency = 0.05. Variance of frequency = (0.05)(0.95) = 0.0475.
Mean severity = (0 + 5000)/2 = 2500. Variance of severity = (5000 - 0)2 /12 = 2,083,333.
Mean aggregate loss = (0.05)(2500) = 125.
Variance of aggregate losses = (0.05)(2083333) + (25002 )(0.0475) = 401,042.
4.36. B. The mean and variance of the frequency is 300.
The mean of the Generalized Pareto severity is: /(-1) = (1000)(2)/(3-1) = 1000.
The 2nd moment of the Generalized Pareto severity is:
2 ( +1)
(10002) (2) (3)
=
= 3 million.
( 1) ( 2)
(3 -1) (3 - 2)
Mean aggregate losses = (300)(1000) = 300,000.
Variance of Aggregate Losses = (mean of Poison)(2nd moment of severity) =
(300)(3 million) = 900 million.
Standard Deviation of Aggregate Losses = 30,000.
Using the Normal Approximation, the chance that the aggregate losses are greater than 360,000 is
approximately: 1 - [(360,000 - 300,000)/30,000] = 1 - [2] = 1 - 0.9772 = 0.0228.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 64

4.37. E. The mean and variance of the frequency is 150.


The mean of the Generalized Pareto severity is twice what it was or 2000.
The 2nd moment of the Generalized Pareto severity is four times what it was or 12 million.
Mean aggregate losses = (150)(2000) = 300,000.
Variance of Aggregate Losses = (mean of Poison)(2nd moment of severity) =
(150)(12 million) = 1800 million.
Standard Deviation of Aggregate Losses = 42,426.
Using the Normal Approximation, the chance that the aggregate losses are greater than 360,000 is
approximately: 1 - [(360,000 - 300,000)/42,426] = 1 - [1.41] = 1 - 0.9207 = 0.0793.
Comment: The second moment is always multiplied by the square of the inflation factor under
uniform inflation. Alternately, one can instead use the behavior under uniform inflation of the
Generalized Pareto Distribution; the new severity distribution is also a Generalized Pareto, but with
parameters = 2000, = 3 and = 2. Its mean and second moment are as Ive stated. In general,
when one halves the frequency and uniformly doubles the claim size, while the expected aggregate
losses remain the same, the variance of the aggregate losses increases. (Given a Poisson
Frequency, the variance of the aggregate losses doubles.) Therefore, there is a larger chance for an
unusual year. High Severity/Low Frequency lines of insurance are more volatile than High
Frequency/Low Severity lines of insurance.
4.38. D. Mean Frequency = (0.4)(0) + (0.3)(1) + (0.2)(2) + (0.1)(3) = 1.
2nd moment of Frequency = (0.4)(02 ) + (0.3)(12 ) + (0.2)(22 ) + (0.1)(32 ) = 2.
Variance of Frequency = 2 - 12 = 1.
Mean Severity = Variance of Severity = 4.
Variance of aggregate claims = (4)(1) + (42 )(1) = 20.
4.39. D. The second moment of severity p1 is: (0.6)(12 ) + (0.4)(22 ) = 2.2.
The second moment of severity p2 is: (0.1)(12 ) + (0.3)(22 ) + (0.5)(32 ) + (0.1)(42 ) = 7.4.
Var[S] = (2)(2.2) + (3)(7.4) = 26.6.
4.40. E. Mean frequency is : (0)(0.4) + (5)(0.1) + (10)(0.5) = 5.5.
The 2nd moment of the frequency is: (02 )(0.4) + (52 )(0.1) + (102 )(0.5) = 52.5.
Variance of the frequency is: 52.5 - 5.252 = 22.25.
Mean severity is: (0.9)(1) + (0.1)(2) = 1.1.
2nd moment of the severity is: (0.9)(12 ) + (0.1)(22 ) = 1.3.
Variance of the severity is: 1.3 - 1.12 = 0.09.
Variance of aggregate losses = (1.12 )(22.25) + (5.5)(0.09) = 27.4.

2013-4-8,

Classical Credibility 4 Variance of Pure Premiums, HCM 10/16/12, Page 65

4.41. C. The mean frequency = mean of the Inverse Gaussian = = 500.


Variance of frequency = mean of Inverse Gaussian + variance of Inverse Gaussian =
+ 3/ = 500 + 5003 /5000 = 25,500.
Let X be the severity distribution. Then we are given that:
1000 = Mean aggregate loss = 500E[X].
150,000 = Variance of aggregate losses = 500 Var[X] + 25,500 E[X]2 .
Therefore, E[X] = 1000 / 500 = 2 and Var[X] = {150,000 - 25,500(22 )} / 500 = 96.
Comment: In general when one has a mixture of Poissons,
Mean frequency = E[] = mean of mixing distribution, and
Second moment of the frequency =
E[second moment of Poisson | ] = E[ + 2] =
mean of mixing distribution + second moment of mixing distribution.
Variance of frequency =
mean of mixing distribution + second moment of mixing - (mean of mixing distribution)2
= mean of mixing distribution + variance of mixing distribution.
4.42. B. List the different possible situations and their probabilities:
Situation

Probability

Aggregate Loss

Square of the
Aggregate Loss

no claims
1 claim @25
1 claim @ 150
2 claims each @50
1 claim @ 50 and 1 claim @ 200
2 claims @ 200

20.00%
20.00%
40.00%
8.89%
8.89%
2.22%

0
25
150
100
250
400

0
625
22,500
10,000
62,500
160,000

105

19,125

Weighted Average

Mean = 105. Second Moment = 19,125. Variance = 19,125 - 1052 = 8100.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 66

Section 5, Full Credibility for Pure Premiums & Aggregate Losses


A single standard for full credibility applies when one wishes to estimate either pure premiums,
aggregate losses, or loss ratios.
Pure Premium =

Loss Ratio =

$ of Loss
# of Claims
$ of Loss
=
= (Frequency) (Severity).
# of Exposures
# of Exposures # of Claims

$Loss
.
$Premium

Since they depend on both the number of claims and the size of claims, pure premiums and
aggregate losses have more reasons to vary than do either frequency or severity. Since pure
premiums are more difficult to estimate than frequencies, all other things being equal the Standard for
Full Credibility for Pure Premiums is larger than that for Frequencies.
Poisson Frequency Example:
For example, assume frequency is Poisson distributed with a mean of 9 (and a variance of 9) and
every claim is of size 10. Then since the severity is constant, it does not increase the random
fluctuations. Since Var[cX] = c2 Var[X], the variance of the pure premium for a single exposure is:
(variance of the frequency)(102 ) = 900.
Exercise: In the above situation, what is the Standard for Full Credibility (in terms of expected
number of claims), so that the estimated pure premium will have a 90% chance of being within 5%
of the true value?
[Solution: We wish to have a 90% probability, so we are to be within 1.645 standard deviations,
since (1.645) = 0.95.
For X exposures the variance of the sum of the pure premiums for each exposure is X900. The
variance of the average pure premium per exposure is this divided by X2 .
Thus we have a variance of 900 / X and a standard deviation of 30 /
The mean pure premium is (9)(10) = 90.
We wish to be within 5% of this or 4.5.
Setting this equal to 1.645 standard deviations we have:

X.

4.5 = 1.645(30 / X ), or X= {(1.645)(30/4.5)}2 = 120.26 exposures.


The expected number of claims is: (120.26)(9) = 1082.
Comment: Since severity is constant, the Standard for Full Credibility is the same as that for
estimating the frequency, with Poisson frequency, P = 90%, and k = 5%: 1082 claims.]

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 67
If the severity is not constant but instead varies, then the variance of the pure premium is greater
than 900. Specifically assume that the severity is given by a Gamma Distribution, with = 3 and
= 10. This distribution has a mean of: = 30, and a variance of: 2 = 300.
Then if we assume frequency and severity are independent we can use the formula developed for
the variance of the pure premium in terms of that of the frequency and severity:
pp2 = s2 f2 + fs2 . In this case s = 30, s2 = 300, f = f2 = 9, pp2 = 10,800.
Assume we wish the Standard for Full Credibility (in terms of expected number of claims), to be
such that the estimated pure premium will have a 90% chance of being with 5% of the true value.
We wish to have a 90% chance, so we want to be within 1.645 standard deviations since
(1.645) = 0.95.
For X exposures the variance of the sum of the pure premiums for each exposure is 10800X. The
variance of the average pure premium per exposure is this divided by X2 . Thus we have a variance
of 10800 / X and a standard deviation of 103.9 /

X . The mean pure premium is (9)(30) = 270.

We wish to be within 5% of this or 13.5. Setting this equal to 1.645 standard deviations we
gave: 13.5 = 1.645(103.9 /

103.9 2
X ), or X = 1.645
= 160.3 exposures.

13.5

The expected number of claims is: (160.3)(9) = 1443.


Note that this is greater than the 1082 claims needed for Full Credibility of the frequency when
P = 90% and k = 5%. In fact the ratio is: 1443/1082 = 1 + 1/3 = 1 + CV2 , where CV2 is the square
of the coefficient of variation of the severity distribution, which for the Gamma is 1/ = 1/3.
It turns out in general, when frequency is Poisson, that the Standard for Full Credibility for the Pure
Premium is: the Standard for Full Credibility for the Frequency times
(1 + square of coefficient of variation of the severity):39
y2
nF = n0 (1 + CVS e v2 ) = 2 (1 + CVS e v2 ).
k

39

Equation 2.5.4 in Mahler and Dean.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 68
Derivation of the Standard for Full Credibility for Pure Premiums, Poisson Case:
The derivation follows that of the particular case above.
Let s be the mean of the severity distribution while s2 is the variance. Assume that the frequency
is Poisson and therefore f = f2 . Assuming the frequency and severity are independent, the
variance of the Pure Premium for one exposure unit is: pp2 = s2 f2 + fs2 = f (s2 + s2 ).
For X exposure units, the variance of the estimated average pure premium is this divided by X.
We wish to be within y standard deviations, where as usual y is such that (y) = (1+P)/2.
For a mean pure premium of f s we wish to be within kf s.
Setting the two expressions for the error bars equal yields:
kf s = y

f (s2 + s2) / X . Solving for X, X = (y/k)2 (s2 + s2 ) / (f s2 ).

The expected number of claims needed for Full Credibility is:


nF = f X = (y/k)2 (1 + s2 /s2 ) = n0 (1 + CVSev2 ).
A Formula for the Square of the Coefficient of Variation:
The following formula for unity plus the square of the coefficient of variation follows directly from the
definition of the Coefficient of Variation.
C V2 = Variance / E[X]2 = (E[X2 ] - E[X]2 ) / E[X]2 = (E[X2 ] / E[X]2 ) - 1.
Thus, 1 + CV2 =

E[X2 ]
= 2nd moment divided by the square of the mean.
E[X]2

This formula is useful for Classical credibility problems involving the Pure Premium.
For example, assume one has a Pareto Distribution. Then using the formulas for the moments:
1 + CV2 = E[X2 ] / E[X]2 = {22 /(1)(2)} / { / (1)}2 = 2 (1) / (2).
For example if = 5, then (1 + CV2 ) = 2(4)/3 = 8/3 .
Exercise: Assume frequency and severity are independent and frequency is Poisson.
For P = 90% and k = 5%, and if severity follows a Pareto Distribution with
= 5, what is the Standard for Full Credibility for the Pure Premium in terms of claims?
[Solution: n0 (1 + CV2 ) = 1082(8/3) = 2885 claims.]
In general the Standard for Full Credibility for the pure premium is the sum of those for frequency
and severity. n0 (1+CV2 ) = n0 + n0 CV2 . In this case: 1082 + 1803 = 2885.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 69
General Case, if Frequency is Not Poisson:
As with the Standard for Full Credibility for frequency, one can derive a more general formula when
the Poisson assumption does not apply. The Standard for Full Credibility for estimating
either pure premiums or aggregate losses is:40
s 2
f 2
y2 f 2
(
+
)
=
n
(
+ CVs e v2 ).
0
k2 f
s 2
f
which reduces to the Poisson case when f2 /f = 1. Note that if every claim is of size one, then the
variance of the severity is zero and the standard for full credibility reduces to that for frequency:
2
n0 f .
f
Exercise: Frequency is Negative Binomial with r = 0.1 and = 0.5. Severity has a coefficient of
variation of 3. The number of claims and claim sizes are independent.
The observed aggregate loss should be within 5% of the expected aggregate loss 90% of the
time. Determine the expected number of claims needed for full credibility.
2
[Solution: P = 90%. y = 1.645. k = 0.05. f = r(1 + )/(r) = 1+ = 1.5.
f

2
y2 f2
2 ) = 1.645 (1.5 + 32 ) = 11,365 claims.]
(
+
CV
sev
0.05
k2 f

Note that a Negative Binomial has

f2
> 1, so the standard for full credibility is larger than if one
f

assumed a Poisson frequency. Note that if one limits the size of claims, then the coefficient of
variation is smaller. Therefore, the criterion for full credibility for basic limits losses is less than that for
total losses.
In general the Standard for Full Credibility for the pure premium is the sum of those for frequency
2
2
and severity: n0 ( f + CV2 ) = n0 f + n0 CV2 .
f
f

S severity and f frequency. Equation 2.5.5 in Mahler and Dean, as derived in one of the problems below.
See The Credibility of the Pure Premium, by Mayerson, Jones, and Bowers, PCAS 1968.
40

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 70
Derivation of the Standard for Full Credibility for Pure Premiums or Aggregate Loss:
Let f be the mean of the frequency distribution while f2 is its variance.
Let s be the mean of the severity distribution while s2 is its variance.
Assuming that frequency and severity are independent, the variance of the Pure Premium for one
exposure unit is: pp2 = fs2 + s2 f2 .
We assume the number of exposures is known; it is not random.41
For X exposure units, the variance of the estimated average pure premium is this divided by X.
We wish to be within y standard deviations, where as usual y is such that (y) = (1+P)/2.
For a mean pure premium of f s we wish to be within k f s.
Setting the two expressions for the error bars equal yields:
k f s = y

(f s2 + s2 f2 ) / X .

Solving for X, the full credibility standard in exposures: X = (y/k)2 (s2 f2 + fs2 ) / (f2 s2 ).
The expected number of claims needed for Full Credibility is:
nF = f X = (y/k)2 (f2 /f + s2 /s2 ) = n0 (f2 /f + CVSev2 ).
Exposures vs. Claims:
Standards for Full Credibility are calculated in terms of the expected number of claims. It is common
to translate these into a number of exposures by dividing by the (approximate) expected claim
frequency. So for example, if the Standard for Full Credibility is 2885 claims and the expected claim
frequency in Auto Insurance were 0.07 claims per car-year, then 2885 / 0.07 41,214 car-years
would be a corresponding Standard for Full Credibility in terms of exposures.
The Standard for Full Credibility in terms of claims, can be converted to exposures by
dividing by f, the mean claim frequency.
Standard for Full Credibility in terms of exposures = nF / f =
n0 (f2 /f + s2 / s2 )/ f = n0 (s2 f2 + fs2 )/ (fs)2 =
n0 (variance of pure premium)/(mean pure premium)2 = n0 (CV of the Pure Premium)2 .
41

While we will solve for that number of exposures which satisfies the criterion for full credibility, in any given
application of the credibility technique the number of exposures is known.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 71
When asked for the number of exposures needed for Full Credibility for Pure Premiums one can
directly use this formula:42 43 Standard for Full Credibility for Pure Premiums in terms of exposures is:
y2
n0 (Coefficient of Variation of the Pure Premium)2 =
( C VP P) 2 .
k2
Exercise: The variance of pure premiums is 100,000. The mean pure premium is 40. Frequency is
Poisson. We require that the estimated pure premiums be within 2.5% of the true value 90% of the
time. How many exposures are need for full credibility?
[Solution: The square of the Coefficient of Variation of the Pure Premium is 100,000/402 = 62.5.
y = 1.645. k = 0.025. n0 = (y / k)2 = 4330.
n0 (Coefficient of Variation of the Pure Premium)2 = 4330(62.5) = 270,625 exposures.
Alternately, let m be the mean frequency. Then since the frequency is assumed to be Poisson,
variance of pure premium = m(second moment of severity).
Thus E[X2 ] = 100000 / m. E[X] = 40 / m. Standard for Full Credibility in terms of claims is:
n0 (1 + CV2 ) = n0 E[X2 ] / E[X]2 = 4330 (100000 / 402 ) m = 270,625 m claims.
To convert to exposures divide by m, to get 270,625 exposures.]
Assumptions:
2
The formula nF = n0 ( f + CVsev2 ) assumes:
f
1. Frequency and Severity are independent.
2. The claims are drawn from the same distribution or at least from distributions with the same finite
mean and variance.44
3. The pure premium or aggregate loss is approximately Normally Distributed
(the Central Limit Theorem applies.)
4. The number of exposures is known; it is not stochastic.
The pure premiums are often approximately Normal; generally the greater the expected number of
claims or the shorter tailed the frequency and severity distributions, the better the Normal
Approximation. It is assumed that one has enough claims that the aggregate losses approximate a
Normal Distribution.
42

See Equation 20.6 in Loss Models.


In the formula for the standard for full credibility the CV is calculated for one exposure. The CV would go down as
the number of exposures increased, since the mean increases as a factor of N, while the standard deviation
increases as a factor of square root of N. This is precisely why when we have a lot of exposures, we get a good
estimate of the pure premium by relying solely on the data. In other words, this is why there is a standard for full
credibility.
44
The claim sizes can follow any distribution with a finite mean and variance, so that one can compute the coefficient
of variation.
43

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 72
While it is possible to derive formulas that dont depend on the Normal Approximation, they are not
on the Syllabus.45
Notation in Loss Models:
The Loss Models text does not use the same notation as Mahler-Dean and many other casualty
actuarial papers. Thus if one wanted to read this material in Loss Models, one would have to learn
the notation in Loss Models.
Mahler-Dean

Loss Models

probability level

range parameter

yp

such that the mean y standard deviations covers


probability P on the Normal Distribution

n0
nF

Standard for Full Credibility for Poisson Frequency


Standard for Full Credibility for Pure Premium

Loss Models refers to Classical Credibility as Limited Fluctuation Credibility.

45

See for example Appendix 1 of Classical Partial Credibility with Application to Trend by Gary Venter,
PCAS 1986.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 73
Problems:
5.1 (2 points) You are given the following information :

The number of claims is Poisson.

The severity distribution is LogNormal, with parameters = 6 and = 1.2.

Frequency and severity are independent


Full credibility is defined as having a 95% probability of being within plus or minus 10% of
the true pure premium.
What is the minimum number of expected claims that will be given full credibility?
A. Less than 1600
B. At least 1600 but less than 1700
C. At least 1700 but less than 1800
D. At least 1800 but less than 1900
E. At least 1900
5.2 (2 points) The number of claims is Poisson.
Mean claim frequency = 7%. Mean claim severity = $500.
Variance of the claim severity = 1 million. Full credibility is defined as having a 80% probability of
being within plus or minus 5% of the true pure premium.
What is the minimum number of policies that will be given full credibility?
A. 47,000 B. 48,000 C. 49,000 D. 50,000 E. 51,000
5.3 (3 points) The number of claims is Poisson. The full credibility standard for a company is set so
that the total number of claims is to be within 5% of the true value with probability P. This full
credibility standard is calculated to be 5000 claims. The standard is altered so that the total cost of
claims is to be within 10% of the true value with probability P. The claim frequency has a Poisson
distribution and the claim severity has the following distribution:
f(x) = 0.000008 (500 - x), 0 x 500
What is the expected number of claims necessary to obtain full credibility under the new standard?
A. 1825
B. 1850
C. 1875
D. 1900
E. 1925

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 74
5.4 (2 points) You are given the following information:
A standard for full credibility of 3,000 claims has been selected so that the actual
pure premium would be within 5% of the expected pure premium 98% of the time.
The number of claims follows a Poisson distribution, and is independent of the
severity distribution.
Using the concepts of classical credibility, determine the coefficient of variation of the severity
distribution underlying the full credibility standard.
A. Less than 0.6
B. At least 0.6 but less than 0.7
C. At least 0.7 but less than 0.8
D. At least 0.8 but less than 0.9
E. At least 0.9
5.5 (2 points) You are given the following:
The number of claims is Poisson distributed.
Number of claims and claim severity are independent.
Claim severity has the following distribution:
Claim Size
Probability
1
0.50
5
0.30
10
0 20
Determine the number of claims needed so that the total cost of claims is within 3% of the expected
cost with 90% probability.
A. Less than 5000
B. At least 5000 but less than 5100
C. At least 5100 but less than 5200
D. At least 5200 but less than 5300
E. At least 5300
5.6. (2 points) Frequency is Poisson, and severity is Pareto with = 4.
The standard for full credibility is that actual aggregate losses be within 10% of expected aggregate
losses 99% of the time.
50,000 exposures are needed for full credibility.
Determine the expected number of claims per exposure.
A. 2%
B. 3%
C. 4%
D. 5%
E. 6%
5.7 (2 points) The distribution of pure premium has a coefficient of variation of 5.
The full credibility standard has been selected so that actual aggregate losses will be within 5% of
expected aggregate losses 90% of the time.
Using limited fluctuation credibility, determine the number of exposures required for full credibility.
(A) 23,000
(B) 24,000
(C) 25,000
(D) 26,000
(E) 27,000

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 75
5.8 (3 points) Require that the estimated pure premium should be within 100k% of the expected
pure premium with probability P. Assume frequency and severity are independent.
Use the following notation:
f = mean frequency f2 = variance of frequency
s = mean severity

s2 = variance of severity

Let y be such that (y) = (1 + P)/2.


Using the Normal Approximation, which of the following is a formula for the number of claims needed
for full credibility of the pure premium?
2
y2
A. ( f + s ) 2
f
s k

2
y2
B. ( f2 + s ) 2
f
s k

2
2 y2
C. ( f + s2 ) 2
f
s
k

2
2 y2
D. ( f2 + s2 ) 2
f
s
k

E. None of the above.


5.9 (2 points) Using the formula derived in the previous question, find the number
of claims required for full credibility. Require that there is a 90% chance that the estimate of the pure
premium is correct within 7.5%. The frequency distribution has a variance 2.5 times its mean.
The claim amount distribution is a Pareto with = 2.3.
A. Less than 4500
B. At least 4500 but less than 4600
C. At least 4600 but less than 4700
D. At least 4700 but less than 4800
E. At least 4800
5.10 (2 points) The number of claims is Poisson. The expected number of claims needed to
produce a selected standard for full credibility for the pure premium is 1500. If the severity were
constant, the same selected standard for full credibility would require 850 claims.
Given the information below, what is the variance of the severity in the first situation?
Average Claim Frequency = 200
Average Claim Severity = 500.
A. less than 190,000
B. at least 190,000 but less than 200,000
C. at least 200,000 but less than 210,000
D. at least 210,000 but less than 220,000
E. at least 220,000

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 76
5.11 (1 point) The expected number of claims needed to produce full credibility for the claim
frequency is 700. Let:
Average claim frequency = 100
Average claim cost = 400
Variance of claim frequency = 100
Variance of claim cost = 280,000
What is the expected number of claims required to produce full credibility for the pure premium?
A. Less than 1,750
B. At least 1,750, but less than 1,850
C. At least 1,850, but less than 1,950
D. At least 1,950, but less than 2,050
E. 2,050 or more
5.12 (2 points) A full credibility standard is determined so that the total number of claims is within 5%
of the expected number with probability 99%. If the same expected number of claims for full
credibility is applied to the total cost of claims, the actual total cost would be within 100k% of the
expected cost with 95% probability. The coefficient of variation of the severity is 2.5. The frequency
is Poisson. Frequency and severity are independent. Using the normal approximation of the
aggregate loss distribution, determine k.
A. 4%
B. 6%
C. 8%
D. 10%
E. 12%
5.13 (1 point) Which of the following are true regarding Standards for Full Credibility?
1. A Standard for Full Credibility should be adjusted for inflation.
2. All other things being equal, if severity is not constant, a Standard for Full Credibility
for pure premiums is larger than that for frequency.
3. All other things being equal, a Standard for Full Credibility for pure premiums is
larger as applied to losses limited by a policy limit than when applied to unlimited losses.
A. None of 1, 2 or 3
B. 1
C. 2
D. 3
E. None of A, B, C or D
5.14 (2 points) You are given the following:

The frequency distribution is Poisson

The claim amount distribution has mean 1000, variance 4,000,000.

Frequency and severity are independent.

Find the number of claims required for full credibility, if you require that there will be a
80% chance that the estimate of the pure premium is correct within 10%.
A. Less than 750
B. At least 750 but less than 800
C. At least 800 but less than 850
D. At least 850 but less than 900
E. At least 900

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 77
5.15 (3 points) Standards for full credibility for aggregate losses are being determined for three
situations.
The only thing that differs among the situations is the assumed size of loss distribution:
1. Exponential.
2. Weibull, = 1/2.
3. LogNormal, = 0.8.
Rank the resulting standards for full credibility from smallest to largest.
A. 1, 2, 3
B. 1, 3, 2
C. 2, 1, 3
D. 2, 3, 1
E. none of A, B, C, or D
5.16 (2 points) You are given the following:

The total losses for one risk within a class of homogeneous risks equals T.
E [{T - E(T)}2 ] = 40,000.
The average amount of each claim = 100.
The frequency for each insured is Poisson.
The average number of claims for each risk = 2.
Find the number of claims required for full credibility, if you require that there will be a 90% chance
that the estimate of the pure premium is correct within 5%.
A. Less than 1,000
B. At least 1,000 but less than 1,500
C. At least 1,500 but less than 2,000
D. At least 2,000 but less than 2,500
E. 2,500 or more
5.17 (1 point) You are given the following:

You require that the estimated frequency should be


within 100k% of the expected frequency with probability P.

The standard for full credibility for frequency is 800 claims.

You require that the estimated pure premium should be


within 100k% of the expected pure premium with probability P.

The standard for full credibility for pure premiums is 2000 claims.

You require that the estimated severity should be

within 100k% of the expected severity with probability P.


What is the standard for full credibility for the severity, in terms of the number of claims?
E. 1300
A. 900
B. 1000
C. 1100
D. 1200

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 78
5.18 (3 points) You are given the following:

The number of claims follows a Poisson distribution.

Claim sizes follow a Burr distribution, with parameters (unknown), = 9, and = 0.25.

The number of claims and claim sizes are independent.


The full credibility standard has been selected so that actual aggregate claim
costs will be within 10% of expected aggregate claim costs 85% of the time.
Using the methods of Classical credibility, determine the expected number of claims needed for full
credibility.
A. Less than 1000
B. At least 1000, but less than 10,000
C. At least 10,000, but less than 100,000
D. At least 100,000, but less than 1,000,000
E. At least 1,000,000

5.19 (2 points) You are given the following:

The number of claims follows a Poisson distribution.

The variance of the number of claims is 20.

The variance of the claim size distribution is 35.

The variance of aggregate claim costs is 1300.

The number of claims and claim sizes are independent.

The full credibility standard has been selected so that actual aggregate claim
costs will be within 7.5% of expected aggregate claim costs 98% of the time.
Using the methods of classical credibility, determine the expected number of claims required for full
credibility.
A. Less than 2,000
B. At least 2,000, but less than 2,100
C. At least 2,100, but less than 2,200
D. At least 2,200, but less than 2,300
E. At least 2,300
5.20 (3 points) Determine the number of claims needed for full credibility in three situations. In each
case, there will be a 90% chance that the estimate is correct within 10%.
1. Estimating frequency. Frequency is assumed to be Negative Binomial with = 0.3.
2. Estimating severity. Severity is assumed to be Pareto with = 5.
3. Estimating aggregate losses. Frequency is assumed to be Poisson.
Severity is assumed to be Gamma with = 2.
Rank the resulting standards for full credibility from smallest to largest.
A. 1, 2, 3
B. 1, 3, 2
C. 2, 1, 3
D. 2, 3, 1
E. none of A, B, C, or D

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 79
5.21 (2 points)
A company has determined that the limited fluctuation full credibility standard is 16,000 claims if:
(i) The total cost of claims is to be within r% of the true value with probability p.
(ii) The number of claims follows a Geometric distribution with = 0.4.
(iii) The severity distribution is Exponential.
The standard is changed so that the total cost of claims is to be within 3r% of the true value
with probability p, where claim severity is Gamma with = 2.
Using limited fluctuation credibility, determine the expected number of claims necessary to
obtain full credibility under the new standard.
A. 1100
B. 1200
C. 1300
D. 1400
E. 1500
5.22 (2 points) You are given the following information about a book of business:
(i) Each insureds claim count has a Poisson distribution with mean , where has a
gamma distribution with = 4 and = 0.5.
(ii) Individual claim size amounts are independent and uniformly distributed from 0 to 500.
(iii) The full credibility standard is for aggregate losses to be within 10% of the expected
with probability 0.98.
Using classical credibility, determine the expected number of claims required for full credibility.
(A) 600
(B) 700
(C) 800
(D) 900
(E) 1000
5.23 (3 points) You are given the following:

Claim sizes follow a gamma distribution, with parameters = 2.5 and unknown.

The number of claims and claim sizes are independent.


The full credibility standard for frequency has been selected so that the actual number of
claims will be within 2.5% of the expected number of claims P of the time.

The full credibility standard for aggregate loss has been selected so that the actual aggregate
losses will be within 2.5% of the expected actual aggregate losses P of the time, using the
same P as for the standard for frequency.

13,801 expected claims are needed for full credibility for frequency.

18,047 expected claims are needed for full credibility for aggregate loss.
Using the methods of Classical credibility, determine the value of P.
A. 80%
B. 90%
C. 95%
D. 98%
E. 99%

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 80
5.24 (4, 11/82, Q.47) (3 points) You are given the following:

The frequency distribution is Negative Binomial with variance equal to twice its mean.

The claim amount distribution is LogNormal with mean 100, variance 25,000.

Frequency and severity are independent.

Find the number of claims required for full credibility, if you require that there will be a 90% chance
that the estimate of the pure premium is correct within 5%. Use the Normal Approximation.
A. Less than 4500
B. At least 4500 but less than 4600
C. At least 4600 but less than 4700
D. At least 4700 but less than 4800
E. At least 4800
5.25 (4, 5/83, Q.36) (1 point) The number of claims is Poisson. Assume that claim severity has
mean equal to 100 and standard deviation equal to 200. Which of the following is closest to the
factor which would need to be applied to the full credibility standard based on frequency only, in
order to approximate the full credibility standard for the pure premium?
A. 0.5
B. 1.0
C. 1.5
D. 2.0
E. 5.0

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 81
5.26 (4, 5/83, Q.46) (3 points) Chebyshev's inequality says that for a probability distribution X,
with mean m and standard deviation , for any constant "a": Prob(|X - m| a) 1/a2 .
Using Chebyshev's inequality (rather than the Normal Approximation) derive a formula for the
number of claims needed for full credibility of the pure premium.
Assume frequency and severity are independent. Require that the observed pure premium should
be within 100k% of the expected pure premium with probability P. Use the following notation:
f = mean frequency

f2 = variance of frequency

s = mean severity

s2 = variance of severity

2
f2
+ s2
f
s
A.
2
k (1-P)

2
2
B. ( f2 + s2 ) k2 (1-P)
f
s

2
2 k2
C. ( f + s2 )
f
s
P

2
2 k2
D. ( f2 + s2 )
f
s
P

E. None of the above.


5.27 (2 points) Using the formula derived in the previous question, find the number
of claims required for full credibility.
Require that there is a 90% chance that the estimate of the pure premium is correct within 7.5%.
The frequency distribution has a variance 2.5 times its mean.
The claim amount distribution is a Pareto with = 2.3.
A. Less than 15,000
B. At least 15,000 but less than 16,000
C. At least 16,000 but less than 17,000
D. At least 17,000 but less than 18,000
E. At least 18,000

5.28 (4, 5/85, Q.32) (1 point) The expected number of claims needed to produce a selected level
of credibility for the claim frequency is 1200. Let:
Average claim frequency
= 200
Average claim cost
= 400
Variance of claim frequency = 200
Variance of claim cost = 80,000
What is the expected number of claims required to produce the same level of credibility for the pure
premium? (Use Classical Credibility.)
A. Less than 1,750
B. At least 1,750, but less than 1,850
C. At least 1,850, but less than 1,950
D. At least 1,950, but less than 2,050
E. 2,050 or more

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 82
5.29 (4, 5/85, Q.33) (2 points) How many claims are necessary for full credibility if the standard for
full credibility is to have the estimated pure premium be within 8% of the true pure premium 90% of
the time? Assume the average claim severity is $1000 and the standard deviation of the claim
severity is 4000. Assume the variance of the number of claims is 1.5 times the mean number of
claims. Assume frequency and severity are independent.
A. Less than 7,150
B. At least 7,150, but less than 7,250
C. At least 7,250, but less than 7,350
D. At least 7,350, but less than 7,450
E. 7,450 or more
5.30 (4, 5/87, Q.34) (1 point) The expected number of claims needed to produce a selected
standard for full credibility for the pure premium is 1800. If the claim size were constant, the same
selected standard for full credibility would require 1200 claims.
Given the information below, what is the variance of the claim cost in the first situation?

The number of claims is Poisson.

Average Claim Frequency = 200.

Average Claim Cost = 400.


A. 20,000
B. 40,000
C. 80,000 D. 120,000 E. 160,000
5.31 (4, 5/87, Q.35) (2 points) The number of claims for a company's major line of business is
Poisson distributed, and during the past year, the following claim size distribution was observed:
$ 0 - 400
20
400 - 800 240
800 - 1200 320
1200 - 1600 210
1600 - 2000 100
2000 - 2400 60
2400 - 2800 30
2800 - 3200 10
3200 - 3600 10
Total
1000
The mean of this claim size distribution is $1216 and the standard deviation is

$362,944 .

You need to select the number of claims needed to ensure that the estimate of losses is within 8%
of the actual value 90% of the time. How many claims are needed for full credibility if the claim size
distribution is considered?
A. Less than 450 claims
B. At least 450, but less than 500 claims
C. At least 500, but less than 550 claims
D. At least 550, but less than 600 claims
E. 600 claims or more

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 83
5.32 (4, 5/90, Q.29) (2 points) The ABC Insurance Company has decided to establish its full
credibility requirements for an individual state rate filing using Classical Credibility. The full credibility
standard is to be set so that the observed total cost of claims underlying the rate filing should be
within 5% of the true value with probability 0.95. The claim frequency follows a Poisson distribution
and the claim severity is distributed according to the following distribution:
1
f(x) =
, for 0 x 100,000.
100,000
What is the expected number of claims, nF necessary to obtain full credibility.
A.

nF < 1500

B. 1500 nF < 1800


C. 1800 nF < 2100
D. 2100 nF < 2400
E. 2400 nF
5.33 (4, 5/91, Q.22) (1 point) The average claim size for a group of insureds is $1,500 with
standard deviation $7,500. Assuming a Poisson claim count distribution, calculate the expected
number of claims so that the total loss will be within 6% of the expected total loss with probability
P = 0.90.
A. Less than 10,000
B. At least 10,000 but less than 15,000
C. At least 15,000 but less than 20,000
D. At least 20,000 but less than 25,000
E. At least 25,000
5.34 (4, 5/91, Q.39) (3 points) The full credibility standard for a company is set so that the total
number of claims is to be within 5% of the true value with probability P. This full credibility standard
is calculated to be 800 claims. The standard is altered so that the total cost of claims is to be within
10% of the true value with probability P. The claim frequency has a Poisson distribution and the
claim severity has the following distribution.
f(x) = (0.0002) (100 - x), 0 x 100.
What is the expected number of claims necessary to obtain full credibility under the new standard?
A. Less than 250
B. At least 250 but less than 500
C. At least 500 but less than 750
D. At least 750 but less than 1000
E. At least 1000

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 84
5.35 (4B, 5/92, Q.1) (2 points) You are given the following information:

A standard for full credibility of 1,000 claims has been selected so that the actual pure
premium would be within 10% of the expected pure premium 95% of the time.

The number of claims follows a Poisson distribution, and is independent of the


severity distribution.
Using the concepts from Classical Credibility determine the coefficient of variation of the severity
distribution underlying the full credibility standard.
A. Less than 1.20
B. At least 1.20 but less than 1.35
C. At least 1.35 but less than 1.50
D. At least 1.50 but less than 1.65
E. At least 1.65
5.36 (4B, 5/92, Q.16) (2 points) You are given the following information:

The number of claims follows a Poisson distribution.

Claim severity is independent of the number of claims and has the following distribution:
f(x) = (5/2) x-7/2 , x > 1.
A full credibility standard is determined so that the total number of claims is within 5% of the
expected number with probability 98%. If the same expected number of claims for full credibility is
applied to the total cost of claims, the actual total cost would be within 100K% of the expected cost
with 95% probability.
Using the normal approximation of the aggregate loss distribution, determine K.
A. Less than 0.04
B. At least 0.04 but less than 0.05
C. At least 0.05 but less than 0.06
D. At least 0.06 but less than 0.07
E. At least 0.07

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 85
5.37 (4B, 11/92, Q.1) (2 points) You are given the following:
The number of claims is Poisson distributed.
Number of claims and claim severity are independent.
Claim severity has the following distribution:
Claim Size
Probability
1
0.50
2
0.30
10
0.20
Determine the number of claims needed so that the total cost of claims is within 10% of the
expected cost with 90% probability.
A. Less than 625
B. At least 625 but less than 825
C. At least 825 but less than 1,025
D. At least 1,025 but less than 1,225
E. At least 1,225
5.38 (4B, 11/92, Q.10) (2 points) You are given the following:

A full credibility standard of 3,025 claims has been determined using classical credibility
concepts.

The full credibility standard was determined so that the actual pure premium is
within 10% of the expected pure premium 95% of the time.

Number of claims is Poisson distributed.


Determine the coefficient of variation for the severity distribution.
A. Less than 2.25
B. At least 2.25 but less than 2.75
C. At least 2.75 but less than 3.25
D. At least 3.25 but less than 3.75
E. At least 3.75

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 86
5.39 (4B, 11/92, Q.15) (2 points) You are given the following:

X is the random variable for claim size.

N is the random variable for number of claims and has a Poisson distribution.

X and N are independent.

n0 is the standard for full credibility based only on number of claims.

nF is the standard for full credibility based on total cost of claims.

n is the observed number of claims.

C is the random variable for total cost of claims.

Z is the amount of credibility to be assigned to total cost of claims.


According to the Classical credibility concepts, which of the following are true?
1. Var(C) = E(N) Var(X) + E(X) Var(N)
E(X)2 + Var(X)
2. nF = n0
E(X2 )
3. Z =

n
nF

A. 1 only

B. 2 only

C. 1, 3 only

D. 2, 3 only

E. 1, 2, 3

5.40 (4B, 5/93, Q.10) (2 points) You are given the following:
The number of claims for a single insured follows a Poisson distribution.
The coefficient of variation of the severity distribution is 2.
The number of claims and claim severity distributions are independent.
Claim size amounts are independent and identically distributed.
Based on Classical credibility, the standard for full credibility is 3415 claims.
With this standard, the observed pure premium will be within k% of the expected pure premium
95% of the time.
Determine k.
A. Less than 5.75%
B. At least 5.75% but less than 6.25%
C. At least 6.25% but less than 6.75%
D. At least 6.75% but less than 7.25%
E. At least 7.25%

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 87
5.41 (4B, 11/93, Q.11) (3 points) You are given the following:
Number of claims follows a Poisson distribution.
Claim severity is independent of the number of claims and has the following probability density
distribution
f(x) = 5x-6, x > 1.
A full credibility standard has been determined so that the total cost of claims is within 5% of the
expected cost with a probability of 90%. If the same number of claims for full credibility of total cost
is applied to frequency only, the actual number of claims would be within 100k% of the expected
number of claims with a probability of 95%.
Using the normal approximation of the aggregate loss distribution, determine k.
A. Less than 0.0545
B. At least 0.0545, but less than 0.0565
C. At least 0.0565, but less than 0.0585
D. At least 0.0585, but less than 0.0605
E. At least 0.0605
5.42 (4B, 5/94, Q.13) (2 points) You are given the following:

120,000 exposures are needed for full credibility.

The 120,000 exposures standard was selected so that the


actual total cost of claims is within 5% of the expected total 95% of the time.

The number of claims per exposure follows a Poisson distribution with mean m.

m was estimated from the following observed data using the


method of moments:
| Year | Exposures | Claims |
1
18,467
1,293
2
26,531
1,592
3
20,002
1,418
If mean claim severity is $5,000, determine the standard deviation of the claim severity distribution.
A. Less than $9,000
B. At least $9,000, but less than $12,000
C. At least $12,000, but less than $15,000
D. At least $15,000, but less than $18,000
E. At least $18,000

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 88
5.43 (4B, 11/94, Q.11) (3 points) You are given the following:
Number of claims follows a Poisson distribution with mean m.
X is the random variable for claim severity, and has a Pareto distribution with parameters = 3.0 and
= 6000.
A standard for full credibility was developed so that the observed pure premium is within 10% of
the expected pure premium 98% of the time.
Number of claims and claims severity are independent.
Using Classical credibility concepts, determine the number of claims needed for full credibility for
estimates of the pure premium.
A. Less than 600
B. At least 600, but less than 1200
C. At least 1200, but less than 1800
D. At least 1800, but less than 2400
E. At least 2400
5.44 (4B, 5/95, Q.10) (1 point) You are given the following:
The number of claims follows a Poisson distribution.
The distribution of claim sizes has a mean of 5 and variance of 10.
The number of claims and claim sizes are independent.
How many expected claims are needed to be 90% certain that actual claim costs will be within 10%
of the expected claim costs?
A. Less than 100
B. At least 100, but less than 300
C. At least 300, but less than 500
D. At least 500, but less than 700
E. At least 700

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 89
5.45 (4B, 5/95, Q.26) (3 points) You are given the following:

40,000 exposures are needed for full credibility.

The 40,000 exposures standard was selected so that the actual


total cost of claims is within 7.5% of the expected total 95% of the time.

The number of claims per exposure follows a Poisson distribution with mean m.

The claim size distribution is lognormal with parameters


(unknown) and = 1.5.

The lognormal distribution has the following moments:


mean: exp( + 2/2)

variance: exp(2 + 2) {exp(2) - 1}.

The number of claims per exposure and claim sizes are independent.
Using the methods of classical credibility, determine the value of m.
A. Less than 0.05
B. At least 0.05, but less than 0.10
C. At least 0.10, but less than 0.15
D. At least 0.15, but less than 0.20
E. At least 0.20
5.46 (4B, 11/95 Q.11) (2 points) You are given the following:

The number of claims follows a Poisson distribution.


Claim sizes follow a Pareto distribution, with parameters = 3000 and = 4.
The number of claims and claim sizes are independent.
2000 expected claims are needed for full credibility.

The full credibility standard has been selected so that actual claim costs will be
within 5% of expected claim costs P% of the time.
Using the methods of Classical credibility, determine the value of P.
A. Less than 82.5
B. At least 82.5, but less than 87.5
C. At least 87.5, but less than 92.5
D. At least 92.5, but less than 97.5
E. At least 97.5

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 90
5.47 (4B, 5/96, Q.27) (2 points) You are given the following:

The number of claims follows a Poisson distribution.

Claim sizes follow a gamma distribution, with parameters = 1 and (unknown).

The number of claims and claim sizes are independent.


The full credibility standard has been selected so that actual claim costs will be within 5% of
expected claim costs 90% of the time. Using the methods of Classical credibility, determine the
expected number of claims required for full credibility.
A. Less than 1,000
B. At least 1,000, but less than 2,000
C. At least 2,000, but less than 3,000
D. At least 3,000
E. Cannot be determined from the given information.
5.48 (4B, 11/96, Q.2) (1 point) Using the methods of Classical credibility, a full credibility standard
of 1,000 expected claims has been established so that actual claim costs will be within 100c% of
expected claim costs 90% of the time. Determine the number of expected claims that would be
required for full credibility if actual claim costs were to be within 100c% of expected claim costs 95%
of the time.
A. Less than 1,100
B. At least 1,100, but less than 1,300
C. At least 1,300, but less than 1,500
D. At least 1,500, but less than 1,700
E. At least 1,700
5.49 (4B, 11/96, Q.28) (2 points) You are given the following:

The number of claims follows a Poisson distribution.


Claim sizes are discrete and follow a Poisson distribution with mean 4.

The number of claims and claim sizes are independent.


The full credibility standard has been selected so that actual claim costs will be within 10% of
expected claim costs 95% of the time. Using the methods of Classical credibility, determine the
expected number of claims required for full credibility.
A. Less than 400
B. At least 400, but less than 600
C. At least 600, but less than 800
D. At least 800, but less than 1,000
E. At least 1,000

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 91
5.50 (4B, 5/97, Q.2) (2 points) The number of claims follows a Poisson distribution.
Using the methods of Classical credibility, a full credibility standard of 1,200 expected claims has
been established for aggregate claim costs. Determine the number of expected claims that would
be required for full credibility if the coefficient of variation of the claim size distribution were changed
from 2 to 4 and the range parameter, k, were doubled.
A. 500
B. 1,000
C. 1,020
D. 1,200
E. 2,040
5.51 (4B, 11/97, Q.24 & Course 4 Sample Exam 2000, Q.15) (3 points)
You are given the following:

The number of claims per exposure follows a Poisson distribution with mean 0.01.

Claim sizes follow a lognormal distribution, with parameters (unknown) and = 1.

The number of claims per exposure and claim sizes are independent.
The full credibility standard has been selected so that actual aggregate claim costs will be
within 10% of expected aggregate claim costs 95% of the time.
Using the methods of Classical credibility, determine the number of exposures required for full
credibility.
A. Less than 25,000
B. At least 25,000, but less than 50,000
C. At least 50,000, but less than 75,000
D. At least 75,000, but less than 100,000
E. At least 100,000

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 92
Use the following information for the next two questions:
You are given the following:

The number of claims follows a Poisson distribution.

Claim sizes follow a inverse gamma distribution, with parameters = 4 and unknown.

The number of claims and claim sizes are independent


The full credibility standard has been selected so that the actual aggregate claim costs
will be within 5% of expected aggregate claim costs 95% of the time.

5.52 (4B, 5/98 Q.18) (2 points) Using the methods of Classical credibility, determine the
expected number of claims required for full credibility.
A. Less than 1,600
B. At least 1,600, but less than 18,00
C. At least 1,800, but less than 2,000
D. At least 2,000, but less than 2,200
E. At least 2,200
5.53 (4B, 5/98 Q.19) (1 point) If the number of claims were to follow a negative binomial
distribution instead of a Poisson distribution, determine which of the following statements would be
true about the expected number of claims required for full credibility.
A. The expected number of claims required for full credibility would be smaller.
B. The expected number of claims required for full credibility would be the same.
C. The expected number of claims required for full credibility would be larger.
D. The expected number of claims required for full credibility would be either the same or smaller,
depending on the parameters of the negative binomial distribution.
E. The expected number of claims required for full credibility would be either smaller or larger,
depending on the parameters of the negative binomial distribution.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 93
5.54 (4B, 11/98, Q.5) (2 points) You are given the following:

The number of claims follows a Poisson distribution.

The variance of the number of claims is 10.

The variance of the claim size distribution is 10.

The variance of aggregate claim costs is 500.

The number of claims and claim sizes are independent.

The full credibility standard has been selected so that actual aggregate claim
costs will be within 5% of expected aggregate claim costs 95% of the time.
Using the methods of Classical credibility, determine the expected number of claims required for full
credibility.
A. Less than 2,000
B. At least 2,000, but less than 4,000
C. At least 4,000, but less than 6,000
D. At least 6,000, but less than 8,000
E. At least 8,000
5.55 (4B, 11/98, Q.29) (3 points) You are given the following:

The number of claims follows a Poisson distribution.

Claim sizes follow a Burr distribution, with parameters (unknown), = 6, and = 0.5 .

The number of claims and claim sizes are independent.


6,000 expected claims are needed for full credibility.
The full credibility standard has been selected so that actual aggregate claim
costs will be within 10% of expected aggregate claim costs P% of the time.
Using the methods of Classical credibility, determine the value of P.
n (1 + n / ) ( - n / )
Hint: For the Burr Distribution E[Xn ] =
.
()
A. Less than 80
B. At least 80, but less than 85
C. At least 85, but less than 90
D. At least 90, but less than 95
E. At least 95

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 94
5.56 (4B, 5/99, Q.19) (1 point) You are given the following:

The number of claims follows a Poisson distribution.

The coefficient of variation of the claim size distribution is 2.

The number of claims and claim sizes are independent.

1,000 expected claims are needed for full credibility.

The full credibility standard has been selected so that the actual number of
claims will be within k% of the expected number of claims P% of the time.
Using the methods of Classical credibility, determine the number of expected claims that would be
needed for full credibility if the full credibility standard were selected so that actual aggregate claim
costs will be within k% of expected aggregate claim costs P% of the time.
A. 1,000
B. 1,250
C. 2,000
D. 2,500
E. 5,000
5.57 (4B, 11/99, Q.2) (2 points) You are given the following:
The number of claims follows a Poisson distribution.
Claim sizes follow a lognormal distribution, with parameters and .
The number of claims and claim sizes are independent.
13,000 expected claims are needed for full credibility.
The full credibility standard has been selected so that actual aggregate claim costs
will be within 5% of expected aggregate claim costs 90% of the time.
Determine .
A. Less than 1.2
B. At least 1.2, but less than 1.4
C. At least 1.4, but less than 1.6
D. At least 1.6, but less than 1.8
E. At least 1.8

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 95
5.58 (4, 11/00, Q.14) (2.5 points) For an insurance portfolio, you are given:
(i) For each individual insured, the number of claims follows a Poisson distribution.
(ii) The mean claim count varies by insured, and the distribution of mean claim
counts follows a gamma distribution.
(iii) For a random sample of 1000 insureds, the observed claim counts are as follows:
Number Of Claims, n
0
1
2
3
4
5
Number Of Insureds, fn
512 307 123 41
11
6

n fn = 750, n2 fn = 1494.
(iv) Claim sizes follow a Pareto distribution with mean 1500 and variance 6,750,000.
(v) Claim sizes and claim counts are independent.
(vi) The full credibility standard is to be within 5% of the expected aggregate loss 95% of the time.
Determine the minimum number of insureds needed for the aggregate loss to be fully credible.
(A) Less than 8300
(B) At least 8300, but less than 8400
(C) At least 8400, but less than 8500
(D) At least 8500, but less than 8600
(E) At least 8600
5.59 (4, 11/02, Q.14 & 2009 Sample Q. 39) (2.5 points) You are given the following information
about a commercial auto liability book of business:
(i) Each insureds claim count has a Poisson distribution with mean ,
where has a gamma distribution with = 15 and = 0.2.
(ii) Individual claim size amounts are independent and exponentially distributed with mean 5000.
(iii) The full credibility standard is for aggregate losses to be within 5% of the expected
with probability 0.90.
Using classical credibility, determine the expected number of claims required for full credibility.
(A) 2165
(B) 2381
(C) 3514
(D) 7216
(E) 7938

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 96
5.60 (4, 11/03, Q.3 & 2009 Sample Q.2) (2.5 points) You are given:
(i) The number of claims has a Poisson distribution.
(ii) Claim sizes have a Pareto distribution with parameters = 0.5 and = 6.
(iii) The number of claims and claim sizes are independent.
(iv) The observed pure premium should be within 2% of the expected pure premium 90%
of the time.
Determine the expected number of claims needed for full credibility.
(A) Less than 7,000
(B) At least 7,000, but less than 10,000
(C) At least 10,000, but less than 13,000
(D) At least 13,000, but less than 16,000
(E) At least 16,000
5.61 (4, 5/05, Q.2 & 2009 Sample Q.173) (2.9 points) You are given:
(i) The number of claims follows a negative binomial distribution with parameters r and = 3.
(ii) Claim severity has the following distribution:
Claim Size Probability
1
0.4
10
0.4
100
0.2
(iii) The number of claims is independent of the severity of claims.
Determine the expected number of claims needed for aggregate losses to be within 10% of
expected aggregate losses with 95% probability.
(A) Less than 1200
(B) At least 1200, but less than 1600
(C) At least 1600, but less than 2000
(D) At least 2000, but less than 2400
(E) At least 2400

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 97
5.62 (4, 11/05, Q.35 & 2009 Sample Q.245) (2.9 points) You are given:
(i) The number of claims follows a Poisson distribution.
(ii) Claim sizes follow a gamma distribution with parameters (unknown) and = 10,000.
(iii) The number of claims and claim sizes are independent.
(iv) The full credibility standard has been selected so that actual aggregate losses will
be within 10% of expected aggregate losses 95% of the time.
Using limited fluctuation (classical) credibility, determine the expected number of claims required for
full credibility.
(A) Less than 400
(B) At least 400, but less than 450
(C) At least 450, but less than 500
(D) At least 500
(E) The expected number of claims required for full credibility cannot be determined
from the information given.
5.63 (4, 11/06, Q.30 & 2009 Sample Q.273) (2.9 points)
A company has determined that the limited fluctuation full credibility standard is 2000 claims if:
(i) The total number of claims is to be within 3% of the true value with probability p.
(ii) The number of claims follows a Poisson distribution.
The standard is changed so that the total cost of claims is to be within 5% of the true value
with probability p, where claim severity has probability density function:
1
f(x) =
, 0 x 10,000.
10,000
Using limited fluctuation credibility, determine the expected number of claims necessary to
obtain full credibility under the new standard.
(A) 720
(B) 960
(C) 2160
(D) 2667
(E) 2880

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 98
Solutions to Problems:
5.1. B. The mean severity = exp[ + 0.52 ] = exp(6.72) = 828.82. The second moment of the
severity = exp[2 + 22 ] = exp (14.88) = 2,899,358. Thus 1 + CV2 = E[X2 ]/E[X]2 =
2,899,358 /828.822 = 4.221. y = 1.960 since (1.960) = 0.975 = (1 + 0.95)/2. Therefore
n0 = y2 / k2 = (1.96/.1)2 = 384. Therefore nF = n0 (1+CV2 ) = (384)(4.221) = 1621 claims.
5.2. A. Square of Coefficient of Variation = (1 million)/(5002 ) = 4.
y = 1.282 since (1.282) = 0.9 = (1 + 0.8)/2. k = 5%.
Therefore in terms of number of claims the full credibility standard is:
(y2 / k2 ) (1+ CV2 ) = (1.282/0.05)2 (1 + 4) = 3287 claims.
This is equivalent to: 3287 / 0.07 = 46,958 policies.
5.3. C. The severity has a mean of 166.7, and a second moment of 41,667:
500

500

x f(x) dx = 0.000008

500

500

x2 f(x)

dx = 0.000008

x = 500

(500x - x 2) dx = (0.000008) (250x2 - x3 / 3)]

= 166.7.

x =0
x = 500

(500x2 - x3 ) dx = (0.000008) (500x3 / 3 - x 4 / 4)]

= 41,667.

x =0

1 + CV2 = E[X2 ] / E[X]2 = 41667 / 166.72 = 1.5.


The standard for Full Credibility for the pure premiums for k = 5% is therefore
nF = n0 (1+CV2 ) = (5000)(1.5) =7500. For k = 10% we need to multiply by: (5%/10%)2 = 1/4
since the full credibility standard is inversely proportional to k2 . 7500/4 = 1875.
5.4. B. We have y = 2.326 since (2.326) = 0.99 = (1+.98)/2.
Therefore n0 = y2 / k2 = (2.326 / 0.05)2 = 2164.
nF = n0 (1+CV2 ), therefore CV =

nF
- 1=
n0

3000
- 1 = 0.62.
2164

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 99
5.5. D. We have y = 1.645 since (1.645) = .95 = (1+.9)/2.
Therefore, n0 = y2 / k2 = (1.645 / 0.03)2 = 3007.
The mean severity is (1)(0.5) + (5)(0.3) + (10)(0.2) = 4.
The 2nd moment of the severity is: (12 )(0.5) + (52 )(0.3) + (102 )(0.2) = 28.
1 + CV2 = E[X2 ]/E[X]2 = 28 / 42 = 1.75. nF = n0 (1+CV2 ) = (3007)(1.75) = 5262.
5.6. C. P = 99% y = 2.576. n0 = (2.576/0.1)2 = 664 claims.
For the Pareto severity: 1 + CV2 = E[X2 ] / E[X]2 =

2 2
2
/
= 3.
(4 - 1)(4 - 2) 4 - 1

Thus the standard for full credibility is: (664)(3) = 1992 claims.
Thus, 1992 claims 50,000 exposures. = 1992 / 50,000 = 3.98%.
5.7. E. P = 90%. y = 1.645. n0 = (1.645/0.05)2 = 1082.
Standard for full credibility is: n0 CVPP2 = (1082)(52 ) = 27,050 exposures.
5.8. C. Assume there are N claims expected and therefore N/f exposures.
The mean pure premium is m = Ns.
For frequency and severity independent, the variance of the pure premium for a single exposure is:
f s2 + s2 f2 .
The variance of the aggregate loss for N/f independent exposures is:
2 = (N / f)(f s2 + s2 f2 ) = N (s2 + s2 f2 /f).
We desire that Prob[m - km X m + km] P.
Using the Normal Approximation this is true provided km = y
Therefore, k2 m2 = y2 2 . Thus, k2 N2 s2 = y2 N (s2 + s2 f2 /f).
Solving, N = y2 (s2 + s2 f2 /f)/(k2 s2 ) = (f2 / f + s2 / s2 )(y2 /k2 ) = n0 (f2 /f + CVSev2 ).
Comment: See Mayerson, Jones and Bowers The Credibility of the Pure Premium, PCAS
1968. Note that if one assumes a Poisson Frequency Distribution, then f2 /f = 1 and the formula
becomes: (1 + CV2 ) {y2 / k2 }.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 100
5.9. E. (1.645) = 0.95 so that y = 1.645. n0 = y2 / k2 = (1.645/0.075)2 = 481. Using the formulas
2 2
for the moments: CV2 = E[X2 ] / E2 [X] - 1 =

( 1) ( 2)
- 1 = 2 (1) / (2) - 1
2

= /(2). For = 2.3, CV2 = 2.3 / 0.3 = 7.667.


Therefore, n0 (f2 /f + CV2 ) = (481)(2.5 + 7.667) = 4890.
5.10. B. nF = n0 (1 + CV2 ). Therefore CV2 = {(nF / n0 ) - 1} = {(1500 / 850) - 1} = 0.7647.
Variance of severity = CV2 mean2 = (0.7647)(500)2 = 191,175.
5.11. C. CV2 = 280000 / 4002 = 1.75. nF = n0 (1+CV2 ) = (700)(1 + 1.75) = 1925.
5.12. D. ( 2.576) = 0.995, so y = 2.576.
For frequency the standard for full credibility is: ( 2.576/0.05)2 = 2654.
( 1.960) = 0.975, so y = 1.960 for the Standard for Full Credibility for pure premium.
Thus 2654 = nF = (y2 / k2 )(1 + CV2 ) = {1.962 /k2 }(1 + 2.52 ) = 27.85 / k2 .
Thus k =

27.85
= 0.102.
2654

5.13. C. 1. F. The formula for the Standard for Full Credibility for either severity or the Pure
Premium involves the severity distribution via the coefficient of variation, which is not effected by
(uniform) inflation. (The Standard for Full Credibility for frequency doesnt involve severity at all, and
is thus also unaffected.)
2. T. (y2 / k2 )(1+ CV2 ) (y2 / k2 ).
3. F. Limited (basic limits) losses have a smaller coefficient of variation than do unlimited (total limits)
losses. Therefore, the Standard for Full Credibility for Basic Limits losses is less.
5.14. C. n0 = y2 / k2 = (1.282/0.10)2 = 164. For the Pure Premium, the Standard For Full Credibility
is: n0 (1 + CV2 ) = (164)(1 + 4000000/10002 ) = (164)(5) = 820.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 101
5.15. E. The standard for full credibility is n0 (f2 /f + CVsev2 ). Since the only thing that differs is
the severity distribution, the ranking depends on CVsev, the coefficient of variation of the severity
distribution. For the Exponential, CV = 1.
For the Weibull, 1 + CV2 = E[X2 ]/E[X]2 =
CV =

2 (1 + 2 / )
(1 + 2 / )
(5)
4!
=
=
=
= 6.
{ (1 + 1/ )}2 (1 + 1/ )2 (3)2 (3!)2

5 = 2.236.

For the Lognormal, 1 + CV2 = E[X2 ]/E[X]2 = exp[2 + 22] / exp[ + 2/2]2 = exp[2] =
exp[0.64] = 1.896. CV =

0.896 = 0.95. From smallest to largest: 3, 1, 2.

5.16. D. k = 5% and P = 90%. We have y = 1.645 since (1.645) = 0.95 = (1+P)/2.


Therefore, n0 = y2 /k2 = (1.645/0.05)2 = 1082.
For Poisson frequency, the variance of the total losses is:
(mean frequency)(s2 + s2) = (mean frequency)(mean severity)2 (1 + CVsev2 ).
Thus 40,000 = (2)(1002 )(1 + CVsev2 ). (1 + CV2 ) = 2.
But nF = n0 (1+CVsev2 ) = (1082)(2) = 2164 claims.
5.17. D. The Standard for Full Credibility for the pure premium is the sum of those for frequency
and severity. Thus in this case, the standard for full credibility for the severity is 2000 - 800 =1200
claims.
5.18. E. For the Burr Distribution E[Xn ] = n/( n/)(1+ n/)/().
For = 9 and = 0.25, E[X] = 4 (9-4)(1+ 4)/(9) = 4 (4!)(4!)/8! = 4/70.
E[X2 ] = 8 (9-8)(1+ 8)/(9) = 8 (1)(8!)/8! = 8.
(1+CV2 ) = E[X2 ] / E2 [X] = (8)/(4/70)2 = 4900.
We have y = 1.439 since (1.439) = 0.925 = (1+P)/2. k = 0.10. Therefore,
nF = n0 (1+CV2 ) = (y/k)2 (1+CV2 ) = (1.439/0.10}2 (4900) = 1.015 million claims.
Comment: For = 0.25 one gets a very heavy-tailed Burr Distribution and therefore a very large
Standard for Full Credibility.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 102
5.19. B. Frequency is Poisson and therefore f = f2 .
pp2 = s2 f2 + fs2 = f (s2 + s2 ).
Thus 1300 = 20(s2 + 35 ). Therefore, s2 = 30.
C Vsev2 = s2 / s2 = 35/30 = 1.167. k = 0.075.
(y) = (1+P)/2 = (1+.98)/2 = 0.99. Thus y = 2.326.
n0 = (y/k)2 = (2.326/0.075)2 = 962. nF = n0 (1+CV2 ) = (962) (1 + 1.167) = 2085.
5.20. B. For situation #1: n0 (f2 /f) = n0 r(1 + )/(r) = n0 (1 + ) = 1.3n0 .
For situation #2: n0 (CV2 ) = n0 (E[X2 ]/E[X]2 - 1) = n0 ({22 /(1)(2)}/{ / (1)}2 - 1) =
{2(1)/(2) - 1}n0 = (/( - 2))n0 = (5/3)n0 = 1.67n0 .
For situation #3: n0 (1 + CV2 ) = n0 (1 + 2/()2) = n0 (1 + 1/) = 1.5n0 .
From smallest to largest: 1, 3, 2.
5.21. D. For the Geometric Distribution: variance / mean = (1+) / = 1 + = 1.4.
For the Exponential distribution: CV = 1.
For the Gamma Distribution, CV2 = 2 / ()2 = 1/ = 1/2.
Old standard: 16,000 = (y/r)2 (1.4 + 12 ). (y/r)2 = 16,000/2.4.
New standard: {y/(3r)}2 (1.4 + 1/2) = (1/9) (y/r)2 (1.9) = (1.9/9) (16,000/2.4) = 1407.
5.22. E. For the Gamma-Poisson, the mixed distribution is Negative Binomial,
with r = = 4 and = = 0.5. Therefore, for frequency f2 /f = r(1 + )/ (r) = 1 + = 1.5.
For the Uniform Distribution from 0 to 500, s2 / s2 = {(5002 )/12}/2502 = 1/3.
For P = 0.98, y = 2.326. k = 0.1.
{y2 / k2 }(f2 /f + s2 / s2 ) = (2.326/0.1)2 (1.5 + 0.333) = 992 claims.
Comment: Similar to 4, 11/02, Q.14.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 103
5.23. E. 13,800 = (f2 /f)n0 .
18,050 = (f2 /f + CVSev2 )n0 .
Subtracting the first equation from the second: CVSev2 n0 = 4246.
For the Gamma, CV2 = variance/mean2 = 2/()2 = 1/ = 1/2.5 = 0.4.
Therefore, n0 = 4246/0.4 = 10,615. (y/k)2 = 10,615. y/k = 103.03.

y = (103.03)(2.5%) = 2.576.
99.5% = [y] = (1+P)/2. P = 99%.
5.24. E. n0 = (y/k)2 = (1.645/.05)2 = 1082. For the Pure Premium, when we have a general
frequency distribution (not necessarily Poisson), the Standard For Full Credibility is:
n0 (f2 /f + CVsev2 ) = (1082) (2 + 25000/1002 ) = (1082)(4.5) = 4869.
5.25. E. The coefficient of variation = standard deviation / mean = 200 /100 = 2.
nF = n0 (1+CVsev2 ) = n0 (1+22 ) = 5n0 .
5.26. A. Assume there are N claims expected and therefore N/f exposures. The mean pure
premium is m = Ns. For frequency and severity independent, the variance of the pure premium for
a single exposure is: f s2 + s2 f2 . The variance of the aggregate loss for
N/f independent exposures = 2 = (N/f)(f s2 + s2 f2 ) = N (s2 + s2 f2 /f).
We desire that: Prob[(m - km) X ( m + km)] P or equivalently Prob[|X-m| km] 1 - P.
This is in the form of Chebyshevs inequality provided we take 1/a2 = 1 - P, and km = a.
Thus a = 1 /

1 - P and km = /

1 - P . Therefore k2 m2 (1 - P) = 2 .

Thus, k2 N2 s2 (1-P) = N (s2 + s2 f2 /f ). Solving for


N = (s2 + s2 f2 /f ) / {k2 s2 (1-P)} = (f2 / f + s2 / s2 ) / {k2 (1-P)}
Comment: See Dale Nelsons review in PCAS 1969 of Mayerson, Jones and Bowers The
Credibility of the Pure Premium. Note that this formula resembles that derived from the normal
approximation, but with y2 replaced by 1/(1 - P). For example for P = 95%, y2 = 1.962 = 3.84,
while 1/(1-P) = 1/.05 = 20. Thus while Chebyshevs inequality holds regardless of the form of the
distribution, it is very conservative if the distribution is approximately Normal. For P = 95% it results
in a standard for full credibility 5.2 times as large.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 104
5.27. E. Using the formulas for the moments: CV2 = E[X2 ] / E2 [X] - 1 =
2 2

( 1) ( 2)
- 1= 2(1)/(2) - 1 = / (2).
2

For = 2.3, CV2 = 2.3 / 0.3 =7.667.


Therefore, (f2 /f + s2 /s2 ) / {k2 (1-P)} = (2.5 + 7.667) / {(0.0752 )(1 - 0.9)} = 18,075.
Comment: Note how much larger the Standard for Full Credibility is than when using the Normal
Approximation as in a previous question.
5.28. B. nF = n0 (1+CV2 ) = (1200)(1 + 80000/4002 ) = (1200)(1.5) = 1800 claims.
5.29. D. k = 8%, P = 90%. Therefore y = 1.645, since (1.645) = 0.95 = (1+P)/2.
n0 = (y/k)2 = (1.645/0.08)2 = 423. CV = standard deviation / mean = 4000/1000 = 4.
nF = (f2 /f + s2 /s2 ) {y2 /k2 } = n0 (f2 /f + CV2 ) = (423)(1.5 + 42 ) = 7403.
5.30. C. For a Poisson frequency, the standard for full credibility for the pure premium is
n0 (1 + CV2 ), where CV is the coefficient of variation of the severity and n0 is the standard for full
credibility for frequency. Therefore in this case, 1800 = 1200(1 + CV2 ).
Therefore CV2 = (1800/1200) - 1 = 0.5. But the square of the coefficient of variation =
variance / mean2 . Therefore variance of severity = (0.5)(4002 ) = 80,000.
Comment: Given an output, you are asked to solve for the missing input.
Note that one makes no use of the given average frequency.
5.31. C. k = 0.08 and P = 0.90. y = 1.645 , since (1.645 ) = 0.95 = (1+P)/2.
n0 = y2 / k2 = (1.645/0.08)2 = 423.
The coefficient of variation of the severity = standard deviation / mean.
C V2 = 362,944 / 12162 = 0.245.
Thus nF = n0 (1 + CV2) = (423)(1.245) = 527.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 105
5.32. C. The mean of the severity distribution is 100,000/2 = 50,000.
The Second Moment of the Severity Distribution is the integral from 0 to 100,000 of x2 f(x), which is
100,0003 / {3 (100000)}. Thus the variance is: 100,0002 / 3 - 50,0002 = 833,333,333.
Thus the square of the coefficient of variation is 833,333,333 / 50,0002 = 1/3.
k = 5% (within 5%) and since P = 0.95, y = 1.960 since (1.960) = (1+P)/2 = 0.975.
The Standard for Full Credibility Pure Premium, nF = (y2 / k2) (1+CV2 ) =
(1.96/.05)2 (1 + 1/3) = (1537) (4/3) = 2049 claims.
Comment: For the uniform distribution on the interval (a,b) the coefficient of variation is:
Thus the CV2 =

b-a
.
(b + a) 3

(b - a)2
(100,000 - 0)2
=
= 1/3.
(b + a)2 (3) (100,000 + 0)2 (3)

Note that the CV2 is 1/3 whenever a = 0.


5.33. C. k = 6% ( within 6% of the expected total cost ). (1.645) = 0.95 = (1+0.90)/2, so that
y = 1.645. Standard for full credibility for frequency = n0 = y2 / k2 = (1.645 / 0.06)2 = 751.7.
Coefficient of Variation of the severity = 7500 /1500 = 5. Standard for full credibility for pure
premium = nF = n0 (1+CV2) = (751.7)(1 + 52 ) = 19,544 claims.
5.34. B. For the given severity distribution the mean is:
100

100

0 x f(x) dx = (0.0002) 0 x (100 - x) dx = (0.0002)

x = 100

(50x2

x3

/ 3 )]

= 33.33

x =0

For the given severity distribution the second moment is:


100

100

x2

f(x) dx = (0.0002)

x2

(100 - x) dx = (0.0002)

{(100 / 3)x3

x = 100
4
x / 4}

= 1666.67

x= 0

Thus the variance of the severity is: 1666.67 - 33.332 = 555.8.


Coefficient of variation squared = CV2 = 555.8 / 33.332 = 0.50.
For the given standard for full credibility for frequency, 800 = y2 / k2 = y2 / 0.052 .

y2 = (800)(0.052 ) = 2.
Now for the same P value that produced this y value, we want a standard for full credibility for pure
premiums, with k = 0.10: {y2 / k2}(1+CV2 ) = {2 / 0.12 } (1 + 0.50) = (200)(1.5) = 300 claims.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 106
5.35. B. k = 10% ( within 10% of the expected pure premium ). (1.960) = 0.975 = (1+.95)/2,
so that y = 1.960.
Standard for full credibility for frequency = n0 = (y/k)2 = (1.960 / 0.10)2 = 384 claims.
Standard for full credibility for pure premium = nF = n0 (1+CV2 ).
Therefore CV2 = (nF / n0 ) - 1 = (1000/ 384) -1 = 1.604.
Thus CV = 1.27.
5.36. C. k = 5% ( within 5% of the expected frequency). (2.327) = 0.99 = (1+.98)/2, so that
y = 2.327. Standard for full credibility for frequency = n0 = (y/k)2 = (2.327 / 0.05)2 = 2166 claims.
Now one has to start fresh and write down the formula for a standard for full credibility for the pure
premium, with a new y and k. Since (1.960) =0.975 = (1+0.95)/2, the new y = 1.960.
The standard for full credibility for pure premium = nF = n0 (1+CV2 ).

The mean severity is:

1 x f(x) dx = 1 x (5 / 2)

x- 7 / 2

dx = {(5 / 2) / (-3 / 2)}

The 2nd moment is:

x2

(5 / 2)

x- 7 / 2

dx = {(5 / 2) / (-1/ 2)}

x=
3
/
2
x

= 5/3.

x=1
x =

1/
2
x

= 5.

x =1

Thus the variance = 5 - (5/3)2 = 2.22.


The coefficient of variation is the standard deviation divided by the mean: 2.22 / (5/3) = 0.894.
We are given that this standard for full credibility for pure premium is equal to the previously
calculated standard for full credibility for frequency; thus
2166 = (1.9602 / k2 )(1 + 0.8942 ). Solving, the new k = 0.056.
Comment: The severity is a Single Parameter Pareto Distribution, with = 2.5 and = 1.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 107
5.37. A. We are given k = 10%, P = 90%. (y)= (P +1)/ 2 = 95%. The Normal distribution has a
95% chance of being less than 1.645. Thus y = 1.645. n0 = (y/k)2 = 271.
The mean severity is 3.1 and the variance of the severity = 21.7 - 3.12 = 12.09.
Claim
Size
1
2
10
Average

Probability
0.5
0.3
0.2

3.1

Square of
Claim Size
1
4
100
21.7

Therefore the Square of the Coefficient of Variation = variance / mean2 = 12.09 / 3.12 = 1.258.
Therefore the full credibility standard is = n0 (1 + CV2 ) = (271)(1 + 1.258) = 612 claims.
5.38. B. nF = n0 (1+CV2 ) = (y2 / k2) (1+CV2 ). k =10%. y = 1.960 since (1.960) = 0.975 =
(P+1)/2 = 1.90 / 2. Thus 3025 = (1.96/0.1)2 (1+CV2 ). Therefore: 7.87 = 1+CV2 . CV = 2.62.
5.39. D. 1. False. The correct formula contains the square of the mean severity:
Var(C) = E(N) Var(X) + E(X)2 Var(N).
2. True. Using the fact that the Coefficient of Variation is the mean over the standard deviation:
n0 {E(X)2 + Var(X)}/E(X)2 = n0 {1 + Var(X)/E(X)2 } = n0 [1 + CV2 ] = nF.
3. True. The square root rule for partial credibility used in Classical Credibility.
Comment: Statement 3 is only true for n nF. For n nF, Z=1.
5.40. E. The Normal distribution has a (1+P)/2 = (1+.95)/2 = 97.5% chance of being less than
1.960. Thus y = 1.960. Therefore in terms of number of claims the full credibility standard is =
y 2 (1+ CV2 ) / k2 = (1.962 )(1 + 4) /k2 = 3415 claims.
Therefore k = (1.96) 5 / 3415 = 0.075.
Comment: You are given the output, 3415 claims, and asked to solve for the missing input, k.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 108
5.41. C. We are given k = 5%. (y) = (1+P)/2 = (1+.90)/2 = 0.95, therefore y = 1.645.
n0 = y2/k2 = 1.6452 / 0.052 = 1082 claims.
The given severity distribution is a Single Parameter Pareto, with = 5 and = 1.
First moment is: / (-1) = 5/4.
Second moment is: 2/ (-2) = 5/3.
1 + CV2 = E[X2 ] / E[X]2 = (5/3) / (5/4)2 = 16/15.
Therefore, the standard for full credibility for the Pure Premium is:
n0 (1 + CV2 ) = (1082)(16/15) = 1154 claims.
Next the problem states that this is also a full credibility standard for frequency.
In this case, (y) = (1+0.95)/2 = 0.975, therefore y = 1.960.
Thus setting 1154 claims = y2 / k2 = 1.962 / k2 , one solves for k = 0.0577.
5.42. B. Let the standard deviation of the severity distribution, for which we will solve, be .
The Classical Credibility Standard for the Pure Premium is given by:
nf = n0 (1 + CV2 ). CV2 = 2 / 50002 . n0 = (y/k)2 = (1.96/0.05)2 = 1537 claims.
One must now translate n0 into exposures, since that is the manner in which the full credibility criterion
is stated in this problem. One does so by dividing by the expected frequency, which is the fitted
Poisson parameter m.
Using the method of moments, m = (observed # of claims) / (observed number of exposures) =
(1293 + 1592 + 1418) / (18,567 + 26,531 + 20,002) = 4303 / 65,000 = .0662.
Thus n0 in terms of exposures is:
1537 claims / (0.0662 claims / exposure) = 23,218 exposures.
Now one sets the given criterion for full credibility equal to its calculated value:
120,000 = 23,218(1 + 2 /50002 ). Solving, = $10,208.
Comment: Assuming a Poisson frequency with parameter 0.0662, a severity distribution with a
mean of $5000 and a standard deviation of $10,208, how many exposures are needed for full
credibility if we want the actual total cost of claims to be within 5% of the expected total 95% of the
time? The solution to this alternate question is:
(1537 claims) {1 + (10208/5000)2 } / (0.0662 claims per exposure) 120,000 exposures.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 109
5.43. D. nf = n0 {1 + square of coefficient of variation of severity}.
n0 = y2 / k2 . k = 10%. P = 98%. y = 2.326
For the Pareto, mean = / (1) = 3000, and the second moment =

2 2
= 36 million.
( 1) ( 2)

1+ CV2 = E[X2 ] / E[X]2 = 36 million / 30002 = 4. Thus nF = (2.326 / 0.1)2 (4) = 2164.
5.44. C. k = .10 (within 10% of the expected)
y = 1.645 since (1.645) = .95 (to be 90% certain, allow 5% outside on each tail.)
n0 = y2/k2 = 1.6452 /0.12 = 271. Coefficient of Variation2 = Variance / mean2 = 10 / 25 = 0.4.
nF = n0 (1+CV2 ) = (271)(1.4) = 379 claims.
5.45. D. k = 0.075 (within 7.5% of the expected)
y = 1.960 since (1.960) = 0.975 (to be 95% certain, allow 2.5% outside on each tail.)
n0 = y2/k2 = 1.9602 /0.0752 = 683. 1 + CV2 = second moment / mean2 =
exp(2 + 22) / exp(2 + 2) = exp(2) = e2.25 = 9.49.
nF = n0 (1 + CV2 ) = (683)(9.49) = 6482 claims.
But we are given that the full credibility criterion with respect to exposures is 40,000.
To convert to claims we multiply by the mean claim frequency.
Therefore nF = 40,000m. Therefore m = 6442 / 40,000 = 16.2%.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 110
5.46. A. n0 = y2/k2 = y2 / 0.052 = 400 y2 . For the full credibility standard for pure premiums (claims
costs) we need to compute the coefficient of variation. For a Pareto with = 3000 and
= 4, the second moment is: 2(3000)2 / (4-1)(4-2), while the mean is: 3000 / (4-1).
Thus 1 + CV2 = second moment / mean2 = 2(3) / 2 = 3.
Therefore, the standard for full credibility is: n0 (1 + CV2 ) = 400 y2 (3) = 1200 y2 .
Setting this equal to the given 2000 claims we solve for y: y =

2000
= 1.291.
1200

One then needs to compute how much probability is within 1.291 standard deviations on the
Normal Distribution. (1.291) = 0.9017.
Therefore, P = 1 - (2)(1 - .9017) = 0.803. (9.83% is outside on each tail.)
Comment: If one had been given P = 80.3% and were asked to solve for the standard for full
credibility, then we would want 0.0985 outside on either tail, so we want (y) = 0.9015.
Thus y 1.29 and the standard for full credibility (3) (1.292 )/ (0.052 ) 2000.
5.47. C. For the Gamma Distribution the Coefficient of Variation = 1 /

= 1.

We are given k = 5% and P = 90%. (1.645) = 0.95 = (1 + P)/2. y = 1.645.


Therefore n0 = y2 /k2 = (1.645/0.05)2 = 1082. nF = n0 (1 + CV2 ) = 1082(1 + 12) = 2164.
Comment: The Gamma Distribution for =1 is an Exponential Distribution,
with Coefficient of Variation of 1 (and Skewness of 2.)
5.48. C. nF = y2 /k2 (1+CV2 ), nF is proportional to y2 .
For P = 90%, (1.645) = 0.95 = (1 + P)/2. y = 1.645.
For P = 95%, y = 1.960, since (1.960) = 0.975 = (1 + 0.95)/2.
Thus the new criterion for full credibility = (1000)(1.960/1.645)2 =1420.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 111
5.49. B. P = 0.95 and k = 0.1. (1.960) = 0.975 = (1+P)/2, so that y = 1.960.
n0 = (y/k)2 = (1.960 / 0.10)2 = 384.
The mean severity is 4 and so is the variance, since it follows a Poisson Distribution. Thus the square
of the coefficient of variation of the severity = CV2 = variance / mean2 = 4 / 42 = 1/4.
Therefore nF = n0 (1+CV2 ) = (384)(1 + 1/4) = 480.
Comment: Its unusual to have severity follow a Poisson Distribution. This situation is mathematically
equivalent to a Poisson-Poisson compound frequency distribution.
5.50. C. nF = y2 /k2 (1+CV2 ). If the CV goes from 2 to 4, and k doubles then the Standard for Full
Credibility is multiplied by: {(1+42 ) / (1+22 )} / 22 = (17/5) / 4.
Thus the Standard for Full Credibility is altered to: (1200)(17/5) / 4 = 1020.
Comment: If k doubled and the CV stayed the same, then the Standard for Full Credibility would
be altered to: (1200) / 4 = 300. If k stayed the same and the CV went from 2 to 4, then the
Standard for Full Credibility would be altered to: (1200) {(1+42 ) / (1+22 )} = 4080.
5.51. E. For the Lognormal, Mean = exp( + 2/2), 2nd Moment = exp(2 + 22),
1 + square of coefficient of variation = 2nd moment / mean2 = exp(2) = e1 = 2.718.
k = 0.1. P = 95%, so that y = 1.96 since (1.96) = 0.975 = (1+P)/2.
Thus n0 = y2/k2 = 384. Thus the number of claims needed for full credibility of the pure premium is:
n0 (1+CV2 ) = 384(2.718) = 1044 claims.
To convert to the full credibility standard to exposures, divide by the expected frequency of 0.01:
1044/.01 = 104.4 thousand exposures.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 112
5.52. E. k = 5% (within 5%), P = 95% (95% of the time). (y) = (1+P)/2 = 0.975, thus y = 1.960.
The Inverse Gamma has: E[X] = /(-1). E[X2 ] =

2
.
( 1) ( 2)

nF = n0 (1+CV2 ) = (y/k)2 (E[X2 ]/E2 [X]) = (1.96/0.05)2 {(-1)/(-2)} = (1537)(3/2) 2306 claims.
Comment: In this case, CV2 =1/2. For the Inverse Gamma Distribution, CV2 = 1/(-2).
5.53. C. The Negative Binomial has a larger variance than the Poisson, so there is more random
fluctuation, and therefore the standard for Full Credibility is larger. Specifically, one can derive a more
general formula than when the Poisson assumption does not apply. The Standard for Full
Credibility is: {y2 / k2 }(f2 /f + s2 / s2 ), which reduces to the Poisson case when f2 /f = 1.
For the Negative Binomial the variance is greater than the mean, so f2 /f > 1. Thus for the Negative
Binomial the standard for Full Credibility is larger than the Poisson case, all else equal.
Comment: For example, assume one had in the previous question instead of a Poisson, a
Negative Binomial frequency distribution with parameters = 2.5 and r = 3. Then f2 /f =
r(1+)/(r) = 1 + = 3.5. Thus, the Standard for Full Credibility would have been:
{y2 / k2 }(f2 /f + s2 /s2 ) = (1537)(3.5 + 0.5) 6148 claims, rather than about 2306 claims.
5.54. A. Frequency is Poisson and therefore f = f2 . pp2 = s2 f2 + fs2 =
f (s2 + s2 ). Thus 500 = 10(s2 + 10 ). Therefore, s2 = 40. CV2 = s2 / s2 = 10/40 = 0.25.
k = 0.05. (y) = (1+P)/2 = (1+.95)/2 = 0.975. Thus y = 1.96. n0 = (y/k)2 = (1.96/0.05)2 = 1537.
nF = n0 (1+CV2 ) = 1537(1+0.25) = 1921.
5.55. D. For the Burr Distribution E[Xn ] =

n (1 + n / ) ( - n / )
.
()

For = 6 and = 0.5, E[X] = (1+ 2) (6-2) / (6) = (2!)(3!) / 5! = /10.


E[X2 ] = 2(1+ 1)(6-1)/(6) = 2 (1!)(4!) / 5! = 2/5.
(1+CV2 ) = E[X2 ] / E2 [X] = (2/5)/(/10)2 = 100/5 = 20.
k = 0.10. nF = n0 (1+CV2 ) = (y2 / k2 )(1+CV2 ). Since we are given that nF = 6000.
6000 = (y/0.1)2 (20). y = 0.1

6000
= 0.1
20

300 = 3 = 1.732.

(1+P)/2 = (y) = (1.732) = 0.9584. Thus P = 0.917.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 113
5.56. E. nF = n0 (1+CV2 ) = (1000)(1 + 22 ) = 5000.
5.57. C. For the LogNormal Distribution, 1 +CV2 = (2nd moment)/mean2 =
exp(2 + 22)/ exp( + .52)2 = exp(2).
k = 5%, P = 90%. We have y = 1.645, since (1.645) = 0.95 = (1+P)/2.
Therefore, n0 = (y/k)2 = (1.645 /0.05)2 = 1082.
We are given nF = 13,000. But nF = n0 (1+CV2 ). Thus 13,000 = 1082(1+CV2 ). 1+CV2 = 12.01.
Therefore, 12.01 = exp(2). =

ln(12.01) = 1.577.

5.58. E. The mean frequency is: 750/1000 = 0.75.


The variance of the frequency is: 1494/1000 - 0.752 = 0.9315.
C VSev2 = 6750000/15002 = 3. k = 5%. P = 95%. y = 1.960. n0 = y2/k2 = 1.9602 /0.052 = 1537.
Standard for full credibility = n0(F2/F + CVSev2 ) = (1537)(0.9315/0.75 + 3) = 6520 claims.
6520 claims corresponds to 6520/0.75 = 8693 exposures.
Alternately, Standard for full credibility in terms of exposures =
n0 (coefficient of variation of the pure premium)2 =
(1537)(variance of the pure premium)/(mean pure premium)2 =
(1537){(.75)(6750000) + (15002 )(0.9315)} / {(0.75)(1500)}2 =
(1537)(7.1584 million)/11252 = 8693 exposures.
Comment: Items (i) and (ii) are not needed to answer the question, although they do imply that the
frequency for the whole portfolio is Negative Binomial. Therefore the factor F2/F should be greater
than 1. That the severity is Pareto is also not used to answer the question, although one can infer that
= 3 and = 3000.
5.59. B. For the Gamma-Poisson, the mixed distribution is Negative Binomial, with r = = 15 and
= = 0.2. Therefore, for frequency, f2 /f = r(1 + )/ (r) = 1 + = 1.2.
For the Exponential Distribution, s2 / s2 = 2/2 = 1. For P = 0.90, y = 1.645. k = 0.05.
{y2 / k2 }(f2 /f + s2 /s2 ) = (1.645/0.05)2 (1.2 + 1) = 2381 claims.
Comment: We use the Negative Binomial Distribution for the whole portfolio of insureds, in order to
compute the standard for full credibility, thereby taking into account the larger random fluctuation of
results due to the heterogeneity of the portfolio.

2013-4-8, Classical Credibility 5 Full Credibility Aggregate Loss, HCM 10/16/12, Page 114
5.60. E. k = 0.02. P = 90% y = 1.645. n0 = y2 /k2 = (1.645/0.02)2 = 6765 claims.
For the Pareto, E[X] = /(-1) = 0.5/5 = 0.1, E[X2 ] =

2 2
(2) (0.52)
=
= 0.025,
( 1) ( 2) (6 -1) (6 - 2)

and 1 + CV2 = E[X2 ]/E[X]2 = 0.025/0.12 = 2.5.


Standard for Full Credibility for pure premium when frequency is Poisson =
n0 (1 + CVSev2 ) = (6765)(2.5) = 16,913 claims.
Comment: For the Pareto Distribution, CV2 = /(-2) = 6/4 = 1.5.
5.61. E. f2/f = r(1 + )/(r) = 1 + = 4.
E[X] = 24.4. E[X2 ] = 2040.4. CVsev2 = 2040.4/24.42 - 1 = 2.427.
K = 10%. P = 95%. y = 1.960. n0 = (1.960/0.1)2 = 384.
The standard for full credibility for aggregate losses is: (4 + 2.427)(384) = 2468 claims.
5.62. E. Since we have a Poisson frequency, the standard for full credibility is
n0 (1 + CVSev2 ). Thus we need to determine the coefficient of variation of severity.
Mean = . Variance = 2. CV2 =

2
= 1/, can not be determined.
( )2

Comment: One need only know in order to determine the coefficient of variation of the Gamma
Distribution, as in 4B, 5/96, Q.27. P = 95%. y = 1.960. k = 10%. n0 = y2 /k2 = 384.
5.63. B. From the standard for frequency, 2000 = y2 /0.032 . y2 = 1.8.
For the uniform severity: CV2 = variance/mean2 = (100002 /12)/(10000/2)2 = 1/3.
Standard for Aggregate Losses is: n0 (1 + CV2 ) = (1.8/0.052 )(1 + 1/3) = 960 claims.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 115

Section 6, Partial Credibility


When one has at least the number of claims needed for Full Credibility, then one assigns
100% credibility to the observed data. However, when one has less data then is needed for full
credibility, one assigns an amount of Credibility less than 100%.
If the Standard for Full Credibility is 683 claims and one has only 300 claims, then one assigns less
than full credibility to this data. How much less is determined via the square root rule.46
Let n be the (expected) number of claims for the volume of data, and nF be the standard
for Full Credibility for the pure premium or aggregate losses. Then the partial credibility
assigned is Z =

n
. When dealing with frequency or severity a similar formula applies.
nF

Unless stated otherwise assume that for Classical Credibility the partial credibility is given by
this square root rule.47 Use the square root rule for partial credibility for either frequency, severity,
pure premiums, or aggregate losses.
For example if 1000 claims are needed for full credibility for frequency, then the following credibilities
would be assigned:
Expected # of Claims
1
10
25
50
100
200
300
400
500
600
700
800
900
1000
1500

46

Credibility
3%
10%
16%
22%
32%
45%
55%
63%
71%
77%
84%
89%
95%
100%
100%

In some practical applications an exponent other than 1/2 is used. For example, in Workers Compensation
classification ratemaking, an exponent of 0.4 is used by the NCCI.
47
In contrast for Buhlmann/ Greatest Accuracy Credibility, Z = N / (N+K) for K equal to the Buhlmann Credibility
parameter. There is no Standard for Full Credibility for Buhlmann Credibility.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 116

Exercise: The Standard for Full Credibility is 683 claims and one has observed 300 claims.
How much credibility is assigned to this data?
[Solution:

300
= 66.3%.]
683

Exercise: The Standard for Full Credibility is 683 claims and one has observed 2000 claims.
How much credibility is assigned to this data?
[Solution: 100%. When the volume of data is greater than (or equal to) Standard for Full Credibility,
one assigns 100% credibility to the data.]
When available, one generally uses the number of exposures or the expected number of
claims in the square root rule, rather than the observed number of claims.48
Make sure that in the square root rule you divide comparable quantities:
Z=

number of claims
, or
standard for full credibility in terms of claims

Z=

number of exposures
.
standard for full credibility in terms of exposures

Exercise: Prior to observing any data, you assume that the claim frequency rate per exposure has
mean = 0.25. The Standard for Full Credibility for frequency is 683 claims.
One has observed 300 claims on 1000 exposures.
Estimate the number of claims you expect for these 1000 exposures next year.
[Solution: The expected number of claims on 1000 exposures is: (1000)(.25) = 250.
Z=

250
= 60.5%.
683

Alternately, a standard of 683 claims corresponds to 683/0.25 = 2732 exposures.


Z=

1000
= 60.5%.
2732

In either case, the estimated future frequency = (60.5%)(0.30) + (1 - 60.5%)(0.25) = 0.280.


(1000)(0.280) = 280 claims.]

48

See Credibility by Mahler and Dean, page 29.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 117

Limiting Fluctuations:
For example, assume that the mean frequency per exposure is 2%, and we have 50,000
exposures. Then the expected number of claims is: (2%)(50,000) = 1000.
If frequency is Poisson, then the variance of the number of claims from a single exposure is 2%.
2% 49
The variance of the average frequency for the portfolio of 50,000 exposures is:
.
50,000
The standard deviation of the observed claim frequency is:

2%
= 0.000632.
50,000

If instead of 50,000 exposures one had only 5000 exposures, then the expected number of claims
is: (2%)(5,000) = 100. The standard deviation of the estimated frequency is:

2%
= 0.002.
5000

With only 5000 rather than 50,000 exposures, there would be considerably more fluctuation in the
observed claim frequency.

49

The variance of an average is the variance of a single draw divided by the number of items being averaged.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 118

Below are shown 100 random simulations of the claim frequency for 50,000 exposures with a
Poisson parameter = 0.02, for 1000 expected claims:

0.024

0.022

20

40

60

80

100

0.018

0.016

0.014

Below are shown 100 random simulations of the claim frequency for 5,000 exposures with a
Poisson parameter = 0.02, for 100 expected claims:

0.024

0.022

20

40

60

80

100

0.018

0.016

0.014

With only 100 expected claims, there is much more random fluctuation in the observed claim
frequency, than with 1000 expected claims.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 119

Let us now assume that the standard for full credibility for estimating frequency is chosen as 1000
expected claims.50 Then if we had 50,000 exposures and 1000 expected claims, we would give the
observed frequency a credibility of one; we would rely totally on the observed frequency to
estimate the future frequency. As discussed previously, for this amount of data the standard
deviation of the observed claim frequency is: 0.000632. This is also the standard deviation of the
estimated claim frequency. Thus the chosen standard for full credibility results in a standard deviation
of the estimated claim frequency of 0.000632.51
If we had only 5000 exposures and 100 expected claims, then as discussed previously, the
standard deviation of the observed claim frequency is: 0.002. If we were to rely totally on the
observed frequency to estimate the future frequency, then the standard deviation of that estimate
would be much larger than desired.
However, with only 100 expected claims, in estimating the future frequency we multiply the
observation by Z =

100
= 31.6%. The standard deviation of Z times the observation is:
1000

(0.316)(0.002) = 0.000632. This is the same standard deviation as when we had full credibility.
Therefore, using credibility, the fluctuation in the estimated frequency due to the fluctuations in the
data will be the same.
To reiterate, the standard deviation of the observed claim frequency is larger for 100 expected
claims than for 1000 claims. If one uses 1000 claims as the Standard for Full Credibility, then the
credibility assigned to 100 expected claims is the ratio of the standard deviation with 1000 expected
claims to the standard deviation with 100 expected claims.
In this case, Z = 0.000632 / 0.002 = 31.6% =

100
.
1000

This concept is shown below for two Normal Distributions, approximating Poisson frequency
Distributions, one with mean 1000 and variance 1000 (solid curve) and the other with mean 100 and
variance 100 (dotted curve).

50
51

This would have been based on some choice of P and k, as discussed previously.
If we had more than 50,000 exposures, the standard deviation of the estimated claim frequency would be less.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 120

The x-axis is the number of claims / mean number of claims.

12
10
8
6
4
2

0.7

0.8

0.9

1.1

1.2

1.3

Each arrow is plus or minus one coefficient of variation, since each curve has been scaled in terms of
its mean number of claims. With a full credibility standard of 1000 claims, the partial credibility for 100
expected claims is the ratio of the lengths of the arrows: Z = 0.0316/0.100 = 31.6%.
The credibilities are inversely proportional to the standard deviations of the observed frequencies.
In general, the partial credibility assigned to n claims for n nF will be the ratio of the standard
deviation with nF expected claims to the standard deviation with n expected claims.
This ratio will be such that Z =

n
.
nF

The standard deviation of Z times the observation will be that for full credibility:
n

VAR[(Z)(observation)] = Z2 VAR[observation] =
=
. Thus the random fluctuation in the
nF n nF
estimate that is due to the contribution of Z times the observation has been limited to that which was
deemed acceptable when the Standard for Full Credibility was determined. This is why the term
Limited Fluctuation Credibility is sometimes used to describe Classical Credibility.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 121

The square root rule for partial credibility is designed so that when one has less data than the
standard for the full credibility, the weight given the observation is such that the standard deviation of
the estimate of the future has the same value it would have had if instead we had an amount of data
equal to the standard for full credibility.

Deriving the Square Root Rule:52


Let nF be such that when the observed pure premium Xfull is based on nF claims:
P = Prob[ - k Xfull + k] P = Prob[- k/full (Xfull - )/ full k/full ].
In this case, our estimate = Xfull.
Let Xpartial be the observed pure premium based on n claims, with n < nF.
In this case, our estimate = ZXpartial + (1-Z)Y, where Y is other information.
We desire to limit the fluctuation in this estimate due to the term ZXpartial.
We desire ZXpartial to have a large probability of being close to Z:
P = Prob[Z - k ZXpartial Z + k]
P = Prob[-k/(Z partial) (Xpartial - ) / partial k/(Z partial)].
Assuming both (Xfull - )/full and (Xpartial - )/partial are approximately Standard Normals,
comparing the two requirements, in order to make both probabilities P, we require that
k
full
k
=
. Z =
.
Z partial
partial
full
However, the standard deviation of an average goes down as the inverse of the amount of data.
Therefore,

52

full
1 / nF
=
=
partial
1 / n

n
. Z =
nF

See pages 514-515 of Mahler and Dean.

n
.
nF

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 122

Comparing Different Standards for Full Credibility:


The credibilities assigned to various numbers of claims under either a Standard for Full Credibility of
2500 claims (dashed) or 1000 claims (solid) are shown below.
Cred.
1
0.8
0.6
0.4
0.2

500

1000

1500

2000

2500

Claims
3000

For large volumes of data the credibility is 100% under either Standard. For smaller volumes of data,
more credibility is assigned when using a Standard for Full Credibility of 1000 claims rather than
2500 claims. The differences in the amount of credibility assigned using these two different
Standards for Full Credibility of 1000 and 2500 claims are:
Cred.
0.35
0.3
0.25
0.2
0.15
0.1
0.05
500

1000

1500

2000

2500

Claims
3000

For smaller volumes of data there is as much as a 37% difference in the credibilities depending on
the Standard for Full Credibility. Nevertheless, even for the criteria differing by a factor of 2.5, the
credibilities assigned to most volumes of data are not that dissimilar.53
53

Rounding Standards for Full Credibility to a whole number of claims should be more than sufficient.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 123

Classical Credibility vs. Buhlmann Credibility:


Below the Classical Credibility formula for credibility with 2500 claims for Full Credibility (dashed
curve) is compared to one from Buhlmann Credibility (solid curve): Z = N/(N + 350).54
Cred.
1
0.8
0.6
0.4
0.2

1000

2000

3000

4000

Claims
5000

One important distinction is that as the volume of data increases the Buhlmann Credibility
approaches but never quite attains 100% credibility.55

54

Z = N/(N+K) for K equal to the Buhlmann Credibility parameter. In this example, K = 350.
See Mahlers Guide to Buhlmann Credibility.
55
However, the credibilities produced by these two formula are relatively similar. Generally this will be true provided
the Standard for Full Credibility is about 7 or 8 times the Buhlmann Credibility Parameter.
See An Actuarial Note on Credibility Parameters, by Howard Mahler, PCAS 1986.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 124

Here is the difference in the credibilities produced by these two formulas:

Cred.
0.1
0.05

1000
- 0.05
-0 . 1

2000

3000

4000

Claims
5000

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 125

Problems:
6.1 (1 point) The Standard for Full Credibility is 1500 claims.
How much credibility is assigned to 200 claims?
A. less than 0.2
B. at least 0.2 but less than 0.3
C. at least 0.3 but less than 0.4
D. at least 0.4 but less than 0.5
E. at least 0.5
6.2 (1 point) The 1996 pure premium underlying the rate equals $1,000. The loss experience is
such that the actual pure premium for that year equals $1,200 and the number of claims equals 400.
If 8000 claims are needed for full credibility and the square root rule for partial credibility is used,
estimate the pure premium underlying the rate in 1997.
(Assume no change in the pure premium due to inflation.)
A. Less than $1,020
B. At least $1,020, but less than $1,030
C. At least $1,030, but less than $1,040
D. At least $1,040, but less than $1,050
E. $1,050 or more
6.3 (1 point) Using the square root rule for partial credibility a certain volume of data is assigned
credibility of 0.26.
How much credibility would be assigned to 20 times that volume of data?
A. less than 0.5
B. at least 0.5 but less than 0.7
C. at least 0.7 but less than 0.9
D. at least 0.9 but less than 1.1
E. at least 1.1
6.4 (2 points) Assume a Standard for Full Credibility for severity of 2000 claims.
Assume that for the class of Plumbers one has observed 513 claims totaling $4,771,000.
Assume the average cost per claim for all similar classes is $10,300.
What is the estimated average cost per claim for the Plumbers class?
A. less than 9600
B. at least 9600 but less than 9650
C. at least 9650 but less than 9700
D. at least 9700 but less than 9750
E. at least 9750

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 126

6.5 (1 point) The Standard for Full Credibility is 4500 claims. The expected claim frequency is 4%
per house-year. How much credibility is assigned to 5000 house-years of data?
A. less than 0.2
B. at least 0.2 but less than 0.3
C. at least 0.3 but less than 0.4
D. at least 0.4 but less than 0.5
E. at least 0.5
6.6 (2 points) You are given the following information:

Frequency is Poisson

Severity follows a Gamma Distribution with = 2.5.

Frequency and Severity are Independent.

Full credibility is defined as having a 98% probability of being within plus or minus 6%

of the true pure premium.


What credibility is assigned to 200 claims?
A. less than 0.32
B. at least 0.32 but less than 0.34
C. at least 0.34 but less than 0.36
D. at least 0.36 but less than 0.38
E. at least 0.38
6.7 (3 points) You are given the following:
Prior to observing any data, you assume that the claim frequency rate per exposure has
mean = 0.05 and variance = 0.15.
A full credibility standard is devised that requires the observed sample frequency rate per exposure
to be within 3% of the expected population frequency rate per exposure 98% of the time.
You observe 9021 claims on 200,000 exposures.
Estimate the number of claims you expect for these 200,000 exposures next year.
A. less than 9200
B. at least 9200 but less than 9300
C. at least 9300 but less than 9400
D. at least 9400 but less than 9500
E. at least 9500
6.8 (1 point) The standard for full credibility is 1000 exposures.
For how many exposures would Z = 40%?

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 127

6.9 (3 points) You are given the following:

The number of claims follows a Poisson distribution.

The variance of the pure premium distribution is 100.

The a priori estimate of the mean pure premium is 6.

The full credibility standard has been selected so that the estimated pure premiums will be
within 2.5% of their expected value 80% of the time.

You observe $3,200 of losses for 800 exposures.


Using the methods of classical credibility, estimate the future pure premium.
A. Less than 5.0
B. At least 5.0, but less than 5.2
C. At least 5.2, but less than 5.4
D. At least 5.4, but less than 5.6
E. At least 5.6
Use the following information for the next two questions:

The number of claims follows a Poisson distribution.


Claim sizes follow an exponential distribution.
The number of claims and claim sizes are independent.
Credibility is assigned to the observed data using the concepts of classical credibility.

6.10 (2 points) If one were estimating the future frequency, the volume of data observed would be
assigned 60% credibility. Assume the same value of k and P are used to determine the Full
Credibility Criterion for frequency and pure premiums. How much credibility would be assigned to
this same volume of data for estimating the future pure premium?
A. Less than 45%
B. At least 45%, but less than 50%
C. At least 50%, but less than 55%
D. At least 55%
E. Cannot be determined from the given information.
6.11 (2 points) If one were estimating the future frequency, the volume of data observed would be
assigned 100% credibility. Assume the same value of k and P are used to determine the Full
Credibility Criterion for frequency and pure premiums. How much credibility would be assigned to
this same volume of data for estimating the future pure premium?
A. Less than 85%
B. At least 85%, but less than 90%
C. At least 90%, but less than 95%
D. At least 95%
E. Cannot be determined from the given information.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 128

6.12 (3 points) You are given the following:

The number of claims follows a Poisson distribution.

The number of claims and claim sizes are independent.


Credibility is assigned to the observed data using the concepts of classical credibility.
The estimated pure premium is to be within 10% of its expected value 95% of the time.
You observe the following data:
Year:
1
2
Dollars of Loss:
200 150
There is no inflation.

3
230

4
180

There is no change in exposure.

The current manual premium contains a provision for losses of 210.


Estimate the future annual losses.
A. Less than 197
B. At least 197, but less than 199
C. At least 199, but less than 201
D. At least 201, but less than 203
E. At least 203
6.13 (3 points) You are given:

Claim counts follow a Poisson distribution.


Claim sizes have a coefficient of variation squared of 1/2.
Claim sizes and claim counts are independent.
The number of claims in 2001 was 810.
The aggregate loss in 2001 was $1,134,000.
The manual premium for 2001 was $1.6 million.
The expected loss ratio underlying the manual rates is 80%. (The expected aggregate
losses are 80% of manual premiums.)

The exposure in 2002 is 12% more than the exposure in 2001.


The full credibility standard is to be within 2.5% of the expected aggregate loss 90% of the time.
Estimate the aggregate losses (in millions) for 2002.
(A) Less than 1.25
(B) At least 1.25, but less than 1.30
(C) At least 1.30, but less than 1.35
(D) At least 1.35, but less than 1.40
(E) At least 1.40

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 129

6.14 (2 points) So far this baseball season, the Houston Astros baseball team has won 35 games
and lost 72 games. Using a simulation, a website has predicted that for the entire season the
Houston Astros are expected to win 55.4 games and lose 106.6 games.
Sanford Beech is an actuarial student.
Sandy notices that using classical credibility, giving weight 1 - Z to a 50% winning percentage,
he can get the same estimate.
How many games would if take for Sandy to give full credibility?
A. Less than 190
B. At least 190 but less than 195
C. At least 195 but less than 200
D. At least 200 but less than 205
E. At least 205
6.15 (3 points) You are given the following:
The number of losses is Poisson distributed with mean 500.
Number of losses and loss severity are independent.
Loss severity has the following distribution:
Loss Size
Probability
100
.30
1000
.40
10,000
.20
100,000
.10
There is a 1000 deductible and maximum covered loss of 25,000.
How much credibility would be assigned so that the estimated total cost of claim payments is within
10% of the expected cost with 90% probability?
A. Less than 55%
B. At least 55% but less than 60%
C. At least 60% but less than 65%
D. At least 65% but less than 70%
E. At least 70%
6.16 (2 points) Prior to the beginning of the baseball season you expected the New York Yankees
to win 100 of 162 games. The Yankees have won 8 of their first 19 games this season.
Using a standard for full credibility of 1000 games, predict how many games in total the Yankees will
win this season.
A. 88
B. 90
C. 92
D. 94
E. 96

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 130

6.17 (3 points) You are given the following information:

Claim counts follow a Poisson distribution.

Claim sizes follow a Gamma Distribution.

Claim sizes and claim counts are independent.

The full credibility standard is to be within 5% of the expected aggregate loss 90%
of the time.

The number of claims in 2007 was 77.

The average size of claims in 2007 was 6861.

In 2007, the provision in the premium in order to pay losses was 400,000.

The exposure in 2008 is identical to the exposure in the 2007.

There is 4% inflation between 2007 and 2008.

If the estimate of aggregate losses in 2008 is 447,900,


what is the value of the parameter for the Gamma distribution of severity?
(A) 2

(B) 3

(C) 4

(D) 5

(E) 6

6.18 (2 points) For Workers Compensation Insurance for Hazard Group D you are given the
following information on lost times claims:
State of Con Island
Countrywide
Number of Claims
7,363
442,124
Dollars of Loss
218 million
23,868 million
The full credibility standard has been selected so that actual severity will be within 7.5% of expected
severity 99% of the time.
The coefficient of variation of the size of loss distribution is 4.
What is the estimated average severity for Hazard Group D in the state of Con Island?
A. 37,000
B. 39,000
C. 41,000
D. 43,000
E. 45,000
6.19 (3 points) The average baseball player has a batting average of 0.260.
In his first six at bats, Reginald Mantle gets 3 hits, for a batting average of 0.500.
In his 3000 at bats, Willie Mays Hayes has gotten 900 hits, for a batting average of 0.300.
Which of these two players would you expect to have a better batting average in the future?
Use Classical Credibility to discuss why.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 131

6.20 (4, 5/84, Q.35) (2 points) Frequency is Poisson. Three years of data are used to calculate the
pure premium. In the case of an average annual claim count of 36 claims, 20% credibility is assigned
to the observed pure premium. The standard for full credibility was chosen so as to achieve a 90%
probability of departing no more than 5% from the expected value. What is the ratio of the standard
deviation to the mean for the claim severity distribution?
A. Less than 1.1
B. At least 1.1, but less than 1.4
C. At least 1.4, but less than 1.7
D. At least 1.7, but less than 2.0
E. 2.0 or more
6.21 (4, 5/85, Q.30) (1 point) The 1984 pure premium underlying the rate equals $1,000.
The loss experience is such that the actual pure premium for that year equals $1,200 and the
number of claims equals 600.
If 5400 claims are needed for full credibility and the square root rule for partial credibility is used,
estimate the pure premium underlying the rate in 1985.
(Assume no change in the pure premium due to inflation.)
A. Less than $1,025
B. At least $1,025, but less than $1,075
C. At least $1,075, but less than $1,125
D. At least $1,125, but less than $1,175
E. $1,175 or more
6.22 (4, 5/86, Q.35) (1 point) You are in the process of revising rates.
The premiums currently being used reflect a loss cost per insured of $100.
The loss costs experienced during the two year period used in the rate review averaged $130 per
insured.
The average frequency during the two year review period was 250 claims per year.
Using a full credibility standard of 2,500 claims and assigning partial credibility, what loss cost per
insured should be reflected in the new rates?
(Assume that there is no inflation.)
A. Less than $105
B. At least $105, but less than $110
C. At least $110, but less than $115
D. At least $115, but less than $120
E. $120 or more

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 132

6.23 (4, 5/87, Q.36) (2 points) The actuary for XYZ Insurance Company has just developed a
new rate for a particular class of insureds. The new rate has a loss cost provision of $125. In doing
so, he used the partial credibility approach of classical credibility. In the experience period used,
there were 10,000 insureds with an average claim frequency of 0.0210. If the loss cost in the old
rate was $100 and the loss cost in the experience period was $200, what was the actuary's
standard for full credibility? (Assume zero inflation.)
A. Less than 3,000
B. At least 3,000, but less than 3,200
C. At least 3,200, but less than 3,400
D. At least 3,400, but less than 3,600
E. 3,600 or more.
6.24 (4, 5/88, Q.34) (2 points) Assume the random variable N, representing the number of claims
for a given insurance portfolio during a one year period, has a Poisson distribution with a mean of n.
Also assume X1 , X2 ..., XN are N independent, identically distributed random variables with Xi
representing the size of the ith claim. Let C = X1 + X2 + ... Xn represent the total cost of claims
during a year. We want to use the observed value of C as an estimate of future costs. Using
Classical credibility procedures, we are willing to assign full credibility to C provided it is within
10.0% of its expected value with probability 0.96. Frequency is Poisson. If the claim size
distribution has a coefficient of variation of 0.60, what credibility should we assign to the experience if
213 claims occur?
A. Less than 0.60
B. At least 0.60, but less than 0.625
C. At least 0.625, but less than 0.650
D. At least 0.650, but less than 0.675
E. 0.675 or more

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 133

6.25 (4, 5/88, Q.35) (2 points) The High Risk Insurance Company is revising its rates, based on
its experience during the past two years. The company experienced an average of 1,250 claims
annually over these two years. The loss costs underlying the current rates average $500 per
insured. The Actuary is proposing that this loss costs provision be revised upward to $550, based
on the average loss costs of $700 experienced over the two year experience period. The Actuary
is using the Classical credibility approach. The expected number of claims necessary for full
credibility is determined by the requirement that the observed total cost of claims should be within
100k% of the true value 100P% of the time. What is the probability that a fully credible estimate of
the loss costs (for a sample whose expected number of claims is equal to the full credibility
standard) would be within 5% of the true value?
Assume that frequency is Poisson, the average claim size is $700, and the variance of the claim size
distribution is 17,640,000.
A. Less than 0.775
B. At least 0.775, but less than 0.825
C. At least 0.825, but less than 0.875
D. At least 0.875, but less than 0.925
E. 0.925 or more
6.26 (4, 5/89, Q.30) (2 points) The Slippery Rock Insurance Company is reviewing their rates.
In order to calculate the credibility of the most recent loss experience they have decided to use
Classical credibility.
The expected number of claims necessary for full credibility is to be determined so that the
observed total cost of claims should be within 5% of the true value 90% of the time. Based on
independent studies, they have estimated that individual claims are independent and identically
1
distributed as follows: f(x) =
, 0 x 200,000.
200,000
Assume that the number of claims follows a Poisson distribution.
What is the credibility Z to be assigned to the most recent experience given that it contains 1,082
claims? Use a normal approximation.
A.
Z 0.800
B. 0.800 < Z < 0.825
C. 0.825 < Z < 0.850
D. 0.850 < Z < 0.875
E. 0.875 < Z

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 134

6.27 (4, 5/91, Q.23) (2 points) The average claim size for a group of insureds is $1,500 with
standard deviation $7,500. Assuming a Poisson claim count distribution, use as your standard for full
credibility, the expected number of claims so that the total loss will be within 6% of the expected
total loss with probability P = 0.90. We observe 6,000 claims and a total loss of $15,600,000 for a
group of insureds. If our prior estimate of the total loss is
16,500,000, find the Classical credibility estimate of the total loss for this group of insureds.
A. Less than 15,780,000
B. At least 15,780,000 but less than 15,870,000
C. At least 15,870,000 but less than 15,960,000
D. At least 15,960,000 but less than 16,050,000
E. At least 16,050,000
6.28 (4B, 5/92, Q.6) (1 point)
You are given the following information for a group of insureds:
Prior estimate of expected total losses
$20,000,000
Observed total losses
$25,000,000
Observed number of claims
10,000
Required number of claims for full credibility
17,500
Using the partial credibility as in Classical credibility, determine the estimate for the group's expected
total losses based upon the latest observation.
A. Less than $21,000,000
B. At least $21,000,000 but less than $22,000,000
C. At least $22,000,000 but less than $23,000,000
D. At least $23,000,000 but less than $24,000,000
E. At least $24,000,000
6.29 (4B, 11/93, Q.20) (2 points) You are given the following:
P = Prior estimate of pure premium for a particular class of business.
O = Observed pure premium during latest experience period for same class of business.
R = Revised estimate of pure premium for same class following observations.
F = Number of claims required for full credibility of pure premium.
Based on the concepts of Classical credibility, determine the number of claims used as the basis for
determining R.
A.

D.

F (R - P)
O - P
F (R - P)2
(O - P)2

B.

F (R - P)2
(O - P) 2

E.

F2 (R - P)
O - P

C.

F (R - P)
O - P

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 135

6.30 (4B, 11/95, Q.12) (1 point) 2000 expected claims are needed for full credibility. Determine
the number of expected claims needed for 60% credibility.
A. Less than 700
B. At least 700, but less than 900
C. At least 900, but less than 1100
D. At least 1100, but less than 1300
E. At least 1300
6.31 (4B, 5/96, Q.28) (1 point) The full credibility standard has been selected so that the actual
number of claims will be within 5% of the expected number of claims 90% of the time.
Frequency is Poisson.
Using the methods of Classical credibility, determine the credibility to be given to the experience if
500 claims are expected.
A. Less than 0.2
B. At least 0.2, but less than 0.4
C. At least 0.4, but less than 0.6
D. At least 0.6, but less than 0.8
E. At least 0.8
6.32 (4B, 11/96, Q.29) (1 point) You are given the following:

The number of claims follows a Poisson distribution.


Claim sizes are discrete and follow a Poisson distribution with mean 4.

The number of claims and claim sizes are independent.


The full credibility standard has been selected so that the actual number of claims will be within
10% of the expected number of claims 95% of the time. Using the methods of Classical
credibility, determine the expected number of claims needed for 40% credibility.
A. Less than 100
B. At least 100, but less than 200
C. At least 200, but less than 300
D. At least 300, but less than 400
E. At least 400

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 136

6.33 (4B, 5/99, Q.18) (1 point) You are given the following:

The number of claims follows a Poisson distribution.

The coefficient of variation of the claim size distribution is 2.

The number of claims and claim sizes are independent.

1,000 expected claims are needed for full credibility.

The full credibility standard has been selected so that the actual number of
claims will be within k% of the expected number of claims P% of the time.
Using the methods of Classical credibility, determine the number of expected claims needed for
50% credibility.
A. Less than 200
B. At least 200, but less than 400
C. At least 400, but less than 600
D. At least 600, but less than 800
E. At least 800
6.34 (4B, 11/99, Q.18) (2 points) You are given the following:
Partial Credibility Formula A is based on the methods of classical credibility,
with 1,600 expected claims needed for full credibility.
Partial Credibility Formula B is based on Buhlmann's credibility formula with a
Buhlmann Credibility Parameter of K = 391.
One claim is expected during each period of observation.
Determine the largest number of periods of observation for which Partial Credibility Formula B
yields a larger credibility value than Partial Credibility Formula A.
A. Less than 400
B. At least 400, but less than 800
C. At least 800, but less than 1,200
D. At least 1,200, but less than 1,600
E. At least 1,600

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 137

6.35 (4, 5/00, Q.26) (2.5 points) You are given:


(i) Claim counts follow a Poisson distribution.
(ii) Claim sizes follow a lognormal distribution with coefficient of variation 3.
(iii) Claim sizes and claim counts are independent.
(iv) The number of claims in the first year was 1000.
(v) The aggregate loss in the first year was 6.75 million.
(vi) In the first year, the provision in the premium in order to pay losses was 5.00 million.
(vii) The exposure in the second year is identical to the exposure in the first year.
(viii) The full credibility standard is to be within 5% of the expected aggregate loss 95%
of the time.
Determine the classical credibility estimate of losses (in millions) for the second year.
(A) Less than 5.5
(B) At least 5.5, but less than 5.7
(C) At least 5.7, but less than 5.9
(D) At least 5.9, but less than 6.1
(E) At least 6.1
Note: I have reworded bullet vi in the original exam question
6.36 (4, 11/01, Q.15 & 2009 Sample Q.65) (2.5 points) You are given the following information
about a general liability book of business comprised of 2500 insureds:
Ni

(i) Xi =

Yij is a random variable representing the annual loss of the ith insured.
j=1

(ii) N1 , N2 , ... , N2500 are independent and identically distributed random variables
following a negative binomial distribution with parameters r = 2 and = 0.2.
(iii) Yi1, Yi2, ..., YiNi are independent and identically distributed random variables
following a Pareto distribution with = 3.0 and = 1000.
(iv) The full credibility standard is to be within 5% of the expected aggregate losses
90% of the time.
Using classical credibility theory, determine the partial credibility of the annual loss
experience for this book of business.
(A) 0.34
(B) 0.42
(C) 0.47
(D) 0.50
(E) 0.53

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 138

6.37 (4, 11/03, Q.35 & 2009 Sample Q.27) (2.5 points) You are given:
(i) Xpartial = pure premium calculated from partially credible data
(ii) = E[Xpartial]
(iii) Fluctuations are limited to k of the mean with probability P
(iv) Z = credibility factor
Which of the following is equal to P?
(A)

Pr[ - k Xpartial + k]

(B)

Pr[Z - k Z Xpartial Z + k]

(C)

Pr[Z - Z Xpartial Z + ]

(D)

Pr[1 - k Z Xpartial + (1-Z) 1 + k]

(E)

Pr[ - k Z Xpartial + (1-Z) + k]

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 139

Solutions to Problems:

6.1. C. Z =

200
= 36.5%.
1500

6.2. D. Z =

400
= 22.4%.
8000

Estimated Pure Premium = (22.4%)(1200) + (77.6%)(1000) = $1045.


6.3. D. Since the credibility is proportional to the square root of the number of claims, we get
(26%)( 20 ) = 116%. However, the credibility is limited to 100%.

6.4. E. Z =

513
= 0.506. Observed average cost per claim is: 4,771,000 / 513 = 9300.
2000

Thus the estimated severity = (0.506)(9300) + (1 - 0.506)(10,300) = $9794.


6.5. B. The expected number of claims is (0.04)(5000) = 200.
Z=

200
= 21.1%.
4500

6.6. A. (2.326) = 0.99, so that y = 2.326. n0 = y2/k2 = (2.326 / 0.06)2 = 1503.


For the Gamma Distribution, the mean is , while the variance is 2 .
Thus CV2 =
Z=

2
= 1 / = 1/2.5 = 0.4. nF = n0 (1+CV2 ) = (1503)(1.4) = 2104.
( )2

200
= 30.8%.
2104

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 140

6.7. B. P = 98%. Therefore y = 2.326, since (2.326) = 0.99 = (1+P)/2. k = 0.03.


Standard For Full Credibility is: nF = (y / k)2 (f2/f) = (2.326/0.03)2 (0.15/0.05) = 18,034 claims,
or 18,034/0.05 = 360,680 exposures. Z =

200,000
= 74.5%.
360,680

Estimated future frequency is: (74.5%)(9021/200000) + (25.5%)(.05) = 4.635%.


Expected number of future claims is: (200000)(4.635%) = 9270.
Comment: When available, one generally uses the number of exposures or the expected number
of claims in the square root rule, rather than the observed number of claims.
Using the expected number of claims, Z =

6.8.

10,000
= 74.5%.
18,034

x
= 0.4. x = (0.42 ) (1000) = 160 exposures.
1000

6.9. C. CV2 of the Pure Premium is: 100/62 = 2.778. y = 1.282. k = 0.025. n0 = y2/k2 = 2630.
Standard for Full Credibility for P.P. = n0 (Coefficient of Variation of the P.P.)2 = (2630)(2.778) =
7306 exposures. Z =

800
= 33.1%. Observation = 3200/800 = 4.
7306

New Estimate = (4)(33.1%) + (6)(66.9%) = 5.34.


Alternately, let m be the mean frequency. Then since the frequency is assumed to be Poisson,
variance of pure premium = m(second moment of severity). Thus E[X2 ] = 100 / m. E[X] = 6 / m.
Standard for Full Credibility in terms of claims is: n0 (1 + CV2 ) = n0 E[X2 ] / E[X]2 =
(2630) (100 / 62 )m = 7306m claims. Expected number of claims = 800m.
Z=

800m
= 33.1%. Proceed as before.
7306m

Comment: You are given the number of exposures and not the number of claims, so that it may be
easier to get a standard for full credibility in terms of exposures. When computing Z, make sure the
ratio you use is either claims/claims or exposures/exposures. The numerator and the standard for full
credibility in the denominator should be in the same units.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 141

6.10. A. The Exponential Distribution has a coefficient of variation of 1. For a Poisson frequency,
standard for full credibility for pure premium = nF = n0 (1 + CV2 ) = n0 (1 + 12 ) =
2 n0 = twice standard for full credibility for frequencies. Since the credibility is inversely proportional
to the square root of the standard for full credibility, the credibility for pure premiums is that for
frequency divided by

2 : 60%/ 2 = 42.4%.

6.11. E. We know we have an amount of data at least equal to the full credibility criterion for
frequency. If we have a lot more data, we would also assign 100% credibility for estimating pure
premiums. If we have just enough data to assign 100% credibility for estimating frequencies, then
we would assign 100%/ 2 = 70.7% credibility for estimating pure premiums. Thus we cannot
determine the answer from the given information.
Comment: One could proceed as in the previous question and calculate 100%/ 2 = 70.7%.
However, this assumes that we have just enough data to assign 100% credibility for estimating
frequencies. In fact we may have much more data than this. For example, if the full credibility criteria
for frequency is 1082 claims, we might have either 1082 or 100,000 claims in our data.
6.12. B. y = 1.960. k = 0.10. n0 = (y / k)2 = 384.
Estimated annual pure premium is : (200 + 150 + 230 + 180) / 4 = 190.
Estimated variance of the pure premium is :
{(200 - 190)2 + (150 - 190)2 + (230 - 190)2 + (180 - 190)2 } / (4-1) = 1133.
Using the formula: Standard for Full Credibility for P.P. in exposures =
n0 (Coefficient of Variation of the Pure Premium)2 = (384)(1133/1902 ) = 12.1 exposures.
Since we have 4 exposures (we have counted each year as one exposure,)
Z=

4
= 57.5%. Observation = 190. Prior estimate is 210.
12.1

Therefore, estimated P.P. = (190)(57.5%) + (210)(42.5%) = 198.5.


Alternately, let F be the mean frequency, F be the standard deviation of the frequency, S be the
mean frequency, and S be the standard deviation of the severity. Then in terms of claims, the
Standard for Full Credibility for P.P. is: n0 (F2/F + CV2 ) =
n0 (F2/F + S2/S2). Thus in terms of exposures, the Standard for Full Credibility for P.P. is:
n0 (F2/F + S2/S2)/F = n0 (F2S2 + FS2)/(F2 S2) = n0 (variance of P.P.)/(mean of P.P.)2 =
n0 (Coefficient of Variation of the Pure Premium)2 . Proceed as above.
Comment: The use of the unbiased estimator of the variance, with n - 1 in the denominator, when
we have a sample, is the type of thing that is done in Loss Models.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 142

6.13. D. P = 0.90 and k = 0.025. (1.645) = 0.95 = (1+P)/2, so that y = 1.645.


n0 = (1.645 / 0.025)2 = 4330. CVSev2 = 1/2.
nF = n0 (1 + CVSev2 ) = (4330)(1 + 1/2) = 6495 claims. Z =

810 / 6495 = 35.3%.

The prior estimate of aggregate losses is: (80%)($1.6 million) = $1.28 million.
The observation of aggregate losses is $1.134 million.
Thus the new estimate is: (0.353)(1.134) + (1 - 0.353)(1.28) = 1.228 million.
Since exposures have increased by 12%, the estimate of aggregate losses for 2002 is:
(1.12)(1.228) = $1.38 million.
6.14. B. The observed winning percentage is: 35/107.
The predicted winning percentage for the remainder of the season is:
(55.4 - 35) / (162 - 107) = 20.4/55.
Z 35/107 + (1 - Z) (0.5) = 20.4/55. Z = 0.7466.
107 / nF = 0.7466. nF = 192 games.
Comment: Information taken from www.coolstandings.com, as of August 4, 2012.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 143

6.15. D. We are given k = 10%, P = 90%. Thus y = 1.645. n0 = (y/k)2 = 271.


For losses of size 100 and 1000 the insurer makes no payment.
In the case of 10,000, the insurer pays 10000 - 1000 = 9000.
In the case of 100,000, the insurer pays 25000 - 1000 = 24000.
The distribution of the size of nonzero payments is: 9000 @2/3 and 24,000 @1/3.
This has mean of: (2/3)(9000) + (1/3)(24000) = 14,000.
This has second moment of: (2/3)(90002 ) + (1/3)(240002 ) = 246,000,000.
1 + CV2 = 246,000,000/14,0002 = 1.255.
We expect 500 losses, and (.3)(500) = 150 nonzero payments.
Number of claims (nonzero payments) needed for full credibility is: (271)(1.255) = 340.
Z=

150
= 66.4%.
340

Alternately, the distribution of amounts paid is: 0 @.7, 9000 @.2 and 24,000 @.1.
This has mean of: (0.2)(9000) + (0.1)(24000) = 4,200.
This has second moment of: (0.2)(90002 ) + (0.1)(24,0002 ) = 73,800,000.
1 + CV2 = 73,800,000/42002 = 4.184.
Number of losses needed for full credibility is: (271)(4.184) = 1134.
Z=

500
= 66.4%.
1134

Comment: The expected total payments are: (150)(14000) = 2,100,000 = (500)(4200).


If for example, we observed 2,500,000 in total payments this year, we would estimate total
payments next year of: (66.4%)(2,500,000) + (33.6%)(2,100,000) = 2,365,600.

6.16. C. Z =

19
= 13.8%. Observed frequency = 8/19 = .421.
1000

Prior estimate of frequency = 100/162 = .617.


Estimated future frequency = (13.8%)(0.421) + (1 - 13.8%)(0.617) = .590.
Estimated number of games won rest of season = (.590)(162 - 19) = 84.4.
Estimated total number of games won = 8 + 84.4 = 92.4.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 144

6.17. C. Prior to taking into account inflation, the estimate of aggregate losses in 2008 must have
been: 447,900/1.04 = 430,673.
The observed aggregate loss is: (77)(6861) = 528,297.
Z 528,297 + (1 - Z)(400,000) = 430,673. Z = 23.9%.
P = 90%. y = 1.645. n0 = (y/k)2 = (1.645/0.05)2 = 1082 claims.
For a Gamma, CV2 = (2) / ()2 = 1/.
Standard for Full Credibility is: (1082)(1 + 1/).
23.9% = Z =

77
. 0.2392 {(1082)(1 + 1/)} = 77. = 4.07.
(1082) (1 + 1/ )

6.18. B. Average severity for the State = $218 million / 7,363 = $29,608.
Average severity for countrywide = $23,868 million / 442,124 = $53,985.
P = 99%. y = 2.576. n0 = (2.576/0.075)2 = 1180 claims.
Standard for full credibility for severity is: CV2 n0 = (42 )(1180) = 18,880 claims.
Z=

7,363
= 62.4%.
18,880

Estimated state severity is: (62.4%)($29,608) + (1 - 62.4%)($53,985) = $38,774.


Comment: You are not responsible for knowing the details of any specific line of insurance.
A simplified portion of the calculation of State/Hazard Group Relativities for Workers
Compensation Insurance.

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 145

6.19. I would expect Willie to have a better batting average in the future than Reginald.
While Reginald has a batting average of 0.500, there is too little data to have much credibility.
Thus the estimated future batting average of Reginald is probably only slightly higher than the
overall mean of 0.260.
On the other hand, Willie has a considerable amount of data.
His estimated future batting average is close to or equal to his observed 0.300.
For example, let us assume a Binomial Model.
m q (1- q)
Then for q = 0.26, the ratio of the variance to the mean frequency is:
= 1- q = 0.74.
mq
If for example, we were to take P = 90% and k = 5%, then n0 = 1082 claims.
The Standard for Full Credibility for frequency would be: (0.74)(1082) = 801 claims.
This is equivalent to: 801 / 0.26 = 3081 exposures (at bats).
Then for Reginalds data, Z =

6
= 4.4%.
3081

Reginalds estimated future batting average is: (4.4%)(0.5) + (1 - 4.4%)(0.26) = 0.271.


For Willies data, Z =

3000
= 98.7%.
3081

Reginalds estimated future batting average is: (98.7%)(0.3) + (1 - 98.7%)(0.26) = 0.299.


Comment: Not the style of question you will get on your exam.
Other reasonable choices for P and k would produce somewhat different credibilities.
With additional information besides the results of their batting, one could make better estimates.
6.20. B. k = 5% and P = 90%. We have y = 1.645, since (1.645) = 0.95 = (1+P)/2.
Therefore n0 = (y/k)2 = (1.645/0.05)2 = 1082. When we have 36 claims per year for three years
we assign 20% credibility; therefore .20 =

108
. Thus nF = 2700.
nF

But nF = n0 (1 + CV2 ). Thus 2700 = 1082(1 + CV2 ). CV = 1.22.

6.21. B. The credibility Z =

600
= 1/3.
5400

Thus the new estimate is: (1/3)(1200) + (1- 1/3)(1000) = $1067.

6.22. C. The credibility assigned to (2)(250) = 500 claims, Z =


The new estimate is (0.447)(130) + (1 - 0.447)(100) = $113.

500
= 0.447.
2500

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 146

6.23. C. The credibility assigned was:

change in loss cost


=
difference between observation and prior estimate

(125 - 100) / (200 - 100) = 25%. The expected number of claims was (10,000)(.0210) = 210.
Z=

210
. Therefore nF = 210 / 0.252 = 3360.
nF

Comment: We expect 210 claims, and Z =

210
= 0.25.
3360

Then the new estimate of the loss costs is: ($200)(0.25) + ($100)(1 - 0.25) = $125.
6.24. B. (1+P)/2 = (1.96)/2 = 0.98. Thus y = 2.054, since (2.054) = 0.98.
The standard for full credibility is: (y2 /k2 ) (1+CV2 ) = (2.054/0.10)2 (1+.62) = 574 claims.
Thus we assign credibility of Z =

213
= 60.9%.
574

6.25. D. CV2 = variance / mean2 = 17,640,000 / 7002 = 36. k = 0.05, while P (and y) are to be
solved for. The credibility being applied to the observation is:
Z = (change in estimate) / (observation - prior estimate) = (550-500) / (700-500) = 0 25.
We expect: (2)(1250) = 2500 claims. Thus since 2500 claims are given 0.25 credibility,
the full credibility standard is: 2500/0.252 = 40,000 claims. However, that should equal
(y2 /k2 ) (1+CV2 ) = (y2 /0.052 ) (1 + 36). Thus: y = (0.05)

40,000
= 1.644.
37

(y) = (1+P)/2. Thus P = 2(1.644) - 1 = (2)(0.9499) - 1 = 0.90.


6.26. D. k = 0.05 and P = 0.90. y = 1.645 , since (1.645) = 0.95 = (1+P)/2.
n0 = y2 / k2 = (1.645/0.05)2 = 1082. The mean of the severity distribution is 100,000. The second
moment of the severity is the integral of x2 /200,000 from 0 to 200,000, which is 200,0002 /3.
Thus the variance is 3,333,333,333. The square of the coefficient of variation is variance/ mean2 =
3,333,333,333 / 100,0002 = 0.3333. Thus nF = n0 (1+CV2) = (1082)(1.333) = 1443.
For 1082 claims, Z =

1082
=
1443

3
= 0.866.
4

Comment: For the uniform distribution on [a,b], the CV =

b-a
. For a = 0, CV2 = 1/3.
(b + a) 3

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 147

6.27. D. k = 6%. (1.645) = .95 = (1+.90)/2, so that y = 1.645.


Standard for full credibility for frequency = n0 = y2/k2 = (1.645 / .06)2 = 756.
Coefficient of Variation of the severity = 7500 /1500 = 5.
Standard for full credibility for pure premium = nF = n0 (1 + CV2 ) = 756(1+52 ) = 19,656 claims.
Z=

6000
= 0.552. The prior estimate is given as $16.5 million.
19,656

The observation is given as $15.6 million. Thus the new estimate is:
(0.552)(15.6) + (1 - 0.552)(16.5) = $16.00 million.

6.28. D. Z =

10,000
= 75.6%.
17,500

Thus the new estimate = (25 million)(0.756) + (20 million)(1 - 0.756) = $23.78 million.

6.29. B. Z =

N
. Thus, R = O
F

Solving for N, N =

N
+ P {1 F

N
}.
F

F (R - P)2
.
(O - P)2

Comment: Writing the revised estimate as R = P + Z(O-P) can be useful in general and allows a
slightly quicker solution of the problem. This can also be written as
Z = (R - P) / (O - P); i.e., the credibility is the ratio of the revision of the estimate from the prior
estimate to the deviation of the observation from the prior estimate.

6.30. B. 0.6 = Z =

n
. Therefore, n = (0.62 )(2000) = 720 claims.
2000

6.31. D. We are given k = 5% and P = 90%, therefore we have y = 1.645 since (1.645) = .95 =
(1 + P)/2. Therefore, n0 = (y/k)2 = (1.645/.05)2 = 1082. The partial credibility is given by the square
root rule: Z =

500
= 0.68.
1082

6.32. A. P = .95 and k = .1. (1.960) = .975 = (1+P)/2, so that y = 1.960.


n0 = (y/k)2 = (1.960 / .10)2 = 384. Z =

n
= 0.4. Thus n = (384)(0.42 ) = 61.4.
384

2013-4-8,
6.33. B.

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 148


n
= 0.5. Thus n = (1000) (0.5)2 = 250.
1000

6.34. B. For N observations, Classical Credibility =

N
N
=
, for N 1600.
1600
40

For N observations, Greatest Accuracy/ Buhlmann Credibility = N/(N + K) = N / (N + 391).


We want

N
N
>
. N - 40 N + 391 > 0.
40
N + 391

N - 40 N + 391 = 0.

N =

40

402 - (4)(1)(391)
= 17 or 23.
2

For N between 172 = 289 and 232 = 529, the Buhlmann Credibility is greater than the Classical
Credibility.
Comment: The 2 formulas for K = 391 and nF = 1600 produce very similar credibilities.
N
Classical Cred.
Buhlmann Cred.

0
100
200
300
400
500
529
600
1000
0.0% 25.0% 35.4% 43.3% 50.0% 55.9% 57.5% 61.2% 79.1%
0.0% 20.4% 33.8% 43.4% 50.6% 56.1% 57.5% 60.5% 71.9%

See An Actuarial Note on Credibility Parameters, by Howard Mahler, PCAS 1986.


6.35. A. P = .95 and k = .05. (1.960) = 0.975 = (1+P)/2, so that y = 1.960.
n0 = (y/k)2 = (1.960 / 0.05)2 = 1537. Standard for full credibility for pure premium =
nF = n0 (1 + CV2 ) = 1537(1 + 32 ) = 15,370 claims. Z =

1000
= 25.5%.
15,370

The prior estimate is given as $5 million. The observation is given as $6.75 million.
Thus the new estimate is: (25.5%)(6.75) + (1 - 25.5%)(5) = $5.45 million.
6.36. C. k = .05. P = 90%. y = 1.645. n0 = (1.645/.05)2 = 1082 claims.
For the Negative Binomial, f = (2)(0.2) = 0.4. f2 = (2)(0.2)(1.2). f2 /f = 1.2.
For the Pareto, E[X] = 1000 / (3-1) = 500. E[X2 ] =

(2) (10002)
= 1,000,000.
(3 -1) (3 - 2)

C V2 = E[X2 ]/E[X]2 - 1 = 1,000,000/5002 - 1 = 4 - 1 = 3.


Standard for Full Credibility = (f2 /f + CVSev2 )n0 = (1.2 + 3)(1082) = 4546 claims.
2500 exposures (2500)(0.4) = 1000 expected claims. Z =

1000
= 47%.
4546

2013-4-8,

Classical Credibility 6 Partial Credibility, HCM 10/16/12, Page 149

6.37. E. The estimate using classical credibility is: Z Xpartial + (1-Z).


We want this estimate to be within k of , with probability P
P = Pr[ - k Z Xpartial + (1-Z) + k].
Comment: See page 30 in Section 2.6 of Mahler & Dean.
P = Pr[- k Z Xpartial - Z k]
P = Pr[Z - k Z Xpartial Z + k]
P = Pr[(1-Z) + Z - k (1-Z) + Z Xpartial (1-Z) + Z + k]
P = Pr[ - k Z Xpartial + (1-Z) + k].

2013-4-8,

Classical Credibility 7 Important Ideas , HCM 10/16/12, Page 150

Section 7, Important Formulas and Ideas


The estimate using credibility =
ZX + (1-Z)Y, where Z is the credibility assigned to the observation X.
Full Credibility (Sections 2, 3, and 5):
Assume one desires that the chance of being within k of the mean frequency to be at least P, then
y2
1+ P
.
n0 = 2 , where y is such that (y) =
k
2

The Standard for Full Credibility for Frequency is in terms of claims:

f2
n0 .
f

In the Poisson case this is: n0.


The Standard for Full Credibility for Severity is in terms of claims: CVSev2 n0 .
The Standard for Full Credibility for either Pure Premiums or Aggregate Losses is in
2
terms of claims: ( f + CVS e v2 ) n0 . In the Poisson case this is: (1 + CVSev2 ) n0.
f
The standard can be put in terms of exposures rather than claims by dividing by f.
Standard for Full Credibility for Pure Premiums or Aggregate Losses is in terms of exposures:
n0 (coefficient of variation of the pure premium)2 .
Variance of Pure Premiums and Aggregate Losses (Section 4):
Pure Premiums =

$ of Loss
# of Claims
$ of Loss
=
= (Frequency)(Severity).
# of Exposures
# of Exposures # of Claims

When frequency and severity are independent: P P2 = F r e q S e v2 + S e v2 F r e q2 .

A 2 = F S 2 + S 2 F 2 .

2013-4-8,

Classical Credibility 7 Important Ideas , HCM 10/16/12, Page 151

Partial Credibility (Section 6):


When one has at least the number of claims needed for Full Credibility, then one assigns
100% credibility to the observations.
Otherwise use the square root rule:

Z=

number of claims
, or
standard for full credibility in terms of claims

Z=

number of exposures
.
standard for full credibility in terms of exposures

When available, one generally uses the number of exposures or the expected number of claims in
the square root rule, rather than the observed number of claims.
Make sure that in the square root rule you divide comparable quantities; either divide claims by
claims or divide exposures by exposures.

Mahlers Guide to

Buhlmann Credibility
and Bayesian Analysis
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-9


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-9

Buhlmann Credibility,

HCM 10/19/12,

Page 1

Mahlers Guide to Buhlmann Credibility and Bayesian Analysis


Copyright 2013 by Howard C. Mahler.
Information in bold or sections whose title is in bold are more important for passing the exam.
Information presented in italics (and sections whose title is in italics) should not be needed to directly
answer exam questions and should be skipped on first reading. It is provided to aid the readers
overall understanding of the subject, and to be useful in practical applications.
Solutions to the problems in each section are at the end of that section.1
Section #

Pages

Section Name

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

4-6
7-48
49-76
77-126
127-205
206-282
283-331
332-393
394-453
454-504
505-516
517-555
556-596
597-611
612-625
626-646
647-679
680-703
704-707

Introduction
Conditional Distributions
Covariances and Correlations
Bayesian Analysis, Introduction

A
B
C
D
E
F
G
H
I

Bayesian Analysis, with Discrete Risk Types


Bayesian Analysis, with Continuous Risk Types

EPV and VHM


Buhlmann Credibility, Introduction
Buhlmann Credibility, Discrete Risk Types
Buhlmann Credibility, with Continuous Risk Types
Linear Regression & Buhlmann Credibility
Philbricks Target Shooting Example
Die / Spinner Models
Classification Ratemaking
Experience Rating
Loss Functions
Least Squares Credibility
The Normal Equations for Credibilities
Important Formulas and Ideas

Note that problems include both some written by me and some from past exams. In some cases Ive rewritten these
questions in order to match the notation in the current Syllabus. Past exam questions are copyright by the Casualty
Actuarial Society and Society of Actuaries and are reproduced here solely to aid students in studying for exams. The
solutions and comments are solely the responsibility of the author; the CAS and SOA bear no responsibility for their
accuracy.

2013-4-9

Buhlmann Credibility,

HCM 10/19/12,

Page 2

Course 4 Exam Questions by Section of this Study Aid2


Section Sample
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

5/00

11/00

5/01

11/01

11/02

11/03

5/05

13
32
11

7 22

29
7

28
14

28
24

13
39
21 24

14 39
19 31

5
33

19 38

37

19 20

33

10 11 28

35

11 26 38

29 32

23

9 25

13
32
20

18

18
7

11

29

11 17

23

18
23

The CAS/SOA did not release the 5/02, 5/03, 5/04, and 5/06 exams.

11/04

Excluding any questions that are no longer on the syllabus.

2013-4-9

Buhlmann Credibility,

Section 11/05
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

11/06

HCM 10/19/12,

5/07

35
15
32

16

2, 21
36

26

32

The CAS/SOA did not release the 11/07 and subsequent exams.

Page 3

2013-4-9

Buhlmann Credibility 1 Introduction,

HCM 10/19/12,

Page 4

Section 1, Introduction
This study guide covers a number of related ideas.3 The basic concepts of Credibility are covered
in Mahlers Guide to Classical Credibility. Read at least the first section of that Study Guide prior to
this one. The concepts in this study guide are applied to special situations in Mahlers Guide to
Conjugate Priors.
In this study guide, the preliminary mathematical ideas of conditional distributions, covariances and
correlations are covered first.4
The first key idea is that of Bayes Theorem and Bayesian Analysis.
The second key idea is that of Greatest Accuracy or Buhlmann Credibility. Loss Models uses the
term Greatest Accuracy Credibility for what is more commonly known as Buhlmann,
Bhlmann-Straub, or Least Squares Credibility. In this study guide I will use the terms Buhlmann
Credibility and Greatest Accuracy Credibility interchangeably. Many of you will benefit by reading
my section on the Philbrick Target Shooting Example, prior to studying Buhlmann Credibility.5
One has to become proficient at applying Buhlmann Credibility to various situations typically posed
in exam questions, in particular calculating the expected value of the process variance and the
variance of the hypothetical means. Therefore, one has to become proficient at calculating variances.6
The third key idea, Nonparametric Empirical Bayesian Estimation, is presented in its own study
guide. Rather than a model, the data is used in order to estimate the expected value of the process
variance and the variance of the hypothetical means.
The fourth key idea, Semiparametric Estimation, is presented in its own study guide. Both a model
and data are relied upon in order to estimate the expected value of the process variance and the
variance of the hypothetical means.

The concepts in Chapter 20 of Loss Models related to Buhlmann or Greatest Accuracy Credibility are
demonstrated. This material can also be covered from Credibility by Mahler and Dean, Chapter 8 of the fourth
Edition of Foundations of Casualty Actuarial Science. My study guide is very similar to and formed a basis for the
Credibility Chapter written by myself and Curtis Gary Dean.
4
Many of those familiar with these ideas would benefit by glancing over the important ideas in these sections and
doing the highly recommended problems. Those unfamiliar with these ideas should go through these sections in
more detail.
5
While not on the syllabus, it will help many students develop an understanding of the ideas on the syllabus.
6
The process variances of various loss (severity) and frequency distributions are covered in those study guides.
The process variance of pure premiums is covered in Mahlers Guide to Classical Credibility.
In general, credibility depends on the variance-covariance structure, as discussed in the section on the Normal
Equations for Credibility; however, with rare exceptions, on the exam one need only calculate variances in order to
calculate credibilities.

2013-4-9

Buhlmann Credibility 1 Introduction,

HCM 10/19/12,

Page 5

Problems:
1.1 (1 point) You observe 1256 claims per 10,000 exposures. The credibility given to this data is
70%. The complement of credibility is given to the prior estimate of .203 claims per exposure.
What is the new estimate of the claim frequency?
A. Less than .135
B. At least .135 but less than .145
C. At least .145 but less than .155
D. At least .155 but less than .165
E. At least .165
1.2 (1 point) The prior estimate was a pure premium of $2.53 per $100 of payroll. After observing
$81,472 per $795,034 of payroll, the new estimate is a pure premium of $2.87. How much
credibility was assigned to the observed data?
A. Less than 4%
B. At least 4% but less than 5%
C. At least 5% but less than 6%
D. At least 6% but less than 7%
E. At least 7%
1.3 (1 point) Given an observation with a value of 250, the Buhlmann credibility estimate for the
expected value of the next observation would be 223. If instead the observation had been 100,
the Buhlmann credibility estimate for the expected value of the next observation would have been
118. Determine the Buhlmann credibility of the first observation.
A. 40%
B. 50%
C. 60%
D. 70%
E. 80%
1.4 (4, 5/83, Q.35, Q.39, Q.41) (1 point) Which of the following are true?
1. Credibility can be characterized as a measure of the relative value of the information
contained in the data.
2. The definition of full credibility depends on two parameters.
3. If one assumes that claims for an individual driver are Poisson distributed and that the
means of these distributions are Gamma distributed, then the total number of accidents
follows a Poisson distribution.
A. 1
B. 2
C. 3
D. 1, 2, 3
E. None of A ,B, C or D.
1.5 (4, 5/96, Q.3) (1 point) Given a first observation with a value of 2, the Buhlmann credibility
estimate for the expected value of the second observation would be 1.
Given a first observation with a value of 5, the Buhlmann credibility estimate for the expected value
of the second observation would be 2.
Determine the Buhlmann credibility of the first observation.
A. 1/3
B. 2/5
C. 1/2
D. 3/5
E. 2/3

2013-4-9

Buhlmann Credibility 1 Introduction,

HCM 10/19/12,

Page 6

Solutions to Problems:
1.1. C. (70%)(1256/10,000) + (1-70%)(0.203) = 0.149
1.2. B. Observed pure premium per $100 of payroll is 81472 / 7950.34 = $10.25.
new estimate - old estimate
$2.87 - $2.53
Z=
=
= 4.4%.
Observation - old estimate
$10.25 - $2.53
1.3. D. Let Y be the prior estimate and Z be the credibility of the first observation.
250Z + (1 - Z)Y = 223, and 100 Z + (1 - Z) Y = 118. 150Z = 105. Z =105/150 = 70%.
Alternately, the credibility is the slope of the line of posterior estimates versus observations:
estimates
223 - 118
Z=
=
= 105/150 = 70%.
observations 250 - 100
1.4. E. 1. True. This is one way to describe Buhlmann Credibility. 2. True. The Classical Credibility
Standard For Full Credibility depends on choosing P and k. 3. False. The mixed predictive
distribution for the Gamma-Poisson is a Negative Binomial Distribution.
1.5. A. Let Y be the prior estimate and Z be the credibility of the first observation.
Then: 2Z + (1 - Z)Y = 1, and 5 Z + (1 - Z)Y = 2. Therefore, 3Z = 1 or Z = 1/3.
Alternately, the credibility is the slope of the line of posterior estimates versus observations:
estimates
2 - 1
Z=
=
= 1/3.
observations 5 - 2

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 7

Section 2, Conditional Distributions7


Assume that 14% of actuarial students take exam seminars and that 8% of actuarial students both
take exam seminars and pass their exam. Then the chance of a student who has taken an exam
seminar of passing his exam is 8% / 14% = 57%. Assume 1000 total students, of whom 140 take
exam seminars. Of these 140 students, 80 pass, for a pass rate of: 80/140.
This is a simple example of a conditional probability.
The conditional probability of an event A given another event B is defined as:
P[A and B]
P[A|B] =
.
P[B]
In the simple example, A = {student passes exam}, B = {student takes exam seminar},
P[A and B] = 8%, P[B] = 14%. Thus P[A|B] = P[A and B] / P[B] = 8% / 14% = 57%.
Here is a more complicated example of a conditional probability. Assume two honest,
six-sided dice of different colors are rolled, and the results D1 and D2 are observed.
Let S = D1 + 2D2 . Then one can easily compute S for all the possible outcomes:

D1
1
2
3
4
5
6

D2
3

3
4
5
6
7
8

5
6
7
8
9
10

7
8
9
10
11
12

9
10
11
12
13
14

11
12
13
14
15
16

13
14
15
16
17
18

Then when S 13 we have the following equally likely possibilities:


D2
D1

Possibilities

1
2
3
4
5
6

x
x
x
x
x
x

x
x
x
x
x
x

x
x
x
x
x
x

x
x
x
x
x

x
x
x

6
5
5
4
4
3

Conditional Density Function


of D1 given that S 13
6/27
5/27
5/27
4/27
4/27
3//27

Conditional distributions form the basis for Bayesian Analysis. Thus even though these ideas are only occasionally
tested directly, you should make sure you have a firm understanding of the concepts in this section. For those who
already know this material, do only a few problems from this section in order to refresh your memory.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 8

The mean of the conditional density function of D1 given that S 13 is:


{(6)(1) + (5)(2) + (5)(3) + (4)(4) +(4)(5)+(3)(6)} / 27 = 3.148.
The median is 3, since the Distribution Function at 3 is 16/27 .5 while that at 2 is 11/27 < .5.
The mode is 1, since that is the value at which the density is a maximum.
Exercise: In the above example, if S > 13, what is the chance that D1 = 4?
[Solution: The following 9 pairs have S > 13: (2,6), (3,6), (4,5), (4,6), (5,5),(5,6), (6,4), (6,5) and
(6,6). Thus P[D1 = 4 | S > 13] = 2 / 9.]
Theorem of Total Probability:
We observe that P[D1 = 4 | S > 13] P[S > 13] = (2/9)(9/36) = 2/36, while
P[D1 = 4 | S 13] P[S 13] = (4/27)(27/36) = 4/36.
The sum of these two terms is: (2/36) + (4/36) = 6/36 = P[D1 = 4].
Note that either S > 13 or S 13; these are two disjoint events that cover all the possibilities.
One can have a longer series of mutually disjoint events that cover all the possibilities rather than just
two. If one has such a set of events Bi, then one can write the marginal distribution function P[A] in
terms of the conditional distributions P[A | Bi] and the probabilities P[Bi]:

P[A |

Bi] P[Bi]

Bi

This theorem follows from, P[A | Bi] P[Bi] = P[A and Bi] = P[A], provided that the Bi are
disjoint events that cover all possibilities.
Thus one can compute probabilities of events either directly or by summing a product of terms. For
example, we already know that in this example there is 1 in 6 chance that the first die is 3. However,
we can compute this probability by taking the set of disjoint events, that S is 3, 4, 5,..., or 18.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 9

Let Bi = {S D1 + 2D2 = i} for i = 3 to 18. Then we have for D1 = 3:


A

i
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

P[D1 = 3 | S = i]
0.00%
0.00%
50.00%
0.00%
33.33%
0.00%
33.33%
0.00%
33.33%
0.00%
33.33%
0.00%
50.00%
0.00%
0.00%
0.00%

P[S=i]
2.78%
2.78%
5.56%
5.56%
8.33%
8.33%
8.33%
8.33%
8.33%
8.33%
8.33%
8.33%
5.56%
5.56%
2.78%
2.78%

Col. B x Col. C =
P[D1 = 3 | S = i] P[S=i]
0.00%
0.00%
2.78%
0.00%
2.78%
0.00%
2.78%
0.00%
2.78%
0.00%
2.78%
0.00%
2.78%
0.00%
0.00%
0.00%

16.67%

Sum

So in this case we can indeed calculate the probability that the first die is 3, using the Theorem of
Total Probability: 1/6 = P[D1 = 3] = P[D1 = 3 | S = i] P[S = i]
Exercise: Assume that the number of students taking an exam by exam center are as follows:
Chicago 3500, Los Angeles 2000, New York 4500. The number of students from each exam
center passing the exam are: Chicago 2625, Los Angeles 1200, New York 3060.
What is the overall passing percentage?
[Solution: (2625+1200+3060) / (3500+2000+4500) = 6885/10000 = 68.85%. ]
Exercise: Assume that the percent of students taking an exam by exam center are as follows:
Chicago 35%, Los Angeles 20%, New York 45%. The percent of students from each exam center
passing the exam are: Chicago 75%, Los Angeles 60%, New York 68%.
What is the overall passing percentage?
[Solution: P[A | Bi] P[Bi] = (75%)(35%) + (60%)(20%) + (68%)(45%) = 68.85%. ]
Note that this exercise is mathematically the same as the previous exercise. This is a concrete
example of the Theorem of Total Probability.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 10

Conditional Expectation:
The mean of the conditional density function of D2 given that S13 is:
{(6)(1) + (6)(2) + (6)(3) + (5)(4) +(3)(5)+(1)(6)} / 27 = 2.852.
In general, in order to compute such a conditional expectation, we take the weighted average over
all the possibilities y:
E[X | B] =

y P[ X = y |

B]

Note that the conditional expectation of D2 given that S13 which is 2.852 is not equal to the
unconditional expectation of D2 which is 3.5, the mean of a fair six-sided die.
The fact that we observed that S13 decreased the expected value of D2 .
Exercise: What is the mean of the conditional density function of D2 given that S>13?
[Solution: (0/9)(1) + (0/9)(2) + (0/9)(3) + (1/9)(4) +(3/9)(5)+(5/9)(6) = 5.444. ]
In general, we can compute the unconditional expectation by taking a weighted average of the
conditional expectations over all the possibilities:

E[X

| Bi] P[Bi]

Bi

The different events Bi must be disjoint and cover all the possibilities.
In this particular case for example we can take two possibilities S 13 and S > 13:
E[D2 ] = E[D2 |S 13] P[S 13] + E[D2 |S > 13] P[S > 13] =
( 2.852)(27/36) + (5.444 )(9/36) = 3.5, which is the correct unconditional mean.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 11

One could have obtained 3.5 by instead summing over each possible value of S:
A

i
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Sum

E[D2 | S = i]
1.000
1.000
1.500
1.500
2.000
2.000
3.000
3.000
4.000
4.000
5.000
5.000
5.500
5.500
6.000
6.000

P[S=i]
2.78%
2.78%
5.56%
5.56%
8.33%
8.33%
8.33%
8.33%
8.33%
8.33%
8.33%
8.33%
5.56%
5.56%
2.78%
2.78%
1

Col. B x Col. C =
E[D2 | S = i] P[S=i]
0.028
0.028
0.083
0.083
0.167
0.167
0.250
0.250
0.333
0.333
0.417
0.417
0.306
0.306
0.167
0.167
3.500

For example for S = 11, there are three equally likely possibilities:8
(D1 = 5, D2 = 3), (D1 = 3, D2 = 4), (D1 = 1, D2 = 5). Thus E[D2 | S =11] = (3+4+5)/3 = 4.
Conditional Variances:
One could be asked any questions about a conditional distribution, that one could be asked about
any other distribution. For example, the variance of the conditional distribution of D2 given S =11 is
computed by subtracting the square of the mean from the second moment.
Since for S = 11, there are three equally likely possibilities: (D1 = 5, D2 = 3),
(D1 = 3, D2 = 4), (D1 = 1, D2 = 5), the conditional distribution has probability of 1/3 at each of 3, 4
and 5. Thus its second moment is: (1/3)(32 ) +(1/3)(42 ) + (1/3)(52 ) = 16.667.
Thus since the mean is 4, the conditional variance is 16.667 - 42 = 0.667.
In general one, can compute higher moments in the same way one computes the conditional mean:
E[Xn | B] =

yn P[ X = y
y

Recall that we defined S = D1 + 2D2 .

| B ].

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 12

In general, we can compute the unconditional higher moments by taking a weighted average of the
conditional moments over all the possibilities:
E[Xn ] =

E[X n

| Bi ] P[Bi]

Bi

A Continuous Example:
Ive gone through a discrete example involving conditional distributions. Here is a continuous
example.9 Wherever one would use sums in the discrete case, one uses integrals in the continuous
case.
For a given value of , the number of claims is Poisson distributed with mean .
In turn is distributed uniformly from 0.1 to 0.4.
Exercise: From an insured picked at random, what is the chance that zero claims are observed?
[Solution: Given , the chance that we observe zero claims is e.
0.4

P(n=0) =

0.4

0.1P(n = 0 | ) f() d = 0.1

e-

(1/ 0.3) d = (-1/ 0.3)e

= 0.4
-

= 0.1

= (e-0.1 - e-0.4)/0.3 = 0.782.]


Exercise: From an insured picked at random, what is the chance that one claim is observed?
[Solution: Given , the chance that we observe one claim is: e 1 / 1! = e.
0.4

P(n=1) =

0.4

P(n = 1 | ) f() d =

0.1

= 0.4

e - (1/ 0.3) d = (-1/ 0.3)(e - + e - )]

0.1

= 0.1

= (1.1e-0.1 - 1.4e-0.4)/0.3 = 0.190.]


Exercise: What is the unconditional mean?
[Solution: The unconditional mean can be obtained by integrating the conditional means versus the
distribution of :
0.4

E[X] =
9

0.4

0.1E[X | ] f() d = 0.1 (1/ 0.3) d = {(0.42 - 0.12)/2} / 0.3 = 0.25.]

For additional continuous examples, see the problems below.


Also the Conjugate Prior processes provide continuous examples, including the important Gamma-Poisson.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 13

In general, we can compute the unconditional higher moments by taking an integral of the conditional
moments times the chance of each possibility over all the possibilities:
E[Xn ] =

E[Xn|] f() d.

Exercise: What is the unconditional variance?


[Solution: For the Poisson Distribution the mean is , the variance is the , and therefore the
(conditional) second moment is: + 2. The unconditional second moment can be obtained by
integrating the conditional second moments versus the distribution of :
0.4

E[X2 ] =

0.1

0.4

E[X2

| ] f() d =

0.1( +

2 )

(1/ 0.3) d =

(2

= 0.4

/ 2 + / 3) (1/ 0.3)]
3

= 0.32.

= 0.1

From the previous exercise the unconditional mean is 0.25.


Thus the unconditional variance is: 0.32 - 0.252 = 0.2575.
Comment: Integrate the conditional moments, not the conditional variance.]

Summary:
This material is preliminary, and therefore there will be very few if any exam questions based solely
on the material in this section. However, it does form the basis for the many Bayesian Analysis
questions covered subsequently. Thus it is a good idea to have an ability to apply the concepts
related to conditional distributions to the type of situations that come up on questions involving
Bayesian Analysis, including those covered in Mahlers Guide to Conjugate Priors.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 14

Problems:
2.1 (2 points) Assume that 3% of men are colorblind, while 0.2% of women are colorblind.
The studio audience of the television show The Vista is made up 20% of men and 80% of
women. A person in this audience is colorblind.
What is the chance that this colorblind person is a man?
A. less than 0.65
B. at least 0.65 but less than 0.70
C. at least 0.70 but less than 0.75
D. at least 0.75 but less than 0.80
E. at least 0.80
Use the following information for the next 6 questions:
A large set of urns contain many black and red balls. There are four types of urns each with differing
percentages of black balls. Each type of urn has a differing chance of being picked.
Type of Urn
A Priori Probability
Percentage of Black Balls
I
40%
3%
II
30%
5%
III
20%
8%
IV
10%
13%
2.2 (1 point) An urn is picked and a ball is selected from that urn.
What is the chance that the ball is black?
A. Less than 0.050
B. At least 0.050 but less than 0.055
C. At least 0.055 but less than 0.060
D. At least 0.060 but less than 0.065
E. At least 0.065
2.3 (1 point) An urn is picked and a ball is selected from that urn.
If the ball is black, what is the chance that Urn I was picked?
A. Less than 0.22
B. At least 0.22 but less than 0.24
C. At least 0.24 but less than 0.26
D. At least 0.26 but less than 0.28
E. At least 0.28

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 15

2.4 (1 point) An urn is picked and a ball is selected from that urn.
If the ball is black, what is the chance that Urn II was picked?
A. Less than 0.22
B. At least 0.22 but less than 0.24
C. At least 0.24 but less than 0.26
D. At least 0.26 but less than 0.28
E. At least 0.28
2.5 (1 point) An urn is picked and a ball is selected from that urn.
If the ball is black, what is the chance that Urn III was picked?
A. Less than 0.22
B. At least 0.22 but less than 0.24
C. At least 0.24 but less than 0.26
D. At least 0.26 but less than 0.28
E. At least 0.28
2.6 (1 point) An urn is picked and a ball is selected from that urn.
If the ball is black, what is the chance that Urn IV was picked?
A. Less than 0.22
B. At least 0.22 but less than 0.24
C. At least 0.24 but less than 0.26
D. At least 0.26 but less than 0.28
E. At least 0.28
2.7 (2 points) An urn is picked and a ball is selected from that urn and then replaced. If the ball is
black, what is the chance that the next ball picked from that same urn will be black?
A. Less than 0.06
B. At least 0.06 but less than 0.07
C. At least 0.07 but less than 0.08
D. At least 0.08 but less than 0.09
E. At least 0.09

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 16

Use the following information for the next two questions:


For a given value of , the number of claims is Poisson distributed with mean .
In turn is distributed uniformly from 0 to 1.5.
2.8 (2 points) What is the chance that zero claims are observed?
A. Less than 0.35
B. At least 0.35 but less than 0.40
C. At least 0.40 but less than 0.45
D. At least 0.45 but less than 0.50
E. At least 0.50
2.9 (2 points) What is the chance that one claim is observed?
A. Less than 0.35
B. At least 0.35 but less than 0.40
C. At least 0.40 but less than 0.45
D. At least 0.45 but less than 0.50
E. At least 0.50
Use the following information for the next 8 questions:
Let X and Y be two continuous random variables with joint density function:
f(x,y) = (6 + 12x + 18y) / 25, for 1< y < x < 2. f(x, y) = 0 otherwise.
2.10 (3 points) What is the (unconditional) marginal density f(x)?
A. (11x2 - 6x - 5) / 25
B. (11x2 - 16x - 15) / 25
C. (21x2 - 16x - 15) / 25
D. (21x2 - 6x - 5) / 25
E. None of the above
2.11 (3 points) What is the (unconditional) marginal density f(y)?
A. (30 + 36y - 16y2 ) / 25
B. (36 + 36y - 24y2 ) / 25
C. (30 + 30y - 16y2 ) / 25
D. (36 + 30y - 24y2 ) / 25
E. None of the above

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 17

2.12 (2 points) What is the conditional density f(y | X=1.5)?


A. (24/93) (4 + 3y)
B. (24/93) (3 + 3y)
C. (24/93) (4 + 5y)
D. (24/93) (3 + 5y)
E. None of the above
2.13 (2 points) What is the conditional density f(x | Y = 1.5)?
A. (8x + 4) / 9
B. (4x + 11) / 9
C. (2x + 15) / 9
D. (x + 19) / 9
E. None of the above
2.14 (4 points) What is the conditional expectation E[Y | X =1.6]?
A. 1.27
B. 1.29
C. 1.31
D. 1.33
E. 1.35
2.15 (4 points) What is the conditional expectation E[X | Y = 1.2]?
A. Less than 1.50
B. At least 1.50 but less than 1.55
C. At least 1.55 but less than 1.60
D. At least 1.60 but less than 1.65
E. At least 1.65
2.16 (4 points) What is the unconditional expectation E[X]?
A. Less than 1.50
B. At least 1.50 but less than 1.55
C. At least 1.55 but less than 1.60
D. At least 1.60 but less than 1.65
E. At least 1.65
2.17 (4 points) What is the unconditional expectation E[Y]?
A. Less than 1.25
B. At least 1.25 but less than 1.30
C. At least 1.30 but less than 1.35
D. At least 1.35 but less than 1.40
E. At least 1.40

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 18

Use the following information for the next two questions:


X and Y are each given by the result of rolling a six-sided die.
X and Y are independent of each other. Z = X + Y.
2.18 (1 point) What is the probability that X = 6 if Z 10?
A. 1/5
B. 1/4
C. 1/3
D. 2/5
E. 1/2
2.19 (2 points) What is the expected value of X if Z 10?
A. less than 4.8
B. at least 4.8 but less than 5.0
C. at least 5.0 but less than 5.2
D. at least 5.2 but less than 5.4
E. at least 5.4
2.20 (2 points) Let X and Y each be distributed exponentially with distribution functions:
F(x) = 1 - e-3x, x > 0, H(y) = 1 - e-3y, y > 0. X and Y are independent.
Let Z = X + Y. Given Z = 1/2, what is the conditional distribution of X?
A. P[X=x | Z=1/2] = 2, for 0<x 1/2
B. P[X=x | Z=1/2] = 8x, for 0<x 1/2
C. P[X=x | Z=1/2] = 24x2 , for 0<x 1/2
D. P[X=x | Z=1/2] = 64x3 , for 0<x 1/2.
E. None of the above.
2.21 (2 points) X is Binomial with m = 3 and q = 0.2. Y is Binomial with m = 5 and q = 0.2.
X and Y are independent. What is the conditional distribution of X, given that X + Y = 6?
2.22 (2 points) X is Poisson with mean 3. Y is Poisson with mean 7. X and Y are independent.
What is the conditional distribution of X, given that X + Y = 9?
2.23 (2 points) A nurse has just started to count the babies in a hospital nursery. She has just
counted that there are four boys, and has not counted the girls, when a new baby is brought in to the
nursery. A baby is then selected at random from all the babies present, to have its footprint taken.
The selected baby happens to be a boy.
What is the probability that the baby added was a girl?
A. Less than 44%
B. At least 44%, but less than 46%
C. At least 46%, but less than 48%
D. At least 48%, but less than 50%
E. 50% or more

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 19

Use the following information for the next two questions:


City
Boston
Springfield
Worcester
Pittsfield

Percentage of Total Drivers


40%
25%
20%
15%

Percent of Drivers Accident-Free


90%
92%
94%
96%

2.24 (1 point) A driver is picked at random.


If the driver is accident-free, what is the chance the driver is from Boston?
A. 35%
B. 36%
C. 37%
D. 38%
E. 39%
2.25 (1 point) A driver is picked at random.
If the driver has had an accident, what is the chance the driver is from Pittsfield?
A. Less than 0.05
B. At least 0.05 but less than 0.06
C. At least 0.06 but less than 0.07
D. At least 0.07 but less than 0.08
E. At least 0.08

2.26 ( 3 points ) A die is selected at random from an urn that contains four six-sided dice with the
following characteristics:
Number of Faces
Number on Face
Die A
Die B
Die C
Die D
1
3
1
1
1
2
1
3
1
1
3
1
1
3
1
4
1
1
1
3
The first five rolls of the selected die yielded the following in sequential order: 2, 3, 1, 2, and 4.
What is the probability that the selected die is B?
A. Less than 0.5
B. At least 0.5, but less than 0.6
C. At least 0.6, but less than 0.7
D. At least 0.7, but less than 0.8
E. 0.8 or more

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 20

2.27 (1 point) On a multiple choice exam, each question has 5 possible answers, exactly one of
which is correct.
On those questions for which he is not certain of the answer, Les N. DeRisk's strategy for taking the
exam is to answer at random from the 5 possible answers.
Assume he correctly answers the questions for which he knows the answers.
If Les knows the answers to 76% of the questions, what is the probability that he knew the answer
to a question he answered correctly?
(A) 90%
(B) 92%
(C) 94%
(D) 96%
(E) 98%
2.28 (19 points) You are given the following joint distribution of X and Y:
y
x
0
1
2
0
0.1
0.2
0
1
0
0.2
0.1
2
0.2
0
0.2
(a) What are the marginal distributions of X and Y?
(b) What are the conditional distributions of X, given y = 0, 1, 2?
(c) What are the conditional expected values of X, given y = 0, 1, 2?
(d) What are the conditional variances of X, given y = 0, 1, 2?
(e) Verify that E[X] = E[E[X | Y]].
(f) What is E[Var[X | Y]]?
(g) What is Var[E[X | Y]]?
(h) Verify that Var[X] = E[Var[X | Y]] + Var[E[X | Y]].
(i) What are the conditional distributions of Y, given x = 0, 1, 2?
(j) What are the conditional expected values of Y, given x = 0, 1, 2?
(k) What are the conditional variances of Y, given x = 0, 1, 2?
(l) Verify that E[Y] = E[E[Y | X]].
(m) What is E[Var[Y | X]]?
(n) What is Var[E[Y | X]]?
(o) Verify that Var[Y] = E[Var[Y | X]] + Var[E[Y | X]].
(p) What is the skewness of X?
(q) What is the skewness of Y?
(r) What is the kurtosis of X?
(s) What is the kurtosis of Y?

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 21

Use the following information for the next 4 questions:


For any given insured, the number of claims is Negative Binomial with parameters and r = 3.
Let p = 1/(1+). Over the portfolio of insureds, p is distributed uniformly from 0.1 to 0.6.
2.29 (2 points) For an insured picked at random, what is the chance that zero claims are observed?
A. Less than 5%
B. At least 5%, but less than 7%
C. At least 7%, but less than 9%
D. At least 9%, but less than 11%
E. 11% or more
2.30 (2 points) For an insured picked at random, what is the chance that one claim is observed?
A. Less than 5%
B. At least 5%, but less than 7%
C. At least 7%, but less than 9%
D. At least 9%, but less than 11%
E. 11% or more
2.31 (2 points) What is the unconditional mean?
A. Less than 6
B. At least 6, but less than 7
C. At least 7, but less than 8
D. At least 8, but less than 9
E. 9 or more
2.32 (2 points) What is the unconditional variance?
A. Less than 55
B. At least 55, but less than 60
C. At least 60, but less than 65
D. At least 65, but less than 70
E. 70 or more
2.33 (2 points) A medical test has been developed for the disease Hemoglophagia.
The test gives either a positive result, indicating that the patient has Hemoglophagia, or a negative
result, indicating that the patient does not have Hemoglophagia.
However, the test sometimes gives an incorrect result.
1 in 500 of those who do not have Hemoglophagia nevertheless have a positive test result.
3% of those having Hemoglophagia have a negative test result.
If 83% is the probability that a person with a positive test result has Hemoglophagia, determine the
percent of the general population that has Hemoglophagia.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 22

Use the following information for the next two questions:


The number of children per family is Poisson with = 2.
2.34 (1 point) A family is picked at random.
If the family has at least one child, what is the probability that it has more than one child?
A. 69%
B. 71%
C. 73%
D. 75%
E. 77%
2.35 (2 points) A child is picked at random.
What is the probability that this child has at least one brother or sister?
A. 83%
B. 85%
C. 87%
D. 89%
E. 91%

2.36 (3 points) Prior to your going on vacation for a week, your neighbor, forgetful Frank, has agreed
to water your prized potted plant.
If your plant is watered it has a 95% chance of living.
If your plant is not watered it has only a 40% chance of living.
When you return from vacation your plant is dead!
Discuss how an actuary would determine the probability that Frank watered your plant.
2.37 (2, 5/85, Q.10) (1.5 points) Let X and Y be continuous random variables with joint density
function f(x, y) = 1.5x for 0 < y < 2x < 1.
What is the conditional density function of Y given X = x ?
A. 1/(2x) for 0 < x < 1/2.
B. 1/(2x) for 0 < y < 2x < 1.
C. 4/x for 0 < x < 1.
D. 4/x for 0 < y < 2x < 1.
E. 16x/(1 - 2y2 ) for 0 < y < 2x < 1.
2.38 (2, 5/85, Q.27) (1.5 points) Let X and Y have the joint density function
f(x, 7) = x + y for 0 < x < 1 and 0 < y < 1. What is the conditional mean E(Y I X = 1/3)?
A. 3/8
B. 5/12
C. 1/2
D. 7/12
E. 3/5
2.39 (4, 5/86, Q.32) (1 point) Let X,Y and Z be random variables.
Which of the following statements are true?
1. The variance of X is the second moment about the origin of X.
2. If Z is the product of X and Y, then the expected value of Z is the product of the expected
values of X and Y.
3. The expected value of X is equal to the expectation over all possible values of Y,
of the conditional expectation of X given Y.
A. 2
B. 3
C. 1, 2
D. 1, 3
E. 2, 3

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 23

2.40 (2, 5/88, Q.18) (1.5 points) Let X and Y be continuous random variables with joint density
function f(x, y) = 6xy + 3x2 for 0 < x < y < 1. What is E(X I Y = y)?
A. 3y4

B. 2y5

C.

6xy + 3x2
4y3

D.

6x2y + 3x3
4y3

E.

11y
16

2.41 (2, 5/88, Q.37) (1.5 points) Let X and Y be continuous random variables with joint density
function f(x, y) = e-y/2 for -y < x < y and y > 0. What is P[X < 1 I Y = 3]?
A. e-3/2

B. 2e-3

C. (e-1 - e-3)/2

D. 1/6

E. 2/3

2.42 (4, 5/88, Q.33) (1 point) On this multiple choice exam, each question has 5 possible
answers, exactly one of which is correct. On those questions for which he is not certain of the answer,
Stu Dent's strategy for taking the exam is to answer at random from the 5 possible answers.
Assume he correctly answers the questions for which he knows the answers.
If Stu Dent knows the answers to 75% of the questions, what is the probability that he knew the
answer to a question he answered correctly?
A. Less than 0.850
B. At least 0.850, but less than 0.875
C. At least 0.875, but less than 0.900
D. At least 0.900, but less than 0.925
E. 0.925 or more
2.43 (4, 5/89, Q.28) (2 points) A die is selected at random from an urn that contains two six-sided
dice with the following characteristics:
Number of Faces
Number on Face
Die #1 Die #2
1
1
1
2
3
1
3
1
1
4
1
3
The first five rolls of the selected die yielded the following in sequential order: 2, 3, 4, 1, and 4.
What is the probability that the selected die is the second one?
A. Less than 0.5
B. At least 0.5, but less than 0.6
C. At least 0.6, but less than 0.7
D. At least 0.7, but less than 0.8
E. 0.8 or more

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 24

2.44 (4, 5/89, Q.31) (3 points) Let Y be a random variable which represents the number of claims
that occur in a given year. The probability density function for Y is a function of the parameter .
The parameter is distributed uniformly over the interval (0, 1).
The probability of no claims occurring during a given year is greater than 0.350.
That is, P(Y=0) > 0.350 under the assumption that the prior distribution of is uniform on (0, 1).
Which of the following represent possible conditional probability distributions for Y given ?
1. P(Y = y | ) = e y / y !
n + y -1
2. P(Y = y | ) =
n (1-)y for n = 2
y
n
3. P(Y = y | ) = y (1-)n-y for n = 2
y
A. 1

B. 2

C. 3

D. 1, 2

E. 1, 3

2.45 (2, 5/90, Q.14) (1.7 points) Let X and Y be continuous random variables with joint density
function f(x, y) = (12/25)(x + y2 ) for 1 < x < y < 2.
What is the marginal density function of Y where nonzero?
A. (6/25)(2y3 - y2 - 1) for 1 < y < 2.

B. (6/25)(3 + 2y2 ) for 1 < y < 2.

C. (6/25)y2 (1 + 2y) for 1 < y < 2.

D. 3(x + y2 )/(8 + 6x - 3x2 - x3 ) for 1 < x < y < 2.

E. 3(x + y2 )/(7 + 3x) for 1 < x < y < 2.


2.46 (2, 5/90, Q.32) (1.7 points) Let X and Y be discrete random variables with joint probability
function f(x, y) = (x2 + y2 )/56, for x = 1, 2, 3, and y = 1,..., x. What is P[Y = 3 I Y 2]?
A. 9/28
B. 1/3
C. 6/13
D. 41/54
E. 6/7

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 25

2.47 (2, 5/90, Q.35) (1.7 points) Let X and Y be continuous random variables with joint
probability function f(x, y) and marginal density functions fX and fY, respectively, that are nonzero
only on the interval (0 ,1). Which of the following statements is always true?
1

A. E[X2 Y 3 ] =

0 x2dx 0 y3dy
1

C. E[X2 Y 3 ] =

B. E[X2 ] =

0 x2 f(x,y)dx 0 y3 f(x,y)dy

0 x2 f(x,y)dx
1

D. E[X2 ] =

0 x2 fX(x)dx

E. E[Y3 ] =

0 y 3 fX(x)dx

2.48 (4, 5/91, Q.30) (2 points) The expected value of a random variable X, is written as E[X].
Which of the following are true?
1. E[X] = EY[E[X|Y]]
2. Var[X] = EY[Var[X|Y]] + VarY[E[X|Y]]
3. E[g(Y)] = EY[EX[g(Y)| X]]
A. 1, 2

B. 1, 3

C. 2, 3

D. 1, 2, 3

E. None of A, B, C, or D

2.49 (2, 5/92, Q.11) (1.7 points) Let X and Y be continuous random variables with joint density
function f(x, y) = 3x/4 for 0 < x < 2 and 0 < y < 2 - x. What is P[X > 1]?
A. 1/8
B. 1/4
C. 3/8
D. 1/2
E. 3/4
2.50 (2, 5/92, Q.17) (1.7 points) Let X and Y be continuous random variables with joint density
function f(x, y) = x + y for 0 < x < 1, 0 < y < 1.
What is the marginal density function for X, where nonzero?
A. y + 1/2

B. 2x

C. x

D. (x + x2 )/2

E. x + 1/2

2.51 (2, 5/92, Q.44) (1.7 points) Let X and Y be discrete random variables with joint probability
function f(x, y) = (x + 1)(y + 2)/54 for x = 0, 1, 2 and y = 0, 1, 2. What is E(Y I X = 1)?
A. 11/27

B. 1

C. 11/9

D. (y + 2)/9

E. (y2 + 2y)/9

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 26

2.52 (4B, 5/92, Q.4) (2 points)


You have selected a die at random from the two dice described below.
Die A
Die B
2 sides labeled 1
4 sides labeled 1
2 sides labeled 2
1 side labeled 2
2 sides labeled 3
1 side labeled 3
The following outcomes from 5 tosses of the selected die are observed: 1, 1, 2, 3, 1.
Determine the probability that you selected Die A.
A. Less than 0.20
B. At least 0.20 but less than 0.30
C. At least 0.30 but less than 0.40
D. At least 0.40 but less than 0.50
E. At least 0.50
2.53 (4B, 5/94, Q.5) (2 points) Two honest, six-sided dice are rolled, and the results D1 and D2
are observed. Let S = D1 + D2 .
Which of the following are true concerning the conditional distribution of D1 given that S<6?
1. The mean is less than the median.
2. The mode is less than the mean.
3. The probability that D1 = 2 is 1/3.
A. 2

B. 3

C. 1, 2

D. 2, 3

E. None of A, B, C, or D

2.54 (2, 2/96, Q.34) (1.7 points) Let X and Y be continuous random variables with joint density
function f(x, y) = 2 for 0 < x < y < 1.
Determine the conditional density function of Y given X = x, where 0 < x < 1.
A. 1/(1 - x) for x < y < 1
B. 2(1 - x) for x < y < 1
C. 2 for x < y < 1
D. 1/y for x < y <1
E. 1/(1 - y) for x < y < 1
2.55 (4B, 5/96, Q.8) (1 point) You are given the following:

X1 and X2 are two independent observations of a discrete random variable X.

X is equally likely to be 1, 2, 3, 4, 5, or 6.
Determine the conditional mean of X1 given that X1 + X2 is less than or equal to 4.
A. 5/3
B. 2
C. 5/2
D. 10/3
E. 7/2

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 27

2.56 (4B, 11/98, Q.9) (2 points) You are given the following:
A portfolio consists of 75 liability risks and 25 property risks.
The risks have identical claim count distributions.
Loss sizes for liability risks follow a Pareto distribution, with parameters = 300 and = 4.
Loss sizes for property risks follow a Pareto distribution, with parameters = 1,000 and = 3.
A risk is randomly selected from the portfolio and a claim of size k is observed.
Determine the limit of the posterior probability that this risk is a liability risk as k goes to zero.
A. 3/4
B. 40/49
C. 10/11
D. 40/43
E. 1
2.57 (4B, 11/98, Q.16) (2 points) You are given the following:
A portfolio of automobile risks consists of 900 youthful drivers and 800 nonyouthful drivers.
A youthful driver is twice as likely as a nonyouthful driver to incur at least one claim during
the next year.
The expected number of youthful drivers (n) who will be claim-free during the next year is
equal to the expected number of nonyouthful drivers who will be claim-free during
the next year.
Determine n.
A. Less than 150
B. At least 150, but less than 350
C. At least 350, but less than 550
D. At least 550, but less than 750
E. At least 750

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 28

Use the following information for the next two questions:

Four shooters are available to shoot at a target some distance away that has the
following design:

Shooter A hits Areas R, S, U, and V, each with probability 1/4.


Shooter B hits Areas S, T, V, and W, each with probability 1/4.
Shooter C hits Areas U, V, X, and Y, each with probability 1/4.
Shooter D hits Areas V, W, Y, and Z, each with probability 1/4.

2.58 (4B, 5/99, Q.9) (2 points) One shooter is randomly selected and fires two shots.
Determine the probability that the shooter can be identified with certainty.
A. 1/4
B. 7/16
C. 1/2
D. 9/16
E. 3/4
2.59 (4B, 11/99, Q.11) (2 points)
Two distinct shooters are randomly selected, and each fires one shot.
Determine the probability that both shots land in the same Area.
A. 1/16
B. 5/48
C. 3/16
D. 1/4
E. 5/12
2.60 (Course 1 Sample Exam, Q.13) (1.9 points)
Let X and Y be discrete random variables with joint probability function
p(x, y) = (2x + y)/12, for (x, y) = (0, 1), (0, 2), (1, 2) and (1, 3).
Determine the marginal probability function of X.
A. p(x) is 1/6 for x = 0 and 5/6 for x = 1.
B. p(x) is 1/4 for x = 0 and 3/4 for x = 1.
C. p(x) is 1/3 for x = 0 and 2/3 for x = 1.
D. p(x) is 2/9 for x = 1, 3/9 x = 2, and 4/9 for x = 3.
E. p(x) is y/12 for x = 0 and (2 + y)/12 for x = 1.
2.61 (IOA 101, 4/00, Q.7) (2.25 points) Suppose that X and Y are continuous random variables.
Prove that

E(X) =

-E(X | Y = y) fY(y) dy .

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 29

2.62 (1, 5/00, Q.22) (1.9 points) An actuary determines that the annual numbers of tornadoes in
counties P and Q are jointly distributed as follows:
Annual number of tornadoes in county Q
0
1
2
3
Annual number
0
0.12 0.06 0.05 0.02
of tornadoes
1
0.13 0.15 0.12 0.03
in county P
2
0.05 0.15 0.10 0.02
Calculate the conditional variance of the annual number of tornadoes in county Q, given that there are
no tornadoes in county P.
(A) 0.51
(B) 0.84
(C) 0.88
(D) 0.99
(E) 1.76
2.63 (2 points) In the previous question, calculate the conditional variance of the annual number of
tornadoes in county Q, given that there is at least one tornado in county P.
A. Less than 0.75
B. At least 0.75, but less than 0.80
C. At least 0.80, but less than 0.85
D. At least 0.85, but less than 0.90
E. At least 0.90
2.64 (1, 11/00, Q.4) (1.9 points) A diagnostic test for the presence of a disease has two possible
outcomes: 1 for disease present and 0 for disease not present. Let X denote the disease state of a
patient, and let Y denote the outcome of the diagnostic test. The joint probability function of X and
Y is given by: P(X = 0, Y = 0) = 0.800. P(X = 1, Y = 0) = 0.050. P(X = 0, Y = 1) = 0.025.
P(X = 1, Y = 1) = 0.125.
Calculate Var(Y | X = 1).
(A) 0.13
(B) 0.15
(C) 0.20
(D) 0.51
(E) 0.71
2.65 (IOA 101, 4/01, Q.2) (1.5 points) A certain medical test either gives a positive or negative
result. The positive test result is intended to indicate that a person has a particular (rare) disease,
while a negative test result is intended to indicate that they do not have the disease. Suppose,
however, that the test sometimes gives an incorrect result: 1 in 100 of those who do not have the
disease have positive test results, and 2 in 100 of those having the disease have negative test
results. If 1 person in 1000 has the disease, calculate the probability that a person with a positive
test result has the disease.
2.66 (4, 11/04, Q.13 & 2009 Sample Q.142) (2.5 points) You are given:
(i) The number of claims observed in a 1-year period has a Poisson distribution with mean .
(ii) The prior density is: () = e/(1 - e-k), 0 < < k.
(iii) The unconditional probability of observing zero claims in 1 year is 0.575.
Determine k.
(A) 1.5
(B) 1.7
(C) 1.9
(D) 2.1
(E) 2.3

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 30

Solutions to Problems:
2.1. D. The probability of picking a colorblind person out of this population is:
(3%)(20%) + (0.2%)(80%) = 0.76%.
The chance of a person being both colorblind and male is: (3%)(20%) = 0.6%.
Thus the (conditional) probability that the colorblind person is a man is: 0.6% / 0.76% = 78.9%.
Alternately, assume we have a 1000 people: 200 men and 800 women.
Expected number of colorblind people is: (3%)(200) + (0.2%)(800) = 6 + 1.6 = 7.6.
The proportion of the colorblind people in the audience who are male is: 6/7.6 = 78.9%.
2.2. C. Taking a weighted average, the a priori chance of a black ball is 5.6%.
A

Type

A Priori Probability

% Black Balls

Col. B times Col. C

I
II
III
IV
SUM

0.4
0.3
0.2
0.1
1

0.03
0.05
0.08
0.13

0.0120
0.0150
0.0160
0.0130
0.0560

2.3. A. P[Urn = I | Ball = Black] = P[Urn = I and Ball =Black] / P[Ball = Black] =
(.4)(.03) / .056 = .012/.056 = 21.4%.
2.4. D. P[Urn = II | Ball = Black] = P[Urn = II and Ball =Black] / P[Ball = Black] =
(.3)(.05) / .056 = .015/.056 = 26.8%.
2.5. E. P[Urn = III | Ball = Black] = P[Urn = III and Ball =Black] / P[Ball = Black] =
(.2)(.08) / .056 = .016/.056 = 28.6%.
2.6. B. P[Urn = IV | Ball = Black] = P[Urn = IV and Ball =Black] / P[Ball = Black] =
(.1)(.13) / .056 = .013/.056 = 23.2%.
Comment: The conditional probabilities of the four types of urns add to unity.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 31

2.7. C. Using the solutions to the previous problems, one takes a weighted average using the
posterior probabilities of each type of urn:
(21.4%)(.03) + (26.8%)(.05) + (28.6%)(.08) + (23.2%)(.13) = 0.073.
Comment: This whole set of problems can be usefully organized into a spreadsheet:
A

A Priori
Probability
0.4
0.3
0.2
0.1

Type
I
II
III
IV
SUM

% Black Balls
0.03
0.05
0.08
0.13

Probability
Weights =
Col. B x Col. C
0.0120
0.0150
0.0160
0.0130
0.0560

Posterior
Probability
0.214
0.268
0.286
0.232
1.000

Col. C x
Col. E
0.0064
0.0134
0.0229
0.0302
0.0729

This is a simple example of Bayesian Analysis, which is covered subsequently.


2.8. E. Given , the chance that we observe zero claims is: e 0 /0! = e.
1.5

1.5

1.5

P(n=0 | ) f() d = e(1/1.5) d = (-1/1.5)e ] = (1-e-1.5)/1.5 = 0.518.

P(n=0) =
0

2.9. A. Given , the chance that we observe one claim is: e 1 / 1! = e.


1.5

1.5

1.5

P(n=1 | ) f() d = e(1/1.5) d = (-1/1.5){e+ e} ]

P(n=1) =
0

= (1-2.5e-1.5)/1.5 = 0.295.
Comment: For larger numbers of observed claims the result is an incomplete Gamma Function.
In this case P(n=1) = (1/1.5)(2;1.5) = (.4422) / 1.5 = .295.
For example, P(n=2) = (1/1.5)(3;1.5) = (.1912) / 1.5 = .127.
One can compute the other probabilities in a similar manner:
# claims
Prob

0
0.5179

1
0.2948

2
0.1274

3
0.0438

4
0.0124

5
0.0030

6
0.0006

7
0.0001

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 32


x

2.10. E. f(x) =

f(x,y) dy = (1/25) (6 +12x + 18y) dy = (1/25)(6y + 12xy + 9y2) ] =

y=1

y=1

y=1

= (6x +12x2 +9x2 -6 - 12x - 9)/25 = (21x2 - 6x - 15) / 25.


Comment: This unconditional density function is referred to as the marginal or full marginal density of
x.
2

2.11. D. f(y) =

f(x,y) dx = (1/25) (6 +12x + 18y) dx = (1/25)(6x + 6x2 + 18yx) ] =

x=y

x=y

x=y

(12 + 24 + 36y - 6y - 6y2 - 18y2 )/25 = (36 + 30y - 24y2 ) / 25.


2.12. A. f(y | x) = f(x,y) / f(x) = (6 +12x + 18y) / (21x2 -6x - 15). f(y | X=1.5) =
(6 +12(1.5) + 18y) / (21(1.5)2 -6(1.5) - 15) = (24 + 18y) / 23.25 = (4 + 3y)(24/93).
2.13. B. f(x | y) = f(x,y) / f(y) = (6 +12x + 18y) / (36 +30y - 24y2 ). f(x | Y = 1.5) =
(6 +12x + 18(1.5)) / (36 +30(1.5) - 24(1.5)2 ) = (12x +33) / 27 = (4x + 11) / 9.
Comment: Verify that the integral of this density function from x=1.5 to 2 is in fact unity.
x

y=1

y=1

2.14. C. E[Y | X = x] = y f (y |x) dy = {1/(21x2 -6x - 15)} (6y +12xy + 18y2 ) dy =


x

{1/(21x2 -6x - 15)}(3y2 + 6xy2 + 6y3 ) ] = (12x3 +3x2 -6x - 9 ) / (21x2 -6x - 15)
y=1
= (4x2 + 5x + 3) / (7x + 5).
Therefore, E[Y | X = 1.6] = {4(1.6)2 + 5(1.6) + 3} / {7(1.6) + 5} = 1.311.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 33


2

x=y

x=y

2.15. D. E[X | Y =y] = x f (x |y) dx = {1/(36 +30y - 24y2 )} (6x +12x2 + 18yx) dx =
2

{1/(36 +30y - 24y2 )}(3x2 +4x3 + 9yx2 ) ] = (44+ 36y- 3y2 -13y3 ) / (36 +30y - 24y2 ) =
x=y
(13y2 +29y +22) / (24y +18).
Therefore, E[X | Y = 1.2] = {13(1.2)2 +29(1.2) + 22} / {24(1.2) +18} = 1.614.
Comment: E[X | Y =y] is close to being linear, as graphed below:

2.16. E. E[X] =

x f(x,y) dxdy = (1/25) (6x +12x2 + 18xy) dxdy =

y=1 x=y

y=1 x=y

y=1

x=y

(1/25) (3x2 + 4x3 + 9x2 y)

dy= (1/25) (44 + 36y -3y2 - 13y3 )dy


y=1
2

= (1/25)(44y +18y2 - y3 - (13/4)y4 )

] = 169/100 = 1.69.

y=1

Comment: One can get the same answer by taking the integral over y of: E[X | Y=y] f(y).

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 34


2

2.17. D. E[Y] =

y f(x,y) dx dy = (1/25) (6y +12xy + 18y2) dx dy =

y=1 x=y

y=1 x=y

y=1

x=y

(1/25) (6xy + 6x2 y + 18xy2 )

dy = (1/25) (36y + 30y2 -24y3 ) dy


y=1

] = 34/25 = 1.36.

= (1/25)(18y2 + 10y3 - 6y4 )

y=1

Comment: One can get the same answer by taking the integral over x of: E[Y | X =x]f(x).
2.18. E. There are the following 6 equally likely possibilities such that Z 10:
(4,6), (5,5), (5,6), (6,4), (6,5), (6,6). Of these, 3 have X = 6, so that
Prob[X = 6 | Z 10] = Prob[X = 6 and Z 10] / Prob [Z 10] = (3/36) /(6/36) = 1/2.
2.19. D. There are the following 6 equally likely possibilities such that X + Y 10: (4,6), (5,5),
(5,6), (6,4), (6,5), (6,6). Of these one has X = 4, two have X = 5, and three have X = 6.
Therefore E[X | Z 10] = {(1)(4) + (2)(5) + (3)(6)} / 6 = 5.33.
For those who like diagrams:
Y
X
1
2
3
4
5
6

x
x

x
x
x

Possibilities
0
0
0
1
2
3

Conditional Density Function


of X given that X + Y 10

1/6
2/6
3/6

E[X | Z 10] = i P[X=i | Z 10] = (1)(0) + (2)(0) + (3)(0) + (4)(1/6) + (5)(2/6) + (6)(3/6) = 5.33.
2.20. A. Since y > 0 and x + y = 1/2, x 1/2. If x + y = 1/2, then y = 0.5 -x.
The probability that X = x and Y = .5 -x is: f(x)h(.5-x) = (3e-3x)(3e-3(.5-x)) = 9e -1.5. Now this is
proportional to the conditional distribution of x since: P[X=x|Z=1/2] = P[X=x and Z=1/2] / P[Z=1/2].
Thus the conditional distribution of x is proportional to a constant, 9 e -1.5, thus it is uniform on (0,.5].
P[X=x | Z=1/2] = 2 for 0 < x 1/2.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 35

2.21. X + Y is Binomial with m = 3 + 5 = 8 and q = .2.


Prob[X = x | X + Y = 6] = Prob[X = x and X + Y = 6]/Prob[X + Y = 6] =
Prob[X = x]Prob[Y = 6 - x]/Prob[X + Y = 6] =
(3!/{(3-x)!x!} .2x .83-x)(5!/{(5-(6-x))!(6-x)!} .26-x .85-(6-x)) / (8!/{(8-6)!6!} .26 .88-6)
3 5
= (3!/{(3-x)!x!})(5!/{(x-1)!(6-x)!}) /(8!/{2!6!}) =

x 6 - x

8
6 .

This is a Hypergeometric Distribution.


Comment: Beyond what you likely to be asked on your exam! The Hypergeometric Distribution is
discussed for example in Introduction to Probability Models by Ross.
2.22. X + Y is Poisson with mean 3 + 7 = 10.
Prob[X = x | X + Y = 9] = Prob[X = x and X + Y = 9] / Prob[X + Y = 9] =
Prob[X = x] Prob[Y = 9 - x] / Prob[X + Y = 9] =
{e- 3 3x / x! } {e- 7 79 - x / (9 - x)!}
3x 79 - x 9
9!
=
= 0.3x 0.79-x.
x! (9- x)!
x
e- 10 109 / 9!
109
This is Binomial with m = 9 and q = 0.3.
Comment: In general, Prob[X = x | X + Y = z] is Binomial with m = z and q = 1/(1 + 2).
2.23. B. Assume a priori that the unknown baby is equally like to be a boy or a girl. After the baby
is added, let there be a total of N babies. Then there are 4 boys, N-5 girls, and one unknown baby.
If the unknown baby is a boy, then the chance of having picked a boy (the chance of the
observation) is 5/N. On the other hand, if the unknown baby is a girl, then the chance of having
picked a boy (the chance of the observation) is 4/N. Then the posterior chances of being a boy or a
girl are proportional to 5/N and 4/N respectively.
Thus the posterior probabilities are: 5/9 and 4/9.
The probability that the baby added was a girl is: 4/9 = 0.444.
A

Type of
Unknown Baby

A Priori
Chance of
This Type

Chance
of the
Observation
Given Type

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type

Boy
Girl

0.500
0.500

5/N
4/N

2.5/N
2/N

5/9
4/9

4.5/N

1.000

Overall

Comment: See Its a Puzzlement by John Robertson in the 5/97 Actuarial Review.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 36

2.24. E. The chance that a driver is accident-free is:


(40%)(90%) + (25%)(92%) + (20%)(94%) + (15%)(96%) = 92.2%.
The chance that a driver is both accident-free and from Boston is: (40%)(90) = 36%.
Thus the chance this driver is from Boston is: 36% / 92.2% = 39.0%.
Comment: Some may find to helpful to assume for example a total of 100,000 drivers.
2.25. D. The chance that a driver has had an accident is:
(40%)(10%) + (25%)(8%) + (20%)(6%) + (15%)(4%) = 7.8%.
The chance that a driver both has had an accident and is from Pittsfield is (15%)(4%) = .6%.
Thus the chance this driver is from Pittsfield is: .6% / 7.8% = 0.077.
Comment: This is the type of reasoning required to do an exam question involving Bayesian
Analysis. Note that the chances for each of the other cities are:
{(40%)(10%), (25%)(8%), (20%)(6%)} / 7.8%. You should confirm that the conditional probabilities
for the four cities sum to 100%.
2.26. B. If one has Die A, then the chance of the observation is: (1/6)(1/6)(3/6)(1/6)(1/6) =
3 / 7776. This also the chance of the observation for Die C or Die D. If one has Die B, then the
chance of the observation is: (3/6)(1/6)(1/6)(3/6)(1/6) = 9 / 7776.
A

A Priori
Probability
0.25
0.25
0.25
0.25

Die
A
B
C
D

Chance of
Observation
0.000386
0.001157
0.000386
0.000386

Probability
Weights =
Col. B x Col. C
0.000096
0.000289
0.000096
0.000096

Posterior
Probability
0.1667
0.5000
0.1667
0.1667

0.000579

1.000

SUM

2.27. C. If Les knows the answer, then the chance of observing a correct answer is 100%.
If Les doesnt know the answer to a question then the chance of observing a correct answer is 20%.
A

Type of
Question
Les knows
Les Doesn't know

A Priori
Chance of
This Type
of Question
0.760
0.240

Overall

Chance
of the
Observation
1.0000
0.2000

Prob. Weight =
Product
of Columns
B&C
0.7600
0.0480

Posterior
Chance of
This Type
of Question
94.06%
5.94%

0.808

1.000

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 37

2.28. (a) Marginal Distribution of X is the result of adding across columns:


30% @0, 30% @1, 40% @2.
Marginal Distribution of Y is the result of adding down rows:
30% @0, 40% @1, 30% @2.
(b) Given y = 0, conditional distribution of X: 1/3 @0, 0 @1, 2/3 @2.
Given y = 1, conditional distribution of X: 1/2 @0, 1/2 @1, 0 @2.
Given y = 2, conditional distribution of X: 0 @0, 1/3 @1, 2/3 @2.
(c) Given y = 0, conditional expected value of X: (1/3)(0) + (0)(1) + (2/3)(2) = 4/3.
Given y = 1, conditional expected value of X: (1/2)(0) + (1/2)(1) + (0)(2) = 1/2.
Given y = 2, conditional expected value of X: (0)(0) + (1/3)(1) + (2/3)(2) = 5/3.
(d) Given y = 0, conditional second moment of X: (1/3)(02 ) + (0)(12 ) + (2/3)(22 ) = 8/3.
Given y = 1, conditional second moment of X: (1/2)(02 ) + (1/2)(12 ) + (0)(22 ) = 1/2.
Given y = 2, conditional second moment of X: (0)(02 ) + (1/3)(12 ) + (2/3)(22 ) = 3.
Given y = 0, conditional variance of X: 8/3 - (4/3)2 = 8/9.
Given y = 1, conditional variance of X: 1/2 - (1/2)2 = 1/4.
Given y = 2, conditional variance of X: 3 - (5/3)2 = 2/9.
(e) Using the marginal distribution, E[X] = (.3)(0) + (.3)(1) + (.4)(2) = 1.1.
E[E[X | Y]] = Prob(y = 0)E[X | y = 0] + Prob(y = 1)E[X | y = 1] + Prob(y = 2)E[X | y = 2] =
(.3)(4/3) + (.4)(1/2) + (.3)(5/3) = 1.1.
(f) E[Var[X | Y]] = Prob(y = 0)Var[X | y =0] + Prob(y = 1)Var[X | y =1] + Prob(y = 2)Var[X | y =2]
= (.3)(8/9) + (.4)(1/4) + (.3)(2/9) = .4333.
(g) Var[E[X | Y]] = {(.3)(4/3)2 +(.4)(1/2)2 + (.3)(5/3)2 } - 1.12 = .2567.
(h) Var[X] = (.3)(02 ) + (.3)(12 ) + (.4)(22 )} - 1.12 = .690 = .4333 + .2567 =
E[Var[X | Y]] + Var[E[X | Y]].
(i) Given x = 0, conditional distribution of Y: 1/3 @0, 2/3 @1, 0 @2.
Given x = 1, conditional distribution of Y: 0 @0, 2/3 @1, 1/3 @2.
Given x = 2, conditional distribution of Y: 1/2 @0, 0 @1, 1/2 @2.
(j) Given x = 0, conditional expected value of Y: (1/3)(0) + (2/3)(1) + (0)(2) = 2/3.
Given x = 1, conditional expected value of Y: (0)(0) + (2/3)(1) + (1/3)(2) = 4/3.
Given x = 2, conditional expected value of Y: (1/2)(0) + (0)(1) + (1/2)(2) = 1.
(k) Given x = 0, conditional second moment of Y: (1/3)(02 ) + (2/3)(12 ) + (0)(22 ) = 2/3.
Given x = 1, conditional second moment of Y: (0)(02 ) + (2/3)(12 ) + (1/3)(22 ) = 2.
Given x = 2, conditional second moment of Y: (1/2)(02 ) + (0)(12 ) + (1/2)(22 ) = 2.
Given x = 0, conditional variance of Y: 2/3 - (2/3)2 = 2/9.
Given x = 1, conditional variance of Y: 2 - (4/3)2 = 2/9.
Given x = 2, conditional variance of Y: 2 - (1)2 = 1.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 38

(l) Using the marginal distribution, E[Y] = (.3)(0) + (.4)(1) + (.3)(2) = 1.


E[E[Y | X]] = Prob(x = 0)E[Y | x = 0] + Prob(x = 1)E[Y | x = 1] + Prob(x = 2)E[Y | x = 2] =
(.3)(2/3) + (.3)(4/3) + (.4)(1) = 1.
(m) E[Var[Y | X]] = Prob(x = 0)Var[Y | x =0] + Prob(x = 1)Var[Y | x =1] + Prob(x = 2)Var[Y | x =2]
= (.3)(2/9) + (.3)(2/9) + (.4)(1) = .5333.
(n) Var[E[X | Y]] = {(.3)(2/3)2 +(.3)(4/3)2 + (.4)(1)2 } - 12 = .0667.
(o) Var[Y] = (.3)(02 ) + (.4)(12 ) + (.3)(22 )} - 12 = .6 = .5333 + .0667 = E[Var[X | Y]] + Var[E[X | Y]].
(p) Marginal Distribution of X is the result of adding across columns: 30% @0, 30% @1, 40% @2.
E[X] = (.3)(0) + (.3)(1) + (.4)(2) = 1.1.
Var[X] = {(.3)(02 ) + (.3)(12 ) + (.4)(22 )} - 1.12 = .690.
E[(X - X )3 ] = (.3)(0 - 1.1)3 + (.3)(1 - 1.1)3 + (.4)(2 - 1.1)3 = -0.108.
Skew[X] = -0.108/0.6901.5 = -0.188.
(q) Marginal Distribution of Y is the result of adding down rows: 30% @0, 40% @1, 30% @2.
Since the distribution of Y is symmetric around 1, its skewness is zero.
(r) E[(X - X )4 ] = (.3)(0 - 1.1)4 + (.3)(1 - 1.1)4 + (.4)(2 - 1.1)4 = 0.7017.
Kurtosis[X] = 0.7017/0.6902 = 1.474.
(s) E[Y] = (.3)(0) + (.4)(1) + (.3)(2) = 1.
Var[Y] = (.3)(02 ) + (.4)(12 ) + (.3)(22 )} - 12 = 0.6.
E[(Y - Y )4 ] = (.3)(0 - 1)4 + (.4)(1 - 1)4 + (.3)(2 - 1)4 = 0.6.
Kurtosis[Y] = 0.6/0.62 = 1.667.
2.29. B. p = 1/(1+). Therefore, 1 - p = /(1+).
For the Negative Binomial Distribution with r = 3, the conditional probability given p is:
P(x | p) = {x / (1+)x+r } (x+r-1)! /{x! (r-1)!} = p3 (1-p)x (x+2)! / {(2)(x!)}.
Thus the conditional probability of zero claims is: P(x = 0 | p) = p3 .
.6
P(x = 0) =

.6

p = .6

P(x = 0 | p) f(p) dp = p3 (1/.5)dp = (1/2) p4 ] = 0.06475.

.1

.1

p = .1

2.30. D. P(x = 1 | p) = 3 p3 (1 - p) = 3(p3 - p4 ).


.6

P(x = 1) =
.1

.6

p = .6

P(x = 1 | p) f(p) dp = 3(p3 - p4) (2)dp = (6)(p4 /4 - p5/5) ] = 0.10095.


.1

p = .1

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 39

2.31. C. For the Negative Binomial Distribution with r = 3, the mean is:
3 = 3(1 - p)/p = 3p-1 - 3. The unconditional mean can be obtained by integrating the conditional
means versus the distribution of p:
.6

E[X] =

.6

p=.6

E[X | p] f(p) dp = 3(p-1 - 1) (2)dp = (6)(ln(p) - p) ] = 7.751.

.1

.1

p=.1

2.32. E. For the Negative Binomial Distribution with r = 3, the mean is 3p-1 - 3, while the variance is
3(1 + ) = 3{(1 - p)/p}/p = 3p-2 - 3p-1.
Therefore the (conditional) second moment is: 3p-2 - 3p-1 + (3p-1 - 3)2 = 12p-2 - 21p-1 + 9.
The unconditional second moment can be obtained by integrating the conditional second moments
versus the distribution of p:
.6

E[X2 ] =

.6

p=.6

E[X2 | p] f(p) dp = (12p-2 - 21p-1 + 9) (2)dp = (-24p-1 - 42 ln(p) + 18p) ]

.1

.1

p=.1

= 133.746. From the previous solution the mean is: 7.751.


Thus the variance = 133.746 - 7.7512 = 73.675.
2.33. Prob[Positive | Hemoglophagia] = 0.97. Prob[Positive | No Hemoglophagia] = 0.002.
x = Percent of of the general population that has Hemoglophagia
Prob[Positive] = (x)(0.97) + (1-x)(0.002) = 0.002 + 0.968x.
(x)(0.97)
0.83 = Prob[Yes | Positive] = Prob[Positive and Yes]/Prob[Positive] =
.
0.002 + 0.968x

0.00166 + 0.80344x = 0.97x. x = 1.0%.


Comment: The percent of those who test positive who have Hemoglophagia as a function of the
percent of the population with Hemoglophagia:
Percent with Hem.

Percent of Positives with Hem.

5.00%
1.00%
0.10%
0.01%

96.23%
83.05%
32.68%
4.63%

For an extremely rare disease, the majority of positive test results are false positives.
2.34. A. f(0) = e-2 = .1353. f(1) = 2e-2 = .2707.
Prob[N > 1 | N > 0] = (1 - .1353 - .2707)/(1 - .1353) = 68.7%.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 40

2.35. C. f(0) = e-2 = .1353. f(1) = 2e-2 = .2707.


Let x = average number of children in families with more than one child.
2 = Overall average = (0)(.1353) + (1)(.2707) + x(1 - .1353 - .2707). x = 2.911.
Therefore percentage of children in families with more than one child is:
(2.911)(1 - .1353 - .2707)/2 = 86.5%.
Alternately, the percentage of children in families with n children is: n f(n)/E[N].
The percentage of children in families with one child is: (1)(.2707/)2 = .135.
Percentage of children in families with more than one child is: 1 - .135 = 86.5%.
Comment: A child must come from a family with at least one child.
Unlike the previous question, here we pick the child rather than the family at random.
Picking a child at random, means each child is equally likely to be picked.
Then we are more likely to pick a child from a family with a lot of children.
For example, assume there are only two families.
Family 1 has one child: Alice.
Family 2 has three children: Ben, Charlotte, and Dan.
A family is picked at random.
If the family has at least one child, what is the probability that it has more than one child?
Since we are equally likely to pick Family #1 or #2, the probability is 1/2.
If a child is picked at random, what is the probability that this child has at least one brother or sister?
We are equally likely to pickAlice,Ben, Charlotte, or Dan, thus the probability is 3/4.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 41

2.36. Let p be the a priori probability that Frank remembered to water your plant.
Then the probability that he watered your plant and it died anyway is: 0.05 p.
The probability that he failed to water your plant and it died is: 0.6 (1 - p).
Thus given that the plant died, the probability Frank failed to water your plant is:
0.6 (1 - p)
12 - 12p
=
.
0.6 (1 - p) + 0.05 p 12 - 11p
For example, if you assume a 30% a priori probability that Frank will water your plant,
then the probably that Frank failed to water your plant is: 8.4/8.7 = 96.6%.
If you instead assume an 80% a priori probability that Frank will water your plant,
then the probably that Frank failed to water your plant is: 2.4/3.2 = 75%.
The probability that Frank failed to water your plant as a function of p:
1.0

0.9

0.8

0.7

0.6
0.2

0.4

0.6

0.8

Comment: One could instead just check to see how damp the soil around your plant is.
2.37. B.

2x

1.5x / 1.5x dy = 1/(2x), 0 < y < 2x < 1.


y =0

2.38. E.

y (1/3 + y) dy / (1/3 + y) dy = (1/6 + 1/3)/(1/3 + 1/2) = (1/2)/(5/6) = 3/5.


y =0

y =0

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 42

2.39. B. 1. False. The variance is the second central moment: VAR[X} = E[(X-E[X])2] =
E[X2] - E[X]2. The second moment around the origin is E[X2]. 2. False. COVAR[X,Y] =
E[XY] - E[X]E[Y], so statement 2 only holds when the covariance of X and Y is zero. (This is true if X
and Y are independent.) 3. True. E[X] = EY[E[X|Y]].
2.40. E.

x (6xy + 3x2) dx / 6xy + 3x2 dx = (2y4 + 3y4/4)/(3y3 + y3) = y(11/4)/4 = 11y/16.


x =0

x =0

2.41. E. Given Y = 3, X is uniform from -3 to +3. P[X < 1 I Y = 3] = 4/6 = 2/3.


1

e-3/2 dx / e-3/2 = 2e-3/(3e-3) = 2/3.


x =-3

x =-3

2.42. E. If Stu knows the answer, then the chance of observing a correct answer is 100%.
If Stu doesnt know the answer to a question then the chance of observing a correct answer is 20%.
A

Type of
Question
Stu knows
Stu Doesn't know

A Priori
Chance of
This Type
of Question
0.750
0.250

Chance
of the
Observation
1.0000
0.2000

Prob. Weight =
Product
of Columns
B&C
0.7500
0.0500

Posterior
Chance of
This Type
of Question
93.75%
6.25%

0.800

1.000

Overall

2.43. D. If one has Die #1, then the chance of the observation is: (3/6)(1/6)(1/6)(1/6)(1/6) =
3 / 7776. If one has Die #2, then the chance of the observation is: (3/6)(1/6)(1/6)(3/6)(1/6) =
9 / 7776.
A

Die
1
2
SUM

A Priori
Probability
0.5
0.5

Chance of
Observation
0.00387
0.01160

Probability
Weights =
Col. B x Col. C
0.00193
0.00580

Col. D / Sum of Col. D =


Posterior
Probability
0.2500
0.7500

0.00773

1.000

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 43


1

2.44. A. P(Y=0) =

P(Y = 0 | ) f() d =

P(Y = 0 | ) d

For the first case, P(Y = 0 | ) = e,

and P(Y=0) =

e - d = 1 - e-1 = 0.632.

For the second case, P(Y = 0 | ) =

2,

and P(Y=0) =

2 d = 1/3 = 0.333

For the third case,

P(Y = 0 | ) = (1-)2,

and P(Y=0) =

(1-) 2 d = 1/3 = 0.333.

Only in the first case is P(Y=0) > 0.35.


Comment: Three separate problems in which you need to calculate P(Y=0) given
P(Y = y | ) and f(). The conditional distributions are a Poisson, Negative Binomial and a Binomial.
In each case they are being mixed by the same uniform distribution on [0,1].
2.45. A.

(12/25)(x + y2) dx = (12/25){y2/2 - 1/2 + (y - 1)y2} = (6/25)/(2y3 - y2 - 1), for 1 < y < 2.
x =1

Comment: The marginal density of Y can not depend on x, eliminating choices D and E.
The marginal density of Y has to integrate to 1, eliminating choices B, C, and E.
2.46. C. f(1, 1) = 2/56. f(2, 1) = 5/56. f(2, 2) = 8/56. f(3, 1) = 10/56. f(3, 2) = 13/56.
f(3, 3) = 18/56. P[Y = 3 I Y 2] = (18/56)/(8/56 + 13/56 + 18/56) = 18/39 = 6/13.
2.47. D. Statement A is false. Even if X and Y were independent, then E[X2 Y 3 ] = E[X2 ]E[Y3 ] =
1

0 x2 fX(x)dx 0 y 3 fY(y)dy . Statement B is false. Statement C is false.


1

E[X2 Y 3 ] =

x2 y3 f(x,y) dxdy . Statement D is true. Statement E is false. E[Y3] = 0 y 3 fY(y) dy .

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 44

2.48. D. 1. True. 2. True.


3. True. Assume discrete distributions; if they are instead continuous distributions then replace
summations by integrals.
EY[EX[g(Y)| X]] = P[yi][EX[g(Y)| X]] = P[yi]{ g(yi)P[yi and xk] /P[yi and xk]} =
i

P[yi]{ g(yi)P[yi and xk] / P[yi]} = { g(yi)P[yi and xk]} = g(yi)P[yi and xk] = E[g(Y)].
i

i,k

Comment: EY denotes expectation taken over the sample space of Y, (the weighted average taken
over all possible values of Y, using the probabilities as the weights.) Statement 3 is similar to
E[g(Y)] =EY[g(Y)] = EX[EY[g(Y)| X]]. It turns out you can get the same result (in the regular situations
encountered in practice) whether you first take the expectation over X or first take the expectation
over Y. Basically you are just taking the double integral of g(y) times the probability density function
over all possible values of x and y. The order in which you evaluate this double integral should not
matter. In the demonstration of Statement 3, Ive used the fact that the sum over all k of P[xk and yi]
is P[yi] , since we exhaust all possible x values. The final equality in the demonstration of Statement
3 is the definition of the expected value.
2.49. D.

2-x

3x/4 dy dx = (3/4) 2x - x2 dx = (3/4)(3 - 7/3) = 1/2.


x =1 y =0

2.50. E.

x =1

(x + y) dy = x + 1/2, for 0 < x < 1.


y =0

Comment: The marginal density of X can not depend on y, eliminating choice A.


The marginal density of X has to integrate to 1, eliminating choices C and D.
2.51. C. f(1, 0) = 4/54. f(1, 1) = 6/54. f(1, 2) = 8/54.
E[Y I X = 1] = {(0)(4/54) + (1)(6/54) + (2)(8/54)}/(4/54 + 6/54 + 8/54) = 11/9.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 45

2.52. C.
A

A Priori
Chance
Type of Chance of
of the
Die
This Type Observation
of Die
A
0.500
0.00412
B
0.500
0.00823
Overall

Prob. Weight =
Product
of Columns
B&C
0.00206
0.00412

Col. D / Sum of Col. D =


Posterior
Chance of
This Type of Die
33.3%
66.7%

0.00617

1.000

For example, the chance of the observation if we have picked Die B is:
(4/6)(4/6)(1/6)(1/6)(4/6) = 0.00823.
2.53. A. When S < 6 we have the following equally likely possibilities:
D2
Conditional Density Function
D1

Possibilities

of D1 given that S < 6

1
x
x
x
x
4
2
x
x
x
3
3
x
x
2
4
x
1
The mean of the conditional density function of D1 given that S < 6 is:

4/10
3/10
2/10
1/10

(0.4)(1) + (0.3)(2) + (0.2)(3) + (0.1)(4) = 2.


The median is equal to 2, since the Distribution Function at 2 is 0.7 0.5, but at 1 it is 0.4 < 0.5.
The mode is 1, since that is the value at which the density is a maximum. Thus 1. F, 2. T, 3. F.
2.54 . A.

2 / 2 dy = 2/{2(1 - x)} = 1/(1 - x), for x < y < 1.


y =x

2.55. A. X1 + X2 4 we have the following equally likely possibilities:


X2
X1

Possibilities

Conditional Density Function


of X1 given that X1 + X2 4

1
x
x
x
3
3/6
2
x
x
2
2/6
3
x
1
1/6
4
0
0
5
0
0
6
0
0
The mean of the conditional density function of X1 given that X1 + X2 4 is:
{(3)(1) + (3)(2) + (1)(3) + (0)(4) + (0)(5)+ (0)(6)} / (3 + 2 + 1) = 5/3.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 46

2.56. D. The posterior distribution is proportional to the product of the prior distribution and the
chance of the observation given the type of risk. The prior distribution is: .75, .25. The chances of the
observation given the types of risks are the densities at k: (4)(3004 )(300 + k)-5, and
(3)(10003 )(1000 + k)-4. Thus the posterior distribution is proportional to:
(.75)(4)(3004 )(300 + k)-5 = (3)(3004 )/(300 + k)5 = (3/300)/(300/300 + k/300)5
= (1/100)(1 + k/300)-5, and (.25)(3)(10003 )(1000 + k)-4 = (.75/1000)(1 + k/1000)-4.
As k goes to zero, these probability weights go to .01, and .75/1000 = .00075. Normalized to
unity, these weights are: .01/.01075 = .930 = 40/43, and .00075/.01075 = .170 = 3/43.
2.57. D. n = the expected number of youthful claim free drivers
= the expected number of nonyouthful claims free drivers
The chance of a youthful driver incurring at least one claim is: (900 - n)/900.
The chance of a nonyouthful driver incurring at least one claim is: (800 - n)/800.
We are given that the former is twice the latter: (900 - n)/900 = 2(800 - n)/800.
Solving for n: n = 1/{(1/400 - 1/900)} = 720.
Comment: Check: (900 - 720)/900 = 20% is twice (800 - 720)/800 = 10%.
2.58. D. The situation is totally symmetric with respect to the shooters and the shooters are equally
likely, so we can just take one of the four shooters and examine that situation.
If for example we have shooter A, then there are 16 equally likely outcomes:
R
S
U
V

R
Yes
Yes
Yes
Yes

S
Yes
No
Yes
No

U
Yes
Yes
No
No

V
Yes
No
No
No

Ive labeled those situations where the shooter can be identified with certainty.
There are 9 out of 16 such situations, for a probability of 9/16.
Comment: Note that in the case of targets S and U being hit, you can eliminate both shooters B and
C, leaving only shooter A. The pattern of shooters is as follows:
A
A,C
C

A,B
A,B,C,D
C,D

B
B,D
D

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 47

2.59. B. The situation is totally symmetric with respect to the shooters and the shooters are equally
likely, so we can just take one of the four shooters as first and examine that situation. If for example
we assume shooter A is the first shooter, then there is 1/3 chance each that B, C or D is the second
shooter.
Assume A, hits area R, then there is zero chance that both shots land in the same area. If A hits area
V, then regardless of the identity of the second shooter, there is a 1/4 chance of his shot hitting the
same area V. If A hits area S, then if the second shooter is B, there is a 1/4 chance of the second
shot landing in the same area S; otherwise the chance is zero. Therefore, if A hits area S, there is a
(1/3)(1/4) =1/12 chance of the second shot being in the same area. Similarly, if A hits area U, there is
a (1/3)(1/4) =1/12 chance of the second shot being in the same area.
Thus overall, there is a (0 + 1/4 + 1/12 + 1/12)/4 = 5/48 chance of two shots in the same area.
2.60. B. Prob[X = 0] = p(0, 1) + p(0, 2) = 1/12 + 2/12 = 1/4.
Prob[X = 1] = p(1, 2) + p(1, 3) = 4/12 + 5/12 = 3/4.

2.61. E(X | Y = y) = x f(x | y) dx = x f(x, y)/fY(y) dx.

E(X | Y = y) fY(y) dy =

x f(x, y) dx dy = E[X].

2.62. D. E[Q | P = 0] = {(0)(.12) + (1)(.06) + (2)(.05) + (3)(.02)}/(.12 + .06 + .05 + .02) = .22/.25
= 0.88. E[Q2 | P = 0] = {(02 )(.12) + (12 )(.06) + (22 )(.05) + (32 )(.02)}/(.12 + .06 + .05 + .02) =
.44/.25 = 1.76. Var[Q | P = 0] = 1.76 - .882 = 0.9856.
2.63. B. E[Q | P 0] = {(0)(.18) + (1)(.3) + (2)(.22) + (3)(.05)}/(.18 + .3 + .22 + .05) = .89/.75 =
1.1867. E[Q2 | P 0] = {(02 )(.18) + (12 )(.3) + (22 )(.22) + (32 )(.05)}/(.18 + .3 + .22 + .05) =
1.63/.75 = 2.1733. Var[Q | P 0] = 2.1733 - 1.18672 = 0.765.
2.64. C. E[Y | X = 1] = {(.050)(0) + (.125)(1)}/(.050 + .125) = .7143.
E[Y2 | X = 1] = {(.050)(02 ) + (.125)(12 )}/(.050 + .125) = .7143.
Var(Y | X = 1) = .7143 - .71432 = 0.2041.
2.65. Prob[Positive | Yes] = .98. Prob[Positive | No] = .01.
Prob[Positive] = (.001)(.98) + (.01)(.999) = .01097.
Prob[Yes | Positive] = Prob[Positive and Yes]/Prob[Positive] = (.001)(.98)/.01097 = 8.93%.

2013-4-9

Buhlmann Credibility 2 Conditional Distributions, HCM 10/19/12, Page 48

2.66. C. Prob[zero claims | ] = e.


k

Prob[zero claims] = e () d = e e/(1 - e-k) d = {(1 - e-2k) / 2}/(1 - e-k) = (1 + e-k)/2.


0

Set .575 = (1 + e-k)/2. k = 1.897.


Comment: I used the fact that (1 - x2 )/(1 - x) = 1+ x, with x = e-k.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 49

Section 3, Covariances and Correlations10


Given the joint distribution of two variables, one can compute their covariance and correlation.
Covariances:
The Covariance of X and Y is defined as: Cov[X,Y] E[XY] - E[X]E[Y].
Exercise: E[X] = 9, E[Y] = 8, and E[XY] = 60. What is the covariance of X and Y?
[Solution: Cov[X,Y] = E[XY] - E[X]E[Y] = 60 - (9)(8) = -12.]
As in the previous section, assume two honest, six-sided dice of different colors are rolled, and the
results D1 and D2 are observed. Let S = D1 + 2D2 .
In order to compute the covariance of D2 and S, one must first compute E[D2 S].
Since S = D1 + 2D2 , for all the possible outcomes, S has values:

D1
1
2
3
4
5
6

D2
3

3
4
5
6
7
8

5
6
7
8
9
10

7
8
9
10
11
12

9
10
11
12
13
14

11
12
13
14
15
16

13
14
15
16
17
18

Then for these 36 equally likely possibilities the product of D2 and S is:

D1
1
2
3
4
5
6

D2
2

3
4
5
6
7
8

10
12
14
16
18
20

21
24
27
30
33
36

36
40
44
48
52
56

55
60
65
70
75
80

78
84
90
96
102
108

Thus E[D2 S] = (3+4+5+6+7+8+10+12+14+16+18+20+21+24+27+33+36+


36+40+44+48+52+56+55+60+65+70+75+80+78+84+90+96+102+108)/36 = 42.5833.
10

Even though these ideas are only occasionally tested directly, it will be assumed that you have a good
understanding of the concepts in this section. For those who already know this material, do only a few problems
from this section in order to refresh your memory.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 50


Alternately, one could use the conditional distributions of S given various values of D2 to simplify
this calculation somewhat: E[D2 S] = {P(D2 = i) E[D2 | D2 = i] E[S | D2 = i]} =

P(D2 = i) i E[S | D2 = i] = (1/6)(1)(5.5) + (1/6)(2)(7.5) + (1/6)(3)(9.5) + (1/6)(4)(11.5) +


(1/6)(5)(13.5) + (1/6)(6)(15.5) = 42.5833.
Now E[D2 ] = 3.5, while E[S]= E[D1 + 2D2] = E[D1] + 2 E[ D2] = 3.5 + (2)(3.5) = 10.5.
Therefore Cov[D2 , S] = E[D2 S] - E[D2 ]E[S] = 42.5833 - (3.5)(10.5) = 5.8333.
Alternately, Cov[D2 , S] = Cov[D2 , D1 + 2D2] = Cov[D2 , D1] + 2 Cov[D2 ,D2] =
0 + 2 Var[D2 ] = (2)(35/12) = 35/6 = 5.8333.
Cov[X, X] = E[X2 ] - E[X]2 = Var[X].
Thus the variance is a special case of the covariance, Cov[X, X] = Var[X].
Note that if X and Y are independent, then E[XY] = E[X]E[Y] and thus Cov[X,Y] = 0.
Note that Cov[X, Y] = Cov[Y, X].
Also Cov[X, Y + Z] = Cov[X, Y] + Cov[X, Z].
Also if b is any constant, Cov[X, bY] = b Cov[X,Y] while Cov[X, b] = 0.
Var[X + Y] = E[(X+Y)2 ] - E[X + Y]2 = E[X2 ] + 2E[XY] + E[Y2 ] - (E[X] + E[Y])2 =
(E[X2 ] - E[X]2 ) + 2(E[XY] - E[X]E[Y]) + (E[Y2 ] - E[Y]2 ) = Var[X] + Var[Y] + 2Cov[X,Y].
Var[X + Y] = Var[X] + Var[Y] + 2Cov[X,Y].
If X and Y are independent then:
Cov[X, Y] = 0 and Var[X + Y] = Var[X] + Var[Y].
Exercise: Var[X] = 10, Var[Y] = 20, and Cov[X, Y] = -12. What is Var[X + Y]?
[Solution: Var[X + Y] = Var[X] + Var[Y] + 2Cov[X,Y] = 10 + 20 + (2)(-12) = 6.]
Similarly, Var[aX + bY] = a2 Var[X] + b2 Var[Y] + 2abCov[X,Y].
Exercise: Var[X] = 10, Var[Y] = 20, and Cov[X, Y] = -12. What is Var[3X + 4Y]?
[Solution: Var[3X + 4Y] = 9Var[X] + 16Var[Y] + (2)(3)(4)Cov[X,Y] = 90 + 320 + (24)(-12) = 122.]

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 51


Correlations:
The Correlation of two random variables is defined in terms of their covariance:
Cov[X, Y]
Corr[X,Y]
.
Var[X] Var[Y]
Exercise: Var[X] = 10, Var[Y] = 20, and Cov[X, Y] = -12. What is Corr[X, Y]?
-12
Cov[X, Y]
[Solution: Corr[X,Y] =
=
= -0.85.]
Var[X] Var[Y]
(10) (20)
The correlation is always in the interval [-1, +1].
Corr[X, Y] = Corr[Y,X]
Corr[X, X] = 1
Corr[X, -X] = -1
Corr[X, Y] if a > 0

0 if a = 0
Corr[X, aY] =
-Corr[X, Y] if a < 0

Corr[X, aX] = 1 if a > 0


Two variables that are proportional with a positive proportionality constant are
perfectly correlated and have a correlation of one. Closely related variables, such as height
and weight, have a correlation close to but less than one. Unrelated variables have a correlation near
zero. Inversely related variables, such as the average temperature and the use of heating oil, are
negatively correlated.
Continuing the example with dice, we can determine the correlation of D2 , and S = D1 + 2D2 .
Exercise: What is the variance of D2 ?
[Solution: The second moment is (1/6)(12 ) + (1/6)(22 ) + (1/6)(32 ) + (1/6)(42 ) + (1/6)(52 ) +
(1/6)(62 ) = 91/6. The mean is 7/2. Thus the variance = 91/6 - (7/2)2 = 35/12.]
Exercise: What is the variance of S?
[Solution: Since D1 and D2 are independent, Var[S] = Var[D1 + 2D2 ] =
Var[ D1 ] + 4 Var[ D2 ] = (35/12) + (4)(35/12) = 175/12. ]

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 52


Thus, Corr[D2 , S] =

Cov[D 2, S]
Var[D2 ] Var[S]

5.8333
(35 / 12) (175 / 12)

= 0.894.

Note that D2 and S are positively correlated. Larger values of D2 tend to be associated with larger
values of S, and vice versa.11
A Poisson Process Example:12
N has a Poisson distribution with mean 6.
Let A = X1 +...+ XN, where Prob(Xi = 5) = 50% and Prob(Xi = 25) = 50%, for all i,
where the Xi s are independent.
Let B = Y1 +...+ YN, where Prob(Yi = 5) = 80% and Prob(Yi = 25) = 20%, for all i,
where the Yi s are independent.
Note that we assume that A and B have the same number of claims N.
Exercise: Calculate the Covariance of A and B.
[Solution: E[X] = 15. E[Y] = 9. E[AB] = EN[E[AB | n]] = EN[15n9n] = 135E[N2 ] = 135(6 + 62 ) =
5670. Cov[A, B] = 5670 - (15)(6)(9)(6) = 810.]
One could instead use the general result established below for this type of situation:
Cov[A, B] = Var[N] E[X]E[Y] = (6)(15)(9) = 810.
Exercise: Calculate the correlation coefficient between A and B.
[Solution: E[X2 ] = (0.5)(52 ) + (0.5)(252 ) = 325. Var[A] = (6)(325) = 1950.
E[Y2 ] = (0.8)(52 ) + (0.2)(252 ) = 145. Var[B] = (6)(145) = 870.
Corr[A, B] =

11
12

810
= 0.622.]
(1950)(870)

The maximum possible correlation is +1, so that D2 and S are very strongly (positively) correlated.
See 4, 11/01, Q.29

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 53


A General Result for the Covariance when there is a common Number of Claims:
Assume two aggregate process.
In each process frequency and severity are independent of each other, and the claim sizes are
mutually independent, identically distributed random variables.
Let N be the number of claims.
Assume N is the same for each of the two processes, but the first process has severity X and the
second process has severity Y.
X and Y are independent.
Let A be the aggregate loss from the first process and B be the aggregate loss from the second
process.
Then Cov[A, B] = Var[N] E[X]E[Y].
Proof: E[AB] = EN[E[AB | n]] = EN[E[nXnY | n] ] = EN[E[X]E[Y]n2 ] = E[N2 ]E[X]E[Y].
Cov[A, B] = E[AB] - E[A]E[B] = E[N2 ]E[X]E[Y] - E[N]E[X]E[N]E[Y] = Var[N] E[X]E[Y].
If the frequency is Poisson with mean , then Cov[A , B] = E[X]E[Y], Var[A] = E[X2 ],
and Var[B] = E[Y2 ].
Corr[A, B] =

E[X] E[Y]
E[X2 ] E[Y2 ]

1
(E[X2] / E[X]2 ) (E[Y2 ]/ E[Y]2 )

For the example above, Corr[A, B] =

1
(1.4444) (1.7901)

= 0.622.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 54


Sample Covariance and Correlation:
Assume we have the following heights of eight fathers and their adult sons (in inches):13
Father
Son
53
56
54
58
57
61
58
60
61
63
62
62
63
65
66
64
Here is a graph of this data:
Son

64

62

60

58

56
54

56

58

60

62

64

66

Father

There appears to be a relationship between the height of the father, X, and the height of his son, Y.
A taller father seems to be more likely to have a taller son. One way to measure this relationship is
by computing the sample covariance and sample correlation.

13

There are only 8 pairs of observations solely in order to keep things simple.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 55


The sample variance of X is: sX2 =

(Xi

- X)2

N - 1

Exercise: What are the sample variances of X and Y?


[Solution: Mean height of fathers = X = 59.25.
Xi - X = (53, 54, 57, 58, 61, 62, 63, 66) - 59.25 =
(-6.25, -5.25, -2.25, -1.25, 1.75, 2.75, 3.75, 6.75).
sX2 = {(-6.25)2 + ... + 6.7522 } / (8 - 1) = 143.5/7 = 20.5.
Mean height of sons = Y = 61.125. Yi - Y = (56, 58, 61, 60, 63, 62, 65, 64) - 61.125 =
(-5.125, -3.125, -0.125, -1.125, 1.875, 0.875, 3.875, 2.875).
sY2 = {(-5.125)2 + ... + 2.8752 } / (8 - 1) = 64.875/7 = 9.2679.]
In analogy to the sample variance, the sample covariance of X and Y is computed as:14
^
Cov
[X, Y] =

(Xi

- X) (Yi - Y)
N - 1

Exercise: What is the sample covariance of X and Y?


[Solution:

(Xi

- X) (Yi - Y)
N - 1

= {(-6.25)(-5.125) + ... + (6.75)(2.875)}/(8 - 1) = 89.75/7 =

12.8214.]
Then the sample correlation is the sample covariance divided by the product of the two sample
standard deviations: r =

^ [X, Y]
Cov
=
sX sY

(Xi - X ) (Yi - Y ) .
(Xi - X)2 (Yi - Y)2

15

Exercise: What is the sample correlation of X and Y?


12.8214
[Solution: Using previous solutions: r =
= 0.9302.
(20.5) (9.2679)
Comment: The slope of a linear regression with intercept fit to this line is:
^

= r sY/sX = (0.93019)(3.0443)/4.5277 = 0.6254.]


14

Just as the variance is a special case of the covariance, the sample variance is a special case of the sample
covariance
15
The factors of 1/(N-1) cancel.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 56


Covariance of Sample Means:
As discussed previously, Var[ X ] = Var[X] / n.16
A similar result holds for Cov[ X , Y ].
For paired samples of data:17
n

Cov[ X , Y ] = Cov[(1/n) Xi , (1/n)


i=1

n n

j=1

j=1i=1

Yj ] = Cov[Xi ,Yj ] /n2

= n Cov[X, Y]/n2 = Cov[X, Y]/n.


Cov[ X , Y ] = Cov[X, Y]/n.
Exercise: For the heights example assume Var[X] = Var[Y] = 14, and Corr[X, Y] = 0.93.
Determine Cov[ X , Y ].
[Solution: Cov[ X , Y ] = Cov[X, Y]/n = (0.93)(14)/8 = 1.63.]
For the example, mean height of fathers = X = 59.25, and mean height of sons = Y = 61.125.
Y - X = 61.125 - 59.25 = 1.875, is an estimate of the average difference in heights between sons
and their fathers.
Exercise: For the heights example assume Var[X] = Var[Y] = 14, and Corr[X, Y] = 0.93.
Determine Var[ Y - X ].
[Solution: Var[ Y - X ] = Var[ X ] + Var[ Y ] - 2 Cov[ X , Y ] = 14/8 + 14/8 - (2)(0.93)(14)/8 = 0.245.]

16

See Mahlers Guide to Frequency Distributions.


The Xi are a series of independent, identically distributed variables.
The Yi are another series of independent, identically distributed variables.
Xi and Yj for i j are independent, and therefore have a covariance of zero.
For i = j, Cov[Xi, Yj] = Cov[X, Y].
17

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 57


Problems:
3.1 (1 point) E[X] = 3, E[X2 ] = 15, E[Y] = 5. E[Y2 ] = 37. E[XY] = 19.
What is the correlation of X and Y?
A. less than 0.30
B. at least 0.30 but less than 0.35
C. at least 0.35 but less than 0.40
D. at least 0.40 but less than 0.45
E. at least 0.45
3.2 (1 point) You are given the following:

Let X be a random variable X.

Z is defined to be 0.75X.
Determine the correlation coefficient of X and Z.
A. 0.00
B. 0.25
C. 0.50
D. 0.75

E. 1.00

3.3 (3 points) N has a Poisson distribution with mean 4.


Let A = X1 +...+ XN, where Xi = 9 for all i.
Let B = Y1 +...+ YN, where Prob(Yi = 5) = 80% and Prob(Yi = 25) = 20%, for all i,
where the Yi s are independent.
Calculate the correlation coefficient between A and B.
(A) 0.75
(B) 0.80
(C) 0.85
(D) 0.90

(E) 0.95

Use the following information for the next 8 questions:

G is the result of rolling a green six-sided die.

R is the result of rolling a red six-sided die.

G and R are independent of each other.

M is the maximum of G and R.

3.4 (1 point) What is the mode of M ?


A. 2
B. 3
C. 4

D. 5

E. 6

3.5 (1 point) What is the mode of the conditional distribution of M if G = 3?


A. 2
B. 3
C. 4
D. 5
E. 6

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 58


3.6 (1 point) What is the expected value of M?
A. less than 4.0
B. at least 4.0 but less than 4.2
C. at least 4.2 but less than 4.4
D. at least 4.4 but less than 4.6
E. at least 4.6
3.7 (1 point) What is the expected value of the conditional distribution of M if G = 3?
A. 3
B. 4
C. 5
D. 6
E. 7
3.8 (1 point) What is the variance of M?
A. less than 1.0
B. at least 1.0 but less than 1.3
C. at least 1.3 but less than 1.6
D. at least 1.6 but less than 1.9
E. at least 1.9
3.9 (1 point) What is the variance of the conditional distribution of M if G = 3?
A. less than 1.0
B. at least 1.0 but less than 1.3
C. at least 1.3 but less than 1.6
D. at least 1.6 but less than 1.9
E. at least 1.9
3.10 (3 points) What is the covariance of G and M?
A. less than 1.0
B. at least 1.0 but less than 1.3
C. at least 1.3 but less than 1.6
D. at least 1.6 but less than 1.9
E. at least 1.9
3.11 (3 points) What is the correlation of G and M?
A. less than 0.3
B. at least 0.3 but less than 0.4
C. at least 0.4 but less than 0.5
D. at least 0.5 but less than 0.6
E. at least 0.6

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 59


3.12 (2 points) You are given the following for a sample of four observations from a bivariate
distribution:
(i)
x
y
12
33
14
11
25
16
36
40
A is the covariance of the empirical distribution Fe as defined by these four observations.
B is the maximum possible covariance of an empirical distribution with identical
marginal distributions to Fe .
Determine B - A.
A. less than 40
B. at least 40 but less than 45
C. at least 45 but less than 50
D. at least 50 but less than 55
E. at least 55
3.13 (2 points) You are given the following:
A is a random variable with mean 15 and variance 7.
B is a random variable with mean 25 and variance 13.
C is a random variable with mean 45 and variance 28.
A, B, and C are independent.
X=A+B
Y=A+C
Determine the correlation coefficient between X and Y.
A. less than 0.3
B. at least 0.3 but less than 0.4
C. at least 0.4 but less than 0.5
D. at least 0.5 but less than 0.6
E. at least 0.6
3.14 (2 points) You have a paired sample of data: (X1 , Y1 ), (X2 , Y2 ), ... , (X100, Y100).
The Xi are a series of independent, identically distributed variables.
The Yi are another series of independent, identically distributed variables.
Assume Var[X] = 9, Var[Y] = 16, and Corr[X, Y] = 0.8.
Determine Var[ X - Y ].
A. 0.02
B. 0.04

C. 0.06

D. 0.08

E. 0.10

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 60


Use the following information for the next 11 questions:

A green ball is placed at random in one of three urns.

Then independently a red ball is placed at random in one of these three urns.

Let X be the number of balls in Urn 1.

Let Y be the number of balls in Urn 2.

Let Z be the number of balls in Urn 3.

Let N be the number of occupied urns.

3.15 (1 point) Given X = 0, what is the (conditional) variance of Y?


A. 0.2
B. 0.3
C. 0.4
D. 0.5
E. 0.6
3.16 (1 point) Given X = 1, what is the (conditional) variance of Z?
A. less than 0.3
B. at least 0.3 but less than 0.4
C. at least 0.4 but less than 0.5
D. at least 0.5 but less than 0.6
E. at least 0.6
3.17 (2 points) What is the covariance of Y and Z?
A. -0.5
B. -0.4
C. -0.3
D. -0.2

E. -0.1

3.18 (2 points) Given X = 0, what is the (conditional) covariance of Y and Z?


A. -0.5
B. -0.4
C. -0.3
D. -0.2
E. -0.1
3.19 (2 points) Given X = 1, what is the (conditional) covariance of Y and Z?
A. less than -0.3
B. at least -0.3 but less than -0.2
C. at least -0.2 but less than -0.1
D. at least -0.1 but less than 0
E. at least 0
3.20 (1 point) What is the correlation of Y and Z?
A. less than -0.3
B. at least -0.3 but less than -0.2
C. at least -0.2 but less than -0.1
D. at least -0.1 but less than 0
E. at least 0

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 61


3.21 (1 point) Given X = 0, what is the (conditional) correlation of Y and Z?
A. less than -0.9
B. at least -0.9 but less than -0.8
C. at least -0.8 but less than -0.7
D. at least -0.7 but less than -0.6
E. at least -0.6
3.22 (1 point) Given X = 1, what is the (conditional) correlation of Y and Z?
A. less than -0.9
B. at least -0.9 but less than -0.8
C. at least -0.8 but less than -0.7
D. at least -0.7 but less than -0.6
E. at least -0.6
3.23 (1 point) Given X = 0, what is the (conditional) expected value of N?
A. 1/4
B. 1/2
C. 1
D. 3/2
E. 2
3.24 (1 point) Given X = 1, what is the (conditional) expected value of N?
A. 1/4
B. 1/2
C. 1
D. 3/2
E. 2
3.25 (1 point) Given X = 2, what is the (conditional) expected value of N?
A. 1/4
B. 1/2
C. 1
D. 3/2
E. 2
Use the following information for the next 3 questions:

Two independent lives x and y are of the same age.


Both lives follow De Moivres law of mortality with the same .
T(xy) is the the time until failure for the joint life status of x and y.
T( xy ) is the the time until failure for the last survivor status of x and y.
3.26 (3 points) What is the correlation of T(x) and T(xy)?
A. 0.3
B. 0.4
C. 0.5
D. 0.6
E. 0.7
3.27 (3 points) What is the correlation of T(x) and T( xy )?
A. 0.3
B. 0.4
C. 0.5
D. 0.6
E. 0.7
3.28 (3 points) What is the correlation of T(xy) and T( xy )?
A. 0.3
B. 0.4
C. 0.5
D. 0.6
E. 0.7

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 62


3.29 (7 points) You are given the following data points and summary statistics:
Price of Silver (X)
Price of Gold (Y)
$4.67
$343.80
$5.97
$416.25
$6.39
$427.75
$9.04
$530.00
$13.01
$639.75
5

Xi = 39.08
i=1

Yi = 2,357.55
i=1

(i) (2 points) Determine the sample variance of X.


(ii) (2 points) Determine the sample variance of Y.
(iii) (2 points) Determine the sample covariance of X and Y.
(iv) (1 point) Determine the sample correlation of X and Y.
3.30 (3 points) You are given the following joint distribution of X and Y:
y
x
0
1
2
0
0.1
0.2
0
1
0
0.2
0.1
2
0.2
0
0.2
Determine the correlation of X and Y.
A. less than 0.1
B. at least 0.1 but less than 0.2
C. at least 0.2 but less than 0.3
D. at least 0.3 but less than 0.4
E. at least 0.4
3.31 (2, 5/88, Q.41) (1.5 points) A hat contains 3 chips numbered 1, 2, and 3. Two chips are
drawn successively from the hat without replacement. What is the correlation between the number
on the first chip and the number on the second chip?
A. -1/2
B. -1/3
C. 0
D. 1/3
E. 1/2
3.32 (2, 5/90, Q.4) (1.7 points) Let X and Y be discrete random variables with joint probability
distribution given in the table below:
X
Y
-1
0
1
0
0.1
0.1
0.2
1
0.1
0.3
0.2
What is Cov(X, Y)?
A. -0.02
B. 0
C. 0.02
D. 0.10
E. 0.12

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 63


3.33 (2, 5/92, Q.40) (1.7 points) Let X and Y be discrete random variables with joint probability
function given by the following table:
x
y
0
1
2
0
0
2/5
1/5
1
1/5
1/5
0
What is the variance of Y - X?
A. 4/25
B. 16/25
C. 26/25
D. 5/4
E. 7/5
3.34 (2, 5/92, Q.45) (1.7 points) Let X and Y be continuous random variables with joint density
function f(x, y) = 6x for 0 < x < y < 1.
Note that E[X] = 1/2 and E[Y] = 3/4. What is Cov(X, Y)?
A. 1/40
B. 2/5
C. 5/8
D. 1
E. 13/8
3.35 (2, 2/96, Q.4) (1.7 points) Let X1 , X2 , X3 be uniform random variables on the interval (0, 1)
with Cov(Xi, Xj) = 1/24 for i, j = 1, 2, 3, i j.
Calculate the variance of X1 + 2X2 - X3 .
A. 1/6

B. 1/4

C. 5/12

D. 1/2

E. 11/12

3.36 (2, 2/96, Q.8) (1.7 points) Let X and Y be discrete random variables with joint probability
function p(x, y) given by the following table:
x
y
2
3
4
5
0
0.05 0.05 0.15 0.05
1
0.40
0
0
0
2
0.05 0.15 0.10
0
For this joint distribution E(X) = 2.85 and E(Y) = 1. Calculate Cov(X, Y).
A. -0.20
B. -0.15
C. 0.95
D. 2.70
E. 2.85
3.37 (5B, 11/98, Q.24) (1.5 points) Given the following information, what is the variance of a
portfolio consisting of equal weights of stocks B and C? Show all work.
AB = 0.6. AC = 1.0. A2 = 0.4. B2 = 0.7. C2 = 0.6.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 64


3.38 (4B, 11/99, Q.29) (2 points) You are given the following:
A is a random variable with mean 5 and coefficient of variation 1.
B is a random variable with mean 5 and coefficient of variation 1.
C is a random variable with mean 20 and coefficient of variation 1/2.
A, B, and C are independent.
X=A+B
Y=A+C
Determine the correlation coefficient between X and Y.
A. -2/ 10

B. -1/ 10

C. 0

D. 1/ 10

E. 2/ 10

3.39 (IOA 101, 4/00, Q.12) (3 points) A random sample of 200 pairs of observations (x, y) from
a discrete bivariate distribution (X, Y) is as follows:
the observation (-2, 2) occurs 50 times
the observation (0, 0) occurs 90 times
the observation (2, -1) occurs 60 times.
Calculate the sample correlation coefficient for these data.
3.40 (1, 5/00, Q.20) (1.9 points) Let X and Y denote the values of two stocks at the end of a
five-year period. X is uniformly distributed on the interval (0, 12). Given X = x, Y is uniformly
distributed on the interval (0, x). Determine Cov(X, Y) according to this model.
(A) 0
(B) 4
(C) 6
(D) 12
(E) 24
3.41 (IOA 101, 9/00, Q.10) (3.75 points) Let Z be a random variable with mean 0 and variance 1,
and let X be a random variable independent of Z with mean 5 and variance 4.
Let Y = X - Z. Calculate the correlation coefficient between X and Y.
3.42 (4, 11/00, Q.32) (2.5 points) You are given the following for a sample of five observations
from a bivariate distribution:
(i)
x
y
1
4
2
2
4
3
5
6
6
4
(ii) x = 3.6, y = 3.8.
A is the covariance of the empirical distribution Fe as defined by these five observations.
B is the maximum possible covariance of an empirical distribution with identical
marginal distributions to Fe .
Determine B - A.
(A) 0.9
(B) 1.0

(C) 1.1

(D) 1.2

(E) 1.3

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 65


3.43 (1, 5/01, Q.7) (1.9 points) A joint density function is given by
f(x, y) = kx, for 0 < x < 1, 0 < y < 1, where k is a constant.
What is Cov(X,Y)?
(A) - 1/6
(B) 0
(C) 1/9
(D) 1/6
(E) 2/3
3.44 (4, 11/01, Q.29) (2.5 points) In order to simplify an actuarial analysis Actuary A uses an
aggregate distribution S = X1 +...+ XN, where N has a Poisson distribution with mean 10 and
Xi = 1.5 for all i.
Actuary As work is criticized because the actual severity distribution is given by
Pr(Yi = 1) = Pr(Yi = 2) = 0.5, for all i, where the Yi s are independent.
Actuary A counters this criticism by claiming that the correlation coefficient between S and
S* = Y1 +...+ YN is high.
Calculate the correlation coefficient between S and S*.
(A) 0.75
(B) 0.80
(C) 0.85
(D) 0.90
(E) 0.95
3.45 (4, 11/03, Q.13) (2.5 points) You are given:
(i) Z1 and Z2 are independent N(0,1) random variables.
(ii) a, b, c, d, e, f are constants.
(iii) Y = a + bZ1 + cZ2 and X = d + eZ1 + fZ2 .
Determine E[Y | X].
(A) a
(B) a + (b + c)(X - d)
(C) a + (be + cf)(X - d)
(D) a + [(be + cf) / (e2 + f2 )] X
(E) a + [(be + cf) / (e2 + f2 )] (X - d)

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 66


Solutions to Problems:
3.1. E. Var[X] = 15 - 32 = 6. Var[Y] = 37 - 52 = 12.
Cov[X, Y] = E[XY] - E[X]E[Y] = 19 - (3)(5) = 4.
Corr[X,Y] = Cov[X,Y] /

Var[X]Var[Y] = 4/ (6)(12) = 0.47.

3.2. E. Var[Z] = Var[.75X] = .752 Var[X]. Cov[X, Z] = Cov[X, .75X] = .75Cov[X, X] = .75 Var[X].
Therefore, Corr[X,Z] = .75 Var[X] / Var[X] 0.752 Var[X] = 1.
Comments: Two variables that are proportional with a positive proportionality constant are perfectly
correlated and have a correlation of one.
3.3. A. E[A] = (4)(9) = 36. Var[A] = (4)(92 ) = 324.
E[B] = (4)((.8)(5) + (.2)(25)) = 36. Var[B] = (4)((.8)(52 ) + (.2)(252 )) = 580.
E[AB] = E[E[AB | n]] = E[E[9n B | n]] = E[9n E[B | n]] = E[9n 9n] = 81E[n2 ] =
(81)(2nd moment of the Poisson) = (81)(4 + 42 ) = 1620.
COV[A, B] = E[AB] - E[A]E[B] = 1620 - (36)(36) = 324.
Corr[A, B] = COV[A, B]/ Var[A]Var[B] = 324/ (324)(580) = 0.747.
Comment: Similar to 4, 11/01, Q.29.
3.4. E. There are 36 equally likely possibilities, with corresponding values of M = MAX[G,R]:
R
G
1
2
3
1
1
2
3
2
2
2
3
3
3
3
3
4
4
4
4
5
5
5
5
6
6
6
6
The most likely value of M is therefore 6.

66

4
4
4
4
5
6

5
5
5
5
5
6

6
6
6
6
6
6

3.5. B. The conditional distribution of M if G =3 is: f(3) = 3/6, f(4) = 1/6, f(5) = 1/6, and
f(6) = 1/6. Thus the mode of the conditional distribution of M if G =3 is 3.
3.6. D. Examining the 36 equally likely possibilities, the distribution of M is:
f(1) =1/36, f(2) = 3/36, f(3) = 5/36, f(4) = 7/36, f(5) = 9/36, and f(6) = 11/36.
Thus the mean of M is: ((1)(1) + (3)(2) + (5)(3) + (7)(4) + (9)(5) + (11)(6)) / 36 = 4.472.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 67


3.7. B. The conditional distribution of M if G =3 is: f(3) = 3/6, f(4) = 1/6, f(5) = 1/6, and
f(6) = 1/6. Thus the mean of the conditional distribution of M if G =3 is: {(3)(3) +4 +5 +6} /6 = 4.
3.8. E. The distribution of M is: f(1) =1/36, f(2) = 3/36, f(3) = 5/36, f(4) = 7/36, f(5) = 9/36,
and f(6) = 11/36. Thus the second moment of M is:
((1)(12 ) + (3)(22 ) + (5)(32 ) + (7)(42 ) + (9)(52 ) + (11)(62 )) / 36 = 21.972.
Thus since the mean of M is 4.472, the variance of M is: 21.972 - 4.4722 = 1.97.
3.9. C. The conditional distribution of M if G =3 is: f(3) = 3/6, f(4) = 1/6, f(5) = 1/6, and
f(6) = 1/6. Thus the the second moment of the conditional distribution of M if G =3 is:
{(3)(32 ) + 42 + 52 + 62 } /6 = 17.33. Thus since the mean of the conditional distribution of M is 4,
the variance the conditional distribution of M is: 17.33 - 42 = 1.33.
3.10. C. First one has to compute E[GM]. There are 36 equally likely possibilities and the
corresponding values of M = MAX[G,R] are:
G
1
2
3
4
5
6

R
3

1
2
3
4
5
6

2
2
3
4
5
6

3
3
3
4
5
6

66

4
4
4
4
5
6

5
5
5
5
5
6

6
6
6
6
6
6

E[GM] = P(G = i) i E[M | G = i] =


(1/6)(1)(21/6) + (1/6)(2)(22/6) +(1/6)(3)(24/6) + (1/6)(4)(27/6) + (1/6)(5)(31/6) + (1/6)(6)(36/6) =
17.1111. Thus Covar[G,M] = E[GM] - E[G]E[M] = 17.1111 - (3.5)(4.4722) = 1.458.
3.11. E. Corr[G,M] = Covar[g,M] / {Var[G] Var[M]}.5 = 1.458 / {(35/12) (1.973)}.5 = 0.608.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 68


3.12. E. For the observations, E[XY] = {(12)(33) + (14)(11) + (25)(16) + (36)(40)}/4 = 597.5.
A = Cov[X,Y] = E[XY] - E[X]E[Y] = 597.5 - E[X]E[Y]. The maximum correlation and covariance
occurs when the smallest x corresponds to the smallest y, and the largest x corresponds to the
largest y, keeping the observed sets of values of x and y the same as before:
x
y
12
11
14
16
25
33
36
40
Now, E[XY] = {(12)(11) + (14)(16) + (25)(33) + (36)(40)}/4 = 655.25.
B = Cov[X,Y] = E[XY] - E[X]E[Y] = 655.25 - E[X]E[Y]. B - A = 655.25 - 597.5 = 57.75.
Comment: Similar to 4, 11/00, Q.32.
3.13. A. Var[X] = Var[A] + Var[B] = 7 + 13 = 20, since A and B are independent.
Var[Y] = Var[A] + Var[C] = 7 + 28 = 35, since A and C are independent.
Cov[X, Y] = Cov[A + B, A + C] = Cov[A, A] + Cov[A, C] + Cov[B, A] + Cov[B, C] =
Var[A] + 0 + 0 + 0 = 7. Corr[X , Y] = Cov[X , Y] /

Var[X]Var[Y] = 7 /

(20)(35) = 0.26.

Comment: No use is made of the given means.


3.14. C. Cov[X, Y] = Corr[X, Y]

Var[X]Var[Y] = (0.8) (9)(16) = 9.6.

Cov[ X , Y ] = Cov[X, Y]/n = 9.6/100 = 0.096.


Var[ X - Y ] = Var[ X ] + Var[ Y ] - 2 Cov[ X , Y ] = 9/100 + 16/100 - (2)(.096) = 0.058.
3.15. D. If X = 0, then Y is binomial with q = 1/2 and m = 2.
So the conditional variance of Y is: (2)(1/2)(1-1/2) = 1/2.
3.16. A. If X = 1, then Z is binomial with q = 1/2 and m = 1.
So the conditional variance of Z is: (1)(1/2)(1-1/2) = 1/4.
3.17. D. There are nine equally likely possibilities: 0,rg,0; r,g,0; g,r,0; gr,0,0; 0,0,rg; r,0,g; g,0,r;
0,r,g; 0,g,r. Therefore, E[YZ] = (0+0+0+0+0+0+0+1+1)/9 = 2/9.
E[Y] = E[Z] = 2/3. Thus, Covar[Y,Z] = 2/9 - (2/3)(2/3) = - 2/9 = -0.222.
3.18. A. If X = 0, then of the original nine equally likely possibilities only 4 apply: 0,rg,0; 0,0,rg;
0,r,g; 0,g,r. Thus E[YZ | X =0] = (0 + 0 + 1 + 1) /4 = 1/2. E[Y | X=0] = E[Z | X=0] = 1. Thus,
Covar[Y,Z | X =0] = 1/2 - (1)(1) = -1/2.
3.19. B. If X = 1, then of the original nine equally likely possibilities only 4 apply: r,g,0; g,r,0; r,0,g;
g,0, r. Thus E[YZ | X =1] = (0 + 0 + 0 + 0) /4 = 0. E[Y | X=1] = E[Z | X=1] = 1/2.
Thus, Covar[Y,Z | X =1] = 0 - (1/2)(1/2) = -1/4.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 69


3.20. A. Y and Z are each distributed as a Binomial with q = 1/3 and m = 2.
Thus Var[Y] = Var[Z] = (2)(1/3)(2/3) = 4/9. From a previous question, Covar[Y,Z] = - 2/9.
Thus Corr[Y,Z] = (-2/9) / {(4/9)(4/9)}0.5 = -1/2.
Comment: Note that Y and Z are negatively correlated. Larger values of Y tend to be associated
with smaller values of Z and vice versa.
3.21. A. From a previous question, Var[Y | X=0] = Var[Z | X=0] = 1/2. From a previous question,
Covar[Y,Z | X =0] = -1/2. Thus Corr[Y,Z | X =0] = (-1/2) / {(1/2)(1/2)}0.5 = -1.
Comment: When X = 0, Y and Z are perfectly negatively correlated. This follows from the fact that Y
and Z are identically distributed and that Y + Z = 2, a constant. Therefore,
Cov[Y, Z] = Cov[Y, 2 - Y] = Cov[Y, 2] - Cov[Y, Y] = 0 - Var[Y] = - Var[Y].
Corr[Y, Z] = Cov[Y, Z] / {Var[Y] Var[Z]}0.5 = -Var[Y] / {Var[Y] Var[Y]}0.5 = -1.
3.22. A. From a previous question, Var[Y | X=1] = Var[Z | X=1] = 1/4. From a previous question,
Covar[Y, Z | X =1] = -1/4. Thus Corr[Y, Z | X =0] = (-1/4) / {(1/4)(1/4)}0.5 = -1.
3.23. D. If X = 0, then of the original nine equally likely possibilities only 4 apply: 0,rg,0; 0,0,rg;
0,r,g; 0,g,r. Thus E[N | X =0] = (2 + 2 + 1 + 1) /4 = 3/2.
3.24. E. If X = 1, then there is exactly one other urn (either Y or Z) that is occupied, so that N = 2
and the (conditional) expected value of N is 2.
3.25. C. If X = 2, then there are no other occupied urns , so that N=1 and the (conditional) expected
value of N is 1.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 70


3.26 to 3.28. D., D., & C. Let u = T(x); u is uniformly distributed from 0 to - x.
Let v = T(y); v is uniformly distributed from 0 to - x.
Let w = T(xy) = min[u, v]. Let z = T( xy ) = max[u, v]. Let - x = b.
Prob[W > w] = Prob[both future lifetimes > w] = Prob[u > w]Prob[v > w] = (1 - w/b)(1 - w/b).
f(w) = -d(Prob[W > w])/dw = (2/b)(1 - w/b), 0 w b.
Integrating wf(w) from 0 to b, E[W] = b/3.
Integrating w2 f(w) from 0 to b, E[W2 ] = b2 /6. Var[W] = b2 /6 - (b/3)2 = b2 /18.
E[w | u] = E[min[u, v] | u] = Prob[v < u]E[v | v < u] + Prob[v u]u = (u/b)(u/2) + (1 - u/b)u =
u - u2 /2b. E[uw] = E[uE[w | u]] = E[u(u - u2 /2b)] = E[u2 ] - E[u3 ]/2b =
b

u2/b du - u3/b du /2b = b2/3 - b2/8 = 5b2/24.


0

Cov[u, w] = E[uw] - E[u]E[w] = 5b2 /24 - (b/2)(b/3) = b2 /24.


Corr[u, w] = Cov[u, w]/

Var[u]Var[w] = b2 /24 /

(b2 / 12)(b2 / 18) = 6 / 4 = 0.612.

By symmetry, Corr[T(x), T( xy )] Corr[u, max[u, v]] = Corr[u, min[u, v]] = 0.612.


u + v = min[u, v] + max[u, v] w + z. Therefore, Cov[u + v, u + v] = Cov[w + z, w + z].
Since u and v are independent, Var[u] + Var[v] = Var[w] + Var[z] + 2Cov[w, z].
Cov[w, z] = (Var[x] + Var[v] - Var[w] - Var[z])/2 = (b2 /12 + b2 /12 - b2 /18 - b2 /18)/2 = b2 /36.
Corr[w, z] = Cov[w, z]/

Var[w]Var[z] = b2 /36 /

(b2 / 18)(b2 / 18) = 18/36 = 1/2.

Comment: Similar to Example 9.5.2 and Exercise 9.13 in Actuarial Mathematics.


3.29. (i) X = 39.08/5 = 7.816. Xi - X = -3.146, -1.846, -1.426, 1.224, 5.194.
sX2 = {(-3.146)2 + (-1.846)2 + (-1.426)2 + (1.224)2 + (5.194)2 }/(5 - 1) = 10.954.
(ii) Y = 2,357.55/5 = 471.51. Yi - Y = -127.71, -55.26, -43.76, 58.49, 168.24.
sY2 = {(-127.71)2 + (-55.26)2 + (-43.76)2 + (58.49)2 + (168.24)2 }/(5 - 1) = 13,251.
(iii) Cov[X, Y] = {(-3.146)(-127.71) + (-1.846)(-55.26) + (-1.426)(-43.76) + (1.224)(58.49) +
(5.194)(168.24)}/(5 - 1) = 377.90.
(iv) r = Cov[X, Y]/(sX sY) = 377.90 /

(10.954)(13,251) = 0.992.

Comment: Setup taken from CAS3L, 5/08, Q. 9.


^

For a linear regression, the fitted slope is = Cov[X, Y]/sX2 = 377.90/10.954 = 34.5.
^
The fitted intercept is ^ = Y - X = 471.51 - (34.5)(7.816) = 201.86.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 71


3.30. B. E[X] = (.3)(0) + (.3)(1) + (.4)(2) = 1.1.
Var[X] = (.3)(02 ) + (.3)(12 ) + (.4)(22 )} - 1.12 = .69.
E[Y] = (.3)(0) + (.4)(1) + (.3)(2) = 1.
Var[Y] = (.3)(02 ) + (.4)(12 ) + (.3)(22 )} - 12 = .6.
E[XY] = (.5)(0) + (.2)(1) + (.1)(2) + (.2)(4) = 1.2.
Cov[X, Y] = E[XY] - E[X]E[Y] = 1.2 - (1.1)(1) = 0.1.
Corr[X, Y] = Cov[X, Y]/ Var[X]Var[Y] = 0.1/ (0.69)(0.6) ) = 0.155.
3.31. A. Let X be the first chip and Y be the second chip picked.
E[XY] = {(1)(2) + (1)(3) + (2)(1) + (2)(3) + (3)(1) + (3)(2)}/6 = 11/3.
Cov[X, Y] = E[XY] - E[X]E[Y] = 11/3 - (2)(2) = -1/3.
Var[X] = Var[Y] = 2/3.
Corr[X, Y] = Cov[X, Y]/ Var[X]Var[Y] = (-1/3)/(2/3) = -1/2.
3.32. A. E[XY] = (0.1)(-1)(0) + (0.1)(0)(0) + (0.2)(1)(0) + (0.1)(-1)(1) + (0.3)(0)(1) + (0.2)(1)(1)
= 0.1.
E[X] = (0.2)(-1) + (0.4)(0) + (0.4)(1) = .2. E[Y] = (0.4)(0) + (0.6)(1) = 0.6.
Cov[X, Y] = E[XY] - E[X]E[Y] = 0.1 - (0.2)(0.6) = -0.02.
3.33. C. E[XY] = (2/5)(1)(0) + (1/5)(2)(0) + (1/5)(1)(0) + (1/5)(1)(1) = 1/5.
E[X] = (1/5)(0) + (3/5)(1) + (1/5)(2) = 1. E[Y] = (3/5)(0) + (2/5)(1) = 2/5.
E[X2 ] = (1/5)(0) + (3/5)(1) + (1/5)(4) = 7/5. E[Y2 ] = (3/5)(0) + (2/5)(1) = 2/5.
E[Y - X] = E[Y] - E[X] = 2/5 - 1 = -3/5. E[(Y - X)2 ] = E[Y2 ] + E[X2 ] - 2E[X]E[Y] =
2/5 + 7/5 - (2)(1)(1/5) = 7/5. Var[Y - X] = 7/5 - (-3/5)2 = 26/25.
Alternately, Var[X] = 7/5 - 12 = 2/5. Var[Y] = 2/5 - (2/5)2 = 6/25.
Cov[X, Y] = E[XY] - E[X]E[Y] = 1/5 - (1)(2/5) = -1/5.
Var[Y - X] = Var[Y] + Var[X] - 2Cov[X, Y] = 6/25 + 2/5 + 2/5 = 26/25.
3.34. A.

E[XY] =

(xy) 6x dx dy = 2y4 dy = 2/5.

y =0 x=0

y =0

Cov[X, Y] = E[XY] - E[X]E[Y] = 2/5 - (1/2)(3/4) = 1/40.


3.35. C. Var[Xi] = 1/12.

Var[X1 + 2X2 - X3 ] =

Var[X1 ] + 4Var[X2 ] + Var[X3 ] + 4Cov[X1 , X2 ] - 2Cov[X1 , X3 ] - 4Cov[X2 , X3 ]


= 6/12 - 2/24 = 5/12.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 72


3.36. B. E[XY] = (.05)(2)(0) + (.05)(3)(0) + (.15)(4)(0) + (.05)(5)(0) + (.4)(2)(1) + (.05)(2)(2) +
(.15)(2)(3) + (.1)(2)(4) = 5.1. Cov[X, Y] = E[XY] - E[X]E[Y] = 2.7 - 2.85 = -0.15.
3.37. AC = 1.0. BC = AB = 0.6.
Variance of a portfolio consisting of equal weights of stocks B and C:
(.52 )B2 + (.52 )C2 + (.5)(.5)2BCBC =
(.52 )(0.7) + (.52 )(0.6) + (.5)(.5)(2)(0.6)

0.7

0.6 = 0.519.

3.38. D. Var[A] = {(mean)(CV)}2 = 25. Var[B] = {(5)(1)}2 = 25. Var[C] = {(20)(1/2)}2 = 100.
Var[X] = Var[A] + Var[B] = 25 + 25 = 50, since A and B are independent.
Var[Y] = Var[A] + Var[C] = 25 + 100 = 125, since A and C are independent.
Cov[X, Y] = Cov[A+B, A+C] = Cov[A, A] + Cov[A, C] + Cov[B, A] + Cov[B, C] =
Var[A] + 0 + 0 + 0 = 25.
Corr[X , Y] = Cov[X , Y] /

Var[X]Var[Y] = 25 /

(50)(125) = 1 /

10 .

Comment: Since A, B, and C are independent, Cov[A, C] = Cov[B, A] = Cov[B, C] = 0.


3.39. E[X] = {(50)(-2) + (90)(0) + (60)(2)}/200 = 0.1.
E[Y] = {(50)(2) + (90)(0) + (60)(1)}/200 = 0.2.
E[XY] = {(50)(-2)(2) + (90)(0)(0) + (60)(2)(1)}/200 = -1.6.
E[X2 ] = {(50)(-2)2 + (90)(0)2 + (60)(2)2 }/200 = 2.2.
E[Y2 ] = {(50)(2)2 + (90)(0)2 + (60)(1)2 }/200 = 1.3.
Cov[X, Y] = E[XY] - E[X]E[Y] = - 1.6 - (.1)(.2) = -1.62.
Var[X] = E[X2 ] - E[X]2 = 2.2 - 0.12 = 2.19. Var[Y] = E[Y2 ] - E[Y]2 = 1.3 - 0.22 = 1.26.
Corr[X, Y] = Cov[X, Y]/ Var[X]Var[Y] = -1.62/ (2.19)(1.26) = -0.975.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 73


3.40. C.

12

E[XY] =

12

(xy) 1/(12x) dy dx = x2/24 dx = 24.

x =0 y=0
12

E[X] =

12

x 1/(12x) dy dx = x/12 dx = 6.
x =0 y=0
12

E[Y] =

x =0

x =0
x

12

y 1/(12x) dy dx = x/24 dx = 3.
x =0 y=0

x =0

Cov[X, Y] = E[XY] - E[X]E[Y] = 24 - (6)(3) = 6.


3.41. Cov[X, Y] = Cov[X, X - Z] = Cov[X, X] - Cov[X, Z] = Var[X] - 0 = 4.
Var[Y] = Var[X - Z] = Var[X] + Var[Z] - 2Cov[X, Z] = 4 + 1 - (2)(0) = 5.
Corr[X, Y] = 4/ (4)(5) = 0.894.
3.42. D. For the five observations, E[XY] = {(1)(4) + (2)(2) + (4)(3) + (5)(6) + (6)(4)}/5 = 14.8.
Cov[X,Y] = E[XY] - E[X]E[Y] = 14.8 - (3.6)(3.8) = 1.12 = A.
If instead we had the maximum correlation between the observed x and y, keeping the
observed sets of values of x and y the same as before:
x
y
1
2
2
3
4
4
5
4
6
6
E[XY] = {(1)(2) + (2)(3) + (4)(4) + (5)(4) + (6)(6)}/5 = 16.
Cov[X,Y] = E[XY] - E[X]E[Y] = 16 - (3.6)(3.8) = 2.32 = B. B - A = 2.32 - 1.12 = 1.2.
Comment: The maximum correlation and covariance occurs when the smallest x corresponds to the
smallest y, and the largest x corresponds to the largest y.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 74


3.43. B. Since f can be written as a product of a function of x and a function of y, (and the support of
x does not depend on y and vice versa) x and y are independent. Cov[X, Y] = 0.
Alternately, the double integral of x over the square is 1/2, so k = 2.
1

E[XY] =

(xy) (2x) dy dx = 1/3.

x =0 y=0
1

E[X] =

x (2x) dy dx = 2/3.
x =0 y=0
1

E[Y] =

y (2x) dy dx = 1/2.
x =0 y=0

Cov[X, Y] = E[XY] - E[X]E[Y] = 0.


3.44. E. E[S] = (10)(1.5) = 15. Var[S] = (10)(1.52 ) = 22.5.
E[S*] = (10)((1+2)/2) = 15. Var[S*] = (10)(2nd mom. of severity) = (10){(12 + 22 )/2} = 25.
E[SS*] = En [E[SS* | n]] = En [E[1.5nS* | n]] = En [1.5nE[S* | n]] = En [1.5n1.5n] =
2.25 En [n2 ] = 2.25(2nd moment of the Poisson) = 2.25(10 + 102 ) = 247.5.
COV[S, S*] E[SS*] - E[S]E[S*] = 247.5 - (15)(15) = 22.5.
Corr[S, S*] COV[S, S*]/ Var[S]Var[S*] = 22.5/ (22.5)(25) = 0.949.

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 75


3.45. E. Assume for example, that all the constants are positive. If d = 1000 and X = 1002
Z 1 and Z2 are more likely to be positive than negative E[Z1 ] > 0 and E[Z2 ] > 0
E[Y | X = 1002] > 0. If d = 1000, then E[Y | X = 998] < 0.
So we expect E[Y | X] to depend on X - d. This eliminates choices A and D.
Now try a = 0, b = 1, c = 0, d = 0, e = 0, f = 1. Then Y = Z1 and X = Z2 . X and Y are independent

E[Y | X] = E[Y] = E[Z1 ] = 0. However, in this case choice B would give: X, eliminating choice B.
Now try a = 0, b = 1, c = 1, d = 0, e = 1, f = 1. Then Y = Z1 + Z2 = X. E[Y | X] = X.
However, in this case choice C would give 2X, eliminating choice C.
Thus the answer must be the only remaining choice E.
Alternately, notice that all of the choices are linear in X. Thus in this case, the least squares linear
estimator must be exact; i.e., equal to E[Y | X]. Applying linear regression, the least squares linear
estimator of Y given X has slope: Cov[X, Y]/Var[X] =
Cov[d + eZ1 + fZ2 , a + bZ1 + cZ2 ]/Var[d + eZ1 + fZ2 ] =
{ebVar[Z1 ] + cfVar[Z2 ]}/{e2 Var[Z1 ] + f2 Var[Z2 ]} = (be + cf) / (e2 + f2 ), using the fact that Z1 and Z2
are independent and each have variance of 1.
The intercept is: E[Y] - E[X] = a - d(be + cf) / (e2 + f2 ).
So the linear least squares estimator is: a - d(be + cf) / (e2 + f2 ) + X(be + cf) / (e2 + f2 ) =
a + [(be + cf) / (e2 + f2 )](X - d).
Alternately, one could prove the result as follows: Let G = Z1 + uZ2 and H = Z1 + vZ2 .
Prob[Z1 = z | G = g] = Prob[ Z1 = z and G = g] / Prob[G = g] =
Prob[ Z1 = z]Prob[Z2 = (g-z)/u] / Prob[G = g] = [z] [(g-z)/u] / Prob[G = g].
The density of the Standard Normal is: [z] = exp[-z2 /2]/

2 .

Therefore, given G, the density of Z1 is proportional to:


[z] [(g-z)/u] ~ exp[-z2 /2]exp[-.5(g-z)2 /u2 ] = exp[-.5{z2 (1 + u2 )/u2 - 2gz/u2 + g2 /u2 )}]

~ exp[-.5{z2 (1 + u2 )/u2 - 2gz/u2 + g2 /(u2 + u4 )}] = exp[-0.5{(z/u) 1 + u2 - g/ u2 + u4 }2 ] =


exp[-.5{(z - g/(1 + u2 )}2 / {u2 /(1 + u2 )}],
which is proportional to a Normal Distribution with = g/(1 + u2 ) and 2 = u2 /(1 + u2 ).

Given G, the density of Z1 is a Normal Distribution with = g/(1 + u2 ) and 2 = u2 /(1 + u2 ).


E[Z1 | G = g] = g/(1 + u2 ).

2013-4-9 Buhlmann Credibility 3 Covariances and Correlations, HCM 10/19/12, Page 76


Given G, the density of Z2 is proportional to:
Prob[Z2 = z and G = g] = Prob[ Z1 = g - uz]Prob[Z2 = z] =
[g - uz] [z] ~ exp[-.5(g - uz)2 ] exp[-z2 /2] = exp[-.5{z2 (1 + u2 ) - 2gzu + g2 )}]

~ exp[-.5{z2 (1 + u2 ) - 2gzu + g2 u2 /(1 + u2 )}] = exp[-.5{(z 1 + u2 - gu/ 1 + u2 }2 ] =


exp[-.5{(z - gu/(1 + u2 )}2 / {1/(1 + u2 )}],
which is proportional to a Normal Distribution with = gu/(1 + u2 ) and 2 = 1/(1 + u2 ).

Given G, the density of Z2 is a Normal Distribution with = gu/(1 + u2 ) and 2 = 1/(1 + u2 ).


E[Z2 | G = g] = gu/(1 + u2 ).
E[H | G = g] = E[Z1 + vZ2 | G = g] = E[Z1 | G = g] + vE[Z2 | G = g] = g/(1 + u2 ) + vgu/(1 + u2 ) =
g(1 + uv)/(1 + u2 ).
X = d + eZ1 + fZ2 (X - d)/e = Z1 + (f/e)Z2 . Let G = (X - d)/e, and u = f/e.
Y = a + bZ1 + cZ2 (Y - a)/b = Z1 + (c/b)Z2 . Let H = (Y - a)/b, and v = c/b.
Then E[Y | X] = E[a + bH | G = (X - d)/e] = a + b E[H | G = (X - d)/e] =
a + b {(X - d)/e}(1 + uv)/(1 + u2 ) = a + b {(X - d)/e}{1 + (f/e)(c/b)}/(1 + (f/e)2 ) =
a + (X - d) (be + cf) / (e2 + f2 ).
Comment: Beyond what you are likely to be asked on your exam, but not beyond what you could
be asked. Some exams have a very hard question like this one. You can not be expected to prove
this result under exam conditions! X and Y are Bivariate Normal and E[Y | X] is linear in X;
see example 5d and Section 7.7 in A First Course in Probability by Ross. For a more general
discussion, see page 206 of Applied Regression Analysis by Draper and Smith.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 77

Section 4, Bayesian Analysis, Introduction


Bayesian Analysis will be discussed in this and the following two sections.18 It is also discussed to
some extent in the section on the Die-Spinner Models and in the section on the Philbrick Targeting
Shooting Example.
Bayes Theorem:
Take the following simple example. Assume there are two types of risks, each with Bernoulli claim
frequencies. One type of risk has a 30% chance of a claim (and a 70% chance for no claims.)
The second type of risk has a 50% chance of having a claim. Of the universe of risks, 3/4 are of the
first type with a 30% chance of a claim, while 1/4 are of the second type with a 50% chance of
having a claim.
Type of Risk
1
2

A Priori Probability
3/4
1/4

Chance of a Claim
30%
50%

If a risk is chosen at random, then the chance of having a claim is (3/4)(30%) + (1/4)(50%) = 35%.
In this simple example, there are two possible outcomes: either we observe 1 claim or no claims.
Thus the chance of no claims is 65%.
Assume we pick a risk at random and observe no claim. Then what is the chance that we have risk
Type 1? By the definition of the conditional probability we have:
P(Type = 1 | n = 0) = P(Type =1 and n =0) / P(n=0).
However, P(Type =1 and n =0) = P(n =0 | Type =1) P(Type =1) = (0.7)(0.75).
Therefore, P(Type = 1 | n = 0) = P(n =0 | Type =1) P(Type =1) / P(n=0)
= (0.7)(0.75) / 0.65 = 0.8077.
This is a special case of Bayes Theorem:
P(B | A) P(A)
P(A | B) =
.
P(B)
P(Risk Type | Observation) =

P(Observation | Risk Type) P(Risk Type)


.
P(Observation)

Exercise: Assume we pick a risk at random and observe no claim. Then what is the chance that we
have risk Type 2?
[Solution: P(Type = 2 | n = 0) = P(n =0 | Type =2) P(Type =2) / P(n=0) = (.5)(.25) / .65 = 0.1923.]
18

This material could just as easily all be in one big section; the division into three is somewhat artificial.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 78


Of course with only two types of risks the chance of a risk being Type 2 is unity minus the chance of
being Type 1. We observe that 0.8077 + 0.1923 = 1.
Some may find it helpful to assume we have for example 1000 risks, 750 of Type 1 and 250 of
Type 2. Of the 750 risks of Type 1, on average (70%)(750) = 525 have no claim. Of the 250 risks
of Type 2, on average (50%)(250) = 125 have no claim. Thus there are expected to be 650 risks
with no claim. Of these 650 risks, 525 or 80.77% are of Type 1, and 125 or 19.23% are of Type 2.
Estimating the Future from an Observation:
Now not only do we have probabilities posterior to an observation, but we can use these to
estimate the chance of a claim if the same risk is observed again. For example, if we observe no
claim the estimated claim frequency for the same risk is:
(post. prob. Type 1)(claim freq. Type 1) + (posterior prob. Type 2)(claim freq. Type 2) =
(0.8077)(30%) + (0.1923)(50%) = 33.85%.
This type of Bayesian analysis can be organized into a spreadsheet. For the above example with
an observation of zero claims:
A

A Priori
Chance of
Type of This Type
Risk
of Risk
1
2
Overall

0.75
0.25

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Freq.

0.7
0.5

0.525
0.125

80.77%
19.23%

0.30
0.50

0.650

1.000

33.85%

Study this very carefully. Organize all of your solutions of Bayesian Analysis questions in a single
manner that works well for you. In my spreadsheet, one lists the different types of risks.19 Then list
the a priori chance of each type of risk. Next determine the chance of the observation given a
particular type of risk. Next compute:
probability weights = (a priori chance of risk)(chance of observation given that type of risk).
These are the Bayes Theorem probabilities except we have not divided by the a priori chance of
the observation. We can automatically convert the probability weights to probabilities by dividing
by their sum so that they add up to unity. (Note how the sum of the probability weights is 0.65,
which is the a priori chance of observing zero claims.) Then one can use these posterior probabilities
to estimate any quantity of interest; in this case we get a posterior estimate of the frequency.20
19

In this example there are only two types of risks. In more complicated examples I sometimes use additional
columns of the spreadsheet to list characteristics of the different types of risks.
20
In more complicated examples I sometimes use additional columns of the spreadsheet to compute the quantity of
interest for the different types of risks.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 79


Note that the posterior estimate is a weighted average of the hypothetical means for the different
types of risks. Thus the posterior estimate of 33.85% is in the range of the hypotheses, 30% to
50%. This is true in general for Bayesian analysis.
The result of Bayesian Analysis is always within the range of hypotheses.
This is not necessarily true for the results of applying Credibility.
Exercise: What if a risk is chosen at random and one claim is observed.
What is the posterior estimate of the chance of a claim from this same risk?
[Solution: (0.6429)(0.3) + (0.3571)(0.5) = 37.14%.
A

A Priori
Chance of
Chance
Type of This Type
of the
Risk
of Risk Observation
1
0.75
0.3
2
0.25
0.5
Overall

Prob. Weight =
Product
of Columns
B&C
0.225
0.125
0.350

Posterior
Chance of
This Type
of Risk
64.29%
35.71%
1.000

Mean
Annual
Freq.
0.30
0.50
37.14%

For example, P(Type = 1 | n = 1) = P(Type =1 and n =1) / P(n=1) = (0.75)(0.3) / 0.35 = 0.643,
P(Type = 2 | n = 1) = P(Type =2 and n =1) / P(n=1) = (0.25)(0.5) / 0.35 = 0.357.]
Note how the estimate posterior to the observation of one claim is 37.14%, greater than the a priori
estimate of 35%. The observation has let us infer that it is more likely that the risk is of the high
frequency type than it was prior to the observation. Thus we infer that the future chance of a claim
from this risk is higher than it was prior to the observation.
Similarly, the estimate posterior to the observation of no claim is 33.85%, less than the a priori
estimate of 35%.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 80


Steps for Bayesian Analysis:
There are generally the following steps to Bayesian Analysis exam questions:
1. Read and understand the given model21. An overall population is divided into
different subgroups with (possibly) different a priori probabilities.
2. Calculate the expected value of the quantity of interest22 for each a priori possibility23.
3. Read and understand the observation(s) of an individual from the overall population.24
Compute the chance of the observation given a risk from each of the subgroups
of the overall population.
4. Compute the posterior probabilities using Bayes Theorem, using steps 1 and 3.
5. Take a weighted average of the expected values from step 2,
using the posterior probabilities from step 4.
Note that this assumes that we have observed an individual from the overall population without
knowing which subgroup it is from. Then we are estimating the future outcome for the same individual
observed in step 3. If instead one picked a new individual from the population, then the estimate
would be the weighted average of the expected values from step 2, using the a priori probabilities
from step 1 rather than the posterior probabilities from step 4.

21

In actual applications you may specify the model yourself. On the exam the model will be specified for you. It may
involve dice, urns, spinners, insured drivers, etc.
22
The quantity of interest may be the sum of die rolls, claim frequency, claim severity, total losses, etc.
23
The a priori possibilities may be different urns, spinners, classes, etc.
24
In actual applications you may need to obtain the relevant data. On the exam the data will be given you. It may
involve rolls of dice, spins of spinners, dollars of loss. etc.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 81


Multi-Sided Dice Example:
Lets illustrate Bayesian Analysis with a simple example involving multi-sided dice:
Assume that there are a total of 100 multi-sided dice of which 60 are 4-sided, 30 are 6-sided, and
10 are 8-sided. The multi-sided dice with 4 sides have 1, 2, 3 and 4 on them.25
The multi-sided dice with the usual 6 sides have numbers 1 through 6 on them.
The multi-sided dice with 8 sides have numbers 1 through 8 on them.26
For a given die each side has an equal chance of being rolled; i.e., the die is fair.
Your friend has picked at random a multi-sided die. (You do not know what sided die he has picked.)
He then rolled the die and told you the result. You are to estimate the result when he rolls that same
die again.
If the result is a 3 then the estimate of the next roll of the same die is 2.853:
A

Type of
Die
4-sided
6-sided
8-sided

A Priori
Chance of
This Type
of Die
0.600
0.300
0.100

Chance
of the
Observation
0.250
0.167
0.125

Prob. Weight =
Product
of Columns
B&C
0.1500
0.0500
0.0125

Posterior
Chance of
This Type
of Die
70.6%
23.5%
5.9%

Mean
Die
Roll
2.5
3.5
4.5

0.2125

1.000

2.853

Overall

The general steps to Bayesian Analysis exam questions were in this case:
1. Read and understand the given model . Make sure you understand what is meant
by the three different types of dice and note their a priori probabilities.
2. Calculate the expected value of the quantity of interest for each a priori possibility.
The mean die rolls for the three types of dice are: 2.5, 3.5 and 4.5.
3. Read and understand the observation(s). There is a single die roll for a 3.
4. Compute the posterior probabilities using Bayes Theorem. The posterior
probabilities for the three types of dice are: 70.6%, 23.5%, and 5.9%.
5. Take a weighted average of the expected values from step 2, using the posterior
probabilities from step 4. (70.6%)(2.5) + (23.5%)(3.5) + (5.9%)(4.5) = 2.85.

25
26

The mean of a 4-sided die is: (1 + 2 + 3 + 4)/4 = 2.5.


The mean of an 8-sided die is: (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8)/8 = (1+8)/2 = 4.5.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 82


Exercise: If instead a 6 is rolled, what is the estimate of the next roll of the same die?
Solution: The estimate of the next roll of the same die is 3.700:
A

Type of
Die
4-sided
6-sided
8-sided

A Priori
Chance of
This Type
of Die
0.600
0.300
0.100

Chance
of the
Observation
0.000
0.167
0.125

Prob. Weight =
Product
of Columns
B&C
0.0000
0.0500
0.0125

Posterior
Chance of
This Type
of Die
0.0%
80.0%
20.0%

Mean
Die
Roll
2.5
3.5
4.5

0.0625

1.000

3.700

Overall

Alternately, assume we picked each of the 100 dice, and rolled each 24 times. Of these 2400 rolls,
1440 would be of 4-sided dice, 720 would be of 6-sided dice, and 240 would be of 8-sided dice.
The expected total number of sixes rolled is: 720/6 + 240/8 = 120 + 30 = 150. Of the sixes rolled,
120/150 = 80% are from 6-sided, and 30/150 = 20% are from 8-diced dice. Proceed as before.]
For this multisided die example, we get the following set of estimates corresponding to each
possible observation:
Observation
Bayesian Estimate

1
2.853

2
2.853

3
2.853

4
2.853

5
3.7

6
3.7

7
4.5

8
4.5

Note that while in this simple example the posterior estimates are the same for a number of different
observations, this is not usually the case.
Exercise: What is the a priori chance of each possible outcome?
[Solution: In this case there is a 60% / 4 = 15% chance that a 4-sided die will be picked and then a 1
will be rolled. Similarly, there is a 30% / 6 = 5% chance that a 6-sided die will be selected and then a
1 will be rolled. There is a 10% / 8 = 1.25% chance that an 8-sided die will be selected and then a 1
will be rolled.
The total chance of a 1 is therefore: 15% + 5% + 1.25% = 21.25%.
Probability
due to
4-sided die

Probability
due to
6-sided die

Probability
due to
8-sided die

A Priori
Probability

1
2
3
4
5
6
7
8

0.15
0.15
0.15
0.15

0.05
0.05
0.05
0.05
0.05
0.05

0.0125
0.0125
0.0125
0.0125
0.0125
0.0125
0.0125
0.0125

0.2125
0.2125
0.2125
0.2125
0.0625
0.0625
0.0125
0.0125

Sum

0.6

0.3

0.1

Roll
of Die

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 83


Using the Multiview Calculator:
Using the TI-30X-IIS Multiview, one could work as follows to do Bayes Analysis for this multisided
die example when a 3 is rolled:
DATA
DATA
Clear L1 ENTER
0.6 x 1/4 ENTER
0.3 x 1/6 ENTER
0.1 x 1/8 ENTER
(The three probability weights should now be in the column labeled L1.)
DATA
Clear L2 ENTER (Use on the big button at the upper right to select Clear L2.)
2.5 ENTER
3.5 ENTER
4.5 ENTER
(The means of the three types of dice should now be in the column labeled L2.)
2nd STAT
1-VAR ENTER (If necessary use on the big button at the upper right to select 1-VAR.)
DATA L2 ENTER (Use on the big button at the upper right to select DATA L2.)
FRQ L1 ENTER (Use and on the big button at the upper right to select FRQ L1.)
CALC ENTER (Use and on the big button at the upper right to select CALC.)
Various outputs are displayed. Use and on the big button to scroll through them.
n = 0.2125. (the sum of the weights, the a priori chance of the observation.)
X = 2.853 (weighted average of the means in L1 with weights in L2,
the estimate using Bayes Analysis.)
To exit stat mode, hit 2ND QUIT.
To display the outputs again:
2nd STAT
STATVAR ENTER (Use on the big button at the upper right to select STATVAR.)

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 84


Using the Functions of the Calculator to Compute a Weighted Average:
Assume the following:
Value
2.5
3.5
4.5

Weight
0.6
0.3
0.1

The weighted average is:

(0.6)(2.5) + (0.3)(3.5) + (0.1)(4.5)


= 3.0.
0.6 + 0.3 + 0.1

Note that as is often the case, here the weights add to one, so there is no need to divide by the
sum of the weights.
You could just calculate this weighted average directly. Alternately, you can use the statistics
functions of the allowed electronic calculators to calculate weighted averages.
If you already know how to fit a linear regression, you can just use some of the outputs of that.
We let X be 2.5, 3.5, 4.5, and the dependent variable Y be the weights: 0.6, 0.3, 0.1.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 85


Using the TI-30X-IIS Multiview, one would fit a straight line with intercept as follows:
DATA
DATA
Clear L1 ENTER
2.5 ENTER
3.5 ENTER
4.5 ENTER
(The three values should now be in the column labeled L1.)
DATA
Clear L2 ENTER (Use on the big button at the upper right to select Clear L2.)
0.6 ENTER
0.3 ENTER
0.1 ENTER
(The three weights should now be in the column labeled L2.)
2nd STAT
2-VAR ENTER (Use on the big button at the upper right to select 2-VAR.)
CALC ENTER (Use and on the big button at the upper right to select CALC.)
Various outputs are displayed. Use and on the big button to scroll through them.
n = 3. (number of data points.)
etc.
Y = 1 (sum of the weights)
XY = 3
a = -0.25 (slope)
b = 1.208 (intercept)
In general the weighted average is XY / Y.
To exit stat mode, hit 2ND QUIT.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 86


Alternately, using the TI-30X-IIS Multiview:
DATA
DATA
Clear All ENTER
(Use on the big button at the upper right to select Clear All.)
2.5 ENTER
3.5 ENTER
4.5 ENTER
(The three values should now be in the column labeled L1.)

0.6 ENTER
0.3 ENTER
0.1 ENTER
(The three weights should now be in the column labeled L2.)
2nd STAT
1-VAR ENTER
Under Data select L1 (If necessary use and on the big button at the upper right)
Under Frequency select L2 (If necessary use and on the big button at the upper right)
CALC ENTER ( Use and on the big button at the upper right to select CALC.)
Various outputs are displayed. Use and on the big button to scroll through them.
n = 1. (The sum of the weights.)
X = 3 (The weighted average.)
To exit stat mode, hit 2ND QUIT.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 87


Using the TI-30X-IIS, one would fit a straight line with intercept as follows:
2nd STAT
CLRDATA ENTER
2nd STAT
2-VAR ENTER (Use the key if necessary to select 2-VAR rather than 1-VAR.)
DATA
X1 = 2.5
Y 1 = 0.6
X2 = 3.5
Y 2 = 0.3
X3 = 4.5
Y 3 = 0.1 ENTER
STATVAR
Various outputs are displayed. Use the key and the key to scroll through them.)
n = 3. (number of data points.)
etc.
Y = 1 (sum of the weights)
XY = 3
a = -0.25 (slope)

b = 1.208 (intercept)

In general the weighted average is XY / Y.


Alternately, using the TI-30X-IIS:
2nd STAT
CLRDATA ENTER
2nd STAT
1-VAR ENTER (Use the key if necessary to select 2-VAR rather than 1-VAR.)
DATA
X1 = 2.5
Freq = 6
X2 = 3.5

(Frequencies have to be integers and proportional to the weights.)

Freq = 3
X3 = 4.5
Freq = 1
ENTER
STATVAR
Various outputs are displayed. Use the key and the key to scroll through them.)
X = 3 (The weighted average.)

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 88


Using the BA II Plus Professional, one would fit a straight line with intercept as follows:
2nd DATA
2nd CLR WORK
X1 2.5 ENTER
Y 1 0.6 ENTER
X2 3.5 ENTER
Y2 0.3 ENTER
X3 4.5 ENTER
Y3 0.1 ENTER
2nd STAT
If necessary press 2nd SET until LIN is displayed (for linear regression)
Various outputs are displayed. Use the keys and to scroll through them.
n = 3. (number of data points.)
etc.
Y = 1 (sum of the weights)
XY = 3
a = 1.208 (intercept)
b = -0.25 (slope)
In general the weighted average is XY / Y.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 89


Balance:
Exercise: What is the a priori overall mean?
[Solution: The a priori chance of the types of dice are 60%, 30%, and 10%, with means of 2.5, 3.5,
and 4.5. Therefore, the a priori overall mean is: (60%)(2.5) + (30%)(3.5) + (10%)(4.5) = 3.]
Note that the Bayesian Estimates are in balance; the weighted average of the Bayesian Estimates,
using the a priori chance of each observation, is equal to the a priori overall mean of 3:
(0.2125)(1) + (0.2125)(2) + (0.2125)(3) + (0.2125)(4) + (0.0625)(5) + (0.0625)(6) + (0.0125)(7)
+ (0.0125)(8) = 3.

Roll
of Die

A Priori
Probability

Bayesian
Analysis
Estimate

1
2
3
4
5
6
7
8

0.2125
0.2125
0.2125
0.2125
0.0625
0.0625
0.0125
0.0125

2.853
2.853
2.853
2.853
3.7
3.7
4.5
4.5

Average

3.000

If Di are the possible outcomes, then the Bayesian estimates are E[X | Di ].
Then P(Di ) E[X | Di ] = E[X] = the a priori mean.
The estimates that result from Bayesian Analysis are always in balance:
The sum of the product of the a priori chance of each outcome times its posterior
Bayesian estimate is equal to the a priori mean.
Assume we were to repeat many times the following simulation of the multi-sided die example:
1. Choose a die.
2. Roll the die.
3. Apply Bayes Analysis to predict the outcome of the next die roll from the same die.
Then the average of the results of step three would be the a priori mean of the model, 3.
This is what we mean by the estimates that result from Bayesian Analysis are always in balance.
If the original model is correct, then the expected value of the Bayes Estimator is the a priori mean.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 90


Multi-sided Dice Example, More than One Roll:
If one observes more than one roll from the same die, the Bayesian Analysis proceeds in the same
fashion as in the previous case of one roll, but one computes the probability of observing these
rolls. Sometimes one is just given the sum of the observations. Again one just computes the
probability of the observation given the particular type of die.
For example, if one observes two rolls that sum to 11, the estimate of the next roll is 3.86:
A

Type of
Die
4-sided
6-sided
8-sided

A Priori
Chance of
This Type
of Die
0.600
0.300
0.100

Chance
of the
Observation
0.0000
0.0556
0.0938

Overall

Prob. Weight = Posterior


Product
Chance of
of Columns
This Type
B&C
of Die
0.0000
0.0%
0.0167
64.0%
0.0094
36.0%
0.0260

1.000

Mean
Die
Roll
2.5
3.5
4.5
3.86

For example, the chance of the observation is 6/64 = 0.09375 if your friend has been rolling an
8-sided die. Two rolls of an 8-sided die can sum to 11 with rolls of: (3,8), (4,7), (5,6), (6,5), (7,4) or
(8,3) for six possibilities out of 82 = 64.
Exercise: If the sum of three rolls is 4, what is the estimate of the next roll from the same die?
[Solution: The estimate using Bayesian Analysis of the next roll is 2.66.
A

Type of
Die
4-sided
6-sided
8-sided

A Priori
Chance of
This Type
of Die
0.600
0.300
0.100

Chance
of the
Observation
0.04688
0.01389
0.00586

Overall

Prob. Weight = Posterior


Product
Chance of
of Columns
This Type
B&C
of Die
0.02813
85.5%
0.00417
12.7%
0.00059
1.8%
0.03288

1.000

Mean
Die
Roll
2.5
3.5
4.5
2.66

For example, the chance of the observation is 3/216 if your friend has been rolling a 6-sided die.
Three rolls of a 6-sided die can sum to 4 with rolls of: (1,1,2), (1,2,1), or (2,1,1) for three possibilities
out of 63 = 216.]

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 91


Problems:
4.1 (1 point) Box I contains 5 red and 3 blue marbles, while Box II contains 3 red and 7 blue
marbles. A fair die is rolled. If the roll results in a one, two, three, or four, a marble is chosen at
random from Box I. If it results in a five or six, a marble is chosen at random from Box II.
If the chosen marble is blue, but you are not allowed to see the roll of the die (and thus, you don't
know which box has been chosen), what is the probability that Box I was chosen?
A. Less than 49%
B. At least 49%, but less than 50%
C. At least 50%, but less than 51%
D. At least 51%, but less than 52%
E. 52% or more.

Use the following information for the next two questions:


There are three dice :
Die A
Die B
Die C
2 faces labeled 0
4 faces labeled 0
5 faces labeled 0
4 faces labeled 1
2 faces labeled 1
1 faces labeled 1
4.2 (2 points) You select a die at random and then roll the selected die twice. The rolls add to 1.
What is the expected value of the next roll of the same die?
A. less than 0.45
B. at least 0.45 but less than 0.50
C. at least 0.50 but less than 0.55
D. at least 0.55 but less than 0.60
E. at least 0.60
4.3 (2 points) You select a die at random and then roll the selected die three times
The rolls add to 1. What is the expected value of the next roll of the same die?
A. 0.25
B. 0.30
C. 0.35
D. 0.40
E. 0.45

4.4 (1 point) Which of the following statements are true with respect to the estimates from Bayesian
Analysis?
1. They are in balance.
2. They are between the observation and the a priori mean.
3. They are within the range of hypotheses.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3
E. None of A, B, C, or D

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 92


Use the following information for the next two questions:
There are two types of urns, each with many balls labeled $1000 and $2000.
A Priori Chance of Percentage of Percentage of
Type of Urn This Type of Urn
$1000 Balls
$2000 Balls
I
80%
90%
10%
II
20%
70%
30%
4.5 (2 points) You pick an Urn at random (80% chance it is of Type I) and pick one ball.
If the ball is $2000, what is the expected value of the next ball picked from that same urn?
A. 1130
B. 1150
C. 1170
D. 1190
E. 1210
4.6 (2 points) You pick an Urn at random (80% chance it is of Type I) and pick three balls.
If two of the balls were $1000 and one of the balls was $2000, what is the expected value of the
next ball picked from that same urn?
A. 1140
B. 1160
C. 1180
D. 1200
E. 1220

4.7 (2 points) Let X1 be the outcome of a single trial and let E[X2 | X1 ] be the expected value of the
outcome of a second trial. You are given the following information:
Outcome = T
P(X1 = T)
Bayesian Estimate For E[X2 | X1 = T]
1
5/8
1.4
4
2/8
3.6
16
1/8
---Determine the Bayesian estimate for E[X2 | X1 = 16].
A. Less than 11
B. At least 11, but less than 12
C. At least 12, but less than 13
D. At least 13, but less than 14
E. 14 or more
4.8 (2 points) There are four types of urns with differing percentages of black balls.
Each type of urn has a differing chance of being picked.
Type of Urn
A Priori Probability
Percentage of Black Balls
I
40%
4%
II
30%
8%
III
20%
12%
IV
10%
16%
An urn is chosen and fifty balls are drawn from it, with replacement; no black balls are drawn.
Use Bayes Theorem to estimate the probability of picking a black ball from the same urn.
A. 3.0%
B. 3.5%
C. 4.5%
D. 5.0%
E. 5.5%

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 93


4.9 (3 points ) A die is selected at random from an urn that contains four six-sided dice with the
following characteristics:
Number of Faces
Number on Face
Die A
Die B
Die C
Die D
1
3
1
1
1
2
1
3
1
1
3
1
1
3
1
4
1
1
1
3
The first five rolls of the selected die yielded the following in sequential order: 2, 3, 1, 2, and 4.
Using Bayesian Analysis, what is the expected value of the next roll of the same die?
A. 1.8
B. 2.0
C. 2.2
D. 2.4
E. 2.6
4.10 (3 points) A game of chance has been designed where you are dealt a two-card hand from a
deck of cards chosen at random from two available decks.
The two decks are as follows:
Deck A: 1 suit from a regular deck of cards (13 cards).
Deck B: same as Deck A except the ace is missing (12 cards).
You will receive $10 for each ace or face card in your hand. Assume that you have been dealt two
cards which are either an ace or a face card (i.e., a $20 hand).
NOTE: A face card equals either a Jack, Queen or King.
Using Bayesian Analysis, what is the expected value of the next hand drawn from the same deck
assuming the first hand is replaced?
A. 5.1
B. 5.3
C. 5.5
D. 5.7
E. 5.9
4.11 (3 points) Your friend has picked at random one of three multi-sided dice. He then rolled the
die and told you the result. You are to estimate the result when he rolls that same die again. One of
the three multi-sided dice has 4 sides (with 1, 2, 3 and 4 on them), the second die has the usual 6
sides (with numbers 1 through 6), and the last die has 8 sides (with numbers 1 through 8).
For a given die each side has an equal chance of being rolled; i.e., the die is fair.
Assume the first roll was a five.
Use Bayes Theorem to estimate the next roll.
A. 3.5
B. 3.7
C. 3.9
D. 4.1
E. 4.3

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 94


Use the following information for the next three questions:
You are on a game show, and the host Monty Hall gives you the choice of one of three doors.
Behind one door is a car, while behind each of the other two doors is a goat.
You pick a door at random, and will receive what is behind your door when it is opened.
Monty Hall knows what is behind each door.
4.12 (2 points) Assume that whenever this game is played, Monty Hall opens a door the
contestant did not pick that has a goat behind it.
Then Monty Hall gives the contestant a chance to switch doors.
What is the probability that the car is behind the door you originally picked?
4.13 (2 points) Assume that whenever this game is played, Monty Hall opens at random a door
that the contestant did not pick. If opening this door reveals a goat behind it, then Monty Hall gives
the contestant a chance to switch doors. If opening this door reveals the car, then the game is over,
and the contestant gets a goat.
After you pick your door, Monty opens a different door revealing a goat.
What is the probability that the car is behind the door you originally picked?
4.14 (2 points) Assume that whenever this game is played, Monty Hall opens a door the
contestant did not pick. If the door the contestant picked has a goat behind it, then one third of the
time Monty opens the other door with a goat behind it, while two thirds of the time Monty opens the
door with a car behind it. If the door the contestant picked has the car behind it, then Monty opens
another door revealing a goat.
After you pick your door, Monty opens a different door revealing a goat, and gives you an
opportunity to switch doors.
What is the probability that the car is behind the door you originally picked?

4.15 (2 points) Only two cab companies operate in a city, Green and Blue.
Eighty-five percent of the cbs are Green and 15 percent are Blue.
A cab was involved in a hit-and-run accident at night.
A witness identified the cab as Blue.
The court tested the reliability of the witness under the circumstances that existed on the night of the
accident and concluded that the witness correctly identified each one of the two colors 80 percent of
the time and got the color wrong 20 percent of the time.
What is the probability that the cab involved in the accident is Blue rather than Green?
(A) 40%
(B) 50%
(C 60%
(D 70%
(E) 80%

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 95


4.16 (3 points) You are given the following:

Two urns each contain three marbles.

One urn contains two red marbles and one black marble.

The other urn contains one red marble and two black marbles.
An urn is randomly selected and designated Urn A. The other urn is designated Urn B.
One marble is randomly drawn from Urn A. The selected marble is placed in Urn B. One marble is
randomly drawn from Urn B. This second selected marble is placed in Urn A. One marble is
randomly drawn from Urn A. This third selected marble is placed in Urn B. One marble is randomly
drawn from Urn B. This fourth selected marble is placed in Urn A. This process is continued
indefinitely, with marbles alternately drawn from Urn A and Urn B.
The first selected marble is red. The second selected marble is black. Determine the Bayesian
analysis estimate of the probability that the third selected marble will be red.
A. Less than 0.3
B. At least 0.3, but less than 0.5
C. At least 0.5, but less than 0.7
D. At least 0.7, but less than 0.9
E. At least 0.9
Use the following information for the next two questions:
There are three large urns, each filled with so many balls that you can treat it as if there are an infinite
number. Urn 1 contains balls with "zero" written on them. Urn 2 has balls with "one" written on them.
The final Urn 3 is filled with 50% balls with "zero" and 50% balls with "one". An urn is chosen at
random and five balls are picked.
4.17 (4, 5/83, Q.44a) (2 points) If all five balls have zero written on them, use Bayes Theorem to
estimate the expected value of another ball picked from that urn.
A. less than 0.02
B. at least 0.02 but less than 0.04
C. at least 0.04 but less than 0.06
D. at least 0.06 but less than 0.08
E. at least 0.08
4.18 (4, 5/83, Q.44c) (2 points) If 3 balls have 0 written on them and 2 have 1 written on them,
use Bayes Theorem to estimate the expected value of another ball picked from that urn.
A. less than 0.42
B. at least 0.42 but less than 0.44
C. at least 0.44 but less than 0.46
D. at least 0.46 but less than 0.48
E. at least 0.48

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 96


Use the following information for the next two questions:
There are three dice :
Die A
Die B
Die C
2 faces labeled 0
4 faces labeled 0
5 faces labeled 0
4 faces labeled 1
2 faces labeled 1
1 faces labeled 1
4.19 (4, 5/84, Q.51) (2 points) You select a die at random and then roll the selected die.
Assuming a 0 is rolled what is the expected value of the next roll of the same die?
A. less than 0.2
B. at least 0.2 but less than 0.3
C. at least 0.3 but less than 0.4
D. at least 0.4 but less than 0.5
E. at least 0.5
4.20 (4, 5/84, Q.51) (2 points) You select a die at random and then roll the selected die.
Assuming a 1 is rolled what is the expected value of the next roll of the same die?
A. less than 0.5
B. at least 0.5 but less than 0.6
C. at least 0.6 but less than 0.7
D. at least 0.7 but less than 0.8
E. at least 0.8
4.21 (4, 5/86, Q.36) (1 point) Box I contains 3 red and 2 blue marbles, while Box II contains 3 red
and 7 blue marbles. A fair die is rolled. If the roll results in a one, two, three, or four, a marble is
chosen at random from Box I. If it results in a five or six a marble is chosen at random from Box II.
If the chosen marble is red, but you are not allowed to see the roll of the die (and thus, you don't
know which box has been chosen), what is the probability that Box I was chosen?
A. Less than 55%
B. At least 55%, but less than 65%
C. At least 65%, but less than 75%
D. At least 75%, but less than 85%
E. 85% or more.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 97


4.22 (4, 5/89, Q.36) (2 points) Your friend selected at random one of two urns and then she pulled
a ball with the number 4 on it from the urn. Then, she replaced the ball in the urn. One of the urns
contains four balls numbered 1 through 4. The other urn contains six balls numbered 1 through 6.
Your friend will make another random selection of a ball from the same urn. Using the Bayesian
method (i.e. Bayes' Theorem) what is the expected value of the number on the ball?
A. Less than 2.925
B. At least 2.925, but less than 2.975
C. At least 2.975, but less than 3.025
D. At least 3.025, but less than 3.075
E. 3.075 or more
4.23 (4, 5/90, Q.39) (2 points) Three urns contain balls marked with either 0 or 1 in the proportions
described below.
Marked 0
Marked 1
Urn A
10%
90%
Urn B
60
40
Urn C
80
20
An urn is selected at random and three balls are selected, with replacement, from the urn. The total
of the values is 1. Three more balls are selected from the same urn.
Calculate the expected total of the three balls using Bayes theorem.
A. less than 1.05
B. at least 1.05 but less than 1.10
C. at least 1.10 but less than 1.15
D. at least 1.15 but less than 1.20
E. at least 1.20
4.24 (4, 5/91, Q.38) (2 points) One spinner is selected at random from a group of three different
spinners. Each of the spinners is divided into six equally likely sectors marked as described below.
--------Number of Sectors---------Spinner
Marked 0 Marked 12 Marked 48
A
2
2
2
B
3
2
1
C
4
1
1
Assume a spinner is selected and a 12 was obtained on the first spin.
Use Bayes' theorem to calculate the expected value of the second spin using the same spinner.
A. Less than 12.5
B. At least 12.5 but less than 13.0
C. At least 13.0 but less than 13.5
D. At least 13.5 but less than 14.0
E. At least 14.0

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 98


4.25 (4, 5/91, Q.50) (2 points) Four urns contain balls marked with either 0 or 1 in the proportions
described below.
Urn
Marked 0 Marked 1
A
70%
30%
B
70
30
C
30
70
D
20
80
An urn is selected at random and four balls are selected from the urn with replacement. The total of
the values is 2. Four more balls are selected from the same urn. Calculate the expected total of the
four balls using Bayes theorem.
A. Less than 1.96
B. At least 1.96 but less than 1.99
C. At least 1.99 but less than 2.02
D. At least 2.02 but less than 2.05
E. At least 2.05
4.26 (2, 5/92, Q.19) (1.7 points) A test for a disease correctly diagnoses a diseased person as
having the disease with probability .85. The test incorrectly diagnoses someone without the
disease as having the disease with probability .10. If 1% of the people in a population have the
disease, what is the chance that a person from this population who tests positive for the disease
actually has the disease?
A. 0.0085
B. 0.0791
C. 0.1075
D. 0.1500
E. 0.9000
4.27 (4B, 5/92, Q.8) (3 points) Two urns contain balls each marked with 0, 1, or 2 in the
proportions described below:
Percentage of Balls in Urn
Marked 0
Marked 1
Marked 2
Urn A
.20
.40
.40
Urn B
.70
.20
.10
An urn is selected at random and two balls are selected, with replacement, from the urn.
The sum of values on the selected balls is 2. Two more balls are selected from the same urn.
Determine the expected total of the two balls using Bayes' Theorem.
A. Less than 1.6
B. At least 1.6 but less than 1.7
C At least 1.7 but less than 1.8
D. At least 1.8 but less than 1.9
E. At least 1.9

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 99


4.28 (4B, 5/92, Q.24) (2 points) Let X1 be the outcome of a single trial and let E[X2 | X1 ] be the
expected value of the outcome of a second trial. You are given the following information:
Buhlmann Credibility
Bayesian
Outcome
Estimate For
Estimate For
T
P(X1 =T)
E[X2 | X1 =T]
E[X2 | X1 =T]
1
1/3
3.4
8
1/3
7.6
12
1/3
10.0
Determine the Bayesian estimate for E[X2 | X1 = 12].
A. 8.6

B. 10.0

C. 10.6

D. 12.0

2.6
7.8
--E. Cannot be determined.

4.29 (4B, 11/94, Q.5) (3 points) Two urns contain balls with each ball marked 0 or 1 in the
proportions described below:
Percentage of Balls in Urn
Marked 0
Marked 1
Urn A
20%
80%
Urn B
70%
30%
An urn is randomly selected and two balls are drawn from the urn. The sum of the values on the
selected balls is 1. Two more balls are selected from the same urn.
Note: Assume that each selected ball has been returned to the urn before the next ball is drawn.
Determine the Bayesian analysis estimate of the expected value of the sum of the values on the
second pair of selected balls.
A. Less than 1.035
B. At least 1.035, but less than 1.055
C. At least 1.055, but less than 1.075
D. At least 1.075, but less than 1.095
E. At least 1.095

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 100
4.30 (4B, 11/95, Q.20) (2 points) Ten urns each contain five balls, numbered as follows:
Urn 1: 1,2,3,4,5 Urn 2: 1,2,3,4,5 Urn 3: 1,2,3,4,5 Urn 4: 1,2,3,4,5 Urn 5: 1,2,3,4,5
Urn 6: 1,1,1,1,1 Urn 7: 2,2,2,2,2 Urn 8: 3,3,3,3,3 Urn 9: 4,4,4,4,4 Urn 10: 5,5,5,5,5
An urn is randomly selected. A ball is then randomly selected from this urn. The selected ball has the
number 2 on it. This ball is then replaced, and another ball is randomly selected from the same urn.
The second selected ball has the number 3 on it. This ball is then replaced, and another ball is
randomly selected from the same urn. Determine the Bayesian analysis estimate of the expected
value of the number on this third selected ball.
A. Less than 2.2
B. At least 2.2, but less than 2.4
C. At least 2.4, but less than 2.6
D. At least 2.6, but less than 2.8
E. At least 2.8
4.31 (4B, 5/97, Q.18) (3 points) You are given the following:

12 urns each contain 10 marbles.


n of the urns contain 3 red marbles and 7 black marbles.

The remaining 12-n urns contain 6 red marbles and 4 black marbles.
An urn is randomly selected, and one marble is randomly drawn from it. The selected marble is red.
This marble is replaced, and a marble is again randomly drawn from the same urn. The Bayesian
analysis estimate of the probability that the second selected marble is red is 0.54. Determine n.
A. 4
B. 5
C. 6
D. 7
E. 8

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 101
Use the following information for the next two questions:

Two urns each contain three marbles.

One urn contains two red marbles and one black marble.

The other urn contains one red marble and two black marbles.
An urn is randomly selected and designated Urn A. The other urn is designated Urn B.
One marble is randomly drawn from Urn A. The selected marble is placed in Urn B.
One marble is randomly drawn from Urn B. This second selected marble is placed in Urn A.
One marble is randomly drawn from Urn A. This third selected marble is placed in Urn B.
One marble is randomly drawn from Urn B. This fourth selected marble is placed in Urn A.
This process is continued indefinitely, with marbles alternately drawn from Urn A and Urn B.
4.32 (4B, 11/97, Q.27) (1 point) The first two selected marbles are red. Determine the Bayesian
analysis estimate of the probability that the third selected marble will be red.
A. 1/2
B. 11/21
C. 4/7
D. 2/3
E. 1
4.32 (4B, 11/97, Q.28) (2 points) Determine the limit as n goes to infinity of the Bayesian analysis
estimate of the probability that the (2n+1)st selected marble will be red if the first 2n selected
marbles are red (where n is an integer).
A. 1/2
B. 11/21
C. 4/7
D. 2/3
E. 1

4.33 (4B, 5/99, Q.2) (2 points) Each of two urns contains two fair, six-sided dice.
Three of the four dice have faces marked with 1, 2, 3, 4, 5, and 6.
The other die has faces marked with 1, 1, 1, 2, 2, and 2.
One urn is randomly selected, and the dice in it are rolled. The total on the two dice is 3.
Determine the Bayesian analysis estimate of the expected value of the total on the same two dice
on the next roll.
A. 5.0
B. 5.5
C. 6.0
D. 6.5
E. 7.0

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 102
4.34 (4B, 5/99, Q.26) (3 points) A company invests in a newly offered stock that it judges will turn
out to be one of three types. The company believes that the stock is equally likely to be any of the
three types.

The annual dividend for Type A stocks has a normal distribution with mean 10 and variance 1.

The annual dividend for Type B stocks has a normal distribution with mean 10 and variance 4.

The annual dividend for Type C stocks has a normal distribution with mean 10
and variance 16.
After the company has held the stock for one year, the stock pays a dividend of amount d. The
company then determines that the posterior probability that the stock is of Type B is greater than
either the posterior probability that the stock is of Type A or the posterior probability that the stock is
of Type C.
Determine all the values of d for which this would be true.
Hint: The density function for a normal distribution is f(x) = exp[-(x-)2 /22 ] /{ 2 }.
A. |d-10| < 2

(2 ln 2) / 3

B. |d-10| < 4

(2 ln 2) / 3

C. 2

(2 ln 2) / 3 < |d-10| < 4

D. |d-10| > 2

(2 ln 2) / 3

E. |d-10| > 4

(2 ln 2) / 3

(2 ln 2) / 3

4.35 (4B, 11/99, Q.16) (2 points) You are given the following:
A red urn and a blue urn each contain 100 balls.
Each ball is labeled with both a letter and a number.
The distribution of letters and numbers on the balls is as follows:
Letter A
Letter B
Number 1
Number 2
Red Urn :
90
10
90
10
Blue Urn:
60
40
10
90
Within each urn, the appearance of the letter A on a ball is independent of the appearance of the
number 1 on a ball.
One ball is drawn randomly from a randomly selected urn, observed to be labeled A-2, and then
replaced.
Determine the expected value of the number on another ball drawn randomly from the same urn.
A. Less than 1.2
B. At least 1.2, but less than 1.4
C. At least 1.4, but less than 1.6
D. At least 1.6, but less than 1.8
E. At least 1.8

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 103
4.36 (Course 1 Sample Exam, Q.3) (1.9 points) Ten percent of a companys life insurance
policyholders are smokers. The rest are nonsmokers.
For each nonsmoker, the probability of dying during the year is 0.01.
For each smoker, the probability of dying during the year is 0.05.
Given that a policyholder has died, what is the probability that the policyholder was a smoker?
A. 0.05
B. 0.20
C. 0.36
D. 0.56
E. 0.90
4.37 (Course 1 Sample Exam, Q.29) (1.9 points) An insurance company designates 10% of its
customers as high risk and 90% as low risk. The number of claims made by a customer in a calendar
year is Poisson distributed with mean and is independent of the number of claims made by a
customer in the previous calendar year.
For high risk customers = 0.6, while for low risk customers = 0.1.
Calculate the expected number of claims made in calendar year 1998 by a customer who made
one claim in calendar year 1997.
A. 0.15
B. 0.18
C. 0.24
D. 0.30
E. 0.40
4.38 (1, 5/00, Q.2) (1.9 points)
A study of automobile accidents produced the following data:
Model year Proportion of all vehicles
Probability of involvement in an accident
1997
0.16
0.05
1998
0.18
0.02
1999
0.20
0.03
Other
0.46
0.04
An automobile from one of the model years 1997, 1998, and 1999 was involved in an accident.
Determine the probability that the model year of this automobile is 1997.
(A) 0.22
(B) 0.30
(C) 0.33
(D) 0.45
(E) 0.50
4.39 (1, 5/00, Q.33) (1.9 points) A blood test indicates the presence of a particular disease 95%
of the time when the disease is actually present. The same test indicates the presence of the
disease 0.5% of the time when the disease is not present. One percent of the population actually
has the disease. Calculate the probability that a person has the disease given that the test indicates
the presence of the disease.
(A) 0.324
(B) 0.657
(C) 0.945
(D) 0.950
(E) 0.995

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 104
4.40 (1, 11/00, Q.12) (1.9 points) An actuary studied the likelihood that different types of drivers
would be involved in at least one collision during any one-year period.
The results of the study are presented below.
Type of driver
Percentage of all drivers
Probability of at least one collision
Teen
8%
0.15
Young adult
16%
0.08
Midlife
45%
0.04
Senior
31%
0.05
Total
100%
Given that a driver has been involved in at least one collision in the past year, what is the
probability that the driver is a young adult driver?
(A) 0.06
(B) 0.16
(C) 0.19
(D) 0.22
(E) 0.25
4.41 (1, 11/00, Q.22) (1.9 points) The probability that a randomly chosen male has a circulation
problem is 0.25. Males who have a circulation problem are twice as likely to be smokers as those
who do not have a circulation problem. What is the conditional probability that a male has a circulation
problem, given that he is a smoker?
(A) 1/4
(B) 1/3
(C) 2/5
(D) 1/2
(E) 2/3
4.42 (1, 5/01, Q.6) (1.9 points) An insurance company issues life insurance policies in three
separate categories: standard, preferred, and ultra-preferred.
Of the companys policyholders, 50% are standard, 40% are preferred, and 10% are
ultra-preferred. Each standard policyholder has probability 0.010 of dying in the next year, each
preferred policyholder has probability 0.005 of dying in the next year, and each
ultra-preferred policyholder has probability 0.001 of dying in the next year.
A policyholder dies in the next year.
What is the probability that the deceased policyholder was ultra-preferred?
(A) 0.0001 (B) 0.0010 (C) 0.0071 (D) 0.0141 (E) 0.2817
4.43 (1, 5/01, Q.23) (1.9 points) A hospital receives 1/5 of its flu vaccine shipments from
Company X and the remainder of its shipments from other companies. Each shipment contains a
very large number of vaccine vials. For Company Xs shipments, 10% of the vials are ineffective.
For every other company, 2% of the vials are ineffective. The hospital tests 30 randomly selected
vials from a shipment and finds that one vial is ineffective.
What is the probability that this shipment came from Company X ?
(A) 0.10
(B) 0.14
(C) 0.37
(D) 0.63
(E) 0.86

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 105
4.44 (4, 11/01, Q.7 & 2009 Sample Q.60) (2.5 points)
You are given the following information about six coins:
Coin Probability of Heads
14
0.50
5
0.25
6
0.75
A coin is selected at random and then flipped repeatedly. Xi denotes the outcome of the ith flip,
where 1 indicates heads and 0 indicates tails.
The following sequence is obtained: S = {X1 , X2 , X3 , X4 } = {1, 1, 0, 1}.
Determine E(X5 | S) using Bayesian analysis.
(A) 0.52

(B) 0.54

(C) 0.56

(D) 0.59

(E) 0.63

4.45 (2 points) In the previous question, 4, 11/01, Q. 7, using Bayesian analysis, determine the
probability that coin flips 5 and 6 are both heads.
A. Less than 29%
B. At least 29%, but less than 31%
C. At least 31%, but less than 33%
D. At least 33%, but less than 35%
E. At least 35%
4.46 (2 points) In 4, 11/01, Q. 7, you are instead given that: X1 + X2 + X3 + X4 = 3.
Determine E(X5 | S) using Bayesian analysis.
(A) 0.52

(B) 0.54

(C) 0.56

(D) 0.59

(E) 0.63

4.47 (3 points) In 4, 11/01, Q. 7, using Bayesian analysis, determine the probability that
coin flip 5 is a tail, coin flip 6 is a head, and coin flip 7 is a tail.
A. Less than 10.0%
B. At least 10.0%, but less than 10.5%
C. At least 10.5%, but less than 11.0%
D. At least 11.0%, but less than 11.5%
E. At least 11.5%

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 106
4.48 (1, 5/03, Q.8) (2.5 points) An auto insurance company insures drivers of all ages.
An actuary compiled the following statistics on the companys insured drivers:
Age of Driver
Probability of Accident
Portion of Companys Insured Drivers
16-20
0.06
0.08
21-30
0.03
0.15
31-65
0.02
0.49
66-99
0.04
0.28
A randomly selected driver that the company insures has an accident.
Calculate the probability that the driver was age 16-20.
(A) 0.13
(B) 0.16
(C) 0.19
(D) 0.23
(E) 0.40
4.49 (1, 5/03, Q.31) (2.5 points) A health study tracked a group of persons for five years.
At the beginning of the study, 20% were classified as heavy smokers, 30% as light smokers, and
50% as nonsmokers.
Results of the study showed that light smokers were twice as likely as nonsmokers to die during the
five-year study, but only half as likely as heavy smokers.
A randomly selected participant from the study died over the five-year period.
Calculate the probability that the participant was a heavy smoker.
(A) 0.20
(B) 0.25
(C) 0.35
(D) 0.42
(E) 0.57
4.50 (4, 5/07, Q.35) (2.5 points) The observation from a single experiment has distribution:
Pr(D = d | G = g) = g1-d(1-g)d for d = 0, 1
The prior distribution of G is:
Pr(G = 1/5) = 3/5 and Pr(G = 1/3) = 2/5
Calculate Pr(G = 1/3 | D = 0).
(A) 2/19
(B) 3/19
(C) 1/3
(D) 9/19

(E) 10/19

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 107
Solutions to Problems:
4.1. D.
A

Type of
Box
I
II

A Priori
Chance of
This Type
of Box
0.667
0.333

Chance
of the
Observation
0.3750
0.7000

Prob. Weight =
Product
of Columns
B&C
0.2500
0.2333

Posterior
Chance of
This Type
of Box
51.72%
48.28%

0.483

1.000

Overall

4.2. A.
A

A Priori
Chance of
Type of This Type
Die
of Die
A
0.3333
B
0.3333
C
0.3333

Chance
of the
Observation
0.4444
0.4444
0.2778

Prob. Weight =
Product
of Columns
B&C
0.1481
0.1481
0.0926

Posterior
Chance of
This Type
of Die
0.3810
0.3810
0.2381

Mean
Roll
of
Die
0.6667
0.3333
0.1667

0.389

1.000

0.421

Overall

If the sum of two rolls is 1, then one of them must have been a 0 and the other a 1.
For example, if die A is chosen, then the chance of the sum being 1 is: (2)(1/3)(2/3) = 4/9.
If instead die C is chosen, then the chance of the sum being 1 is: (2)(5/6)(1/6) = 5/18.
4.3. C.
A

Type of
Die
A
B
C

A Priori
Chance of
This Type
of Die
0.3333
0.3333
0.3333

Overall

Prob. Weight =
Chance
Product
of the
of Columns
Observation
B&C
0.2222
0.0741
0.4444
0.1481
0.3472
0.1157
0.338

Posterior
Chance of
This Type
of Die
0.2192
0.4384
0.3425

Mean
Roll
of
Die
0.6667
0.3333
0.1667

1.000

0.349

For example, the chance of observing a single 1 on three rolls for Die C is: 3(1/6)(5/6)2 =
75 / 216. (The chance of given numbers of ones being rolled is given by a Binomial Distribution with
m = 3 and q = 1/6 in the case of Die C.)
4.4. B. 1. True. 2. Not necessarily true (is always true of Credibility.) 3. True.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 108
4.5. D.
A

A Priori
Chance of
Type of This Type
Urn
of Urn
I
0.8000
II
0.2000

Chance
of the
Observation
0.1000
0.3000

Prob. Weight =
Product
of Columns
B&C
0.0800
0.0600

Posterior
Chance of
This Type
of Urn
0.5714
0.4286

Mean
Draw
from
Urn
1100
1300

0.140

1.000

1186

Overall

4.6. B. For example, the chance of picking 2 @ $1000 and 1@$2000 from Urn II is given by f(2)
for a Binomial distribution with n = 3 and q = .7: (3)(.72 )(.3).
A

A Priori
Chance of
Type of This Type
Urn
of Urn
I
0.8000
II
0.2000

Chance
of the
Observation
0.2430
0.4410

Prob. Weight =
Product
of Columns
B&C
0.1944
0.0882

Posterior
Chance of
This Type
of Urn
0.6879
0.3121

Mean
Draw
from
Urn
1100
1300

0.283

1.000

1162

Overall

4.7. E. Bayesian Estimates are in balance; the sum of the product of the a priori chance of each
outcome times its posterior Bayesian estimate is equal to the a priori mean. The a priori mean is:
(5/8)(1) + (2/8)(4) + (1/8)(16) = 3.625. Let E[X2 | X1 = 16] = y. Then setting the sum of the chance
of each outcome times its posterior mean equal to the a priori mean:
(5/8)(1.4) + (2/8)(3.6) + (1/8)(y) = 3.625. Therefore y = 14.8.
4.8. C. Chance of no Black Balls in Fifty Draws = (1-p)50.
A

A Priori
Type Probability
I
0.4
II
0.3
III
0.2
IV
0.1
SUM

% Black Balls
0.04
0.08
0.12
0.16

Chance of
Probability
No Black Balls
Weights =
in Fifty Draws Col. B x Col. D
0.12989
0.0520
0.01547
0.0046
0.00168
0.0003
0.00016
0.0000
0.0569

Posterior
Probability
0.912
0.081
0.006
0.000

Col. C x
Col. F
0.036
0.007
0.001
0.000

1.000

0.044

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 109
4.9. D. If one has Die A, then the chance of the observation is: (1/6)(1/6)(3/6)(1/6)(1/6) =
3 / 7776. This is also the chance of the observation for Die C or Die D. If one has Die B, then the
chance of the observation is: (3/6)(1/6)(1/6)(3/6)(1/6) = 9 / 7776.
A

Die
A
B
C
D

A Priori
Probability
0.25
0.25
0.25
0.25

Probability
Chance of
Weights =
Observation Col. B x Col. C
0.00387
0.00097
0.01160
0.00290
0.00387
0.00097
0.00387
0.00097

SUM

Posterior
Probability
0.1667
0.5000
0.1667
0.1667

Mean Value
of a Single Roll
of a Die
2.000
2.333
2.667
3.000

1.000

2.444

0.00580

4.10. D. Given Deck A, the chance of getting a $20 hand is (4/13)(3/12) = 3/39. Given Deck B,
the chance of getting a $20 hand is (3/12)(2/11) = 1/22.
For Deck A, the expected value of a card is (10)(4/13).
The expected value of a hand is 80/13.
For Deck B the expected value of a two card hand is (2)(10)(3/12) = 5.
A

Deck
A
B

A Priori
Probability
50%
50%

Chance
of the
Observation
0.07692
0.04545

Prob. Weight =
Product
of Columns
B&C
0.038462
0.022727

Posterior
Chance of
This Type
of Risk
62.86%
37.14%

Mean
Value
of a Hand

0.061189

1.000

5.73

Overall

6.15
5.00

4.11. C.
A

Type of
Die
4-sided
6-sided
8-sided

A Priori
Chance of
This Type
of Die
0.333
0.333
0.333

Chance
of the
Observation
0.000
0.167
0.125

Prob. Weight =
Product
of Columns
B&C
0.000
0.056
0.042

Posterior
Chance of
This Type
of Risk
0.0%
57.1%
42.9%

Mean
Die
Roll
2.5
3.5
4.5

0.097

1.000

3.93

Overall

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 110
4.12. Applying Bayes Theorem as shown below, there is a 1/3 chance that the car is behind the
door you originally picked.
A

Behind
Chosen
Door

A Priori
Probability

Chance of the
Observation
Given Risk
r Type

Probability
Weight

Posterior
Probability

Car
Goat

0.3333
0.6667

1.0000
1.0000

0.3333
0.6667

0.3333
0.6667

1.000

1.000

Overall

The observation is that Monty opened a door (other than the one you picked) with a goat behind it.
Prob[Monty opens a door with a goat behind it given the door you picked has a car behind it] = 1.
Prob[Monty opens a door with a goat behind it given the door you picked has a goat behind it] = 1.
Comment: In this case, it is advantageous to accept Montys offer and switch doors.
Regardless of what is behind the door that you picked, Monty will open a door you did not pick with
a goat behind it and give you a chance to switch. There is a 100% chance you will observe Monty
opening a door with a goat behind it and give you a chance to switch doors.
4.13. Applying Bayes Theorem as shown below, there is a 1/2 chance that the car is behind the
door you originally picked.
A

Behind
Chosen
Door

A Priori
Probability

Chance of the
Observation
Given Risk Type

Probability
Weight

Posterior
Probability

Car
Goat

0.3333
0.6667

1.0000
0.5000

0.3333
0.3333

0.5000
0.5000

0.667

1.000

Overall

Comment: In this case, it is neutral between accepting and refusing Montys offer to switch doors.
Monty opens a door you did not pick at random.
If you picked the door with a car, then the door Monty opens is always a goat.
There is a 100% chance you will observe Monty giving you a chance to switch doors.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 111
4.14. Applying Bayes Theorem as shown below, there is a 60% chance that the car is behind the
door you originally picked.
A

Behind
Chosen
Door

A Priori
Probability

Chance of the
Observation
Given Risk Type

Probability
Weight

Posterior
Probability

Car
Goat

0.3333
0.6667

1.0000
0.3333

0.3333
0.2222

0.6000
0.4000

0.556

1.000

Overall

For example, (2/3)(1/3) = 2/9. (2/9)/(1/3 + 2/9) = 0.4.


Comment: In this case, it is advantageous to refuse Montys offer to switch doors.
The probability that the car is behind the door you originally picked, depends on the procedure
employed by Monty Hall. Quite often this problem is imprecisely stated, without specifying
Montys procedure, but implicitly assuming the procedure specified in the first question, rather
than some other procedure such as those in one of the other two questions!
4.15. A. Prob[witness says Blue | Blue] = (80%)(15%) = 12%.
Prob[witness says Blue | Green] = (20%)(85%) = 17%.
Prob[witness says Blue] = 12% + 17% = 29%.
Prob[Blue | Witness said Blue] = 12%/ 29% = 41.4%.
4.16. A. Let Urn 1 contain two red marbles and one black marble.
Let Urn 2 contain one red marble and two black marbles.
If A is Urn 1, then the chance of the first marble being red is 2/3. Then this red marble is placed in Urn
2 and the chance that the second marble is black is 2/4. ( 2 black marbles out of 1+3 = 4 marbles.)
Thus in the case the chance of the observation is (2/3)(2/4) = 1/3. The black marble is placed in Urn
1, which now has 1 red and 2 black marbles. So there is a 1/3 chance the third marble is red.
If A is Urn 2, then the chance of the first marble being red is 1/3. Then this red marble is placed in
Urn 1 and the chance that the second marble is black is 1/4. (1 black marbles out of 1 + 3 = 4
marbles.) Thus in the case the chance of the observation is (1/3)(1/4) = 1/12. The black marble is
placed in Urn 2, which now has 3 black marbles. So there is a 0 chance the third marble is red.
A

Urn
Number
Which is A

A Priori
Chance of
This Situation

Chance
of the
Observation

1
2

0.5
0.5

0.3333
0.0833

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Situation

Chance of
Red Third
Marble

0.1667
0.0417

0.8000
0.2000

0.3333
0.0000

0.2083

1.0000

0.2667

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 112
4.17. A.
A

Type
of
Urn
1
2
3

A Priori Chance
of this
Type of Urn
0.333
0.333
0.333

Chance
of the
Observation
1.0000
0
0.0312

Prob. Weight =
Product of
Columns B & C
0.3333
0.0000
0.0104

Posterior Chance
of this
Type of Urn
97.0%
0.0%
3.0%

Mean
Ball
From Urn
0.0
1.0
0.5

0.3438

1.000

0.0152

Overall

4.18. E. Since the observation is impossible from either Urn 1 or Urn 2, the posterior probability is
100% Urn 3. Therefore the posterior estimate is 1/2.
A

A Priori
Chance of
Type of This Type
Urn
of Urn
1
0.333
2
0.333
3
0.333

Chance
of the
Observation
0.0000
0
0.3125

Prob. Weight =
Product
of Columns
B&C
0.0000
0.0000
0.1042

Posterior
Chance of
This Type
of Urn
0.0%
0.0%
100.0%

Mean
Ball
From Urn
0.0
1.0
0.5

0.1042

1.000

0.5000

Overall

4.19. C.
A

A Priori
Type of Chance of
Die
This Type
of Die
A
0.3333
B
0.3333
C
0.3333
Overall

Chance
of the
Observation
0.3333
0.6667
0.8333

Prob. Weight =
Product
of Columns
B&C
0.1111
0.2222
0.2778

Posterior
Chance of
This Type
of Die
0.1818
0.3636
0.4545

Mean
Roll
of
Die
0.6667
0.3333
0.1667

0.611

1.000

0.318

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 113
4.20. B.
A

A Priori
Type of Chance of
Die
This Type
of Die
A
0.3333
B
0.3333
C
0.3333

Chance
of the
Observation
0.6667
0.3333
0.1667

Prob. Weight =
Product
of Columns
B&C
0.2222
0.1111
0.0556

Posterior
Chance of
This Type
of Die
0.5714
0.2857
0.1429

Mean
Roll
of
Die
0.6667
0.3333
0.1667

0.389

1.000

0.500

Overall

Comment: Note that the a priori mean is (1/3)(2/3) + (1/3)(1/3) + (1/3)(1/6) = 7/18.
The chance of observing a 0 is 11/18 and the chance of observing a 1 is 7/18.
The Bayesian Estimates balance to the a priori mean: (11/18)(.318) + (7/18)(.500) = 7/18.
4.21. D. The posterior distribution is proportional to the product of the a priori chance of picking
each box and the chance of the observation.
A

A Priori
Chance of
This
Box
0.667
0.333

Type of
Box
I
II

Chance
of the
Observation
0.6000
0.3000

Prob. Weight =
Product
of Columns
B&C
0.4000
0.1000

Col. D / Sum of Col. D =


Posterior
Chance of
This Box
80.00%
20.00%

0.500

1.000

Overall

4.22. A.
A

Urn

A Priori
Chance of
This Urn

I
II
Overall

0.500
0.500

Chance
Prob. Weight = Col. D / Sum of Col. D =
of the
Product of
Posterior
Observation Columns B & C
Chance of this Urn
0.2500
0.1667

0.1250
0.0833
0.2083

60.0%
40.0%
1.000

Mean
Ball for
this Urn
2.500
3.500
2.900

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 114
4.23. A.
A

A Priori
Chance of
Chance
Type of This Type
of the
Urn
of Urn
Observation
A
0.3333
0.0270
B
0.3333
0.4320
C
0.3333
0.3840
Overall

Prob. Weight =
Product
of Columns
B&C
0.0090
0.1440
0.1280

Col. D / Sum of Col. D =


Posterior Chance of
This Type
of Urn
0.0320
0.5125
0.4555

Mean
for 3
Balls from
Urn
2.7000
1.2000
0.6000

0.2810

1.000

0.975

For example, the chance of observing a single 1 and 2 zeros on three draws from Urn C is:
(3)(.2)(.8)2 = .384. (The chance of given numbers of ones being drawn is given by a Binomial
Distribution with n = 3 and q = .2 in the case of Urn C.)

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 115
4.24. E. Compute the probability weights as the product of the chance of observing a 12 given
that type of Spinner and the a priori chance of that type of Spinner. Then the new estimate is a
weighted average of the means of the spinners, using either the probability weights or the posterior
probabilities. (The posterior probabilities are just the probability weights divided by their sum. Note
that the sum of the probability weights of .278 is the a priori chance of observing a 12.)
A

A Priori
Chance of
Type of This Type
Spinner of Spinner
A
0.3333
B
0.3333
C
0.3333

Chance
of
Observing
Spin of 12
0.3333
0.3333
0.1667

Prob. Weight = Col. D / Sum of Col D. =


Mean
Product
Posterior Chance of
for
of Columns
This Type
This Type
B&C
of Spinner
of Spinner
0.1111
0.4000
20
0.1111
0.4000
12
0.0556
0.2000
10

Overall

0.278

1.000

14.800

Comment: The observation in this question a single spin of 12. You could have just as easily been
asked for the new estimate if the observation was a single spin of 0:
A

A Priori
Chance of
Type of This Type
Spinner of Spinner
A
0.3333
B
0.3333
C
0.3333

Chance
of
Observing
Spin of 0
0.3333
0.5000
0.6667

Overall

Prob. Weight = Col. D / Sum of Col D. =


Product
Posterior Chance of
of Columns
This Type
B&C
of Spinner
0.1111
0.2222
0.1667
0.3333
0.2222
0.4444
0.500

1.000

Mean
for
This Type
of Spinner
20
12
10
12.889

If the observation was a single spin of 48:


A

A Priori
Chance of
Type of This Type
Spinner of Spinner
A
0.3333
B
0.3333
C
0.3333
Overall

Chance
of
Observing
Spin of 48
0.3333
0.1667
0.1667

Prob. Weight =
Product
of Columns
B&C
0.1111
0.0556
0.0556

Col. D / Sum of Col D. =


Posterior Chance of
This Type
of Spinner
0.5000
0.2500
0.2500

Mean
for
This Type
of Spinner
20
12
10

0.222

1.000

15.500

Note that the three estimates weighted by the corresponding probabilities of the three observations
equal the a priori mean: (.278)(14.8) +(.5)(12.889) + (.222)(15.5) = 14.00.
This is an example of the general result, that Bayesian Estimates are in balance.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 116
4.25. B. For example, the chance of 2 Balls of each type from Urn D is given by f(2) for a Binomial
distribution with n = 4 and q = 0.8 : 6(0.82 )(0.22 ) = 0.1536.
A

Urn

A Priori
Probability

% of Balls
Marked 1

A
B
C
D

0.25
0.25
0.25
0.25

0.3
0.3
0.7
0.8

Chance of
Probability
Posterior
2 Balls with 1
Weights =
Probability
in 4 draws Col. B x Col. D
0.2646
0.0662
0.2793
0.2646
0.0662
0.2793
0.2646
0.0662
0.2793
0.1536
0.0384
0.1621

SUM

0.2369

1.0000

Col. C x
Col. F
0.0838
0.0838
0.1955
0.1297
0.4928

The posterior probabilities are the probability weights divided by the sum of the probability
weights. For example, 0.1621 = 0.0384/0.2368. By weighting the means by the posterior
probabilities, the posterior expected value of a single ball from the same urn is 0.4928.
Thus the expected value of 4 balls is: (4)(0.4928) = 1.971.
4.26. B. Prob[disease | positive] = Prob[positive | disease] Prob[disease] / Prob[positive] =
(0.85)(0.01)/{(0.85)(0.01) + (0.10)(0.99)} = 0.00850/.1075 = 0.0791.
4.27. D. If we observe that the sum of two balls is 2, then the balls picked were in order either:
2,0 1,1 or 0,2.
Therefore the chance of the observation if we have picked Urn B is:
(0.1)(0.7) + (0.2)(0.2) + (0.7)(0.1) = 0.18.
Similarly, the chance of the observation if we have picked Urn A is:
(0.4)(0.2) + (0.4)(0.4) + (0.2)(0.4) =0 .32.
As shown below, the posterior probability for Urn A is: 0.16 / 0.25 = 64%.
For example, the mean for Urn B is: (0)(0.7) + (1)(0.2) + (2)(0.1) = 0.4
A

A Priori
Chance of
Type of This Type
Urn
of Urn
A
0.500
B
0.500
Overall

Chance
of the
Observation
0.32
0.18

Prob. Weight = Col. D / Sum of Col. D =


Product
Posterior
Mean of
of Columns
Chance of
This Type
B&C
This Type of Urn
of Urn
0.16
64.0%
1.2
0.09
36.0%
0.4
0.25000

1.000

The posterior estimate for a single ball drawn from the same urn is 0.912.
Thus the posterior estimate for two balls is: (2)(0.912) = 1.824.

0.912

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 117
4.28. C. The Bayesian Estimates are always in balance; the sum of the product of the a priori
chance of each outcome times its posterior Bayesian estimate is equal to the a priori mean. The a
priori mean is (1/3)(1) + (1/3)(8) + (1/3)(12) = 7. Let E[ X2 | X1 = 12] = . Then setting the sum of
the chance of each outcome times its posterior mean equal to the a priori mean:
(1/3)(2.6) + (1/3)(7.8) + (1/3)() = 7. Therefore = 10.6.
Comment: The given Buhlmann Credibility estimates are on the line: .6T + 2.8. (They average to
7, the a priori mean.) They should be the least squares linear approximation to the Bayesian
estimate. One should be able to solve for the missing Bayesian Estimate , since the credibility
must equal the slope of the weighted least squares line, which is given (incorrectly) as .6. This slope
is given by: {(wiXiY i) - (wiY i)(wiXi)} / {(wiXi2 ) - (wiXi)2 }, where the wi are the a priori
probabilities, which in this case are all 1/3, Xi are the possible outcomes, and Y i are the Bayesian
Estimates. Using the missing value of 10.6, derived in the solution above, one gets a slope of:
{(64.07) -(7)(7)} / {69.67 - 72 } = .73. Thus the given Buhlmann Credibility estimates are in fact
inconsistent with the other given information. In fact they should be along the line : 1.89 + .73T. (The
values along that line are 2.62, 7.73, 10.65 and fall very close to the Bayesian estimates.) Thus if
one uses the given incorrect Buhlmann Credibility estimates in an attempt to solve this problem
one will end up with the wrong answer. If instead one just ignores these Buhlmann Credibility
estimates, one should obtain the right answer.
4.29. A. Since the urns have the same a priori probability, the chance of having picked each urn is
proportional to the chance of observing one ball of each type.
Thus the New Estimate = {(0.16)(1.6) + (0.21)(.6)} / (0.16 + 0.21) = 1.032.
In more detail, for Urn 1, the chance of picking one ball marked 0 and one marked 1 is: 2(.2)(.8) =.32,
as given by the Binomial distribution. One then gets probability weights as the product of the a
priori probability and the chance of observation. For example, for Urn B, (.5)(.42) = .21.
One converts these to the posterior probabilities by dividing by the sum of the weights.
Then one weights together the means for two balls drawn from each urn, using the posterior
probabilities (or probability weights): (.4324)(1.6) + (.5676)(.6) = 1.032.
Urn
A
B
Sum

A Priori
Probability
0.5
0.5

Chance of
Observation
0.3200
0.4200

Probability
Weights
0.1600
0.2100

Posterior
Probability
0.4324
0.5676

Mean
(2 balls)
1.6
0.6

0.3700

1.0000

1.032

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 118
4.30. E.
A

Urn
A Priori
Chance Chance
Chance
Prob. Weight = Posterior
Chance of
(Type Chance of
of
of
of the
Product
of
This Type Picking Picking
Observation =
of Columns This Type
Col. C times Col. D
B&E
of Risk
Risk)
of Risk
2
3
1
2
3
4
5
6
7
8
9
10

0.100
0.100
0.100
0.100
0.100
0.100
0.100
0.100
0.100
0.100

0.2
0.2
0.2
0.2
0.2
0
1
0
0
0

0.2
0.2
0.2
0.2
0.2
0
0
1
0
0

0.04000
0.04000
0.04000
0.04000
0.04000
0.00000
0.00000
0.00000
0.00000
0.00000

Overall

Mean

0.00400
0.00400
0.00400
0.00400
0.00400
0.00000
0.00000
0.00000
0.00000
0.00000

0.20000
0.20000
0.20000
0.20000
0.20000
0.00000
0.00000
0.00000
0.00000
0.00000

3
3
3
3
3
1
2
3
4
5

0.02000

1.00000

4.31. A. The posterior Bayesian estimate is:


(0.3)(0.3n) / (7.2 - 0.3n) + (0.6)(7.2 - 0.6n) / (7.2 - 0.3n).
We set this equal to 0.54 as given: (0.54)(7.2 - 0.3n) = 0.09n + 4.32 - 0.36n. n = 4.
A

Type
of
Urn

A Priori
Chance of
this type of
Urn
n/12
(12-n)/12

1
2

Chance
Prob. Weight =
of the
Product
Observation
of Columns
B&C
0.3
.3n/12
0.6
.6(12-n)/12

Posterior
Chance of
this type of
Urn
.3n/(7.2-.3n)
(7.2-.6n)/(7.2-.3n)

Mean
of this
Type of Urn
0.3
0.6

Comment: Given the output, we must solve for the missing an input. One can verify that if
n = 4, then the usual Bayesian Analysis would produce an estimate of .54.
A

Type
of Urn
1
2
Overall

A Priori
Chance of
this type
of Urn
0.3333
0.6667

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C
0.10
0.40
0.50

Posterior
Chance of
this type
of Urn
0.20
0.80
1.00

Mean
of
this type
of Urn
0.30
0.60
0.54

0.30
0.60

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 119
4.32. B. Let Urn 1 contain two red marbles and one black marble.
Let Urn 2 contain one red marble and two black marbles.
If A is Urn 1, then the chance of the first marble being red is 2/3. Then this red marble is placed in Urn
2 and the chance that the second marble is red is 2/4. (1 + 1 = 2 red marbles out of 1 + 3 = 4
marbles.) Thus in the case the chance of the observation is (2/3)(2/4) = 1/3.
If A is Urn 2, then the chance of the first marble being red is 1/3. Then this red marble is placed in
Urn 1 and the chance that the second marble is red is 3/4. (1+2 = 3 red marbles out of 1+3 = 4
marbles.) Thus in the case the chance of the observation is (1/3)(3/4) = 1/4.
Note that regardless of whether A is Urn 1 or 2, the first step removes a red marble from Urn A
while the second step returns a red marble to Urn A. Thus prior to the third step, Urn A, as well as
Urn B, are in their original configurations.
Thus the chance that the third selected marble is red if A is Urn 1 is 2/3 and if A is Urn 2 is 1/3.
The posterior chances of A being 1 and 2 are 4/7 and 3/7, resulting in a chance of the third marble
being red of: (4/7)(2/3) + (3/7)(1/3) = 11/21.
A

Urn
Number
Which is A

A Priori
Chance of
This Situation

Chance
of the
Observation

1
2

0.5
0.5

0.3333
0.2500

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Situation

Overall

Chance of
Red Third
Marble

0.1667
0.1250

0.5714
0.4286

0.6667
0.3333

0.2917

1.0000

0.5238

Comment: Too hard and too long for 1 point.


4.33. D. If the first 2n marbles are red, each cycle of 2 draws returns us to the starting configuration
of marbles in Urns. Thus we have n independent repetitions of the experiment in the previous
question. Thus the chances of the observations are (1/3)n and (1/4)n , if A is Urn 1 and Urn 2
respectively. Since the two situations are a priori equally probable, the posterior probabilities are
proportional to (1/3)n and (1/4)n . Thus the posterior probabilities are: (1/3)n / {(1/3)n + (1/4)n } and
(1/4)n / {(1/3)n + (1/4)n }. As n approaches infinity, the posterior probabilities approach 1 and 0.
Thus the estimate that the (2n+1)st marble is red approaches: (1)(2/3) + (0)(1/3) = 2/3.
Comment: Uses the intermediate results of the prior solution.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 120
4.34. B. Label the urn with two dice with faces 1, 2, 3, 4, 5, and 6, as Urn A.
Label the urn with one die with faces 1, 2, 3, 4, 5, and 6, and one die with faces 1, 1, 1, 2, 2, and 2
as Urn B.
Given Urn A, the average result of rolling two dice is 3.5 + 3.5 = 7.
Given Urn A, the chance of observing a sum of 3 is 2/36 = 1/18.
Given Urn B, the average result of rolling two dice is 3.5 + 1.5 = 5.
Given Urn B, the chance of observing a sum of 3 is 6/36 = 1/6.
A

Type of
Urn
A
B

A Priori
Chance of
This Type
of Urn
0.5000
0.5000

Chance
of the
Observation
0.0556
0.1667

Prob. Weight =
Product
of Columns
B&C
0.0278
0.0833

Posterior
Chance of
This Type
of Urn
0.2500
0.7500

Mean
Outcome
for
Urn
7.0000
5.0000

0.111

1.000

5.500

Overall

4.35. C. Since the a priori probabilities are equal, the posterior probabilities are proportional to the
densities for the observation. Let y = d-10. The densities for Types A, B, and C are proportional to:
exp(-y2 /2), exp(-y2 /8)/2, and exp(-y2 /32)/4.
Therefore since the posterior probability that it is Type B is largest we have:
exp(-y2 /8)/2 > exp(-y2 /2) and that exp(-y2 /8)/2 > exp(-y2 /32)/4.
Therefore taking logs: -y2 /8 - ln(2) > - y2 /2 and -y2 /8 - ln(2) > - y2 /32 - ln(4).
(3/8)y2 > ln(2) and ln(2) > (3/32)y2 .
|y| >

8 ln(2) / 3 = 2 ln(2) / 3 and |y| <

Since y = d-10, 2

32 ln(2) / 3 = 4 2 ln(2) / 3 .

2 ln(2) / 3 < |d - 10| < 4

2 ln(2) / 3 .

Comment: If d is very close to 10, then it is most likely to have come from Type A, which has the
smallest variance around 10. If d is very far from 10, then it is most likely to have come from Type C.
Therefore, the only way Type B could have the largest posterior probability of the three types is if
d is neither too far from nor too close to 10. Of the five choices, only choice C is of this form.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 121
4.36. D. Given the red urn, the chance of the observation is: (90%)(10%) = 9%.
Given the blue urn, the chance of the observation is: (60%)(90%) = 54%.
The mean number from the red urn is: (.9)(1) + (.1)(2) = 1.1.
The mean number from the blue urn is: (.1)(1) + (.9)(2) = 1.9.
A

A Priori
Chance of
Type of This Type
Urn
of Urn
Red
0.5000
Blue
0.5000
Overall

Chance
of the
Observation
0.0900
0.5400

Prob. Weight =
Product
of Columns
B&C
0.0450
0.2700

Posterior
Chance of
This Type
of Urn
0.1429
0.8571

Mean
Outcome
for
Urn
1.1000
1.9000

0.315

1.000

1.786

4.37. C. Prob[smoker | died] = Prob[died | smoker] Prob[smoker]/Prob[died] =


(.05)(.1)/{(.05)(.1) + (.01)(.9)} = .005/.014 = 0.357.
4.38. C. Prob[high | 1 claim] = Prob[1 claim | high] Prob[high]/Prob[1 claim] =
(.6e-.6)(.1)/{(.6e-.6)(.1) + (.1e-.1)(.9)} = .0329/.1144 = .288.
Prob[low | 1 claim] = Prob[1 claim | low] Prob[low]/Prob[1 claim] =
(.1e-.1)(.9)/{(.6e-.6)(.1) + (.1e-.1)(.9)} = .0329/.1144 = .712.
Expected number of claims: (.288)(.6) + (.712)(.1) = 0.244.
4.39. D. Prob[1997 | accident] = Prob[accident | 1997] Prob[1997]/Prob[accident] =
(.05)(.16)/{(.05)(.16) + (.02)(.18) + (.03)(.20)} = .008/.0176 = 0.4545.
Comment: Since the automobile was from one of the model years 1997, 1998, and 1999, we do
not use the information on other model years.
4.40. B. Prob[disease | positive] = Prob[positive | disease] Prob[disease] / Prob[positive] =
(.95)(.01)/{(.95)(.01) + (.005)(.99)} = .0095/.01445 = 0.657.
4.41. D. Prob[young | collision] = Prob[collision | young] Prob[young] / Prob[collision] =
(.08)(16%)/{(.15)(8%) + (.08)(16%) + (.04)(45%) + (.05)(31%)} = .0128/.0583 = 0.220.
4.42. C. Let Prob[smoker | no problem] = p. Then Prob[smoker | problem] = 2p.
Prob[problem | smoker] = Prob[smoker | problem] Prob[problem] / Prob[smoker] =
(2p)(.25)/{(2p)(.25) + (p)(.75)} = .5/1.25 = 0.4.
4.43. D. Prob[ultra | die] = Prob[die | ultra] Prob[ultra] / Prob[die] =
(.001)(10%)/{(.001)(10%) + (.005)(40%) + (.010)(50%)} = .0001/.0071 = 0.0141.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 122
4.44. A. Prob[observation | X] = 30(.1)(.929). Prob[observation | not X] = 30(.02)(.9829).
Prob[X | obs.] = Prob[obs. | X] Prob[X] / Prob[obs.] =
30(.1)(.929)(1/5)/{30(.1)(.929)(1/5) + 30(.02)(.9829)(4/5)} =
(.1)(.929)/{(.1)(.929) + 4(.2)(.9829)} = 1/{1 + .8(.98/.90)29} = 0.096.
4.45. C. If q is the chance of a head, then the probability of the observation of head, head, tail,
head, in that order, is: q3 (1-q).
A

Type
of
Coin

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Coin

Prob.
of
a
Head

I
II
III

66.67%
16.67%
16.67%

0.0625
0.0117
0.1055

0.0417
0.0020
0.0176

68.09%
3.19%
28.72%

0.500
0.250
0.750

0.0612

100.00%

0.564

Overall

Comment: Since this is a Bernoulli, with 1 corresponding to a head and 0 corresponding to a tail,
Prob[Head on 5th flip] = Prob[X5 = 1] = E[X5 ].

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 123
4.46. D. One uses the posterior distribution from the solution to the exam question.
Then the chance of two heads on the next two coin flips is q2 .
A

Type
of
Coin

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Coin

Prob.
of
Two
Heads

I
II
III

66.67%
16.67%
16.67%

0.0625
0.0117
0.1055

0.0417
0.0020
0.0176

68.09%
3.19%
28.72%

0.2500
0.0625
0.5625

0.0612

100.00%

0.334

Overall

Alternately, from the solution to the exam question, the chance of a head on coin flip number five is
0.564. Also the distribution posterior to the fourth coin flip is: 68.09%, 3.19%, and 28.72%.
Use this as the prior distribution to the fifth coin flip, and then get the distribution posterior to the fifth
coin flip, assuming a head on the fifth coin flip.
A

Type
of
Coin

Probability
Prior to
the 5th
Flip

Chance
of Head on
the 5th
Flip

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Coin

Chance
of Head on
the 6th
Flip

I
II
III

68.09%
3.19%
28.72%

0.500
0.250
0.750

0.3404
0.0080
0.2154

60.38%
1.41%
38.20%

0.500
0.250
0.750

0.5638

100.00%

0.592

Overall

Prob[head on 5th and head on 6th] = Prob[head on 5th] Prob[head on 6th | head on 5th] =
(0.564)(0.592) = 0.334.
Comment: The correct answer is not: 0.5642 = 0.318.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 124
4.47. C. If q is the chance of a head, then the probability of the observation of 3 heads in 4 trials is
the density at 3 of a Binomial with m = 4: 4q3 (1-q).
A

Type
of
Coin

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Coin

Prob.
of
a
Head

I
II
III

66.67%
16.67%
16.67%

0.2500
0.0469
0.4219

0.1667
0.0078
0.0703

68.09%
3.19%
28.72%

0.500
0.250
0.750

0.2448

100.00%

0.564

Overall

Comment: The observation in this question did not specify the order in which the heads and tails
occurred. Therefore, each of the chances of observation were 4 times those in the exam question.
However, the factor of 4 was common to each row, so the posterior distribution was the same as in
the previous question. Therefore, the solution was the same as in the exam question.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 125
4.48. B. One uses the posterior distribution from the solution to the exam question.
Then the chance of tail, head, tail, on the next three coin flips is (1-q)q(1-q).
A

Type
of
Coin

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Coin

Prob.
of
Head, Tail,
Head

I
II
III

66.67%
16.67%
16.67%

0.0625
0.0117
0.1055

0.0417
0.0020
0.0176

68.09%
3.19%
28.72%

0.1250
0.1406
0.0469

0.0612

100.00%

0.103

Overall

Alternately, from the solution to the exam question, the chance of a head on coin flip number five is
.564. Also the distribution posterior to the fourth coin flip is: 68.09%, 3.19%, and 28.72%.
Use this as the prior distribution to the fifth coin flip, and then get the distribution posterior to the fifth
coin flip, assuming a tail on the fifth coin flip.
A

Type
of
Coin

Probability
Prior to
the 5th
Flip

Chance
of Tail on
the 5th
Flip

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Coin

Chance
of Head on
the 6th
Flip

I
II
III

68.09%
3.19%
28.72%

0.500
0.750
0.250

0.3404
0.0239
0.0718

78.05%
5.49%
16.46%

0.500
0.250
0.750

0.4362

100.00%

0.527

Overall

Use this as the prior distribution to the sixth coin flip, and then get the distribution posterior to the
sixth coin flip, assuming a head on the sixth coin flip.
A

Type
of
Coin

Probability
Prior to
the 6th
Flip

Chance
of Head on
the 6th
Flip

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Coin

Chance
of Head on
the 7th
Flip

I
II
III

78.05%
5.49%
16.46%

0.500
0.250
0.750

0.3902
0.0137
0.1235

73.99%
2.60%
23.41%

0.500
0.250
0.750

0.5274

100.00%

0.552

Overall

Prob[tail on 5th, head on 6th, and tail on 7th] =


Prob[tail on 5th] Prob[head on 6th | tail on 5th] Prob[tail on 7th | tail on 5th and head on 6th] =
(1 - 0.564)(0.527)(1 - 0.552) = 0.103.
Comment: The correct answer is not: (1 - .564)(.564)(1 - .564) = 0.107.
4.49. B. Prob[young | accident] = Prob[accident | young] Prob[young] / Prob[accident] =
(.06)(8%)/{(.06)(8%) + (.03)(15%) + (.02)(49%) + (.04)(28%)} = .0048/.0303 = 0.158.

2013-4-9 Buhlmann Credibility 4 Bayes Analysis Introduction, HCM 10/19/12, Page 126
4.50. D. Let q = Prob[die | nonsmoker].
Then Prob[die | light smokers] = 2q and Prob[die | heavy smoker] = 4q.
Prob[heavy smoker | die] = Prob[die | heavy smoker] Prob[heavy smoker] / Prob[die] =
4q.2/{4q.2 + 2q.3 + q.5} = .8/1.9 = 0.421.
4.51. E. Prob[G = 1/3 | D = 0] = Prob[G = 1/3] Prob[D = 0 | G = 1/3] / Prob[D = 0]
= (2/5)(1/3)/{(3/5)(1/5) + (2/5)(1/3)} = (2/15)/(3/25 + 2/15) = 10/19.
Comment: The distribution of D is Bernoulli with mean 1 - g. An application of Bayes Theorem.
For whatever reason, the way the question was worded made this simple question harder for me.
A mathematically equivalent statement of the question:
Frequency is Bernoulli. The prior distribution of q is: Prob[q = 4/5] = 60% and Prob[q = 2/3] = 40%.
If no claims are observed, what is the posterior probability that q is 2/3?

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 127

Section 5, Bayesian Analysis, with Discrete Types of Risks


In this section, Bayesian Analysis will be applied to situations involving frequency, severity, pure
premiums, or aggregate losses, when there are discrete types of risks. The case of continuous
types of risks will be covered in the next section.

Bayesian Analysis, Frequency Example:


One can use Bayesian Analysis to predict the future claim frequency. For example, assume the
following information:

Type
1
2
3

Portion of Risks
in this Type
50%
30%
20%

Bernoulli (Annual)
Frequency Distribution
q = 40%
q = 70%
q = 80%

We assume that the types are homogeneous; i.e., every risk of a given type has the same
frequency process.
Assume in addition that a risk is picked at random and that we do not know what type it is.27
If for this randomly selected risk during 4 years one observes 3 claims, then one can use Bayesian
Analysis to predict the future frequency.
For the three types, the mean (annual) frequencies are: 0.4, 0.7, and 0.8.
Therefore the a priori mean frequency is: (50%)(0.4) + (30%)(0.7) + (20%)(0.8) = .57.
If the frequency is Bernoulli each year, assuming the years are independent, then the frequency is
Binomial over 4 years, with parameters q and m = 4.
Therefore, a risk of type 1 is Binomial with q = 40% and m = 4.
Thus the chance of observing 3 claims in four years for a risk of type 1 is:
(0.4)3 (0.6)1 4! / (3! 1!) = 0.1536.
Similarly, the chance of observing 3 claims in four years for a risk of type 2 is:
(0.7)3 (0.3)1 4! / (3! 1!) = 0.4116.
The chance of observing 3 claims in four years for a risk of type 3 is:
(0.8)3 (0.2)1 4! / (3! 1!) = 0.4096.
27

The latter is very important. If one knew which type the risk was, one would use the expected value for that type in
order to estimate the future frequency.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 128
Then by Bayes Theorem after observing 3 claims in 4 years, the posterior chance of this individual
being of each type is proportional to the product of the a priori chance of being of that type and the
chance of the observation if the individual were of that type. So for example, the chance of the
individual being from type 1 is proportional to: (50%)(0.1536) = 0.0768.
The probability weights for the other two types are: (30%)(0.4116) = 0.12348,
and (20%)(0.4096) = 0.08192.
One can convert these probability weights to probabilities by dividing by their sum.28
Thus, 0.0768 / 0.2822 = 27.21%, 0.12348 / 0.2822 = 43.76%, and 0.08192 / 0.2822 = 29.03%.
Finally one can weight together the mean frequencies for each type, using these posterior
probabilities.29 (27.21%)(0.4) + (43.76%)(0.7) + (29.03%)(0.8) = 0.6474. Thus using Bayesian
Analysis the estimated future annual frequency for this individual is 0.6474.30
This whole calculation is organized in a spreadsheet as follows:
A

Type

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Freq.

1
2
3

50%
30%
20%

0.15360
0.41160
0.40960

0.076800
0.123480
0.081920

27.215%
43.756%
29.029%

0.4
0.7
0.8

0.282200

100.000%

0.6474

Overall

Sequential Approach:
In the above example, let us assume that for a given insured, one observes 3 claims over the first
four years, and then no claim in the fifth year. Then there are two equivalent ways to perform Bayes
Analysis.
One could start from time zero and go through year 5 in one step.
The chance of the observation given q is: {4q3 (1 - q)}(1 - q) = 4q3 (1 - q)2 .

28

The sum of the probability weights is the a priori chance of the observations. In this case there was an a priori
chance of observing 3 claims in 4 years of 28.22%.
29
One would get the same answer whether one used the posterior probabilities or the posterior probability weights
to take the weighted average.
30
As shown in a subsequent section, in this case one would get a different estimate if one used Buhlmann
Credibility.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 129
A

Type

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Freq.

1
2
3

50%
30%
20%

0.09216
0.12348
0.08192

0.046080
0.037044
0.016384

46.31%
37.23%
16.47%

0.4
0.7
0.8

0.099508

1.000

0.5775

Overall

Alternately, one could use the distribution posterior to 4 years as the distribution prior to year 5.
The chance of the observation in year 5 given q is: (1 - q).
A

Type

Posterior
to
Year 4
Probability

Year 5
Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Freq.

1
2
3

27.215%
43.756%
29.029%

0.6
0.3
0.2

0.163290
0.131268
0.058058

46.31%
37.23%
16.46%

0.4
0.7
0.8

0.352616

1.000

0.5775

Overall

This results in the same estimate as previously, using one big step rather than two smaller steps as
here. Such a sequential approach works for Bayesian Analysis in general.
Bayesian Analysis, First Severity Example:
One can use Bayesian Analysis to predict the future claim severity.
For example, assume the following information:
Portion of
Gamma
Risks in
Severity
Type
this Type
Distribution
1

50%

= 4, = 100

30%

= 3, = 100

20%

= 2, = 100

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 130
We assume that the types are homogeneous; i.e., every risk of a type has the same severity
process. Assume in addition that a risk is picked at random and we do not know what type it is.31
Finally we assume that we are unconcerned with frequency.32
If for this randomly selected risk one observes 3 claims for a total of $450, then one can use
Bayesian Analysis to predict the future severity of this risk.
The sum of 3 independent claims drawn from a single Gamma Distribution is another Gamma
Distribution, but with parameters: 3 and . Thus if the risk is of type 1, the distribution of the sum of
3 claims is Gamma with = 12 and = 100. For type 2 the sum has parameters: = 9 and
= 100. For type 3 the sum has parameters: = 6 and = 100.
Thus assuming one has 3 claims, the chance that they will add to $450 if the risk is from
type 1 is the density at 450 of a Gamma Distribution with = 12 and = 100:
x1 ex/ / () = (0.01)12 45012-1 e-(0.01)(450) / (12) = 0.0000426439.
Similarly, the chance for type 2 is: (0.01)9 4509-1 e-(0.01)(450) / (9) = 0.000463292.
The chance for a risk from type 3 is: (0.01)6 4506-1 e-(0.01)(450) / (6) = 0.00170827.
Then by Bayes Theorem, the posterior chance of this individual being from each type is
proportional to the product of the a priori chance of being of that type and the chance of the
observation if the individual were of that type. So for example, the chance of the individual being of
type 1 is proportional to: (50%)(0.0000426439) = 0.0000213. The probability weights for the
other two types are: (30%)(0.000463292) = 0.0001390, and (20%)(0.00170827) = 0.0003417.
One can convert these probability weights to probabilities by dividing by their sum.
For example, 0.0003417 / 0.0005020 = 68.06%.
Since the mean of the Gamma Distribution is , for the three types the mean severities are:
400, 300 and 200.
Finally one can weight together the mean severities for each type, using these posterior
probabilities: (4.25%)(400) + (27.69%)(300) + (68.06%)(200) = 236.33
Thus using Bayesian Analysis the estimated future annual severity for this individual is 236.

31

The latter is very important. If one knew which type the risk was, one would use the expected value for that type in
order to estimate the future severity.
32
Either all the risk types have the same frequency process, or we are given no time period within which the number
of claims is observed.
33
One would get the same answer whether one used the posterior probabilities or the posterior probability weights
to take the weighted average.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 131
This whole calculation is organized in a spreadsheet as follows:
A

Type

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Severity

1
2
3

50%
30%
20%

0.0000426
0.0004633
0.0017083

0.0000213
0.0001390
0.0003417

4.25%
27.69%
68.06%

400
300
200

0.0005020

100.00%

236

Overall

Bayesian Analysis, Second Severity Example:


This example is similar to the previous example, except we observe the individual claim sizes rather
than the just the total. As before, one can use Bayesian Analysis to predict the future claim severity.
Assume as previously:

Type

Portion of
Risks in
this Type

Gamma
Severity
Distribution

50%

= 4, = 100

30%

= 3, = 100

20%

= 2, = 100

We assume that the types are homogeneous; i.e., every risk within a type has the same severity
process. Assume in addition that a risk is picked at random and we do not know what type it is.34
Finally we assume that we are unconcerned with frequency.35
Since the mean of the Gamma Distribution is , for the three types the mean severities are:
400, 300, and 200.
If for this randomly selected risk one observes 3 claims of sizes: $50, $100 and $300 (for a total of
$450), then one can use Bayesian Analysis to predict the future severity of this risk.

34

The latter is very important. If one knew which type the insured was, one would use the expected value for that
type in order to estimate the future severity.
35
Either all the risk types have the same frequency process, or we are given no time period within which the number
of claims is observed.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 132
If one assumes that the claims occurred in the order 50, 100 and 300, then the likelihood of the three
observed claims is the product of the individual likelihoods: f(50)f(100)f(300). If one assumes the
claims could have occurred in any order, one gets an extra factor of 3! = 6.
Since no order was specified, I will include the factor of 6.36
6f(50)f(100)f(300) =
6{501 e50//()}{1001 e100//()}{3001 e300// ()} =
63 15000001 e-450/ / ()3 .
So if the risk is of type 1 with = 4 and = 100, the likelihood is:
(6)0.0112 15000003 e-4.5 / (4)3 = 1.0415 x 10-9.
Similarly, the chance for type 2 is: (6).019 15000002 e-4.5 / (3)3 = 1.8746 x 10-8.
The chance for a risk of type 3 is: (6).016 15000001 e-4.5 / (2)3 = 9.9981 x 10-8.
Then by Bayes Theorem, the posterior chance of this individual being of each type is proportional
to the product of the a priori chance of being of that type and the chance of the observation if the
individual were of that type. So for example, the chance of the individual being from type 1 is
proportional to : (50%)(1.0415 x 10-9) = 5.2075 x 10-10.
The probability weights for the other two types are:
(30%)(1.8746 x 10-8) = 5.6238 x 10-9, and (20%)(9.9981 x 10-8) = 1.9996 x 10-8.
One can convert these probability weights to probabilities by dividing by their sum.
For example, 5.2075 x 10-10 / 2.6141 x 10-8 = 1.99%.
Finally one can weight together the mean severities for each type, using these posterior
probabilities:37 (1.99%)(400) + (21.51%)(300) + (76.49%)(300) = 225.
Thus using Bayesian Analysis, the estimated future annual severity for this individual is 225.

36

Since this extra factor would appear on each row, it would drop out when one calculated posterior probabilities.
Therefore, some prefer to leave it and similar factors out, when it would not make a difference such as in this case.
37
One would get the same answer whether one used the posterior probabilities or the probability weights to take the
weighted average.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 133
Note that differs from the estimate of 236 we got previously using the sum of claims rather than the
individual claim values. On the exam, if you are given the individual claim sizes use them in this
situation, unless specifically told otherwise.38
This whole calculation is organized in a spreadsheet as follows:
A

Type

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Severity

1
2
3

50%
30%
20%

0.00000000104
0.00000001875
0.00000009998

0.00000000052
0.00000000562
0.00000002000

1.99%
21.51%
76.49%

400
300
200

0.00000002614

100.00%

225

Overall

Bayesian Analysis, Third Severity Example:


This example is similar to the previous example, except we consider the frequency processes.
As before, one can use Bayesian Analysis to predict the future claim severity.
Assume the following information:
Type Portion of Risks in this Type

Bernoulli Frequency Dist.

Gamma Severity Dist.

50%

q = 40%

= 4, = 100

30%

q = 70%

= 3, = 100

20%

q = 80%

= 2, = 100

We assume that the types are homogeneous; i.e., every risk within a type has the same frequency
and severity processes. Assume in addition that a risk is picked at random and we do not know what
type it is. If for this randomly selected risk one observes 3 claims in 4 years, of sizes: $50, $100 and
$300, in any order, then one can use Bayesian Analysis to predict the future severity of this risk.
Combining the computations of the previous severity example and an earlier frequency example,
one can determine the probability of observing 3 claims in 4 years and the probability given one
has observed three claims that they are of sizes 50, 100 and 300.
Then P(Observation | Risk Type) =
P(3 claims in 4 years | Risk Type) P(Severities = 50, 100, 300 | Risk Type & 3 claims).

38

Note the similar Buhlmann Credibility situation, to be discussed subsequently, only uses the sum of claims, never
their separate values.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 134
For example for risk type 2, the chance of observing 3 claims in 4 years was computed in the
frequency example as 0.4116. For risk type 2, the likelihood given one has observed three claims
that they were of sizes 50, 100 and 300, in any order, was computed in the previous severity
example as 1.8746 x 10-8. Thus given risk type 2, the chance of the current observation is their
product: (0.4116)(1.8746 x 10-8) = 7.716 x 10-9. Then the probability weight for risk type 2 is the
product of the chance of the observation given risk type 2 and the a priori chance of risk type 2:
(7.716 x 10-9)(30%) = 2.315 x 10-9.
The Bayesian Analysis calculation is organized in a spreadsheet as follows:
A

Chance of
Chance of
Observing
Observing
A Priori
3 Claims Given Severities
Type Probability in 4 Years Given 3 Claims
1
2
3

50%
30%
20%

0.1536
0.4116
0.4096

Chance
of the
Observation
CxD

Prob. Weight = Posterior


Product
Chance of Mean
of Columns
This Type Annual
B&E
of Risk Severity

0.00000000104 0.00000000016 0.00000000008


0.00000001875 0.00000000772 0.00000000231
0.00000009998 0.00000004095 0.00000000819

0.76%
21.87%
77.38%

400
300
200

0.00000001059

100.00%

223

Overall

Bayesian Analysis, Pure Premium Example, Continuous Distributions:


One can use Bayesian Analysis to predict the future pure premium. However, exam questions
involving Bayesian Analysis, pure premiums, and continuous severity distributions tend to be very
long. Here is an example that shows why.
Assume the following information:
Type Portion of Risks in this Type

Bernoulli Frequency Dist.

Gamma Severity Dist.

50%

q = 40%

= 4, = 100

30%

q = 70%

= 3, = 100

20%

q = 80%

= 2, = 100

We assume that the types are homogeneous; i.e., every risk of a given type has the same
frequency process and the same severity process. Assume that for an individual risk the frequency
and severity are independent. Assume in addition that a risk is picked at random and that we do not
know what type it is.39
39

The latter is very important. If one knew which class the risk was from, one would use the expected value for that
class to estimate the future pure premium.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 135
If for this randomly selected risk one observes in 4 years a total of $450, then one can use Bayesian
Analysis to predict the future pure premium of this risk.
For the three types the mean (annual) frequencies are: 0.4, 0.7 and 0.8. Since the mean of the
Gamma Distribution is , for the three types the mean severities are: 400, 300 and 200. Thus the
mean pure premiums are: (0.4)(400) = 160, (0.7)(300) = 210, and (0.8)(200) = 160.
Now comes the difficult part. One needs to compute the probability of observing $450 of loss in 4
years. In this case involving the Binomial Distribution, there are at most 4 claims in 4 years.40 If there
is no claim, then one would not have any losses, so that is not a possibility for the observation.
However, if one has one claim, then the chance of $450 in loss is the Gamma Distribution with and
at x = 450: x1 ex/ / (). If one has two claims, then the chance of $450 in loss is the
Gamma Distribution41 with 2 and at x = 450: 2x21 ex/ / (2).
If one has three claims, then the chance of $450 in loss is the Gamma Distribution with 3 and at
x = 450: 3x31 ex/ / (3). If one has four claims, then the chance of $450 in loss is the
Gamma Distribution with 4 and at x = 450: 4x41 ex/ / (4).
Now, the chance of one claim in four years is: 4q(1-q)3 . The chance of two claims in four years is:
6q2 (1-q)2 . The chance of three claims in four years is: 4q3 (1-q).
The chance of four claims in four years is: q4 .
Thus the overall chance of observing $450 in four years is:
4q(1-q)3 x1 ex/ /() + 6q2 (1-q)2 2 x21 ex/ /(2) +
4q3 (1-q) 3 x31ex/ /(3) + q4 4 x41 ex/ /(4).

40

If instead of a Binomial one had a Poisson, there would be a small but positive chance of any very large number of
claims.
41
The sum of two independent Gamma distributions with and is a Gamma distributions with 2 and .

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 136
For type 1 with q = 0.4 and = 4 and = 100, the computation of the overall chance of observing
$450, goes as follows:42
A

Number
of
Claims
0
1
2
3
4
Sum

Probability
Given this Number of Claims,
of this number
the Probability of $450
of Claims
in Total Losses
0.1296
0
0.3456
0.0016871788
0.3456
0.0008236295
0.1536
0.0000426439
0.0256
0.0000005338

Column B
times
Column C
0
0.0005830890
0.0002846464
0.0000065501
0.0000000137
0.0008742991

Thus for a risk of type 1 the likelihood of $450 in loss in 4 years is 0.000874299.
In a similar manner for a risk of type 2 the likelihood of $450 in loss in 4 years is: 0.000737971.
A

Number
of
Claims
0
1
2
3
4

Probability
of this number
of Claims
0.0081
0.0756
0.2646
0.4116
0.2401

Given this Number of Claims,


the Probability of $450
in Total Losses
0
0.0011247859
0.0017082686
0.0004632916
0.0000426439

Column B
times
Column C
0
0.0000850338
0.0004520079
0.0001906908
0.0000102388

Sum

0.0007379713

In a similar manner for a risk of type 3 the likelihood of $450 in loss in 4 years is: 0.00130901.

42

Number
of
Claims
0
1
2
3
4

Probability
of this number
of Claims
0.0016
0.0256
0.1536
0.4096
0.4096

Given this Number of Claims,


the Probability of $450
in Total Losses
0
0.0004999048
0.0016871788
0.0017082686
0.0008236295

Column B
times
Column C
0
0.0000127976
0.0002591507
0.0006997068
0.0003373586

Sum

0.0013090137

While this is not particularly difficult compared to computations actuaries typically do with the aid of a computer, it
would take very long on the exam.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 137
One can use these likelihoods in the usual manner to get posterior probabilities for each type:
A

Type

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Pure
Premium

1
2
3

50%
30%
20%

0.0008742990
0.0007379710
0.0013090100

0.0004371495
0.0002213913
0.0002618020

47.50%
24.06%
28.45%

160
210
160

0.0009203428

100.00%

172

Overall

Using the posterior probabilities of 47.50%, 24.06%, and 28.45% as weights, the estimated future
pure premium is $172. Other than the length of time necessary to calculate Column C, the chance of
the observation, this was a typical Bayesian Analysis question. In the next example, the frequency
and severity process are somewhat simpler.

Bayesian Analysis, Pure Premium Example, Discrete Distributions:43


Assume there are two types of insureds with different risk processes:
Portion of
Risks in
Type
this type
A
75%
B
25%
You are given the following for Type A:
The number of claims for a single exposure period will be either 0, 1 or 2:
Number of Claims Probability
0
60%
1
30%
2
10%
The size of the claim, independent of any other will be 50, with probability 80%;
or 100, with probability 20%.
You are given the following for Type B:
The number of claims for a single exposure period will be either 0, 1 or 2:
Number of Claims Probability
0
50%
1
30%
2
20%
The size of the claim, independent of any other will be 50, with probability 60%;
or 100, with probability 40%.
43

See also the subsequent section on the Die/Spinner Models.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 138
A risk is selected at random44 and you observe a total of $100 in losses during a single year.
One can use Bayesian Analysis to estimate the future pure premium for this risk.
For type A, the chance of observing $100 in total losses is sum of the chance for one claim of size
$100 and that for two claims each $50:
(0.3)(0.2) + (0.1)(0.82 ) = 6.0% + 6.4% = 12.4%.
For type B, the chance of observing $100 in total losses is:
(0.3)(0.4) + (0.2)(0.62 ) = 12.0% + 7.2% = 19.2%.
Then the Bayesian Analysis of the pure premiums proceeds as follows:
A

Type
A
B
Overall

Chance
A Priori
of the
Probability Observation
75%
25%

0.124
0.192

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Freq.

Mean
Sev.

Mean
Annual
Pure
Premium

0.09300
0.04800

65.96%
34.04%

0.5
0.7

60
70

30
49

0.14100

100.00%

Thus the estimated future pure premium is: (65.96%)(30) + (34.04%)(49) = $36.47.

44

You dont know to which type this risk belongs.

36.47

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 139
Bayesian Analysis, Pure Premium Example, Dependent Frequency and Severity:
Unlike the previous example, in this case the frequency and severity are not independent.
Assume there are two types of insureds with different risk processes:
Portion of
Risks in
Type
this type
A
75%
B
25%
You are given the following for Type A:
The number of claims for a single exposure period will be either 0, 1 or 2:
Number of Claims Probability
0
60%
1
30%
2
10%
If only one claim is incurred, the size of the claim will be 50, with probability 80%; or 100,
with probability 20%.
If two claims are incurred, the size of each claim, independent of the other, will be 50,
with probability 50%; or 100, with probability 50%.
You are given the following for Type B:
The number of claims for a single exposure period will be either 0, 1 or 2:
Number of Claims Probability
0
50%
1
30%
2
20%
If only one claim is incurred, the size of the claim will be 50, with probability 60%; or 100,
with probability 40%.
If two claims are incurred, the size of each claim, independent of the other, will be 50,
with probability 30%; or 100, with probability 70%.
A risk is selected at random45 and you observe a total of $100 in losses during a single year.
One can use Bayesian Analysis to estimate the future pure premium for this risk.

45

You dont know to which type this risk belongs.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 140
For type A one can compute the mean pure premium by listing all the possibilities:
Situation
0 claims
1 claim @ 50
1 claim @ 100
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 100
2 claims @ 100 each
Overall

Probability
60.0%
24.0%
6.0%
2.5%
5.0%
2.5%

Pure Premium
0
50
100
100
150
200

100.0%

33

For example, the chance for 1 claim at $50 and one at $100, is the chance for two claims of 10%,
multiplied by the Binomial probability: (2)(0.5)(0.5) = 1/2. (10%)(1/2) = 5%.
Similarly, for type B:
Situation
0 claims
1 claim @ 50
1 claim @ 100
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 100
2 claims @ 100 each
Overall

Probability
50.0%
18.0%
12.0%
1.8%
8.4%
9.8%

Pure Premium
0
50
100
100
150
200

100.0%

55

For example, the chance for 1 claim at $50 and one at $100, is the chance for two claims of 20%,
multiplied by the Binomial probability: (2)(0.7)(0.3) = 0.42. (20%)(0.42) = 8.4%.
For type A, the chance of observing $100 in total losses is sum of the chance for one claim of size
$100 and that for two claims each $50: 6.0% + 2.5% = 8.5%.
For type B, the chance of observing $100 in total losses is: 12.0% + 1.8% = 13.8%.
Then the Bayesian Analysis for an observation of a total of $100 in losses during a single year
proceeds as follows:
A

Type

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Pure
Premium

A
B

75%
25%

0.085
0.138

0.06375
0.03450

64.89%
35.11%

33
55

0.09825

100.00%

40.73

Overall

Thus the estimated future pure premium is: (64.89%)(33) + (35.11%)(55) = $40.73.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 141
Bayesian Analysis, Layer Average Severity Example:46
One can apply Bayesian Analysis in order to estimate any quantity of interest. For example, one
might be interested in the layer average severity, the average dollars in a layer per loss (or per
loss excess of a certain amount.)
Exercise: Assume losses are given by a Single Parameter Pareto Distribution:
f(x) = 10 x(+1), x > 10.
If = 2.5, what are the average dollars in the layer from 15 to 30 per loss?
[Solution: In general, the average dollars in the layer from 15 to 30 per loss is:
E[X

30] - E[X

E[X x] =

15]. For the Single Parameter Pareto, with parameters and :

.47 For = 10 and = 2.5:


- 1 ( -1) x 1

E[X 30] = (2.5)(10)/(1.5) - (102.5) / {(302.5-1)(1.5)} = 16.67 - 210.82 / 164.32 = 16.67 - 1.28 =
15.39.
E[X 15] = (2.5)(10)/(1.5) - (102.5) / {(152.5-1)(1.5)} = 16.67 - 210.82 / 58.09 = 16.67 - 3.63 =
13.04. E[X 30] - E[X 15] = 15.39 - 13.04 = 2.35.]
Exercise: Assume losses are given by a Single Parameter Pareto Distribution:
f(x) = 10 x(+1), x > 10. Assume we expect per year 6 claims excess of 10.
If = 2.5, what are the expected annual dollars in the layer from 15 to 30?
[Solution: We expect 2.35 dollars in the layer for from 15 to 30 for each claim of size greater than 10.
The average dollars in the layer from 15 to 30 per loss excess of ten is 2.35. Therefore, the
expected annual dollars in the layer from 15 to 30 is: (6)(2.35) = 14.1.
Comment: Since the Single Parameter Pareto has support starting at 10, everything is excess of
10. The number of claims excess of 10 here is analogous to the number of losses in a situation in
which instead the support starts at zero.]
Thus we see that given a particular value of the parameter alpha, we can compute the
layer average severity.48 For = 2.5, the layer average severity for the layer from 15 to 30 is 2.35.49

46

This example is somewhat hard.


Note that this formula for the limited expected value only works for 1. For = 1, one could perform the
appropriate integrals, rather than using the formula for the limited expected values.
48
Which with additional information can be used to compute the expected annual dollars in that layer.
49
It should be noted that if everything is in millions of dollars ( = 10 million, and the layer is from 15 million to 30
million), then the layer average severity is just multiplied by one million. In general one can easily adjust the scale in
this manner. Changing scales from for example 10 million to 10 can often make a computation easier to perform. (On
the computer you may thereby avoid overflow problems.) It can also make the computation much easier to check.
47

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 142
As alpha gets smaller the Single Parameter Pareto Distribution gets heavier-tailed, while as alpha
gets larger the distribution gets lighter-tailed. Thus we expect the layer average severity to depend
on alpha.
Exercise: Assume losses are given by a Single Parameter Pareto Distribution:
f(x) = 10 x(+1), x > 10, > 1.
What are the average dollars in the layer from 15 to 30 per loss?
[Solution: E[X

30] = 10/(1) - (10) / {(301)(1)}.

E[X

15] = 10/(1) - (10) / {(151)(1)}.

E[X

30] - E[X

15] = (10/(1)){1/(151) - 1/(301)}. ]

Now that we have the quantity of interest, the layer average severity, as a function of alpha, we are
now ready to perform Bayesian Analysis. Given an a priori distribution of values for alpha and a set
of observations, one can estimate the future layer average severity.
Assume the following:

Losses are given by a Single Parameter Pareto Distribution:


f(x) = 10 x(+1), x > 10

Based on prior information we assume that alpha has the following distribution:

A Priori Probability

2.1
2.3
2.5
2.7

20%
30%
30%
20%

We then observe 7 losses (each of size greater than 10): 12, 15, 17, 18, 23, 28, 39.
Use Bayesian Analysis to estimate the future layer average severity for the layer from 15 to 30,
(the average dollars of loss in the layer from 15 to 30 assuming there has been a single loss of size
greater than 10.)
Weve seen how to compute the layer average severity for this situation given a value for alpha.
The other item to compute is the probability of the observation given alpha.
The probability of the observation is the product of the densities at the observed points, given
alpha. f(x) = 10 x(+1). So for example if alpha = 2.5, then
f(x) = 790.6/ x3.5 and f(17) = 0.0390. For alpha = 2.5, f(12)f(15)f(17)f(18)f(23)f(28)f(39) =
(0.1321)(0.0605)(0.0390)(0.0320)(0.01355)(0.00681)(0.00213) = 1.960 x 10-12.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 143
The solution to this problem in the usual spreadsheet format is:
A

Alpha

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Distribution
of Alpha

Layer
Average
Severity

Square of
Layer
Average
Severity

2.1
2.3
2.5
2.7

20%
30%
30%
20%

4.155e-12
2.931e-12
1.960e-12
1.253e-12

8.311e-13
8.792e-13
5.880e-13
2.507e-13

32.60%
34.49%
23.07%
9.83%

3.10
2.70
2.35
2.04

9.64
7.27
5.50
4.18

2.549e-12

100.00%

2.68

7.33

Overall

The posterior distribution of alpha is: 32.60%, 34.49%, 23.07%, 9.83%; posterior to the
observations we believe the loss distribution is somewhat more likely to have a smaller value of
alpha (be heavier-tailed.) Posterior to the observation, the estimated layer average severity is:
(32.60%)(3.10) + (34.49%)(2.70) + (23.07%)(2.35) + (9.83%)(2.04) = 2.68.50
This compares to the a priori estimate of the layer average severity:
(20%)(3.10) + (30%)(2.70) + (30%)(2.35) + ( 20%)(2.04) = 2.54.
We can also use the posterior distribution to compute that the expected value of the square of the
layer average severity is 7.33. Combining this second moment with the expected value, gives a
variance of the layer average severities of 7.33 - 2.682 = 0.148.
The posterior standard deviation is

0.148 = 0.38.

Note that instead of an a priori distribution on only four values, one could have had an a priori
distribution with support over many more values. The a priori distribution could have been given by
some function, either discrete or continuous. If the a priori distribution is continuous, then integrals (or
numerical integration) would have replaced sums. The loss distribution function could have been
something other than a Single Parameter Pareto Distribution. In which case it could have had more
than one unknown parameter, in which case one would have to perform multiple sums or multiple
integrals. The quantity of interest could have been something other than the layer average severity.
In which case the only change is in the formula for the quantity of interest as a function of the
parameter(s).

50

Note one weights together the values of the quantity of interest. One does not calculate the posterior mean of
alpha and then calculate the layer average severity for this value of alpha. (In this case, the posterior mean of alpha is
2.327, and the corresponding layer average severity is 2.65.)

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 144
Exercise: Assume the following:

Losses (in millions of dollars) are given by a Single Parameter Pareto Distribution:
f(x) = 10 x(+1), x > 10.

Based on prior information we assume alpha has the following distribution:

A Priori Probability

1.5
2.0
2.5
3.0
3.5

10%
20%
40%
20%
10%

You then observe 7 losses (each of size greater than 10):


12, 15, 17, 18, 23, 28, 39 (in millions of dollars).

You expect per year 13 losses of size greater than 10.


Use Bayesian Analysis to estimate the expected annual dollars of loss (in millions of dollars) in the
layer from 30 million to 50 million.
[Solution:
A

Alpha

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Distribution
of Alpha

Layer
Average
Severity

1.5
2.0
2.5
3.0
3.5

10%
20%
40%
20%
10%

7.591e-12
4.835e-12
1.960e-12
5.971e-13
1.494e-13

7.591e-13
9.670e-13
7.840e-13
1.194e-13
1.494e-14

28.70%
36.57%
29.65%
4.52%
0.56%

2.60
1.33
0.69
0.36
0.19

2.644e-12

100.00%

1.46

Overall

The estimated layer average severity for the layer from 30 million to 50 million is 1.46 million.
Therefore, the expected annual loss in the layer from 30 million to 50 million is:
(1.46 million)(13) = $19.0 million.]
Note that in this exercise, the a priori distribution of alpha was more diffuse than in the example
above. Therefore, the observations had an opportunity to have a greater impact on our posterior
estimate. The posterior distribution of alpha differed significantly from the prior distribution of alpha.
Therefore, the a priori estimate of $11.6 million differed significantly from the posterior estimate of
$19.0 million in expected annual losses in this layer.51

51

The a priori estimate of the layer average severity is:


(0.1)( 2.60) + (0.2)(1.33) + (0.4)(0.69) + (0.2)(0.36) + (0.1)(0.19) = 0.89. (13)(0.89) = 11.6.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 145
Problems:
Use the following information for the next 7 questions:
There are three types of risks. Assume 60% of the risks are of Type A, 25% of the risks are of
Type B, and 15% of the risks are of Type C. Each risk has either one or zero claims per year.
Type of Risk
Chance of a Claim
A Priori Chance of Type of Risk
A
20%
60%
B
30%
25%
C
40%
15%
5.1 (1 point) What is the overall mean annual claim frequency?
A. 24.5%

B. 25.0%

C. 25.5%

D. 26.0%

E. 26.5%

5.2 (1 point) You observe no claim in a year.


What is the probability that the risk you are observing is of Type A?
A. 58%
B. 60%
C. 62%
D. 64%
E. 66%
5.3 (1 point) You observe no claim in a year.
What is the probability that the risk you are observing is of Type B?
A. 23.0%
B. 23.5%
C. 24.0%
D. 24.5%
E. 25.0%
5.4 (1 point) You observe no claim in a year.
What is the probability that the risk you are observing is of Type C?
A. 12%
B. 14%
C. 16%
D. 18%
E. 20%
5.5 (1 point) You observe no claim in a year.
What is the expected annual claim frequency from the same risk?
A. 21%
B. 23%
C. 25%
D. 27%
E. 29%
5.6 (2 points) You observe one claim in a year.
What is the expected annual claim frequency from the same risk?
A. 22%
B. 24%
C. 26%
D. 28%
E. 30%
5.7 (3 points) You observe a single risk over five years. You observe 2 claims in 5 years.
What is the expected annual claim frequency from the same risk?
A. 21%
B. 23%
C. 25%
D. 27%
E. 29%

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 146
Use the following information for the next two questions:
An insured population consists of 9% youthful drivers and 91% adult
drivers. Based on experience, we have derived the following probabilities
that an individual driver will have n claims in a year:
n
Youth
Adult
0
85%
95%
1
10%
4%
2
4%
1%
3
1%
0%
5.8 (2 points) If a driver has had exactly two claims in the prior year, what is the probability it is a
youthful driver?
A. Less than 0.26
B. At least 0.26, but less than 0.27
C. At least 0.27, but less than 0.28
D. At least 0.28, but less than 0.29
E. 0.29 or more.
5.9 (2 points) If a driver has had exactly two claims in the prior year, what is the expected number of
claims for that same driver over the next year?
A. Less than 0.10
B. At least 0.10, but less than 0.11
C. At least 0.11, but less than 0.12
D. At least 0.12, but less than 0.13
E. 0.13 or more.
5.10 (2 points) There are two types of risks, with equal frequencies but different size of loss
distributions. Each claim is either $1000 or $2000.
A Priori Chance of Percentage of
Percentage of
Type of Risk This Type of Risk
$1000 Claims
$2000 Claims
Low
80%
90%
10%
High
20%
70%
30%
You pick a risk at random (80% chance it is Low) and observe three claims. If two of the claims were
$1000 and one of the claims was $2000, what is the expected value of the next claim from that
same risk?
A. less than 1160
B. at least 1160 but less than 1170
C. at least 1170 but less than 1180
D. at least 1180 but less than 1190
E. at least 1190

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 147
Use the following information for the following three questions:
Frequency and severity are independently distributed for each driver.
There are three types of drivers with the following characteristics:
Portion of Drivers
Poisson Annual
Pareto
Type
of This Type
Claim Frequency
Claim Severity
Good

60%

5%

= 5, = 10,000

Bad

30%

10%

= 4, = 10,000

Ugly

10%

20%

= 3, = 10,000

5.11 (3 points) A driver is observed to have over a five year period a single claim.
Use Bayes theorem to predict this drivers future annual claim frequency.
A. 9.1%
B. 9.3%
C. 9.5%
D. 9.7%
E. 9.9%
5.12 (3 points) Over 1 year, for an individual driver you observe a single claim of size $25,000.
Use Bayes Theorem to estimate this drivers future average claim severity.
A. less than $4100
B. at least $4100 but less than $4200
C. at least $4200 but less than $4300
D. at least $4300 but less than $4400
E. at least $4400
5.13 (4 points) Over 3 years, for an individual driver you observe two claims of sizes $5,000 and
$25,000 in that order.
Use Bayes Theorem to estimate this drivers future average claim severity.
A. less than $4100
B. at least $4100 but less than $4200
C. at least $4200 but less than $4300
D. at least $4300 but less than $4400
E. at least $4400

5.14 (2 points) Annual claim counts for each policyholder follow a Negative Binomial distribution with
r = 3. Half of the policyholders have = 1. The other half of the policyholders have = 2.
A policyholder had 2 claims in one year.
Determine the probability that for this policyholder = 2.
A. 20%

B. 25%

C. 30%

D. 35%

E. 40%

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 148
5.15 (2 points) The aggregate loss distributions for three risks for one exposure
period are as follows:
Aggregate Losses
$0
$100
$500
Risk
A
0.90
0.07
0.03
B
0.50
0.30
0.20
C
0.30
0.33
0.37
A risk is selected at random and is observed to have $500 of aggregate losses in the first exposure
period. Determine the Bayesian analysis estimate of the expected value of the aggregate losses
for the same risk's second exposure period.
A. Less than $100
B. At least $100, but less than $125
C. At least $125, but less than $150
D. At least $150, but less than $175
E. At least $175
5.16 (2 points) Annual claim counts for each policyholder follow a Binomial distribution with m = 2.
80% of the policyholders have q = 0.10. The remaining 20% of the policyholders have q = 0.20.
A policyholder had 1 claim in one year.
Determine the probability that for this policyholder q = 0.10.
A. 61%
B. 63%
C. 67%
D. 69%
E. 71%
5.17 (2 points) You are given the following information:

S i = state of the world i, for i = 1, 2, 3

The probability of each state = 1/3

In any state there is either 0 or 1 claims, and the probability of a claim = 30%

The claim size is either 1 or 2 units

Given that a claim has occurred, the following are conditional probabilities

of claim size (in units) for each possible state:


S1
S2
S3
Pr(1) = 1/3
Pr(1) = 1/2
Pr(1) = 1/6
Pr(2) = 2/3
Pr(2) = 1/2
Pr(2) = 5/6
Use the data given above and Bayes' Theorem. If you observe a single claim of size 2 units, in
which range is your estimate of the pure premium for that risk?
A. 0.45
B. 0.47
C. 0.49
D. 0.51
E. 0.53

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 149
Use the following information for the next 6 questions:
For a single insured selected from type A:

The number of claims for a single exposure period will be 1, with probability 4/5;
or 2, with probability 1/5.

If only one claim is incurred, the size of the claim will be 50, with probability 3/4;
or 200, with probability 1/4.

If two claims are incurred, the size of each claim, independent of the other, will be 50,
with probability 60%; or 150, with probability 40%.
For a single insured selected from type B:

The number of claims for a single exposure period will be 1, with probability 3/5;
or 2, with probability 2/5.

If only one claim is incurred, the size of the claim will be 50, with probability 1/2;
or 200, with probability 1/2.

If two claims are incurred, the size of each claim, independent of the other, will be 50,
with probability 80%; or 150, with probability 20%.
An insured has been selected from a population consisting of 65% insureds in type A and 35%
insureds in type B. It is not known which of the two types the insured is from.
This insured will be observed for two exposures periods.
5.18 (3 points) If the first exposure period had total losses of 50, determine the expected
number of claims for the second exposure period.
A. 1.20
B. 1.22
C. 1.24
D. 1.26
E. 1.28
5.19 (2 points) What is the mean pure premium of a risk from type A?
A. 104
B. 106
C. 108
D. 110
E. 112
5.20 (2 points) What is the mean pure premium of a risk from type B?
A. 100
B. 110
C. 120
D. 130
E. 140
5.21 (2 points) If the first exposure period had total losses of 50, determine the expected
total losses for the second exposure period.
A. 109
B. 111
C. 113
D. 115
E. 117
5.22 (3 points) If the first exposure period had total losses of 200, determine the expected
number of claims for the second exposure period.
A. 1.21
B. 1.23
C. 1.25
D. 1.27
E. 1.29
5.23 (2 points) If the first exposure period had total losses of 200, determine the expected
total losses for the second exposure period.
E. 121
A. 113
B. 115
C. 117
D. 119

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 150
5.24 (3 points) You are given:

A portfolio of independent insureds is divided into two classes, Class A and Class B.
There are three times as many insureds in Class A as in Class B.
The number of claims for each insured during a single year follows a Bernoulli distribution.
The expected number of claims per year for an individual insured in Class A is 0.4.
The expected number of claims per year for an individual insured in Class B is 0.8.
Classes A and B have claim size distributions as follows:
Claim Size Class A Class B
1000
0.30
0.50
2000
0.70
0.50
One insured is chosen at random. The insured's loss for three years combined is 2000.
Use Bayesian Analysis to estimate the future pure premium for this insured.
(A) 700
(B) 725
(C) 750
(D) 775
(E) 800
5.25 (3 points) You are given the following:

A portfolio consists of 1000 independent risks.

450 of the risks each have a policy with a $100 per claim deductible, 300 of the risks
each have a policy with a $1000 per claim deductible, and 250 of the risks each have
a policy with a $10,000 per claim deductible.

The risks have identical claim count distributions.

Prior to truncation by policy deductibles, the loss size distribution for each risk
is as follows:
Claim Size
$50
$500
$5,000
$50,000

Probability
40%
30%
20%
10%

A report is available which shows actual loss sizes incurred for each policy

after truncation by policy deductibles, but does not identify the policy deductible
associated with each policy.
The report shows exactly three losses for a single policy selected at random.
Two of the losses are $50,000 and $5,000, but the amount of the third is illegible.
Using Bayes Theorem, what is the expected value of this illegible number?
A. $11,000
B. $13,000
C. $15,000
D. $17,000
E. $19,000

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 151
5.26 (2 points) You are given the following information:

There are three types of risks

The types are homogeneous, every risk of a given type has the same

Poisson frequency process:


Portion of
Average
Risks in
(Annual) Claim
Type
this Type
Frequency
1
70%
40%
2
20%
60%
3
10%
80%
A risk is picked at random and we do not know what type it is.
For this randomly selected risk, during 1 year there are 3 claims.
Use Bayesian Analysis to predict the future claim frequency of this same risk.
A. 0.54
B. 0.56
C. 0.58
D. 0.60
E. 0.62
5.27 (3 points) You are given the following information:

There are three types of risks

The types are homogeneous, every risk of a given type has the same

Exponential severity process:


Portion of
Average
Risks in
Claim
Type
this Type
Size
1
70%
$25
2
20%
$40
3
10%
$50
A risk is picked at random and we do not know what type it is.
For this randomly selected risk, there are 3 claims for a total of $140.
Use Bayesian Analysis to predict the future average claim size of this same risk.
A. $34
B. $36
C. $38
D. $40
E. $42

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 152
Use the following information for the next four questions:

There are three types of risks

The types are homogeneous, every risk of a given type has the same
Poisson frequency process and the same Exponential severity process:
Portion of
Average
Average
Risks in
Annual Claim
Claim
Type
this Type
Frequency
Size
1
70%
40%
$25
2
20%
60%
$40
3
10%
80%
$50

5.28 (2 points) If a risk is of type 1, what is the likelihood of the observed total losses in a year
being $140?
A. 0.01%
B. 0.02%
C. 0.03%
D. 0.04%
E. 0.05%
5.29 (2 points) If a risk is of type 2, what is the likelihood of the observed total losses in a year
being $140?
A. 0.04%
B. 0.05%
C. 0.06%
D. 0.07%
E. 0.08%
5.30 (2 points) If a risk is of type 3, what is the likelihood of the observed total losses in a year
being $140?
A. 0.03%
B. 0.05%
C. 0.07%
D. 0.09%
E. 0.11%
5.31 (2 points) A risk is picked at random and we do not know what type it is.
For this randomly selected risk, there are $140 of total losses in one year.
Use Bayesian Analysis to predict the future average pure premium of this same risk.
A. Less than $25
B. At least $25, but less than $26
C. At least $26, but less than $27
D. At least $27, but less than $28
E. $28 or more

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 153
Use the following information for the next nine questions:
The years for a crop insurer are either Good, Typical, or Poor, depending on the weather
conditions. The years follow a Markov Chain with transition matrix Q:
0.70 0.20 0.10
0.15 0.80 0.05

0.20 0.30 0.50


The insurers aggregate losses for each type of year are Exponentially distributed:
Type
Mean
Good
25
Typical
50
Poor
100
5.32 (3 points) What is the insurers average annual aggregate loss?
Hint: A stationary distribution, , satisfies the matrix equation Q = , and i = 1.
A. 41

B. 43

C. 45

D. 47

E. 49

5.33 (2 points) What is the probability that the aggregate losses in a year picked at random are
greater than 120?
A. 8.4%
B. 8.6%
C. 8.8%
D. 9.0%
E. 9.2%
5.34 (1 point) Let X be the insurers annual losses this year and Y be the insurers annual
losses next year. If this year is Good, what is E[Y]?
A. 34
B. 36
C. 38
D. 40
E. 42
5.35 (1 point) Let X be the insurers annual losses this year and Y be the insurers annual
losses next year. If this year is Typical, what is E[Y]?
A. 41
B. 43
C. 45
D. 47
E. 49
5.36 (1 point) Let X be the insurers annual losses this year and Y be the insurers annual
losses next year. If this year is Poor, what is E[Y]?
A. 55
B. 60
C. 65
D. 70
E. 75
5.37 (2 points) Let X be the insurers annual losses this year and Y be the insurers annual
losses next year. What is E[XY]?
A. 2500
B. 2600
C. 2700
D. 2800
E. 2900
5.38 (3 points) What is the long term variance observed in the insurers annual aggregate
losses?
A. 3000
B. 3100
C. 3200
D. 3300
E. 3400

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 154
5.39 (2 points) Let X be the insurers annual losses this year and Y be the insurers annual
losses next year. What is the correlation of X and Y?
A. 7%
B. 8%
C. 9%
D. 10%
E. 11%
5.40 (3 points) If the insurer has losses of 75 this year, what are the insurers expected losses
next year?
A. 46
B. 48
C. 50
D. 52
E. 54

5.41 (3 points) You are given the following information:

There are three types of risks

The types are homogeneous, every risk of a given type has the same

Exponential severity process:


Portion of
Average
Risks in
Claim
Type
This type
Size
1
70%
$25
2
20%
$40
3
10%
$50
A risk is picked at random and we do not know what type it is.
For this randomly selected risk, there are 3 claims of sizes: $30, $40, and $70.
Use Bayesian Analysis to predict the future average claim size of this same risk.
A. $34
B. $36
C. $38
D. $40
E. $42
5.42 (2 points) The number of claims incurred each year is 0, 1, 2, 3, or 4, with equal probability. If
there is a claim, there is a 65% chance it will be reported to the insurer by year end, independent of
any other claims.
If there is 1 claim incurred during 2003 that is reported by the end of year 2003, what is the
estimated number of claims incurred during 2003?
(A) Less than 1.7
(B) At least 1.7, but less than 1.8
(C) At least 1.8, but less than 1.9
(D) At least 1.9, but less than 2.0
(E) At least 2.0

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 155
Use the following information for the next three questions:
You are given the following information about two classes of risks:

Risks in Class A have a Poisson frequency with a mean of 0.6 per year.
Risks in Class B have a Poisson frequency with a mean of 0.8 per year.
Risks in Class A have an exponential severity distribution with a mean of 11.
Risks in Class B have an exponential severity distribution with a mean of 15.
Class A has three times the number of risks in Class B.
Within each class, severities and claim counts are independent.
A risk is randomly selected and observed to have three claims during one year.
The observed claim amounts were: 7, 10, and 21.
5.43 (2 points) Calculate the posterior expected value of the frequency for this risk.
(A) Less than 0.68
(B) At least 0.68, but less than 0.69
(C) At least 0.69, but less than 0.70
(D) At least 0.70, but less than 0.71
(E) At least 0.71
5.44 (2 points) Calculate the posterior expected value of the severity for this risk.
(A) Less than 12.0
(B) At least 12.0, but less than 12.5
(C) At least 12.5, but less than 13.0
(D) At least 13.0, but less than 13.5
(E) At least 13.5
5.45 (2 points) Calculate the posterior expected value of the pure premium for this risk.
(Do not make separate estimates of frequency and severity.)
(A) Less than 8.0
(B) At least 8.0, but less than 8.5
(C) At least 8.5, but less than 9.0
(D) At least 9.0, but less than 9.5
(E) At least 9.5

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 156
5.46 (3 points) You are given the following:
The number of claims incurred each year is Poisson with mean 4.

If there is a claim, there is a 70% chance it will be reported to the insurer by year end.
The chance of a claim being reported by year end is independent of the reporting of any
other claim, and is also independent of the number of claims incurred.
If there are 3 claims incurred during 2003 that are reported by the end of year 2003, what is the
estimated number of claims incurred during 2003?
A. 4.0
B. 4.2
C. 4.4
D. 4.6
E. 4.8
Use the following information for the next two questions:
There are two equally likely types of risks, each with severities 20 and 50:
Type Probability of 20 Probability of 50
A
50%
50%
B
90%
10%
5.47 (2 points) A loss of size 20 is observed.
Using Bayes Analysis, what is the estimated future severity for this insured?
(A) Less than 26
(B) At least 26, but less than 27
(C) At least 27, but less than 28
(D) At least 28, but less than 29
(E) At least 29
5.48 (2 points) A second loss is observed for this same insured, this time of size 50.
Using Bayes Analysis, what is the estimated future severity for this insured?
(A) Less than 31
(B) At least 31, but less than 32
(C) At least 32, but less than 33
(D) At least 33, but less than 34
(E) At least 34

5.49 (3 points) You are given the following:

The number of claims incurred each year is Negative Binomial with r = 2 and = 1.6.
If there is a claim, there is a 70% chance it will be reported to the insurer by year end.
The chance of a claim being reported by year end is independent of the reporting
of any other claim, and is also independent of the number of claims incurred.
If there are 5 claims incurred during 2003 that are reported by the end of year 2003, what is the
estimated number of claims incurred during 2003?
A. 6.0
B. 6.2
C. 6.4
D. 6.6
E. 6.8

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 157
5.50 (2 points) You are given:
(i) Two risks have the following severity distributions:
Probability of Claim
Probability of Claim
Amount of Claim
Amount for Risk 1
Amount for Risk 2
250
0.5
0.7
2,500
0.3
0.2
60,000
0.2
0.1
(ii) Risk 1 is twice as likely to be observed as Risk 2.
A claim of 250 is observed.
Determine the Bayesian estimate of the second claim amount from the same risk.
(A) Less than 10,200
(B) At least 10,200, but less than 10,400
(C) At least 10,400, but less than 10,600
(D) At least 10,600, but less than 10,800
(E) At least 10,800
5.51 (3 points) An insurance company sells three types of policies with the following characteristics:
Type of Policy
Proportion of Total Policies
Annual Claim Frequency
I

30%

Negative Binomial with r = 1 and = 0.25

II

50%

Negative Binomial with r = 2 and = 0.25

III

20%

Negative Binomial with r = 2 and = 0.50

A randomly selected policyholder is observed to have a total of one claim for Year 1 through
Year 4. For the same policyholder, determine the Bayesian estimate of the expected number of
claims in Year 5.
(A) Less than 0.4
(B) At least 0.4, but less than 0.5
(C) At least 0.5, but less than 0.6
(D) At least 0.6, but less than 0.7
(E) At least 0.7
5.52 (4 points) There are two types of insured, equally likely.
They each have ground up size of loss distributions that are Pareto with = 4.
However, for one type = 800, while for the other type = 1200.
For each type, there is a deductible of either 500 or 1000, equally likely.
From a policy picked at random you observe two payments: 400, 1500.
Determine the posterior probabilities of all four combinations of type of insured and deductible.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 158
5.53 (2 points) Each insured has at most one claim a year.
Claim Size Distribution
Class
Prior Probability
Probability of a Claim
100
200
A
3/4
1/5
2/3
1/3
B
1/4
2/5
1/2
1/2
An insured is chosen at random and a single claim of size 100 has been observed during a year.
Use Bayes Theorem to estimate the future pure premium for this insured.
A. 32
B. 34
C. 36
D. 38
E. 40
5.54 (4, 5/85, Q.41) (3 points) Si = state of the world i, for i = 1, 2, 3.
The probability of each state = 1/3. In any state, the probability of a claim = 1/2.
The claim size is either 1 or 2 units. Given that a claim has occurred, the following are conditional
probabilities of claim size (in units) for each possible state:
S1
S2
S3
Pr(1) = 2/3
Pr(1) = 1/2
Pr(1) = 5/6
Pr(2) = 1/3
Pr(2) = 1/2
Pr(2) = 1/6
Use the data given above and Bayes' Theorem. If you observe a single claim of size 2 units, in
which range is your estimate of the pure premium for that risk?
A. Less than 0.65
B. At least 0.65, but less than 0.67
C. At least .67, but less than 0.69
D. At least 0.69, but less than 0.71
E. 0.71 or more
5.55 (4, 5/87, Q.39) (2 points) An insured population consists of 1500 youthful drivers and 8500
adult drivers. Based on experience, we have derived the following probabilities that an individual
driver will have n claims in a year's time:
n
Youth
Adult
0
0.50
0.80
1
0.30
0.15
2
0.15
0.05
3
0.05
0.00
If you have a policy with exactly one claim on it in the prior year, what is the probability the insured is
a youthful driver?
A. Less than 0.260
B. At least 0.260, but less than 0.270
C. At least 0.270, but less than 0.280
D. At least 0.280, but less than 0.290
E. 0.290 or more

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 159
5.56 (165, 5/90, Q.9) (1.7 points) A Bayesian method is used to estimate a value t from an
observed value u. You are given:
(i) t is a sample from a random variable T that has a binomial distribution with q = 1/2 and m = 2.
(ii) u is a sample from a random variable U with conditional distribution, given t,
that is Binomial with q = t/2 and m = 2.
Determine E[T | u = 2].
(A) 5/6
(B) 1
(C) 5/4
(D) 5/3
(E) 5/2
5.57 (4B, 11/92, Q.24) (2 points) A portfolio of three risks exists with the following characteristics:

The claim frequency for each risk is normally distributed


with mean and standard deviations:
Distribution of Claim Frequency
Risk
Mean
Standard Deviation
A
0.10
0.03
B
0.50
0.05
C
0.90
0.01

A frequency of 0.12 is observed for an unknown risk in the portfolio.


Determine the Bayesian estimate of the same risk's expected claim frequency.
A. 0.10
B. 0.12
C. 0.13
D. 0.50
E. 0.90
5.58 (4B, 5/93, Q.26) (2 points) You are given the following information:
An insurance portfolio consists of two classes, A and B.
The number of claims distribution for each class is:
Probability of Number of Claims =
Class 0
1
2
3
A
0.7 0.1
0.1
0.1
B
0.5 0.2
0.1
0.2
Class A has three times as many insureds as Class B.
A randomly selected risk from the portfolio generates 1 claim over the most recent policy period.
Determine the Bayesian analysis estimate of the claims frequency rate for the observed risk.
A. Less than 0.72
B. At least 0.72 but less than 0.78
C. At least 0.78 but less than 0.84
D. At least 0.84 but less than 0.90
E. At least 0.90

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 160
5.59 (4B, 11/93, Q.17) (2 points) You are given the following:
Two risks have the following severity distribution.
Probability of Claim Amount For
Amount
of Claim
Risk 1
Risk 2
100
0.50
0.70
1,000
0.30
0.20
20,000
0.20
0.10
Risk 1 is twice as likely as Risk 2 of being observed.
A claim of 100 is observed, but the observed risk is unknown.
Determine the Bayesian analysis estimate of the expected value of a second claim amount from the
same risk.
A. Less than 3,500
B. At least 3,500, but less than 3,650
C. At least 3,650, but less than 3,800
D. At least 3,800, but less than 3,950
E. At least 3,950
5.60 (4B, 5/94, Q.8) (2 points) The aggregate loss distributions for two risks for one exposure
period are as follows:
Aggregate Losses
$50
$1,000
Risk $0
A
0.80
0.16
0.04
B
0.60
0.24
0.16
A risk is selected at random and observed to have $0 of losses in the first two exposure periods.
Determine the Bayesian analysis estimator of the expected value of the aggregate losses for the
same risk's third exposure period.
A. Less than $90
B. At least $90, but less than $95
C. At least $95, but less than $100
D. At least $100, but less than $105
E. At least $105

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 161
5.61 (4B, 5/95, Q.19) (2 points) The aggregate loss distributions for three risks for one exposure
period are as follows:
Aggregate Losses
$0
$50
$2,000
Risk
A
0.80
0.16
0.04
B
0.60
0.24
0.16
C
0.40
0.32
0.28
A risk is selected at random and is observed to have $50 of aggregate losses in the first exposure
period. Determine the Bayesian analysis estimate of the expected value of the aggregate losses
for the same risk's second exposure period.
A. Less than $300
B. At least $300, but less than $325
C. At least $325, but less than $350
D. At least $350, but less than $375
E. At least $375
5.62 (4B, 11/95, Q.18 & Course 4 Sample Exam 2000, Q. 11) (3 points)
You are given the following:

A portfolio consists of 150 independent risks.


100 of the risks each have a policy with a $100,000 maximum covered loss, and
50 of the risks each have a policy with a $1,000,000 maximum covered loss.
The risks have identical claim count distributions.

Prior to censoring by maximum covered losses, the claim size distribution for each risk is as
follows:
Claim Size
$10,000
$50,000
$100,000
$1,000,000

Probability
1/2
1/4
1/5
1/20

A claims report is available which shows actual claim sizes incurred for each policy after
censoring by maximum covered losses, but does not identify the maximum covered loss
associated with each policy.
The claims report shows exactly three claims for a policy selected at random.
Two of the claims are $100,000, but the amount of the third is illegible.
What is the expected value of this illegible number?
A. Less than $45,000
B. At least $45,000, but less than $50,000
C. At least $50,000, but less than $55,000
D. At least $55,000, but less than $60,000
E. At least $60,000

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 162
5.63 (4B, 5/96, Q.5) (2 points) You are given the following:

A portfolio of independent risks is divided into two classes.

Each class contains the same number of risks.

For each risk in Class 1, the number of claims for a single exposure period
follows a Poisson distribution with mean 1.

For each risk in Class 2, the number of claims for a single exposure period
follows a Poisson distribution with mean 2.
A risk is selected at random from the portfolio. During the first exposure period, 2 claims are
observed for this risk. During the second exposure period, 0 claims are observed for this same risk.
Determine the posterior probability that the risk selected came from Class 1.
A. Less than 0.53
B. At least 0.53, but less than 0.58
C. At least 0.58 but less than 0.63
D. At least 0.63 but less than 0.68
E. At least 0.68
5.64 (4B, 11/96, Q.12) (3 points) You are given the following:

75% of claims are of Type A and the other 25% of claims are of Type B.
Type A claim sizes follow a normal distribution with mean 3,000 and variance 1,000,000.
Type B claim sizes follow a normal distribution with mean 4,000 and variance 1,000,000.
A claim file exists for each of the claims, and one of them is randomly selected. The claim file
selected is incomplete and indicates only that its associated claim size is greater than 5,000.
Determine the posterior probability that a Type A claim was selected.
A. Less than 0.15
B. At least 0.15, but less than 0.25
C. At least 0.25, but less than 0.35
D. At least 0.35, but less than 0.45
E. At least 0.45

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 163
5.65 (4B, 5/97, Q.11) (2 points) You are given the following:

A portfolio of independent risks is divided into three classes.


Each class contains the same number of risks.

For all of the risks in Class 1, claim sizes follow a uniform distribution on the
interval from 0 to 400.

For all of the risks in Class 2, claim sizes follow a uniform distribution on the
interval from 0 to 600.

For all of the risks in Class 3, claim sizes follow a uniform distribution on the
interval from 0 to 800.
A risk is selected at random from the portfolio. The first claim observed for this risk is 340.
Determine the Bayesian analysis estimate of the expected value of the second claim observed for
this same risk.
A. Less than 270
B. At least 270, but less than 290
C. At least 290, but less than 310
D. At least 310, but less than 330
E. At least 330
5.66 (4B, 5/98, Q.24) (3 points) You are given the following:

A portfolio consists of 100 independent risks.

25 of the risks have a policy with a $5,000 maximum covered loss, 25 of the risks have
a policy with a $10,000 maximum covered loss, and 50 of the risks have a policy with a
$20,000 maximum covered loss.

The risks have identical claim count distributions.

Prior to censoring by maximum covered losses, claims size for each risk follow a
Pareto distribution, with parameters = 5,000 and = 2.

A claims report is available which shows the number of claims in various claim size
ranges for each policy after censoring by maximum covered loss, but does not identify
the maximum covered loss associated with each policy.
The claims report shows exactly one claim for a policy selected at random. This claim falls in the claim
size range of $9,000 to $11,000.
Determine the probability that this policy has a $10,000 maximum covered loss.
A. Less than 0.35
B. At least 0.35, but less than 0.55
C. At least 0.55, but less than 0.75
D. At least 0.75, but less than 0.95
E. At least 0.95

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 164
5.67 (4B, 5/99, Q.16) (2 points) You are given the following:
The number of claims per year for Risk A follows a Poisson distribution with mean m .
The number of claims per year for Risk B follows a Poisson distribution with mean m + 1.
The probability of selecting Risk A is equal to the probability of selecting Risk B.
One of the risks is randomly selected, and zero claims are observed for this risk during one year.
Determine the posterior probability that the selected risk is Risk A.
A. Less than 0.3
B. At least 0.3, but less than 0.5
C. At least 0.5, but less than 0.7
D. At least 0.7, but less than 0.9
E. At least 0.9
5.68 (4B, 11/99, Q.28) (2 points) You are given the following:
The number of claims per year for Risk A follows a Poisson distribution with mean m.
The number of claims per year for Risk B follows a Poisson distribution with mean 2m.
The probability of selecting Risk A is equal to the probability of selecting Risk B.
One of the risks is randomly selected, and zero claims are observed for this risk during one year.
Determine the posterior probability that the selected risk will have at least one claim during the next
year.
A.

1 - e- m
1 + e- m

B.

1 - e - 3m
1 + e- m

C. 1 - e-m

D. 1 - e-2m

E. 1 - e-2m - e-4m

5.69 (4, 5/00, Q.7) (2.5 points) You are given the following information about two classes of risks:
(i) Risks in Class A have a Poisson claim count distribution with a mean of 1.0 per year.
(ii) Risks in Class B have a Poisson claim count distribution with a mean of 3.0 per year.
(iii) Risks in Class A have an exponential severity distribution with a mean of 1.0.
(iv) Risks in Class B have an exponential severity distribution with a mean of 3.0.
(v) Each class has the same number of risks.
(vi) Within each class, severities and claim counts are independent.
A risk is randomly selected and observed to have two claims during one year.
The observed claim amounts were 1.0 and 3.0.
Calculate the posterior expected value of the aggregate loss for this risk during the next year.
(A) Less than 2.0
(B) At least 2.0, but less than 4.0
(C) At least 4.0, but less than 6.0
(D) At least 6.0, but less than 8.0
(E) At least 8.0
5.70 (2 points) In the previous question, 4, 5/00, Q.7, change the observation to:
the observed two claim amounts were 1.0, and at least 3.0.
Calculate the posterior expected value of the aggregate loss for this risk during the next year.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 165
5.71 (4, 5/00, Q.22) (2.5 points) You are given:
(i) A portfolio of independent risks is divided into two classes, Class A and Class B.
(ii) There are twice as many risks in Class A as in Class B.
(iii) The number of claims for each insured during a single year follows a Bernoulli distribution.
(iv) Classes A and B have claim size distributions as follows:
Claim Size Class A Class B
50,000
0.60
0.36
100,000
0.40
0.64
(v) The expected number of claims per year is 0.22 for Class A and 0.11 for Class B.
One insured is chosen at random. The insured's loss for two years combined is 100,000.
Calculate the probability that the selected insured belongs to Class A.
(A) 0.55
(B) 0.57
(C) 0.67
(D) 0.71
(E) 0.73
5.72 (4, 11/00, Q.28) (2.5 points) Prior to observing any claims, you believed that claim sizes
followed a Pareto distribution with parameters = 10 and = 1, 2 or 3, with each value being
equally likely. You then observe one claim of 20 for a randomly selected risk. Determine the
posterior probability that the next claim for this risk will be greater than 30.
(A) 0.06
(B) 0.11
(C) 0.15
(D) 0.19
(E) 0.25
5.73 (3 points) In 4, 11/00, Q.28, you instead observe for a randomly selected risk two claims, one
of size 20 and the other of size 40, not necessarily in that order.
Determine the posterior probability that the next claim for this risk will be greater than 30.
(A) 18%
(B) 20%
(C) 22%
(D) 24%
(E) 26%
5.74 (2 points) In 4, 11/00, Q.28, you instead observe for a randomly selected risk one claim of
size 20 or more.
Determine the posterior probability that the next claim for this risk will be greater than 30.
(A) 13%
(B) 15%
(C) 17%
(D) 19%
(E) 21%

5.75 (4, 11/02, Q.39 & 2009 Sample Q. 55) (2.5 points) You are given:
Number of Claim Count Probabilities
Class Insureds
0
1
2
3
4
1
3000
1/3
1/3
1/3
0
0
2
2000
0
1/6
2/3
1/6
0
3
1000
0
0
1/6
2/3 1/6
A randomly selected insured has one claim in Year 1.
Determine the expected number of claims in Year 2 for that insured.
(A) 1.00
(B) 1.25
(C) 1.33
(D) 1.67
(E) 1.75

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 166
5.76 (CAS3, 11/03, Q.12) (2.5 points) A driver is selected at random. If the driver is a "good"
driver, he is from a Poisson population with a mean of 1 claim per year. If the driver is a "bad" driver,
he is from a Poisson population with a mean of 5 claims per year. There is equal probability that the
driver is either a "good" driver or a "bad" driver. If the driver had 3 claims last year, calculate the
probability that the driver is a "good" driver.
A. Less than 0.325
B. At least 0.325, but less than 0.375
C. At least 0.375, but less than 0.425
D. At least 0.425, but less than 0.475
E. At least 0.475
5.77 (CAS3, 11/03, Q.13) (2.5 points)
The Allerton Insurance Company insures 3 indistinguishable populations.
The claims frequency of each insured follows a Poisson process. Given:
Expected
Probability
Population time between
of being in
Claim
(class)
claims
class
cost
I
12 months
1/3
1,000
II
15 months
1/3
1,000
Ill
18 months
1/3
1,000
Calculate the expected loss in year 2 for an insured that had no claims in year 1.
A. Less than 810
B. At least 810, but less than 910
C. At least 910, but less than 1,010
D. At least 1,010, but less than 1,110
E. At least 1,110
5.78 (4, 11/03, Q.14 & 2009 Sample Q.11) (2.5 points) You are given:
(i) Losses on a companys insurance policies follow a Pareto distribution with probability

density function: f(x | ) =


, 0 < x< .
(x + ) 2
(ii) For half of the companys policies = 1, while for the other half = 3.
For a randomly selected policy, losses in Year 1 were 5.
Determine the posterior probability that losses for this policy in Year 2 will exceed 8.
(A) 0.11
(B) 0.15
(C) 0.19
(D) 0.21
(E) 0.27

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 167
5.79 (4, 11/03, Q.39 & 2009 Sample Q.29) (2.5 points) You are given:
(i) Each risk has at most one claim each year.
(ii)
Type of Risk
Prior Probability
Annual Claim Probability
I
0.7
0.1
II
0.2
0.2
III
0.1
0.4
One randomly chosen risk has three claims during Years 1-6.
Determine the posterior probability of a claim for this risk in Year 7.
(A) 0.22
(B) 0.28
(C) 0.33
(D) 0.40
(E) 0.46
5.80 (4, 11/04, Q.5 & 2009 Sample Q.136) (2.5 points) You are given:
(i) Two classes of policyholders have the following severity distributions:
Probability of Claim
Probability of Claim
Claim Amount
Amount for Class 1
Amount for Class 2
250
0.5
0.7
2,500
0.3
0.2
60,000
0.2
0.1
(ii) Class 1 has twice as many claims as Class 2.
A claim of 250 is observed.
Determine the Bayesian estimate of the expected value of a second claim from the same
policyholder.
(A) Less than 10,200
(B) At least 10,200, but less than 10,400
(C) At least 10,400, but less than 10,600
(D) At least 10,600, but less than 10,800
(E) At least 10,800
5.81 (4, 5/05, Q.35 & 2009 Sample Q.203) (2.9 points) You are given:
(i) The annual number of claims on a given policy has the geometric distribution with
parameter .
(ii) One-third of the policies have = 2, and the remaining two-thirds have = 5.
A randomly selected policy had two claims in Year 1.
Calculate the Bayesian expected number of claims for the selected policy in Year 2.
(A) 3.4
(B) 3.6
(C) 3.8
(D) 4.0
(E) 4.2

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 168
5.82 (4, 11/05, Q.15 & 2009 Sample Q.226) (2.9 points)
For a particular policy, the conditional probability of the annual number of claims given = , and the
probability distribution of are as follows:
Number of Claims

Probability

1 - 3

0.10

0.30

Probability
0.80
0.20
One claim was observed in Year 1.
Calculate the Bayesian estimate of the expected number of claims for Year 2.
(A) Less than 1.1
(B) At least 1.1, but less than 1.2
(C) At least 1.2, but less than 1.3
(D) At least 1.3, but less than 1.4
(E) At least 1.4
5.83 (CAS3, 5/06, Q.30) (2.5 points) Claim counts for each policyholder are independent and
follow a common Negative Binomial distribution. A priori, the parameters for this distribution are
(r, ) = (2, 2) or (r, ) = (4, 1). Each parameter set is considered equally likely.
Policy files are sampled at random. The first two files sampled do not contain any claims.
The third policy file contains a single claim.
Based on this information, calculate the probability that (r, ) = (2, 2).
A. Less than 0.30
B. At least 0.30, but less than 0.45
C. At least 0.45, but less than 0.60
D. At least 0.60, but less than 0.75
E. At least 0.75
5.84 (4, 11/06, Q.16 & 2009 Sample Q.260) (2.9 points) You are given:
(i) Claim sizes follow an exponential distribution with mean .
(ii) For 80% of the policies, = 8.
(iii) For 20% of the policies, = 2.
A randomly selected policy had one claim in Year 1 of size 5.
Calculate the Bayesian expected claim size for this policy in Year 2.
(A) Less than 5.8
(B) At least 5.8, but less than 6.2
(C) At least 6.2, but less than 6.6
(D) At least 6.6, but less than 7.0
(E) At least 7.0

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 169
Solutions to Problems:
5.1. C. (20%)(60%) + (30%)(25%) + (40%)(15%) = 25.5%.
Comment: The chance of observing no claim is: 1 - 0.255 = 0.745.
5.2. D. P(Type A | no claim) = P(no claim | Type A)P(Type A) / P (no claim) = (0.8)(0.6) / 0.745 =
64.43%.
5.3. B. (0.7)(0.25) / 0.745 = 23.49%.
5.4. A. (0.6)(0.15) / 0.745 = 12.08%.
5.5. C.
A

A Priori
Type of Chance of
Chance
Risk
This Type
of the
of Risk Observation
A
0.6
0.8
B
0.25
0.7
C
0.15
0.6
Overall

Prob. Weight =
Product
of Columns
B&C
0.480
0.175
0.090

Posterior
Chance of
This Type
of Risk
64.43%
23.49%
12.08%

Mean
Annual
Freq.
0.20
0.30
0.4

0.745

1.000

24.77%

5.6. D.
A

Type of
Risk
A
B
C
Overall

A Priori
Chance of
This Type
of Risk
0.6
0.25
0.15

Chance
of the
Observation
0.2
0.3
0.4

Prob. Weight =
Product
of Columns
B&C
0.120
0.075
0.060

Posterior
Chance of
This Type
of Risk
47.06%
29.41%
23.53%

Mean
Annual
Freq.
0.20
0.30
0.40

0.255

1.000

27.65%

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 170
5.7. D. For example, if one has a risk of Type B, the chance of observing 2 claims in 5 years is
given by (a Binomial Distribution): (10)(0.32 )(0.73 ) = 0.3087.
A

Type of
A Priori
Risk
Chance of This
Type of Risk
A
B
C

Chance
of the
Observation

0.6
0.25
0.15

0.2048
0.3087
0.3456

Overall

Prob. Weight =
Posterior
Product of Columns Chance of This
B&C
Type of Risk

Mean
Annual
Freq.

0.123
0.077
0.052

48.78%
30.64%
20.58%

0.20
0.30
0.4

0.252

1.000

27.18%

5.8. D.
A

Type of
Driver
Youth
Adult

A Priori
Chance of
This Type
of Driver
0.090
0.910

Chance
of the
Observation
4%
1%

Prob. Weight =
Product
of Columns
B&C
0.0036
0.0091

Posterior
Chance of
This Type
of Driver
28.35%
71.65%

0.013

1.000

Overall

5.9. B. The mean claim frequency for youthful drivers is: (1)(10%) + (2)(4%) + (3)(1%) = 21%.
The mean claim frequency for adult drivers is (1)(4%) + (2)(1%) = 6%.
(28.53%)(21%) + (71.65%)(6%) = 10.29%.
A

Type of
Driver
Youth
Adult

A Priori
Chance of
This Type
of Driver
0.090
0.910

Overall

Prob. Weight = Posterior


Chance
Product
Chance of
of the
of Columns
This Type
Observation
B&C
of Driver
4%
0.0036
28.35%
1%
0.0091
71.65%
0.013

1.000

Expected
Claims
per 3
Years
0.210
0.060
0.103

Comment: Assumes that if the driver were youthful in the prior year, he will also be youthful in the
future period. On the exam, do not worry about such possible real world concerns.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 171
5.10. B.
A

A Priori
Chance of
Type of This Type
Risk
of Risk
Low
0.8000
High
0.2000

Chance
of the
Observation
0.2430
0.4410

Prob. Weight =
Product
of Columns
B&C
0.1944
0.0882

Posterior
Chance of
This Type
of Risk
0.6879
0.3121

Mean
Claim
from
Risk
1100
1300

0.283

1.000

1162

Overall

5.11. A. Note that over 5 years one gets a Poisson with 5 times the mean for a single year.
For the Poisson with mean , the chance of n accidents is e n / n!.
Therefore the chance of a single accident is exp(-).
The chance of a single accident is therefore for each of the type of drivers:
Good = 0.25: 0.195, Bad = 0.5: 0.303, Ugly = 1: 0.368.
Therefore the Probability Weights are: (0.195)(0.6), (0.303)(0.3), (0.368)(0.1).
Taking a weighted average of the claim frequencies of each type gives 9.1%.
A

Type of
Driver
Good
Bad
Ugly
Overall

A Priori
Chance of
This Type
of Driver
0.6
0.3
0.1

Chance
of the
Observation
0.195
0.303
0.368

Prob. Weight =
Product
of Columns
B&C
0.117
0.091
0.037

Posterior
Chance of
This Type
of Driver
47.8%
37.1%
15.0%

Mean
Annual
Freq.
0.05
0.10
0.20

0.245

1.000

9.1%

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 172
5.12. B. The posterior probability of the $25,000 claim having come from each type of driver is
proportional to the product of the a priori chance of having that type of driver, times the chance of
observing one claim during a year from that type of driver, times the chance of having a claim of size
25,000. (For example, the chance of a claim having come from a Good driver times the chance of
such a claim being of size 25,000 for a Good Driver.)
For example, for Good drivers (Poisson with mean .05) the chance of observing a single claim over
one year is (.05)e-.05 = .04756. For the Pareto f(x) = ()( + x) ( + 1).
Thus for the Good drivers, with Pareto parameters = 5 and = 10,000, given one has a claim the
chance it is of size 25,000 is f(25,000) = 5(100005 )/(350006 ) = (5 x 10-4)/(3.56 ) = 2.72 x 10-7.
The chances of claims of size $25000 are: 2.7, 7.6, 20.0 each times 10-7.
Thus the probabilities are proportional to:
(.04756)(2.7 x 10-7)(60%), etc.
The average severities by Type of driver are /(1): 2500, 3333, and 5000.
Therefore, the weighted average is $4120.
A

Type of
Driver
Good
Bad
Ugly
Overall

5
4
3

A Priori
Chance of
This Type
of Driver
0.6
0.3
0.1

f(25000)
2.72e-7
7.62e-7
2.00e-6

Chance of
Observing
one
Claim
0.04756
0.09048
0.16375

Prob. Weight =
Product
of Columns
C, D, E
7.76e-9
2.07e-8
3.27e-8
6.12e-8

Posterior
Chance of
This Type Average
of Driver Severity
12.7%
2500.00
33.8%
3333.33
53.5%
5000.00
1.00

4120

Comment: Note that with differing claim frequencies and severities for different types of risks one
has to take into account both when computing the chance of observing a given size claim.
A Good Driver is less likely to have produced a claim than a Bad Driver and in this example if a
Good Driver produces a claim it is less likely to be $25000 than a claim from a Bad Driver.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 173
5.13. E. P(Observation | Risk Type) =
P(2 claims in 3 years | Risk Type)P(Severities = 5000 & 25000 | Risk Type & 2 claims).
Thus we need to compute both the probability of 2 claims in 3 years given a risk type and the
probability given a risk type and that we observe 2 claims in 3 years, that the claims are of sizes
$5000 and $25,000.
Over three years each of the risk types has a Poisson frequency with three times the annual mean.
For example for Bad drivers, the frequency over three years is Poisson with parameter (3)(10%) =
30%. For a Bad driver, the chance of observing 2 claims in 3 years is thus: (.32 )e-.3 / 2 = .03334.
The likelihood given one has observed two claims that they were of sizes 5,000 and 25,000 is the
product of f(5000)f(25000). For example for Bad drivers with Pareto parameters = 4, =10,000,
f(5000)f(25000) = {4(100004 )/(150005 )}{4(100004 )/(350005 )} = 4.0117 x 10-11.
Thus given a Bad driver, the chance of the current observation is their product:
(.03334) (4.0117 x 10-11) = 1.337 x 10-12.
Then the probability weight for Bad Drivers is the product of the chance of the observation given a
Bad Driver and the a priori chance of a Bad Driver:
(.3)(1.337 x 10-12) = (.3)(.03334) (4.0117 x 10-11) = 4.01 x 10-13.
Getting the probability weights for Good and Ugly drivers in a similar manner, one divides each
weight by the sum of the weights and computes posterior probabilities of: 4.2%, 24.5% and
71.3%. The average severities by type of driver are /(1): 2500, 3333, and 5000.
Therefore, the weighted average is $4487.
A

Type
of
Driver

A Priori
Chance of
Chance of Observing
This Type 2 Claims

of Driver
in
f(5000)
3 Years

Good
Bad
Ugly

5
4
3

Overall

0.6
0.3
0.1

0.00968
0.03334
0.09879

4.39e-5
5.27e-5
5.93e-5

f(25000)
2.72e-7
7.62e-7
2.00e-6

Chance of
ProbObser- ability
vation
Weight
DXEXF
CxG
1.16e-13
1.34e-12
1.17e-11

Posterior
Chance of
This Type Average
of Driver Severity

6.94e-14
4.01e-13
1.17e-12

4.2%
24.5%
71.3%

2500.0
3333.3
5000.0

1.64e-12

1.00

4487

5.14. D. f(2) = {r(r+1)/2}2/(1+)r+2 = 62/(1+)5 . For = 1, f(2) = 3/16. For = 2, f(2) = 8/81.
P[Observation] = (1/2)(3/16) + (1/2)(8/81) = 0.1431.
By Bayes Theorem, P[Risk Type | Observation] = P[Obser. | Type] P[Type] / P[Observation].
P[Risk Type Two | Observation] = (8/81)(.5)/0.1431 = 34.5%.
Comment: Similar to CAS3, 5/06, Q.30.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 174
5.15. E.
A

A Priori
Chance of
Type of This Type
Risk
of Risk
A
0.333
B
0.333
C
0.333
Overall

Chance
of the
Observation
0.030
0.200
0.370

Prob. Weight =
Product
of Columns
B&C
0.0100
0.0667
0.1233
0.2000

Posterior
Chance of
This Type
of Risk
5.0%
33.3%
61.7%
1.000

Avg.
Aggregate
Losses
22
130
218
179

For example, the average aggregate loss for risk type B is: (0)(.50) + (100)(.30) + (500)(.20) =
130. The estimated future aggregate losses are: (5.0%)(22) + (33.3%)(130) + (61.7%)(218) =
179.
5.16. D. f(1) = 2q(1-q). For q = 0.10, f(1) = 0.18. For q = 0.20, f(1) = 0.32.
P[Observation] = (80%)(0.18) + (20%)(0.32) = 0.208.
By Bayes Theorem, P[Risk Type | Observation] = P[Obser. | Type] P[Type] / P[Observation].
P[Risk Type One | Observation] = (.18)(.8)/0.208 = 69.2%.
Comment: Similar to CAS3, 5/06, Q.30.
5.17. D.
A

State
of
World
1
2
3
Overall

A Priori
Chance of
This Type
of State
0.333
0.333
0.333

Chance
of the
Observation
0.200
0.150
0.250

Prob. Weight = Posterior


Product
Chance of
of Columns
This Type
B&C
of Risk
0.0667
33.3%
0.0500
25.0%
0.0833
41.7%
0.2000
1.000

Pure
Premium
0.500
0.450
0.550
0.508

Comment: Remember to multiply by the given 30% claim frequency in order to convert the mean
severities into mean pure premiums.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 175
5.18. C. If one has total losses of 50, then one has had a single claim of size 50.
The chance of this observation for type A is: (4/5)(3/4) = 60%. For type B it is: (3/5)(1/2) = 30%.
Then the Bayesian Analysis to determine the expected claim frequency is:
A

Type
A
B

A Priori
Chance of
This
Type
0.650
0.350

Chance
of the
Observation
0.6000
0.3000

Prob. Weight =
Product
of Columns
B&C
0.3900
0.1050

Posterior
Chance of
this
Type
78.8%
21.2%

Mean
Freq.
for this
Type
1.200
1.400

0.4950

1.000

1.242

Overall

Comment: Note that the frequency and severity are not independent.
5.19. B. The different possible outcomes and their probabilities for a Risk from type A are:
Situation
1 claim @ 50
1 claim @ 200
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 150
2 claims @ 150 each
Overall

Probability
60.0%
20.0%
7.2%
9.6%
3.2%

Total Losses
50
200
100
200
300

100.0%

106.0

For example, the chance of 2 claims with one of size 50 and one of size 150 is the chance of having
two claims times the chance given two claims that one will be 50 and the other 150 is:
(.2){(2)(.6)(.4)} = 9.6%. In that case the total losses is 50 + 150 = 200.
5.20. D. The different possible outcomes and their probabilities for a Risk from type B are:
Situation
1 claim @ 50
1 claim @ 200
2 claims @ 50 each
2 claims: 1 @ 50 & 1 @ 150
2 claims @ 150 each
Overall

Probability
30.0%
30.0%
25.6%
12.8%
1.6%

Total Losses
50
200
100
200
300

100.0%

131

For example, the chance of 2 claims each of size 150 is the chance of having two claims times the
chance given two claims that both will be 150 is: (0.4){(0.2)(0.2)} = 1.6%.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 176
5.21. B. If one has total losses of 50, then one has had a single claim of size 50.
The chance of this observation for type A is: (4/5)(3/4) = 60%. For type B it is: (3/5)(1/2) = 30%.
Then the Bayesian Analysis to determine the expected pure premium is:
A

Type
A
B

A Priori
Chance of
This
Type
0.650
0.350

Chance
of the
Observation
0.6000
0.3000

Prob. Weight =
Product
of Columns
B&C
0.3900
0.1050

Posterior
Chance of
this
Type
78.8%
21.2%

Mean
P.P.
for this
Type
106
131

0.4950

1.000

111.3

Overall

5.22. E. If one has total losses of 200, then either one has had a single claim of size 200, or one
had claims of 50 and 150. The chance of this observation for type A is:
(4/5)(1/4) + (1/5){(2)(.6)(.4)} = 29.6%. For type B it is: (3/5)(1/2) + (2/5){(2)(.8)(.2)} = 42.8%.
Then the Bayesian Analysis to determine the expected claim frequency is:
A

Type

A Priori
Chance of
This
Type

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
this
Type

Mean
Freq.
for this
Type

A
B

0.650
0.350

0.2960
0.4280

0.1924
0.1498

56.2%
43.8%

1.200
1.400

0.3422

1.000

1.288

Overall

5.23. C. If one has total losses of 200, then either one has had a single claim of size 200, or one
had claims of 50 and 150. The chance of this observation for type A is:
(4/5)(1/4) + (1/5){(2)(.6)(.4)} = 29.6%. For type B it is: (3/5)(1/2) + (2/5){(2)(.8)(.2)} = 42.8%.
Then the Bayesian Analysis to determine the expected pure premium is:
A

Type
A
B

A Priori
Chance of
This
Type
0.650
0.350

Chance
of the
Observation
0.2960
0.4280

Prob. Weight =
Product
of Columns
B&C
0.1924
0.1498

Posterior
Chance of
this
Type
56.2%
43.8%

Mean
P.P.
for this
Type
106
131

0.3422

1.000

116.9

Overall

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 177
5.24. C. The mean pure premium for class A is: (.4)(1700) = 680.
The mean pure premium for class B is: (.8)(1500) = 1200.
The observation of 2000 over 3 years corresponds to either a single claim of size 2000 (a claim in
one year and none in the others) or two claims of size 1000 (a claim in two of the years and none in
the remaining year.)
The chance of 1 claim over three years is: 3q(1-q)2 .
The chance of 2 claims over three years is: 3q2 (1-q).
If the risk is from Class A, then the chance of the observation is:
(.7){3(.4)(1-.4)2 } + (.3)(.3){3(.4)2 (1-.4)} = .32832.
If the risk is from Class B, then the chance of the observation is:
(.5){3(.8)(1-.8)2 } + (.5)(.5){3(.8)2 (1-.8)} = .144.
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Class

Mean
Pure
Premium

A
B

0.7500
0.2500

0.3283
0.1440

0.2462
0.0360

0.8724
0.1276

680
1,200

0.2822

1.0000

746.3

Overall

Comment: Similar to 4, 5/00, Q.22.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 178
5.25. D. If one has a $100 deductible, then the chance of having a $5,000 loss reported is
20%/60% = 1/3, while the chance of having a $50,000 loss reported is 10%/60% = 1/6.
So if the deductible were $100, then the chance of the first two observed losses having sizes of
$50,000 and $5,000 is: (2)(1/6)(1/3) = 1/9. If instead one has a $1000 deductible, the chance of
having a $5,000 loss reported is 20%/30% = 2/3, while the chance of having a $50,000 loss
reported is 10%/30% = 1/3. So if the deductible were $1000, then the chance of the first two
observed losses having sizes of $50,000 and $5,000 is: (2)(1/3)(2/3) = 4/9.
If one has a $10,000 deductible, then a $5000 loss would not be reported, so there is no chance for
this observation. Now the a priori chance of having a $100 deductible is 45%.
The mean size of a reported loss when the deductible is $100 is:
{(30%)(500) + (20%)(5000) + (10%)(50,000)} / {30% + 20% + 10%} = 10,250.
The mean size of a reported loss when the deductible is $1000 is:
{ (20%)(5000) + (10%)(50,000)} / {20%+10%} = 20000.
Putting all of the above together, the posterior estimate of a third loss from the same policy is:
A

Deductible
Size
(Type of
Risk)

A Priori
Chance of
This Type
of Risk

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Average
Reported
Loss
Severity

100
1000
10000

0.450
0.3
0.250

0.11110
0.4444
0.00000

0.04999
0.13332
0.00000

0.27273
0.72727
0.00000

10,250
20,000
50,000

0.18332

1.00000

17,341

Overall

5.26. C. The chance of observing 3 claims for a Poisson is: e 3 /3!.


Therefore the chance of observing 3 claims for a risk of type 1 is: e-.4 (.43 ) / 6 = .00715.
A

Type
1
2
3
Overall

A Priori
Probability
70%
20%
10%

Chance
of the
Observation
0.00715
0.01976
0.03834

Prob. Weight =
Product
of Columns
B&C
0.005005
0.003951
0.003834

Posterior
Chance of
This Type
of Risk
39.13%
30.89%
29.98%

Mean
Annual
Freq.
0.4
0.6
0.8

0.012791

1.000

0.5817

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 179
5.27. A. The sum of 3 independent claims drawn from a single Exponential Distribution with
parameter is a Gamma Distribution with parameters: = 3 and . So if the risk is of type 1, the
distribution of the sum of 3 claims is Gamma with = 3 and = 25. For type 2 the sum has
parameters: = 3 and = 40. For type 3 the sum has parameters: = 3 and = 50. Thus
assuming one has 3 claims, the chance that they will add to $140 if the risk is of type 1 is the density
of a Gamma(3, 25) at 140: x1 ex/ / () = (.04)3 1403-1 e-(.04)(140) / (3) = 0.0023193.
Similarly, the chance for type 2 is: (.025)3 1403-1 e-(.025)(140) / (3) = .00462397.
The chance of the observation for type 3 is: (.02)3 1403-1 e-(.02)(140) / (3) = .00476751.
A

Type
1
2
3
Overall

A Priori
Probability
70%
20%
10%

Chance
of the
Observation
0.00232
0.00462
0.00477

Prob. Weight =
Product
of Columns
B&C
0.001624
0.000925
0.000477

Posterior
Chance of
This Type
of Risk
53.67%
30.57%
15.75%

Mean
Annual
Severity
25
40
50

0.003025

100.00%

33.52

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 180
5.28. A. The chance of observing n claims for a Poisson is e n /n!. Therefore, for example, the
chance of observing 3 claims for a risk of type 1 is : e-.4 (.43 ) / 6 = .00715.
If one had three claims, then the sum of the claims is given by a Gamma with =3 and .
Thus assuming one had 3 claims, the chance that they will add to $140 if the risk is of type 1 is the
density of Gamma(3, 25) at 140: x1 ex/ / () = (25)3 1403-1 e-140/25 / (3) = 0.0023193.
Thus if one has type 1, the chance of observing 3 claims totaling $140 is:
(.00715)(.0023193) = .00001658.
One can compute and then combine the other ways to get a total of $140 in loss for type 1:
A

Number
of
Claims

Probability
of this number
of Claims

Given this Number


of Claims, the
Probability
of $140
in Total Losses

Column B
times
Column C

0
1
2
3
4
5

0.67032
0.26813
0.05363
0.00715
0.00072
0.00006

0
0.0001479
0.0008283
0.0023193
0.0043294
0.0060611

0
0.000039660
0.000044419
0.000016583
0.000003096
0.000000347

Sum

1.00000

0.000104105

Thus for a risk of type 1 the likelihood of $140 in loss in a year is .000104.
Comment: Ive ignored the possibility of more than 5 claims, since that adds very little to the total
likelihood in this case.
5.29. C. For type two with average frequency of 60% and average severity of $40, the likelihood
of the observed total losses in a year being $140 is computed:
A

Number
of
Claims

Probability
of this number
of Claims

Given this Number


of Claims, the
Probability
of $140
in Total Losses

Column B
times
Column C

0
1
2
3
4
5

0.54881
0.32929
0.09879
0.01976
0.00296
0.00036

0
0.00075493
0.00264227
0.00462397
0.00539464
0.00472031

0
0.00024859
0.00026102
0.00009136
0.00001599
0.00000168

Sum

0.99996

0.00061863

Comment: We have ignored the possibility of more than 5 claims, since that adds very little to the
total likelihood in this case.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 181
5.30. E. For type 3 with average frequency of 80% and average severity of $50, the likelihood of
the observed total losses in a year being $140 is computed:
A

Number
of
Claims

Probability
of this number
of Claims

Column B
times
Column C

0
1
2
3
4
5

0.44933
0.35946
0.14379
0.03834
0.00767
0.00123

Given this Number


of Claims, the
Probability of $140
in Total Losses
0
0.00121620
0.00340536
0.00476751
0.00444967
0.00311477

Sum

0.99982

0
0.00043718
0.00048964
0.00018280
0.00003412
0.00000382
0.00114756

5.31. C. One can use these likelihoods computed in the prior three questions in the usual manner to
get posterior probabilities for each type:
A

Type
1
2
3

A Priori
Probability
70%
20%
10%

Chance
of the
Observation
0.00010410
0.00061860
0.00114800

Prob. Weight =
Product
of Columns
B&C
0.00007287
0.00012372
0.00011480

Posterior
Chance of
This Type
of Risk
23.40%
39.73%
36.87%

Mean
Annual
Pure
Premium
10
24
40

0.00031139

100.00%

26.6

Overall

Using the posterior probabilities as weights, the estimated future pure premium is $26.6.
5.32. D. The balance equations for the stationary distribution are:
.71 + .152 + .23 = 1.

.21 + .82 + .33 = 2.

.11 + .052 + .53 = 3.

Also 1 + 2 + 3 = 1.

Eliminating 3 from the first two equations: 1.71 - 1.152 = 31 - 22.


Thus 2 = 1.5291.
Substituting into the third equation: 3 = (.11 + .05(1.529)1)/.5 = .3531.
Therefore, substituting into the constraint equation: (1+ 1.529 +.353)1 = 1.
Thus 1 = .347, 2 = .531, and 3 = .122.
Therefore, the mean annual aggregate losses = (.347)(25) + (.531)(50) +(.122)(100) = 47.4.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 182
5.33. C. S(x) = e-x/, with varying by type of year.
Good: S(120) = exp(-120/25) = .82%. Typical: S(120) = exp(-120/50) = 9.07%.
Poor: S(120) = exp(-120/100) = 30.12%.
From the previous solution, 1 = .347, 2 = .531, and 3 = .122.
Therefore, the probability that the annual aggregate losses are greater than 120 is:
(.347)(.82%) + (.531)(9.07%) + (.122)(30.12%) = 8.8%.
5.34. C. If this year is Good, there is a 70% chance next year is Good, 20% chance next year is
Typical, and 10% chance next year is Poor.
Therefore, E[Y | X is Good] = (.7)(25) + (.2)(50) +(.1)(100) = 37.5.
5.35. E. (.15)(25) + (.8)(50) + (.05)(100) = 48.75.
5.36. D. (.2)(25) + (.3)(50) + (.5)(100) = 70.
Comment: Note that the insurers average annual loss could be computed using the probabilities of
X being of a certain type and E[Y | Type of X]:
(34.7%)(37.5) + (53.1%)(48.75) + (12.2%)(70) = 47.4, which matches the solution to a previous
question.
5.37. A. E[XY | Type of X] = E[X | Type of X] E[Y | Type of X].
E[XY | X is Good] = (25)(37.5) = 937.5.
Type of
First Year

Probability

Mean for
First Year

Expected Value
for Second Year

E[XY]

Good
Typical
Poor

34.7%
53.1%
12.2%

25
50
100

37.5
48.75
70

937.5
2437.5
7000

Average

2474

Comment: Uses the solutions to previous questions.


E[XY | Type of X] = xy Prob[X = x and Y = y | Type of X] =

xy Prob[X = x | Type of X] Prob[ Y = y | Type of X] =


x Prob[X = x | Type of X] y Prob[Y = y | Type of X] = E[X | Type of X] E[Y | Type of X].

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 183
5.38. D. For a year chosen at random, the aggregate losses are distributed as per: an Exponential
with mean 25 (Good) with a 34.7% probability, an Exponential with mean 50 (Typical) with a 53.1%
probability, and an Exponential with mean 100 (Poor) with a 12.2% probability.
A 3-point mixture of Exponentials, with mean: (.347)(25) + (.531)(50) + (.122)(100) = 47.425.
Its second moment is: (.347){(2)(252 )} + (.531){(2)(502 )} + (.122){(2)(1002 )} = 5528.75.
Its variance is: 5528.75 - 47.4252 = 3280.
Alternately, the variance of an Exponential Distribution is 2. Thus if X is Good, the process variance
is: 252 = 625. The expected value of the process variance of X is:
(34.7%)(252 ) + (53.1%)(502 ) + (12.2%)(1002 ) = 2764.
Type

Probability

Process Variance

Mean

Square of Mean

Good
Typical
Poor

34.7%
53.1%
12.2%

625
2500
10000

25
50
100

625
2500
10000

2764.38

47.42

2764.38

Average

The variance of the hypothetical means = 2764.38 - 47.422 = 516.


The total variance is the sum of the expected value of the process variance and the variance of the
hypothetical means = 2764 + 516 = 3280.
Comment: One uses results from previous solutions.
The EPV and VHM, are discussed in a subsequent section.
5.39. A. Cov[X, Y] = E[XY] - E[X]E[Y] = 2474 - (47.4)(47.4) = 227.
Corr[X,Y] = Cov[X, Y] / Var[X]Var[Y] = 227/3280 = 6.9%.
Comment: Difficult. Uses the solutions to previous questions. The losses in consecutive years are
positively correlated, since a Good Year is more likely to be followed by another Good Year and a
Poor Year is more likely to be followed by another Poor Year.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 184
5.40. C. One applies Bayes Theorem in order to compute the chances that this year was Good,
Typical, or Poor, conditional on the observation of losses of 75 this year.
Prob[This Year is Good | X = 75] = Prob[ X = 75 | Good] Prob[Good]/ Prob[X = 75] =
(e-75/25 /25)(.347) / {(e-75/25 /25)(.347) + (e-75/50 /50)(.531) + (e-75/100 /100)(.122)} =
.00069 / (.00069 + .00237 + .00058) = 19.0%. Similarly,
Prob[This Year is Typical | X = 75] = .00237 / (.00069 + .00237 + .00058) = 65.2%.
Prob[This Year is Poor | X = 75] = .00058 / (.00069 + .00237 + .00058) = 15.8%.
If this year is Good, 37.5 are the expected losses next year.
If this year is Typical, 48.75 are the expected losses next year.
If this year is Poor, 70 are the expected losses next year. Thus the expected losses next year are:
(19.0%)(37.5) + (65.2%)(48.75) + (15.8%)(70) = 50.0.
Comment: Probably beyond what you will be asked on the exam.
This whole calculation could be arranged in the following spreadsheet:
A

Type

A Priori
Probability

Good
Typical
Poor

34.7%
53.1%
12.2%

Overall

Chance
Prob. Weight = Posterior Chance of
of the
Product
This Type =
Observation
of B
Columns
Col.
D
/ Sum of Col. D
&C
0.00199
0.00069
0.19001
0.00446
0.00237
0.65154
0.00472
0.00058
0.15845
0.00364

1.00000

Expected
Losses
Next
Year
37.50
48.75
70.00
49.98

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 185
5.41. A. If the risk is of type 1, the distribution has parameter = 25 and density
ex/ / = .04e-.04x. Thus the chance of the observation is:
{.04e-.04(30)} {.04e-.04(40)} { .04e-.04(70)} = (.04)3 e-.04(140) = 2.37 x 10-7.
For type 2 the chance of the observation is: (.025)3 e-.025(140) = 4.72 x 10-7.
For type 3 the chance of the observation is: (.02)3 e-.02(140) = 4.86 x 10-7.
A

Type

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Severity

1
2
3

70%
20%
10%

0.0000002367
0.0000004718
0.0000004865

0.0000001657
0.0000000944
0.0000000486

53.67%
30.57%
15.76%

25
40
50

0.0000003087

100.00%

33.5

Overall

Comment: One gets the same solution as the previous question. This is true in this case since each
risks severity is from a Gamma Distribution with the same value of . (The Exponential is a Gamma
for = 1.) In the example in the text of this section where this was not the case, the two severity
examples produced different results from each other.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 186
5.42. C. Given m 1 claims have been incurred, the probability of observing 1 claim by year end
is the density at 1 of a Binomial Distribution with parameters .65 and m: m(.65)(.35m-1).
A

# of
Claims
Incurred

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Probability =
Col. D / Sum of Col. D

# of
Claims
Incurred

0
1
2
3
4

0.20
0.20
0.20
0.20
0.20

0.0000
0.6500
0.4550
0.2389
0.1115

0.0000
0.1300
0.0910
0.0478
0.0223

0.0000
0.4466
0.3126
0.1641
0.0766

0
1
2
3
4

Overall

1.0000

0.2911

1.0000

1.871

Comment: Thus the number of claims remaining to be reported is estimated as:


1.871 - 1 = .871. See Loss Development Using Credibility, by Eric Brosius.
In a similar manner one can compute the estimates for other possible observations:
# of Claims
Reported by Year End

Estimated Number of
Claims Incurred

Estimated
Number of Claims
Not Yet Reported

0
1
2
3
4

0.512
1.871
2.905
3.583
4.000

0.512
0.871
0.905
0.583
0.000

As usual, the estimates using Bayes Analysis are in balance:


# of Claims
Reported by Year End

A Priori
Probability
of the Observation

Estimated Number of
Claims Incurred

0
1
2
3
4

0.3061
0.2911
0.2353
0.1318
0.0357

0.512
1.871
2.905
3.583
4.000

Weighted Average

2.000

The prior mean number of claims incurred of (0 + 1 + 2 + 3 + 4)/5 = 2, is equal to the weighted
average of the posterior estimates of the numbers of claims incurred, using weights equal to the a
priori probabilities of each possible observation.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 187
5.43. A. If the risk is from Class A, then the chance of the observation is:
(e-.6 .63 / 3!)(6)(e-7/11/11)(e-10/11/11)(e-21/11/11) = (.01976)(.0001425) = .000002816.
If the risk is from Class B, then the chance of the observation is:
(e-.8 .83 / 3!)(6)(e-7/15/15)(e-10/15/15)(e-21/15/15) = (.03834)(.0001411) = .000005411.
A

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Class

Mean
Frequency

Class

A Priori
Chance of
This Class

A
B

0.7500
0.2500

2.816e-6
5.411e-6

2.112e-6
1.353e-6

0.6096
0.3904

0.6000
0.8000

3.465e-6

1.0000

0.6781

Overall

Comment: I have used the entire observation, including the information on severity, in order to
estimate the probability that risk is from Class A or B.
In general when doing Bayes Analysis, use all of the information given, no more and no less.
5.44. C. Using the posterior probabilities from the previous solution:
(0.6096)(11) + (0.3904)(15) = 12.562.
5.45. C. The mean pure premium for Class A is: (.6)(11) = 6.6.
The mean pure premium for Class B is: (.8)(15) = 12.0.
Using the posterior probabilities from a previous solution:
(0.6096)(6.6) + (0.3904)(12.0) = 8.708.
Comment: Similar to 4, 5/00, Q.7.
(0.6781)(12.562) = 8.518 8.708.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 188
5.46. B. Given m 3 claims have been incurred, then the chance of the observation is the density
at 3 of a Binomial Distribution with parameters .7 and m: {m!/ (3!(m-3)!)} .73 .3m-3.
The chance that m claims have been incurred is: e-44m/m!.
Thus by Bayes Theorem, the posterior probability of m, for m 3 is:
(e-44m/m!){m!/ (3!(m-3)!)} .73 .3m-3 / (e-44m/m!){m!/ (3!(m-3)!)} .73 .3m-3 =
m3

{.3m4m/(m-3)!} / .3m4m/(m-3)! = 1.2m/(m-3)! / 1.23 1.2i/i! =


m=3

i=0

1.2m/(m-3)! / {1.23 e1.2} = e-1.21.2m-3/(m-3)!. Thus the posterior mean is:

me-1.21.2m-3/(m-3)! = (i+3)e-1.21.2i/i! = i e-1.21.2i/i! + 3 e-1.21.2i/i! =


m=3

i=0

i=0

i=0

= (mean of a Poisson with = 1.2) + (3)(sum of the densities of a Poisson with = 1.2) =
1.2 + 3 = 4.2.
Alternately, divide the original Poisson Process into two independent Poisson Processes: claims
reported by year end with mean (.7)(4) = 2.8, and claims not reported by year end with mean
(.3)(4) = 1.2. Since the two processes are independent, the expected number of claims not
reported is 1.2, regardless of the observation. Therefore, for 3 claims observed by year end, the
expected number of claims incurred is: 3 + 1.2 = 4.2.
Comment: See Loss Development Using Credibility, by Eric Brosius.
The posterior distribution of m - 3 is Poisson with mean 1.2.
5.47. C.
A

Type

A Priori
Chance of
This Type

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Type

Mean

A
B

0.50
0.50

0.50
0.90

0.2500
0.4500

0.3571
0.6429

35.000
23.000

0.7000

1.0000

27.286

Overall

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 189
5.48. B. Using the prior distribution and the observation of both losses:
A

Type

A Priori
Chance of
This Type

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Type

Mean

A
B

0.50
0.50

0.25
0.09

0.1250
0.0450

0.7353
0.2647

35.000
23.000

0.1700

1.0000

31.824

Overall

Alternately, one can use the posterior distribution to the observation of the first claim, as the prior
distribution to the observation of the second claim:
A

Type

Distribution
Posterior to
1st Claim

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Type

Mean

A
B

0.3571
0.6429

0.50
0.10

0.1785
0.0643

0.7353
0.2647

35.000
23.000

0.2428

1.0000

31.823

Overall

Comment: Such a sequential approach always works for Bayesian Analysis.


5.49. D. Given m 5 claims have been incurred, then the chance of the observation is the density
at 5 of a Binomial Distribution with parameters .7 and m: {m!/ (5!(m-5)!)} .75 .3m-5.
The chance that m claims have been incurred is:
(1.6m/2.6m+2)(m+1)!/(m!(2-1)!) = (.1479)(.6154m)(m+1).
Thus by Bayes Theorem, the posterior probability of m, for m 5 is proportional to:
(.1479)(.6154m)(m+1){m!/ (5!(m-5)!)} .75 .3m-5, which is proportional to:
(.18462m){(m+1)!/(m-5)!)}.
Letting i = m - 5, then i = 0, 1, 2, 3, ..., and the density of i is proportional to:
(.18462i){(i+6)!/i!}. This is proportional to a Negative Binomial Distribution with
r = 7, and /(1+) = .18462. = .226. Therefore, the posterior mean of i is: (7)(.226) = 1.58.
Therefore, the posterior mean of m is: 5 + 1.58 = 6.58.
Comment: See Loss Development Using Credibility, by Eric Brosius. While the number of risk
types are infinite, m = 0, 1, 2, ..., they are discretely distributed rather than continuous.
5.50. B. For risk 1, the mean is: (.5)(250) + (.3)(2500) + (.2)(60000) = 12,875.
Risk

A Priori
Probability

Probability
of Observation

Probability
Weight

Posterior
Distribution

Mean

1
2

66.67%
33.33%

0.5
0.7

0.3333
0.2333

58.82%
41.18%

12,875
6,675

0.5667

Comment: Set up taken from the 4, 11/03, Q.23.

10,322

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 190
5.51. B. If for each year we have a Negative Binomial with r and , then for a sum of four
independent years we have a Negative Binomial with parameters 4r and .
Over 4 years, Type I is Negative Binomial with r = 4 and = 0.25,
Type II is Negative Binomial with r = 8 and = 0.25,
and Type III is Negative Binomial with r = 8 and = 0.50. f(1) = r/(1+)r+1.
A

A Priori
Type of Chance of
Chance
Risk
This Type
of the
of Risk Observation
I
II
III

0.30
0.50
0.20

0.3277
0.2684
0.1040

Overall

Comment: Similar to 4, 11/06, Q.2.

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Freq.

0.09830
0.13422
0.02081

38.80%
52.98%
8.21%

0.250
0.500
1.000

0.25333

1.000

0.444

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 191
5.52. For the Pareto Distribution: f(x) =


44
=
.
( + x ) + 1 ( + x )5

The density for the payments excess of a deductible of size d is:


f(x +d) / S(d) =

44
4 ( + d)4
4
/
=
.
( + x + d)5 + d
( + x + d)5

There are four combinations equally likely a priori:


= 800 and d = 500, = 800 and d = 1000, = 1200 and d = 500, = 1200 and d = 1000.
The chances of the observation are:
4 (800 + 500)4
(800 + 400 + 500)5

4 (800 + 500) 4
= 53.411 x 10-9,
(800 + 1500 + 500 )5

4 (800 + 1000)4
4 (800 + 1000)4
= 87.421 x 10-9,
(800 + 400 + 1000)5 (800 + 1500 + 1000 )5
4 (1200 + 500) 4
4 (1200 + 500)4
= 81.445 x 10-9,
(1200 + 400 + 500)5 (1200 + 1500 + 500 )5
4 (1200 + 1000)4
4 (1200 + 1000)4
= 106.568 x 10-9.
(1200 + 400 + 1000)5 (1200 + 1500 + 1000 )5
Since the four combinations are equally likely a priori, the probability weights are:
53.411, 87.421, 81.445, and 106.568.
Thus the posterior probabilities are:
16.24% = 800 and d = 500
26.58% = 800 and d = 1000
24.77% = 1200 and d = 500
32.41% = 1200 and d = 1000
5.53. D. The average pure premium for Class A is: (1/5){(2/3)(100) + (1/3)(200)} = 26.67.
The average pure premium for Class B is: (2/5){(1/2)(100) + (1/2)(200)} = 60.
Chance of the observation if the insured is from class A: (1/5)(2/3).
Thus the probability weight for class A is: (3/4)(1/5)(2/3) = 1/10.
Similarly, the probability weight for class B is: (1/4)(2/5)(1/2) = 1/20.
Thus the posterior distribution is: 2/3 and 1/3.
The estimated future pure premium for this insured is: (2/3)(26.67) + (1/3)(60) = 37.78.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 192
5.54. D.
A

State
of
World

A Priori
Chance of
This Type
of State

Chance
of the
Observation

1
2
3

0.333
0.333
0.333

0.333
0.500
0.167

Overall

Prob. Weight = Col. D / Sum of Col. D =


Product
Posterior Chance of
of Columns
This Type
Average
B&C
of Risk
Severity
0.1111
0.1667
0.0556

33.3%
50.0%
16.7%

0.3333

1.000

1.3333
1.5000
1.1667

Average
Pure
Premium
0.667
0.750
0.583
0.694

Comment: Remember to multiply by the given 50% claim frequency in order to convert the mean
severities into mean pure premiums.
5.55. B.
A

Type of
Driver

A Priori Chance of
This Type
of Driver

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior Chance
of This Type
of Driver

Youth
Adult

0.150
0.850

30%
15%

0.0450
0.1275

26.09%
73.91%

0.172

1.000

Overall

5.56. D. Prob[u = 2 | T = 0] is the density at 2 of a Binomial with q = 0 and m = 2, which is 0.


Prob[u = 2 | T = 1] is the density at 2 of a Binomial with q = 1/2 and m = 2, which is 1/4.
Prob[u = 2 | T = 1] is the density at 2 of a Binomial with q = 1 and m = 2, which is 1.
A

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Frequency

0
1
2

25%
50%
25%

0.000
0.250
1.000

0.000
0.125
0.250

0.00%
33.33%
66.67%

0
1
2

0.375

100.00%

1.667

Overall

Expected future frequency is: (0)(0) + (1/3)(1) + (2/3)(2) = 5/3.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 193
5.57. A. The Normal Distribution has a density function of exp[-(x-)2 /(22) ] / (2).5.
For example, for Type A the Normal Density at .12 is: e-.222/(.03)(2.507) = 10.65.
The probability weights are the chance of the observation (the probability density function) times
the a priori probability. The probability density function at .12 from either risk B or C is so small,
(10-12 and 10-1320 respectively,) that the posterior chance of Risk A is 100%.
Thus the Bayesian estimate is .10, the mean of Risk A.
A

Type
of
Risk
A
B
C

A Priori
Chance of
This Type
of Risk
Mean
0.333
0.1
0.333
0.5
0.333
0.9

Probability
Density
Function
1.065e+1
2.288e-12
0.000e+0

Standard
Deviation
0.03
0.05
0.01

Prob. Weight = Posterior


Product
Chance of
of Columns
This Type
B&E
of Risk
3.5494
100.0%
0.0000
0.0%
0.0000
0.0%

Overall

3.5494

1.000

Mean
For this
Type of
Risk
0.1
0.5
0.9
0.1000

Comment: Assume that each of the three risks is a priori equally likely.
5.58. B. The posterior probabilities are the probability weights divided by their sum.
A

A Priori
Prob. Weight = Posterior
Chance of
Chance
Product
Chance of
Type of This Type
of the
of Columns
This Type
Risk
of Risk
Observation
B&C
of Risk
A
0.750
0.100
0.075
60.0%
B
0.250
0.200
0.050
40.0%
Overall

0.125

Avg.
Claim
Frequency
0.6
1.0
0.76

1.000

The posterior estimate is the product of the posterior probabilities and the means for each type of
risk: (60%)(.6) + (40%)(1.0) = .76.
5.59. A.
Risk
1
2
Overall

A priori
Probability
0.666
0.333

Chance of
Observation
0.5
0.7

Probability
Weight
0.333
0.233

Probability
0.588
0.412

Mean
4350
2270

0.566

3494

Comment: (.333)(.7) = .233. .233 / .566 = .412. (.7)(100) + (.2)(1000) + (.1)(20000) = 2270.
(.588)(.4350) + (.412)(2270) = 3494.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 194
5.60. B.
A priori
Probability
0.5
0.5

Risk
A
B

Chance of
Observation
0.64
0.36

Probability
Weight
0.32
0.18

Posterior
Probability
0.64
0.36

Mean
48
172

0.5

92.64

Overall

The probability weight is the product of the a priori probability and the chance of the observation.
The posterior probability is the probability weight divided by the sum of the probability weights.
The posterior estimate is: (.64)(48) + (.36)(172) = 92.64.
5.61. E. For example, the average aggregate loss for risk type A is:
(0)(.80) + (50)(.16) + (2000)(.04) = 88.
A

A Priori
Chance of
Chance
Type of This Type
of the
Risk
of Risk
Observation
A
0.333
0.160
B
0.333
0.240
C
0.333
0.320
Overall

Prob. Weight = Posterior Chance


Product
of This Type =
of Columns
Col. D/
B&C
Sum of Col. D
0.053
22.2%
0.080
33.3%
0.107
44.4%
0.240
1.000

Average
Aggregate
Losses
88.0
332.0
576.0
386.2

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 195
5.62. C. If one has a $1 million maximum covered loss the chance of having a $100,000 payment
is 1/5 = .2. So if the maximum covered loss were $1 million, then the chance of observing two
claims payments of $100,000 is .22 = .04. If instead one has a $100,000 maximum covered loss,
the chance of having a $100,000 payment is the chance of having a total size of loss 100,000,
which is 1/5 +1/20 = .25. So if the maximum covered loss were $100,000, then the chance of
observing two claims payments of $100,000 is .252 = .0625. Now the a priori chance of having a
maximum covered loss of $100,000 is 2/3; since the claim count distributions are the same for both
types, the chance of a claim coming from a policy with maximum covered loss $100,000 is also 2/3.
The mean payment for a claim when the maximum covered loss is $100,000 is:
(1/2)(10000) + (1/4)(50000) + (1/5)(100,000) + (100,000)(1/20) = 42,500.
The mean payment for a claim when the maximum covered loss is $1 million is:
(1/2)(10000) + (1/4)(50000) + (1/5)(100,000) + (1/20)(1,000,000) = 87,500.
Therefore, the posterior estimate of a third claim from the same policy is:
A

Type of
Risk

A Priori
Chance of
This Type
of Risk
0.667
0.333

1
2
Overall

Chance
of the
Observation
0.06250
0.04000

Prob. Weight =
Product
of Columns
B&C
0.04167
0.01333
0.05500

Posterior
Chance of
This Type
of Risk
0.75758
0.24242
1.00000

Average
Claim
Severity
$42,500
$87,500
$53,409

Comment: It may take a moment to recognize that this is a question involving Bayesian Analysis;
we observe two claims payments from a single policy of unknown type and we wish to estimate
the size of another claims payment from the same policy. While this is a somewhat artificial example,
(but then how many times in your career have you picked balls from urns), this question tests
whether you can recognize and apply Bayes Analysis to general situations.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 196
5.63. D. For the Poisson Distribution with mean , f(x) = e x / x!. The chance of observing zero
claims is therefore f(0) = e, while the chance of observing 2 claims is
f(2) = e 2 / 2. For the Poisson Distribution the number of claims observed over the first period is
independent of the number of claims observed over the second period. (The Poisson has a
constant claims intensity, such that how many claims are observed over any interval of time is
independent of how many claims are observed over any other disjoint interval of time.) Thus the
chance of observation given is f(0)f(2) = e2 2 / 2.
For Class 1 with = 1 this is: e-2 / 2 = .06767. For Class 2 with = 2 this is: 2e-4 = .03663.
The Bayesian Analysis proceeds as follows:
A

Class
1
2

A Priori
Chance of
This
Class
0.500
0.500

Chance
of the
Observation
0.0677
0.0366

Prob. Weight =
Product
of Columns
B&C
0.0338
0.0183

Posterior
Chance of
This Type
of Risk
64.9%
35.1%

Mean
Frequency

0.0522

1.000

1.35

Overall

1
2

5.64. C. If Type A, the chance of the observation is 1 - ((5000-3000)/1000) =


1 - (2) = 0.0228. If Type B, then the mean is 4000 and the standard deviation is 1000 so that the
chance of the observation is 1 - [(5000-4000)/1000] = 1 - (1) = 0.1587.
A

Type
of
Claim
A
B
SUM

A Priori
Probability
0.75
0.25

Chance of
Observation
0.0228
0.1587

Probability
Col. D / Sum of Col. D =
Weights =
Posterior
Col. B x Col. C
Probability
0.0171
0.301
0.0397
0.699
0.05677

1.000

The posterior chance of Type A is proportional to the product of its a priori chance and the chance of
the observation if Type A. The posterior probability is: .0171/.05678 = 0.301.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 197
5.65. B. The posterior probabilities are proportional to the product of the chance of the observation
given each class and the a priori probability of each class. Since the a priori probabilities of the
classes are all equal, the posterior probabilities are proportional to the chance of the observation
given each class. Thus, the posterior probabilities are proportional to the density functions at 340:
1/400, 1/600 and 1/800. Dividing by their sum, these produce posterior probabilities of:
6/13, 4/13 and 3/13. The means of the classes are 200, 300 and 400.
Thus the Bayesian analysis estimate of the expected value of a second claim from the same risk is:
(6/13)(200) + (4/13)(300) + (3/13)(400) = 3600/13 = 277.
A

Class

A Priori
Chance of
this Class

Chance
of the
Observation

0.3333
0.3333
0.3333

0.002500
0.001667
0.001250

Posterior
Chance of
This Class =
Col. D / Sum of D
0.4615
0.3077
0.2308

Mean
of this
Class

1
2
3

Prob. Weight =
Product
of Columns
B&C
0.000833
0.000556
0.000417
0.001806

1.000

276.9

Overall

200
300
400

5.66. C. F(x) = 1 - {5000/(5000+x)}2 .


The probability of the observation given a maximum covered loss of $5000 is 0.
The probability of the observation given a maximum covered loss of $10,000 is:
1 - F(9000) = 0.1276.
The probability of the observation given a maximum covered loss of $20,000 is:
F(11000) - F(9000) = (1 - 0.0977) - (1 - 0.1276) = 0.0299.
A

Maximum
Covered
Loss

A Priori
Chance of
This
Type
0.2500
0.2500
0.5000

5000
10000
20000

Chance
of the
Observation
0.0000
0.1276
0.0299

Overall

Prob. Weight = Posterior Chance of


Product
This Type =
of Columns
Col. D / Sum of Col. D
B&C
0.0000
0.0000
0.0319
0.6809
0.0149
0.3191
0.0469

1.0000

5.67. D. If Risk A, the chance of the observation is e-m.


If Risk B, the chance of the observation is e-(m+1).
The posterior probabilities are proportional to the product of the a priori probabilities and the chance
of the observation. Given that A and B are equally likely a priori, the posterior probabilities are
proportional to e-m and e-(m+1).
Thus the posterior probability of Risk A is: e-m / {e-m + e-(m+1)} = e / (1+e) = 0.731.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 198
5.68. B. The chance of observing zero claims if we have risk type A is e-m.
The chance of observing zero claims if we have risk type B is e-2m.
The a priori chance of risk type A and B are each .5.
Therefore, by Bayes theorem the posterior probability of Risk Type A is:
(.5)(e-m) / { (.5)(e-m) +(.5)(e-2m)} = 1/(1+e-m).
The posterior probability of Risk Type B is: (.5)(e-2m) / { (.5)(e-m) +(.5)(e-2m)} = e-m/(1+e-m).
Thus the posterior chance of zero claims is:
e-m/(1+e-m) + e-2m e-m/(1+e-m) = (e-m + e-3m)/(1+e-m).
Thus the posterior chance of at least one claim is :
1 - (e-m + e-3m)/(1+e-m) = (1 - e- 3 m) / (1+e-m).
Comment: Some students may find this easier to do by just plugging in a value for m at the
beginning, such as m = 0.7, and just proceeding to do the problem numerically. In that case,
compare your numerical answer to the available choices. With m = 0.7:
A

Type
of
Risk
A
B
Overall

Mean
A Priori
Prob. Weight =
Posterior
of
Chance of
Chance
Product
Probability
Poisson
This
of the
of Columns
Col. E / Sum of Col. E
Dist.
Type
Observation
C&D
0.7
1.4

0.5000
0.5000

0.4966
0.2466

Chance
of
at Least
1 Claim

0.2483
0.1233

0.6682
0.3318

0.5034
0.7534

0.3716

1.0000

0.586

(1 - e-3m)/(1 + e-m) = (1 - e-2.1)/(1 + e-.7) = .8775 / 1.4966 = 0.586.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 199
5.69. D. If the risk is from Class A, then the chance of the observation is:
(e-1 12 / 2!)(2)(e-1/1)(e-3/1) = e-5 = .00674.
If the risk is from Class B, then the chance of the observation is:
(e-3 32 / 2!)(2)(e-1/3/3)(e-3/3/3) = e-13/3 = .01312.
The mean pure premium for Class A is: (1)(1) = 1.
The mean pure premium for Class B is: (3)(3) = 9.
A

Class
A
B

A Priori
Chance
Chance of
of the
This Class Observation
0.5000
0.5000

0.00674
0.01312

Overall

Prob. Weight =
Product
of Columns
B&C
0.00337
0.00656

Posterior
Chance of
This Class

Mean
Pure
Premium

0.3392
0.6608

1.0000
9.0000

0.00993

1.0000

6.286

Comment: I have included a factor of two in the chance of observations in order to take into account
the two combinations of claim severities (either claim can be first.) If the question has instead said
that the first claim was of size 1 and the second claim was of size 3, then one should leave out this
factor of two. You get the same answer to the question whether you include this factor of two or not.
In general, any factor that shows up on every row of the chance of the observation column, drops
out when one computes the posterior distribution.
5.70. If we have two claims, then the chance they are of sizes 1.0, and at least 3.0 is:
2 f(1) S(3) = 2 (e-1/ / ) e-3/.
If the risk is from Class A, then the chance of the observation is:
(e-1 12 / 2!)(2)(e-1/1)(e-3) = e-5 = 0.00674.
If the risk is from Class B, then the chance of the observation is:
(e-3 32 / 2!)(2)(e-1/3/3)(e-3/3) = 3 e-13/3 = 0.03937.
The mean pure premium for Class A is: (1)(1) = 1.
The mean pure premium for Class B is: (3)(3) = 9.

Class
A
B
Overall

A Priori
Chance
Chance of
of the
This Class Observation
0.5000
0.5000

0.00674
0.03937

Probability
Weight
B&C
0.00337
0.01969
0.02305

Posterior
Chance of
This Class

Mean
Pure
Premium

0.1461
0.8539

1.0000
9.0000

1.0000

7.831

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 200
5.71. D. The observation corresponds to either a single claim of size 100,000 (either a claim in the
first year and none in the second year or vice versa) or two claims of size 50,000 (one in each year.)
If the risk is from Class A, then the chance of the observation is:
(2)(.22)(.78)(.4) + (.22)(.22)(.6)(.6) = .1547.
If the risk is from Class B, then the chance of the observation is:
(2)(.11)(.89)(.64) + (.11)(.11)(.36)(.36) = .1269.
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Class

Mean
Pure
Premium

A
B

0.6667
0.3333

0.1547
0.1269

0.1031
0.0423

0.709
0.291

15,400
9,020

0.1454

1.000

13,544

Overall

Comment: Bullet (v) could/should have been worded more clearly. The expected number of
claims per year for an individual risk in Class A is 0.22. Usually expected frequencies are intended
to be per individual or per exposure, unless stated otherwise.
If not, in this case they would have said something like the total number of claims per year expected
from Class A.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 201
5.72. C. f(x) = /(x+)+1 = 10/(x+10)+1. f(20) = (/30) /3.
S(x) = (/(x+)). S(30) = (10/40) = 1/4.
A

1
2
3

A Priori
Chance of
This Type
of Risk
0.3333
0.3333
0.3333

f(20) =
Chance
of the
Observation
0.01111
0.00741
0.00370

Prob. Weight =
Product
of Columns
B&C
0.00370
0.00247
0.00123

Posterior
Chance of This
Type of Risk =
Col. D / Sum of Col. D
50.00%
33.33%
16.67%

25.00%
6.25%
1.56%

0.00741

1.000

14.84%

Overall

S(30)

Comment: The a priori probability that a claim will be greater than 30 is:
(1/3)(25%) + (1/3)(6.25%) + (1/3)(1.56%) = 10.94%. Since posterior to the observation, there is
a greater chance that the Pareto Distribution is longer-tailed ( smaller), the estimate of S(30) has
increased. Since the survival function is not linear in alpha, one should not weight together the alphas:
(1/2)(1) + (1/3)(2) + (1/6)(3) = 1.667. This would result in an incorrect value for the posterior
estimate of S(30); 1/41.667 = 9.9% 14.84%.
5.73. B. f(x) = /(x+)+1 = 10/(x+10)+1. f(20) = (/30) /3. f(40) = (/50) /5.
The probability of the observation is: 2f(20)f(40).
S(x) = (/(x+)). S(30) =(10/40) = 1/4.
A

A Priori
Chance

f(20)

1
2
3

0.3333
0.3333
0.3333

0.01111
0.00741
0.00370

Overall

f(40)

Probability
of the
Observation

Probability
Weight

Posterior
Chance

S(30)

0.0000889
0.0000237
0.0000036

0.00002963
0.00000790
0.00000119

76.53%
20.41%
3.06%

25.00%
6.25%
1.56%

0.00003872

1.000

20.46%

0.00400
0.00160
0.00048

Comment: Since it appears on every row, the factor of 2 in the probability of the observation does
not affect the posterior distribution.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 202
5.74. D. S(x) = (/(x+)). S(30) = (10/40) = 1/4.
The probability of the observation is: S(20) = (10/30) = 1/3.
A

A Priori
Chance

Probability
of the
Observation

Probability
Weight

Posterior
Chance

S(30)

1
2
3

0.3333
0.3333
0.3333

33.33%
11.11%
3.70%

0.11111
0.03704
0.01235

69.23%
23.08%
7.69%

25.00%
6.25%
1.56%

0.16049

100.00%

18.87%

Overall

5.75. B. Mean for class 1 is: (0 + 1 + 2)/3 = 1.


Mean for class 2 is: (1)(1/6) + (2)(2/3) + (3)(1/6) = 2.
Mean for class 3 is: (2)(1/6) + (3)(2/3) + (4)(1/6) = 3.
A

Class

A Priori
Chance of
This
Class

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

1
2
3

0.5000
0.3333
0.1667

0.33333
0.16667
0.00000

0.16667
0.05556
0.00000

75.00%
25.00%
0.00%

1.00
2.00
3.00

0.22222

1.000

1.25

Overall

Posterior
Mean
Chance of This
Frequency
Class =
Col. D / Sum of Col. D

Comment: While there is no reason why they could not have asked you to estimate the future using
Buhlmann Credibility, they did not. Buhlmann Credibility is a least squares linear approximation to
Bayesian Analysis. Therefore, Bayes Analysis is the default.
Use Bayes Analysis when both are available, unless: they use the words: credibility, Buhlmann,
Buhlmann-Straub, semiparametric, empirical Bayes, etc., or it is one of the Conjugate Prior situations
in which Buhlmann = Bayes, so it does not matter which one you use.
5.76. A. Prob[3 claims | good] = (13 )e-1/3! = .0613.
Prob[3 claims | bad] = (53 )e-5/3! = .1404.
Since the two types of drivers are equally likely, the a priori probability of the observation of 3
claims is: (.5)(.0613) + (.5)(.1404) = .1009.
By Bayes Theorem, P[A | B] = P[B | A]P[A]/P[B]:
Prob[good | 3 claims] = Prob[good]Prob[3 claims | good]/Prob[3 claims] =
(.5)(.0613)/.1009 = 30.4%.
Comment: Prob[bad | 3 claims] = Prob[bad]Prob[3 claims | bad]/Prob[3 claims] =
(.5)(.1404)/.1009 = 69.6% = 1 - 30.4%.
The estimated future claim frequency for this driver is: (30.4%)(1) + (69.6%)(5) = 3.78.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 203
5.77. A. For Class I, = 12/12 = 1 per year. For Class II, = 12/15 = 0.8 per year.
For Class III, = 12/18 = 2/3 per year. Prob[0 claims | Class I] = e-1 = .3679.
Prob[0 claims | Class II] = e-.8 = .4493. Prob[0 claims | Class III] = e-2/3 = .5134.
The a priori probability of the observation of 0 claims is:
(1/3)(.3679) + (1/3)(.4493) + (1/3)(.5134) = .4435.
By Bayes Theorem, P[A | B] = P[B | A]P[A]/P[B]:
Prob[Class I | 0 claims] = Prob[Class I]Prob[0 claims | Class I]/Prob[0 claims] =
(1/3)(.3679)/.4435 = 27.65%.
Prob[Class II | 0 claims] = Prob[Class II]Prob[0 claims | Class II]/Prob[0 claims] =
(1/3)(.4493)/.4435 = 33.77%.
Prob[Class III | 0 claims] = Prob[Class III]Prob[0 claims | Class III]/Prob[0 claims] =
(1/3)(.5134)/.4435 = 38.59%.
Therefore, the expected future frequency for this insured is:
(27.65%)(1) + (33.77%)(.8) + (38.59%)(2/3) = .8039.
The expected loss in year 2 for this insured is: (1000)(.8039) = 804.
5.78. D. This a Pareto Distribution with = 1 and S(x) = /(x+).
Type

A priori
chance

f(5)

Probability
Weight

Posterior
Probability

S(8)

=1
=3

0.5
0.5

0.02778
0.04688

0.01389
0.02344

0.37209
0.62791

0.1111
0.2727

0.03733

0.2126

5.79. B. For a Binomial with m = 6, f(3) = 20q3 (1-q)3 . For q = 0.1, f(3) = 0.01458.
Type

A priori
Probability

Chance of
Observation

Probability
Weight

Posterior
Probability

Mean
Frequency

I
II
III

0.7
0.2
0.1

0.01458
0.08192
0.27648

0.01021
0.01638
0.02765

0.188
0.302
0.510

0.1
0.2
0.4

0.05424

0.283

Comment: Buhlmann Credibility is a linear approximation to Bayes Analysis; therefore, on the exam
the default is to use Bayes Analysis unless they say to use credibility.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 204
5.80. B. Mean severity for Class 1: (.5)(250) + (.3)(2500) + (.2)(60000) = 12,875.
A

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Class

Mean
Severity

Class

A Priori
Chance of
This Class

1
2

0.6667
0.3333

0.5000
0.7000

0.3333
0.2333

0.588
0.412

12,875
6,675

0.5667

1.000

10,322

Overall

Comment: Same setup as 4, 11/03, Q.23, which uses Buhlmann Credibility.


5.81. C. The chance of the observation is the density at two of a Geometric: 2/(1 + )3 .
A

Beta

A Priori
Chance of
This Type

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Type

Mean
Frequency

2
5

0.3333
0.6667

0.1481
0.1157

0.0494
0.0772

0.390
0.610

2
5

0.1265

1.000

3.829

Overall

5.82. A. Type I, = .1, has distribution: 20% @ 0, 10% @ 1, and 70% @2, with mean 1.5.
Type II, = .3, has distribution: 60% @ 0, 30% @ 1, and 10% @2, with mean 0.5.
A

Theta

A Priori
Chance of
This Type

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Type

Mean

0.1
0.3

0.80
0.20

0.10
0.30

0.08
0.06

0.571
0.429

1.500
0.500

0.14

1.000

1.071

Overall

5.83. E. f(0) = 1/(1+)r. For r = 2 and = 2, f(0) = 1/9. For r = 4 and = 1, f(0) = 1/16.
f(1) = r/(1+)r+1. For r = 2 and = 2, f(1) = 4/27. For r = 4 and = 1, f(1) = 1/8.
Probability of the observation is: f(0)f(0)f(1).
Prob[Obs. | r = 2 and = 2] = (1/9)(1/9)(4/27) = .0018290.
Prob[Obs. | r = 4 and = 1] = (1/16)(1/16)(1/8) = .0004883.
P[Observation] = (1/2)(.0018290) + (1/2)(.0004883) = 0.001159.
By Bayes Theorem, P[Risk Type | Observation] = P[Obser. | Type] P[Type] / P[Observation].
P[Risk Type One | Observation] = (.0018290)(.5)/0.001159 = 78.9%.
Comment: P[Risk Type Two | Observation] = (.0004883)(.5)/0.001159 = 21.1%.

2013-4-9 Buhlmann Credibility 5 Bayes Discrete Risk Types, HCM 10/19/12, Page 205
5.84. E. Prob[ = 8 | observation] = (0.8e-5/8/8) / {(0.8e-5/8/8) + (0.2e-5/2/2)} = 0.867.
Prob[ = 2 | observation] = (0.2e-5/2/2)/{(0.8e-5/8/8) + (0.2e-5/2/2)} = 0.133.
Posterior estimate of is: (0.867)(8) + (0.133)(2) = 7.20.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 206

Section 6, Bayesian Analysis, with Continuous Risk Types


In the prior section Bayes Theorem was applied to situations where there were several distinct
types of risks. In this section, Bayes Theorem will be applied in a similar manner to situations in
which there are an infinite number of risk types, parameterized in some continuous manner. Where
summation was used in the discrete case, instead integration will be used in the continuous case.
Mahlers Guide to Conjugate Priors contains many more examples of Bayesian Analysis with
Continuous Risk Types.52
An Example of Mixing Bernoullis:
For example, assume:
In a large portfolio of risks, the number of claims for one policyholder during one
year follows a Bernoulli distribution with mean q.
The number of claims for one policyholder for one year is independent of the number
of claims for the policyholder for any other year.
The distribution of q within the portfolio has density function:
(q) = 20.006q4 , 0.6 q 0.8.
A policyholder is selected at random from the portfolio. He is observed to have two claims in three
years. What is the density of his posterior distribution function of q?
The chance of the observing 2 claims in 3 years, given q, is from a Binomial Distribution with
parameters m = 3 and q: 3q2 (1-q) = 3q2 - 3q3 .
By Bayes Theorem, Prob(q | observation) =
Prob(q)Prob(observation | q) / Prob(observation).
Therefore, the posterior probability density function of q is proportional to :
Prob(q)Prob(observation | q) = (q)(3q2 - 3q3 ).
In order to compute the posterior distribution we need to divide by the integral of
(q)(3q2 - 3q3 ), over the support of (q), which is [.6, .8].
0.8

(q)

0.6

52

0.8

(3q2

3q3)

dq =

20.006

0.6

0.8

q4 (3q2

3q3 )

dq = 60.018

q6 - q7 dq =

0.6

See the sections on Mixing Poissons, Gamma-Poisson, Beta-Bernoulli, Inverse Gamma - Exponential, and the
Overview.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 207
60.018{q7 / 7

q = 0.8
8
q / 8}

= 60.081{(0.02996 - 0.02097) - (0.00400 - 0.00210)} = 0.04259.

q = 0.6

Thus the density of the posterior distribution of q is:


(q)(3q2 - 3q3 ) / 0.4259 = 20.006q4 3(q2 - q3 ) / 0.4259 = 141.079 (q6 - q7 ), for 0.6 q 0.8.
Exercise: In this example, a policyholder is selected at random from the portfolio. He is observed to
have two claims in three years. What is his future expected annual claim frequency?
[Solution: Since for each insured q is the mean frequency, what we want is the mean of the posterior
distribution of q. As calculated above, for this observation the density of the posterior distribution of
q is: 141.079 (q6 - q7 ), for 0.6 q 0.8. Its mean is:
0.8

0.8

141.079 (q6 - q7) q dq = 141.079

0.6

q7 - q8 dq =

0.6

141.079{q8 / 8

q = 0.8
9
q / 9}

= 141.079{(0.02097 - 0.01491) - (0.00210 - 0.00112)} = 0.72. ]

q = 0.6

Bayesian Interval Estimates:


By the use of Bayes Theorem one obtains an entire posterior distribution. Rather than just using the
mean of that posterior distribution in order to get a point estimate, one can use the density function of
the posterior distribution to estimate the posterior chance that the quantity of interest is in a given
interval.53 For the above example, one could compute the posterior probability that the future
expected frequency for this insured, q, is in for example the interval [0.75, 0.76].
The posterior chance that q is in the interval [0.75, 0.76] is the integral from 0.75 to 0.76 of the
posterior density:
0.76

141.079

0.75

q6

q7 dq

141.079{q7 / 7

q = 0.76
8
q / 8}

q = 0.75

(141.079) {(0.020922 - 0.013913) - (0.019069 - 0.012514)} = 6.4%.

53

Bayesian Interval Estimates can come up in the situations covered in Mahlers Guide to Conjugate Priors.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 208
More generally, the same technique can be applied whenever one has a continuous prior
distribution of the quantity of interest.54 Then given an observation one applies Bayes Theorem to
get the posterior density, which is proportional to the product of the prior density and the chance of
the observation. One needs to divide by the integral of this product over the support of the prior
density, in order to get the posterior density.55
Then in order to find the posterior chance that the quantity of interest is in a given interval, one can
integrate the posterior density over that interval.56
Exercise: For each individual within a portfolio, the probability of a claim in a year is a Bernoulli with
mean q. The prior distribution of q within the portfolio is uniform on [0,1].
An insured is picked at random from the portfolio, and one claim was observed in one year.
What is the posterior estimate that this insured has a q parameter less than 0.2?
[Solution: The prior density function of q is (q) = 1 for 0 q 1. The chance of observing one claim
in one year, given q, is q. By Bayes Theorem, the posterior probability density function is
proportional to: (q)q = q. In order to compute the posterior density we need to divide by the
integral of (q)q, over the support of (q), which is [0, 1].
1

(q) q dq =
0

q dq = 1/2.

Thus the posterior density is: (q)q / (q)q dq = (1)(q)/(1/2) = 2q, for 0 q 1. The posterior
probability that q is in the interval [0, 0.2) is the integral from 0 to 0.2 of the posterior density:
0.2

q = 0.2

2q dq = q2 ]

= 0.22 = 0.04. ]

q=0

If instead the prior distribution of q had been (q) = 4q3 , 0 q 1, then the density of the posterior

distribution of q would be: (q)q / (q)q dq = (4q3 )(q)/(4/5) = 5q4 for 0 q 1.


In this case, the prior probability that q is in the interval [0, 0.2) is:
0.2

0
54

4q3

dq =

q = 0.2
q4 ]
=

0.24 = 0.0016.

q=0

In the example, q was distributed continuously .


The posterior density has to integrate to one. In the example, we divided f(q)(3q2 - 3q3 ) by 0.4259 so that the
posterior density would integrate to 1 rather than 0.4259.
56
In the example, we integrated the posterior density over the interval [0.75, 0.76], and determined that there was a
6.4% probability that q was in that interval.
55

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 209
While the posterior probability that q is in the interval [0, 0.2) is:
0.2

q = 0.2

5q4 dq = q5 ]

= 0.25 = 0.00032.

q=0

Bayesian Estimation:57
Loss Models formalizes and generalizes what has been discussed so far. It is important to note that
many people perform and understand Bayesian Analysis without using the following formal
definitions. Nevertheless, the notation and terminology of the text may be used in exam questions,
so it is a good idea to learn it. Lets first go over some definitions by applying them to the previous
example.
In a large portfolio of risks, the number of claims for one policyholder during one
year follows a Bernoulli distribution with mean q.
The number of claims for one policyholder for one year is independent of the number
of claims for the policyholder for any other year.
The distribution of q within the portfolio has density function: 20.006q4 , 0.6 q 0.8
A policyholder is selected at random from the portfolio.
He is observed to have two claims in three years.
In general one has the possibilities described via a Prior Distribution of the parameter(s),
denoted by (). In this example, the prior distribution of q is: (q) = 20.006q4 , 0.6 q 0.8.
The Model Distribution is the likelihood of the observation given a particular value of the
parameter or a particular type of insured, denoted fX|(x|).
In general the model distribution will be a product of densities; it is the likelihood function.
In this example, the model distribution is the product of Bernoulli densities58:
fX|q(2 | q) = 3q2 (1-q).
The Joint Distribution is the product of the Prior Distribution and the Model Distribution.59
fX,(x,) = () fX|(x|). In this example, the joint distribution has probability density function:
20.006q4 3q2 (1-q), 0.6 q 0.8. Note that x denotes the observations, in this case 2 claims in 3
years. If instead one had seen 3 claims in 3 years, then the joint distribution would instead have
p.d.f. of: 20.006q4 q3 , 0.6 q 0.8.
57

See Section 15.5.1 of Loss Models.


Note that this is a Binomial with parameters m = 3 and q. The factor of 3 in front comes from the fact that which years
had claims was not specified; there are 3 different combinations that would produce 2 claims in 3 years.
59
This is just the usual definition used in statistics; see the section on conditional distributions.
58

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 210
The Marginal Distribution of x is the integral over the possible values of the parameter(s) of the
joint density:60
fX(x) =

() fX | (x | ) d .

In the Bernoulli example, for the marginal distribution for the number of claims over three years
is at 1:61
0.8

fx(1) =

20.006

0.8

q4 3q(1- q)2

dq = 60.0118

0.6

q5 - 2q6 + q7 dq =

0.6

(60.0118) {(0.86 /6 - 2(0.87 )/7 + 0.88 /8) - (0.66 /6 - 2(0.67 )/7 + 0.68 /8)}
= (60.0118)(0.00474 - 0.00188) = 17.2%.
Exercise: For the Bernoulli example, what is the marginal distribution for the number of claims over
three years?
[Solution:
x:
0
1
2
3
f(x):
2.5%
17.2%
42.5%
37.8% ]
In this example, it is the chance, prior to any observations, that we will observe 0, 1, 2, or 3 claims
over the coming 3 years. In general, the marginal distribution is prior to any observations. The
marginal distribution is often referred to as the prior mixed distribution. Posterior to observations
one has two additional distributions of interest.
The Posterior Distribution is the distribution of the parameters subsequent to the observation(s).
It is just the conditional distribution of the parameters given the observations. The density of the
posterior distribution is denoted by: |X( | x).
As computed previously using Bayes Theorem, in the Bernoulli example after observing 2 claims in
3 years the density of the posterior distribution of q is: 141.079(q6 - q7 ), for .6 q .8.
The Predictive Distribution is the distribution of x subsequent to the observation(s).
It is the mixed distribution of x, given the observations.
The density of the predictive distribution is denoted by: fY|X(y | x).

60

This is just the usual definition used in statistics; see the section on conditional distributions.
One could also compute the marginal distribution for the number of claims over a single year. In this case it would
be an a priori 71.9% chance observing one claim over the coming year and a 28.1% chance of observing no claims
over the coming year. (One integrates f(q)q and f(q)(1-q) respectively.)
61

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 211
Exercise: For the Bernoulli example, after observing 2 claims in 3 years, what is the predictive
distribution for the number of claims over the next year?
[Solution: The density of the posterior distribution of q is: 141.079(q6 - q7 ), for 0.6 q 0.8.
fY|X(y | x) =

0.8

0.6

0.8

141.079 (q6 - q7) q dq = 141.079

q7 - q8 dq = 71.6%.

0.6

Since for a Bernoulli we can have only 0 or 1 claims in a single year, the predictive distribution is:
28.4% chance of zero claims, and 71.6% chance of one claim, in other words a Bernoulli with
q = 0.716.
Comment: One can integrate (1-q) times the posterior density and obtain 28.4%.]
The predictive distribution is analogous to the marginal (or prior mixed) distribution, but posterior to
the observations.62 The predictive distribution is computed from the posterior distribution in the
same manner as the marginal distribution is computed from the prior distribution.
Mixing Poissons and the Conjugate Prior situations63 provide good examples, which should allow
you to fully understand these concepts.
Finally it should be noted that throughout this section I have used the mean (expected value using
the posterior distribution) of the quantity of interest as the estimator. This is only one possible
Bayesian estimator; the use of the mean as the estimator corresponds to a least squares criteria.
The Bayes Estimator based on the mean of the posterior distribution is the least squares estimator
with respect to the true underlying value of the quantity of interest, as well as with respect to the next
observation.
Other estimators such as the median, percentiles and the mode of the posterior distribution, are
discussed in a subsequent section on Loss Functions / Error Functions. However, unless stated
otherwise, assume that Bayesian Estimation refers to the Bayes Estimator corresponding to the
squared-error loss function, the mean of the posterior distribution.

62
63

Therefore, the predictive distribution is sometimes referred to as the posterior mixed distribution.
See Mahlers Guide to Conjugate Priors.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 212

Summary:
If () is the density of the prior distribution of the parameter ,
then the density of the posterior distribution of is proportional to: () P(Observation | ).
The density of the posterior distribution of is:

() Prob[Observation | ]

() Prob[Observation | ] d

(Mean given ) () Prob[Obs. | ] d

The Bayes estimate is:


.
() Prob[Obs. | ] d

64

The Bayes estimate is analogous to the case with a discrete distribution of risk types:

(Mean given ) Prob[] Prob[Obs. | ]


.
Prob[]
Prob[Obs.
|
]

Improper Prior Distributions:


In the course of Bayesian Estimation one will use the prior distribution in order to get the posterior
distribution. In order to do so the only thing of importance will be the relative chances of each value
of the parameter or type of insured. Therefore, it is not even essential that the prior distribution, (),
be a true distribution that integrates to unity. For example, if instead in the Bernoulli example we had
taken (q) = q4 , 0.6 q 0.8, we would have gotten the same posterior distribution, once we
normalized it so that it integrates to unity. One could generalize to a situation in which the prior
distribution does not even have a finite integral. All that is important is the probabilities all be
non-negative.
An improper prior distribution is a set of non-negative probabilities for which the sum or integral is
infinite.65

64

In the case of the Bernoulli, the mean frequency is the parameter q.


In the case of Negative Binomial with r fixed and varying, the mean frequency would be r.
65
See Definition 15.8 in Loss Models.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 213
Exercise: Let severity be given by an exponential distribution with mean : f(x) = e-x/ / .
In turn, let have the improper prior distribution () = 1/ , 0 < < .
One observes 5 claims of sizes: 3, 4, 6, 9 ,11. Determine the posterior distribution of .
[Solution: The chance of the observation is the product of the densities at the observed points:
e-3/ e-4/e-6/e-9/e-11// 5 = e-33/ / 5.
Multiplying by the prior distribution of 1/ gives the probability weights: e-33/ / 6.
The posterior distribution is proportional to this and therefore is an Inverse Gamma Distribution.66
Specifically, the posterior distribution of delta is a Inverse Gamma with = 33 (the sum of the
observed claims) and = 5 (the number of observed claims.) | x( | x) = 335 e-33/ / {6 (5)}.]
Exercise: Let severity be given by an exponential distribution with mean : f(x) = e-x/ / .
In turn, let have the improper prior distribution () = 1/ , 0 < < .
One observes 5 claims of sizes: 3, 4, 6, 9 ,11. Estimate the future average claim size.
[Solution: The posterior distribution of delta is an Inverse Gamma with = 33 and = 5.
The posterior average claim size is E[] = mean of posterior Inverse Gamma = / (1) = 33/4 =
8.25.]
Credibility (Confidence) Intervals:67
Exercise: For each individual within a portfolio, the probability of a claim in a year is a Bernoulli with
mean q. The prior distribution of q within the portfolio is uniform on [0, 1].
An insured is picked at random from the portfolio, and one claim was observed in one year. What is
the posterior estimate that this insured has a q parameter in the interval [0.20, 0.98]?
[Solution: As shown in a previous exercise, the posterior density is: 2q, for 0 q 1.
The posterior chance that q is in the interval [0.20, 0.98] is the integral from 0.2 to 0.98 of the
0.98

posterior density:

0.2

2q dq =

q = 0.98
q2 ]
=

0.982 - 0.22 = 0.96 - 0.04. = 0.92.]

q = 0.2

Thus posterior to the observation, [0.20, 0.98] is a 92% confidence interval for q.
In general, by eliminating some probability on either tail, one can use the posterior distribution to
create a confidence interval of a given level.
66

One can normalize the probability weights by dividing by their integral; the integral is of the Gamma variety.
Alternately, one can recognize the Inverse Gamma from the negative power of the variable multiplied by the
exponential of the reciprocal of that variable.
67
Loss Models Definition 15.19 refers to what is commonly termed a confidence interval as a credibility interval.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 214
Quite often the Normal Approximation is used to get a confidence interval. Using the Normal
Approximation, the estimated mean 1.96 times the estimated standard deviation is a 95%
confidence interval. The use of the Normal Approximation is usually valid when one has a large
amount of data, since the posterior distribution is usually asymptotically normal; i.e., as the volume of
data increases the posterior distribution approaches a Normal.68
In general, [a, b] is a credibility interval (confidence interval) at level 1- for a parameter, provided
that the probability that the parameter is outside the interval [a, b] is less than or equal to . In this
example, for a = 0.20, b = 0.98, we have a confidence interval with = .08, for the parameter q.
Usually actuaries pick some reasonable value of and get a reasonable interval that
(approximately) covers probability of 1- or a little more. Even if two actuaries agree on a value of
alpha, they may come up with slightly different confidence intervals.69 This can be the case because
they used different approximations.
More fundamentally, the confidence interval is not unique. One has to specify something more than
just alpha, if one wants a unique interval. For example, one might specify that the interval should be
symmetric around the estimated mean. Another example would be an equal probability interval,
which would leave half the probability on either tail. If were 10%, then an equal probability interval
[a, b] would be such that there is a 5% probability of being less than a and a 5% probability of
being greater than b.
Even more generally, one is not limited to intervals. One can take the union of several intervals.
Such credibility sets have credibility intervals as a special case.70 So if for example, most of the
probability of a parameter were concentrated around = 2 and = 7, then a 90% credibility set for
might be: [1.5, 2.5] [6, 8].

68

See Theorem 15.22 of Loss Models. Besides the conditions that both the prior distribution and the model
distribution (prior likelihood function) are twice differentiable in the parameter(s), there are also the conditions stated
in Theorem 15.5. In particular applications it may be clear that the normal approximation is appropriate, for example, if
the posterior distribution is a Gamma. Loss Models implies that the Normal Approximation would only be applied if
one had difficulty calculating the exact form of the posterior distribution. In fact, the Normal Approximation is often
used by actuaries even when the exact form of the posterior distribution has been calculated.
69
Fortunately , I have never found this to be of major practical importance in actuarial work. Since the choice of the
confidence level 1- is somewhat arbitrary, I never agonize over small differences in confidence intervals.
70
I have never found this to be of practical use in actuarial work.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 215
Exercise: Let the posterior distribution of the parameter be given as follows:

1.5

f()

1%

10% 27% 10% 3%

2.5

1%

3%

10% 25% 9%

9
1%

Find a 90% confidence set for .


[Solution: [1.5, 2.5] [6, 8] is a 90% confidence set. While it is not the only such confidence set, it is
the smallest one. For example [1, 7] also covers 90% probability.]
Loss Models defines the Highest Posterior Density (HPD) credibility set for a given level
1- , as that credibility set among all possible credibility sets for a given posterior density and for the
given level 1- , such that the smallest value the posterior density takes in the HPD credibility set is
as large as possible.71
In the above exercise, the smallest value of the posterior density in [1.5, 2.5] [6, 8] is 9%, while
the smallest value of the posterior density in [1, 7] is 1%. So we know that [1, 7] is not the HPD
90% credibility set for this posterior density.
You can confirm that [1.5, 2.5] [6, 8] is the HPD 90% credibility set for this posterior density.72
This concept applies equally well to either discrete or continuous densities. The way one would
construct an HPD credibility set is to start with the places where the density was largest. Then keep
adding places where the density is a little lower, until the sum of the probability covered is the
desired amount.
In the exercise above, one would start with places where the density is at least 27%, then keep
lowering that to 25%, 10%, etc. until one covered at least 90% probability. The HPD 90% credibility
set in the exercise turns out to consist of those places where the density is at least 9%.
Exercise: Given that the posterior distribution of a parameter is a Normal Distribution with
= 7 and = 11, construct the HPD 90% credibility set for .
[Solution: 7 (1.645)(11) = (-11.095, 25.095).]
In the case of the Normal Distribution the Highest Probability Density credibility set is just the usual
confidence interval symmetric around the mean.73 In the above exercise, the Highest Probability
Density 90% credibility set is a confidence interval of 1.645 standard deviations around the mean.
71

See Definition 15.21 of Loss Models.


The sum of the probabilities at those points where the density is > 9%, is only 81%. Thus any set that covers 90%
of the probability will have to include a point where the density is less than or equal to 9%.
73
This follows from the fact that the Normal is symmetric and unimodal.
72

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 216
If the posterior distribution is continuous and unimodal (has one mode), then the smallest interval of a
given confidence level is [a, b] such that the value of the posterior density at a and b are the same
and such that [a, b] covers the desired amount of probability.74
Exercise: Let the posterior distribution for the parameter q be given by qe-q , 0< q.
Write down the equations that need to be solved (numerically) in order to find the smallest 80%
confidence interval for q.
[Solution: Let the desired interval be [a, b]. Then in order to cover 80% probability
b

0.8 =

e- q

dq = - (1+

q=b
q
q)e ]
q= a

= (1+a)e-a - (1+b)e-b.

Also for the shortest interval, the density function at a and b need to be equal: ae-a = be-b. ]
By solving these equations numerically, it turns out that a = 0.1673 and b = 3.08029.
Thus the interval [.1673, 3.08029] is the smallest 80% confidence interval for this posterior
distribution (which is a Gamma distribution with = 2 and = 1.)
While the equations need to be solved numerically, the concept is relatively simple when put in
graphical terms. The density is shown below. Also shown are vertical lines at 0.1673 and 3.08029.
We note that the density has the same value at 0.1673 and 3.08029; the vertical lines are the same
height.75 The area under the density between these two lines is 80%.76

0.3

0.2

0.1

1
74

Theorem 15.20 in Loss Models.


Example 15.20 in Loss Models shows how to set up equations for a and b that have to be solved numerically.
75
0.1673e-0.1673 = 0.1415 = 3.08029e-3.08029.
76
(1+0.1673)e-0.1673 - (1+3.08029)e-3.08029 = 0.9875 - 0.1875 = 0.8000. Since the posterior density is a
Gamma Distribution, this area can also be put in terms of Incomplete Gamma Functions:
[2;3.08029] - [2;0.1673] = 0.8125 - 0.0125 = 0.8000. (Note that 0.8125 = 1 - 0.1875 and 0.0125 = 1 - 0.9875.)

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 217
In order to solve graphically for the shortest 80% confidence interval, one picks a height, sees where
the density achieves that height, draws the corresponding vertical lines, and checks the area under
the curve between the two vertical lines. If the area is larger than the desired confidence level, in this
case 80%, then increase the selected height and try again. If the area is smaller than the desired
confidence level, then decrease the selected height and try again. Eventually youll find an interval
that covers the desired level of confidence.
We note that usually the shortest confidence interval has different probabilities outside on either tail.
In this case, there is 1.25% outside on the left and 18.75% outside on the right.
Exercise: Let the posterior distribution for the parameter q be given by qe-q , 0< q.
Write down the equations that need to be solved (numerically) in order to find the equal probability
80% confidence interval for q, the confidence interval that places 10% outside at each end.
[Solution: Let the desired interval be [a,b]. Then in order to have 10% probability outside at each
tail, F(a) = 1 - (1+a)e-a = .1 and F(b) = 1 - (1+b)e-b = .9. Note that since the posterior distribution is
a Gamma Distribution with alpha = 2 and theta = 1, one can also write these equations in terms of
Incomplete Gamma Functions as: [2; a] = 0.1 and [2; b] = 0.9.]
By solving these equations numerically, it turns out that a = 0.531812 and b = 3.88972. Thus the
interval [0.531812, 3.88972] is 80% confidence interval for q for this posterior distribution, that
places 10% outside at each tail. While the equations need to be solved numerically, the concept is
relatively simple when put in graphical terms. The density is shown below. Also shown are vertical
lines at 0.531812 and 3.88972. We note that the areas outside the vertical lines but under the curve
are the same; they are each 10%. The area under the density between these two lines is 80%.

0.3

0.2

0.1

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 218
The equal probability interval is longer than the smallest interval determined previously. One could
also determine an 80% confidence interval using the Normal Approximation.
Exercise: Let the posterior distribution for the parameter q be given by qe-q , 0< q.
Use the Normal Approximation to determine an 80% confidence interval for q. Note that the
posterior distribution is a Gamma Distribution with = 2 and = 1.
[Solution: The Gamma Distribution has a mean of = 2 and a variance of 2 = 2.
For an 80% confidence interval we want 1.282 standard deviations, since (1.282) = (1+.8)/2 =
0.9. Thus the desired interval is 2 1.282 2 = [.187, 3.813] . ]
The interval determined by the Normal Approximation differs from the other two previously
determined. This interval of [0.187, 3.813] actually covers a probability of [2; 3.813] - [2; .187] =
0.8937 - 0.0155 = 87.8%, which is more than the required 80%.77
This interval is shown below:

0.3

0.2

0.1

77

That is why it is called the Normal approximation.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 219
An Exponential Example:78
Let us assume that severity follows an Exponential Distribution with hazard rate , f(x | ) = e-x,
with varying across the portfolio.
The prior distribution of the parameter has probability density function:
2 for 0.1 0.3
() =
.
3 for 0.3 < 0.5
Exercise: Verify that () integrates to one over its support.
[Solution: (2)(0.3 - 0.1) + (3)(0.5 - 0.3) = 1.]
Exercise: What is the a priori mean severity?
[Solution: E[X | ] = 1/. Thus we integrate 1/ versus the prior density of lambda.
0.3

0.5

0.12 / d + 0.3 3 / d = 2 ln[3] + 3 ln[5/3] = 3.73.]

Exercise: What is the prior probability that lambda is between 0.2 and 0.35?
0.3

[Solution:

78

0.35

0.2 2 d + 0.3 3 d = (2)(0.1) + (3)(0.05) = 35%.]

Some of the calculations in this example are longer than what you should get on your exam.
Concentrate on the concepts and then do some (more) of my questions or past exam questions in this section.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 220
Exercise: What is the marginal (prior mixed) distribution?
Hint:

y e- cy dy = -y e-cy / c - e-cy / c2.

[Solution: f(x | ) = e-x.


0.5

The marginal distribution is:

0.1 [] e- x d .

0.3

0.12

= 0.3

e- x

d = (2) { -

e- x

/ 5 -

e- x

/ 25

}=

= 0.1

(2) {0.1 e-0.1x / x - 0.3 e-0.3x / x + e-0.1x / x2 - e-0.3x / x2 } .


0.5

0.3

= 0.5

3 e - x d = (3) { - e- x / x - e- x / x2

}=

= 0.3

(3) {0.3 e-0.3x / x - 0.5 e-0.5x / x + e-0.3x / x2 - e-0.5x / x2 }.


0.3

The marginal distribution is:

0.12

0.5

e- x

d +

0.3 3 e - x d =

0.2 e-0.1x / x + 2e-0.1x / x2 + 0.3 e-0.3x / x + e-0.3x / x2 - 1.5 e-0.5x / x - 3e-0.5x / x2 .]


A graph of the marginal (prior mixed) density:
density
0.30
0.25
0.20
0.15
0.10
0.05
2

10

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 221
One can use the marginal distribution in order to for example determine the probability that the first
claim observed will be in a certain interval. For example, the probability that the first claim will be in
the interval from 6 to 8 is the integral from 6 to 8 of the density of the marginal distribution. Using a
computer, the probability that the first claim will be in the interval from 6 to 8 is 6.88%.
Exercise: A single claim of size 5 is observed from an insured.
What is the posterior distribution of lambda for that insured?
Hint:

y e- cy dy = -y e-cy / c - e-cy / c2.

[Solution: The chance of the observation given lambda is: f(5 | ) = e-5.
Thus the numerator of Bayes Theorem, the probability weight, is () f(5 | ) :
2 e-5 for 0.1 0.3, and 3 e-5 for 0.3 < 0.5.
0.3

0.12

= 0.3

e- 5

d = (2) { -

e- 5

/ 5 -

e- 5

/ 25

}=

= 0.1

(2) {0.1 e-0.5 / 5 - 0.3 e-1.5 / 5 + e-0.5 / 25 - e-1.5 / 25} = 0.02816.


0.5

0.3

= 0.5

3 e - 5 d = (3) { - e- 5 / 5 - e- 5 / 25

}=

= 0.3

(3) {0.3 e-1.5 / 5 - 0.5 e-2.5 / 5 + e-1.5 / 25 - e-2.5 / 25} = 0.03246.


0.3

Thus the denominator of Bayes Theorem is:

0.12

0.5

e- 5

d +

0.3 3 e - 5 d = 0.06062.

Dividing the numerator and denominator of Bayes Theorem, the density of the posterior distribution
of lambda is:
33 e-5 for 0.1 0.3, and 49.5 e-5 for 0.3 < 0.5.
Comment: For a draw from a continuous distribution such as an Exponential, we use the density as
the chance of the observation.
The hint involves a Gamma type integral you should know how to do for your exam.
One can do this integral by parts or just remember the result.

Mean of an Exponential with hazard rate c is:

0 x c e- cx dx = c 0 x e- cx dx = (c) (1/c2) = 1/c. ]

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 222
The density of lambda, posterior to observing one claim of size 5:
density
3.0
2.5
2.0
1.5
1.0
0.5
0.1

0.2

0.3

0.4

0.5

lambda

Exercise: What is the expected value of the next claim from the same insured?
Hint:

y e- cy dy = -y e-cy / c - e-cy / c2.

[Solution: E[X | ] = 1/. Thus we integrate 1/ versus the posterior density of lambda.
0.3

33

0.5

0.1e- 5 d + 49.5 0.3 e- 5 d = (33)(e-0.5/5 - e-1.5/5) + (49.5)(e-1.5/5 - e-2.5/5) = 3.93.

Comment: Differs somewhat from the a priori mean severity of 3.73.]


Exercise: What is the posterior probability that lambda is between 0.2 and 0.35?
0.3

[Solution: 33

0.35

0.2 e- 5 d + 49.5 0.3 e- 5 d =


= 0.3

(33) { -

e- 5

/ 5 -

e- 5

/ 25

= 0.35

} + (49.5) { - e- 5 / 5 - e- 5 / 25

= 0.2

}=

= 0.3

(33) {0.2 e-1 / 5 - 0.3 e-1.5 / 5 + e-1 / 25 - e-1.5 / 25} +


(49.5) {0.3 e-1.5 / 5 - 0.35 e-1.75 / 5 + e-1.5 / 25 - e-1.75 / 25} = 0.23487 + 0.158295 = 39.3%.
Comment: This Bayes interval estimate differs from the a priori probability of 35%.]

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 223
Exercise: A single claim of size 5 is observed from an insured.
What is the predictive (posterior mixed) distribution?
Hint:

y2 e- cy dy = -y2 e-cy / c - 2y e-cy / c2 - 2e-cy / c3.

[Solution: f(x | ) = e-x.


From a previous solution, the posterior density of lambda is:
33 e-5 for 0.1 0.3, and 49.5 e-5 for 0.3 < 0.5.
The density of the predictive distribution is the integral of f(x | ) times the posterior density of .
0.3

0.3

0.1 e- x 33 e- 5 d = 33 0.1 2 e- (x+ 5) d =

(33) { -

= 0.3

e- (x + 5)

/ (x + 5)

- 2

e- (x+ 5)

/ (x +

5)2

2e-(x+ 5)

/ (x +

5)3

}=

= 0.1

(33) {0.01 e-0.1(x+5) / (x+5) - 0.09 e-0.3(x+5) / (x+5) + 0.2e-0.1(x+5) / (x+5)2 - 0.6e-0.3(x+5) / (x+5)2
+ 2e-0.1(x+5) / (x+5)3 - 2e-0.3(x+5) / (x+5)3 } .
0.5

49.5

0.3 2 e- (x + 5) d =

(49.5) { - 2 e- (x + 5) / (x + 5)

= 0.5

- 2 e- (x+ 5) / (x + 5)2 - 2e-(x+ 5) / (x + 5)3

}=

= 0.3

(49.5) {0.09 e-0.3(x+5) / (x+5) - 0.25 e-0.5(x+5) / (x+5) + 0.6e-0.3(x+5) / (x+5)2 - e-0.5(x+5) / (x+5)2
+ 2e-0.3(x+5) / (x+5)3 - 2e-0.5(x+5) / (x+5)3 } .
Thus the density of the predictive distribution is:
0.33 e-0.1(x+5) / (x+5) + 6.6e-0.1(x+5) / (x+5)2 + 66e-0.1(x+5) / (x+5)3
+ 1.485 e-0.3(x+5) / (x+5) + 9.9 e-0.3(x+5) / (x+5)2 + 33 e-0.3(x+5) / (x+5)3
- 12.375 e-0.5(x+5) / (x+5) - 49.5e-0.5(x+5) / (x+5)2 - 99 e-0.5(x+5) / (x+5)3 . ]
One can use the predictive distribution in order to for example determine the probability that the
next claim observed from this same insured will be in a certain interval. For example, the probability
that the next claim will be in the interval from 6 to 8 is the integral from 6 to 8 of the density of the
predictive distribution. Using a computer, the probability that the next claim will be in the interval from
6 to 8 is 7.26%. This compares to the a priori probability of 6.88%.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 224
A graph of the predictive (posterior mixed) density after observing a single claim of size 5:
density
0.30
0.25
0.20
0.15
0.10
0.05
2

10

With only one observation, the predictive distribution is very similar to the marginal distribution;
however, here is a comparison of their righthand tails:
density
0.016
0.014
0.012
0.010
0.008

Predictive
Marginal

0.006
0.004
12

14

16

18

20

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 225
Exercise: Assume instead that two claims were observed from a single insured.
The two observed claims are of sizes in that order 5 and 10.
What is the posterior distribution of lambda for that insured?
Hint:

y2 e- cy dy = -y2 e-cy / c - 2y e-cy / c2 - 2e-cy / c3.

[Solution: The chance of the observation given lambda is: f(5 | ) f(10 | ) = 2 e-15.
Thus the numerator of Bayes Theorem, the probability weight, is () 2 e-15 :
2 2 e-15 for 0.1 0.3, and 32 e-15 for 0.3 < 0.5.
0.3

0.1

22

e- 5

d = (2) { -

= 0.3

e-15

/ 15 - 2

e-15

/ 225 -

2e-15

/ 3375

}=

= 0.1

(2) {0.01e-1.5 / 15 - 0.09e-4.5 / 15 + 0.2e-1.5 / 225 - 0.6e-4.5 / 225 + 2e-1.5 / 3375 - 2e-4.5 / 3375}
= 0.0007529.
= 0.5

0.5

0.3

3 2

e- 5

d = (3) { -

e-15

/ 15 - 2

e-15

/ 225 -

2e-15

/ 3375

}=

= 0.3

(3) {0.09e-4.5 / 15 - 0.25e-7.5 / 15 + 0.6e-4.5 / 225 - 1e-7.5 / 225 + 2e-4.5 / 3375 - 2e-7.5 / 3375} =
0.0002726.
0.3

The denominator of Bayes Theorem is:

0.5

0.122 e- 5 d + 0.3 3 2 e- 5 d = 0.0010255.

Thus the density of the posterior distribution of lambda is:


1950 2 e-15 for 0.1 0.3, and 2925 2 e-15 for 0.3 < 0.5.
Comment: If the observed claims were 5 and 10 in either order, then the chance of the observation
would have been multiplied by two. However, this 2 would have canceled in the numerator and
denominator of Bayes Theorem, resulting in the same posterior distribution of lambda.]

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 226
The density of lambda, posterior to observing two claims of sizes 5 and 10:
density

0.1

0.2

0.3

0.4

0.5

Exercise: The two observed claims are of sizes in that order 5 and 10.
What is the expected value of the next claim from the same insured?
Hint:

y e- cy dy = -y e-cy / c - e-cy / c2.

[Solution: E[X | ] = 1/. Thus we integrate 1/ versus the posterior density of lambda.
0.3

1950

0.5

0.1 e-15 d + 2925 0.3 e-15 d =

(1950) (0.1e-1.5/15 + e-1.5/152 - 0.3e-4.5/15 + e-4.5/152 ) +


(2925) (0.3e-4.5/15 + e-4.5/152 - 0.5e-7.5/15 + e-7.5/152 ) = 5.04.
Comment: Differs from the a priori mean severity of 3.73.]

lambda

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 227
Exercise: The two observed claims are of sizes in that order 5 and 10.
What is the posterior probability that lambda is between 0.2 and 0.35?
Hint:

y2 e- cy dy = -y2 e-cy / c - 2y e-cy / c2 - 2e-cy / c3.


0.3

[Solution: 1950

0.35

0.2 2 e-15 d + 2925 0.3 2 e-15 d =

(1950) {0.22 e-3 / 15 - 0.32 e-4.5 / 15 + 0.4 e-3 / 152 - 0.6 e-4.5 / 152 + 2e-3 / 153 - 2e-4.5 / 153 } +
(2925){0.32 e-4.5/15 - 0.352 e-5.25/15 + 0.6e-4.5/152 - 0.7e-5.25/152 + 2e-4.5/153 - 2e-5.25/153 }
= 40.7%.
Comment: This Bayes interval estimate differs from the a priori probability of 35%.]

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 228
Exercise: The two observed claims are of sizes in that order 5 and 10.
What is the predictive (posterior mixed) distribution?
Hint:

y3 e- cy dy =-y3 e-cy / c - 3y2 e-cy / c2 - 6y e-cy / c3 - 6e-cy / c4.

[Solution: f(x | ) = e-x.


From a previous solution, the posterior density of lambda is:
1950 2 e-15 for 0.1 0.3, and 2925 2 e-15 for 0.3 < 0.5.
The predictive distribution is the integral of f(x | ) times the posterior density of .
0.3

0.3

0.1 e- x 1950 2 e- 15 d = 1950 0.1 3 e- (x + 15) d =

(1950) {0.001e-0.1(x+15) / (x+15) - 0.027 e-0.3(x+15) / (x+15)


+ 0.03e-0.1(x+15) / (x+15)2 - 0.27e-0.3(x+15) / (x+15)2
+ 0.6e-0.1(x+15) / (x+15)3 - 1.8e-0.3(x+15) / (x+15)3
+ 6e-0.1(x+15) / (x+15)4 - 6e-0.3(x+15) / (x+15)4 } .
0.5

2925

0.3 3 e- (x+ 15) d =

(2925) {0.027 e-0.3(x+15) / (x+15) - 0.125 e-0.5(x+15) / (x+15)


+ 0.27e-0.3(x+15) / (x+15)2 - 0.75e-0.5(x+15) / (x+15)2
+ 1.8e-0.3(x+15) / (x+15)3 - 3e-0.5(x+15) / (x+15)3
+ 6e-0.3(x+15) / (x+15)4 - 6e-0.5(x+15) / (x+15)4 } .
Thus the predictive distribution is:
1.95 e-0.1(x+15) / (x+15) + 58.5 e-0.1(x+15) / (x+15)2
+ 1170 e-0.1(x+15) / (x+15)3 + 11,700 e-0.1(x+15) / (x+15)4
+ 26.325 e-0.3(x+15) / (x+15) + 263.25 e-0.3(x+15) / (x+15)2
+ 1755 e-0.3(x+15) / (x+15)3 + 5850 e-0.3(x+15) / (x+15)4
- 365.625 e-0.5(x+15) / (x+15) - 2193.75 e-0.5(x+15) / (x+15)2
- 8775 e-0.5(x+15) / (x+15)3 - 17,550 e-0.5(x+15) / (x+15)4 . ]
The probability that the next claim will be in the interval from 6 to 8 is the integral from 6 to 8 of the
density of the predictive distribution. Using a computer, the probability that the next claim will be in
the interval from 6 to 8 is 8.71%. This compares to the a priori probability of 6.88%.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 229
A graph of the predictive (posterior mixed) density after observing two claims of sizes 5 and 10:
density

0.20

0.15

0.10

0.05

10

Here is a comparison of this predictive distribution and the marginal distribution:


density
0.30
0.25

Marginal

0.20
0.15
0.10
Predictive
0.05

10

12

14

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 230
Problems:
Use the following information for the next two questions:

The probability of y successes in m trials is given by a Binomial distribution with


parameters m and q.

The prior distribution of q is uniform on [0,1].

Two successes were observed in three trials.

6.1 (3 points) What is the Bayesian estimate for the probability that the unknown parameter q is in
the interval [0.5, 0.6]?
A. Less than 0.15
B. At least 0.15, but less than 0.16
C. At least 0.16, but less than 0.17
D. At least 0.17, but less than 0.18
E. 0.18 or more
6.2 (2 points) What is the probability that a success will occur on the fourth trial?
A. 0.3

B. 0.4

C. 0.5

D. 0.6

E. 0.7

6.3 (3 points) For a group of insureds, you are given:


(i) The amount of a claim is uniformly distributed from 0 to .
(ii) The prior distribution of is a Single Parameter Pareto with = 2 and = 10.
(iii) Four independent claims are observed of sizes: 4, 5, 7, and 13 .
Determine the probability that the next claim will exceed 15.
(A) 5%
(B) 6%
(C) 7%
(D) 8%
(E) 9%
6.4 (2 points) Use the following information:
(i) The number of days per hospital stay for patients at an individual hospital follows
a zero-truncated negative binomial distribution with parameters r = 2 and .
(ii) varies between different hospitals.
(ii) The assumed prior distribution of is: [] = 2310 6 / (1 + )12, 0 < < .
At Mercy Hospital you observe 100 hospital stays that total 450 days.
For Mercy Hospital, determine the density of the posterior distribution of up to a proportionality
constant.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 231
Use the following information for the next four questions:
In a large portfolio of risks, the number of claims for one policyholder during one
year follows a Bernoulli distribution with mean q.
The number of claims for one policyholder for one year is independent of the number
of claims for the policyholder for any other year. The number of claims for one
policyholder is independent of the number of claims for any other policyholder.
The distribution of q within the portfolio has density function:
400q , 0 < q 0.05

f(q) =
.
40 - 400q , 0.05 < q < 0.10
A policyholder Phillip DeTanque is selected at random from the portfolio.
6.5. (1 point) Prior to any observations, what is the probability that Phil has a Bernoulli parameter in
the interval [0.03, 0.04]?
A. 10%
B. 11%
C. 12%
D. 13%
E. 14%
6.6. (2 points) During Year 1, Phil has one claim. What is the probability that Phil has a Bernoulli
parameter in the interval [0.03, 0.04]?
A. 10%
B. 11%
C. 12%
D. 13%
E. 14%
6.7. (2 points) During Year 1, Phil has one claim. During Year 2, Phil has no claim.
What is the probability that Phil has a Bernoulli parameter in the interval [0.03, 0.04]?
A. less than 9.7%
B. at least 9.7% but less than 10.0%
C. at least 10.0% but less than 10.3%
D. at least 10.3% but less than 10.6%
E. at least 10.6%
6.8. (3 points) During Year 1, Phil has one claim. During Year 2, Phil has no claim.
During Year 3, Phil has no claim.
What is the probability that Phil has a Bernoulli parameter in the interval [0.03, 0.04]?
A. less than 9.7%
B. at least 9.7% but less than 10.0%
C. at least 10.0% but less than 10.3%
D. at least 10.3% but less than 10.6%
E. at least 10.6%

6.9 (2 points) f(x ; ) is a probability density function with one parameter .


is distributed via (), 0 < < . One observes: x1 , x2 , x3 ,..., xn .
What is the posterior probability that is in the interval [a, b]?

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 232
Use the following information for the next two questions:

Each insured has its accident frequency given by a Poisson Distribution with mean .
is assumed to be distributed across the portfolio via the improper prior distribution:
() = 1/, > 0.

An insured is randomly selected from the portfolio and you observe C claims in Y years.
6.10 (2 points) What is the density of the posterior distribution of ?
A. e Y
B. C (Y)C exp(-(Y)C) /
C. YC+1 CeY / C!
D. C YC /(+Y)C+1
E. None of the above
6.11 (2 points) Using Bayesian Analysis, what is the estimated future claim frequency for this
insured?
A. (C-1)/Y

B. C/Y

C. (C-1)/(Y-1)

D. C/(Y-1)

E. None of A, B, C, or D

Use the following information for a group of insureds for the next four questions:
(i) The amount of a claim is uniformly distributed, but will not exceed a certain unknown limit b.
(ii) The prior distribution of b is: (b) = 200/b3 , b > 10.
6.12 (2 points) Determine the probability that 25 < b < 35.
(A) 0.08
(B) 0.10
(C) 0.12
(D) 0.14
(E) 0.16
6.13 (2 points) Determine the probability that the next claim will exceed 20.
(A) 0.02
(B) 0.04
(C) 0.06
(D) 0.08
(E) 0.10
6.14 (2 points) From a given insured, three independent claims of sizes 17, 13, and 22 are
observed in that order. For this insured, determine the posterior probability that 25 < b < 35.
(A) 0.41
(B) 0.43
(C) 0.45
(D) 0.47
(E) 0.49
6.15 (2 points) From a given insured, three independent claims of sizes 17, 13, and 22 are
observed in that order.
Determine the probability that the next claim from this insured will exceed 20.
(A) 0.24
(B) 0.26
(C) 0.28
(D) 0.30
(E) 0.32

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 233
Use the following information for the next two questions:
Losses for individual policyholders follow a Compound Poisson Process.
The prior distribution of the annual claims intensity is uniform on [2, 6].
Severity is Gamma with parameters = 3 and .
The prior distribution of has density 25e-5//3 , > 0.
6.16 (3 points) An individual policyholder has 3 claims this year.
What is the expected number of claims from that policyholder next year?
Hint: xn e-x / n! dx = -e-x(xn /n! + xn-1/(n-1)! + ...+ x + 1).
(A) 3.00

(B) 3.25

(C) 3.50

(D) 3.75

(E) 4.00

6.17 (3 points) An individual policyholder has 3 claims this year, of sizes 4, 7, and 13.
What is the expected aggregate loss from that policyholder next year?

Hint:

0 x - ( + 1) e- / x dx = ()/.

(A) 27

(B) 30

(C) 33

(D) 36

(E) 39

6.18 (3 points) You are given the following:


Claim sizes for a given policyholder follow a distribution with density function
f(x) = 3x2 /b3 , 0 < x < b.
The prior distribution of b is a Single Parameter Pareto Distribution with = 3 and = 40.
A policyholder experiences two claims of sizes 30 and 60.
Determine the expected value of the next claim from this policyholder.
A. 30
B. 40
C. 50
D. 60
E. 70
6.19 (4 points) Use the following information:
Claim sizes for a given policyholder follow a mixed exponential distribution with density function
f(x) = 0.75e-x + 0.5e-2x, 0 < x < .
The prior distribution of is uniform from 0.01 to 0.05.
The policyholder experiences a claim of size 60.
Use Bayesian Analysis to determine the expected size of the next claim from this policyholder.
A. 36
B. 37
C. 38
D. 39
E. 40

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 234
Use the following information for the next two questions:
(i) Xi is the claim count observed for driver i for one year.
(ii) Xi has a negative binomial distribution with parameters = 0.5 and ri.
(iii) The ris have an exponential distribution with mean 0.4.
(iv) The size of claims follows a Pareto Distribution with = 3 and = 1000.
6.20 (4 points) An individual driver is observed to have 2 claims in one year.
Use Bayesian Analysis to estimate this drivers future annual claim frequency.
(A) 0.33
(B) 0.35
(C) 0.37
(D) 0.39
(E) 0.41
6.21 (2 points) An individual driver is observed to have 2 claims in one year, of sizes 1500 and
800. Use Bayesian Analysis to estimate this drivers future annual aggregate loss.
(A) 190
(B) 210
(C) 230
(D) 250
(E) 270
Use the following information for the next two questions:
(i) Xi is the claim count observed for insured i for one year.
(ii) Xi has a negative binomial distribution with parameters r = 2 and i.
(iii) The is have a distribution [] = 2804 / (1 + )9 , 0 < < .
6.22 (3 points) What is the mean annual claim frequency?
A. 1.67
B. 2
C. 2.5
D. 3
E. 3.33
6.23 (3 points) An insured has 8 claims in one year.
What is that insureds expected future annual claim frequency?
A. 4.8
B. 5.0
C. 5.2
D. 5.4
E. 5.6

6.24 (4 points) Use the following information:


Students are given standardized tests on different subjects.
The scores of students on each test are Normally distributed with mean 65.
However, the variance of each of these Normal Distributions, v, differs between the test.
v is assumed to be distributed across the portfolio via the improper prior distribution:
(v) = 1/v, v > 0.
A test on a new subject is administered to 80 students.
The mean of these scores is 66.
The second moment of these scores is 4400.
Using Bayes Analysis, estimate the variance of this test.
A. 90
B. 95
C. 100
D. 105
E. 110

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 235
Use the following information for the next two questions:
Claim sizes for a given policyholder follow an exponential distribution with density function
f(x) = e-x , 0 < x < .
The prior distribution of is uniform from 0.02 to 0.10.
The policyholder experiences a claim of size y.
6.25 (3 points) If y = 60, use Bayesian Analysis to determine the expected size of the next claim
from this policyholder.
A. 24
B. 26
C. 28
D. 30
E. 32
6.26 (3 points) Determine the limit as y approaches infinity of the expected size of the next claim
from this policyholder.
A. 20
B. 30
C. 40
D. 50
E.

Use the following information for the next three questions:


(i) The prior distribution of the parameter has probability density function:
() = 1/2, 1 < < .
(ii) Given = , claim sizes follow a Pareto distribution with parameters = 3 and .
A claim of 5 is observed.
6.27 (3 points) Calculate the posterior probability that exceeds 2.
(A) 0.81

(B) 0.84

(C) 0.87

(D) 0.90

(E) 0.93

6.28 (3 points) Calculate the expected value of the next claim.


(A) 5.2
(B) 5.4
(C) 5.6
(D) 5.8
(E) 6.0
6.29 (2 points) Which of the following is the probability that the next claim exceeds 5?

A. 162

3
(5 + )6 d
1

B. 162

4
C. 162
d
6
(5
+
)
1
E. None of A, B, C, or D.

3
(5 + )7 d
1

D. 162

(5 + )7
1

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 236
6.30 (2 points) For each insured, the probability of event A in each year is p.
p is constant for each insured. Each year is independent of any other.
Over a portfolio of insureds, p is uniformly distributed from 0 to 10%.
For a given insured, event A occurs in each of three years.
In the fourth year, what is probability of observing event A for this same insured?
A. 7.5%
B. 8.0%
C. 8.5%
D. 9.0%
E. 9.5%
Use the following information for the next three questions:
Severity is uniform from 0 to b.
b is distributed uniformly from 10 to 30.
6.31 (1 point) What is the chance that the next loss is less than 10?
A. 55%
B. 60%
C. 65%
D. 70%
E. 75%
6.32 (3 points) If you observe two losses each of size less than 10, what is the chance that the next
loss is less than 10?
A. 55%
B. 60%
C. 65%
D. 70%
E. 75%
6.33 (2 points) If you observe two losses each of size less than 10, what is the expected size of
the next loss?
A. Less than 8.0
B. At least 8.0, but less than 8.5
C. At least 8.5, but less than 9.0
D. At least 9.0, but less than 9.5
E. At least 9.5

6.34 (3 points) Use the following information:

Each insured has its severity given by a Exponential Distribution with mean .
is assumed to be distributed across the portfolio via the improper prior distribution:
() = 1/, > 0.

An insured is randomly selected from the portfolio and you observe two losses of sizes 100
and 400.
Using Bayesian Analysis, what is the estimated future average severity for this insured?
A. 200

B. 250

C. 300

D. 400

E. 500

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 237
6.35 (3 points) For a portfolio of policies, you are given:
(i) The annual claim amount on a policy has probability density function: f(x | ) = 2x/2, 0 < x < .
(ii) The prior distribution of has density function: () = 43, 0 < < 1.
(iii) A randomly selected policy had claim amount 0.9 in Year 1.
Determine the Bayesian estimate of the claim amount for the selected policy in Year 2.
(A) 0.444
(B) 0.500
(C) 0.622
(D) 0.634
(E) 0.667
6.36 (5 points) Use the following information:
Claim sizes follow a Gamma Distribution, with parameters = 3 and .
The prior distribution of is assumed to be uniform on the interval (5, 10).
You observe from an insured 2 claims of sizes 15 and 31.
Using Bayes Analysis, what is the estimated future claim severity from this insured?
You may use the following values of the incomplete Gamma Function:
[4, 4.6] = 0.674294.

[4, 9.2] = 0.981580.

[5, 4.6] = 0.486766.

[5, 9.2] = 0.951420.

6.37 (5 points) You are given:


(i) The size of claims on a given policy has a Pareto Distribution with parameters = 1000 and .
(ii) The prior distribution of is uniform from 3 to 5.
A randomly selected policy had a claim of size 800.
Use Bayes Analysis to estimate the size of the next claim from this policy.
Hint:

x cx dx = x cx / ln[c] - cx / ln[c]2.
x

Let the Exponential Integral Function be Ei[x] =

Ei[ln(4/9)] = -0.30453219.
Ei[4 ln(4/9)] = -0.00959173.

et
dt .
t

Ei[2 ln(4/9)] = -0.08359820.


Ei[5 ln(4/9)] = -0.00353746.

Ei[3 ln(4/9)] = -0.02722911.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 238
6.38 (4 points) You are given:
(i) The annual number of claims on a given policy has a geometric distribution with parameter .
(ii) The prior distribution of has the density function:
() = / ( + 1)(+1), 0 < < , where is a known constant greater than 2.
A randomly selected policy had x claims in Year 1.
Determine the Bayesian estimate of the number of claims for the selected policy in Year 2.
1

Hint:

0 y a - 1 (1- y)b - 1 dy = (a , b) = [a] [b] / [a+b].

(A)

1
1

(B)

1
( 1) x
+
( 1)

(C) x

(D)

x+1

(E)

x +1
1

Use the following information for the next two questions:


Each insured has its severity given by a Gamma Distribution with = 5.
is assumed to be distributed across the portfolio via the improper prior distribution:
() = 1/, > 0.

An insured is randomly selected from the portfolio and you observe 3 losses of sizes:
10, 30, 50.
6.39 (3 points) Using Bayesian Analysis, what is the estimated future average severity for this
insured?
A. 26

B. 28

C. 30

D. 32

E. 34

6.40 (3 points) Use the Bayesian central limit theorem to construct a 90% credibility interval
(confidence interval) for the estimate in the previous question.

6.41 (4 points) Severity is LogNormal with parameters and = 1.2.


is uniformly distributed across the portfolio from 9 to 10.
From an individual insured, you observe one claims of size 59,874.
Use Bayes Analysis to estimate the size of the next claim from the same insured.
Hint:

Exp[bx - ax2] dx = Exp[

A. 31,000

B. 32,000

b2
]
4a

C. 33,000

2ax - b
{[
] - 1/2}.
a
2a
D. 34,000

E. 35,000

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 239
6.42 (3 points) Use the following information:
Each insured has its annual frequency given by a Binomial Distribution with m = 5.
q is assumed to be distributed across the portfolio via the improper prior distribution:
1
(q) =
, 0 < q < 1.
q (1 q)
An insured is randomly selected from the portfolio and you observe three years
with the following numbers of claims: 3, 2, 5.
What is the estimated future average frequency for this insured?
A. 2.9

B. 3.0

C. 3.1

D. 3.2

E. 3.3

Use the following information for the next two questions:

For each insured, claim sizes are uniformly distributed from b to b + 100.
b varies between the insureds via an Exponential Distribution with = 80.
6.43 (3 points) An insured is selected at random,
and a claim of size 300 is observed from that insured.
Determine the expected value of the next claim from this same insured.
A. 280
B. 290
C. 300
D. 310
E. 320
6.44 (3 points) An insured is selected at random,
and two claims of sizes 200 and 230 are observed from that insured.
Determine the expected value of the next claim from this same insured.
A. 200
B. 210
C. 215
D. 220
E. 230

Use the following information for the next two questions:

Frequency for an individual is a 50-50 mixture of two Poissons with means and 2.
The prior distribution of is Exponential with a mean of 0.1.
6.45 (2 points) An insured is chosen at random and observed to have no claims in the first year.
Estimate of the expected number of claims next year for the same insured.
A. 0.09
B. 0.10
C. 0.11
D. 0.12
E. 0.13
6.46 (3 points) An insured is chosen at random and observed to have one claim in the first year.
Estimate of the expected number of claims next year for the same insured.
A. 0.24
B. 0.26
C. 0.28
D. 0.30
E. 0.32

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 240
6.47 (15 points) During World War II, you assume that the enemy has a total of n tanks that have
serial numbers from 1 to n.
You observe k enemy tanks and the maximum of their serial numbers is m.
You assume n follows the improper prior discrete distribution uniform on m to infinity.
(a) (2 points) For k > 1, using Bayes Analysis, determine the posterior distribution for n.

Hints: For binomial coefficients, for m k > 1:

1i
i=m

k
1
.
mk -1 1
k -1

For a sample of size k from the numbers 1 to n, for k m n, the probability that the maximum of
m- 1

k -1
the sample is m is:
.
n
k

(b) (2 points) For k > 2, determine the mean of this posterior distribution for n.
(c) (4 points) For k > 3, determine the variance of this posterior distribution.
Hint: Calculate the second factorial moment.
(d) (2 points) For k > 1, determine the posterior probability that n > x, for x m.
(e) (5 points) You observe 5 enemy tanks and their maximum serial number is 20.
With the aid of a computer, graph the posterior density of n, the total number of enemy tanks.
Determine the mean and variance of this posterior distribution of n.
With the aid of a computer, graph the posterior survival function of n.
6.48 (2, 5/83, Q.21) (1.5 points) Suppose one observation x is taken on a random variable X with
density function f(x | ) = 2x / (1 - 2), x 1.
The prior density function for is p() = 4(1 - 2), 0 < 1. What is E( l x)?
A.

2
3x2

B.

8
15

C.

2x
3

D.

2
x2

E.

2
x

6.49 (4B, 11/95, Q.5) (2 points) A number x is randomly selected from a uniform distribution on
the interval [0, 1]. Three independent Bernoulli trials are performed with probability of success x on
each trial. All three are successes.
What is the posterior probability that x is less than 0.9?
A. Less than 0.6
B. At least 0.6, but less than 0.7
C. At least 0.7, but less than 0.8
D. At least 0.8, but less than 0.9
E. At least 0.9

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 241
6.50 (4B, 11/97, Q.9) (2 points) You are given the following:
In a large portfolio of automobile risks, the number of claims for one policyholder during one
year follows a Bernoulli distribution with mean m/100,000, where m is the number of
miles driven each and every year by the policyholder.
The number of claims for one policyholder for one year is independent of the number of
claims for the policyholder for any other year. The number of claims for one
policyholder is independent of the number of claims for any other policyholder.
The distribution of m within the portfolio has density function
f(m) = m/100,000,000 , 0 < m 10,000
= (20,000- m)/100,000,000, 10,000 < m < 20,000.
A policyholder is selected at random from the portfolio. During Year 1, one claim is observed for this
policyholder. During Year 2, no claims are observed for this policyholder. No information is available
regarding the number of claims observed during Years 3 and 4.
Determine the posterior probability that the selected policyholder drives less than 10,000 miles
each year. Hint: Use a change of variable such as q = m/100,000.
A. 1/3
B. 37/106
C. 23/54
D. 1/2
E. 14/27
6.51 (4B, 5/98, Q.8) (2 points) You are given the following:
The number of claims during one exposure period follows a Bernoulli distribution
with mean q.
The prior density function of q is assumed to be f(q) = (/2) sin(q/2), 0 < q < 1.
1

Hint:

( q/ 2) sin( q / 2) dq = 2/, and

( q2 / 2) sin( q / 2) dq = 4( - 2) / 2.

The claims experience is observed for one exposure period and no claims are observed.
Determine the posterior density function of q.
A. (/2) sin(q/2) , 0 < q < 1
B. (p/2) sin(q/2) , 0 < q < 1
C. ((1-q)/2) sin(q/2) , 0 < q < 1
D. (2q/4) sin(q/2) , 0 < q < 1
E. (2(1-q)/{2(-2)}) sin(q/2) , 0 < q < 1

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 242
6.52 (4B, 11/99, Q.5) (3 points) You are given the following:
Claim sizes for a given policyholder follow a distribution with density function
f(x) = 2x/b2 , 0 < x < b.
The prior distribution of b has density function
g(b) = 1/b2 , 1 < b < .
The policyholder experiences a claim of size 2.
Determine the expected value of a second claim from this policyholder.
A. 1
B. 3/2
C. 2
D. 3
E.
6.53 (4, 11/01, Q.14 & 2009 Sample Q.64) (2.5 points) For a group of insureds, you are given:
(i) The amount of a claim is uniformly distributed, but will not exceed a certain unknown limit .
(ii) The prior distribution of is () = 500/2, > 500.
(iii) Two independent claims of 400 and 600 are observed.
Determine the probability that the next claim will exceed 550.
(A) 0.19
(B) 0.22
(C) 0.25
(D) 0.28
(E) 0.31
6.54 (2 points) Altering bullet (iii) in the previous question, 4, 11/01, Q.14,
assume that instead two independent claims of 400 and 300 are observed.
Determine the probability that the next claim will exceed 550.
(A) 0.19
(B) 0.22
(C) 0.25
(D) 0.28
(E) 0.31

6.55 (4, 11/02, Q.21 & 2009 Sample Q. 43) (2.5 points) You are given:
(i) The prior distribution of the parameter has probability density function: () = 1/2, 1 < < .
(ii) Given = , claim sizes follow a Pareto distribution with parameters = 2 and .
A claim of 3 is observed.
Calculate the posterior probability that exceeds 2.
(A) 0.33

(B) 0.42

(C) 0.50

(D) 0.58

(E) 0.64

6.56 (2 points) Altering bullet (i) in 4, 11/02, Q.21, you are given:
(i) The prior distribution of the parameter has discrete probability density function:
Prob[ = 2] = 70%, Prob[ = 4] = 30%.
(ii) Given = , claim sizes follow a Pareto distribution with parameters = 2 and .
A claim of 3 is observed.
Calculate the posterior distribution of .

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 243
6.57 (4, 11/02, Q.24 & 2009 Sample Q. 45) (2.5 points) You are given:
(i) The amount of a claim, X, is uniformly distributed on the interval [0, ].
(ii) The prior distribution of is () = 500/2, > 500.
Two claims, x1 = 400 and x2 = 600, are observed.
You calculate the posterior distribution as: f( | x1 , x2 ) = 3(6003 /4), > 600.
Calculate the Bayesian premium, E(X3 | x1 , x2 ).
(A) 450

(B) 500

(C) 550

(D) 600

(E) 650

6.58 (2 points) Altering bullet (ii) in 4, 11/02, Q.24, you are given:
(i) The amount of a claim, X, is uniformly distributed on the interval [0, ].
(ii) The prior distribution of is:
Prob[ = 500] = 50%, Prob[ = 1000] = 30%, Prob[ = 2000] = 20%.
Two claims, x1 = 400 and x2 = 600, are observed.
Calculate the Bayesian premium, E(X3 | x1 , x2 ).
A. Less than 500
B. At least 500, but less than 550
C. At least 550, but less than 600
D. At least 600, but less than 650
E. At least 650

6.59 (4, 11/03, Q.19 & 2009 Sample Q.15) (2.5 points) You are given:
(i) The probability that an insured will have at least one loss during any year is p.
(ii) The prior distribution for p is uniform on [0, 0.5].
(iii) An insured is observed for 8 years and has at least one loss every year.
Determine the posterior probability that the insured will have at least one loss during Year 9.
(A) 0.450
(B) 0.475
(C) 0.500
(D) 0.550
(E) 0.625
6.60 (4, 11/03, Q.31 & 2009 Sample Q.24) (2.5 points) You are given:
(i) The probability that an insured will have exactly one claim is .
(ii) The prior distribution of has probability density function:
() = (3/2)

, 0 < < 1.

A randomly chosen insured is observed to have exactly one claim.


Determine the posterior probability that is greater than 0.60.
(A) 0.54

(B) 0.58

(C) 0.63

(D) 0.67

(E) 0.72

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 244
6.61 (4, 11/04, Q.33 & 2009 Sample Q.157) (2.5 points) You are given:
(i) In a portfolio of risks, each policyholder can have at most one claim per year.
(ii) The probability of a claim for a policyholder during a year is q.
(iii) The prior density is (q) = q3 /0.07, 0.6 < q < 0.8.
A randomly selected policyholder has one claim in Year 1 and zero claims
in Year 2.
For this policyholder, determine the posterior probability that 0.7 < q < 0.8.
(A) Less than 0.3
(B) At least 0.3, but less than 0.4
(C) At least 0.4, but less than 0.5
(D) At least 0.5, but less than 0.6
(E) At least 0.6
6.62 (4, 11/05, Q.32 & 2009 Sample Q.242) (2.9 points) You are given:
(i) In a portfolio of risks, each policyholder can have at most two claims per year.
(ii) For each year, the distribution of the number of claims is:
Number of Claims Probability
0
0.10
1
0.90 - q
2
q
(iii) The prior density is:
(q) = q2 /0.039, 0.2 < q < 0.5.
A randomly selected policyholder had two claims in Year 1 and two claims in Year 2.
For this insured, determine the Bayesian estimate of the expected number of claims in
Year 3.
(A) Less than 1.30
(B) At least 1.30, but less than 1.40
(C) At least 1.40, but less than 1.50
(D) At least 1.50, but less than 1.60
(E) At least 1.60

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 245
Solutions to Problems:
6.1. C. Assuming a given value of q, the chance of observing two successes in three trials is
3q2 (1-q). The prior distribution of q is: g(q) = 1, 0 q 1. By Bayes Theorem, the posterior
distribution of q is proportional to the product of the chance of the observation and the prior
distribution: 3q2 (1-q). Thus the posterior distribution of q is proportional to q2 - q3 .
(You can keep the factor of 3 and get the same result.)
The integral of q2 - q3 from 0 to 1 is 1/3 - 1/4 = 1/12.
Thus the posterior distribution of q is 12(q2 - q3 ). (The integral of the posterior distribution has to be
unity. In this case dividing by 1/12; i.e., multiplying by 12, will make it so.)
The posterior chance of q in [.5, .6] is:
.6

.6

] = 0.1627.

12 (q2 - q3 ) dq = 4q3 - 3q4


q=.5

q=.5

Comment: A Beta-Bernoulli conjugate prior situation. The uniform distribution is a Beta distribution
with a=1 and b=1. The posterior distribution is Beta with parameters a = a + 2 = 1 + 2 = 3, and
b = b + 3 - 2 = 1 + 3 - 2 = 2: {(4!) /((2!)(1!))}q3-1 (1 - q)2-1 = 12(q2 - q3 ).
See Mahlers Guide to Conjugate Priors.
6.2. D. From the solution to the prior question, the posterior distribution of q is: 12(q2 - q3 ).
The mean of this posterior distribution is:
1

q=0

q=0

12 (q2 - q3 )q dq = 3q4 - (12/5)q5

= .6.

The chance of a success on the fourth trial is E[q] = 0.6.


Comment: A Beta-Bernoulli conjugate prior situation. The uniform distribution is a Beta distribution
with a=1 and b=1. The posterior mean is:
(a + number of successes) / (a + b + number of trials) = (1 + 2)/(1 + 1 + 3) = 3/5 = .6.
See Mahlers Guide to Conjugate Priors.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 246
6.3. B. Severity is uniform on [0, ].
If for example, = 13.01, the chance of a claim of size 13 is 1/13.01.
If for example, = 12.99, the chance of a claim of size 13 is 0.
For 13, Prob[observation] = 24 f(4) f(5) f(7) f(13) = (24)(1/)(1/)(1/)(1/) = 24/3.
For < 13, Prob[observation] = 24 f(4) f(5) f(7) f(13) = 24 f(4) f(5) f(7) 0 = 0.
() = 200/3, > 10.

13

10 () Prob[observation | ] d = 10 () 0 d + 13 3

200 24
d = 800 / 136 .
4

By Bayes Theorem, the density of the posterior distribution of is:


() Prob[observation | ] 4800 / 7
136
=
=
6
, 13.
800 / 136
800 / 136
7
(Recall that if < 13, Prob[observation] = 0.)
For < 15, S(15) = 0. For 15, S(15) = ( - 15)/ = 1 - 15/.
The probability that the next claim will exceed 15 is:

13

136
6
S(15) d =
7

(136 )

15

15

13

136
6
0 d +
7

6
d - (90)(136 )
7

15

15

136
(1 - 15 / ) d =
7

1
136
136
d
=
(90/7)
= 6.05%.
8
156
157

Comment: Similar to 4, 11/01, Q.14.


Note that the posterior distribution of is also a Single Parameter Pareto, so the Single Parameter
Pareto is a Conjugate Prior to the uniform Iikelihood.
In general: = + n, and = Max[x1 , ... xn , ].
In this case, = 2 + 4 = 6, and = Max[4, 5 ,7, 13, 10] = 13.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 247
6.4. The density at x of the zero truncated negative binomial is proportional to:

x / (1+ )x + 2
=

2
1+
1 - 1 / (1+ )

1
.
(2 + )

Thus the chance of the observation is proportional to:


350
450
1

=
.

1+
(1+ ) 450 (2 +) 100
100 (2 + )100

Thus multiplying by [], the posterior distribution of is the proportional to:

356

(1+ ) 462 (2 +) 100


.04

6.5. E.

, > 0.

.04

f(q)dq = 400q dq

.03

.03

q =.04

= 200q2

= 200(.0016 - .0009) = 14%.

q =.03

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 248
6.6. A. The chance of the observation given q is q.
By Bayes Theorem, the posterior probability density function is proportional to f(q)q.
In order to compute the posterior density function we need to divide by the integral of f(q)q.
.1

.05

.1

f(q)qdq = 400q2 dq + 40q - 400q2 dq =


0

.05

q=.05

(400/3)q3

q=.1

+ {20q2 - (400/3)q3 } ] = .01667 + (.2 - .1333) - (.05 - .01667) = .05.

q=0

q =.05

Thus the posterior density is: 400q2 /.05 = 8000q2 for 0 < q .05, and
{40q - 400q2 }/.05 = 800q - 8000q2 for .05< q .1. Then the posterior chance that q is in the
interval [.03, .04] is the integral from .03 to .04 of the posterior density:
.04

q = .04

8000q2 dq = 8000q3/3 ] = .1707 - .0720 = 9.87%.


.03

q = .03

Comment: An example of a Bayesian Interval Estimate. After having observed a claim, the chance
of Phil being a better than average risk has declined. For example, the chance of Phils expected
frequency being in the interval [.03, .04] has declined from 14% to 9.9%.
6.7. C. The chance of the observation given q is q(1-q) = q - q2 .
By Bayes Theorem, the posterior probability density function is proportional to f(q)(q - q2 ).
In order to compute the posterior density function we need to divide by the integral of f(q)(q - q2 ).
.1

.05

.1

f(q)(q - q2)dq = {400q2 - 400q3} dq + {40q - 440q2 +


0

.05
q=.05

{(400/3)q3 - (100)q4 }

400q3 } dq =

]
q=0

q =.1

+ {20q2 - (440/3)q3 + (100)q4 } ] = .016042 + .031042 = .047083.


q =.05

Thus the posterior density is: {400q2 - 400q3 } / .047083 for 0 < q .05, and
{40q - 440q2 + 400q3 } / .047083 for .05< q .1. Then the posterior chance that q is in the interval
[.03, .04] is the integral from .03 to .04 of the posterior density:
.04

q=.04

{400q2 - 400q3}/.047083 dq = {(400/3)q3 - (100)q4}/.047083 ] = .004758 /.047083 = 10.1%.


.03

q=.03

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 249
6.8. D. The chance of the observation given q is q(1-q)2 = q - 2q2 + q3 .
By Bayes Theorem, the posterior probability density function is proportional to f(q)(q - 2q2 + q3 ).
In order to compute the posterior density we need to divide by the integral of f(q)(q - 2q2 + q3 ).
.1

.05

.1

f(q)(q - 2q2 + q3)dq = 400{q2 - 2q3 + q4} dq + 40 {q - 12q2 +


0

.05

q=.05

400{q3 /3 - q4 /2 + q5 /5}

21q3 - 10q4 } dq =

q=.1

+ 40{q2 /2 - 4q3 + 21q4 /4 - 2q5 } ] =

q=0

q =.05

(400)(.000041667 - .000003125 + .000000063) +


40{(.005 - .004 + .000525 - .00002) - (.00125 - .0005 + .000032813 - .000000625)} =
.01544 +.02891 = .04435.
Thus the posterior density is: 400{q2 - 2q3 + q4 } / .04435 for 0 < q .05, and
40{q - 12q2 + 21q3 - 10q4 } / .04435 for .05< q .1. Then the posterior chance that q is in the
interval [.03, .04] is the integral from .03 to .04 of the posterior density:
.04

q=.04

400 {q2 - 2q3 + q4 } / .04435 dq =400{q3 /3 - q4 /2 + q5 /5}/ .04435


.03

q=.03

.004590 / .04435 = 10.35%.


n

6.9. The probability of the observation given is: f(x1 ; ) f(x2 ; )... f(xn ; ) = f(xi ; ).
i=1
n

Posterior density of is: () f(xi ; ) / () f(xi ; ) d.


i=1

i=1

Prob[ a b] = Integral of the posterior density of , from a to b:


b

() f(xi ; ) d / () f(xi ; ) d.
a

i=1

i=1

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 250
6.10. E. Given lambda, the number of claims over Y years is Poisson with mean Y.
Therefore, the chance of the observation given is: (Y)CeY/C!.
The posterior distribution is proportional to the product of the chance of the observation given
lambda and the prior distribution of lambda: {YC CeY/C!} () = {YC CeY/C!} / , which is
proportional to: C-1eY, proportional to a Gamma Distribution with = C and = 1/Y.
Thus the posterior distribution of lambda is a Gamma Distribution with = C and = 1/Y:
|x(|x) = YC C - 1eY /(C-1)!, > 0.
Comment: After this first set of observations one now has the starting point for a Gamma-Poisson
process, which can be easily updated for any additional observations of the same insured, as
discussed in Mahlers Guide to Conjugate Priors.
6.11. B. The posterior distribution of lambda is a Gamma Distribution with = C and = 1/Y.
Since each insureds average claim frequency is lambda, the expected future claim frequency is just
the expected value of the posterior distribution of lambda.
The mean of a Gamma Distribution is: = C/Y.
Comment: The estimated future claim frequency is equal to the observed claim frequency C/Y.
That is why 1/ is referred to as the noninformative or vague (improper) prior distribution for a
Poisson.
6.12. A. & 6.13. D.

35

Prob[25 < b < 35] =

35

200/b3 db = -100/b2 ] = .16 - .0816 = 0.0784.

25

25

For b 20, S(20) = (b - 20)/b = 1 - 20/b.

b=

(1 - 20/b) 200/b3 db = {-100/b2 + (20)(200/3)/b3} ] = .25 - .1667 = 0.0833.


20

b = 20

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 251
6.14. B. & 6.15. A. For b 22, Prob[observation] = f(17)f(13)f(22) = (1/b)(1/b)(1/b) = 1/b3.
For b < 22, Prob[observation] = f(17)f(13)f(22) = f(17)f(13)0 = 0.

(b) Prob[observation | b] db = (200/b3)(1/b3)db = (200/5)/225 = 40/225.


22

22

By Bayes Theorem, Posterior distribution of b is:


(b) Prob[observation | b]/ {40/225 } = (5)(225 )/b6, b >22.
35

Prob[25 < b < 35] =

35

(5)(225)/b6 db = -(225)/b5 ] = .5277 - .0981 = 0.4296.

25

25

For b 20, S(20) = (b - 20)/b = 1 - 20/b.

(1 - 20/b) (5)(225)/b6 db = 1 +

b=

(20)((5/6)(225 ))/b6

22

] = 1 - .7576 = 0.2424.
b = 22

Comment: Similar to 4, 11/01, Q.14.


If we did not assume a prior distribution of b, then the maximum likelihood fit of b would be the
maximum of the sample, or 22. Then the estimate of S(20) would be: (22 - 20)/22 = 1/11.
6.16. D. The chance of the observation given is: f(3 | ) = 3e/6.
The prior density of is: 1/4, 2 6.
Therefore, the posterior distribution is proportional to: (1/4)3e/6 3e.
6

The posterior distribution of is: 3e / 3e d = 3e / {(3!)(-e-x)(x3 /3! + x2 /2! + x + 1)]} =


2

3e / (6){(e-2)(6.333) - (e-6)(61)} = 3e /4.236. The mean of the posterior distribution is:


6

4e d / 4.236 = {(4!)(-e-x)(x4/4! + x3/3! + x2/2! + x + 1)]}/ 4.236 =


2

(24){(e-2)(7) - (e-6)(115)}/4.236 = 15.894/4.236 = 3.75.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 252
6.17. C. Since the prior distributions of and are independent, the distribution of posterior to
observing 3 claims is that from the previous solution, with resulting estimated future frequency of
3.75.
Given , the chance of the observed sizes of loss is:
6f(4)f(7)f(13) e4/-3 e7/-3 e13/-3 = e24/-9.
Thus the posterior distribution of is proportional to: e24/-9e-5//3 = e29/12.
Mean of posterior distribution of is:

e29/12d / e29/12d = {(10)/2910}/{(11)/2911} = 29(9!/10!) = 2.9.


0

E[severity] = E[3] = (3)(2.9) = 8.7.


Mean aggregate loss = (mean frequency)(mean severity) = (3.75)(8.7) = 32.6.
Comment: The posterior distribution of is an Inverse Gamma with = 11 and scale parameter 29,
with mean 29/(11 - 1) = 2.9.
6.18. C. If b < 60, then we would not observe a loss of size 60.
The probability of the observation is: 2f(30)f(60), for b 60.
2f(30)f(60) = (2)(900/b3 )(3600/b3 ), which is proportional to: 1/b6 , b 60.
Single Parameter Pareto Distribution with = 3 and = 40:
(b) = (3)403 /b4 , which is proportional to: 1/b4 , b > 40.
Thus the posterior distribution of b is proportional to: (1/b4 )(1/b6 ) = 1/b10, b 60.
The posterior distribution of b is:

(1/b10 ) / 1/b10 db = (1/b10 )/{1/((9)/609 )} = (9)609 /b10.


60

Given b, the mean severity is the integral from 0 to b of xf(x): 3b/4.


Posterior expected value of this mean severity:

(3b/4){(9)609 /b10} db = {1/((8)608)}(3/4)(9)609 = 50.6.


60

Comment: Similar to 4B, 11/99, Q.5.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 253
6.19. E. X is a 75%-25% mixture of Exponentials with means 1/ and 1/(2).
E[X | ] = .75/ + .25/(2) = .875/. The posterior density of is proportional to:
() f(60) = (.75e-60 + .5e-260)/.04 = 18.75e-60 + 12.5e-120, .01 .05.
.05

18.75e-60 + 12.5e-120 d =
.01
= .05

{18.75(-e-60/60 - e-60/3600) + 12.5(-e-120/120 - e-120/14400)

= .00409634.

= .01

Therefore, the posterior density is:


(18.75e-60 + 12.5e-120)/.00409634 = 4577e-60 + 3052e-120, .01 .05.
The expected size of the next claim from the same policyholder is:
.05

.05

(4577e-60 + 3052e-120) E[X | ] d = (4577e-60 + 3052e-120) .875/ d =


.01

.01
.05

.875 4577e-60 + 3052e-120 d = 39.96.


.01

6.20. E. For a Negative Binomial, f(2) = (r(r+1)/2) 2/(1+)2+r. g(r) = e-r/.4/.4.


Therefore the posterior distribution is proportional to: e-2.5rr(r+1)/(1.5)r= (r + r2 )e-2.905r.

(r + r2)e 2.905r dr

= (2)/2.9052 + (3)/2.9053 = 1/2.9052 + 2/2.9053 = .200.

Therefore, the posterior distribution of r is: (r + r2 )e-2.905r/.200 = 5(r + r2 )e-2.905r.

E[r] = r5(r + r2 )e2.905r dr = 5(4)/2.9054 + 5(3)/2.9053 = 30/2.9054 + 10/2.9053 = .829.


0

Expected future annual frequency = E[r] = .5E[r] = 0.414.


Comment: Set up taken from 4, 5/00, Q.37. 1/(1.5)r = e-ln(1.5)r = e-.405r.

t 1 e t / dt = (), see Mahlers Guide to Conjugate Priors.


0

We apply this result twice, once with = 2 and = 1/2.905, and once with = 3 and = 1/2.905.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 254
6.21. B. The chance of the observation is:
2(Negative Binomial @2)(Pareto @ 1500)(Pareto @ 800).
However, since the Pareto densities do not involve r, they do not affect the posterior density of r.
From the previous solution, the posterior density of r is 5(r + r2 )e-2.905r, and the expected future
annual frequency is .414. The average severity is: 1000/(3-1) = 500.
Expected future annual aggregate loss is: (500)(.414) = 207.
Comment: Since the parameters of the Pareto do not vary by insured, the observed claim sizes do
not affect the posterior distribution.
6.22. E. Making the change of variables, x = /(1+), d = dx/(1-x)2 :

E[] = () d = 280 5/(1+)9 d = 280 x5 (1-x)4 dx/(1-x)2 = 280 x5 (1-x)2 dx =


0

(280)((6)(3)/(6 + 3)) = (280)(5! 2! / 8!) = 5/3. E[r] = (2)E[] = (2)(5/3) = 10/3 = 3.33.
Comment: By a change of variables the distribution of the parameter was converted to a Beta
distribution and the integral into a Beta type integral. The Beta Distribution and Beta Type Integrals
are discussed in Mahlers Guide to Conjugate Priors. See also page 2 of the tables attached to
your exam. If [] is proportional to a-1/(1 + )(a+b), then x = /(1+) follows a Beta Distribution
with parameters a and b. E[] = a/(b-1). For this problem a = 5 and b = 4, and a/(b-1) = 5/3.
The mixed distribution is sometimes called a Generalized Waring Distribution.
See Example 4.7.2 in Insurance Risk Models by Panjer and WiIlmot.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 255
6.23. C. The chance of the observation is the p.d.f. at 8 of a Negative Binomial Distribution, which
given is proportional to: 8/(1+)8+2 = 8/(1+)10. Therefore, the posterior distribution of is
proportional to: {8/(1+)11}{4/(1 + )9 } = 12/(1+)19.

E[] = 12/(1+)19 d /
0

12/(1+)19d = x13 (1-x)6dx/(1-x)2 / x12 (1-x)7dx/(1-x)2


0

= x13 (1-x)4 dx / x12 (1-x)5 dx = ((14)(5)/(14 + 5))/((13)(6)/(13 +6)) =


0

(13! 4! / 18!)/(12! 5! / 18!) = 13/5. E[r] = (2)E[] = (2)(13/5) = 26/5 = 5.2.


Comment: If [] is proportional to a-1/(1 + )(a+b), and one observes C claims in one year, then
the mean of the posterior distribution of is: (a + C)/(r + b - 1).
For this problem a = 5, b = 4, r = 2 and C = 8. (a + C)/(r + b - 1) = 13/5.
If for fixed r, 1/(1+) of the Negative Binomial is distributed over a portfolio by a Beta,
then the posterior distribution of 1/(1+) parameters is also given by a Beta.
Thus the Beta distribution is a conjugate prior to the Negative Binomial Distribution for fixed r.
Equivalently, /(1+) can be distributed via a different Beta Distribution.
It turns out that this is an example of exact credibility, in which the estimate from Bayesian Analysis
equals that from Buhlmann Credibility. In this case K = (b-1)/r and Z = r/(r+b-1).
Other examples of exact credibility are discussed in Mahlers Guide to Conjugate Priors.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 256
6.24. B. A Normal Distribution with mean 65 and variance v has a density of:
exp[-

(x - 65)2
1
]
.
1
/
4
2v
v
2

Thus the chance of the observation is proportional to: exp[-

1
(xi - 65)2
] 20 .
2v
v

(v) = 1/v, v > 0.


Thus the posterior distribution of v is proportional to: exp[-

(xi - 65)2 1
] 21 , v > 0.
2v
v

Therefore, the posterior distribution of v is Inverse Gamma with:


+ 1 = 21, and = (xi - 65)2 /2.
(xi - 65)2 = xi2 - 130 xi + (n)(652 ) = (80)(4400) - (130)(80)(66) + (80)(652 ) =
3600.
The posterior distribution of v is Inverse Gamma with = 21 - 1 = 20,
and = 3600/2 = 1800.
The mean of this Inverse Gamma is: 1800 / (20 - 1) = 94.7.
The estimate the variance of this test is 94.7.
Comment: In the absence of any other information, such as a prior mean and prior distribution of v,
our estimate of the variance of this test would be the sample variance of:
(80/79) (4400 - 662 ) = 44.56.
As usual, our estimate depends on the prior distribution used.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 257
6.25. C. The chance of the observation given is: e -60.
The prior distribution of is uniform from 0.02 to 0.10, with constant density 12.5.
The posterior distribution of is proportional to: 12.5e-60, 0.02 < < 0.1.
Therefore, the posterior distribution of is proportional to: e-60, 0.02 < < 0.1.
.1

.1

e-60 d = - e-60/60 - e-60/3600] = .000179.


.02

.02

Thus the posterior distribution of is: e-60/.000179, .02 < < .1.
The expected future severity is: E[1/] =
.1

.1

.1

(1/)e-60/.000179 d = (1/.000179)e-60 d = (1/.000179) (-e-60/60)] = 27.8.


.02

.02

.02

6.26. D. The chance of the observation given is: e -y.


Thus the posterior distribution of is proportional to: e-y, .02 < < .1.
.1

.1

e-y d = - e-y/y - e-y/y2 ] = .02e-.02y/y - e-.02y/y2 - .1e-.1y/y - e-.1y/y2 .


.02

.02

The posterior distribution of is: e-y/{.02e-.02y/y - e-.02y/y2 - .1e-.1y/y - e-.1y/y2 }, 0.02 < < 0.1.
The expected future severity is:
.1

E[1/] =

e-y/{.02e-.02y/y - e-.02y/y2 - .1e-.1y/y - e-.1y/y2} d =

.02

{e-.02y/y - e-.1y/y }/{.02e-.02y/y - e-.02y/y2 - .1e-.1y/y - e-.1y/y2 } =


{1 - e-.08y}/{.02 - 1/y - .1e-.08y - e-.08y/y}. As y goes to infinity this goes to 1/0.02 = 50.
Alternately, as y gets larger it is more and more likely that the mean severity of 1/ gets larger, or
gets smaller. In the limit = 0.02, the smallest possible value. 1/ 50.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 258
6.27. C. () = 1/2, 1 < < . f(x) = ()( + x)( + 1) = 33/(x + )4 . f(5) = 33/(5 + )4 .
Posterior distribution of is proportional to: ()f(5) = 3/(5 + )4 , 1 < < .

3/(5 + )4 d = 3 ( 5 + - 5)/(5 + )4 d = 3 1/(5 + )3 - 5/(5 + )4 d =


1

1
=

= 3{-(1/2)/(5 + )2 + (5/3)/(5 + )3 } ] = (3)(1/162) = 1/54.


=1

Posterior distribution of is: {3/(5 + )4 }/(1/54) = 162/(5 + )4 , 1 < < .

The posterior probability that exceeds 2 = 162/(5 + )4 d =


2
=

= (162){-(1/2)/(5 + )2 + (5/3)/(5 + )3 } ] = (162)(.005345) = 0.866.


= 2

Comment: Similar to 4, 11/02, Q.21, however the integral here is harder.


6.28. B. The mean of the Pareto Distribution is: /(1) = /2.
Posterior distribution of is: 162/(5 + )4 , 1 < < .

The expected value of the next claim is: (/2)162/(5 + )4 d = 81 2/(5 + )4 d =


1

81 (2 + 10 + 25 - 10 - 25)/(5 + )4 d = 81 1/(5 + )2 - 10/(5 + )4 - 25/(5 + )4 d =


1

81{(1/6) -10(1/162) - 25(1/648)} = (81)(.06636) = 5.375.


Alternately, one can let y = /(5 + ). Then dy = 5/(5 + )2 .
1

y=1

expected value = 81 y2/5 dy = (27/5)y3


1/6

] = 5.375.

y = 1/6

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 259
6.29. D. For the Pareto, S(x) = (/(+x)). S(5) = 3/(5 + )3 .
Posterior distribution of is: 162/(5 + )4 , 1 < < .
Therefore, the probability that the next claim exceeds 5 is:

{3/(5 + )3}162/(5 + )4 d = 1624/(5 + )7 d.


1

Comment: Let y = /(5 + ). Then 1 - y = 5/(5 + ) and dy = 5/(5 + )2 .


1

y =1

Probability = 162 y4(1-y)/25 dy = (162/25)(y5 /5 - y6 /6)] = 21.6%.


1/6

y = 1/6

6.30. B. The posterior distribution is proportional to:


(density of p)(probability of observation given p) = (10)(p3 ).
.1

p 3 dp = .000025. The posterior distribution of p is: p3 /.000025 = 40000p3 , 0 < p < 0.1.
0
.1

Posterior mean of p = p 40000p3 dp = 8%.


0

6.31. A.

30

Prob[x < 10 | b] = 10/b. Prob[x < 10] = (10/b) (1/20) db = (1/2)(ln30 - ln10) = 0.549.
10

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 260
6.32. C. The chance of the observation given b is: (10/b)2 .
Prior density of b is: 1/20, 10 b 30.

Post. Dist. of b is proportional to: (10/b)2 (1/20) = 5/b2 .

30

Post. Dist. of b = (5/b2 ) / 5/b2 db = (5/b2 )/(1/3) = 15/b2 , 10 b 30.


10
30

(10/b) (15/b2) db = 2/3.

Post. Prob. x < 10 =


10

6.33. B. Post. Dist. of b = 15/b2 , 10 b 30. E[X | b] = b/2.


30

Posterior mean of X = (b/2) (15/b2 ) db = 8.24.


10

6.34. E. The chance of the observation is: 2(e-100/ / )(e-400/ / ) = 2 e-500/ / 2.


Therefore, the posterior distribution is proportional to: (1/) e-500/ / 2 = e-500/ / 3.
Therefore, the posterior mean is:

e 500/ / 3 d /

e 500/ / 3 d .

Making the change of variables, = 1/x, d = -dx/x2 .

e 500/ / 3 d =

e 500/ / 3 d =

e 500/ / 2 d =

e 500x dx = 1/500.

e 500x x d x = 1/5002.

Therefore, the posterior mean is: (1/500)/(1/5002 ) = 500.


Comment: The posterior distribution is proportional to an Inverse Gamma Distribution with
= 2 and = 500. Therefore, this is the posterior distribution. Its mean is: 500/( 2 - 1) = 500.
For this situation, the posterior distribution of is a Inverse Gamma with
= the sum of the observed losses and = the number of observed losses.
Thus the posterior mean is: (sum of losses)/(number of losses - 1).

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 261
6.35. D. The chance of the observation is zero if 0.9.
For > 0.9, the chance of the observation is: (2)(0.9)/2.
() = 43, 0 < < 1.
Thus the posterior distribution is proportional to: 3/2 = , for 0.9 < < 1.
1

0.9 d = (1/2)(12 - 0.92) = 0.095.


Thus the posterior distribution of is: /0.095, for 0.9 < < 1.

E[X | ] =

0 x 2x / 2 dx = 2/3.
1

Therefore, the posterior mean is:

0.9 ( / 0.095) (2 / 3) d = (2/9)(13 - 0.93)/0.095 = 0.634

Comment: Setup taken from 4, 11/05, Q.7, which instead uses Buhlmann Credibility.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 262
6.36. The chance of the observation is: f(15) f(31) = 2
Thus the probability weight is: (1/5) 2

152 e - 15 / 312 e - 31/


.
2 3
2 3

152 e - 15 / 312 e - 31/


, 5 < < 10.
2 3
2 3

This is proportional to: e-46/ 6.


The mean of the Gamma is 3.
10

3
Thus the estimated future severity is:

5 e - 4 6 / -5 d

10

5 e - 4 6 / -6 d

10

Let t = 46/,

9.2

5 e - 4 6 / - 6 d = (1/46)5 4.6 e - t t4 dt = (1/46)5 (5) {[5, 9.2] - [5, 4.6]} =

(24/465 ) (0.951420 - 0.486766) = 5.41442 x 10-8.


10

Let t = 46/,

9.2

5 e - 4 6 / -5 d = (1/46)4 4.6 e - t t3 dt = (1/46)4 (4) {[4, 9.2] - [4, 4.6]} =

(6/464 ) (0.981580 - 0.674294) = 4.11778 x 10-7.


Estimated future severity = (3) (4.11778 x 10-7) / (5.41442 x 10-8) = 22.8.
Comment: Beyond what you are likely to be asked on your exam.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 263
6.37. The chance of the observation is: f(800) =

800

1
=
(4/9).

+
1
1800
1800

The prior density of is: 1/2, 3 < < 5.


The integral of the probability weight is:
5

1
1
1
(4 / 9)
d =
2
3600
1800

3 (4 / 9) d =

1
{5 (4/9)5 / ln(4/9) - 3 (4/9)3 / ln(4/9) - (4/9)5 / ln(4/9)2 + (4/9)3 / ln(4/9)2 } = 0.32499 / 3600.
3600
Therefore, the posterior distribution of is:
1
(4/9) / (0.32499 / 3600) = 3.077 (4/9), 3 < < 5.
3600
Given , the mean of the Pareto is: 1000 / ( - 1).
Thus the estimate of the size of the next claim from this policy is:
5

3.077

(4 / 9)

1000
d = 3077
- 1

(4 / 9) d .
- 1

Let x = - 1. Then the estimate is:


4

3077

x +1
(4 / 9)x + 1 dx = (12,308/9) {
x

2 (4 / 9)x dx + 2 (4 / 9)x / x dx } .

2 (4 / 9)x dx = (4/9)4 / ln(4/9) - (4/9)2 / ln(4/9) = 0.19547.


4

Letting t = x ln(4/9),

4 ln(4/ 9)

(4 / 9)x / x dx =

2 ln(4/ 9)

et
dt = Ei[4 ln(4/9)] - Ei[2 ln(4/9)] =
t

(-0.00959173) - (-0.0835982) = 0.07401.


Therefore the estimate is: (12,308/9) (0.19547 + 0.07401) = 368.5.
Comment: Beyond what you should be asked on your exam.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 264
6.38. D. The chance of the observation given is: x / (1+)x+1.
Therefore, the posterior distribution is proportional to: x / (1+)x++2, 0 < < .
The mean if each Geometric is .

Thus the posterior mean is:

x
x + + 2

d /

x
x + + 2

d .

Let y = 1/(1+). dy = d/(1+)2 . 1 - y = /(1+).

0 x + + 2 d = 0 yx (1- y) dy = (x+1 , +1) = [x+1] [+1] / [x+2+].

0 x + + 2 d = 0 yx + 1 (1- y) - 1 dy = (x+2 , ) = [x+2] [] / [x+2+].


Thus the posterior mean is:

[x + 2] [] / [x + 2 + ]
[x + 2]/ [x +1] x + 1
=
=
.
[x +1] [ +1] / [x + 2 + ]
[ + 1]/ [ ]

Comment: Difficult! Setup taken from 4, 5/05, Q.17, which instead uses Buhlmann Credibility.
In this case, Bayes Analysis and Buhlmann Credibility produce the same answer.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 265
6.39. D. The density of the Gamma is:

exp[-x / ] x4
5 4!

Thus the chance of the observation is proportional to:


exp[-10 / ] exp[-30 / ] exp[-50 / ] exp[-90 / ]
=
.
5
5
5
15
() = 1/, > 0. Therefore, the posterior distribution of is proportional to:
exp[-90 / ] 1 exp[-90 / ]
=
, > 0.
15

16
Thus the posterior distribution of the parameter of the severity distribution is
Inverse Gamma with: = 15, and scale parameter = 90.
The mean of this posterior Inverse Gamma is: 90 / (15 - 1) = 6.4286.
Therefore, the estimate of the future mean severity is: E[] = (5)(6.4286) = 32.143.
Comment: If instead we had taken () = 1/2, > 0, then the posterior distribution would have
been an Inverse Gamma with = 16, and scale parameter = 90.
The mean of this Inverse Gamma is: 90 / (16 - 1) = 6.
The resulting estimate of the future mean severity would be: (5)(6) = 30 = X .
Thus in this situation, () = 1/2, > 0, would be called the noninformative or diffuse prior.
6.40. From the previous solution, the posterior distribution of the parameter of the severity
distribution is Inverse Gamma with: = 15, and scale parameter = 90.
Thus posterior, E[2] =

2
90 2
=
= 44.5055.
( - 1) ( - 2) (14)(13)

E[] = /(-1) = 90 / (15 - 1) = 6.4286.


Thus posterior, Var[] = 44.5055 - 6.42862 = 3.1786.
E[5] = (5)(6.4286) = 32.143.
Var[5] = (25)Var[] = (25)(3.1786) = 79.465.
Thus using the Normal Approximation, a 90% confidence interval for the estimated mean severity is:
32.143 1.645 79.465 = 32.143 14.664 = [17.48, 46.81].
Comment: Similar to Exercise 15.81 in Loss Models.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 266
6.41. A. The chance of the observation is: f(59,874).
The probability weight is: f(59,874) / 1, 9 < < 10.
This is proportional to: exp[-{ln(59,874) - )2 / {(2)(1.22 )}] = exp[-(11 - )2 /2.88].
This is proportional to: exp[7.6391 - 0.34722].
The mean of the LogNormal is: exp[ + 1.22 /2] = 2.0544 e.
10

exp[7.639 - 0.34722 ] 2.0544 exp[] d

Thus the estimated future severity is:

10

9 exp[7.639 - 0.34722 ] d

10

9 exp[7.639 - 0.34722 ] d =

Exp[

7.6392
]
(4)(0.3472)

(2)(0.3472)(10) - 7.639
(2)(0.3472)(9) - 7.639

{[
] - [
]} =
0.3472
(2)(0.3472)
(2)(0.3472)

(5.3258 x 1018) {[-0.83] - [-1.67]} = (5.3258 x 1018) (0.2033 - 0.0475) = 8.298 x 1017.
10

10

9 exp[7.639 - 0.34722 ] 2.0544 exp[] d = 2.0544 9 exp[8.639 - 0.34722 ] d .

10

9 exp[8.639 - 0.34722 ] d =

Exp[

8.6392
]
(4)(0.3472)

(2)(0.3472)(10) - 8.639
(2)(0.3472)(9) - 8.639

{[
] - [
]} =
0.3472
(2)(0.3472)
(2)(0.3472)

(6.5571 x 1023) {[-2.03] - [-2.87]} = (6.5571 x 1023) (0.0212 - 0.0021) = 1.2524 x 1022.
Thus, the estimated future severity = ( 2.0544)(1.2524 x 1022) / (8.298 x 1017) = 31,007.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 267
6.42. E. The chance of the observation given q is:
{10 q3 (1-q)2 } {10 q2 (1 - q)3 } q5 = 100 q10 (1 - q)5 .
(q) =

1
, 0 < q < 1. Thus the posterior distribution of q is proportional to:
q (1 q)

1
q10 (1 - q)5 = q9 (1 - q)4 , 0 < q < 1.
q (1 q)
Therefore, the posterior distribution of q is Beta with a = 10, b = 5, and = 1.
The mean of this posterior distribution of q is: 10 / (10 + 5) = 2/3.
Thus the estimated future average frequency for this insured is: m E[q] = (5)(2/3) = 3.333.
Comment: More generally let the data for n years be: x1 , x2 , ... , xn .
Then the chance of the observation is proportional to:
q

x1

(1- q)

m - x1

x
x
... qxn (1- q) m - n = q xi (1- q) mn - i .

x
Thus the posterior distribution is proportional to: q xi - 1 (1- q) mn - i - 1.

Therefore, the posterior distribution of q is Beta with a = xi, b = mn - xi, and = 1.


The mean of this posterior distribution of q is:

a
xi
=
.
a + b mn

Thus the estimated future average frequency for the insured is: m

xi
= X.
mn

The estimate of the future is the observed claim frequency.


1
Thus for this situation,
is called the noninformative or diffuse prior.
q (1 q)
Since the posterior distribution is Beta, the predictive distribution (posterior mixed distribution) is a
Beta-Binomial, as discussed in Mahlers Guide to Conjugate Priors.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 268
6.43. B. If b were 300, then claim sizes are uniform from 300 to 400, and we could observe a claim
of size 300.
If b were 200, then claim sizes are uniform from 200 to 300, and we could observe a claim of size
300.
b can be 200, 300, or anything in between.
Given 200 b 300, the chance of the observation given b is 1/100.
(b) = e-b/80 / 80. Thus the probability weight is: (e-b/80 / 80) (1/100).
Thus the posterior distribution of b is proportional to: e-b/80, 200 b 300.
b = 300

e- b / 80 db = -80 e - b / 80 ]
= 80 (e-2.5 - e-3.75).

b = 200
200
300

Thus the posterior distribution of b is:

e - b / 80
80 (e - 2.5 - e - 3.75)

, 200 b 300.

Given b, the mean severity is: b + 50.


Therefore, the expected value of the next claim from the same insured is:
300

200

(b + 50)

e- b / 80
80 (e - 2.5 - e - 3.75 )

db =

300

1
80 (e - 2.5 - e - 3.75)
1
80 (e - 2.5 - e - 3.75)

200

300

b e- b / 8 0 db + 50

(-80b

e- b / 80

802

200

80 (e - 2.5 - e - 3.75 )

b = 300
b
/
80
e
)

db =

+ (50)(1) =

b = 200

200 e - 2.5 + 80 e - 2.5 - 300 e - 3.75 - 80 e - 3.75


e - 2.5 - e - 3.75

e - b / 80

+ 50 = 239.845 + 50 = 289.845.

Comment: The integral of the posterior density of b over its support has to be one.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 269
6.44. B. If b were 200, then claim sizes are uniform from 200 to 300, and we could observe a claim
of size 200 and a claim of size 230.
If b were 130, then claim sizes are uniform from 130 to 230, and we could observe a claim of size
200 and a claim of size 230.
b can be 130, 200, or anything in between.
Given 130 b 200, the chance of the observation given b is: (1/100) (1/100).
(b) = e-b/80 / 80. Thus the probability weight is: (e-b/80 / 80) (1/100) (1/100).
Thus the posterior distribution of b is proportional to: e-b/80, 130 b 200.
200

130

e - b / 80

db = -80 e

b = 200
- b / 80

= 80 (e-1.625 - e-2.5).

b = 130

Thus the posterior distribution of b is:

e - b / 80
80 (e - 1.625 - e - 2.5)

, 130 b 200.

Given b, the mean severity is: b + 50.


Therefore, the expected value of the next claim from the same insured is:
200

130

(b + 50)

e- b / 80
80 (e - 1.625 - e - 2.5)

db =

200

1
80 (e - 1.625 - e - 2.5)

130

200

b e - b / 8 0 db + 50

130

e- b / 80
80 (e - 1.625 - e - 2.5)

db =

b = 200

1
80 (e - 1.625 - e - 2.5)

(-80b e- b / 80 - 802 e- b / 80 )]

b = 130

130 e-1.625 + 80 e -1.625 - 200 e - 2.5 - 80 e- 2.5


e - 1.625 - e - 2.5

+ (50)(1) =

+ 50 = 159.960 + 50 = 209.960.

Comment: The integral of the posterior density of b over its support has to be one.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 270
6.45. E. & 6.46. B. () = 10 e10, > .
The chance of no claims given is: 0.5e + 0.5 e2.
Thus the posterior distribution is proportional to: e10(e + e2) = e11 + e12.
The mean frequency given is: 0.5 + (0.5)(2) = 1.5.
Thus the posterior mean frequency is:

e - 11

d +

e - 12 d

1.5 0

= 1.5

e - 11 d + e - 12 d

1/ 112 + 1/ 122
= 0.1309.
1/ 11 + 1/ 12

The chance of one claims given is: 0.5e + 0.5 (2)e2.


Thus the posterior distribution is proportional to: e10(e + 2e2) = e11 + 2 e12.
The mean frequency given is: 0.5 + (0.5)(2) = 1.5.
Thus the posterior mean frequency is:

1.5

e - 11

d + 2 2 e - 12 d

e - 11 d
0

+ 2

= 1.5

e - 12 d

2 / 113 + (2)(2 / 123 )


= 0.2585.
1/ 112 + 2 / 122

Comment: For Gamma type integrals:

tn e- c t

dt = n! / cn+1.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 271
6.47. (a) Since the prior distribution is uniform, given m k, the probability weight for n m is
m- 1

k -1
proportional to:
. Given we have observed k and m, this is proportional to:
n

k

From the given identity, the sum of the probability weights is:

1n

n=m

1
.
n
k

k
1
.
mk -1 1
k -1

Thus dividing the probability weight by its sum, the posterior probability for n is:
m - 1

k - 1 k - 1
, n m k > 1.
k
n
k

m- 1

k - 1 k -1
(b) The mean of the posterior distribution is:
= (k-1)
n
k n
n=m
k

m- 1
(k-1)

k -1

(k-1)

m- 1
m- 1 k -1
1
1
= (k-1)
= (k-1)

n -1
i
k -1
k -1 k - 2
n=m
i=m-1

k -1
k -1

(m- 1)!
k -1 (m-k)! (k - 2)!
k-1
= (m-1)
, k > 2.
(m-k)! (k -1)! k - 2
(m- 2)!
k-2

m- 1
k -1

n=m

1
=
m 2
k - 2

(n- k)!(k - 1)!


=
(n -1)!

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 272
(c) The second factorial moment of the posterior distribution is: E[N(N-1)] =
m-1

m- 1
m- 1
(n- k)!(k - 2)!
k -1 k -1
1
2
2
= (k-1)
= (k-1)
=
n(n- 1)

(n - 2)!
k n
n - 2
k -1
k -1
n=m
n=m
n=m

k
k - 2

m- 1 k - 2
1
(m- 1)!
k - 2 (m-k)! (k - 3)! k -1
= (k-1)2
=
(m-1)(m-2).
(k-1)2

m3

(m-k)! (k -1)! k - 3
(m- 3)!
k-3
k -1 k - 3
k - 3

E[N2 ] = E[N(N-1)] + E[N] =


Var[N] = E[N2 ] - E[N]2 =

k -1
k -1
(m-1)(m-2) + (m-1)
.
k-3
k-2

k -1
k -1
(k - 1)2
(m-1)(m-2) + (m-1)
- (m-1)2
=
k-3
k-2
(k - 2)2

(k -1) (m- 1)
2 (m-2) + (k-2)(k-3) - (m-1)(k-1)(k-3)} = (k - 1) (m - 1) (m + 1 - k), k > 3.
{(k-2)
(k - 2)2 (k - 3)
(k - 2)2 (k - 3)
m- 1

k - 1 k -1
k -1 m- 1
1
(d) For x m, Prob[N > x] =
=
=

k n
k k -1
n
n=x+1
n=x+1
k
k

m - 1

k - 1
k -1 m- 1 k
1
(m- 1)!
(x + 1 - k)! (k - 1)!
=
=

k k -1 k -1 x
(m-k)! (k -1)!
x!
x
k -1
k - 1

(m- 1)! (x + 1 - k)!


, k > 1.
(m- k)!
x!

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 273
(e) k = 5 and m = 20. The posterior probability is for n 20:
m- 1

k -1 k -1
= (4/5)
k
n
k

19

4
(n - 5)!
= 372,096
.
n!
n
5

For example, Prob[n = 20] = 1/5 = 20%, and Prob[n = 21] = 16/105 = 15.24%.
Here is a graph of the posterior densities of n:
Prob.
0.20

0.15

0.10

0.05

20

25

30

The mean of the posterior distribution of n is: (m-1)

35
k -1
= (19)(4/3) = 25.33.
k-2

The variance of the posterior distribution of n is:


(k -1) (m- 1)
(m + 1 - k)
= (16)(4)(19) / {(9)(2)} = 67.56.
(k - 2)2 (k - 3)
For x 20, S(x) =

(m-1)! (x + 1 - k)! 19! (x - 4)!


=
.
(m- k)!
x!
15!
x!

For example, S(20) = 16/20 = 80%, and S(21) = 272/420 = 64.76%.

40

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 274
Here is a graph of the posterior survival function:
S(x)
0.8

0.6

0.4

0.2

x
20
25
30
35
40
45
50
Comment: The German Tank Problem.
See for example, http://en.wikipedia.org/wiki/German_tank_problem
Assume that we have the numbers from 1 to n, and pick a subset of size k, without repeating any
numbers. There are n ways to chose the first number, n - 1 ways to choose the second number, and
n!
n+1-k ways to chose the final number. Thus there are
such subsets.
(n -k)!
If a subset of size k has a maximum of m n, then it can be thought of as first choosing an m, and
then choosing the remaining k - 1 elements from the number from 1 to m-1.
m-1!
Thus there are
subsets whose maximum is m.
(m-1- k)!
Thus given k m n, the probability the subset has a maximum of m is:
m- 1

k -1
(m-1)!
n!
(m-1)!
n!
/
=
/
=
, matching one of the given hints.
(m-1- k)! (n -k)! (m-1- k)! k! (n -k)! k!
n
k

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 275
6.48. C. The probability weight is: f(x | ) p() = 8x, 0 < x.
x

E( l x) = 8x d / 8x d = (8x3 /3)/(8x2 /2) = 2x/3.


0

6.49. B. Given x, the chance of observing three successes is x3 . The a priori distribution of x is
f(x) = 1, 0 x 1. By Bayes Theorem, the posterior density is proportional to the product of the
chance of the observation and the a priori density function. Thus the posterior density is proportional
to x3 for 0 x 1. Since the integral from zero to one of x3 is 1/4, the posterior density is 4x3 .
(The posterior density has to integrate to unity.) Thus the posterior chance that x < 0.9 is the integral
of the posterior density from 0 to 0.9, which is 0.94 = 0.656.
Alternately, by Bayes Theorem (or directly from the definition of a conditional distribution):
Pr[x<.9 | 3 successes] = Pr[3 successes | x<.9 ] Pr[x<.9] / Pr[3 successes] =
Pr[3 successes and x<.9 ] / Pr[3 successes] =
.9

x3 f(x)dx /

x=0

x=0

.9

x3 f(x)dx =

x=0

x3 dx /

x=0

x3 dx = {(.94 )/4 } / {(14 )/4} = 0.656.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 276
6.50. B. Let q = m/100,000. Then m = 100,000q. dm/dq = 100000. Then f(q) = f(m)dm/dq =
(m/100,000,000)100000 = (100,000q/100,000,000)100000 = 100q for 0 < q .1, and
f(q) = f(m)dm/dq = ((20,000- m)/100,000,000)100000 =
((20,000 -100,000q)/100,000,000)100000 = 20 - 100q for .1 < q < .2.
Thus, after the change of variables, f(q) = 100q for 0< q 0.1, f(q) = 20 - 100q for 1 < q < 0.2.
We want to compute the posterior probability that q < .1.
The chance of the observation given q is q(1-q). The posterior probability density function is
proportional to f(q)q(1-q). In order to compute the posterior distribution we need to divide by the
integral of f(q)q(1-q).
.2

.1

.2

f(q)q(1-q)dq = (100q)q(1-q)dq + (20 - 100q)q(1-q)dq =


0

.1
q=.1

{100q3 /3 - 25q4 }

q=.2

+ {10q2 - 40q3 + 25q4 } ] = .37/12 + (3.6 - 3.36 +.45)/12 = 1.06/12.

q=0

q =.1

Thus the posterior density is:


100(q2 -q3 )12/1.06 for 0 < q .1, and (.2 - 100q)(q-q2 )12/1.06 for .1< q .2.
Thus the posterior chance that q<.1 is the integral from 0 to .1 of the posterior density:
.1

q=.1

(q2-q3)1200/1.06 dq = (1200/1.06)(q3/3 - q4/4) ] = (1200/1.06)(.0037/12) = 37/106.


0

q=0

Comment: The key point for the change of variables is that since probability density functions are
derivatives there is an extra factor of dm/dq. f(q) = f(m)dm/dq. One way to remember this is that
f(q) = dF(q)/dq and thus f(q)dq = dF = f(m)dm. Both f(q)dq and f(m)dm must integrate to unity since
they are probability density functions.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 277
6.51. E. Given q, the chance of observing no claims is 1 - q. The posterior density is proportional
to the product of the chance of the observation and the prior density:
(1 - q)(/2) sin(q/2) = (/2) sin(q/2) - (q/2) sin(q/2). Integrating from zero to one to:
1

(/2) sin(q/2) dq - (q/2) sin(q/2) dq = 1 - 2/ = ( 2)/.


0

(The first integral is unity, since it is the integral of the given density. The second integral is gotten
from the hint.) Dividing by this integral, the posterior density is:
(1-q)(/2) sin(q/2) / {( 2)/} = (2(1-q)/{2(-2)}) sin(q/2).
Comment: Choice B integrates to 2/ 1, while Choice C integrates to 1 - 2/ 1, thus neither is a
density function. Choice A can also be eliminated, since in this case if one observes zero rather than
one claim, then the posterior mean is lower than the prior mean and thus the posterior density can
not be equal to the prior density.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 278
6.52. C. Given b, the chance of the observation is f(2). If b > 2, then f(2) = 4/b2 .
If b 2, f(2) = 0. The a priori chance of a given value of b, 1 < b < , is 1/b2 .
Thus by Bayes Theorem, the posterior density of b is proportional to:
(4/b2 )(1/b2 ) = 4/b4 if b > 2, and zero for b 2.
In order to convert this to a density we need to divide by its integral over the domain of b.
2

0 db + 4/b4 db = -4/ 3b3 ] = 1/6.


1

Thus the posterior density of b is: (4/b4 )/(1/6) = 24/b4 , b > 2. The mean conditional on b:
b

x=b

Mean = x(2x/b2 ) dx = (2/3)x3 /b2 ] = 2b /3.


0

x=0

Given b, the mean is 2b /3. Thus the posterior estimate of the next claim is:

(2b /3)(24/b4 ) db = -8/ b2 ] = 2.


2

Comment: It turns out in this case, that if one observed a single claim of size y> 1, the posterior
estimate is also y. This is an example of a noninformative or vague prior distribution.
One has to be very careful to distinguish the two cases where b 2 and b > 2.
Once we observe a claim of size 2, from the fact that the support of f(x) is 0 < x< b, we know b > 2.
Thus the posterior density of b is zero for b 2.
Even though it would have been better if it did so, this question did not specify whether to use
Buhlmann Credibility or Bayesian Analysis. However, if one tries to use Buhlmann Credibility one
would run into trouble. Prior to any observations, one has an infinite overall mean; the integral of
(2b/3)(1/b2 ) from 1 to infinity is infinite. Prior to any observations, one has an infinite second moment
of the hypothetical means; the integral of (2b/3)2 (1/b2 ) from 1 to infinity is infinite. Therefore the
VHM is also infinite or undefined. The integral of (b2 /18)(1/b2 ) from 1 to infinity is infinite; therefore
the EPV is infinite. When both the VHM and EPV are infinite, one can not calculate K and therefore
one can not apply Buhlmann Credibility. However, in this case one could take a limit of
Buhlmann Credibility Parameters. For g(b) = L/ (L-1)b2 , 1 < b < L, one can calculate,
EPV = L/18, overall mean = 2 L (ln L) / 3(L-1), and VHM = (4/9){L - {L (ln L) / (L-1)}2 }.
K = EPV/VHM = (1/8)(1/{1 - L(ln L)2 / (L-1)2 }. As L goes to infinity, K goes to 1/8. Thus for large
L, for one observation Z 8/9. The estimate using Buhlmann Credibility is approximately
(8/9)(2)+(1/9){2 L (ln L) / 3(L-1)}; as L goes to infinity, this estimate goes to infinity.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 279
6.53. E. Severity is uniform on [0, ].
If for example, = 601, the chance of a claim of size 600 is 1/601.
If for example, = 599, the chance of a claim of size 600 is 0.
For 600, Prob[observation] = 2f(400)f(600) = (2)(1/)(1/) = 2/2.
For < 600, Prob[observation] = 2f(400)f(600) = 2f(400)0 = 0.

() Prob[observation | ] d = (500/2)(2/2)d = (1000/3)/6003.


500

600

By Bayes Theorem, Posterior distribution of is:


() Prob[observation | ]/ {(1000/3)/6003 } = (3)(6003 )/4, > 600.
For 550, S(550) = ( - 550)/ = 1 - 550/.
=

(1 - 550/) (3)(6003)/4 d = 1 +

(550)(3/4)(6003 )4

600

] = 1 - 0.6875 = 0.3125.
= 600

Comment: The integral over its support of the posterior distribution must be one.
The Single Parameter Pareto is a Conjugate Prior to the uniform Iikelihood.
Assume a uniform distribution on (0, ), with [] a Single Parameter Pareto with and .
Then the parameters of the posterior Single Parameter Pareto are:
= + n, and = Max[x1 , ... xn , ].
In this case, the posterior distribution is Single Parameter Pareto with parameters:
= 1 + 2 = 3, and = Max[400, 600, 500] = 600.

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 280
6.54. A. Severity is uniform on [0, ].
For 400, Prob[observation] = 2f(400)f(300) = (2)(1/)(1/) = 2/2.

() Prob[observation | ] d =

500

(500 / 2) (2 / 2) d = (1000/3)/5003 .

500

By Bayes Theorem, Posterior distribution of is:


() Prob[observation | ] / {(1000/3)/5003 } = (3)(5003 )/4, > 500.
For 550, S(550) = ( - 550)/ = 1 - 550/. For 550, S(550) = 0.

(3)(5003 )

(1 - 550 / ) / 4 d = (3)(5003 ) {

550

1
550
} = 0.188.
3 (5503) 4 (5504)

6.55. E. () = 1/2, 1 < < . f(x) = 22/(x + )3 . f(3) = 22/(3 + )3 .


Posterior distribution of is proportional to: ()f(3) = 2/(3 + )3 , 1 < < .

2/(3 + )3 d = -1/(3 + )2 ] = 1/16.


1

=1

Posterior density of is: {2/(3 + )3 }/(1/16) = 32/(3 + )3 , 1 < < .

The posterior probability that exceeds 2 = 32/(3 + )3 d = -16/(3 + )2


2

6.56. f(x) = 22/(x + )3 . f(3) = 22/(3 + )3 .


If = 2, f(3) = 8/125 = 0.06400. If = 4, f(3) = 32/343 = 0.09329.
Posterior distribution of is proportional to Prob[] f(3):
(.7)(.064) = .0448 for = 2, (.3)(.09329) = .02799 for = 4.
Posterior distribution of is:
.0448/(.0448 + .02799) = 61.5% for = 2,
.02799/(.0448 + .02799) = 38.5% for = 4.

] = 16/25 = 0.64.

=2

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 281
6.57. A. Severity is uniform on [0, ]. E[X | ] = /2.
We are given that the posterior distribution of is: (3)(6003 )/4, > 600.
Posterior estimate of the average severity is:

E[X | ] f( | x1, x2) d = ( / 2)


600

(3)(6003 )/ 4

d =

-(3 / 4)(6003)

=
-2

= 450.

= 600

Comment: Same setup as 4, 11/01, Q.14. See the solution to that question, in order to see how
one would derive the given posterior distribution.
6.58. C. Severity is uniform on [0, ]. E[X | ] = /2.
The uniform distribution from [0, 500] has density of 0 at 600, and the chance of the observation is 0.
The uniform distribution from [0, 1000] has density of 1/1000, and the chance of the observation is
1/10002 .
The uniform distribution from [0, 2000] has density of 1/2000, and the chance of the observation is
1/20002 .
If we assume that the claims are 400 and 600 in that order, then the chance of the observation is:
f(400) f(600).
Theta

A priori
Probability

Chance of
Observation

Probability
Weight

Posterior
Probability

Mean
Severity

500
1000
2000

0.5
0.3
0.2

0.000e+0
1.000e-6
2.500e-7

0.000e+0
3.000e-7
5.000e-8

0.000
0.857
0.143

250.0
500.0
1000.0

3.500e-7

571.4

If we instead assume that the claims are 400 and 600 in either order, then the chance of the
observation is: 2 f(400) f(600); however, we get the same posterior distribution.
6.59. A. The prior density of p is: (p) = 2, 0 p 0.5.
The probability of the observation is p8 . Therefore, the posterior distribution of p is:
0.5

2 p8 / 2p8 dp = p8 /{(1/2)9 /9} = 4608 p8 , 0 p 0.5.


0

The posterior probability that the insured will have at least one loss during Year 9 is:
0.5

(4608 p8) p dp = (4608)(1/2)10 / 10 = 0.45.


0

2013-4-9 Buhlmann Cred. 6 Bayes Continuous Risk Types, HCM 10/19/12, Page 282
6.60. E. The probability of the observation is . Therefore, the posterior distribution of is:
1

(3/2) /

0 (3 / 2)

d = 1.5/(1/2.5) = 2.51.5, 0 1.

The posterior probability that that is greater than 0.60 is:


1

0.6 2.5 1.5 d = 12.5 - 0.62.5 = 0.721.


Comment: The prior distribution of is a Beta with a = 1.5 and b = 1. This is a mathematically the
same as a Beta-Bernoulli frequency process; see Mahlers Guide to Conjugate Priors.
The posterior distribution of is Beta with parameters: a = a + 1 = 2.5, b = b + 1 - 1 = 1.
6.61. D. Prob[observation | q] = q(1 - q) = q - q2 .
0.8

0.8

Posterior distribution = (q - q2 )(q3 /.07) / (q - q2 )(q3 /.07) dq = (q4 - q5 )/ (q4 - q5 )dq =


0.6

0.6

(q4 - q5 )/.01407. Posterior probability that 0.7 < q < 0.8:


0.8

(q4 - q5) dq /.01407 = .00784/.01407 = 0.557.


0.7

6.62. B. Given q, the probability of the observation is:


Prob[2 claims in Year 1] Prob[2 claims in Year 2] = q q = q2 .
Therefore, the posterior distribution is proportional to: q2 (q) = q4 /0.039, 0.2 < q < 0.5.
.5

q4/0.039 dq = .1586.
.2

Therefore, the posterior distribution is: (q4 /0.039)/.1586 = 161.7q4 , 0.2 < q < 0.5.
Given q, the mean is: 0.90 - q + 2q = 0.90 + q. Therefore, the expected future frequency is:
.5

161.7q4 (0.90 + q) dq = .90 + .419 = 1.319.


.2

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 283

Section 7, EPV and VHM


In the prior sections we saw how to apply Bayesian Analysis. In the next section, how to apply
Buhlmann Credibility will be demonstrated. In order to apply Buhlmann Credibility, one will first
have to calculate the Expected Value of the Process Variance (EPV) and the Variance of
the Hypothetical Means (VHM), which together sum to the total variance.79 How to compute
these important quantities, will be demonstrated in this section.80
A Series of Examples:
The following information will be used in a series of examples involving the frequency, severity, and
pure premium:
Portion of
Bernoulli
Gamma
Risks in
(Annual) Frequency
Severity
Type
this Type
Distribution
Distribution
1

50%

q = 40%

= 4, = 100

30%

q = 70%

= 3, = 100

20%

q = 80%

= 2, = 100

We assume that the types are homogeneous; i.e., every insured of a given type has the same
frequency and severity process. Assume that for an individual insured, frequency and severity are
independent.81
I will show how to compute the Expected Value of the Process Variance and the Variance of the
Hypothetical Means in each case. In general, the simplest case involves the frequency, followed by
the severity, with the pure premium being the most complex case.
Expected Value of the Process Variance, Frequency Example:
For type 1, the process variance of the Bernoulli frequency is: q(1 - q) = (0.4)(1 - 0.4) = 0.24.
Similarly for type 2 the process variance for the frequency is (0.7)(1 - 0.7) = 0.21.
For type 3 the process variance for the frequency is: (0.8)(1 - 0.8) = 0.16.

79

Those who are familiar with the general application of analysis of variance (ANOVA) may find that helps them to
understand the material in this section.
80
Many of you will benefit by first reading the section on the Philbrick Target Shooting Example.
81
Across types, the frequency and severity are not independent. In this example, types with higher average
frequency have lower average severity.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 284

The expected value of the process variance is the weighted average of the process variances for
the individual types, using the a priori probabilities as the weights.82
The EPV of the frequency = (50%)(0.24) + (30%)(0.21) + (20%)(0.16) = 0.215.
This computation can be organized in the form of a spreadsheet:
Class
1
2
3

A Priori
Probability
50%
30%
20%

Bernoulli Parameter
n
q
0.4
0.7
0.8

Average

Process
Variance
0.240
0.210
0.160
0.215

I recommend you organize your computations for exam questions in a similar manner or one that
works for you. Using the same structure for similar problems every time reduces the chance for error.
Note that to compute the EPV one first computes variances and then one computes the expected
value. In contrast, in order to compute the VHM, one first computes expected values, and then one
computes the variance.
Variance of the Hypothetical Mean Frequencies:
For type 1, the mean of the Bernoulli frequency is q = 0.4. Similarly for type 2 the mean frequency
is 0.7. For type 3 the mean frequency is 0.8.
The variance of the hypothetical mean frequencies is computed the same way one would any other
variance. First one computes the first moment: (50%)(0.4) + (30%)(0.7) + (20%)(0.8) = 0.57.
Then one computes the second moment: (50%)(0.42 ) + (30%)(0.72 ) + (20%)(0.82 ) = 0.355.
Then the VHM = 0.355 - 0.572 = 0.0301.
This computation can be organized in the form of a spreadsheet:
Class
1
2
3
Average

A Priori
Probability
50%
30%
20%

Bernoulli Parameter
Mean
n Frequency
q
0.4
0.4
0.7
0.7
0.8
0.8
0.57

Square of
Mean Freq.
0.160
0.490
0.640
0.355

Then the variance of the hypothetical mean frequencies = 0.3550 - 0.5702 = 0.0301.

82

Note that while in this case with discrete possibilities we take a sum, as discussed subsequently, in the case of
continuous risk types we would take an integral.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 285

Total Variance, Frequency Example:


For an insured picked at random, there is a 0.57 chance of a claim and .43 chance of no claim.
This is a Bernoulli, with variance: (0.57)(0.43) = 0.2451.
Total Variance = 0.2451 = 0.215 + 0.0301 = EPV + VHM.
In general, as will be demonstrated subsequently,
EPV + VHM = Total Variance.
Expected Value of the Process Variance, Severity Example:
The computation of the EPV for severity is similar to that for frequency with one important difference.
One has to weight together the process variances of the severities for the individual types using the
chance that a claim came from each type.83 The chance that a claim came from an individual of a given
type is proportional to the product of the a priori chance of an insured being of that type and the
mean frequency for that type.
Taking into account the mean frequencies in this manner is only necessary when one is predicting
future severities and the type of insured includes specifying both frequency and severity. In the
current example, while Type 1 represents 50% of the insureds, as shown below it represents only
35.1% of the claims, due to its relatively low claim frequency.
For type 1, the process variance of the Gamma severity is: 2 = 4(100)2 = 40,000.
Similarly for type 2 the process variance for the severity is: 3(100)2 = 30,000.
For type 3 the process variance for the severity is: 2(100)2 = 20,000.
The mean frequencies are: 0.4, 0.7, and 0.8. The a priori chances of each type are: 50%, 30% and
20%. Thus the weights to use to compute the EPV of the severity are:
(0.4)(50%) = 0.20, (0.7)(30%) = 0.21, and (0.8)(20%) = 0.16.
Thus the probabilities that a claim came from each class are:
0.20/0.57 = 0.351, 0.21/0.57 = 0.368, and 0.16/0.57 = 0.281.
The expected value of the process variance of the severity is the weighted average of the process
variances for the individual types, using these weights. The EPV of the severity is:
{(0.2)(40000) + (0.21)(30000) + (0.16)(20000)} / (0.2 + 0.21 + 0.16) = 30,702.84
83

Each claim is one observation of the severity process. The denominator for severity is number of claims.
In contrast, the denominator for frequency, as well as pure premiums, is exposures.
84
Note that this result differs from what one would get by using the a priori probabilities as weights.
The latter method, which is not correct in this case, would result in:
(50%)(40000) + (30%)(30000) + (20%)(20000) = 33,000 30702.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 286

This computation can be organized in the form of a spreadsheet:


A

Class
1
2
3

A Priori
Probability
50%
30%
20%

Average

Mean
Weights =
Frequency Col. B x Col C.
0.4
0.2
0.7
0.21
0.8
0.16
0.57

Probability that
a claim came Gamma Parameters

from this class


0.351
4
100
0.368
3
100
0.281
2
100
1.000

Process
Variance
40,000
30,000
20,000
30,702

Variance of the Hypothetical Mean Severities:


The computation of the VHM for severity is similar to that for frequency with one important
difference. In computing the moments one has to use for each individual type the chance that a claim
came from that type.85 The chance that a claim came from an individual of a given type is proportional
to the product of the a priori chance of an insured being of that type and the mean frequency for that
type.
For type 1, the mean of the Gamma severity is = 4(100) = 400. Similarly, for type 2 the mean
severity is 3(100) = 300. For type 3 the mean severity is 2(100) = 200.
The mean frequencies are: 0.4, 0.7, and 0.8. The a priori chances of each type are: 50%, 30% and
20%. Thus the weights to use to compute the moments of the severity are:
(0.4)(50%) = 0.20, (0.7)(30%) = 0.21, and (0.8)(20%) = 0.16.
The variance of the hypothetical mean severities is computed the same way as one would any
other variance. First one computes the first moment:
{(0.2)(400) + (0.21)(300) + (0.16)(200)} / (0.2 + 0.21 + 0.16) = 307.02.
Then one computes the second moment:
{(0.2)(4002 ) + (0.21)(3002 ) + (0.16)(2002 )} / (0.2 + 0.21 + 0.16) = 100,526.
Then the VHM of the severity = 100,526 - 307.022 = 6265.

85

Each claim is one observation of the severity process. The denominator for severity is number of claims. In
contrast, the denominator for frequency (as well as pure premiums) is exposures.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 287

This computation can be organized in the form of a spreadsheet:


A

Class
1
2
3

A Priori
Probability
50%
30%
20%

Mean
Weights =
Gamma Parameters

Frequency Col. B x Col C.


0.4
0.2
4
100
0.7
0.21
3
100
0.8
0.16
2
100

Average

0.57

Mean
Severity
400
300
200

Square of
Mean Severity
160,000
90,000
40,000

307.02

100,526

Then the variance of the hypothetical mean severities = 100526 - 307.022 = 6265.
Two cases for Severities:
The above example assumes that not only do the types differ in their severities, they also differ in
their frequencies.
Exercise: There are two types of risks that are equally likely.
Class 1 has a mean frequency of 10% and an Exponential Severity with mean 5.
Class 2 has a mean frequency of 20% and an Exponential Severity with mean 8.
Calculate the EPV and VHM.
[Solution: For an Exponential Distribution, mean = and variance = 2.
A

Class

A Priori
Prob.

Mean
Frequency

Weights =
Col. B
x Col. C

Process
Variance

Mean
Severity

Square of
Mean
Severity

1
2

50%
50%

0.1
0.2

0.05
0.1

5
8

25
64

5
8

25
64

51.00

7.00

51.00

Average

0.15

EPV = 51. VHM = 51 - 72 = 2.]


If the types do not differ in their frequencies, then the computations of the EPV and VHM are
somewhat simpler.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 288

Exercise: There are two types of risks that are equally likely.
Class 1 has an Exponential Severity with mean 5.
Class 2 has an Exponential Severity with mean 8. Calculate the EPV and VHM.
[Solution: For an Exponential Distribution, mean = and variance = 2.

Type

A Priori
Chance of
This Type

1
2

50%
50%

Overall

Process
Variance

Mean

Square of
Mean

5
8

25
64

5
8

25
64

44.50

6.50

44.50

EPV = 44.50. VHM = 44.50 - 6.502 = 2.25.


Comment: Unlike the previous exercise, since there is no mention of differing frequency by type,
we assume the mean frequencies for the types are the same; i.e., we ignore frequency.
We therefore get different answers for the EPV and VHM, than in the previous exercise.]
Expected Value of the Process Variance, Pure Premium Example:
The computation of the EPV for the pure premiums is similar to that for frequency.
However, it is more complicated to compute each process variance of the pure premiums.86
For type 1, the mean of the Bernoulli frequency is q = 0.4.
For type 1, the variance of the Bernoulli frequency is q(1-q) = (0.4)(1 - 0.4) = 0.24.
For type 1, the mean of the Gamma severity is = 4(100) = 400,
and the variance of the Gamma severity is 2 = 4(1002 ) = 40,000.
Thus since frequency and severity are assumed to be independent,
the process variance of the pure premium is:
(Mean Freq.)(Variance of Severity) + (Mean Severity)2 (Variance of Freq.)
= (0.4)(40000) + (400)2 (0.24) = 54,400.
Similarly, for type 2 the process variance of the pure premium is: (0.7)(30000) + (300)2 (021) =
39,900. For type 3 the process variance of the pure premium is: (0.8)(20000) + (200)2 (0.16) =
22,400.
The expected value of the process variance is the weighted average of the process variances for
the individual types, using the a priori probabilities as the weights.
The EPV of the pure premium = (50%)(54,400) + (30%)(39,900) + (20%)(22,400) = 43,650.
86

See a Mahlers Guide to Classical Credibility.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 289

This computation can be organized in the form of a spreadsheet:


Class

A Priori
Probability

1
2
3

50%
30%
20%

Mean
Variance of
Frequency Frequency
0.4
0.7
0.8

Mean
Severity

Variance of
Severity

Process
Variance

400
300
200

40,000
30,000
20,000

54,400
39,900
22,400

0.24
0.21
0.16

Average

43,650

Variance of the Hypothetical Mean Pure Premiums:


The computation of the VHM for the pure premiums is similar to that for frequency. One has to first
compute the mean pure premium for each type.
For type 1, the mean of the Bernoulli frequency is q = 0.4.
For type 1, the mean of the Gamma severity is: = 4(100) = 400.
Thus since frequency and severity are assumed to be independent, the mean pure premium is:
(Mean Frequency)(Mean Severity) = (0.4)(400) = 160.
For type 2, the mean pure premium is: (0.7)(300) = 210.
For type 3, the mean pure premium is: (0.8)(200) = 160.87
One computes the first and second moments of the mean pure premiums as follows:
Class

A Priori
Probability

Mean
Frequency

Mean
Severity

Mean
Pure Premium

Square of
Pure Premium

1
2
3

50%
30%
20%

0.4
0.7
0.8

400
300
200

160
210
160

25,600
44,100
25,600

175

31,150

Average

Thus the variance of the hypothetical mean pure premiums = 31,150 - 1752 = 525.

87

Note that in this example it turns out that the mean pure premium for type 3 happens to equal that for type 1, even
though the two types have different mean frequencies and severities. The mean pure premiums tend be similar
when, as in this example, high frequency is associated with low severity.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 290

Total Variance, Pure Premium Example:


For each risk type, the second moment of the pure premium is the process variance plus the square
of the mean. The second moments are:
Class
1
2
3

A Priori
Variance
Mean Process
of
Variance
Frequency
Severityof Pure Premium
Severity
Probability
50%
30%
20%

54,400
39,900
22,400

Average

Mean
Pure Premium

Second Moment
of Pure Premium

160
210
160

80,000
84,000
48,000

175

74,800

The second moment for the mixture is a weighted average of the individual second moments, using
the a priori probabilities as the weights:
(50%)(80,000) + (30%)(84,000) + (20%)(48,000) = 74,800.
The total variance = 74,800 - 1752 = 44,175.
EPV + VHM = 43,650 + 525 = 44,175.
Thus as is true in general, in this case, EPV + VHM = Total Variance.
EPV versus VHM:
While the two pieces of the total variance seem similar, the order of operations in their computation is
different. In the case of the Expected Value of the Process Variance, EPV, first one separately
computes the process variance for each of the types of risks and then one takes the expected value
over all types of risks. Symbolically: EPV = E[VAR[X | ]].
Loss Models uses the symbol v, to refer to the Expected Value of the Process Variance.
In the case of the Variance of the Hypothetical Means, VHM, first one computes the expected value
for each type of risk and then one takes their variance over all types of risks. Symbolically:
VHM = VAR[ E[X | ] ].
Loss Models uses the symbol a, to refer to the Variance of the Hypothetical Means.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 291

Demonstration that Total Variance = EPV + VHM:


One can demonstrate that in general :
88
VAR[X] = E[VAR[X | ]] + VAR[ E[X | ] ].

First one can rewrite the EPV: E[VAR[X | ]] = E[ E[X2 | ] - E[X | ]2 ] =


E[ E[X2 | ]] - E[E[X | ]2 ] = E[X2 ] - E[E[X | ]2 ].
Second, one can rewrite the VHM: VAR[E[X] | ] = E[E[X] | ]2 ] - E[E[X] | ]2 =
E[E[X | ]2 ] - E[X]2 .
Putting together the first two steps: EPV + VHM = E[VAR[X | ]] + VAR[E[X] | ] =
E[X2 ] - E[E[X | ]2 ] + E[E[X | ]2 ] - E[X]2 = E[X2 ] - E[X]2 = VAR[X] = Total Variance of X.

EPV + VHM = Total Variance.


Use in Buhlmann Credibility:
The Variance of the Hypothetical Means and the Expected Value of the Process Variance will be
used in the next section to compute the Buhlmann Credibility to assign to N observations of this risk
process; i.e., observing the same insured for N years. However, in each case one should compute
these quantities for one insured for one year.
In general, one computes the EPV and VHM for a single observation of the risk process, whether
that consists of observing the pure premium for a single randomly selected insured for one year, or it
consists of observing the value of a single ball drawn from a randomly selected urn.

Similarly, Cov[X, Y] = E[Cov[X, Y | ]] + Cov[ E[X | ] , E[Y | ] ].


See for example, Howard Mahlers discussion of Glenn Meyers Analysis of Experience Rating, PCAS 1987.
88

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 292

VHM When There Are Only Two Types of Risks:


When there are only two types of risks, the calculation of the VHM can be done via a shortcut.
If the two hypothetical means are 1 and 2, and the a priori probabilities are p1 and 1 - p1 ,
then the VHM = (1 - 2)2 ( p1 )(1 - p1 ).
Exercise: There are two risk types, with hypothetical means of 3 and 8. The a priori probabilities are
60% and 40%. Calculate the Variance of the Hypothetical Means.
[Solution: VHM = (3 - 8)2 (0.6)(0.4) = 6.
Alternately, the overall mean is: (60%)(3) + (40%)(8) = 5.
Second moment of the hypothetical means = (60%)(32 ) + (40%)(82 ) = 31.
VHM = 31 - 52 = 6.]
Dividing the Covariance Into Two Pieces:89
Similar to the result for variances, I will show that: Total Covariance =
Expected Value of the Process Covariance + Covariance of the Hypothetical Means.90
Assume that a set of parameters varies across a portfolio.
Cov[X, Y] = E[XY] - E[X] E[Y] = E[E[XY | ]] - E[X | ] E[Y | ]
= E[E[XY | ]] - E[E[X | ] E[Y | ]] + E[E[X | ] E[Y | ]] - E[X | ] E[Y | ]
= E[E[XY | ] - E[X | ] E[Y | ]] + {E[E[X | ] E[Y | ]] - E[X | ] E[Y | ]}
= E[Cov[X, Y | ]] + Cov[[X | ] , [Y | ]].
For example, X might be the losses limited to 1000, and Y might be losses excess of 1000.
There are two equally types of risks.
Type
E[X]
E[Y]
E[XY]
1
400
7000
3 million
2
600
9000
6 million
Then, for Type 1: Cov[X, Y] = 3 million - (400)(7000) = 200,000.
For Type 2: Cov[X, Y] = 6 million - (600)(9000) = 600,000. E[Cov[X, Y | ]] = 400,000.
C o v[[X | ] , [Y | ]] = (1/2)(400)(7000) + (1/2)(600)(9000) - (500)(8000) = 100,000.
Then, Cov[X, Y] = E[Cov[X, Y | ]] + Cov[[X | ] , [Y | ]] = 400,000 + 100,000 = 500,000.
89
90

See for example, Howard Mahlers discussion of Glenn Meyers An Analysis of Experience Rating, PCAS 1987.
This is a generalization of the fact that Total Variance = EPV + VHM.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 293

Mixed Distributions:
If is the parameter being mixed, then one can split the variance of a mixed distribution into two
pieces as: Var[X] = E[Var[X | ]] + Var[E[X | ]].91
Var[E[X | ]] > 0. Var[X] > E[Var[X | ]].

Variance of a mixture > Average of the variances of the components.


Mixing increases the variance.
Exercise: A claim count distribution can be expressed as a mixed Poisson distribution.
The mean of the Poisson distribution is uniformly distributed over the interval [0, 5].
Determine the variance of the mixed distribution.
[Solution: The distribution of has mean 2.5 and variance: 52 /12 = 25/12.
Var[N] = E[Var[N | ]] + Var[E[N | ]] = E[] + Var[] = 2.5 + 25/12 = 4.583.
Alternately, the mean of the mixed distribution is: E[] = 2.5.
Given lambda, the second moment of each Poisson is: + 2.
The second moment of the mixed distribution is: E[ + 2] = E[] + E[2] =
2.5 + (25/12 + 2.52 ) = 10.833.
Variance of the mixed distribution is: 10.833 - 2.52 = 4.583.]
Exercise: One has a two-point mixture of Binomials.
The first component has m = 3 and q = 0.2. The second component has m = 4 and q = 0.1.
The first component is given 70% weight and the second component is given weight 30%.
Determine the variance of the mixed distribution.
[Solution: The first Binomial distribution has mean 0.6 and variance 0.48.
The second Binomial distribution has mean 0.4 and variance 0.36.
The EPV is: (0.7)(0.48) + (0.3)(0.36) = 0.444.
The overall mean is: (0.7)(0.6) + (0.3)(0.4) = 0.54.
The second moment of the hypothetical means is: (0.7)(0.62 ) + (0.3)(0.42 ) = 0.30.
The VHM is: 0.30 - 0.542 = 0.0084.
The variance of the mixture is: EPV + VHM = 0.444 + 0.0084 = 0.4524.
Alternately, the second moment of the mixture is: (0.7)(0.48 + 0.62 ) + (0.3)(0.36 + 0.42 ) = 0.744.
The variance of the mixture is: 0.744 - 0.542 = 0.4524.]
91

See Theorem 5.7 in Loss Models. As discussed, the first piece, E[Var[X | ]] , is the Expected Value of the

Process Variance, while the second piece, Var[E[X | ]], is the Variance of the Hypothetical Means.
Total Variance = Expected Value of the Process Variance + Variance of the Hypothetical Means.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 294

Problems:
Use the following information for the next 2 questions:
There are three types of risks. Each risk has either one or zero claims per year.
Type of Risk
Chance of a Claim
A Priori Chance of Type of Risk
A
30%
50%
B
40%
35%
C
50%
15%
7.1 (2 points) What is the Expected Value of the Process Variance?
A.
B.
C.
D.
E.

Less than 0.19


At least 0.19 but less than 0.20
At least 0.20 but less than 0.21
At least 0.21 but less than 0.22
At least 0.22

7.2 (2 points) What is the Variance of the Hypothetical Means?


A.
B.
C.
D.
E.

Less than 0.006


At least 0.006 but less than 0.007
At least 0.007 but less than 0.008
At least 0.008 but less than 0.009
At least 0.009

7.3 (3 points) For a large group of insureds:

The hypothetical mean frequency for an individual insured is m.


Over the group, m is distributed uniformly on the interval (0, 5].
The severity for an individual insured is Exponential: f(x) = (1/r) exp(-x/r), x 0.
Over the group, r is distributed: g(r) = 2r/9, 0 r 3.
For any individual insured, frequency and severity are independent.
m and r are independently distributed.
In which range is the variance of the hypothetical mean pure premiums for this class of risks?
A. Less than 6
B. At least 6, but less than 8
C. At least 8, but less than 10
D. At least 10, but less than 12
E. 12 or more

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 295

Use the following information in the next two questions:


An insured population consists of 12% youthful drivers and 88% adult drivers.
Based on experience, we have derived the following probabilities
that an individual driver will have n claims in a year's time:
n
Youthful
Adult
0
0.85
0.95
1
0.10
0.04
2
0.04
0.01
3
0.01
0.00
7.4 (2 points) What is the Expected Value of the Process Variance?
A. Less than 0.07
B. At least 0.07 but less than 0.09
C. At least 0.09 but less than 0.11
D. At least 0.11 but less than 0.13
E. At least 0.13
7.5 (1 point) What is the Variance of the Hypothetical Means?
A. 0.0018 B. 0.0020
C. 0.0022
D. 0.0024
E. 0.0026

Use the following information in the next two questions:


A portfolio of three risks exists with the claim frequency for each risk Normally Distributed with mean
and standard deviations:
Risk Mean
Standard Deviation
A
0.10
0.03
B
0.30
0.05
C
0.50
0.01
7.6 (1 point) What is the Expected Value of the Process Variance?
A. 0.0010
B. 0.0012
C. 0.0014
D. 0.0016
E. 0.0018
7.7 (1 point) What is the Variance of the Hypothetical Means?
A. Less than 0.023
B. At least 0.023 but less than 0.025
C. At least 0.025 but less than 0.027
D. At least 0.027 but less than 0.029
E. At least 0.029

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 296

Use the following information for the next 6 questions:


Two dice, A1 and A2 , are used to determine the number of claims.
Each side of both dice are marked with either a 0 or a 1, where 0 represents no claim and 1
represents a claim.
The probability of a claim for each die is:
Die
Probability of Claim
2/6
A1
A2

3/6

In addition, there are two spinners, B1 and B2 , representing claim severity.


Each spinner has two areas marked 20 and 50.
The probabilities for each claim size are:
Claim Size
Spinner
20
50
B1
0.60
0.40
B2

0.20
0.80
A single observation consists of selecting a die randomly from A1 and A2 and a spinner randomly
from B1 and B2, rolling the selected die, and if there is a claim spinning the selected spinner.
7.8 (1 point) Determine the Expected Value of the Process Variance for the frequency.
A. Less than 0.22
B. At least 0.22 but less than 0.23
C. At least 0.23 but less than 0.24
D. At least 0.24 but less than 0.25
E. At least 0.25
7.9 (1 point) Determine the Variance of the Hypothetical Mean frequencies.
A. Less than 0.0060
B. At least 0.0060 but less than 0.0063
C. At least 0.0063 but less than 0.0066
D. At least 0.0066 but less than 0.0069
E. At least 0.0069
7.10 (2 points) Determine the Expected Value of the Process Variance for the severity.
A. Less than 150
B. At least 150 but less than 170
C. At least 170 but less than 190
D. At least 190 but less than 210
E. At least 210

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 297

7.11 (1 point) Determine the Variance of the Hypothetical Mean severities.


A. Less than 33
B. At least 33 but less than 34
C. At least 34 but less than 35
D. At least 35 but less than 36
E. At least 36
7.12 (2 points) Determine the Expected Value of the Process Variance for the pure premium.
A. Less than 420
B. At least 420 but less than 425
C. At least 425 but less than 430
D. At least 430 but less than 435
E. At least 435
7.13 (2 points) Determine the Variance of the Hypothetical Mean pure premiums.
A. Less than 15.3
B. At least 15.3 but less than 15.8
C. At least 15.8 but less than 16.3
D. At least 16.3 but less than 16.8
E. At least 16.8
Use the following information for the next two questions:
There are two types of urns, each with many balls labeled $1000 and $2000.
A Priori Chance of Percentage of
Percentage of
Type of Urn This Type of Urn
$1000 Balls
$2000 Balls
I
80%
90%
10%
II
20%
70%
30%
7.14 (2 points) What is the Expected Value of the Process Variance?
A. Less than 90,000
B. At least 90,000 but less than 100,000
C. At least 100,000 but less than 110,000
D. At least 110,000 but less than 120,000
E. At least 120,000
7.15 (2 points) What is the Variance of the Hypothetical Means?
A. Less than 5000
B. At least 5000 but less than 6000
C. At least 6000 but less than 7000
D. At least 7000 but less than 8000
E. At least 8000

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 298

Use the following information for the next 6 questions:

For an individual insured, frequency and severity are independent.

For an individual insured, frequency is given by a Poisson Distribution.

For an individual insured, severity is given by an Exponential Distribution.

Each type is homogeneous; i.e., every insured of a given type has the same

frequency process and severity process.


Portion of
Insureds in
Mean
Type
this Type
Frequency
1
30%
4
2
45%
6
3
25%
9

Mean
Severity
50
100
200

7.16 (2 points) What is the Expected Value of the Process Variance for the frequency?
A. Less than 6.0
B. At least 6.0 but less than 6.2
C. At least 6.2 but less than 6.4
D. At least 6.4 but less than 6.6
E. At least 6.6
7.17 (2 points) What is the Variance of the Hypothetical Mean frequencies?
A. Less than 3.1
B. At least 3.1 but less than 3.3
C. At least 3.3 but less than 3.5
D. At least 3.5 but less than 3.7
E. At least 3.7
7.18 (2 points) What is the Expected Value of the Process Variance for the severity?
A. Less than 20,000
B. At least 20,000 but less than 20,500
C. At least 20,500 but less than 21,000
D. At least 21,000 but less than 21,500
E. At least 21,500
7.19 (2 points) What is the Variance of the Hypothetical Mean severities?
E. 4200
A. 3400
B. 3600
C. 3800
D. 4000

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 299

7.20 (3 points) What is the Expected Value of the Process Variance for the pure premium?
A. Less than 200,000
B. At least 200,000 but less than 250,000
C. At least 250,000 but less than 300,000
D. At least 300,000 but less than 350,000
E. At least 350,000
7.21 (2 points) What is the Variance of the Hypothetical Mean pure premiums?
A. 275,000 B. 300,000 C. 325,000 D. 350,000 E. 375,000

Use the following information in the next two questions:


The number of claims is given by a Binomial distribution with parameters m = 10 and q.
The prior distribution of q is uniform on [0, 1].
7.22 (2 points) What is the Expected Value of the Process Variance?
A. Less than 1.8
B. At least 1.8 but less than 2.0
C. At least 2.0 but less than 2.2
D. At least 2.2 but less than 2.4
E. At least 2.4
7.23 (2 point) What is the Variance of the Hypothetical Means?
A. Less than 8.2
B. At least 8.2 but less than 8.4
C. At least 8.4 but less than 8.6
D. At least 8.6 but less than 8.8
E. At least 8.8

Use the following information for the next two questions:


(i) Xi is the claim count observed for insured i for one year.
(ii) Xi has a negative binomial distribution with parameters r = 2 and i.
(iii) The is have a distribution [] = 2804 / (1 + )9 , 0 < < .
7.24 (3 points) What is the Expected Value of the Process Variance of claim frequency?
A. 7
B. 9
C. 11
D. 13
E. 15
7.25 (2 points) What is the Variance of the Hypothetical Mean claim frequencies?
A. 7
B. 9
C. 11
D. 13
E. 15

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 300

7.26 (2 points) You are given the following:

A portfolio of risks consists of 2 classes, A and B.

For an individual risk in either class, the number of claims follows a Poisson
distribution with mean .

Class
Number of Exposures
A
700
B
300
Total Portfolio
1,000

Distribution of
Mean
0.080
0.200

Lambdas within Class


Standard Deviation
0.17
0.24

Determine the standard deviation of the distribution of the individuals within the total portfolio.
A. 0.19

B. 0.20

C. 0.21

D. 0.22

E. 0.23

Use the following information for the next two questions:


Number of claims for a single insured follows a Poisson distribution with mean .
The amount of a single claim has a Pareto distribution with = 4 given by:
F(x) = 1 - {/( + x)}4

x > 0, > 0.

and are independent random variables.


E[] = 0.70, var() = 0.20
E[] = 100, var() = 40,000
Number of claims and claim severity distributions are independent.
7.27 (2 points) Determine the expected value of the pure premium's process variance for a single
risk.
A. Less than 10,000
B. At least 10,000 but less than 11,000
C. At least 11,000 but less than 12,000
D. At least 12,000 but less than 13,000
E. At least 13,000
7.28 (2 points) Determine the variance of the hypothetical mean pure premiums.
A. Less than 2000
B. At least 2000 but less than 2500
C. At least 2500 but less than 3000
D. At least 3000 but less than 3500
E. At least 3500

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 301

Use the following information for the next four questions:


There are two types of insureds, A and B.
30% are Type A and 70% are Type B.
Each insured has a LogNormal Distribution of size of loss.
Type A has = 4 and = 1.
Type B has = 4 and = 1.5.
7.29 (2 points) What is the Expected Value of the Process Variance for severities?
A.
B.
C.
D.
E.

Less than 150,000


At least 150,000 but less than 160,000
At least 160,000 but less than 170,000
At least 170,000 but less than 180,000
At least 180,000

7.30 (2 points) What is the Variance of the Hypothetical Mean Severities?


A. 1,100

B. 1,200

C. 1,300

D. 1,400

E. 1,500

7.31 (2 points) The mean frequency for Type A is 3, while that for Type B is 2.
What is the Expected Value of the Process Variance for severities?
A. Less than 150,000
B. At least 150,000 but less than 160,000
C. At least 160,000 but less than 170,000
D. At least 170,000 but less than 180,000
E. At least 180,000
7.32 (2 points) The mean frequency for Type A is 3, while that for Type B is 2.
What is the Variance of the Hypothetical Mean Severities?
A. Less than 1,000
B. At least 1,100 but less than 1,200
C. At least 1,200 but less than 1,300
D. At least 1,300 but less than 1,400
E. At least 1,400
7.33 (3 points) You are given the following:
The amount of an individual claim has an Inverse Gamma distribution with shape parameter = 5
and scale parameter .
The parameter is distributed via an Exponential Distribution with mean 60.
What is the variance of the mixed distribution?
A. 350
B. 375
C. 400
D. 425

E. 450

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Use the following information for the next two questions:


There are three dice :
Die A
Die B
Die C
2 faces labeled 0
4 faces labeled 0
5 faces labeled 0
4 faces labeled 1
2 faces labeled 1
1 faces labeled 1
7.34 (2 points) What is the Expected Value of the Process Variance?
A. Less than 0.15
B. At least 0.15 but less than 0.16
C. At least 0.16 but less than 0.17
D. At least 0.17 but less than 0.18
E. At least 0.18
7.35 (1 point) What is the Variance of the Hypothetical Means?
A. Less than 0.04
B. At least 0.04 but less than 0.05
C. At least 0.05 but less than 0.06
D. At least 0.06 but less than 0.07
E. At least 0.07

Use the following joint distribution for the next two questions:

100

200

0.2

0.1

10

0.1

0.3

40

0.1

0.2

7.36 (2 points) What is the Expected Value of the Process Variance?


(A) 245
(B) 250
(C) 255
(D) 260
(E) 265
7.37 (2 points) What is the Variance of the Hypothetical Means?
(A) 5
(B) 6
(C) 7
(D) 8
(E) 9

Page 302

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 303

7.38 (2 points) For each insured frequency is Poisson with mean .


For each insured the severity distribution has a parameter .
For each insured frequency and severity are independent.
and vary across a portfolio of insureds independently of each other.
The mean frequency for the portfolio is 10%.
For aggregate losses, the expected value of the process variance is 20,000.
Determine the second moment of the mixed severity distribution.
(A) 100,000 (B) 200,000 (C) 300,000 (D) 400,000 (E) Can not be determined
Use the following information for the next three questions:
Severity is LogNormal with parameters m and v.
m varies across the portfolio via a Normal Distribution with parameters and .
7.39 (3 points) What is the Expected Value of the Process Variance?
7.40 (3 points) What is the Variance of the Hypothetical Means?
7.41 (1 point) What is the Buhlmann Credibility Parameter K = EPV/VHM?

7.42 (4 points) For each of the following models of a baseball team, determine the standard
deviation of the number of games won in a year. The team plays 162 games in a year.
(a) There is a 50% probability of winning each game independent of any other game.
(b) The team plays 81 road games with a 45% probability of winning each game independent of
any other road game. The team plays 81 home games with a 55% probability of winning
each game independent of any other home game.
(c) For each game there is equally likely to be a 45% or 55% probability of winning that game.
The winning probability of each game is independent of that of any other game.
The outcome of each game is independent of that of any other game.
(d) The team is equally likely to have a 45% or 55% probability of winning each game.
In any single year, the winning probability of each game is equal to that of any other game.
The outcome of each game is independent of that of any other game.
7.43 (2 points) For each insured frequency is Poisson with mean .
varies across a set of insureds via a Poisson Distribution with mean .
Determine the variance of the mixed distribution.
A.

B. 2

C. + 2

D. 22

E. Can not be determined

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 304

7.44 (3 points) Use the following information:

Frequency for an individual is a 50-50 mixture of two Poissons with means and 2.
The prior distribution of is Exponential with a mean of 0.1.
Determine the Buhlmann Credibility Parameter K =
A. 6

B. 7

C. 8

D. 9

the expected value of the process variance


.
variance of the hypothetical means
E. 10

Use the following information in the next two questions:


There are three large urns, each filled with so many balls that you can treat it as if there are an infinite
number. Urn 1 contains balls with "zero" written on them.
Urn 2 has balls with "one" written on them.
The final Urn 3 is filled with 50% balls with "zero" and 50% balls with "one".
An urn is chosen at random and a single ball is selected.
7.45 (4, 5/83, Q.44a) (1 point) What is the Expected Value of the Process Variance?
A. Less than 0.06
B. At least 0.06 but less than 0.07
C. At least 0.07 but less than 0.08
D. At least 0.08 but less than 0.09
E. At least 0.09
7.46 (4, 5/83, Q.44b) (1 point) What is the Variance of the Hypothetical Means?
A. Less than 0.17
B. At least 0.17 but less than 0.18
C. At least 0.18 but less than 0.19
D. At least 0.19 but less than 0.20
E. At least 0.20
7.47 (4, 5/85, Q.40) (3 points) The hypothetical mean frequencies of the members of a class of
risks are distributed uniformly on the interval (0,1].
The probability density function for severity, f(x) = exp(-x/r) / r, x 0, with the r parameter being
different for different individuals.
r is distributed on (0,1) by the function g(r) = 2r, 0 < r 1.
The frequency and severity are independently distributed.
In which range is the variance of the hypothetical mean pure premiums for this class of risks?
A. Less than 0.06
B. At least 0.06, but less than 0.08
C. At least 0.08, but less than 0.10
D. At least 0.10, but less than 0.12
E. 0.12 or more

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 305

7.48 (4, 5/87,Q.31) (1 point) Let X, Y, and Z be discrete random variables.


Which of the following statements are true?
1. If Z is the sum of X and Y, the variance of Z is the sum of the variance of X and the variance of Y.
2. If Z is the difference between X and Y, the variance of Z is the difference between the variance
of X and the variance of Y.
3. EY[VAR[X|Y]] = VARY [E[X|Y]].
A. 1

B. 1, 2

C. 1, 3

D. 1, 2, 3

E. None of A, B, C, D.

7.49 (4B, 11/92, Q.23) (2 points) You are given the following:

A portfolio of risks consists of 2 classes, A and B.

For an individual risk in either class, the number of claims follows a Poisson distribution.
Distribution of Claim Frequency Rates
Mean
Standard Deviation
0.050
0.227
0.210
0.561

Class
Number of Exposures
A
500
B
500
Total Portfolio 1,000
Determine the standard deviation of the claim frequency for the total portfolio.
A. Less than 0.390
B. At least 0.390 but less than 0.410
C. At least 0.410 but less than 0.430
D. At least 0.430 but less than 0.450
E. At least 0.450

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 306

Use the following information for the next two questions:


Number of claims for a single insured follows a Poisson distribution with mean .
The amount of a single claim has an exponential distribution given by: f(x) = (1/) e-x/, x > 0, > 0.
and are independent random variables.
E[] = 0.10, var() = 0.0025.
E[] = 1000, var() = 640,000.
Number of claims and claim severity distributions are independent.
7.50 (4B, 5/93, Q.21) (2 points) Determine the expected value of the pure premium's process
variance for a single risk.
A. Less than 150,000
B. At least 150,000 but less than 200,000
C. At least 200,000 but less than 250,000
D. At least 250,000 but less than 300,000
E. At least 300,000
7.51 (4B, 5/93, Q.22) (2 points) Determine the variance of the hypothetical means for the pure
premium.
A. Less than 10,000
B. At least 10,000 but less than 20,000
C. At least 20,000 but less than 30,000
D. At least 30,000 but less than 40,000
E. At least 40,000

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 307

Use the following information for the next two questions.


The number of claims for a single risk follows a Poisson distribution with mean m.
m is a random variable with E[m] = 0.40 and var(m) = 0.10.
The amount of an individual claim has a uniform distribution on [0, 100,000].
The number of claims and the amount of an individual claim are independent.
7.52 (4B, 5/94, Q.22) (3 points) Determine the expected value of the pure premium's process
variance for a single risk.
A. Less than 400 million
B. At least 400 million, but less than 800 million
C. At least 800 million, but less than 1,200 million
D. At least 1,200 million, but less than 1,600 million
E. At least 1,600 million
7.53 (4B, 5/94, Q.23) (2 points) Determine the variance of the hypothetical means for the pure
premium.
A. Less than 400 million
B. At least 400 million, but less than 800 million
C. At least 800 million, but less than 1,200 million
D. At least 1,200 million, but less than 1,600 million
E. At least 1,600 million

7.54 (4B, 5/95, Q.4) (3 points) You are given the following:
The number of losses for a single risk follows a Poisson distribution with mean m.
The amount of an individual loss follows an exponential distribution with mean 1000/m
and variance (1000/m)2 .
m is a random variable with density function
f(m) = (1 + m)/ 6, 1 < m < 3
The number of losses and the individual loss amounts are independent.
Determine the expected value of the pure premium's process variance for a single risk.
A. Less than 940,000
B. At least 940,000, but less than 980,000
C. At least 980,000, but less than 1,020,000
D. At least 1,020,000, but less than 1,060,000
E. At least 1,060,000

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 308

7.55 (4B, 5/96, Q.11 & 4B, 11/98, Q.21) (3 points) You are given the following:

The number of claims for a single risk follows a Poisson distribution with mean .
The amount of an individual claim is always 1,000.
is a random variable with density function f() = 4/5, 1 < < .
Determine the expected value of the process variance of the aggregate losses for a single risk.
A. Less than 1,500,000
B. At least 1,500,000, but less than 2,500,000
C. At least 2,500,000, but less than 3,500,000
D. At least 3,500,000, but less than 4,500,000
E. At least 4,500,000
7.56 (4B, 11/97, Q.17) (2 points) You are given the following:
The number of claims follows a Poisson distribution with mean .
Claim sizes follow the following distribution:
Claim Size
Probability
2

1/3

2/3

The prior distribution for is:

Probability

1
1/3
2
1/3
3
1/3
The number of claims and claim sizes are independent.
Determine the expected value of the process variance of the aggregate losses.
A. Less than 150
B. At least 150, but less than 300
C. At least 300, but less than 450
D. At least 450, but less than 600
E. At least 600

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 309

7.57 (4B, 5/98, Q.7) (2 points) You are given the following:
The number of claims during one exposure period follows a Bernoulli distribution with mean p.
The prior density function of p is assumed to be f(p) = (/2) sin(p/2) , 0 < p < 1.
1

Hint:

(p/2) sin(p/2) dp = 2/ and

(p2 /2) sin(p/2) dp = 4( - 2)/2 .

Determine the expected value of the process variance.


B. 2(4 - )/2

A. 4( - 3)/2

C. 4( - 2)/2

D. 2/

E. (4 - ) / {2( - 3)}

7.58 (4B, 5/98, Q.26) (3 points) You are given the following:

The number of claims follows a Poisson distribution with mean m.

Claim sizes follow a distribution with mean 20m and variance 400m2 .

m is a gamma random variable with density function f(m) = m2 e-m / 2 , 0 < m < .

For any value of m, the number of claims and the claim sizes are independent.
Determine the expected value of the process variance of the aggregate losses.
A. Less than 10,000
B At least 10,000, but less than 25,000
C. At least 25,000, but less than 40,000
D. At least 40,000, but less than 55,000
E. At least 55,000
Use the following information for the next two questions:

Claim sizes follow a Pareto distribution, with parameters and = 3.

The prior distribution of has density function f() = e-, 0 < < .

7.59 (4B, 5/99, Q.5) (2 points) Determine the expected value of the process variance.
A. 3/8

B. 3/4

C. 3/2

D. 3

E. 6

7.60 (4B, 5/99, Q.6) (2 points) Determine the variance of the hypothetical means.
A. 1/4

B. 1/2

C. 1

D. 2

E. 4

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 310

7.61 (4B, 11/99, Q.4) (2 points) You are given the following:
Claim sizes for a given policyholder follow a distribution with density function
f(x) = 2x/b2 , 0 < x < b.
The prior distribution of b has density function
g(b) = 1/b2 , 1 < b < .
Determine the expected value of the process variance.
A. 0
B. 1/18
C. 4/9
D. 1/2
E.
7.62 (IOA 101, 9/01, Q.5) (2.25 points) The number of claims, X, to be processed in a day by an
employee of an insurance company is modeled as X ~ Poisson with mean 10.
The time (minutes) the employee takes, Y, to process x claims is modeled as having a distribution
with conditional mean and variance given by E[Y | X = x] = 15x + 20,
Var[Y | X = x] = x + 12. Calculate the unconditional variance of the time the employee takes to
process claims in a day.
7.63 (4, 5/05, Q.13 & 2009 Sample Q.183) (2.9 points) You are given claim count data for which
the sample mean is roughly equal to the sample variance. Thus you would like to use a claim count
model that has its mean equal to its variance. An obvious choice is the Poisson distribution.
Determine which of the following models may also be appropriate.
(A) A mixture of two binomial distributions with different means
(B) A mixture of two Poisson distributions with different means
(C) A mixture of two negative binomial distributions with different means
(D) None of (A), (B) or (C)
(E) All of (A), (B) and (C)

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 311

Solutions to Problems:
7.1. E. For a Bernoulli the process variance is q(1-q).
Type of
Risk
A
B
C

A Priori
Probability
0.50
0.35
0.15

Process
Variance
0.21
0.24
0.25

0.3
0.4
0.5

Average

0.2265

7.2. A. The Variance of the Hypothetical Means = 0.1385 - 0.3652 = 0.005275.


Type of
Risk
A
B
C

A Priori
Probability
0.50
0.35
0.15

Average

Mean
0.3
0.4
0.5

Square of
Mean
0.09
0.16
0.25

0.3650

0.1385

7.3. E. m is the mean claim frequency for an insured, and h(m) = 1/5 on [0, 5].
The mean severity for an insured is r, since that is the mean for the given exponential distribution.
Therefore for a given insured the mean pure premium is mr. The first moment of the hypothetical
mean pure premiums is (since the distributions of m and r are independent):
m=5

m=0

r=3

m=5

r=3

m r g(r) h(m) dr dm = m/5 dm r (2r/9) dr = (2.5)(2) = 5.


r=0

m=0

r=0

Similarly, the second moment of the hypothetical mean pure premiums is:
m=5

r=3

m2 r2 g(r) h(m) dr dm = m2/5 dm r2 (2r/9) dr = (25/3)(9/2) = 37.5.

m=0

r=0

m=5

m=0

r=3

r=0

Thus, the variance of the hypothetical mean pure premiums is: 37.5 - 52 = 12.5.
Comment: When two variables are independent, the second moment of their product is equal to
the product of their second moments. The same is not true for variances.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 312

7.4. C. The process variance for the Youthful drivers is: .35 - .212 = .3059.
Number
of Claims
0
1
2
3

Probability
for Youthful
0.85
0.10
0.04
0.01

Average
Number
of Claims
0
1
2
3

Probability
for Adult
0.95
0.04
0.01
0.00

Average

n
0
1
2
3

Square of
n
0
1
4
9

0.2100

0.3500

n
0
1
2
3

Square of
n
0
1
4
9

0.0600

0.0800

Thus the process variance for the Adult drivers is .08 - .062 = .0764.
Thus the Expected Value of the Process Variance = (.0764)( 88%) + (.3059)(12%) = 0.104.
7.5. D. The Variance of the Hypothetical Means = .00846 - .07802 = 0.00238.
Type of
Driver
Youthful
Adult

A Priori
Probability
0.1200
0.8800

Average

Mean
0.2100
0.0600

Square of
Mean
0.04410
0.00360

0.0780

0.00846

7.6. B. The process variances are the squares of the standard deviations:
.0009, .0025, .0001. Averaging we get EPV = (.0035) /3 = 0.001167.
7.7. C. The Variance of the Hypothetical Means = .1167 - .32 = 0.0267.
Type of
Risk
A
B
C
Average

A Priori
Probability
0.3333
0.3333
0.3333

Mean
0.1
0.3
0.5

Square of
Mean
0.01
0.09
0.25

0.3000

0.1167

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 313

7.8. C. For a Bernoulli the process variance is q(1-q).


For example for Die A1 , the process variance = (2/6)(1- 2/6) = 2/9 = .2222.
Type of
Die
A1
A2

Bernoulli
Parameter
0.3333
0.5000

A Priori
Probability
0.50
0.50

Average

Process
Variance
0.2222
0.2500
0.2361

7.9. E. The Variance of the Hypothetical Means = .18056 - .416672 = 0.00695.


Type of
Die
A1
A2

A Priori
Probability
0.50
0.50

Average

Mean
0.33333
0.50000

Square of
Mean
0.11111
0.25000

0.41667

0.18056

7.10. C. For spinner B1 the first moment is: (20)(.6) + (50)(.4) = 32 and the second moment is:
(202 )(.6) + (502 )(.4) = 1240. Thus the process variance is: 1240 - 322 = 216.
For spinner B2 the first moment is: (20)(.2) + (50)(.8) = 44 and the second moment is:
(202 )(.2) + (502 )(.8) = 2080. Thus the process variance is: 2080 - 442 = 144.
Therefore, the expected value of the process variance is: (1/2)(216) + (1/2)(144) = 180.
Type of
Spinner
B1
B2

A Priori
Probability
0.50
0.50

Mean
32
44

Second
Moment
1240
2080

Average

Process
Variance
216
144
180

7.11. E. The Variance of the Hypothetical Means = 1480 - 382 = 36.


Type of
Spinner
B1
B2
Average

A Priori
Probability
0.50
0.50

Mean
32
44

Square of
Mean
1024
1936

38

1480

Comment: Note that the spinners are chosen independently of the dice, so frequency and severity
are independent across risk types. Thus one can ignore the frequency process in this and the prior
question. One can not do so when for example low frequency is associated with low severity, as in
the questions related to good, bad and ugly drivers.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 314

7.12. B. For each possible pair of die and spinner use the formula:
variance of p.p. = f s2 + s2 f2.
Die
and
Spinner
A1, B1
A1, B2
A2, B1
A2, B2

A Priori
Chance of
Risk
0.250
0.250
0.250
0.250

Mean
Freq.
0.333
0.333
0.500
0.500

Variance
of Freq.
0.222
0.222
0.250
0.250

Mean
Severity
32
44
32
44

Process
Variance
of P.P.
299.6
478.2
364.0
556.0

Variance
of Sev.
216
144
216
144

Mean

424.4

Comment: It is a much longer problem if one does not make use of values calculated in the solutions
to the previous questions.
7.13. D. The Variance of the Hypothetical Means = 267.222 - 15.8332 = 16.53.
Die
and
Spinner
A1, B1
A1, B2
A2, B1
A2, B2

A Priori
Chance of
Risk
0.250
0.250
0.250
0.250

Mean
Freq.
0.333
0.333
0.500
0.500

Mean
Severity
32
44
32
44

Mean

Mean
Pure
Premium
10.667
14.667
16.000
22.000

Square of
Mean
P.P.
113.778
215.111
256.000
484.000

15.833

267.222

7.14. D. For example, the second moment of Urn II is (.7)(10002 ) + (.3)(20002 ) = 1,900,000.
The process variance of Urn II = 1,900,000 - 13002 = 210,000.
Type of
Urn
I
II

A Priori
Probability
0.8000
0.2000

Mean
1100
1300

Second
Moment
1,300,000
1,900,000

Average

Process
Variance
90,000
210,000
114,000

7.15. C. The variance of the hypothetical means is: 1,306,000 - 11402 = 6400.
Type of
Urn
I
II
Average

A Priori
Probability
0.8000
0.2000

Mean
1100
1300

Square of
Mean
1,210,000
1,690,000

1,140

1,306,000

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 315

7.16. B. For the Poisson the process variance is the equal to the mean.
The expected value of the process variance is the weighted average of the process variances for
the individual types, using the a priori probabilities as the weights.
The EPV of the frequency = (30%)(4) + (45%)(6) + (25%)(9) = 6.15.
Type
1
2
3

A Priori
Probability
30%
45%
25%

Poisson
Parameter
4
6
9

Average

Process
Variance
4
6
9
6.15

7.17. C. One computes the first and 2nd moments of the mean frequencies as follows:
Type
1
2
3
Average

A Priori
Probability
30%
45%
25%

Poisson
Parameter
4
6
9

Mean
Frequency
4
6
9

Square of
Mean Freq.
16
36
81

6.15

41.25

Then the variance of the hypothetical mean frequencies = 41.25 - 6.152 = 3.43.
Comment: Using the solution to this question and the previous question, as explained in the next
section, the Buhlmann Credibility parameter for frequency is K = EPV/VHM = 6.15/3.43 = 1.79.
The Buhlmann Credibility applied to the observation of the frequency for E exposures would be:
Z = E / (E + 1.79).

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 316

7.18. A. One has to weight together the process variances of the severities for the individual types
using the chance that a claim came from each type. The chance that a claim came from an individual
type is proportional to the product of the a priori chance of an insured being of that type and the
mean frequency for that type.
Parameterize the Exponential with mean .
For type 1, the process variance of the Exponential severity is: 2 = 502 = 2500.
The mean frequencies are: 4, 6, and 9. The a priori chances of each type are: 30%, 45% and 25%.
Thus the weights to use to compute the EPV of the severity are (4)(30%), (6)(45%), (9)(25%) =
1.2, 2.7, 2.25. The expected value of the process variance of the severity is the weighted average
of the process variances for the individual types, using these weights.
EPV = {(1.2)(2500) + (2.7)(10000) + (2.25)(40000)}/(1.2 + 2.7 + 2.25) = 19,512.
A

Type
1
2
3

A Priori

Mean

Weights =

Exponential Parameter

Process

50
100
200

Variance
2,500
10,000
40,000

Probability
30%
45%
25%

Frequency Col. B x Col C.


4
1.20
6
2.70
9
2.25

Average

19,512

6.15

7.19. A. As in the previous question, in computing the moments one has to use as weights for
each individual type the chance that a claim came from that type.
A

Type

A Priori
Probability

Mean
Frequency

Weights
Col.B x Col.C

Mean
Severity

Square of
Mean Severity

1
2
3

30%
45%
25%

4
6
9

1.20
2.70
2.25

50
100
200

2,500
10,000
40,000

6.15

126.83

19,512

Average

Then the variance of the hypothetical mean severities = 19512 - 126.832 = 3426.
Comment: As explained in the next section, the Buhlmann Credibility parameter for severity is
K = EPV/VHM = 19512 / 3426 = 5.7. The Buhlmann Credibility applied to the observation of the
mean severity for N claims would be: Z = N / (N + 5.7).

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 317

7.20. B. Since frequency and severity are assumed to be independent,


the process variance of the pure premium =
(Mean Frequency)(Variance of Severity) + (Mean Severity)2 (Variance of Frequency).
Type

A Priori
Probability

1
2
3

30%
45%
25%

Mean
Variance of
Frequency Frequency
4
6
9

Mean
Severity

Variance of
Severity

Process
Variance

50
100
200

2500
10000
40000

20,000
120,000
720,000

4
6
9

Average

240,000

7.21. E. The mean pure premium = (Mean Frequency)(Mean Severity).


Then one computes the first and second moments of the mean pure premiums as follows:
Type

A Priori
Probability

Mean
Frequency

Mean
Severity

Mean
Pure Premium

Square of
Pure Premium

1
2
3

30%
45%
25%

4
6
9

50
100
200

200
600
1,800

40,000
360,000
3,240,000

780.00

984,000

Average

Then the VHM of the pure premiums = 984,000 - 7802 = 375,600.


Comment: Using the solution to this question and the previous question, as explained in the next
section, the Buhlmann Credibility parameter for the pure premium is K = EPV/VHM = 240000 /
375600 = .64. The Buhlmann Credibility applied to the observation of the pure premium for E
exposures would be: Z = E / (E + .64).
7.22. A. The process variance for a Binomial is: mq(1-q) = 10q(1-q) = 10q - 10q2 .
1

EPV = 10E[q - q2 ] = 10 q - q2 dq = 10(1/2 - 1/3) = 10/6 = 1.67.


0

7.23. B. For the Binomial Distribution the mean is mq = 10q.


VHM = VAR[10q] = 100 VAR[q] = 100{E[q2 ] - E[q]2 } = 100{1/3 - (1/2)2 } = 100/12 = 8.33.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 318

7.24. D. Making the change of variables, x = /(1+), d = dx/(1-x)2 :

E[] = () d = 280 5/(1+)9 d = 280 x5 (1-x)4 dx/(1-x)2 = 280 x5 (1-x)2 dx =


0

(280)((6)(3)/(6 + 3)) = (280)(5! 2! / 8!) = 5/3.

E[2] = 2 () d = 280 6/(1+)9 d = 280 x6 (1-x)3 dx/(1-x)2 = 280 x6 (1-x)dx =


0

(280)((7)(2)/(7 + 2)) = (280)(6! 1! / 8!) = 5.


Process Variance = r(1+) = 2 + 22.
EPV = E[2 + 22] = 2E[] + 2E[2] = (2)(5/3) + (2)(5) = 13.33.
Comment: By a change of variables the distribution of the parameter was converted to a Beta
distribution and the integral into a Beta type integral. The Beta Distribution and Beta Type Integrals
are discussed in Mahlers Guide to Conjugate Priors. See also page 2 of the tables attached to
your exam. If [] is proportional to a-1/(1 + )(a+b), then x = /(1+) follows a Beta Distribution
with parameters a and b.
7.25. B. VHM = Var[r] = Var[2] = 4Var[] = 4(E[2] - E[]2) = (4)(5 - 25/9) = 80/9 = 8.89.
Comment: Buhlmann Credibility Parameter = K = EPV/VHM = 13.33/8.89 = 1.5.
If [] is proportional to a-1/(1 + )(a+b), then it turns out that K = (b-1)/r.
For this problem a = 5, b = 4, and r = 2. K = (b-1)/r = 3/2. This is an example of exact credibility,
in which the estimate of the future frequency of an insured using Bayesian Analysis equals that from
Buhlmann Credibility. See Mahlers Guide to Conjugate Priors.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 319

7.26. B. The distribution of claim frequencies for the combined portfolio has mean given by:
E[] = {(700)(.080) + (300)(.200)} /1000 = 0.116.
The second moment for Class A is the variance plus the square of the mean:
(0.172 ) + (0.082 ) = 0.0353.
The second moment for Class B is: (0.242 ) + (0.202 ) = 0.0976.
The combined portfolios second moment is: {(700)(.0353) +(300)(.0976)} / 1000 = 0.05399.
Thus the variance of the combined portfolio is: 0.05399 - 0.1162 = .0405.
The standard deviation is: 0.0405 = 0.2013.
Alternately, the variance of the means for the two classes is:
VAR[E[ | Class]] = {(.7)(.0802 ) + (.3)(.2002 )} - .1162 = 0.003024.
The average of the variance for the two classes is
E[Var[ | Class]] = {(700)(.172 ) + (300)(.242 )} /1000 = 0.03751.
The variance of across the whole portfolio is VAR[] = E[Var[ | Class]] + VAR[E[] | Class]] =
0.003024 + 0.03751 = 0.0405. Thus the standard deviation is:

0.0405 = 0.2013.

7.27. C. E[2 ] = Var[] + E[]2 = 40,000 + 10,000 = 50,000.


Since frequency and severity are independent, for fixed and ,
Process Variance of the Pure Premium = E[freq.]Var[sev.] + E2 [sev.]Var[freq.] =
Var[sev.] + E2 [sev.] = (2nd moment of the severity) = (22 / {(-1)(-2)}) = 2 /3.
EPV = E[2 /3] = (1/3)E[]E[2 ] = (1/3)(.7)(50,000) = 11,667.
Comment: The 2nd moment of the Pareto Distribution is: 22 / {(-1)(-2)}.
7.28. D. E[2] = Var[] + E[]2 = .20 + .49 = .69. E[2] = Var[] + E[]2 = 40,000 + 10,000 =
50,000. The hypothetical mean pure premium is: (avg. freq.)(avg. severity) = /(-1) = /3.
Var[Mean P.P.] = Var[/3] = E[(/3)2 ] - E[/3]2 = E[2]E[2]/9 - E[]2 E[]2 /9 =
(.69)(50,000) / 9 - (.49)(10,000) / 9 = 3,289.
Comment: The mean of the Pareto Distribution is: / (-1).
Combining the answer to this question and the previous one, K = 11667/3289 = 3.5.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 320

7.29. D. For a LogNormal, mean = exp[ + .52], 2nd moment = exp[2 + 22].
Type

A Priori
Probability

Mean

2nd Moment

Process
Variance

A
B

0.3
0.7

4
4

1
1.5

90.02
168.17

22,026
268,337

13,923
240,055
172,215

7.30. C. VHM = 22,229 - 144.732 = 1282.


Type

A Priori
Probability

Mean

Square
of Mean

A
B

0.3
0.7

4
4

1
1.5

90.02
168.17

8,103
28,283

144.73

22,229

7.31. B. The portion of claims from Type A is: (.3)(3)/((.3)(3) + (.7)(2)) = .9/(.9 + 1.4) = 39.13%.
The portion of claims from Type B is: 1.4/(.9 + 1.4) = 60.87%.
Type

A Priori
Prob.

Mean
Freq.

Weight

Mean

2nd
Moment

Process
Variance

A
B

0.3
0.7

3
2

0.9
1.4

4
4

1
1.5

90.02
168.17

22,026
268,337

13,923
240,055
151,569

EPV = (39.13%)(13923) + (60.87%)(240055) = 151,569.


7.32. E. VHM = 20,386 - 137.592 = 1455.
Type

A Priori
Prob.

Mean
Freq.

Weight

Percent
of Claims

Mean

Square
of Mean

A
B

0.3
0.7

3
2

0.9
1.4

39.13%
60.87%

4
4

1
1.5

90.02
168.17

8,103
28,283

2.3

100.00%

137.59

20,386

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 321

7.33. B. Each Inverse Gamma has mean: /( 1) = /4,


and second moment: 2/{( 1)( 2)} = 2/12.
Therefore, each Inverse Gamma has a (process) variance of: 2/12 - (/4)2 = 2/48.
Since is distributed via an Exponential Distribution with mean 60,
E[] = 60, E[2] = (2)(602 ) = 7200, and Var[] = 602 = 3600.
EPV = E[2/48] = E[2]/48 = 7200/48 = 150.
VHM = Var[/4] = Var[]/42 = 3600/16 = 225.
Total Variance = EPV + VHM = 150 + 225 = 375.
Alternately, the mean of the mixture is the mixture of the means: E[/4] = E[]/4 = 60/4 = 15.
The second moment of the mixture is the mixture of the second moments:
E[2/12] = E[2]/12 = 7200/12 = 600.
Therefore, the variance of the mixture is: 600 - 152 = 375.
Alternately, this is an example of an Exponential-Inverse Gamma.
The mixed distribution is Pareto with = 5 and = 60.
This Pareto has mean: /( 1) = 60/4 = 15,
and second moment: 22/{( 1)( 2)} = (2)(602 )/{(4)(3)} = 600.
Therefore, this Pareto has a variance of: 600 - 152 = 375.
7.34. E. For example for Die Type C, the process variance = (1/6)(5/6) = 5/36 = .1389.
Type of
Die

A Priori
Probability

Process
Variance

A
B
C

0.3333
0.3333
0.3333

0.2222
0.2222
0.1389

Average

0.1944

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 322

7.35. B. The Variance of the Hypothetical Means = .1944 - .38892 = 0.0432.


Type of
Die

A Priori
Probability

Mean

Square of
Mean

A
B
C

0.3333
0.3333
0.3333

0.6667
0.3333
0.1667

0.4444
0.1111
0.0278

0.3889

0.1944

Average

7.36. C. & 7.37. D. Adding the probabilities, there is a 40% a priori probability of = 100, risk
type A, and a 60% a priori probability of = 200, risk type B.
For Risk type A, the distribution of X is 0@.2/.4 = 1/2, 10@.1/.4 = 1/4, and 40@.1/.4 = 1/4.
The mean for risk Type A is: E[X | = 100] = (0)(1/2) + (10)(1/4) + (40)(1/4) = 12.5.
The 2nd moment for risk Type A is: E[X2 | = 100] = (02 )(1/2) + (102 )(1/4) + (402 )(1/4) = 425.
Process Variance for Risk Type A is: Var[X | = 100] = 425 - 12.52 = 268.75.
Similarly, Risk Type B has mean 18.333, second moment 583.33, and process variance 247.22.
EPV = 255.8. The variance of the hypothetical means = 264.16 - 162 = 8.2.
Risk
Type

A Priori
Chance

Mean

Square of
Mean

Process
Variance

A
B

0.4
0.6

12.50
18.33

156.25
336.10

268.75
247.22

16.00

264.16

255.83

Average

Comment: Overall we have: 30% X = 0, 40% X = 10, and 30% X = 40. Thus the overall mean is
16, and the total variance is 264. Note that EPV + VHM = 255.8 + 8.2 = 264 = total variance.
Similar to 4, 11/02, Q.29.
7.38. B. For a compound Poisson, the process variance of aggregate loss is: E[X2 | ].
Thus for aggregate loss, EPV = E[] E[E[X2 | ]] = E[] E[X2 ].
Therefore, 20,000 = (10%) E[X2 ]. E[X2 ] = 200,000.
Comment: For severity, the mixture of the second moments is the second moment of the mixture.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

7.39, 7.40, & 7.41. The process variance given m is:


Second Moment of LogNormal - Square of First Momentof LogNormal =
Exp[2m + 2v2 ] - Exp[m + v2 /2]2 = Exp[2m + 2v2 ] - Exp[2m + v2 ]
= Exp[2m] (Exp[2v2 ] - Exp[v2 ]).
Therefore, EPV = E[Exp[2m]](Exp[2v2 ] - Exp[v2 ]).
m is Normal with mean and standard deviation .
Therefore, 2m is Normal with parameters 2 and 2.
Therefore, Exp[2m] is LogNormal withparameters 2 and 2.
Therefore,E[Exp[2m]] is the mean of this LogNormal:
Exp[2 + (2)2 /2] = Exp[2 + 22].
Therefore, EPV =Exp[2 + 2 2](Exp[2v2 ] - Exp[v2 ]).
The hypothetical mean given m is:
First Momentof LogNormal =Exp[m + v2 /2] = Exp[m] Exp[v2 /2].
Therefore, VHM = Var[Exp[m]]Exp[v2 /2]2 = Var[Exp[m]] Exp[v2 ].
m is Normal with mean and standard deviation .
Therefore, Exp[m] is LogNormal withparameters and .
Therefore,Var[Exp[m]] is the variance of this LogNormal:
Exp[2 + 22] - Exp[ + 2 /2]2 =
Exp[2 + 22] - Exp[2 + 2] = Exp[2] (Exp[22] - Exp[2]).
Therefore, VHM =Exp[2] (Exp[22] - Exp[2]) Exp[v2 ].

K = EPV/VHM =

Exp[2 + 2 2 ] (Exp[2v 2] - Exp[v2 ])


Exp[v2 ] - 1
=
.
Exp[2] (Exp[22 ] - Exp[2 ]) Exp[v2 ]
1 - Exp[-2 ]

Page 323

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 324

7.42. a) Variance = (0.5)(1 - 0.5)(162) = 40.5. Standard Deviation = 6.364.


b) Variance of the road games is: (0.45)(1 - 0.45)(81) = 20.0475.
Variance of the home games is: (0.55)(1 - 0.55)(81) = 20.0475.
Variance of total number of games won is: 20.0475 + 20.0475 = 40.095.
Standard Deviation = 6.332.
c) This is a mixture. For a single game, the mean is 0.5.
The second moment of a Bernoulli is: q(1-q) + q2 = q.
For a single game, the second moment is: (1/2)(0.45) + (1/2)(0.55) = 0.5.
The variance of the mixture for one game is: 0.5 - 0.52 = 0.25.
Variance of the mixture for 162 games is: (0.25)(162) = 40.5. Standard Deviation = 6.364.
Alternately, EPV = (1/2)(.45)(.55) + (1/2)(.55)(.45) = 0.2475.
VHM = (1/2)(.45 - .5)2 + (1/2)(.55 - .5)2 = 0.0025.
Therefore, the total variance for one game is: 0.2475 + 0.0025 = 0.25.
Variance of the mixture is: (0.25)(162) = 40.5. Standard Deviation = 6.364.
d) This is a mixture. The mean number of wins for the year is: (0.5)(162) = 81.
The second moment of a Binomial is: mq(1-q) + (mq)2 .
The second moment of a Binomial with m = 162 and q = 0.45 is:
(162)(0.45)(0.55) + {(0.45)(162)}2 = 5354.5.
The second moment of a Binomial with m = 162 and q = 0.55 is:
(162)(0.55)(0.45) + {(0.55)(162)}2 = 7978.9.
The second moment of the mixture is: (1/2)(5354.5) + (1.2)(7978.9) = 6666.7.
Variance of the mixture is: 6666.7 - 812 = 105.7. Standard Deviation = 10.281.
Alternately, EPV = (1/2)(162)(.45)(.55) + (1/2)(162)(.55)(.45) = 40.095.
VHM = (1/2)(72.9 - 81)2 + (1/2)(89.1 - 81)2 = 65.61.
Therefore, the Variance of the mixture for 162 games is: 40.095 + 65.61 = 105.7.
Standard Deviation = 10.281.
7.43. B. The process variance given is . EPV = E[] = .
The hypothetical mean given is . VHM = Var[] = .
Total Variance = EPV + VHM = 2.
Comment: The Buhlmann Credibility Parameter is: K = EPV/VHM = / = 1.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 325

7.44. B. The mean frequency given is: 0.5 + (0.5)(2) = 1.5.


VHM = Var[1.5] = 1.52 Var[] = 1.52 (0.12 ) = 0.0225.
Second moment given lambda is: (0.5)( + 2) + (0.5){2 + (2)2 } = 1.5 + 2.52.
Process variance given lambda is: 1.5 + 2.52 - (1.5)2 = 1.5 + 0.252.
EPV = 1.5E[] + 0.25 E[2] = (1.5)(0.1) + (0.25)(2)(0.12 ) = 0.155.
K = EPV / VHM = 0.155 / 0.0225 = 6.89.
7.45. D.
Type of
Urn

A Priori
Probability

Process
Variance

1
2
3

0.3333
0.3333
0.3333

0
0
0.25

Average

0.0833

7.46. A. Variance of the Hypothetical Means = 0.4167 - 0.52 = 0.1667.


Type of
Urn

A Priori
Probability

Mean for this


Type Urn

Square of Mean
of this type Urn

1
2
3

0.3333
0.3333
0.3333

0
1
0.5

0
1
0.25

0.5

0.4167

Average

Comment: This is a mixture. Mean of the mixture is: (0 + 1 + 1/2)/3 = 1/2.


Each Urn is Bernoulli. For each Urn, its mean is q and variance is q(1-q).
Therefore, for each Urn its second moment is: q(1-q) + q2 = q.
Second moment of the mixture is: (0 + 1 + 1/2)/3 = 1/2.
Variance of mixture: 1/2 - (1/2)2 = 1/4.
EPV + VHM = 0.0833 + 0.1667 = 1/4 = Total Variance.
If you calculated the variance of the mixture, which is the total variance, and calculated either the EPV
or VHM, then you could back the other one out of the total variance.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 326

7.47. A. Let m be the mean claim frequency for an insured. Then h(m) = 1 on (0,1].
The mean severity for a risk is r, since that is the mean for the given exponential distribution.
Therefore for a given insured the mean pure premium is mr. The first moment of the hypothetical
mean pure premiums is:
m=1

m=0

r=1

m=1

r=1

m r g(r) h(m) dr dm = m dm r (2r) dr = (1/2)(2/3) = 1/3.


r=0

m=0

r=0

The second moment of the hypothetical mean pure premiums is (since the frequency and severity
distributions are independent):
m=1

m=0

r=1

m=1

r=1

m2 r2 g(r) h(m) dr dm = m2 dm r2 (2r) dr = (1/3)(1/2) = 1/6.


r=0

m=0

r=0

The variance of the hypothetical mean pure premiums is: 1/6 - (1/3)2 = 1/18 = 0.0556.
7.48. E. 1. False. In general VAR[X+Y] = VAR[X] + VAR[Y] + 2COV[X,Y]. Thus statement 1 is
only true when COV[X,Y] = 0. 2. False. In general VAR[X-Y] = VAR[X] + VAR[Y] - 2COV[X,Y].
3. False. In analysis of variance, these are the two pieces that make up the unconditional variance,
and usually they are not equal. For example the expected value of the process variance is usually
not equal to the variance of the hypothetical means.
7.49. D. Assume that what is given as the Distribution of Claim Frequencies Rates is the
Distribution of the Poisson parameters of the individual drivers across each class.
The distribution of claim frequencies for the combined portfolio has mean given by
E[] = {(500)(.050) +(500)(.210)} / 1000 = .130. The second moment for Class A is:
(.2272 ) + (.052 ) = .054029. The second moment for Class B is: (.5612 ) + (.212 ) = .358821.
The combined portfolios second moment is a weighted average:
E[2] = {(500)(.054029) +(500)(.358821)} / 1000 = .2064. Thus the variance of over the
combined portfolio is: .2064 - .132 = .1895. The standard deviation is:

0.1895 = 0.4353.

Alternately, the variance of the means for the two classes is VAR[E[] | Class] = .082 = .0064.
The average of the variance for the two classes is:
E[Var[ | Class]] = {(500)(.2272 ) + (500)(.5612 )} /1000 = .1831. The variance of across the
whole portfolio is: VAR[] = E[Var[ | Class]] + VAR[E[] | Class] = .1831 + .0064 = .1895.
Thus the standard deviation is:

0.1895 = .4353.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 327

7.50. E. E[2] = Var[] + E[]2 = 640,000 + 1,000,000 = 1,640,000.


Since frequency and severity are independent, for fixed and ,
Process Variance of the Pure Premium = E[freq.]Var[sev.] + E2 [sev.]Var[freq.] = 2 + 2 = 22.
Expected Value of Process Variance = E[22 ] = 2E[]E[2 ] = (2)(.1)(1,640,000) = 328,000.
7.51. B. E[2] = Var[] + E[]2 = .0025 + .01 = .0125.
E[2] = Var[] + E[]2 = 640,000 + 1,000,000 = 1,640,000.
The hypothetical mean pure premium is: (avg. freq)(avg. severity) = .
Var[Mean P.P.] = Var[] = E[()2 ] - E[]2 = E[2] E[2] - E[]2 E[]2 =
(.0125)(1,640,000) - (.01)(1,000,000) = 10,500.
Comment: Note that if one were to combine the answer to this question and the previous one, then
the Buhlmann credibility parameter is K = 328,000 / 10,500 = 31.2.
7.52. D. Given m, since the frequency and severity are independent, the process variance of the
Pure Premium = (mean freq.)(variance of severity) + (mean severity)2 (variance of freq.)

= m {(variance of severity) + (mean severity)2 } = m(2nd moment of the severity)


100000

= (m / 100000)

x2 dx=

(m/100000) (100000)3 / 3 = m 3.333 x 109 .

x=0

Thus, E[Process Variance] = 3.333 x 109 E[m] = (3.333 x 109 )(.4) = 1333 million.
7.53. A. The mean severity is 50,000 for each risk.
Therefore, given m, the hypothetical mean pure premium = m 50000.
Thus the Variance of the Hypothetical Means of the Pure Premiums is:
VAR[50000m] = 500002 VAR[m] = (2.5 x 109 )(.10) = 2.5 x 108 = 250 million.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 328

7.54. D. Take m fixed, then since the frequency and severity are independent:
Process Variance = f s2 + s f2 = m (1000/m)2 + (1000/m)2 m = 2,000,000 / m.
To get the expected value of the Process Variance one has to integrate over all values of m using
the p.d.f. f(m):
3

EPV = (2 million ) (1/m) (1+m)/6 dm = (1 million / 3) (1/m) + 1 dm =


1

1
m=3

(1 million / 3) {ln(m) + m}

= (1 million / 3) {ln(3) + 2} = 1.03 million.

m=1

7.55. D. The individuals in the portfolio are parameterized via , which in turn is distributed as
f(x) = 4 / 5, 1 < < . For each individual we are given that frequency is Poisson with mean and
that severity is constant at 1000. Thus for each individual ( fixed), we have:
f = f2 = , s = 1000, and s2 = 0. Thus for each individual ( fixed), we have:
PP2 = f s2 + s2 f2 = ()(0) + (1000)2 () = 1,000,000 3.
In order to find the Expected Value of the Process Variance, one needs to take the integral with
respect to of PP2 f() d.

1,000,000 3 (4 / 5) d = 4,000,000 2 d = 4,000,000(1) = 4,000,000.


1

Comment: You have to carefully calculate the process variance of the aggregate losses for each
type of risk and then average over the different types of risks.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 329

7.56. D. For a Poisson frequency with independent frequency and severity, the process variance
of the aggregate losses is equal to the mean frequency times the second moment of the severity.
Given , the second moment of the severity is: (1/3)(2)2 + (2/3)(8)2 = 442 . Thus given , the
process variance of the pure premiums is: E[X2 ] = (442 ) = 443 .
The Expected Value of the Process Variance is: (1/3)(44) + (1/3)(352) + (1/3)(1188) = 528.
Lambda

A Priori Probability

Process Variance of P.P. = 44 Lambda Cubed

1
2
3

0.3333
0.3333
0.3333

44
352
1188
528

7.57. B. The Bernoulli has process variance: q(1-q) = q - q2 .


1

E[q] =

(q/2) sin(q/2) dq = 2/.

(q2/2) sin(q/2) dq = 4( - 2)/2 .

E[q2 ] =
0

Thus EPV = E[q - q2 ] = E[q] - E[q2 ] = 2/ - 4( - 2)/2 = (8 - 2)/2 = 2(4 - )/2 .


Comment: The Variance of the Hypothetical Means would be computed as follows.
VHM = VAR[q] = E[q2 ] - E2 [q] = 4( - 2)/2 - (2/)2 = (4 -12)/2.
Then the Buhlmann Credibility Parameter K = EPV/VHM = (8 - 2)/(4 - 12) = 3.03.
7.58. D. For m fixed the process variance of the aggregate losses is:

F S2 + S2 F2 = (m)( 400m2 ) + (20m)2 (m) = 800m3 .


Therefore the expected value of the process variance of the aggregate losses is:

f(m)800m3 dm = 400 m5 e-m dm = 400 (6) = (400)(5!) = 48,000.


m=0

m=0

Comment: The integral from 0 to of m5 e-m is given in terms of a Gamma Function. Alternately, we
need 800 times the integral of m3 times f(m), where f(m) is a Gamma Distribution, with parameters
= 3 and = 1. This integral is the third moment of a Gamma Distribution, it is equal to (+1)(+2)3
= 60. Thus the expected value of the process variance of the aggregate losses is equal to
(800)(60) = 48,000.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 330

7.59. C. The process variance for a Pareto is: 22 /{(1) (2)} {/(1)}2 = 32 / 4.

EPV = (Process Variance | ) f() d = (32 / 4) e d = (3/4) 2 e d = (3/4)(3) =


0

(3/4)(2) = 3/2.
7.60. A. The mean for a Pareto is: /(1) = /2. 2nd Moment of the Hypothetical Means

= ( / 2)2 f() d = (1/4) 2 e d = (1/4) (3) = (1/4)(2) = 1/2.


0

1st Moment of the Hypothetical Means =

( / 2) f() d = (1/2) e d = (1/2) (2) = (1/2)(1) = 1/2.


0

Therefore, Variance of the Hypothetical Means = (1/2) - (1/2)2 = 1/4.


Comment: There are many ways to make a mistake and still get the right answer.
Combining the solutions to this and the previous question would produce a Buhlmann Credibility
Parameter K = EPV/VHM = (3/2) / (1/4) = 6.
7.61. E. Given b, one can get the process variance by computing the first and second moments:
b

x=b

Mean = x(2x/b2 ) dx = (2/3)x3 /b2 ] = 2b /3.


0

x= 0
b

x=b

2nd moment = x2 (2x/b2 ) dx = x4 / 2b2 ] = b2 /2.


0

x=0

Given b, the Process Variance = b2 /2 - (2b /3)2 = b2 /18.

EPV = (b2 /18)(1/b2 ) db = b/18 ] = .


1

Comment: The overall mean, the second moment of the hypothetical means, and the VHM are all
also infinite.

2013-4-9

Buhlmann Cred. 7 EPV and VHM,

HCM 10/19/12,

Page 331

7.62. Var[Y] = Var[E[Y | X = x]] + E[Var[Y | X = x]] = Var[15x + 20] + E[x + 12] =
152 Var[X] + E[X] + 12 = (225)(10) + 10 + 12 = 2272.
7.63. A. When one mixes distributions, the variance increases.
Var[X] = E[Var[X | ]] + Var[E[X | ]] E[Var[X | ]].
Thus since for a Poisson Distribution, the variance is equal to the mean, for a mixture of
Poisson Distributions, the variance is greater than the mean.
Since for a Negative Binomial Distribution, the variance is greater than the mean, for a mixture of
Negative Binomial Distributions, the variance is greater than the mean.
Since for a Binomial Distribution, the variance is less than the mean, for a mixture of Binomial
Distributions, the variance can be either less than, greater than, or equal to the mean.
Thus a mixture of two binomial distributions with different means may be appropriate for the given
situation.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 332

Section 8, Buhlmann Credibility, Introduction


The expected value of the process variance and the variance of the hypothetical means, discussed
in the previous section, will be used in Buhlmann Credibility.
Buhlmann Credibility Parameter:
The Buhlmann Credibility Parameter is calculated as: K = EPV / VHM.92
K = Expected Value of Process Variance / Variance of Hypothetical Means.
Where the Expected Value of the Process Variance and the Variance of the Hypothetical
Means are each calculated for a single observation of the risk process.93
Buhlmann Credibility Formula:
Then for N observations, the Buhlmann Credibility is: Z = N / (N + K).94
Loss Models calls Z the Buhlmann Credibility Factor or Bhlmann-Straub Credibility Factor.95
Using Buhlmann Credibility,
the estimate of the future = Z(observation) + (1 - Z)(prior mean).
Multi-sided Dice Example:
An example involving multi-sided dice was discussed previously with respect to Bayesian
Analysis:
There are a total of 100 multi-sided dice of which 60 are 4-sided, 30 are 6-sided and 10 are
8-sided. The multi-sided dice with 4 sides have 1, 2, 3 and 4 on them. The multi-sided dice with the
usual 6 sides have numbers 1 through 6 on them. The multi-sided dice with 8 sides have numbers
1 through 8 on them. For a given die each side has an equal chance of being rolled; i.e., the die is
fair.
Your friend has picked at random a multi-sided die. He then rolled the die and told you the result.
You are to estimate the result when he rolls that same die again.
92

In Loss Models notation, k = v/a.


It is important to use a single consistent definition of a single draw from the risk process, for both the calculation of
K and the determination of N.
94
If N =1, then Z = 1/(1 + K) = VHM/(VHM + EPV) = VHM / Total Variance.
95
Bhlmann-Straub refers to the case where there are varying number of exposure units as in for example Group
Health Insurance or Commercial Automobile Insurance.
93

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 333
Expected Value of the Process Variance, Multi-sided Die Example:
For each type of die we can compute the mean and the (process) variance.
For example for a six-sided die one need only list all the possibilities:
A

Roll
of Die

A Priori
Probability

Col. A times
Col. B

Square of Col. A
times Col. B

1
2
3
4
5
6

0.16667
0.16667
0.16667
0.16667
0.16667
0.16667

0.16667
0.33333
0.50000
0.66667
0.83333
1.00000

0.16667
0.66667
1.50000
2.66667
4.16667
6.00000

Sum

3.5

15.16667

Thus the mean is 3.5 and the variance is 15.16667 - 3.52 = 2.91667 = 35/12. Thus the conditional
variance if a six-sided die is picked is 35/12. VAR[X | 6-sided] = 35/12.
Exercise: What are the mean and variance of a four sided die?
[Solution: The mean is 2.5 and the variance is 15/12.]
Exercise: What are the mean and variance of an eight sided die?
[Solution: The mean is 4.5 and the variance is 63/12.]
Exercise: What are the mean and variance of a die with S sides?
[Solution: The mean is (S+1)/2 and the variance is (S2 -1 )/12. The mean is the sum of the
integers from 1 to S divided by S. The former is S(S+1) /2 , thus the mean is (S+1) /2.
The second moment is the sum of the squares of the integers from 1 to S divided by S.
The former is S(S+1)(2S-1)/6, thus the second moment is (S+1)(2S+1)/6 .
Then the variance is: (S+1)(2S+1)/6 - {(S+1)/2} 2 = (S2 - 1)/12.]
One computes the Expected Value of the Process Variance by weighting together the process
variances for each type of risk using as weights the chance of having each type of risk96.
In this case the Expected Value of the Process Variance is:
(60%)(15/12) + (30%)(35/12) + (10%)(63/12) = 25.8 /12 = 2.15.
In symbols this sum is:
P(4 -sided)VAR[X | 4-sided] + P(6 -sided)VAR[X | 6-sided] +P(8-sided)VAR[X | 8-sided].
96

In situations where the types of risks are parameterized by a continuous distribution, as for example in the GammaPoisson frequency process, one will take an integral rather than a sum.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 334
Using the fact that a die with S sides has Process Variance of (S2 - 1)/12:
Type
of
Die

A Priori
Chance of this
Type of Die

Process Variance
of this Type
of Die

4-sided
6-sided
8-sided

0.6
0.3
0.1

1.25000
2.91667
5.25000

Average

2.15

Note that this is the Expected Value of the Process Variance for one observation of the risk process;
i.e., one roll of a die.
Variance of the Hypothetical Means, Multi-sided Die Example:
One can also compute the Variance of the Hypothetical Means by the usual technique; compute the
first and second moments of the hypothetical means.
In this case:
Type
of
Die
4-sided
6-sided
8-sided
Average

A Priori
Chance of this
Type of Die
0.6
0.3
0.1

Mean for
this Type
of Die
2.5
3.5
4.5

Square of Mean
of this type
of Die
6.25
12.25
20.25

9.45

The Variance of the Hypothetical Means is the second moment minus the square of the (overall)
mean = 9.45 - 32 = 0.45. Note that this is the variance for a single observation; i.e., one roll of a die.
Total Variance:
One can compute the total variance of the observed results if one were to repeat this experiment
repeatedly. One need merely compute the chance of each possible outcome.
In this case there is a 60% / 4 = 15% chance that a 4-sided die will be picked and then a 1 will be
rolled. Similarly, there is a 30% / 6 = 5% chance that a 6-sided die will be selected and then a 1 will
be rolled. There is a 10% / 8 = 1.25% chance that an 8-sided die will be selected and then a 1 will
be rolled. The total chance of a 1 is therefore: 15% + 5% + 1.25% = 21.25%.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 335
A

Roll
of Die
1
2
3
4
5
6
7
8
Sum

Probability
due to
4-sided die
0.15
0.15
0.15
0.15

Probability
due to
6-sided die
0.05
0.05
0.05
0.05
0.05
0.05

Probability
due to
8-sided die
0.0125
0.0125
0.0125
0.0125
0.0125
0.0125
0.0125
0.0125

A Priori
Probability
0.2125
0.2125
0.2125
0.2125
0.0625
0.0625
0.0125
0.0125

Col. A
times
Col. E
0.2125
0.4250
0.6375
0.8500
0.3125
0.3750
0.0875
0.1000

0.6

0.3

0.1

Square of
Col. A
times Col. E
0.2125
0.8500
1.9125
3.4000
1.5625
2.2500
0.6125
0.8000
11.6

The mean is 3 (the same as computed above) and the second moment is 11.6.
Therefore, the total variance is 11.6 - 32 = 2.6.
As is generally true, in this case, EPV + VHM = 2.15 + 0.45 = 2.6 = Total Variance.
Thus the total variance has been split into two pieces.
Estimating Future Die Rolls:
Your friend has picked at random a multi-sided die. He then rolled the die and told you the result.
You are to estimate the result when he rolls that same die again.
In this case, K = EPV / VHM = 2.15 / 0.45 = 4.778 = 43/9, where the EPV and VHM where
calculated previously for a single die roll.
In this example, the prior mean is: (60%)(2.5) + (30%)(3.5) + (10%)(4.5) = 3.0.
Thus the new estimate = Z(observation) + (1 - Z)(3) = 3 + Z (observation - 3).
If the credibility assigned to the observation is larger, then our new estimate is more responsive to
the observation, and vice versa.
Z = N/(N + K), where N is the number of observations.
For 1 observation, Z = 1/(1 + 4.778) = 0.1731 = 9 /52
= 0.45 / (0.45 + 2.15) = VHM/(VHM + EPV).
Thus in this case if we observe a roll of a 5, then the new estimate is:
(0.1731)(5) + (1 - 0.1731)(3) = 3.3462.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 336
Estimates Are In Balance:
The Buhlmann Credibility estimate is a linear function of the observation.
When observing one die roll,
estimate = 0.1731(observation) + (1 - 0.1731)(3) = 0.1731(observation) + 2.4807:
Observation
New Estimate

1
2
2.6538 2.8269

3
3

4
5
6
7
8
3.1731 3.3462 3.5193 3.6924 3.8655

Exercise: Calculate the a priori chances of the possible observations for this multi-sided die
example, with a 60% chance of a 4-sided die, 30% chance of a 6-sided die, and a 10% chance of an
8-sided die.
[Solution:
Observation
A Priori Probability

1
2
3
4
5
6
7
8
0.2125 0.2125 0.2125 0.2125 0.0625 0.0625 0.0125 0.0125

Exercise: Calculate the weighted average of the Buhlmann Credibility estimates, using as weights
the a priori chances of the possible observations for this multi-sided die example.
[Solution: (0.2125)(2.6538 + 2.8269 + 3 + 3.1731) + (0.0625)(3.3462 + 3.5193) +
(0.0125)(3.6924 + 3.8655) = 3. ]
The weighted average of the Buhlmann Credibility estimates using as weights the a priori chances
of the possible observations is equal to the a priori mean of 3.
Let be the a priori mean. In general, the Buhlmann Credibility estimates are:
(observation - )Z + .
If as here Z is the same for each of the estimates being averaged,97 then the average over all the
possible observations is:
(average of possible observations - )Z + = ( - )Z + = .
Thus, if exposures do not vary, the estimates that result from Buhlmann Credibility are in
balance:
The weighted average of the Buhlmann Credibility estimates over the possible
observations for a given situation, using as weights the a priori chances of the possible
observations, is equal to the a priori mean.
97

This might not be the case, if rather than averaging over the possible observations for a single situation, one were
averaging over the actual observations for classes with different number of exposures when doing classification
ratemaking or over commercial insureds when doing experience rating. See the method that preserves total
losses in the section on varying exposures in Mahlers Guide to Empirical Bayesian Credibility.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 337
Buhlmann Credibility versus Bayesian Analysis:
The above results of applying Buhlmann Credibility in the case of observing a single die roll,
differ from those previously obtained for Bayesian Analysis.
Observation
Buhlmann Credibility Estimate
Bayesian Analysis Estimate

1
2
3
4
5
6
7
8
2.6538 2.8269
3
3.1731 3.3462 3.5193 3.6924 3.8655
2.853 2.853 2.853 2.853
3.7
3.7
4.5
4.5

Estimate
4.5

4.0

3.5

Buhlmann Credibility
3.0

Observation

Using Buhlmann Credibility: Estimate = Z(observation) + (1 - Z)(a priori mean).


The slope of this line is Z, and 0 Z 1. Therefore, the Buhlmann Credibility estimates are on a
straight line, with nonnegative slope not exceeding one.
The straight line formed by the Buhlmann Credibility estimates seems to approximate the Bayesian
Analysis Estimates (dots). As discussed subsequently, the Buhlmann Credibility Estimates are the
weighted least squares line fit to the Bayesian Estimates.
The a priori mean is 3. Therefore, if one observes a 3, the estimate from Buhlmann credibility is also
3. In general, for an observation equal to the a priori mean, the estimate using Buhlmann Credibility
is equal to the observation. This is usually not the case for Bayesian Analysis.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 338
Also note that in the case of the application of credibility, the estimate is always between
the a priori estimate and the observation. This is not necessarily true in general for Bayesian
Analysis.
As discussed in a previous section, the estimates from Bayes Analysis are within the range of
hypothesis, in this case from 2.5 the mean of a 4-sided die, to 4.5 the mean of an eight sided die.
This is not necessarily true in general for estimates from Buhlmann Credibility .
Exercise: Calculate the weighted average of the Bayesian Analysis Estimates, using as weights the
a priori chances of the possible observations for this multi-sided die example.
[Solution: (4)(0.2125)(2.853) + (2)(0.0625)(3.7) + (2)(0.0125)(4.5) = 3. ]
The weighted average of the Bayesian Analysis estimates using as weights the a priori chances of
the possible observations is equal to the a priori mean of 3. As discussed in a previous section,
Bayesian Analysis estimates are always in balance.
For either Bayes Analysis or Buhlmann Credibility, one starts with different risks types, and an a
priori probability for each risk type. One can apply one or the other to the same set up.98
Buhlmann Credibility is the weighted least squares linear approximation to Bayes Analysis.
Note that in the multi-sided die example, if we observe a 7 or an 8, then we know it must have been
the eight-sided die with mean 4.5. This is the Bayes estimate, but not the estimate from Buhlmann
Credibility. Being a linear approximation, Buhlmann Credibility may have some logical deficiencies
in simple models such as the multi-sided die example.99
In Buhlmann Credibility, one uses the initial given probabilities to calculate the EPV and VHM.
One calculates the EPV, VHM, and K prior to knowing the particular observation!
One does not use Bayes Theorem to get the posterior probabilities, and then use them to calculate
the EPV and VHM! Use either the more exact Bayes Analysis or its approximation Buhlmann
Credibility, not both.
If an exam question can be done by either method, and they do not specify which, use the more
exact Bayes Analysis rather than its approximation Buhlmann Credibility.100

98

On some exams they have had two questions with the same set up and asked you to apply Bayes Analysis and
Buhlmann Credibility in the two separate questions.
99
While this kind of thing can happen in simple models used on the exam or for teaching purposes, it would be
extremely unlikely to occur in a practical application of Buhlmann Credibility.
100
For some situations discussed in Mahlers Guide to Conjugate Priors, the two methods give the same result.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 339
Multiple Die Rolls:
So far we have computed variances in the case of a single roll of a die. One can also compute
variances when one is rolling more than one die.101 There are a number of somewhat different
situations which lead to different variances, which lead in turn to different credibilities.
Exercise: Each actuary attending a CAS Meeting rolls 2 multi-sided dice.
One die is 4-sided and the other is 6-sided.
Each actuary rolls his two dice and reports the sum.
What is the expected variance of the results reported by all the actuaries?
[Solution: The variance is the sum of that for a 4-sided and 6-sided die.
Variance = (15/12) + (35/12) = 50/12 = 4.167.]
One has to distinguish the situation in this exercise where the types of dice rolled are known, from
one where each actuary is selecting dice at random. The latter introduces an additional source of
random variation, as shown in the following exercise.
Exercise: Each actuary attending a CAS Meeting independently selects 2 multi-sided dice. For
each actuary his two multi-sided dice are selected independently of each other, with each die having
a 60% chance of being 4-sided, a 30% chance of being 6-sided, and a 10% chance of being 8sided. Each actuary rolls his two dice and reports the sum.
What is the expected variance of the results reported by all the actuaries?
[Solution: The total variance is the sum of the EPV and VHM. For each actuary let his 2 dice be A
and B. Let the parameter (number of sides) for A be and that for B be .
Note that A only depends on , while B only depends on , since the two dice were selected
independently.
Then EPV = E,[VAR[A+B | ,]] = E,[VAR[A | ,]] + E,[VAR[B | ,]] =
E[VAR[A | ]] + E[VAR[B | ]] = 2.15 + 2.15 = (2)(2.15) = 4.30.
VHM = VAR,[ E[A+B | ,] ] = VAR,[ E[A | ,] + E[B | ,]] =
VAR[ E[A | ] ] + VAR[ E[B | ]] = (2)(0.45) = 0.90.
Where I have used the fact that E[A | ] and E[B | ] are independent and thus their variances add.
Total variance = EPV + VHM = 4.3 + 0.9 = 5.2. ]
This previous exercise is subtly different from a situation where the two dice selected by a given
actuary are always of the same type.
101

These dice examples can help one to think about insurance situations where one has more than one observation
or insureds of different sizes.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 340
Exercise: Each actuary attending a CAS Meeting selects two multi-sided dice, both of the same
type. For each actuary, his multi-sided dice have a 60% chance of being 4-sided, a 30% chance of
being 6-sided, and a 10% chance of being 8-sided.
Each actuary rolls his dice and reports the sum.
What is the expected variance of the results reported by all the actuaries?
[Solution: The total variance is the sum of the EPV and VHM. For each actuary let his two die rolls be
A and B. Let the parameter (number of sides) for his dice be , the same for both dice.
Then EPV = E[VAR[A+B | ]] = E[VAR[A | ]] + E[VAR[B | ]] =
E[VAR[A | ]] + E[VAR[B | ]] = 2.15 + 2.15 = (2)(2.15) = 4.30.
The VHM = VAR[ E[A+B | ] ] = VAR[ 2E[A | ]] = (22 )VAR[ E[A | ] ] = (4)(.45) = 1.80.
Where we have used the fact that E[A | ] and E[B | ] are the same.
Total variance = EPV + VHM = 4.3 + 1.8 = 6.1. Alternately, Total Variance =
(N)(EPV for one observation) + (N2 )(VHM for one observation) = (2)(2.15) + (22 )(.45) = 6.1.]
Note that this exercise is the same mathematically as if each actuary chose a single die and reported
the sum of rolling his die twice. Contrast this exercise with the previous one in which each actuary
chose two dice, with the type of each die independent of the other.
Lets go over these ideas again with each actuary rolling 3 dice. For example, assume you sum the
independent rolls of 2 four-sided dice and 1 six-sided die. The mean result is:
(2)(2.5) + (1)(3.5) = 8.5. The variance is: (2)(15/12) + (1)(35/12) = 65/12 = 5.4167.102
In contrast if one selected 3 dice at random, with a 2/3 chance that each one was four-sided and a 1/3
that it was six-sided, the variance of the sum of rolling the dice would be larger than in the previous
situation where we know the type of dice.
Type
of
Die
4-sided
6-sided
Average

A Priori
Chance of this
Type of Die
0.6667
0.3333

Mean for
this Type
of Die
2.5
3.5
2.8333

Square of Mean
Process
of this type
Variance
of Die
This type of Die
6.25
1.2500
12.25
2.9167
8.25

1.8056

The Expected Value of the Process Variance of a single die is 1.8056.


For the roll of three dice it is (3)(1.8056) = 5.4167.
However, now that we are picking the dice at random we need to add the VHM.
The Variance of the Hypothetical Means for the selection of one die is: 8.25 - 2.83332 = 0.222.
102

The variance of a four-sided die is 15/12. The variance of a six-sided die is 35/12.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 341
For the selection of three independent dice, the variance of the means is multiplied by 3;
the VHM is: (3)(0.222) = 0.666.
One can also compute this variance by going back to first principles and listing all the possibilities:
Number
of 4-sided Dice

Number
of 6-sided Dice

A Priori
Chance

Hypothetical
Mean

Square of
Hypothetical Mean

0
1
2
3

3
2
1
0

0.0370
0.2222
0.4444
0.2963

10.5
9.5
8.5
7.5

110.25
90.25
72.25
56.25

8.5

72.9167

Overall

For example, the chance of one 4-sided die and two 6-sided dice is (3)(2/3)(1/3)2 = 2/9.
The Variance of the Hypothetical Means is: 72.9167 - 8.52 = .6666
Thus the total variance is: 5.417 + 0.666 = 6.083. Which we note is higher than the 5.417 variance
calculated when we know we have 2 four-sided and 1 six-sided die.
Instead of making independent selections, you could pick dice which are always all of the same
type. Assume you have three urns, A, B, and C. Urns A and B each contain 4-sided dice. Urn C
contains 6-sided dice. Pick an urn at random, and then select 3 dice from the selected urn. Then there
is a 2/3 chance that the selected dice all were four-sided and a 1/3 chance that they all were
six-sided.
In this situation, the variance of the sum of rolling the dice would be even larger than in the previous
situation where the selection of each die was instead independent. The Expected Value of the
Process Variance of a single die is 1.8056. For the roll of three dice it is (3)(1.8056) = 5.4167, the
same as above. The Variance of the Hypothetical Means for the selection of one die is 2/9. For the
selection of three identical dice, the hypothetical means are each multiplied by 3, so their variance is
multiplied by 9. Thus the VHM = (32 )(2/9) = 2. Thus the total variance is: 5.417 + 2 = 7.417.
Mathematically, this situation is the same as one in which a single die is selected and then rolled three
times. This is the situation that is common in credibility questions. In that case the VHM for the sum of
the three rolls is: (32 )(2/9) = 2 and the total variance is:
EPV + VHM = 5.417 + 2 = 7.417.
The Total Variance = (3)(EPV single die roll) + (32 )(VHM single die roll).
The VHM has increased as per N2 the square of the number of observations, while the EPV goes
up only as N.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 342
Total Variance of N observations =
(N)(EPV for one observation) + (N2 )(VHM for one observation).
This is the assumption behind the Buhlmann Credibility formula: Z = N / (N+K). Note that the
Buhlmann Credibility parameter K is the ratio of the EPV to VHM for a single die.
The Buhlmann Credibility formula is set up to automatically adjust the credibility for the number of
observations N.
Number of Observations:
Exercise: For the multi-sided die example, three rolls from a random die are: 2, 5 and 1.
Use Buhlmann Credibility to estimate the next roll from the same die.
[Solution: The average observation is: (2 + 5 + 1)/3 = 8/3.
There are three observations, so that Z = 3/(3 + K) =3/(3 + 4.778) = 38.6%.
The a priori mean is 3.
Thus the estimate is: (38.6%)(8/3) + (61.4%)(3) = 2.87.
Comment: We have previously calculated the Buhlmann Credibility Parameter, K, for this situation.
Since K does not depend on the observations, there is no need to calculate it again.]
It makes sense to assign more credibility to more rolls of the selected die, since as we gather more
information, we should be able to get a better idea of which type of die has been chosen.
If one has N observations of the risk process one assigns Buhlmann Credibility of:
Z = N / (N + K). For N = K, Z = K/(K + K) = 1/2.
Therefore, the Buhlmann Credibility Parameter, K, is the number of observations needed
for 50% credibility.
For the Buhlmann Credibility formula as N , Z 1, but Buhlmann Credibility never quite
reaches 100%. In this example with K = 4.778:
# of Observations
Credibility

1
17.3%

2
29.5%

3
38.6%

5
51.1%

10
67.7%

25
84.0%

100
95.4%

1000
99.5%

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 343
Below are two examples, corresponding to K = 100 and K = 400, of how Buhlmann Credibility
varies with the number of observations:
Credibility
1.0

K = 100

0.8

K = 400

0.6

0.4

0.2

1000

2000

3000

4000

5000

Below is shown the difference in the resulting credibilities for these two different values, 100 and
400, of the Buhlmann Credibility Parameter, K:
Credibility
0.30
0.25
0.20
0.15
0.10
0.05
1000

2000

3000

4000

5000

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 344
If we add up N independent rolls of the same die, the process variances add.
So if 2 is the expected value of the process variance of a single die, then N2 is the expected
value of the process variance of the sum of N identical dice.
The process variance of one 6-sided die is 35/12, while the process variance of the sum of ten
6-sided dice is 350/12.
In contrast if 2 is the variance of the hypothetical means of one die roll, then the variance of the
hypothetical means of the sum of N rolls of the same die is N2 2 . This follows from the fact that each
of the means is multiplied by N, and that multiplying by a constant multiplies the variance by the
square of that constant.
Thus as N increases, the variance of the hypothetical means of the sum goes up as N2 , while the
process variance goes up only as N. Based on the case with one roll, we expect the credibility to
be given by Z = VHM / Total Variance = VHM / (VHM + EPV) =
N2 2 / (N2 2 + N2 ) = N / (N + 2 / 2 ) = N / (N + K), where K = 2 / 2 = EPV / VHM, with EPV
and VHM are each for a single die.103
In general, one computes the EPV and VHM for a single observation of the risk process,
and then plug into the formula for Buhlmann Credibility the number of observations N.
If one is estimating claim frequencies, pure premiums, or aggregate losses,
then N is in exposures.
If one is estimating claim severities, then N is in number of claims.
N is in the units of whatever is in the denominator of the quantity one is estimating.
Claim frequency =

number of claims
.
number of exposures

Claim severity =

dollars of loss
.
number of claims

Pure Premium =

dollars of loss
.
number of exposures

103

In a subsequent section Z = VHM / Total Variance, is derived in the case of N dice or N observations.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 345
Assumptions Underlying Z = N / (N+K):104
There are a number of important assumptions underlying the formula Z = N / (N+K) where
K = EPV / VHM . While these assumptions usually hold on the exam, they do not hold in some real
world applications.105 These assumptions are:
1. The complement of credibility is given to the overall mean.
2. The credibility is determined as the slope of the weighted least squares line
to the Bayesian Estimates.
3. The risk parameters and risk process do not shift over time.106
4. The expected value of the process variance of the sum of N observations increases as N.
Therefore the expected value of the process variance of the average of N observations
decreases as 1/N.
5. The variance of the hypothetical means of the sum of N observations increases as N2 .
Therefore the variance of the hypothetical means of the average of N observations is
independent of N.
In addition, we must be careful that an insured has been picked at random, that we observe that
insured and then we attempt to make an estimate of the future observation of that same
insured If instead one goes back and chooses a new insured at random, then the information
contained in the observation has been lost.
There is an exception, when they talk about picking insureds from a class.107
One class of policyholders is selected at random from the book. Nine policyholders are
selected at random from this class and are observed to have produced a total of seven
claims. Five additional policyholders are selected at random from the same class. Determine the
Bhlmann credibility estimate for the total number of claims for these five policyholders. 108
Each insured in the class is assumed to have the same risk process, so we can use the experience
from the first set of insureds to predict the future for the second set of insureds from the same class.
104

These assumptions are also discussed in the subsequent section on Least Squares Credibility.
See for example, Howard Mahlers discussion of Glenn Meyers An Analysis of Experience Rating, PCAS 1987,
or "Credibility with Parameter Uncertainty, Risk Heterogeneity, and Shifting Risk Parameters," by Howard C. Mahler,
PCAS 1998.
106
In the Philbrick Target shooting example discussed in a subsequent section, we assume the targets are fixed and
that the skill of the marksmen does not change over time.
107
See for example, 4, 11/00, Q. 38 or 4, 11/02, Q.32.
108
Language quoted from 4, 11/00, Q. 38 .
105

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 346
Problems:
8.1 (1 point) The Expected Value of the Process Variance is 50.
The Variance of the Hypothetical Means is 5.
How much Buhlmann Credibility is assigned to 30 observations of this risk process?
A. less than 60%
B. at least 60% but less than 65%
C. at least 65% but less than 70%
D. at least 70% but less than 75%
E. at least 75%
8.2 (1 point) If 4 observations are assigned 30% Buhlmann Credibility, what is the value of the
Buhlmann Credibility parameter K?
A. less than 9.0
B. at least 9.0 but less than 9.2
C. at least 9.2 but less than 9.4
D. at least 9.4 but less than 9.6
E. at least 9.6
8.3 (3 points) Your friend picked at random one of three multi-sided dice. He then rolled the die and
told you the result. You are to estimate the result when he rolls that same die again. One of the three
multi-sided dice has 4 sides (with 1, 2, 3 and 4 on them), the second die has the usual 6 sides
(number 1 through 6), and the last die has 8 sides (with numbers 1 through 8). For a given die each
side has an equal chance of being rolled, in other words the die is fair. Assume the first roll was a
seven. Use Buhlmann Credibility to estimate the next roll of the same die. Hint: A die with S sides
has mean: (S+1)/2, and variance: (S2 - 1)/12.
A. 4.1

B. 4.2

C. 4.3

D. 4.4

E. 4.5

Use the following information for the next two questions:


There are three large urns, each filled with so many balls that you can treat it as if there are an infinite
number. Urn 1 contains balls with "zero" written on them. Urn 2 has balls with "one" written on them.
The final Urn 3 is filled with 50% balls with "zero" and 50% balls with "one". An urn is chosen at
random and three balls are selected.
8.4 (2 points) If all three balls have zero written on them, use Buhlmann Credibility to estimate the
expected value of another ball picked from that urn.
A. 0.05
B. 0.06
C. 0.07
D. 0.08
E. 0.09
8.5 (2 points) If two balls have zero written on them and one ball has one written on it, use
Buhlmann Credibility to estimate the expected value of another ball picked from that urn.
A. 0.30
B. 0.35
C. 0.40
D. 0.45
E. 0.50

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 347
8.6 (1 point) Which of the following statements are true with respect to the estimates from Buhlmann
Credibility?
1. They are the linear least squares approximation to the estimates from Bayesian Analysis.
2. They are between the observation and the a priori mean.
3. They are within the range of hypotheses.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3
E. None of A, B, C, or D
8.7 (2 points) There are two types of urns, each with many balls labeled $10 and $20.
A Priori Chance of Percentage of
Percentage of
Type of Urn This Type of Urn
$10 Balls
$20 Balls
I
70%
80%
20%
II
30%
60%
40%
An urn is selected at random, and you observe a total of $60 on 5 balls drawn from that urn at
random. Using Buhlmann Credibility, what is estimated value of the next ball drawn from that urn?
A. 12.1
B. 12.2
C. 12.3
D. 12.4
E. 12.5
8.8 (3 points) A die is selected at random from an urn that contains four six-sided dice with the
following characteristics:
Number of Faces
Number on Face
Die A
Die B
Die C
Die D
1
3
1
1
1
2
1
3
1
1
3
1
1
3
1
4
1
1
1
3
The first four rolls of the selected die yielded the following in sequential order: 3, 4, 2, and 4.
Using Buhlmann Credibility, what is the expected value of the next roll of the same die?
A. Less than 2.8
B. At least 2.8, but less than 2.9
C. At least 2.9, but less than 3.0
D. At least 3.0, but less than 3.1
E. 3.1 or more
8.9 (2 points) The a priori mean annual number of claims per policy for a block of insurance policies
is 10.
A randomly selected policy had 15 claims in Year 1 and 11 claims in Year 2.
Based on these two years of data, the Bhlmann credibility estimate of the number of claims on the
selected policy in Year 3 is 10.75.
In Year 3 this policy had 22 claims.
Calculate the Bhlmann credibility estimate of the number of claims on the selected policy in Year 4
based on the data for Years 1, 2 and 3.
(A) 11.0
(B) 11.5
(C) 12.0
(D) 12.5
(E) 13.0

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 348
8.10 (3 points) A teacher gives to her class a final exam with 100 available points. She finds that
the scores very closely approximate a normal distribution with a mean of 56 and standard deviation
of 8. The teacher assumes that the score actually achieved by a student on the exam is normally
distributed around his "true competence" with the course material. Based on her past experience
the teacher estimates the standard deviation of this distribution to be 4. The teacher uses Buhlmann
credibility to estimate, for each student, his "true competence" from his observed exam score. The
teacher wishes to pass those students whose "true competence" she estimates to be greater than
or equal to 65. In which range should the passing grade be?
A. Less than 65
B. At least 65, but less than 67
C. At least 67, but less than 69
D. At least 69, but less than 71
E. 71 or more
8.11 (3 points) There are three dice :
Die A
Die B
Die C
2 faces labeled 0
4 faces labeled 0
5 faces labeled 0
4 faces labeled 1
2 faces labeled 1
1 faces labeled 1
A die is picked at random. The die is rolled 8 times and 2 ones are observed.
Using Buhlmann Credibility what is the expected value of the next roll of that same die?
A. Less than 0.25
B. At least 0.25 but less than 0.26
C. At least 0.26 but less than 0.27
D. At least 0.27 but less than 0.28
E. At least 0.28
8.12 (3 points) There are four types of urns with differing percentages of black balls.
Each type of urn has a differing chance of being picked.
Type of Urn
A Priori Probability
Percentage of Black Balls
I
40%
4%
II
30%
8%
III
20%
12%
IV
10%
16%
An urn is chosen and fifty balls are drawn from it, with replacement; no black balls are drawn.
Use Buhlmann credibility to estimate the probability of picking a black ball from the same urn.
A. less than 3%
B. at least 3% but less than 4%
C. at least 4% but less than 5%
D. at least 5% but less than 6%
E. at least 6%

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 349
8.13 ( 3 points) Smith has selected a handful of ordinary six-sided dice. You do not know how many
dice Smith has selected, but you assume it is either 10, 11, 12 or 13, with the following probabilities:
Number of Dice
A Priori Probability
10
20%
11
40%
12
30%
13
10%
Smith rolls this same number of dice five times, without you seeing how many dice he rolled.
Smith reports totals of: 54, 44, 41, 48 and 47.
Using Buhlmann Credibility, what is your estimate of the average sum you can expect if Smith rolls
this same number of dice again?
Hint: A six-sided die has a mean of 3.5 and a variance of 35/12.
A. Less than 43.0
B. At least 43.0, but less than 43.5
C. At least 43.5, but less than 44.0
D. At least 44.0, but less than 44.5
E. 44.5 or more.
8.14 (2 points) You are given the following:
A portfolio consists of a number of independent insureds.
Losses for each insured for each exposure period are one of three values: , , or .
The probabilities for , , and vary by insured, but are fixed over time.
The average probabilities for , , and over all insureds are 30%, 60%, and 10%, respectively.
One insured is selected at random from the portfolio and its losses are observed for one exposure
period.
Estimates of the same insured's expected losses for the next exposure period are as follows:
Observed Bayesian Analysis
Buhlmann
Losses
Estimate
Credibility Estimate

100

150

140

260

300

Determine y.
A. Less than 85
B. At least 85, but less than 90
C. At least 90, but less than 95
D. At least 95, but less than 100
E. At least 100

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 350
8.15 (3 points) You are given:
(i) Laurence uses classical credibility.
(ii) Hans uses the Bhlmann credibility formula.
(iii) The full credibility standard used by Laurence is eight times the Bhlmann Credibility Parameter,
K, used by Hans.
For most amounts of data, the amounts of credibility assigned by Laurence and Hans differ.
However, for two non-zero amounts of data, they assign the same credibility.
Which of the following are the credibilities assigned in these two cases?
(A) 35% and 65%
(B) 30% and 70%
(C) 25% and 75%
(D) 20% and 80%
(E) 15% and 85%
Use the following information for the next two questions:
The a priori estimate is 25.
Individual observations can extend from 13 to 49.
The Buhlmann Credibility parameter, k, is 11.
There are 4 observations.
8.16 (1 point) What is the minimum possible value of the Buhlmann credibility estimate of the next
observation?
A. Less than 21
B. At least 21, but less than 22
C. At least 22, but less than 23
D. At least 23, but less than 24
E. At least 24
8.17 (1 point) What is the maximum possible value of the Buhlmann credibility estimate of the next
observation?
A. Less than 31
B. At least 31, but less than 32
C. At least 32, but less than 33
D. At least 33, but less than 34
E. At least 34

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 351
8.18 (3 points) For several types of risks, you are given:
(i) The expected number of claims in a year for these risks ranges from 0.5 to 2.0.
(ii) The number of claims follows a Binomial distribution with m = 6 for each risk.
During Year 1, n claims are observed for a randomly selected risk.
For the same risk, both Bayes and Bhlmann credibility estimates of the number of claims in
Year 2 are calculated for n = 0, 1, 2, ... , 6. Which graph represents these estimates?
2.5

2.5

1.5

1.5

1
Buhlmann

Buhlmann
0.5

0.5
Bayes
1

2
3
4
No. of Claims in Year 1

Bayes
5

2.5

2.5

1.5

1.5

2
3
4
No. of Claims in Year 1

Buhlmann

Buhlmann

0.5

0.5
Bayes
1

2
3
4
No. of Claims in Year 1

Bayes
5

1.5

1
Buhlmann
0.5
Bayes
1

2
3
4
No. of Claims in Year 1

2.5

2
3
4
No. of Claims in Year 1

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 352
8.19 (2 points) For a group of policies, you are given:
(i) The annual loss on an individual policy follows a gamma distribution with parameters = 4 and .
(ii) The prior distribution of has mean 600.
(iii) A randomly selected policy had losses of 1400 in Year 1 and 1900 in Year 2.
(iv) Loss data for Year 3 was misfiled and unavailable.
(v) Based on the data in (iii), the Bhlmann credibility estimate of the loss on the selected policy
in Year 4 is 1800.
(vi) After the estimate in (v) was calculated, the data for Year 3 was located.
The loss on the selected policy in Year 3 was 2763.
Calculate the Bhlmann credibility estimate of the loss on the selected policy in Year 4 based
on the data for Years 1, 2, and 3.
(A) Less than 1850
(B) At least 1850, but less than 1950
(C) At least 1950, but less than 2050
(D) At least 2050, but less than 2150
(E) At least 2150
8.20 (3 points) Use the following information for a portfolio of insureds:
(i) The frequency distribution has a parameter , which varies across the portfolio.
(ii) The mean frequency given is N().
(iii) The distribution of N() has a mean of 0.3 and a variance of 0.5.
(iv) The severity distribution has a parameter , which varies across the portfolio.
(v) The mean severity given is X().
(vi) The distribution of X() has a mean of 200 and a variance of 6000.
(vii) and vary independently across the portfolio.
(viii) Given and , frequency and severity are independent.
(ix) The overall variance of aggregate losses is 165,000.
Determine the Buhlmann Credibility Parameter for aggregate losses, K.
A. 2
B. 3
C. 4
D. 5
E. 6
8.21 (7 points) For a group of risks, you are given:
(i) The number of claims for each risk follows a binomial distribution with parameters m = 6 and q.
(ii) The values of q are equally likely to be 0.1, 0.3, or 0.6.
During Year 1, k claims are observed for a randomly selected risk.
For the same risk, both Bayesian and Bhlmann credibility estimates of the number of claims in Year
2 are calculated for k = 0, 1, 2, ... , 6.
Plot as function of k these Bayesian and Bhlmann credibility estimates together on the same graph.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 353
Use the following information for the next two questions:
At Dinahs Diner by the Shore, her lunchtime customers have the following joint distribution of
number of cups of coffee they drink and number of pieces of pie they eat:
Number of Pieces of Pie
Number of Cups of Coffee
0
1
2
0
15%
10%
5%
1
10%
20%
10%
2
5%
10%
15%
Any given customer drinks the same number of cups of coffee each time they have lunch at Dinahs.
Burt has eaten a total of 4 pieces of pie with his last 3 lunches.
8.22 (3 points) Use Bayes Analysis to estimate the number of pieces of pie that Burt will eat with
his next lunch.
8.23 (3 points) Use Buhlmann credibility to estimate the number of pieces of pie that Burt will eat
with his next lunch.
Use the following information for the next two questions:
Your company offers an insurance product.
There is at most one claim a year per policyholder.
There are the following three equally likely types of policyholders:
Type
Frequency of Claim
Severity of Claim Given Claim Occurs
Probability Claim Size
1
20%
80%
$1,500
20%
$1,000
2
40%
50%
$1,500
50%
$1,000
3
80%
10%
$1,500
90%
$1,000
You are also given the following 7 years of claims history for a policyholder named Jim:
Year
1
2
3
4
5
6
Losses
0
$1000
0
$1000
0
$1500
8.24 (4 points) Use Buhlmann Credibility to estimate Jims losses in year 8.
A. 503
B. 508
C. 513
D. 518
E. 523
8.25 (3 points) Use Bayes Analysis to estimate Jims losses in year 8.
A. 500
B. 505
C. 510
D. 515
E. 520

7
0

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 354
Use the following information for the following 2 questions:
There are two classes of insureds.
70% of policyholders are in group A, while 30% of policyholders are in group B.
The size of claims is either 1000 or 2000.
In group A, 80% of the claims are of size 1000.
In group B, 50% of the claims are of size 1000.
For a particular policyholder, you do not know what group it is from, but you observe the following
experience over four years:
Year Number of Claims Claim Sizes
1
1
1000
2
3
1000, 1000, 2000
3
0
--4
2
2000, 2000
8.26 (2 points)
Using Buhlmann Credibility, estimate the future average severity for this policyholder.
A. 1350
B. 1370
C. 1390
D. 1410
E. 1430
8.27 (2 points)
Using Bayes Analysis, estimate the future average severity for this policyholder.
A. 1350
B. 1370
C. 1390
D. 1410
E. 1430

8.28 (3 points) Each insured has at most one claim a year.


Claim Size Distribution
Class
Prior Probability
Probability of a Claim
100
200
A
3/4
1/5
2/3
1/3
B
1/4
2/5
1/2
1/2
An insured is chosen at random and a single claim of size 200 has been observed during two years.
Use Buhlmann Credibility to estimate the future pure premium for this insured.
A. 40
B. 41
C. 42
D. 43
E. 44
8.29 (2 points) You are using Buhlmann Credibility to estimate annual frequency.
The estimated future frequency for an individual with 1 year claim free is 7.875%.
The estimated future frequency for an individual with 2 years claim free is 7.000%.
Determine the estimated future frequency for an individual with 3 years claim free.
A. 6.1%
B. 6.2%
C. 6.3%
D. 6.4%
E. 6.5%

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 355
8.30 (4, 11/82, Q.40) (1 point) In which of the following situations, should credibility be expected
to increase?
1. Larger quantity of observations.
2. Increase in the prior mean.
3. Increase in the variance of hypothetical means.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3

E. None of A, B, C, or D.

Use the following information for the next three questions:


A game of chance has been designed where you are dealt cards from a deck of cards chosen at
random from two available decks. The two decks are as follows:
Deck A: 1 suit from a regular deck of cards (13 cards).
Deck B: same as Deck A except the ace is missing (12 cards).
You will receive $10 for each ace or face card in your hand.
NOTE: A face card equals either a Jack, Queen, or King.
8.31 (2 points) What is the Expected Value of the Process Variance (for the dealing of a single
card)?
A. 16
B. 18
C. 20
D. 22
E. 24
8.32 (2 points) What is the Variance of the Hypothetical Means (for the dealing of a single card)?
A. 0.06
B. 0.08
C. 0.10
D. 0.12
E. 0.14
8.33 (4, 5/84, Q.49) (2 points) Assume that you have been dealt two cards with replacement and
both cards are either an ace or a face card (i.e., a $20 hand).
Using Buhlmann Credibility, what is the expected value of the next card drawn from the same deck
assuming the previous cards have been replaced?
A. Less than 2.7
B. At least 2.7, but less than 2.8
C. At least 2.8, but less than 2.9
D. At least 2.9, but less than 3.0
E. 3.0 or more

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 356
8.34 (4, 5/85, Q.39) (3 points) A teacher gives a final exam to his class. He finds that the scores
very closely approximate a normal distribution with a mean of 55 and standard deviation of 10. The
teacher assumes that the score actually achieved by a student on the exam is normally distributed
around his "true competence" with the course material. Based on his past experience the teacher
estimates the standard deviation of this distribution to be 5. The teacher uses Buhlmann credibility to
estimate, for each student, his "true competence" from his observed exam score. He wishes to
pass those students whose "true competence" he estimates to be greater than or equal to 70%.
In which range should the passing grade be?
Hint: Total variance = expected value of the process variance + variance of the hypothetical means.
A. 73
B. 75
C. 77
D. 79
E. 81
8.35 (4, 5/86 Q.40) (1 point) Which of the following statements are true?
1. If X is the random variable representing the aggregate loss amount, N the random variable
representing number of claims, and Yi the random variable representing the amount of
the ith claim, then VAR[X] = E[Yi] VAR[N] + VAR[Yi] (E[N])2 .
2. Using Buhlmann/Greatest Accuracy credibility methods, the amount of credibility assigned is a
decreasing function of the expected value of the process variance.
3. P(H and B) = P(H | B) P(B)
A. 3
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3
8.36 (4, 5/86, Q.43) (3 points) Jones has selected a handful of ordinary six-sided dice. You do not
know how many dice Jones has selected, but you assume it is either 11, 12 or 13, each with equal
probability. Jones rolls this same number of dice five times, without you seeing how many dice he
rolled. Jones reports totals of 45, 44, 51, 48 and 47. Using Buhlmann Credibility, what is your
estimate of the average sum you can expect if he rolls this same number of dice again?
Note: A six-sided die has a mean of 3.5 and a variance of 35/12.
A. 42
B. 43
C. 44
D. 45
E. 46
8.37 (4, 5/88, Q.37) (3 points) The universe for this problem consists of two urns. Each urn
contains several balls of equal size, weight and shape. Urn 1 contains 5 white, 10 black and 5 red
balls. Urn 2 contains 20 white, 8 black and 12 red balls. Each white ball is worth zero, each black
$100, and each red $500. An urn is selected at random, and you want to determine the expected
value of a ball drawn from the urn. Each observation consists of drawing a ball at random from the
urn, recording the result, and then replacing the ball. You make two observations from the same urn.
What Buhlmann credibility would you assign these two observations, for the purpose of
determining the expected value of a ball drawn from the urn?
A. Less than 0.0003
B. At least 0.0003, but less than 0.0004
C. At least 0.0004, but less than 0.0005
D. At least 0.0005, but less than 0.0006
E. 0.0006 or more

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 357
8.38 (4, 5/88, Q.42) (1 point) Which of the following statements are true?
1. The Buhlmann credibility estimate is the best" linear approximation to the Bayesian estimate
of the pure premium.
2. A Bayesian estimate is the weighted average of the hypothesis and outcome.
3. If the number of claims for an individual are Poisson distributed with parameter q,
where q is a Gamma distributed random variable,
then the total number of accidents is also Gamma distributed.
A. 1
B. 2
C. 1, 2
D. 1, 3
E. 1, 2 and 3
8.39 (4, 5/89, Q.32) (1 point)
Which of the following will increase the credibility of your body of data?
1. A larger number of observations.
2. A smaller process variance.
3. A larger variance of the hypothetical means.
A. 1
B. 2
C. 1, 2
D. 2, 3
E. 1, 2, 3
8.40 (4, 5/89, Q.37) (2 points) Your friend selected at random one of two urns and then she pulled
a ball with the number 4 on it from the urn. Then, she replaced the ball in the urn. One of the urns
contains four balls numbered 1 through 4. The other urn contains six balls numbered 1 through 6.
Your friend will make another random selection of a ball from the same urn. Using the Buhlmann
credibility model what is the estimate of the expected value of the number on the ball?
A. Less than 2.925
B. At least 2.925, but less than 2.975
C. At least 2.975, but less than 3.025
D. At least 3.025, but less than 3.075
E. 3.075 or more
8.41 (4, 5/90, Q.35) (1 point) The underlying expected loss for each individual insured is assumed
to be constant over time. If the Buhlmann credibility assigned to the pure premium for an insured
observed for one year is 1/2, what is the Buhlmann credibility to be assigned to the pure premium
for an insured observed for 3 years?
A. 1/2
B. 2/3
C. 3/4
D. 6/7
E. Cannot be determined

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 358
8.42 (4, 5/90, Q.40) (2 points)
Three urns contain balls marked with either 0 or 1 in the proportions described below.
Marked 0
Marked 1
Urn A
10%
90%
Urn B
60
40
Urn C
80
20
An urn is selected at random and three balls are selected, with replacement, from the urn. The total
of the values is 1. Three more balls are selected from the same urn.
Calculate the expected total of the three balls using Buhlmann's credibility formula.
A. less than 1.05
B. at least 1.05 but less than 1.10
C. at least 1.10 but less than 1.15
D. at least 1.15 but less than 1.20
E. at least 1.20
8.43 (4, 5/91, Q.37) (2 points) One spinner is selected at random from a group of three different
spinners. Each of the spinners is divided into six equally likely sectors marked as described below.
----------Number of Sectors---------Spinner
Marked 0 Marked 12 Marked 48
A
2
2
2
B
3
2
1
C
4
1
1
Assume a spinner is selected and a zero is obtained on the first spin. What is the Buhlmann
credibility estimate of the expected value of the second spin using the same spinner?
A. Less than 12.5
B. At least 12.5 but less than 13.0
C. At least 13.0 but less than 13.5
D. At least 13.5 but less than 14.0
E. At least 14.0

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 359
8.44 (4, 5/91, Q.51) (3 points) Four urns contain balls marked with either 0 or 1 in the proportions
described below.
Urn
Marked 0 Marked 1
A
70%
30%
B
70
30
C
30
70
D
20
80
An urn is selected at random and four balls are selected from the urn with replacement. The total of
the values is 2. Four more balls are selected from the same urn.
Calculate the expected total of the four balls using Buhlmann's credibility formula.
A. Less than 1.96
B. At least 1.96 but less than 1.99
C. At least 1.99 but less than 2.02
D. At least 2.02 but less than 2.05
E. At least 2.05
8.45 (4B, 5/92, Q.9) (3 points) Two urns contain balls each marked with 0, 1, or 2 in the
proportions described below:
Percentage of Balls in Urn
Marked 0
Marked 1
Marked 2
Urn A
0.20
0.40
0.40
Urn B
0.70
0.20
0.10
An urn is selected at random and two balls are selected, with replacement, from the urn.
The sum of values on the selected balls is 2. Two more balls are selected from the same urn.
Determine the expected total of the two balls using Buhlmann's credibility formula.
A. Less than 1.6
B. At least 1.6 but less than 1.7
C. At least 1.7 but less than 1.8
D. At least 1.8 but less than 1.9
E. At least 1.9
8.46 (4B, 11/92, Q.19) (1 point) You are given the following:

The Buhlmann credibility of an individual risk's experience is 1/3 based upon 1 observation.
The risk's underlying expected loss is constant.
Determine the Buhlmann credibility for the risk's experience after four observations.
A. 1/4
B. 1/2
C. 2/3
D. 3/4
E. Cannot be determined.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 360
8.47 (4B, 5/93, Q.3) (1 point) You are given the following:
X is a random variable with mean m and variance v.

m is a random variable with mean 2 and variance 4.


v is a random variable with mean 8 and variance 32.
Determine the value of the Buhlmann credibility factor Z, after three observations of X.
A. Less than 0.25
B. At least 0.25 but less than 0.50
C. At least 0.50 but less than 0.75
D. At least 0.75 but less than 0.90
E. At least 0.90
Use the following information for the next two questions:
Two urns contain balls with each ball marked 0 or 1 in the proportions described below:
Percentage of Balls in Urn
Marked 0
Marked 1
Urn A
20%
80%
Urn B
70%
30%
An urn is randomly selected and two balls are drawn from the urn. The sum of the values on the
selected balls is 1. Two more balls are selected from the same urn.
Note: Assume that each selected ball has been returned to the urn before the next ball is drawn.
8.48 (4B, 11/94, Q.6) (3 points) Determine the Buhlmann credibility estimate of the expected
value of the sum of the values on the second pair of selected balls.
A. Less than 1.035
B. At least 1.035, but less than 1.055
C. At least 1.055, but less than 1.075
D. At least 1.075, but less than 1.095
E. At least 1.095
8.49 (4B, 11/94, Q.7) (1 point) The sum of the values of the second pair of selected balls was 2.
One of the two urns is then randomly selected and two balls are drawn from the urn.
Determine the Buhlmann credibility estimate of the expected value of the sum of the values on the
third pair of selected balls.
A. Less than 1.07
B. At least 1.07, but less than 1.17
C. At least 1.17, but less than 1.27
D. At least 1.27, but less than 1.37
E. At least 1.37

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 361
8.50 (4B, 5/95, Q.2) (2 points) You are given the following:
The Buhlmann credibility of three observations is twice the credibility of one observation.
The expected value of the process variance is 9.
What is the variance of the hypothetical means?
A. 3
B. 4
C. 6
D. 8
E. 9
8.51 (4B, 5/95, Q.16) (1 point)
Which of the following will DECREASE the credibility of the current observations?
1. Decrease in the number of observations
2. Decrease in the variance of the hypothetical means
3. Decrease in the expected value of the process variance
A. 1
B. 2
C. 3
D. 1, 2
E. 1, 3
8.52 (4B, 11/95, Q.2) (1 point)
The Buhlmann credibility of five observations of the loss experience of a single risk is 0.29.
What is the Buhlmann credibility of two observations of the loss experience of this risk?
A. Less than 0.100
B. At least 0.100, but less than 0.125
C. At least 0.125, but less than 0.150
D. At least 0.150, but less than 0.175
E. At least 0.175
8.53 (4B, 11/95, Q.21) (3 points) Ten urns each contain five balls, numbered as follows:
Urn 1: 1,2,3,4,5 Urn 2: 1,2,3,4,5 Urn 3: 1,2,3,4,5 Urn 4: 1,2,3,4,5 Urn 5: 1,2,3,4,5
Urn 6: 1,1,1,1,1 Urn 7: 2,2,2,2,2 Urn 8: 3,3,3,3,3 Urn 9: 4,4,4,4,4 Urn 10: 5,5,5,5,5
An urn is randomly selected. A ball is then randomly selected from this urn. The selected ball has the
number 2 on it. This ball is then replaced, and another ball is randomly selected from the same urn.
The second selected ball has the number 3 on it. This ball is then replaced, and another ball is
randomly selected from the same urn. Determine the Buhlmann credibility estimate of the expected
value of the number on this third selected ball.
A. Less than 2.2
B. At least 2.2, but less than 2.4
C. At least 2.4, but less than 2.6
D. At least 2.6, but less than 2.8
E. At least 2.8

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 362
8.54 (4B, 5/96, Q.16) (3 points) You are given the following:

Two urns contain balls.

In Urn A, half of the balls are marked 0 and half of the balls are marked 2.

In Urn B, half of the balls are marked 0 and half of the balls are marked t.
An urn is randomly selected. A ball is then randomly selected from this urn, observed, and replaced.
You wish to estimate the expected value of the number on the second ball randomly selected from
this same urn. For which of the following values of t would the Buhlmann credibility of the first
observation be greater than 1/10?
A. 1
B. 2
C. 3
D. 4
E. 5
8.55 (4B, 5/96, Q.24) (2 points) A die is randomly selected from a pair of fair, six-sided dice, A
and B. Die A has its faces marked with 1, 2, 3, 4, 5, and 6. Die B has its faces marked with 6, 7, 8, 9,
10, and 11. The selected die is rolled four times. The results of the first three rolls are 1, 2, and 3.
Determine the Buhlmann credibility estimate of the expected value of the result of the fourth roll.
A. Less than 1.75
B. At least 1.75, but less than 2.25
C. At least 2.25, but less than 2.75
D. At least 2.75, but less than 3.25
E. At least 3.25
8.56 (4B, 11/96, Q.10) (2 points) The Buhlmann credibility of n observations of the loss
experience of a single risk is 1/3. The Buhlmann credibility of n+1 observations of the loss
experience of this risk is 2/5. Determine the Buhlmann credibility of n+2 observations of the loss
experience of this risk.
A. 4/9
B. 5/11
C. 1/2
D. 6/11
E. 5/9
8.57 (4B, 5/97, Q.23) (3 points) You are given the following:

Two urns contain balls.


In Urn A, half of the balls are marked 0 and half of the balls are marked 2.

In Urn B, half of the balls are marked 0 and half of the balls are marked t.
An urn is randomly selected. A ball is then randomly selected from this urn, observed, and replaced.
An estimate is to be made of the expected value of the number on the second ball randomly
selected from this same urn.
Determine the limit of the Buhlmann credibility of the first observation as t goes to infinity.
A. 0
B. 1/3
C. 1/2
D. 2/3
E. 1

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 363
8.58 (4B, 11/97, Q.12) (2 points) You are given the following:
A portfolio consists of a number of independent insureds.
Losses for each insured for each exposure period are one of three values: a, b, or c.
The probabilities for a, b, and c vary by insured, but are fixed over time.
The average probabilities for a, b, and c over all insureds are 5/12, 1/6, and 5/12,
respectively.
One insured is selected at random from the portfolio and its losses are observed for one exposure
period. Estimates of the same insured's expected losses for the next exposure period are as
follows:
Observed Bayesian Analysis
Buhlmann
Losses
Estimate
Credibility Estimate
a
3.0
x
b
4.5
3.8
c
6.0
6.1
Determine x.
A. Less than 1.75
B. At least 1.75, but less than 2.50
C. At least 2.50, but less than 3.25
D. At least 3.25, but less than 4.00
E. At least 4.00

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 364
Use the following information for the next two questions:
An urn contains six dice.
Three of the dice have two sides marked 1, two sides marked 2, and two sides marked 3.
Two of the dice have two sides marked 1, two sides marked 3, and two sides marked 5.
One die has all six sides marked 6.
One die is randomly selected from the urn and rolled. A 6 is observed.
8.59 (4B, 5/98, Q.14) (2 points) Determine the Buhlmann credibility estimate of the expected
value of the second roll of this same die.
A. Less than 4.5
B. At least 4.5, but less than 5.0
C. At least 5.0, but less than 5.5
D. At least 5.5, but less than 6.0
E. At least 6.0
8.60 (4B, 5/98, Q.15) (3 points) The selected die is placed back in the urn. A seventh die is then
added to the urn. The seventh die is one of the following three types:
1. Two sides marked 1, two sides marked 3, and two sides marked 5.
2. All six sides marked 3.
3. All six sides marked 6.
One die is again randomly selected from the urn and rolled. An estimate is to be made of the
expected value of the second roll of this same die.
Determine which of the three types for the seventh die would increase the Buhlmann credibility of
the first roll of the selected die (compared to the Buhlmann credibility used in the previous question.)
A. 1
B. 2
C. 3
D. 1, 3
E. 2, 3
8.61 (Course 4 Sample Exam 2000, Q.28)
Four urns contain balls marked either 1 or 3 in the following proportions:
Urn
Marked 1
Marked 3
1 - p1
1
p1
2

p2

1 - p2

p3

1 - p3

p4

1 - p4

An urn is selected at random (with each urn being equally likely) and balls are drawn from it in three
separate rounds. In the first round, two balls are drawn with replacement. In the second round, one
ball is drawn with replacement. In the third round two balls are drawn with replacement.
After two rounds, the Bhlmann-Straub credibility estimate of the total of the values on the two balls
to be drawn in the third round could range from 3.8 to 5.0 (depending on the results of the first two
rounds).
Determine the value of Buhlmann-Straubs k.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 365
8.62 (4, 5/01, Q.6) (2.5 points) You are given:
(i) The full credibility standard is 100 expected claims.
(ii) The square-root rule is used for partial credibility.
You approximate the partial credibility formula with a Bhlmann credibility formula by
selecting a Bhlmann k value that matches the partial credibility formula when 25 claims are
expected. Determine the credibility factor for the Bhlmann credibility formula when 100 claims are
expected.
(A) 0.44
(B) 0.50
(C) 0.80
(D) 0.95
(E) 1.00

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 366
8.63 (4, 5/05, Q.32 & 2009 Sample Q.200) (2.9 points) For five types of risks, you are given:
(i) The expected number of claims in a year for these risks ranges from 1.0 to 4.0.
(ii) The number of claims follows a Poisson distribution for each risk.
During Year 1, n claims are observed for a randomly selected risk.
For the same risk, both Bayes and Bhlmann credibility estimates of the number of claims in
Year 2 are calculated for n = 0, 1, 2, ... , 9. Which graph represents these estimates?
4.5

4.5

3.5

3.5

2.5

2.5

1.5

1.5
Buhlmann

1
0.5

Bayes

0.5
2

Buhlmann

Bayes

10

No. of Claims in Year 1

4.5

4.5

3.5

3.5

2.5

2.5

1.5

10

Buhlmann

Bayes

0.5

Bayes

0.5
2

10

No. of Claims in Year 1

4.5
4
3.5
3
2.5
2
1.5
Buhlmann

Bayes

0.5

1.5
Buhlmann

No. of Claims in Year 1

4
6
No. of Claims in Year 1

10

No. of Claims in Year 1

10

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 367
8.64 (4, 11/06, Q.6 & 2009 Sample Q.251) (2.9 points) For a group of policies, you are given:
(i) The annual loss on an individual policy follows a gamma distribution with parameters = 4 and .
(ii) The prior distribution of has mean 600.
(iii) A randomly selected policy had losses of 1400 in Year 1 and 1900 in Year 2.
(iv) Loss data for Year 3 was misfiled and unavailable.
(v) Based on the data in (iii), the Bhlmann credibility estimate of the loss on the selected
policy in Year 4 is 1800.
(vi) After the estimate in (v) was calculated, the data for Year 3 was located.
The loss on the selected policy in Year 3 was 2763.
Calculate the Bhlmann credibility estimate of the loss on the selected policy in Year 4 based
on the data for Years 1, 2 and 3.
(A) Less than 1850
(B) At least 1850, but less than 1950
(C) At least 1950, but less than 2050
(D) At least 2050, but less than 2150
(E) At least 2150

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 368
8.65 (4, 5/07, Q.2) (2.5 points) For a group of risks, you are given:
(i) The number of claims for each risk follows a binomial distribution with parameters m = 6 and q.
(ii) The values of q range from 0.1 to 0.6.
During Year 1, k claims are observed for a randomly selected risk. For the same risk, both Bayesian
and Bhlmann credibility estimates of the number of claims in Year 2 are calculated for
k = 0, 1, 2, ... , 6. Determine the graph that is consistent with these estimates.
(A)
(B)

(C)

(E)

(D)

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 369
8.66 (4, 5/07, Q.21) (2.5 points) You are given:
(i) Losses in a given year follow a gamma distribution with parameters and ,
where does not vary by policyholder.
(ii) The prior distribution of has mean 50.
(iii) The Bhlmann credibility factor based on two years of experience is 0.25.
Calculate Var().
(A) Less than 10
(B) At least 10, but less than 15
(C) At least 15, but less than 20
(D) At least 20, but less than 25
(E) At least 25

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 370
Solutions to Problems:
8.1. E. The Buhlmann Credibility parameter is:
K = (Expected Value of the Process Variance) / (Variance of the Hypothetical Means) =
50/ 5 = 10. Z = N / (N+K) = 30 / (30 + 10) = 75%.
8.2. C. Z = N / (N+K) therefore K = N {(1/Z) - 1} = 4{(1/.3) - 1)} = 9.33.
8.3. A. The variance of the hypothetical means = 12.91667 - 3.52 = .6667.

Type of
Die
4-sided
6-sided
8-sided

A Priori
Chance of
This Type
of Die
0.333
0.333
0.333

Overall

Process
Variance
1.250
2.917
5.250

Mean
Die
Roll
2.5
3.5
4.5

Square of
Mean
Die
Roll
6.25
12.25
20.25

3.1389

3.50

12.91667

K = EPV / VHM = 3.1389 / .6667 = 4.71. Z = 1/(1 + 4.71) = .175. The a priori estimate is 3.5 and
the observation is 7, so the new estimate is: (.175)(7) + (.825)(3.5) = 4.11.
Comment: We know that the 4-sided and 6-sided dice could not have resulted in a seven. Thus
using Bayes Analysis, the posterior distribution would be 100% probability of the 8-sided die.
Buhlmann Credibility is a linear approximation to Bayes Analysis. This illustrates the problems with
using a linear estimator such as Buhlmann Credibility. The Buhlmann Credibility estimate when there
is an extreme observation may not be very sensible. On the exam, use Buhlmann Credibility when
they ask you to.
8.4. C. Expected Value of the Process Variance = 0.0833.
Variance of the Hypothetical Means = 0.4167 - 0.52 = 0.1667.
Type of
Urn
1
2
3
Average

A Priori
Probability
0.3333
0.3333
0.3333

Mean for this


Type Urn
0
1
0.5

Square of Mean
of this type Urn
0
1
0.25

Process
Variance
0
0.00000
0.25000

0.5

0.4167

0.0833

K= EPV / VHM = .0833 / .1667 = .5 Thus for N = 3, Z = 3/(3+.5) = 85.7%. The observed mean is
0 and the a priori mean is .5, therefore the new estimate = (0)(0.857) + (0.5)(1 - 0.857) = 0.0715.
8.5. B. As computed in the solution to the previous question, for 3 observations Z = 85.7% and
the a priori mean is .5. Since the observed mean is 1/3, the new estimate is:
(1/3)(.857) + (0.5)(1 - 0.857) = 0.357.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 371
8.6. A. 1. True. 2. True (This is always true of credibility, but may not be true of Bayesian analysis.)
3. False (This is always true of Bayesian analysis, but may not be true of credibility.)
8.7. E. For example, the second moment of Urn II is: (.6)(102 ) + (.4)(202 ) = 220.
The process variance of Urn II = 220 - 142 = 24.
Type of
Urn
I
II

A Priori
Probability
70%
30%

Average

Mean
12.0
14.0

Square of
Mean
144.0
196.0

12.6

159.6

Second
Moment
160
220

Process
Variance
16.0
24.0
18.4

The variance of the hypothetical means is: 159.6 - 12.62 = 0.84.


Thus the Buhlmann credibility parameter is K = EPV / VHM = 18.4 / 0.84 = 21.9.
Thus for 5 observations Z = 5 / (5 + 21.9) = 18.6%.
The prior mean is $12.6. The observed mean is: 60/5 = $12.0.
Thus the new estimate is: (.186)(12.0) + (1 - .186)(12.6) = $12.49.
8.8. A. For Die A the mean is: (1+1+1+2+3+4)/6 = 2 and the second moment is:
(1+1+1+4+9+16)/6 = 5.3333. Thus the process variance for Die A is 5.3333 - 22 = 1.3333.
Similarly for Die B the mean is 2.3333 and the second moment is 6.3333. Thus the process
variance for Die B is: 6.333 - 2.3332 = . 889. The mean of Die C is 2.6667. The process variance
for Die C is: {(1-2.6667)2 + (2-2.6667)2 + (3)(3-2.6667)2 + (4-2.6667)2 } / 6 = .889, the same as
Die B. The mean of Die D is 3. The process variance for Die D is:
{(1-3)2 + (2-3)2 + (3-3)2 + (3)(4-3)2 } / 6 = 1.333, the same as Die A. Thus the expected value of
the process variance = (1/4)(1.333) + (1/4)(.889) + (1/4)(.889) + (1/4)(1.333) = 1.111.

Die
A
B
C
D
Mean

A Priori
Chance of
Die
0.250
0.250
0.250
0.250

Mean
2.0000
2.3333
2.6667
3.0000

Square of
Mean
4.0000
5.4443
7.1113
9.0000

2.5000

6.3889

Thus the Variance of the Hypothetical Means = 6.3889 - 2.52 = 0.1389.


Therefore, the Buhlmann Credibility Parameter = K = EPV / VHM = 1.111 / .1389 = 8.0.
Thus the credibility for 4 observations is: 4/(4+K) = 4 /12 = 1/3.
The a priori mean is 2.5. The observed mean is: (3 + 4 + 2 + 4)/4 = 3.25.
Thus the estimated future die roll is: (1/3)(3.25) + (1 - 1/3)(2.5) = 2.75.
Comment: Ive illustrated the two different methods of computing variances, first in terms of the
moments and second as the second central moment.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 372
8.9. C. The a priori mean is 10. The mean for observed Years 1 and 2 is: (15 + 11)/2 = 13.
Therefore, 13Z + (1 - Z)(10) = 10.75 Z = 0.25. 2/(2 + K) = 0.25 K = 6.
For three years of data, the observed mean is: (15 + 11 + 22)/3 = 16.
Z = 3/(3 + 6) = 1/3. (1/3)(16) + (2/3)(10) = 12.
8.10. C. The total variance is 82 = 64. The (assumed) expected value of the process variance
= 42 = 16. Thus the variance of the hypothetical means = total variance - EPV = 64 -16 = 48.
The Buhlmann Credibility parameter K is EPV/VHM = 16/48 = 1/3. Thus the credibility assigned to
one observation is 1/(1+K) = 1 /(4/3) = .75. Thus if one observes a score of s, our estimate of that
students true competence would be s(.75) + (1-.75)(56) = 14 + .75s. Thus an estimated true
competency of 65 would correspond to a score such that 65 = 14 + .75s. Thus s = 51 / .75 = 68.
Comment: Note that for a single observation,
Z = (total variance - EPV) / total variance = (64 - 16)/64 = 3/4.
8.11. E. The Variance of the Hypothetical Means = .1944 - .38892 = .0432.
Type of
Die
A
B
C

A Priori
Probability
0.3333
0.3333
0.3333

Average

Mean
0.6667
0.3333
0.1667

Square of
Mean
0.4444
0.1111
0.0278

Process
Variance
0.2222
0.2222
0.1389

0.3889

0.1944

0.1944

Thus the Buhlmann credibility parameter is K =EPV / VHM = .1944 / .0432 = 4.50. Thus for 8
observations Z = 8 / (8 +4.5) = 64%. The prior mean is .3889 and the observation is 2/8= .25.
Thus the new estimate is: (64%)(.25) + (36%)(.3889) = 0.300.
8.12. B. Assign a value of zero to a non-black ball and a value of 1 to a black ball.
Then the future estimate is equal to the chance of picking a black ball.
Use the fact that for the Bernoulli the process variance is q(1-q).
Type
I
II
III
IV
Overall

A Priori Prob.
0.4
0.3
0.2
0.1

% Black Balls
0.04
0.08
0.12
0.16

Process Var.
0.0384
0.0736
0.1056
0.1344

0.08

0.072

The EPV = .072. Using the fact that the overall mean is .08, the variance of the hypothetical means
is = (.4(.04-.08)2 ) + (.3(.08-.08)2 ) + (.2(.12-.08)2 ) + (.1(.16-.08)2 ) = .0016.
Thus K = .072 / .0016 = 45. Z = 50 / (50+45) = 53%.
Estimate = (0)(53%) + (8%)(47%) = 3.8%.
Comment: Note that the estimate is outside the range of hypothetical means. While this can happen
for estimates based on Credibility, it cant for estimates based on Bayes Theorem.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 373
8.13. C. The mean of a sum of N 6-sided dice is 3.5N.
The process variance of a sum on N independent 6-sided is N(35/12).
Number
of Dice

A Priori Chance of
this Number of Dice

10
11
12
13

0.2
0.4
0.3
0.1

Average

Mean of Sum Square of Mean of Process Variance


of Dice
Sum of Dice
for Sum of Dice
35.0
38.5
42.0
45.5

1225.000
1482.250
1764.000
2070.250

29.167
32.083
35.000
37.917

39.550

1574.125

32.958

Thus the variance of the hypothetical means = 1574.125 - 39.552 = 9.92.


The EPV = 32.958. Thus K = EPV/VHM = 32.958 / 9.92 = 3.32.
Thus five observations are given credibility Z = 5/(5+K) = 5/8.32 = .601.
The observed average is (54+ 44+ 41+ 48 +47)/5 = 46.8. The a priori mean is 39.55.
Thus the new estimate is: (.601)(46.8)+(1-.601)(39.55) = 43.9
8.14. C. The weighted average of the Buhlmann Credibility Estimates are:
(30%)(100) + (60%)(140) + (10%)(300) = 144.
Since the Buhlmann Credibility Estimates are in balance, the a priori mean is 144.
Since the Bayesian Analysis Estimates are also in balance, we must have:
(30%)(y) + (60%)(150) + (10%)(260) = 144. Thus y = (144 - 116)/.3 = 93.33.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 374
N / (8K) . For Bhlmann Credibility, Z = N/(N+K).

8.15. E. For classical credibility Z =


Setting the two credibilities equal,

N / (8K) = N/(N+K). Therefore, for N 0,

8KN = (N +K)2 . N2 - 6NK + K2 = 0.


N=

6K

36K2 - 4K2
= K(3
2

8 ) = 5.83K or .172K.

For N = 5.83K, Z = 5.83/6.83 = 85%. For N = .172K, Z = .172/1.172 = 15%.


Comment: Here is a graph of the two formulas for K = 100 and a standard for full Credibility of 800,
with the Buhlmann formula shown as thick:
Z
1.0
0.8
Buhlmann
0.6
0.4

Classical

0.2

5
10
50 100
500 1000
The two curves cross at 17 and 583 claims, when Z = 15% or 85%.

8.16. B. & 8.17. B. Z = 4/(4+11) = 4/15. Let x be the average of the 4 observations; x can range
from 13 to 49. The estimate is: (4/15)(x) + (11/15)(25) = (4x + 275)/15.
For x = 13 the estimate is 327/15 = 21.8. For x = 49 the estimate is: 471/15 = 31.4.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 375
8.18. E. The Buhlmann estimate should be on a straight line, but they are not in Graph B,
eliminating graph B.
The Buhlmann estimates are the weighted least squares linear approximation to those of Bayes
Analysis. However, in graph A, the Buhlmann estimates are always higher than the Bayes
estimates. Thus graph A is eliminated.
The Bayes estimates must remain within the range of hypotheses, in this case 0.5 to 2.0, eliminating
graph C.
In graph D, the slope of the Buhlmann line is about: (2.1 - 0.3)/6 = 0.3 = Z.
In graph D, the intercept of the Buhlmann line is about: 0.3 = (1 - Z). = 0.3/(1 - .3) = 0.43.
However, , the a priori overall mean should be between 0.5 and 2.0, eliminating graph D.
Comment: Similar to 4, 5/05, Q.32. The problem with graph D is not obvious!
The slope of the straight line formed by the Buhlmann estimates is Z. For ordinary situations,
0 < Z < 1. Thus this slope must be positive and less than 1, which is true for all those graphs in
which the Buhlmann estimates are on a straight line. There is no requirement that the Buhlmann
estimates must remain within the range of hypotheses.
8.19. D. The a priori mean is E[] = 4E[] = (4)(600) = 2400.
The mean for observed Years 1 and 2 is: (1400 + 1900)/2 = 1650.
Therefore, Z1650 + (1 - Z)(2400) = 1800. Z = 0.8. 2/(2+K) = 0.8 K = 1/2.
For three years of data, the observed mean is: (1400 + 1900 + 2763)/3 = 2021,
and Z = 3/(3 + 1/2) = 6/7.
(6/7)(2021) + (1/7)(2400) = 2075.
Comment: Similar to 4, 11/06, Q.6.
8.20. E. E[N()] = 0.3. E[N()2 ] = 0.5 + 0.32 = 0.59.
E[X()] = 200. E[X()2 ] = 6000 + 2002 = 46,000.
Hypothetical Mean Aggregate Loss = N()X().
First Moment of Hypothetical Means of Aggregate Loss =
E[N()X()] = E[N()]E[X()] = (.3)(200) = 60.
Second Moment of Hypothetical Means of Aggregate Loss =
E[(N()X())2 ] = E[N()2 ]E[X()2 ] = (.59)(46,000) = 27,140.
Variance of Hypothetical Means of Aggregate Loss = 27,140 - 602 = 23,540.
We are given that the total variance of aggregate losses is 165,000.
Therefore, EPV + VHM = 165,000. EPV = 165,000 - 23,540 = 141,460.
K = EPV/VHM = 141,460/23,540 = 6.0.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 376
8.21. The hypothetical means are 0.6, 1.8, and 3.6, equally likely.
The a priori mean is: (0.6 + 1.8 + 3.6)/3 = 2.
The variance of the hypothetical means is: {(0.6 - 2)2 + (1.8 - 2)2 + (3.6 - 2)2 }/3 = 1.52.
The process variances are: 0.54, 1.26, and 1.44, equally likely.
The expected value of the process variance is: (0.54 + 1.26 + 1.44)/3 = 1.08.
K = EPV/VHM = 1.08/1.52 = 0.711. Z = 1/(1 + K) = 58.4%.
Thus, the Buhlmann credibility estimates are: .584k + (1 - .584)(2) = .584k + 0.832.
For Bayes analysis, the chance of the observation is f(k) for a Binomial with m = 6:
6
qk(1-q)6-k. For a given value of k, this is proportional to: qk(1-q)6-k.
k
The 3 values of q are equally likely, and thus the probability weights are proportional to: qk(1-q)6-k.
For k = 0, the probability weights are: .96 , .76 , and .46 .
The mean frequencies are: (6)(.1), (6)(.3), and (6)(.6).
Bayes Estimate is: 6{(.96 )(.1) + (.76 )(.3) + (.66 )(.4)}/ {.96 + .76 + .66 } = 0.835.
For k = 1, the probability weights are: (.1)(.95 ), (.3)(.75 ), and (.6)(.45 ).
Bayes Estimate is: 6{(.95 )(.12 ) + (.75 )(.32 ) + (.65 )(.42 )}/ {(.1)(.95 ) + (.3)(.75 ) + (.6)(.45 )} = 1.283.
For k = 2, the probability weights are: (.12 )(.94 ), (.32 )(.74 ), and (.62 )(.44 ).
Bayes Estimate is: 6{(.94 )(.13 ) + (.74 )(.33 ) + (.64 )(.43 )}/ {(.12 )(.94 ) + (.32 )(.74 ) + (.62 )(.44 )} =
2.033.
For k = 3, the probability weights are: (.13 )(.93 ), (.33 )(.73 ), and (.63 )(.43 ).
Bayes Estimate is: 6{(.93 )(.14 ) + (.73 )(.34 ) + (.63 )(.44 )}/ {(.13 )(.93 ) + (.33 )(.73 ) + (.63 )(.43 )} =
2.808.
For k = 4, the Bayes Estimate is:
6{(.92 )(.15 ) + (.72 )(.35 ) + (.62 )(.45 )}/ {(.14 )(.92 ) + (.34 )(.72 ) + (.64 )(.42 )} = 3.302.
For k = 5, the Bayes Estimate is:
6{(.9)(.16 ) + (.7)(.36 ) + (.6)(.46 )}/ {(.15 )(.9) + (.35 )(.7) + (.65 )(.4)} = 3.506.
For k = 6, the Bayes Estimate is: 6{.17 + 37 + .47 }/ {.16 + .36 + .66 } = 3.572.
k
Bayes Estimate
Buhlmann Estimate

0
0.835
0.832

1
1.283
1.416

2
2.033
2.000

3
2.808
2.584

4
3.302
3.168

5
3.506
3.752

6
3.572
4.336

Here is a graph, with the Bayes Estimates as the points and the Buhlmann Estimates as the straight
line:

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 377
Estimate
4
3.5
3
2.5
2
1.5
1

Comment: Similar to 4, 5/07, Q.2.


8.22. There are various possible combinations of pieces of pie for Burts three lunches:
0, 2, 2; 2, 0, 2; 2, 2, 0; 1, 1, 2; 1, 2, 1; 2, 1, 1.
If Burt is someone who always drinks no cups of coffee, then the chance of the observation is:
3(1/2)(1/6)(1/6) + (3)(1/3)(1/3)(1/6) = .0972.
If Burt is someone who always drinks one cup of coffee, then the chance of the observation is:
3(1/4)(1/4)(1/4) + (3)(1/2)(1/2)(1/4) = .2344.
If Burt is someone who always drinks two cups of coffee, then the chance of the observation is:
3(1/6)(1/2)(1/2) + (3)(1/3)(1/3)(1/2) = .2917.
Number
A Priori
of Cups Chance of
Chance
of Coffee This Type
of the
of Risk Observation
1
2
3

0.3
0.4
0.3

Overall

1.000

9.72%
23.44%
29.17%

Prob. Weight =
Product
Prior Two
Columns

Posterior
Chance of
This Type
of Risk

Average
Number of
Pieces
of Pie

0.0292
0.0938
0.0875

13.86%
44.55%
41.58%

0.667
1.000
1.333

0.2104

1.000

1.092

Comment: If we had been told how many cups of coffee Burt drinks with his lunch each day, then
there would have been no need to use Bayes Analysis.
Any given customer drinks the same number of cups of coffee each time they have lunch at
Dinahs, means that there is useful information contained in the number of pieces of pie eaten in the
past for predicting the number of pieces of pie eaten in the future by the same customer. If each
customer instead had the whole joint distribution shown, then the expected number of pieces of pie
eaten is one per lunch for every customer.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 378
8.23. The hypothetical means are: {(15%)(0) + (10%)(1) + (5%)(2)}/30% = 2/3,
{(10%)(0) + (20%)(1) + (10%)(2)}/40% = 1, {(5%)(0) + (10%)(1) + (15%)(2)}/30% = 4/3.
The first moment of the hypothetical means is: (30%)(2/3) + (40%)(1) + (30%)(4/3) = 1.
The 2nd moment of the hypothetical means is: (30%)(2/3)2 + (40%)(1)2 + (30%)(4/3)2 = 1.0667.
VHM = 1.0667 - 1 = 0.0667.
For someone who buys no cups of coffee, the second moment is:
{(15%)(02 ) + (10%)(12 ) + (5%)(22 )}/30% = 1. The process variance is: 1 - (2/3)2 = 5/9.
For someone who buys 1 cup of coffee, the second moment is:
{(10%)(02 ) + (20%)(12 ) + (10%)(22 )}/40% = 1.5. The process variance is: 1.5 - (1)2 = 1/2.
For someone who buys 2 cups of coffee, the second moment is:
{(5%)(02 ) + (10%)(12 ) + (15%)(22 )}/30% = 2.333. The process variance is: 2.333 - (4/3)2 = 5/9.
EPV = (30%)(5/9) + (40%)(1/2) + (30%)(5/9) = 0.533.
K = EPV/VHM = 0.533/0.0667 = 8. Z = 3/(3 + 8) = 3/11.
Estimated number of pieces of pie: (3/11)(4/3) + (8/11)(1) = 12/11 = 1.091.
Comment: While the estimates using Bayes Analysis and Buhlmann credibility are very similar for
this observation, they are not equal.
8.24. D. Type 3 has average severity of $1050, and variance of severity of:
(10%)(4502 ) + (90%)(502 ) = 22,500.
Annual aggregate losses for Type 3 have a process variance of:
(80%)(22,500) + (10502 )(80%)(20%) = 194,400.
Type

A Priori
Probability

1
2
3

33.33%
33.33%
33.33%

Mean
Variance of
Frequency Frequency
0.2
0.4
0.8

0.16
0.24
0.16

Mean
Severity

Variance of
Severity

Process
Variance

1400
1250
1050

40,000
62,500
22,500

321,600
400,000
194,400

Average

305,333

EPV = (321,600 + 400,000 + 194,400)/3 = 305,333.


Type

A Priori
Probability

Mean
Frequency

Mean
Severity

Mean
Aggregate

Square of
Mean Aggregate

1
2
3

33.33%
33.33%
33.33%

0.2
0.4
0.8

1400
1250
1050

280
500
840

78,400
250,000
705,600

540

344,667

Average

VHM = 344,667 - 5402 = 53,067. K = EPV/VHM = 305,333/53,067 = 5.75.


For 7 years of data, Z = 7/(7 + 5.75) = 54.9%.
Jims observed mean annual loss is: (1000 + 1000 + 1500)/7 = $500.
Estimate for Jim for year 8 is: (54.9%)($500) + (1 - 54.9%)($540) = $518.
Comment: Based on Q. 18 of the SOA Fall 2009 Group and Health - Design and Pricing Exam.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 379
8.25. A. For Type 1, the chance of the observation of Jim is:
(.8){(.2)(.2)}(.8){(.2)(.2)}(.8){(.2)(.8)}(.8) = 0.00010486.
For Type 2, the chance of the observation of Jim is:
(.6){(.4)(.5)}(.6){(.4)(.5)}(.6){(.4)(.5)}(.6) = 0.00103680.
For Type 3, the chance of the observation of Jim is:
(.2){(.8)(.9)}(.2){(.8)(.9)}(.2){(.8)(.1)}(.2) = 0.00006636.
The mean aggregate for Type 1 is: (0.2)(1400) = 280.
The mean aggregate for Type 2 is: (0.4)(1250) = 500.
The mean aggregate for Type 3 is: (0.8)(1050) = 840.
Type

A Priori
Probability

Chance of
Observation

Probability
Weight

Posterior
Distribution

Mean
Aggregate

1
2
3

33.33%
33.33%
33.33%

0.00010486
0.0010368
0.00006636

0.000034953
0.000345600
0.000022120

8.68%
85.83%
5.49%

280
500
840

0.000402673

100.00%

500

Estimate for Jim for year 8 is:


(8.68%)($280) + (85.83%)($500) + (5.49%)($840) = $500.
8.26. B. For group A, the mean is 1200, and the variance is: (0.8)(2002 ) + (0.2)(8002 ) = 160,000.
For group B, the mean is 1500, and the variance is: (0.5)(5002 ) + (0.5)(5002 ) = 250,000.
EPV = (0.7)(160,000) + (0.3)(250,000) = 187,000.
Overall mean is: (0.7)(1200) + (0.3)(1500) = 1290.
Second Moment of the hypothetical means is: (0.7)(12002 ) + (0.3)(15002 ) = 1,683,000.
VHM = 1,683,000 - 12902 = 18,900.
K = EPV / VHM = 187,000 / 18,900 = 9.89.
We are estimating severity; there are a total of 6 claims, so N = 6.
Z = 6 / (6+K) = 37.8%.
Observed mean severity is 1500.
The future estimate is: (0.378)(1500) + (1 - 0.378)(1290) = 1369.
8.27. C. For group A, the chance of the observation is proportional to: (0.83 )(0.23 ) = 0.004096.
For group B, the chance of the observation is proportional to: (0.53 )(0.53 ) = 0.015625.
Thus the probability weights are: (0.7)(0.004096) and (0.3)(0.015625).
The posterior distribution is: 38.0% and 62.0%
The means for the two groups are: 1200 and 1500.
The estimate of future severity is: (38%)(1200) + (62%)(1500) = 1386.
Comment: One can include or exclude binomial coefficients. As long as one is consistent between
the two groups, it will not affect the posterior distribution you get.
In the absence of inflation, we make no use of which years the claims occurred in.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 380
8.28. B. The average pure premium for Class A is: (1/5){(2/3)(100) + (1/3)(200)} = 26.67.
The average pure premium for Class B is: (2/5){(1/2)(100) + (1/2)(200)} = 60.
The a priori mean is: (3/4)(26.67) + (1/4)(60) = 35.00.
The second moment of the hypothetical means is: (3/4)(26.672 ) + (1/4)(602 ) = 1433.5.
VHM = 1433.5 - 35.002 = 208.5.
The variance of severity for Class A is: (2/3)(100 - 133.33)2 + (1/3)(200 - 133.33)2 = 2222.
The variance of pure premium for Class A is: (1/5)(2222) + (133.332 )(1/5)(4/5) = 3289.
The variance of severity for Class B is: (1/2)(100 - 150)2 + (1/2)(200 - 150)2 = 2500.
The variance of pure premium for Class B is: (2/5)(2500) + (1502 )(2/5)(3/5) = 6400.
EPV = (3/4)(3289) + (1/4)(6400) = 4067.
K = EPV / VHM = 4067 / 208.5 = 19.5.
We observe two years so that N = 2.
Z = 2 / (2 + K) = 9.3%.
(0.093)(200/2) + (1 - 0.093)(35) = 41.0.
8.29. C. For one year Z = 1/(1+K), 1 - Z = K/(1+K).
Let be the a priori mean frequency.
Then the estimate for an individual with one year claim free is: K / (1 + K) = 0.07875.
For two years Z = 2/(2+K), 1 - Z = K/(2+K).
The estimate for an individual with two years claim free is: K / (2 + K) = 0.07.
Dividing the two equations: (1+K)/(2+K) = 1.125. K = 7. = 9%.
For three years Z = 3/(3+K), 1 - Z = K/(3+K).
The estimate for an individual with three years claim free is: K / (3 + K) = (9%)(7/10) = 6.3%.
8.30. B. 1. True. 2. False. 3. True.
8.31. C. For each deck the process is a 10 times a Bernoulli Process. Thus the process variance is
100 times that of a Bernoulli. For Deck A, q = 4/13 and the process variance is (100)(4/13)(1- 4/13)
= 21.302. For Deck B, q = 3/12 and the process variance is (100)(3/12)(1- 3/12) = 18.750.
Since the Decks are equally likely, the Expected Value of the Process Variance =
(21.302+18.750) /2 = 20.026.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 381
8.32. B. The variance of the hypothetical means = 7.85873 - 2.788462 = 0.0832.
Number
of
Deck
A
B
Average

A Priori
Chance of
This
Deck
0.5
0.5

Mean
of a Card
from this
Deck
3.07692
2.50000

Square of
Mean of
Card from
This Deck
9.46746
6.25000

Process
Variance
for draw from
this Deck
21.302
18.750

2.78846

7.85873

20.026

8.33. C. As shown in the prior solutions, EPV = 20.026 and VHM = .0832.
Thus the Buhlmann credibility parameter = K = EPV/VHM = 20.026 / .0832 = 241.
Thus two observations are given credibility: 2/(2+K) = 2/243 = .0082.
The observed average is $10 per card. The a priori mean is $2.788 per card.
Thus the estimate of the next card is: (.0082)(10)+(1-.0082)(2.788) = 2.85.
8.34. B. The total variance is 102 = 100.
The (assumed) expected value of the process variance = 52 = 25.
Thus the variance of the hypothetical means = total variance - EPV = 100 -25 = 75.
The Buhlmann Credibility parameter K is EPV/VHM = 25/75 = 1/3.
Thus the credibility assigned to one observation is 1/(1+K) = 1 /(4/3) = .75.
Thus if one observes a score of s, our estimate of that students true competence would be:
(.75)s + (1 - .75)(55) = 13.75 + .75s.
Thus an estimated true competency of 70% would correspond to a score such that:
70 = 13.75 + .75s. Thus s = 56.25 / .75 = 75.
8.35. D. 1. False. The correct formula is VAR[X] = (E[Yi])2 VAR[N] + VAR[Yi] E[N].
Note this formula only hold when frequency and severity are independent.
2. True. As the EPV increases there is more random fluctuation and the credibility assigned to the
observation decreases. 3. True. P(H | B) = P(H and B) / P(B).

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 382
8.36. D. The mean of a sum of N 6-sided dice is 3.5N. The process variance of a sum of N
independent 6-sided dice is N(35/12). Thus as calculated below, the variance of the hypothetical
means = 1772.167 - 422 = 8.167, while the EPV = 35. Thus the Buhlmann credibility parameter =
K = EPV/VHM = 35 / 8.167 = 4.29. Thus five observations are given credibility: 5/(5+K) = 5/9.29 =
.538. The observed average is: (45+ 44+ 51+ 48 +47)/5 = 47. The a priori mean is 42.
Thus the new estimate is: (.538)(47)+(1-.538)(42) = 44.7.
Number
of
Dice
11
12
13

A Priori
Chance of
This #
of Dice
0.3333
0.3333
0.3333

Average

Mean
of Sum
of
Dice
38.5
42.0
45.5

Square of
Mean of
Sum of
Dice
1482.250
1764.000
2070.250

Process
Variance
for Sum
of Dice
32.083
35.000
37.917

42.000

1772.167

35.000

8.37. A. For Urn 1 the mean is: {(5)(0) + (10)(100) + (5)(500)}/(5 + 10 + 5) = 175.
For Urn 1 the second moment is: {(5)(02 ) + (10)(1002 ) + (5)(5002 )}/(5 + 10 + 5) = 67500.
Therefore, the process variance of Urn 1 is: 67500 - 1752 = 36875.
For Urn 2, the mean is: {(20)(0) + (8)(100) + (12)(500)}/(20 + 8 + 12) = 170.
For Urn 2, the second moment is: {(20)(02 ) + (8)(1002 ) + (12)(5002 )}/(20 + 8 + 12) = 77000.
Therefore, the process variance of Urn 2 is: 77000 - 1702 = 48100.
Expected Value of the Process Variance = 42488.
Variance of the Hypothetical Means = 29762.5 - 172.52 = 6.25.
K= EPV / VHM = 42488 / 6.25 = 6798. Thus for N = 2, Z = 2/(2 + 6798) = 0.00029.
Type of
Urn
1
2

A Priori
Probability
0.5000
0.5000

Average

Mean for this


Type Urn
175
170

Square of Mean
of this type Urn
30,625.0
28,900.0

Process
Variance
36,875
48,100

172.5

29,762.5

42,488

8.38. A. 1. True.
2. False. An estimate using credibility is the weighted average of the hypothesis and outcome.
3. False. The total number of accidents follows a Negative Binomial Distribution.
8.39. E. 1. True. 2. True. 3. True.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 383
8.40. E. For the first urn the mean is: (1 + 2 + 3 + 4)/4 = 2.5.
The second moment is: (12 + 22 + 32 + 42 )/4 = 7.5.
Thus the process variance = 7.5 - 2.52 = 1.25.
For the second urn the mean is: (1 + 2 + 3 + 4 + 5 + 6)/6 = 3.5.
The second moment is: (12 + 22 + 32 + 42 + 52 + 62 )/6 = 15.167.
Thus the process variance = 15.167 - 3.52 = 2.917.
Thus the expected value of the process variance is: (.5)(1.25)+(.5)(2.917) = 2.083.

Urn
I
II

A Priori
Chance of
This
Urn
0.500
0.500

Overall

Process
Variance
1.250
2.917

Mean
of Ball
From Urn
2.5
3.5

Square of
Mean
of Ball
From Urn
6.25
12.25

2.083

3.00

9.25

The variance of the hypothetical means = 9.25 - 32 = .25. K = EPV / VHM = 2.083 / .25 = 8.33.
Z = 1/(1 + 8.33) = 10.7%. The a priori estimate is 3 and the observation is 4.
Therefore, the new estimate is: (.107)(4) + (1-.107)(3) = 3.11.
8.41. C. Z = N/(N + K). For N =1, Z = 1/2, therefore K = N{(1/Z) - 1} = 1(2 - 1) = 1.
Therefore for N =3, Z = 3 / (3 + 1) = 3/4.
8.42. D. The variance of the hypothetical means = .33667 - .52 = .08667.

Type of
Urn
A
B
C
Overall

A Priori
Chance of
This Type
of Urn
0.333
0.333
0.333

Process
Variance
0.090
0.240
0.160

Mean
for 1
Ball from
Urn
0.9
0.4
0.2

Square of
Mean of 1
Ball from
Urn
0.81
0.16
0.04

0.1633

0.50

0.33667

K = EPV / VHM = .1633 / .08667 = 1.884. For 3 balls, Z = 3/(3 + 1.884) = .614.
The a priori estimate for three balls is (3)(.5) = 1.5 and the observation is 1, so the new estimate is:
(.614)(1) + (1 - .614)(1.5) = 1.193.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 384
8.43. C. For each type of spinner one calculates the mean. For example, for Spinner B it is:
{(3)(0) + (2)(12) + (1)(48)}/6 = 12. One can also compute the 2nd moment for each type of
spinner; for example, for Spinner B it is: {(3)(02 ) + (2)(122 ) +(1)(482 )}/6 = 432.
Then for each type of Spinner, the process variance is the second moment minus the square of the
mean. For example for Spinner B the process variance is: 432 - 122 = 288.
One weights together the individual process variances to get: EPV = 337.33.
Variance of the Hypothetical Means = 214.6667 - 142 = 18.6667.
Type of
Spinner
A
B
C

A Priori
Probability
0.3333
0.3333
0.3333

Mean for this


Square of Mean
Type Spinner of this type Spinner
20
400
12
144
10
100

Average

14

2nd Moment
of Spinner
816
432
408

214.6667

Process
Variance
416
288
308
337.3333

K = EPV / VHM = 337.33/18.6667 = 18.07. Thus for N =1, Z = 1/(1+18.07) = 5.2%.


The observed mean is 0 and the a priori mean is 14, therefore, the new estimate is:
(0)(.052) + (14)(1 - .052) = 13.3.
8.44. D. The overall mean is .525. The second moment of the hypothetical means is .3275.
Urn
A
B
C
D
SUM

A Priori
Probability

% of Balls
Marked 1

0.25
0.25
0.25
0.25

0.3
0.3
0.7
0.8
0.5250

Square of
Mean of this
type Urn
0.09
0.09
0.49
0.64
0.3275

Process
Variance
0.21
0.21
0.21
0.16
0.1975

Therefore, the Variance of the Hypothetical Means = .3275 - .5252 = .051875 .


The process variance for a single draw from an Urn is q(1-q), since it is a Bernoulli process. For
example, for Urn D, the process variance is (.8)(1 - .8) = .16.
EPV = (.25)(.21) + (.25)(.21) + (.25)(.21) + (.25)(.16) = .1975.
Then K = EPV /VHM = .1975 /.051875 = 3.807.
For four balls drawn, Z = N /(N + K) = 4/(4 + 3.807) = .5124.
The prior estimate of the average draw is .525. The observed average draw is 2/4 = .5.
Thus the new estimate of the average draw is: (.5124)(.5)+(1-.5124)(.525) = .5122.
For four draws, the new estimate = (4)(.5122) = 2.049.
Comment: Note that we calculate the VHM & EPV for a draw of a single ball.
The Buhlmann Credibility formula automatically adjusts for N = 4.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 385
8.45. C. The variance of the hypothetical means = .8 - .82 = .16.

Type of
Urn
A
B

A Priori
Chance of
This Type
of Urn
0.500
0.500

Mean of
This Type
of Urn
1.2
0.4

Overall

0.80

SquareSquare of
Second
Mean of MeanMoment of
This Type Die This Type
of Urn Roll of Urn
1.44
2
0.16
0.6
0.80

Process
Variance
0.560
0.440
0.50

The second moment of a ball from Urn B is: (02 )(.7) + (12 )(.2) + (22 )(.1) = .6.
Thus the process variance of Urn B is: .6 - (.42 ) = .44. K = EPV / VHM = .5 / .16 = 3.125.
Z = N /(N + K) = 2/(2 + 3.125) = .39. The a priori estimate for the sum of two balls is (2)(.8) = 1.6
and the observation is 2, so the new estimate is: (.39)(2) + (1 - .39)(1.6) = 1.756.
8.46. C. For one observation Z = 1 / (1+K) = 1/3. Thus K = 2.
For four observations, Z = 4 / (4+K) = 4/6 = 2/3.
8.47. C. Expected Value of the Process Variance = E[v] = 8.
Variance of the Hypothetical Means = Var[m] = 4.
K = EPV / VHM = 8/4 = 2. So, Z = 3 / (3+K) = 3 / (3+2) = 3/5 = 0.6.
8.48. C. The process variance for picking one ball is given for the Bernoulli by q(1-q).
Urn
A
B
Mean

A Priori
Probability
0.5
0.5

Mean
(1 ball)
0.8
0.3

Square of
Mean
0.64
0.09

Process
Variance
0.16
0.21

0.550

0.365

0.185

Variance of the Hypothetical Means = .365 - .552 = .252 = .0625.


Thus, K = .185 / .0625 = 2.96. Z = 2/(2+2.96) = 40.3%. The complement of credibility is given to
the a priori estimate for a pair of balls which is: (2)(.55) = 1.1.
Thus, the New Estimate for a pair of balls = (40.3%)(1) + (59.7%)(1.1) = 1.06.
Comment: One can instead calculate the process variance, VHM, and K for a pair of balls.
Then K = .37 / .25 = 1.48 and Z = 1 / (1+1.48) = 40.3%, thus getting the same result.
8.49. B. Any information about which urn we had chosen which may have been contained in prior
observations is no longer relevant once we make a new random selection of an urn. Therefore our
best estimate is the grand mean (assuming equal probabilities for the two urns) of 1.10.
Comment: Has to be read carefully. Tests basic understanding of an important point.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 386
8.50. A. Let the variance of the hypothetical means = VHM; we will solve for VHM.
K = Expected Value of the Process Variance / Variance of the Hypothetical Means =
9 / VHM. Thus the credibility of one observation is: 1 / (1+(9/VHM)) = VHM / (VHM + 9).
The credibility of three observations is: 3 / (3 + (9/VHM)) = 3VHM / (3VHM + 9).
We are given that 3VHM / (3VHM + 9) = 2{VHM / (VHM + 9)} .
Therefore, 6VHM + 18 = 3 VHM + 27. Therefore VHM = 3.
Comment: Backwards! One is usually given the Variance of the Hypothetical Means and asked to
calculate the credibilities. Note one can first solve for K, which equals 3, and then solve for
VHM = EPV / K = 9/3 = 3.
8.51. D. 1. T. Fewer observations are less valuable all other things being equal.
2. T. When the risks are more similar to each other, the relative value compared to the overall mean
of the observation of an individual risk is less. 3. F. The information value of an individual observation
is increased when the random noise is decreased.
8.52. C. Z = N/(N+K), therefore K = N(1-Z)/Z. If Z = .29 when N = 5, then K = 12.24.
Therefore when N = 2, Z = 2 / (2+12.24) = 0.140.
8.53. D. The Process Variance of Urn #1 is {(1-3)2 + (2-3)2 + (3-3)2 + (4-3)2 + (5-3)2 } / 5 = 2.
Urn
(Type of
Risk)
1
2
3
4
5
6
7
8
9
10
Mean

A Priori
Chance of
Risk
0.100
0.100
0.100
0.100
0.100
0.100
0.100
0.100
0.100
0.100

Hypothetical
Mean

Square of
Mean

Process
Variance

3
3
3
3
3
1
2
3
4
5

9
9
9
9
9
1
4
9
16
25

2
2
2
2
2
0
0
0
0
0

10

Expected Value of the Process Variance = 1.


Variance of the Hypothetical Means = 10 - 32 = 1.
K = EPV /VHM = 1/1 =1. For two observations, Z = 2 / (2+1) = 2/3. The a priori mean is 3 and the
observation is (2+3)/2 = 2.5. The new estimate = (2/3)(2.5) + (1/3)(3) = 2.67.
Comment: Note that due to the particular values in this question it is easy to make a mistake, but still
end up with the correct answer.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 387
8.54. E. Urn A is twice a Bernoulli process with q =1/2, thus it has a mean of: 2(.5) = 1, and
variance of: 22 (.5)(1-.5) = 1. Urn B is t times a Bernoulli process with q = 1/2, thus it has a mean of:
(t)(.5) = t/2, and variance of: t2 (.5)(1-.5) = t2 /4. Thus since the two Urns are equally likely, the
Expected Value of the Process Variance is: (.5)(1 + t2 /4).
The overall mean is: (1 + t/2)/2 = 1/2 + t/4.
The Variance of the Hypothetical Means is:
(1/2)( 1/2 + t/4 -1)2 + (1/2)(1/2 + t/4 - t/2)2 = (1/4)(1 - t/2)2 .
K = EPV / VHM and for one observation Z = 1/(1+K), therefore:
t

EPV

VHM

1
2
3
4
5

0.625
1
1.625
2.500
3.625

0.0625
0.0000
0.0625
0.2500
0.5625

10.000

26.000
10.000
6.444

9.1%
0
3.7%
9.1%
13.4%

Comment: Since for one observation K = (1/Z) - 1, if Z > 1/10, then K < 9.
This may save some time testing the five cases.
8.55. C. Die A has a mean of 3.5. The second moment is (12 + 22 + 32 + 42 + 52 + 62 ) / 6 = 91/6.
Therefore the variance is 91/6 - 3.52 = 35/12 = 2.1967. Die B has a mean of 8.5 and the same
variance as Die A. EPV = (.5)(35/12) + (.5)(35/12) = 35/12 = 2.9167.
Variance of the Hypothetical Means = 42.25 - 62 = 6.25.
Die
A
B
Average

A Priori
Probability
0.5000
0.5000

Mean for
this Die
3.5
8.5

Square of Mean
of this Class
12.25
72.25

Process
Variance
2.9167
2.9167

42.25

2.9167

K= EPV / VHM = (35/12) / (6.25) = 7/15 = .4667. Thus for N =3, Z = 3/(3+.4667) = 86.5%.
The observed mean is (1+2+3)/3 =2 and the a priori mean is 6.
Therefore, the new estimate = (2)(86.5%) + (6)(13.5%) = 2.54.
Comment: All of the outcomes for Die B are 5 more than those for Die A. Adding a constant to a
variable adds that same constant to the mean and does not alter the variance. Note the contrast in
this case of the Buhlmann Credibility estimate compared to the Bayesian Analysis result. The
observation is only possible if we have chosen Die A. Thus, Bayesian Analysis gives a posterior
estimate of 3.5, the mean of Die A.
8.56. B. n / (n+K) = 1/3 and (n+1)/ (n+1+K) = 2/5. Therefore, 3n = n+K and 5n+5 = 2n+2+2K.
Therefore, K = 2n. Then n =3 and K = 6. Thus n+2 observations have credibility:
Z = (n+2)/(n+2+K) = (3+2) / (3+2+6) = 5/11.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 388
8.57. B.
Urn
A
B
Average

Mean
1

Square of Mean
1

Process Variance
1

t2 /4

t/2
1/2 + t/4

1/2 + t2 /8

t2 /4
1/2 + t2 /8

Thus the Variance of the Hypothetical Means is: 1/2 + t2 /8 - (1/2 + t/4)2 = 1/4 - t/4 + t2 /16.
Then K = EPV / VHM = (1/2 + t2 /8) /(1/4 - t/4 + t2 /16).
As t approaches infinity, K approaches: ( t2 /8) /(t2 /16) = 2.
Thus for one observation, Z approaches: 1/(1+2) = 1/3.
Comment: The process variance of Urn B is {(0 - t/2)2 +(t - t/2)2 }/2 = t2 /4.
8.58. C. The (weighted) average of the Bayesian Analysis Estimates is:
(5/12)(3) + (1/6)(4.5) + (5/12)(6) = 9/2.
Since the Bayesian Analysis Estimates are in balance, the a priori mean is 9/2.
The Buhlmann Credibility estimates are thus: (observation - 9/2)Z + 9/2. Since Z is the same for
each of the insured (one exposure for each, Z = 1/(1+K)), the Buhlmann
Credibility estimates are in balance. Thus we want x to be such that:
(x)(5/12) + (3.8)(1/6) + (6.1)(5/12) = 9/2. x = 3.18.
Comment: Both the Bayesian and Buhlmann Estimates are in balance.
8.59. B. Define the three types of dice as: Type A: 2@1, 2@2, 2@3
Type B: 2@1, 2@3, 2@5
Type C: All @ 6

Type of
Die
A
B
C
Overall

A Priori
Chance of
This Type
of Die
0.500
0.333
0.167

Process
Variance
0.667
2.667
0.000

Mean
Die
Roll
2
3
6

Square of
Mean
Die
Roll
4
9
36

1.2222

3.000

11.000

The variance of the hypothetical means = 11 - 32 = 2. K = EPV / VHM = 1.2222 / 2 = .6111.


Z = 1/(1 + .6111) = .621. The a priori estimate is 3 and the observation is 6, so the new estimate is:
(.621)(6) + (1 - .621)(3) = 4.86.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 389
8.60. C. Define the four types of dice as: Type A : 2@1, 2@2, 2@3
Type B : 2@1, 2@3, 2@5
Type C: All @ 6
Type D: All @ 3
Situation 1:

Type of
Die
A
B
C

A Priori
Chance of
This Type
of Die
0.429
0.429
0.143

Overall

1.0000

Process
Variance
0.667
2.667
0.000

Mean
Die
Roll
2
3
6

Square of
Mean
Die
Roll
4
9
36

1.4286

3.000

10.7143

The variance of the hypothetical means = 10.7143 - 32 = 1.7143.


K = EPV / VHM = 1.4286 / 1.7143 = .833. Z = 1/(1+.833) = .546.
Situation 2:

Type of
Die
A
B
C
D

A Priori
Chance of
This Type
of Die
0.429
0.286
0.143
0.143

Process
Variance
0.667
2.667
0.000
0.000

Mean
Die
Roll
2
3
6
3.000

Square of
Mean
Die
Roll
4
9
36
9.000

Overall

1.0000

1.0476

3.0000

10.7143

The variance of the hypothetical means = 10.7143 - 32 = 1.7143.


K = EPV / VHM = 1.0476/ 1.7143 = .611. Z = 1/(1+.611) = .621.
Situation 3:
Type of
Die

A Priori
Chance of This
Type of Die

Process
Variance

Mean
Die
Roll

Square of
Mean
Die Roll

A
B
C

0.429
0.286
0.286

0.667
2.667
0.000

2
3
6

4
9
36

Overall

1.0000

1.0476

3.4286

14.5714

The variance of the hypothetical means = 14.5714 - 3.42862 = 2.8161.


K = EPV / VHM = 1.0476/ 2.8161 = 0.372. Z = 1/(1+0.372) = 0.729.
Since the Credibility in the previous question is .621, the Credibility is less in Situation 1, the same
in Situation 2, and higher in Situation 3.
Comment: One can just calculate the values of the Buhlmann Credibility Parameter K and check
when K is smaller so that Z will be larger. The situations compare as follows

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 390
Situation

EPV

VHM

in the Question

1.2222

2.0000

0.611

0.621

1
2
3

1.4286
1.0476
1.0476

1.7143
1.7143
2.8161

0.833
0.611
0.372

0.545
0.621
0.729

In Situation 1, the EPV is higher and the VHM is smaller than in the previous question, each of which
decreases the Credibility. In Situation 2, the EPV is smaller and the VHM is smaller than in the
previous question, which act in different directions on the Credibility. In this case the credibility turns
out to be the same as in the previous question, but it could have been either higher or lower. In
Situation 3, the EPV is lower and the VHM is higher than in the previous question, each of which
increases the Credibility.
8.61. Let m be the a priori mean. Let x = the average draw observed over the first two rounds,
which consists of three draws. Then 1 x 3. The Buhlmann-Straub credibility estimate after 3
draws is: (3/(3+k))x + (k/(3+k)m = (3x + km)/(3+k). We are told that the estimates for the average
draw are from 3.8/2 = 1.9 to 5.0/2 = 2.5.
Therefore, when x = 1, the estimate is 1.9 and when x = 3, the estimate is 2.5.
Thus (3 + km)/(3+k) = 1.9 and (9 + km)/(3 + k) = 2.5.
Therefore, subtracting the two equations, 6/(3 + k) = .6. k = 7.
Alternately, the smallest observation for the sum of two balls is: (2)(1) = 2, while the largest such
observation is: (2)(3) = 6. Z = estimate/ observation = (5.0 - 3.8)/(6 - 2) = .3.
During the first 2 rounds there are a total of three observations; therefore, Z = 3/(3 + k).
Thus since Z = .3, .3 = 3/(3+k) k = 7.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 391
8.62. C. For classical credibility, for 25 claims, Z = 25 / 100 = 1/2.
In order to have the Buhlmann Credibility be the same for 25 claims, 25/(25+K) = 1/2. K = 25.
Therefore for 100 claims the Buhlmann Credibility is: Z = 100/(100 + 25) = 0.80.
Comment: If K were to be put in terms of exposures rather than claims, then K = 25/f.
100 claims 100/f exposures. Therefore, for 100 claims the Buhlmann credibility is:
Z = (100/f)/(100/f + 25/f) = 100/(100 + 25) = .80.
Here are the Buhlmann Credibility (dashed) and the Classical Credibility (solid):
Cred.
1
0.8
0.6
0.4
0.2

20

40

60

80

100

120

140

Claims

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 392
8.63. A. The Buhlmann estimate should be on a straight line, but they are not in Graph E,
eliminating graph E.
The Buhlmann estimates are the weighted least squares linear approximation to those of Bayes
Analysis. However, in graph C, the Buhlmann estimates are always lower than the Bayes
estimates. Thus graph C is eliminated.
The Bayes estimates must remain within the range of hypotheses, in this case 1 to 4, eliminating
graphs B & D.
The a priori mean is between 1 and 4. The Buhlmann estimate is always between the observation
and the a prior mean. Therefore, the Buhlmann estimate for an observation of 1 must be at least 1.
In graph C, for an observation of 1 the Buhlmann estimate is below 1.
Another reason to eliminate graph C.
In graph C, the slope of the Buhlmann line is about: (3.3 - 0.5)/8 = 0.35 = Z.
In graph C, the intercept of the Buhlmann line is about:
0.5 = (1 - Z). = 0.5/(1 - 0.35) = 0.77.
However, , the a priori overall mean should be between 1 and 4, also eliminating graph C.
Comment: The slope of the straight line formed by the Buhlmann estimates is Z.
For ordinary situations,.0 < Z < 1.
Thus this slope must be positive and less than 1, which is true for graphs A to D.
8.64. D. The a priori mean is E[] = 4E[] = (4)(600) = 2400.
The mean for observed Years 1 and 2 is: (1400 + 1900)/2 = 1650.
Therefore, Z1650 + (1 - Z)(2400) = 1800. Z = 0.8.

2/(2+K) = 0.8 K = 1/2.


For three years of data, the observed mean is: (1400 + 1900+ 2763)/3 = 2021,
and Z = 3/(3 + 1/2) = 6/7.
(6/7)(2021) + (1/7)(2400) = 2075.
8.65. E. (A) Buhlmann is not a linear approximation to the Bayesian since the Buhlmann estimate is
always less than the corresponding Bayesian estimate. Also the Bayesian estimates go outside the
range of hypotheses, which is (6)(.1) = 0.6 to (6)(0.6) = 3.6.
(B) Buhlmann is not a linear approximation to the Bayesian since the Buhlmann estimate is always
greater than the corresponding Bayesian estimate.
(C) The Bayesian estimates go outside the range of hypotheses, which is (6)(.1) = 0.6 to
(6)(0.6) = 3.6.
(D) The Buhlmann estimates are not on a straight line.
Comment: For graph E, we can estimate the slope as about (4.3 - 0.6)/6 = 62% and the intercept
as about 0.7. Thus, Z (4.3 - 0.6)/6 = 62%. 0.7/(1 - 62%) = 2.

2013-4-9 Buhlmann Credibility 8 Buhlmann Cred. Introduction, HCM 10/19/12, Page 393
8.66. A. 0.25 = Z = 2/(2 + K). K = 6.
Process Variance = variance of a Gamma Distribution = 2 .
EPV = E[2 ] = 2E[] = 502 .
Hypothetical Mean = mean of a Gamma Distribution = .
VHM = Var[] = 2 Var[].
6 = K = EPV/VHM = 502 /(2 Var[]) = 50/Var[]. Var[] = 50/6 = 8.33.
Comment: Here the Gamma is the distribution of annual aggregate losses, rather than a distribution
of severity.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 394

Section 9, Buhlmann Credibility, Discrete Risk Types


Buhlmann Credibility will be applied to situations involving frequency, severity, pure premiums, or
aggregate losses.
A Series of Examples:
In a previous section, the following information was used in a series of examples involving the
frequency, severity, and pure premium:
Portion of
Bernoulli
Gamma
Risks in
(Annual) Frequency
Severity
Type
this Type
Distribution
Distribution
1

50%

q = 40%

= 4, = 100

30%

q = 70%

= 3, = 100

20%

q = 80%

= 2, = 100

We assume that the types are homogeneous; i.e., every insured of a given type has the same
frequency and severity process. Assume that for an individual insured, frequency and severity are
independent.
Using the Expected Value of the Process Variance and the Variance of the Hypothetical Means
computed in a previous section, one can compute the Buhlmann Credibility Parameter in each case.
An insured is picked at random of an unknown type.109 For this randomly selected insured during 4
years one observes 3 claims for a total of $450.110 Use Buhlmann Credibility to predict the future
frequency, severity, or pure premium of this insured.
Frequency Example:
As computed in the previous section, the EPV of the frequency = 0.215, while the variance of the
hypothetical mean frequencies = 0.0301.
Thus the Buhlmann Credibility parameter is: K = EPV / VHM = 0.215 / 0.0301 = 7.14.
Thus 4 years of experience are given a credibility of: 4/(4+K) = 4/11.14 = 35.9%.
The observed frequency is 3/4 = 0.75. The a priori mean frequency is 0.57.
The estimate of the future frequency for this insured is: (0.359)(0.75) + (1 - 0.359)(0.57) = 0.635.
109

If one knew which type the insured was, one would use the expected value for that type to estimate the future
frequency, severity, or pure premium.
110
Unlike the Bayesian Analysis case, even if one were given the separate claim amounts, the Buhlmann Credibility
estimate of severity only makes use of the sum of the claim amounts, or equivalently the average.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 395
Severity Example:
As computed in the previous section, the EPV of the severity = 30,702, while the variance of the
hypothetical mean severities = 6265.
Thus the Buhlmann Credibility parameter is K = EPV / VHM = 30,702 / 6265 = 4.90.
Thus 3 observed claims are given a credibility of 3/(3+K) = 3/7.9 = 38.0%.111 The observed mean
severity is: $450/3 = $150. The a priori mean severity is $307. Thus the estimate of the future
severity for this insured is: (0.380)(150) + (1 - 0.380)(307) = $247.3.
Two cases for Severities:
Assume there are two types of risks that are equally likely.
Class 1 has a mean frequency of 10% and an Exponential Severity with mean 5.
Class 2 has a mean frequency of 20% and an Exponential Severity with mean 8.
As computed in the previous section, EPV = 51 and VHM = 2.
Therefore, K = 51/2 = 25.5.
If the types do not differ in their frequencies, then as computed in the previous section,
EPV = 44.50 and VHM = 2.25. Therefore, K = 44.50/2.25 = 19.8, rather than 25.5.
Pure Premium Example:
As computed in the previous section, the EPV of the pure premium is 43,650, while the variance of
the hypothetical mean pure premiums is 525.
Thus the Buhlmann Credibility parameter is: K = EPV / VHM = 43,650 / 525 = 83.1.
Thus 4 years of experience are given a credibility of: 4/(4+K) = 4/87.1 = 4.6%. The observed pure
premium is $450/4 = $112.5. The a priori mean pure premium is $175. Thus the estimate of the
future pure premium for this insured is: (0.046)(112.5) + (1 - 0.046)(175) = $172.
Note that this estimate of the future pure premium is not equal to the product of our previous
estimates of the future frequency and severity. (0.635)($247.3) = $157 $172. In general, one
does not get the same result if one uses credibility to make separate estimates of the frequency and
severity instead of directly estimating the pure premium. Therefore, carefully read exam questions
involving credibility estimates of the pure premium, to see which of the two methods one is
expected to use.

111

Note that the number of observed claims is used to determine the Buhlmann credibility of the severity.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 396
Exposures:
In exam questions, it is common that one policyholder observed for one year is one exposure.
In those situations, one policyholder for three years is three exposures, nine policyholders for one
year is nine exposures,112 and four policyholders observed for five years is 20 exposures.
For example, in automobile insurance, exposures are measured in car-years. If one observes 100
cars in each of three years, then one observes a total of 300 exposures.
A commercial automobile policyholder may have a different large number of vehicles insured each
year. In which case, one adds up the automobiles from each year in order to get the total exposures.
The same would be done for the members of a group health insurance policy.
Exercise: For a group health insurance policy, you observe the following number of employees in
Years 1, 2 and 3 respectively: 800, 600, 400.113 How many exposures are there in total?
[Solution: 800 + 600 + 400 = 1800.]
While the unit of time is usually a year, occasionally it is something different such as a month.
In that case, one insured for one month is one exposure.114
Buhlmann and Bayes Each Lack a Nice Property the Other Has:
There are two types of insureds, equally likely.
Each insured has a Bernoulli frequency.
Type A has a mean frequency of 2%.
Type B has a mean frequency of 50%.
Exercise: Determine the Buhlmann Credibility Parameter, K.
[Solution: Overall mean is: (50%)(0.02) + (50%)(0.5) = 0.26.
Second Moment of the Hypothetical Means is: (50%)(0.022 ) + (50%)(0.52 ) = 0.1252.
Variance of the Hypothetical Means is: 0.1252 - 0.262 = 0.0576.
Expected Value of the Process Variance is: (50%)(0.02)(1 - 0.02) + (50%)(0.5)(1 - 0.5) = 0.1348.
K = EPV / VHM = 0.1348/0.0576 = 2.34.]
Exercise: An insured is picked at random. This insured has 2 claims in 2 years.
Use Buhlmann credibility to estimate the future annual claim frequency for this insured.
[Solution: Z = 2/(2 + K) = 2/4.34 = 46.1%.
Estimate is: (46.1%)(2/2) + (1 - 46.1%)(0.26) = 0.601.]
112
113
114

See 4, 11/00, Q.38.


See 4, 11/01, Q.26.
See 4, 11/03, Q.27.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 397
In this case, the range of hypotheses is from 2% to 50%. However, the estimated future frequency
using Buhlmann Credibility of 60.1% is outside that range of hypotheses. The estimate from
Buhlmann Credibility can be outside the range of hypotheses, since it is a linear estimator which
approximates the Bayes result. As discussed previously, in contrast, the estimate from Bayes
Analysis is always within the range of hypotheses, since it is a weighted average of the hypothetical
means.
Exercise: An insured is picked at random. This insured has 2 claims in 10 years.
Use Bayes Analysis to estimate the future annual claim frequency for this insured.
10
[Solution: The probability of the observation is: q2 (1-q)8 = 45q2 (1-q)8 .
2

Type

A Priori
Probability

1
2

0.02
0.50

50%
50%

Chance
of the
Observation

Probability
Weight

Posterior
Chance of
This Type
of Risk

0.01531
0.04395

0.007657
0.021973

25.84%
74.16%

0.02
0.50

0.029630

1.000

0.376

Overall

Mean
Annual
Freq.

In this case, the observed frequency is 20% and the a priori mean frequency is 26%. However, the
estimated future frequency using Bayes Analysis of 37.6% is not between the observation and the
a priori mean. The estimate from Bayes Analysis is not necessarily between the observation and
the a priori mean. As discussed previously, in contrast, the estimate from Buhlmann Credibility is
always between the observation and the a priori mean, since it is a weighted average of these two
items.
Loss Models, Notation and Terminology:
a = VHM, v = EPV, = collective premium a priori mean
credibility premium estimate using credibility
k = Buhlmann Credibility Parameter
Z = credibility Buhlmann credibility factor Bhlmann-Straub credibility factor
Buhlmann credibility premium Bhlmann-Straub credibility premium

estimate using Buhlmann Credibility


Greatest Accuracy Credibility Buhlmann Credibility

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 398
Sequential Approach:
Assume the observations come in a sequence, first x1 , then x2 , then x3 , etc.
Let the initial estimate be y0 . Then let y1 = the estimate after observing x1 , y2 = the estimate after
observing x1 and x2 , etc. Then one standard estimation method would be to let
y n = yn-1 + an (xn - yn-1), where an = nth gain factor, 0 < an < 1. In other words, the new estimate
after an observation is the most recent estimate plus some fraction of the difference between the
latest observation and the most recent estimate. It turns out the use of Buhlmann Credibility is a
special case of this method, with an = 1/(n+K), and y0 = m = a priori mean.
Exercise: Show that for the use of Buhlmann Credibility, an = (yn - yn-1)/(xn - yn-1) = 1/(n+K).
[Solution: Let X n = the average of the first n observations.
n

y n = Z X n + (1-Z)m = (n X n + Km) / (n+K) = ( xi + Km) / (n+K).


i=1

n-1

i=1

i=1

y n - yn-1 = {( xi + Km)(n+K-1) - ( xi + Km)(n+K)} / {(n+K)(n+K-1)} =


n

n-1

n-1

i=1

i=1

i=1

i=1

i=1

{n xi + K xi - xi - Km - n xi - K xi } / {(n+K)(n+K-1)} =
n

{nxn - xi + Kxn - Km} / {(n+K)(n+K-1)}.


i=1

n-1

n-1

i=1

i=1

xn - yn-1 = xn - ( xi + Km)/(n+K -1) = (nxn + Kxn - xn - xi - Km) / (n+K -1) =


n

{nxn - xi + Kxn - Km} / (n+K-1) = (yn - yn-1)(n+K). ]


i=1

For example, let m = 10, K = 5, and the data be: 7, 15, 12, 8. Then y0 = 10.
y 1 = 10 + (1/6)(7 - 10) = 9.5. y2 = 9.5 + (1/7)(15 - 9.5) = 10.286.
y 3 = 10.286 + (1/8)(12 - 10.286) = 10.500. y4 = 10.500 + (1/9)(8 - 10.500) = 10.222.
If instead we used all the data at once, X = 10.5, Z = 4/(4+5) = 4/9, and
the estimate using Buhlmann Credibility = 10.5(4/9) + (10)(5/9) = 92/9 = 10.222, the same
estimate as obtained using the sequential approach.
As n , an = 1/(n+K) 0. As we get more data, the method is less responsive to the
difference between the latest observation and the most recent estimate.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 399
Problems:
9.1 (2 points) You are given:
(i) The annual number of claims on a given policy has the geometric distribution with parameter .
(ii) One-third of the policies have = 2, and the remaining two-thirds have = 5.
A randomly selected policy had ten claims in Year 1. Using Buhlmann Credibility determine the
expected number of claims for the selected policy in Year 2.
(A) 4.3
(B) 4.4
(C) 4.5
(D) 4.6
(E) 4.7
9.2 (3 points) You are given:
(i) Size of loss follows a LogNormal distribution with = 5.
(ii) For half of the companys policies = 1.0, while for the other half = 1.5.
For a randomly selected policy, you observe 5 losses of sizes: 50, 100, 150, 200, and 250.
Using Buhlmann Credibility determine the expected size of the next loss from the selected policy.
(A) 325
(B) 330
(C) 335
(D) 340
(E) 345
9.3 (2 points) The aggregate loss distributions for three risks for one exposure period are as
follows:
Aggregate Losses
$0
$100
$500
Risk
A
0.80
0.10
0.10
B
0.60
0.20
0.20
C
0.30
0.50
0.20
A risk is selected at random and is observed to have $500 of aggregate losses in the first exposure
period. Determine the Buhlmann Credibility estimate of the expected value of the aggregate losses
for the same risk's second exposure period.
A. $120
B. $130
C. $140
D. $150
E. $160
9.4 (3 points) You are given:
(i) Losses on a companys insurance policies follow a Pareto distribution with probability
density function: f(x | ) = 44 / (x + )5 , 0 < x < .
(ii) For half of the companys policies = 1, while for the other half = 3.
For a randomly selected policy, losses in Year 1 were 5.
Using Buhlmann Credibility determine the expected losses for the selected policy in Year 2.
(A) 0.95
(B) 1.00
(C) 1.05
(D) 1.10
(E) 1.15

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 400
Use the following information for the following 7 questions:
There are three types of drivers with the following characteristics:
Portion of Drivers Poisson Annual
Pareto
Type
of This Type
Claim Frequency
Claim Severity
Good

50%

3%

= 6, = 1000

Bad

30%

5%

= 5, = 1000

Ugly

20%

10%

= 4, = 1000

For any individual driver, frequency and severity are independent.


9.5 (3 points) A driver is observed to have over a five year period a single claim.
Use Buhlmann Credibility to predict this drivers future annual claim frequency.
A. 5.5%
B. 6.0%
C. 6.5%
D. 7.0%
E. 7.5%
9.6 (2 points) What is the expected value of the process variance of the claim severities (for the
observation of a single claim)?
A. less than 130,000
B. at least 130,000 but less than 140,000
C. at least 140,000 but less than 150,000
D. at least 150,000 but less than 160,000
E. at least 160,000
9.7 (2 points) What is the variance of the hypothetical mean severities (for the observation of a
single claim)?
A. less than 2000
B. at least 2000 but less than 3000
C. at least 3000 but less than 4000
D. at least 4000 but less than 5000
E. at least 5000
9.8 (2 points) Over several years, for an individual driver you observe a single claim of size $2500.
Use Buhlmann credibility to estimate this drivers future average claim severity.
A. less than $325
B. at least $325 but less than $335
C. at least $335 but less than $345
D. at least $345 but less than $355
E. at least $355

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 401
9.9 (3 points) What is the expected value of the process variance of the pure premiums (for the
observation of a single exposure)?
A. less than 11,000
B. at least 11,000 but less than 12,000
C. at least 12,000 but less than 13,000
D. at least 13,000 but less than 14,000
E. at least 14,000
9.10 (2 points) What is the variance of the hypothetical mean pure premiums (for the observation of
a single exposure)?
A. less than 70
B. at least 70 but less than 80
C. at least 80 but less than 90
D. at least 90 but less than 100
E. at least 100
9.11 (2 points) A driver is observed to have over a five year period a total of $2500 in losses.
Use Buhlmann Credibility to predict this drivers future pure premium.
A. less than $25
B. at least $25 but less than $30
C. at least $30 but less than $35
D. at least $35 but less than $40
E. at least $40

9.12 (3 points) There are three types of risks. Assume 60% of the risks are of Type A, 25% of the
risks are of Type B, and 15% of the risks are of Type C.
Each risk has either one or zero claims per year.
Type of Risk
Chance of a Claim
A Priori Chance of Type of Risk
A
20%
60%
B
30%
25%
C
40%
15%
A risk is selected at random, and you observe 4 claims in 9 years.
Using Buhlmann Credibility, what is the estimated future claim frequency for that risk?
A. Less than 0.29
B. At least 0.29 but less than 0.30
C. At least 0.30 but less than 0.31
D. At least 0.31 but less than 0.32
E. At least 0.32

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 402
9.13 (3 points) Each taxicab has either have zero or one accident per month. Taxicabs are of two
equally common types, with mean accident frequency of either 1% or 2% per month.
Over the last 3 years, Deniro Taxis has had 10 cabs all of the same type.
Over the last 3 years, Deniro Taxis had 4 accidents.
In the future Deniro Taxis will have 12 cabs.
Use Buhlmann credibility to predict how many accidents Deniro Taxis will have over the coming 3
years.
A. 5.0
B. 5.2
C. 5.4
D. 5.6
E. 5.8
9.14 (3 points) You are given:
(i) Size of loss follows a Gamma Distribution with = 4.
(ii) Three quarters of the companys policies have = 10, while for the other quarter have = 12.
For a randomly selected policy, you observe 8 losses of sizes:
10, 20, 25, 30, 35, 40, 40, and 50.
Using Buhlmann Credibility determine the expected size of the next loss from the selected policy.
(A) 39.0
(B) 39.5
(C) 40.0
(D) 40.5
(E) 41.0
9.15 (3 points) You are given the following information about two classes of risks:

Risks in Class A have a Poisson frequency with a mean of 0.6 per year.
Risks in Class B have a Poisson frequency with a mean of 0.8 per year.
Risks in Class A have an Exponential severity distribution with a mean of 11.
Risks in Class B have an Exponential severity distribution with a mean of 15.
Class A has three times the number of risks in Class B.
Within each class, severities and claim counts are independent.
A risk is randomly selected and observed to have three claims during one year.
The observed claim amounts were: 7, 10, and 21.
Using Buhlmann Credibility, estimate the annual losses for next year for this risk.
(Do not make separate estimates of frequency and severity.)
(A) 8.0
(B) 8.2
(C) 8.4
(D) 8.6
(E) 8.8
9.16 (2 points) You are given the following information about three types of insureds:

60% of insureds are Type 1, 30% are Type 2, and 10% are Type 3.
Insureds in Class 1 have a Binomial frequency with m = 10 and q = 0.1.
Insureds in Class 2 have a Binomial frequency with m = 10 and q = 0.2.
Insureds in Class 3 have a Binomial frequency with m = 10 and q = 0.4.
An insured is randomly selected and observed to have five losses during one year.
Using Buhlmann Credibility, estimate the number of losses next year for this risk.
(A) 2.0
(B) 2.5
(C) 3.0
(D) 3.5
(E) 4.0

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 403
9.17 (3 points) You are given the following:

A portfolio consists of 150 independent risks.

100 of the risks each have a policy with a $100,000 per claim policy limit,
and 50 of the risks each have a policy with a $1,000,000 per claim policy limit.

The risks have identical claim count distributions.

Prior to censoring by policy limits, the claim size distribution for each risk is as follows:
Claim Size
Probability
$10,000
1/2
$50,000
1/4
$100,000
1/5
$1,000,000
1/20

A claims report is available which shows actual claim sizes incurred for each policy
after censoring by policy limits, but does not identify the policy limit
associated with each policy.
The claims report shows exactly three claims for a policy selected at random. Two of the claims are
$100,000, but the amount of the third is illegible. Use Buhlmann Credibility to estimate the value of
this illegible number.
A. Less than $61,000
B. At least $61,000, but less than $62,000
C. At least $62,000, but less than $63,000
D. At least $63,000, but less than $64,000
E. At least $64,000
9.18 (2 points) You assume the following information on claim frequency for individual automobile
drivers:
Type
Portion
Expected
Claim
of Driver
of Drivers
Claims
Variance
A
20%
0.02
0.03
B
50%
0.05
0.06
C
30%
0.10
0.15
Determine the Bhlmann credibility factor for one year of experience of a single driver selected at
random from the population, if its type is unknown.
(A) 0.5%
(B) 1.0%
(C) 1.5%
(D) 2.0%
(E) 2.5%
9.19 (3 points) The number of claims incurred each year is 2, 3, 4, 5, or 6, with equal probability.
If there is a claim, there is a 65% chance it will be reported to the insurer by year end, independent
of any other claims.
There are 3 claims incurred during 2003 that are reported by the end of year 2003.
Use Buhlmann Credibility in order to estimate the number of claims incurred during 2003.
(A) 4.30
(B) 4.35
(C) 4.40
(D) 4.45
(E) 4.50

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 404
9.20 (3 points) You are given the following joint distribution:

0.1

0.0

0.1

0.3

25

0.1

0.4

For a given value of , a sample of size 30 for X sums to 510.


Determine the Bhlmann credibility premium.
(A) 15.5
(B) 15.7
(C) 15.9
(D) 16.1

(E) 16.3

9.21 (3 points) You are given the following:

The number of claims incurred each year is Negative Binomial with r = 2 and = 1.6.
If there is a claim, there is a 70% chance it will be reported to the insurer by year end.
The chance of a claim being reported by year end is independent of the reporting
of any other claim, and is also independent of the number of claims incurred.
There are 5 claims incurred during 2003 that are reported by the end of year 2003.
Use Buhlmann Credibility in order to estimate the number of claims incurred during 2003.
A. 6.0
B. 6.2
C. 6.4
D. 6.6
E. 6.8
9.22 (2 points) An insurer writes a large book of policies. You are given the following information
regarding claims filed by insureds against these policies:
(i) A maximum of one claim may be filed per year.
(ii) The probability of a claim varies by insured, and the claims experience for each
insured is independent of every other insured.
(iii) The probability of a claim for each insured remains constant over time.
(iv) The overall probability of a claim being filed by a randomly selected insured in a year is 0.12.
(v) The variance of the individual insured claim probabilities is 0.03.
An insured selected at random is found to have filed 2 claims over the past 8 years.
Determine the Bhlmann credibility estimate for the expected number of claims the selected insured
will file over the next 3 years.
(A) 0.55
(B) 0.60
(C) 0.65
(D) 0.70
(E) 0.75

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 405
9.23 (3 points) You are given the following information:

There are three types of risks.

The types are homogeneous, every risk of a given type has the same Exponential

severity process:
Portion of
Average
Risks in
Claim
Type
This type
Size
1
70%
25
2
20%
40
3
10%
50
A risk is picked at random and we do not know what type it is.
For this randomly selected risk, over 5 years there are 3 claims of sizes: 30, 40, and 70.
Use Bhlmann Credibility to predict the future average claim size of this same risk.
A. Less than 35
B. At least 35, but less than 36
C. At least 36, but less than 37
D. At least 37, but less than 38
E. 38 or more
9.24 (3 points) For a portfolio of insurance risks, aggregate losses per year per exposure follow a
distribution with mean and coefficient of variation 1.4, with varying by class as follows:
Class

Percent of Risks in Class

X
3
50%
Y
4
30%
Z
5
20%
A randomly selected risk has the following experience over three years:
Year
Number of Exposures
Aggregate Losses
1
124
403
2
103
360
3
98
371
Assuming 100 exposures in Year 4, calculate the Bhlmann-Straub estimate of the mean aggregate
losses in Year 4 for this risk.
A. Less than 350
B. At least 350, but less than 355
C. At least 355, but less than 360
D. At least 360, but less than 365
E. 365 or more

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 406
9.25 (3 points) You are given the following information:

There are two types of risks.

The types are homogeneous, every risk of a given type has the same Exponential
severity process:
Portion of
Risks in
Type
This type
1
75%
2
25%

Average Claim
Size in
Year 1
200
300

Inflation is 5% per year.


A risk is picked at random and we do not know what type it is.
For this randomly selected risk, the experience is as follows:
In Year 1 there are two claims of sizes 20 and 100.
In Year 2 there are no claims.
In Year 3 there are two claims of sizes 50 and 400.
Use Bhlmann Credibility to predict the average claim size of this same risk in year 4.
A. Less than 235
B. At least 235, but less than 240
C. At least 240, but less than 245
D. At least 245, but less than 250
E. 250 or more
9.26 (4 points) For a portfolio of insurance risks, aggregate losses in 2005 follow a
LogNormal Distribution with parameters = 1.5 and varying by type:
Class

Percent of Risks in Class

1
8
50%
2
9
50%
A randomly selected risk has the following experience:
Year
Aggregate Losses
2005
32,000
2006
29,000
2007
37,000
Inflation is 10% per year.
Estimate of the mean aggregate losses in 2008 for this risk using Buhlmann credibility.
A. Less than 24,000
B. At least 24,000, but less than 25,000
C. At least 25,500, but less than 26,000
D. At least 26,000, but less than 27,000
E. 27,000 or more

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 407
Use the following information for the next two questions:
Annual claim counts for each policyholder follow a Negative Binomial distribution with r = 3.
Half of the policyholders have = 1.
The other half of the policyholders have = 2.
A policyholder had 7 claims in one year.
9.27 (2 points)
Using Buhlmann Credibility, estimate the future claim frequency for this policyholder.
A. 4.8
B. 4.9
C. 5.0
D. 5.1
E. 5.2
9.28 (3 points)
Using Bayes Analysis, estimate the future claim frequency for this policyholder.
A. 4.8
B. 4.9
C. 5.0
D. 5.1
E. 5.2

9.29 (4 points) Severity is LogNormal.


is equally likely to be 6 or 7.
is equally likely to be 0.5 or 1.
and are distributed independently of each other.
Determine the Buhlmann Credibility Parameter K.
A. 4
B. 6
C. 8
D. 10

E. 12

9.30 (2 points) You are given:


(i) Each risk has at most one claim each year.
(ii)
Type of Risk
Prior Probability
Annual Claim Probability
I
0.7
0.1
II
0.2
0.2
III
0.1
0.4
One randomly chosen risk has three claims during Years 1-6.
Use Buhlmann Credibility in order to estimate the probability of a claim for this risk in Year 7.
(A) 0.25
(B) 0.28
(C) 0.31
(D) 0.34
(E) 0.37

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 408
Use the following information for the following 6 questions:
There are two types of insureds with the following characteristics:
Portion of Insureds
Annual Claim Frequency
Type of This Type
per Exposure
Claim Severity
1

40%

Bernoulli q = 0.03

Gamma = 6 and = 100

60%

Bernoulli q = 0.06

Gamma = 4 and = 100

For a particular policyholder you observe the following experience over four years:
Year Exposures Number of Claims Claim Sizes
2005
20
2
500, 1000
2006
25
0
--2007
30
1
300
2008
25
1
800
9.31 (3 points)
Using Buhlmann Credibility, estimate the future claim frequency for this policyholder.
A. 4.1%
B. 4.3%
C. 4.5%
D. 4.7%
E. 4.9%
9.32 (3 points)
Using Bayes Analysis, estimate the future claim frequency for this policyholder.
A. 3.2%
B. 3.4%
C. 3.6%
D. 3.8%
E. 4.0%
9.33 (3 points)
Using Buhlmann Credibility, estimate the future average severity for this policyholder.
A. 470
B. 500
C. 530
D. 560
E. 590
9.34 (3 points)
Using Bayes Analysis, estimate the future average severity for this policyholder.
A. 500
B. 520
C. 540
D. 560
E. 580
9.35 (3 points) Assuming 40 exposures in 2009, using Buhlmann Credibility,
estimate the aggregate losses for this policyholder in 2009.
A. 820
B. 840
C. 860
D. 880
E. 900
9.36 (3 points) Assuming 30 exposures in 2010, using Bayes Analysis,
estimate the aggregate losses for this policyholder in 2010.
A. 560
B. 580
C. 600
D. 620
E. 640

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 409
Use the following information for the next two questions:

Each driver can not have 2 or more accidents per year.


Two groups of drivers comprise equal proportions of the insured population.
One group has a 5% annual frequency, while the other has a 10% annual frequency.
Every driver has the following accident severity distribution:
Probability
60%
30%
10%

Size of Loss
1000
3000
5000

9.37 (2 points) What credibility would be given to an insured's number of accidents over three
years using Buhlmann's Credibility Formula?
A. Less than 1.5%
B. At least 1.5%, but less than 2.0%
C. At least 2.0%, but less than 2.5%
D. At least 2.5%, but less than 3.0%
E. 3.0% or more
9.38 (3 points) What credibility would be given to an insured's aggregate losses over three years
using Buhlmann's Credibility Formula?
A. Less than 1.5%
B. At least 1.5%, but less than 2.0%
C. At least 2.0%, but less than 2.5%
D. At least 2.5%, but less than 3.0%
E. 3.0% or more
9.39 (4, 5/87, Q.40) (3 points) There are two classes of insureds in a given population.
Each insured has either no claims or exactly one claim in one experience period.
For each insured the distribution of the number of claims is binomial. The probability of a claim in one
experience period is 0.20 for Class 1 insureds and 0.30 for Class 2.
The population consists of 40% Class 1 insureds and 60% for Class 2.
An insured is selected at random without knowing the insured's class.
What credibility would be given to this insured's experience for five experience periods using
Buhlmann's Credibility Formula?
A. Less than 0.06
B. At least 0.06, but less than 0.08
C. At least 0.08, but less than 0.10
D. At least 0.10, but less than 0.12
E. 0.12 or more

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 410
9.40 (4, 5/90, Q.56) (3 points) Employ a Buhlmann credibility estimate that uses pure premium
rather than using frequency or severity separately.
Consider a group of insureds described by the following.
1. An individual insured's claim frequency is Poisson.
2. An individual insured's claim severity has a variance equal to the mean squared.
Frequency Severity
3. Expected value of the hypothetical means
0.1
100
4. Variance of the hypothetical means
0.1
2500
5. Frequency and severity are independently distributed.
What is the Buhlmann credibility, Z, for a single pure premium observation of an insured selected
from the group?
A. 0 < Z 1/5
B. 1/5 < Z 1/4
C. 1/4 < Z 1/3
D. 1/3 < Z 1/2
E. 1/2 < Z 1
9.41 (4, 5/91, Q.25) (1 point) Assume that the expected pure premium for an individual insured is
constant over time. If the Buhlmann credibility for two years of experience is equal to 0.40, find the
Buhlmann credibility for three years of experience.
A. Less than 0.500
B. At least 0.500 but less than 0.525
C. At least 0.525 but less than 0.550
D. At least 0.550 but less than 0.575
E. At least 0.575

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 411
Use the following information for the next two questions:
Classes A and B have the same number of risks. Each class is homogeneous.
The following data are the mean and process variance for a risk from the given class.
Number of Claims
Size of Loss
Class
Mean Variance
Mean Variance
A
0.1667 0.1389
4
20
B
0.8333 0.1389
2
5
A risk is randomly selected from one of the two classes and four observations are made of the risk.
9.42 (4B, 5/92, Q.18) (3 points) Determine the value for the Buhlmann credibility, Z, that can be
applied to the observed pure premium.
A. Less than 0.05
B. At least 0.05 but less than 0.10
C. At least 0.10 but less than 0.15
D. At least 0.15 but less than 0.20
E. At least 0.20
9.43 (4B, 5/92, Q.19) (1 point) The pure premium calculated from the four observations is 0.25.
Determine the Buhlmann credibility estimate for the risk's pure premium.
A. Less than 0.25
B. At least 0.25 but less than 0.50
C. At least 0.50 but less than 0.75
D. At least 0.75 but less than 1.00
E. At least 1.00

9.44 (4B, 5/93, Q.27) (2 points) Use the following information:


An insurance portfolio consists of two classes, A and B.
The number of claims distribution for each class is:
Probability of Number of Claims =
Class 0
1
2
3
A
0.7 0.1
0.1
0.1
B
0.5 0.2
0.1
0.2
Class A has three times as many insureds as Class B.
A randomly selected risk from the portfolio generates 1 claim over the most recent policy period.
Determine the Buhlmann credibility estimate of the claims frequency rate for the observed risk.
A. Less than 0.72
B. At least 0.72 but less than 0.78
C. At least 0.78 but less than 0.84
D. At least 0.84 but less than 0.90
E. At least 0.90

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 412
9.45 (4B, 5/93, Q.29) (2 points) You are given the following:
The distribution for number of claims is binomial with parameters q and m, where m = 1.
The prior distribution of q has mean = 0.25 and variance = 0.07.
Determine the Buhlmann credibility to be assigned to a single observation of one risk.
A. Less than 0.20
B. At least 0.20 but less than 0.25
C. At least 0.25 but less than 0.30
D. At least 0.30 but less than 0.35
E. At least 0.35
9.46 (4B, 11/93, Q.18) (3 points) You are given the following:
Two risks have the following severity distribution.
Probability of Claim Amount For
Amount of Claim
Risk 1
Risk 2
100
0.50
0.70
1,000
0.30
0.20
20,000
0.20
0.10
Risk 1 is twice as likely as Risk 2 of being observed. A claim of 100 is observed, but the observed
risk is unknown. Determine the Buhlmann credibility estimate of the expected value of a second
claim amount from the same risk.
A. Less than 3,500
B. At least 3,500, but less than 3,650
C. At least 3,650, but less than 3,800
D. At least 3,800, but less than 3,950
E. At least 3,950
9.47 (4B, 5/94, Q.9) (3 points) The aggregate loss distributions for two risks for one exposure
period are as follows:
___ Aggregate Losses
$50
$1,000
Risk $0
A
0.80
0.16
0.04
B
0.60
0.24
0.16
A risk is selected at random and observed to have $0 of losses in the first two exposure periods.
Determine the Buhlmann credibility estimate of the expected value of the aggregate losses for the
same risk's third exposure period.
A. Less than $90
B. At least $90, but less than $95
C. At least $95, but less than $100
D. At least $100, but less than $105
E. At least $105

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 413
Use the following information for the next two questions:
A portfolio of 200 independent insureds is subdivided into two classes as follows:
Expected Variance of
Number
Number of Number of Expected Variance of
of
Claims Per Claims Per Severity Severity
Class Insureds
Insured
Insured Per Claim Per Claim
1
50
0.25
0.75
4
20
2
150
0.50
0.75
8
36
Claim count and severity for each insured are independent.
A risk is selected at random from the portfolio, and its pure premium, P1 , for one exposure
period is observed.
9.48 (4B, 5/94, Q.17) (3 points) Use the Buhlmann credibility method to estimate the expected
value of the pure premium for the second exposure period for the same selected risk.
A. 3.25
B. 0.03 P1 + 3.15
C. 0.05 P1 + 3.09
D. 0.08 P1 + 3.00

E. None of A, B, C, or D

9.49 (4B, 5/94, Q.18) (1 point) After three exposure periods, the observed pure premium for the
selected risk is P. The selected risk is returned to the portfolio.
Then, a second risk is selected at random from the portfolio.
Use the Buhlmann credibility method to estimate the expected pure premium for the next exposure
period for the newly selected risk.
A. 3.25
B. 0.09 P + 2.97
C. 0.14 P + 2.80
D. 0.20 P + 2.59
E. 0.99 P + 0.03
9.50 (4B, 5/95, Q.20) (3 points) The aggregate loss distributions for three risks for one exposure
period are as follows:
Aggregate Losses
$0
$50
$2,000
Risk
A
0.80
0.16
0.04
B
0.60
0.24
0.16
C
0.40
0.32
0.28
A risk is selected at random and is observed to have $50 of aggregate losses in the first exposure
period.
Determine the Buhlmann credibility estimate of the expected value of the aggregate losses for the
same risk's second exposure period.
A. Less than $300
B. At least $300, but less than $325
C. At least $325, but less than $350
D. At least $350, but less than $375
E. At least $375

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 414
9.51 (4B, 5/96, Q.6) (2 points) You are given the following:

A portfolio of independent risks is divided into two classes.

Each class contains the same number of risks.

For each risk in Class 1, the number of claims for a single exposure period follows a
Poisson distribution with mean 1.

For each risk in Class 2, the number of claims for a single exposure period follows a
Poisson distribution with mean 2.
A risk is selected at random from the portfolio. During the first exposure period, 2 claims are
observed for this risk. During the second exposure period, 0 claims are observed for this same risk.
Determine the Buhlmann credibility estimate of the expected number of claims for this same risk for
the third exposure period.
A. Less than 1.32
B. At least 1.32, but less than 1.34
C. At least 1.34, but less than 1.36
D. At least 1.36, but less than 1.38
E. At least 1.38
9.52 (4B, 11/96, Q.4) (2 points) You are given the following:

A portfolio of independent risks is divided into three classes.


Each class contains the same number of risks.

For each risk in Classes 1 and 2, the probability of exactly one claim during
one exposure period is 1/3, while the probability of no claim is 2/3.

For each risk in Class 3, the probability of exactly one claim during one exposure period
is 2/3, while the probability of no claim is 1/3.
A risk is selected at random from the portfolio. During the first two exposure periods, two claims are
observed for this risk (one in each exposure period). Determine the Buhlmann credibility estimate of
the probability that a claim will be observed for this same risk during the third exposure period.
A. 4/9
B. 1/2
C. 6/11
D. 5/9
E. 3/5

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 415
9.53 (4B, 5/97, Q.12) (2 points) You are given the following:

A portfolio of independent risks is divided into three classes.


Each class contains the same number of risks.
For all of the risks in Class 1, claim sizes follow a uniform distribution on [0, 400].
For all of the risks in Class 2, claim sizes follow a uniform distribution on [0, 600].

For all of the risks in Class 3, claim sizes follow a uniform distribution on [0, 800].
A risk is selected at random from the portfolio. The first claim observed for this risk is 340. Determine
the Buhlmann credibility estimate of the expected value of the second claim observed for this same
risk.
A. Less than 270
B. At least 270, but less than 290
C. At least 290, but less than 310
D. At least 310, but less than 330
E. At least 330

9.54 (4B, 11/97, Q.5) (2 points) You are given the following:
A portfolio of independent risks is divided into two classes.
Each class contains the same number of risks.
The claim count probabilities for each risk for a single exposure period are as follows:
Class
Probability of 0 Claims Probability of 1 Claim
1
1/4
3/4
2
3/4
1/4
All claims incurred by risks in Class 1 are of size u.
All claims incurred by risks in Class 2 are of size 2u.
A risk is selected at random from the portfolio. Determine the Buhlmann credibility for the pure
premium of one exposure period of loss experience for this risk.
A. Less than 0.05
B. At least 0.05, but less than 0.15
C. At least 0.15, but less than 0.25
D. At least 0.25, but less than 0.35
E. At least 0.35

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 416
9.55 (4B, 11/99, Q.14) (3 points) You are given the following:
A portfolio of independent risks is divided into two classes.
Each class contains the same number of risks.
The claim count distribution for each risk in Class A is a mixture of a Poisson distribution
with mean 1/6 and a Poisson distribution with mean 1/3,
with each distribution in the mixture having a weight of 0.5.
The claim count distribution for each risk in Class B is a mixture of a Poisson distribution
with mean 2/3 and a Poisson distribution with mean 5/6,
with each distribution in the mixture having a weight of 0.5.
A risk is selected at random from the portfolio.
Determine the Buhlmann credibility of one observation for this risk.
A. 9/83
B. 9/82
C. 1/9
D. 10/83
E. 5/41
9.56 (Course 4 Sample Exam 2000, Q.24) You are given the following:

Type A risks have each year's losses uniformly distributed on the interval [0, 1].
Type B risks have each year's losses uniformly distributed on the interval [0, 2].
A risk is selected at random with each type being equally likely.
The first year's losses equal L.
Let X be the Buhlmann credibility estimate of the second year's losses.

Let Y be the Bayesian estimate of the second year's losses.


Which of the following statements is true?
A. If L < 1, then X > Y.
B. If L > 1, then X < Y.
C. If L = 1/2, then X < Y.
D. There are no values of L such that X = Y.
E. There are exactly two values of L such that X = Y.
9.57 (4, 5/00, Q.3) (2.5 points) You are given the following information about two classes of
business, where X is the loss for an individual insured:
Class 1 Class 2
Number of insureds
25
50
E(X)
380
23
E(X2 )

365,000
---You are also given that an analysis has resulted in a Buhlmann k value of 2.65.
Calculate the process variance for Class 2.
(A) 2,280
(B) 2,810
(C) 7,280
(D) 28,320 (E) 75,050

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 417
9.58 (4, 11/00, Q.19) (2.5 points) For a portfolio of independent risks, you are given:
(i) The risks are divided into two classes, Class A and Class B.
(ii) Equal numbers of risks are in Class A and Class B.
(iii) For each risk, the probability of having exactly 1 claim during the year is 20%
and the probability of having 0 claims is 80%.
(iv) All claims for Class A are of size 2.
(v) All claims for Class B are of size c, an unknown but fixed quantity.
One risk is chosen at random, and the total loss for one year for that risk is observed.
You wish to estimate the expected loss for that same risk in the following year.
Determine the limit of the Bhlmann credibility factor as c goes to infinity.
(A) 0
(B) 1/9
(C) 4/5
(D) 8/9
(E) 1
9.59 (4, 11/00, Q.38) (2.5 points) An insurance company writes a book of business that contains
several classes of policyholders. You are given:
(i) The average claim frequency for a policyholder over the entire book is 0.425.
(ii) The variance of the hypothetical means is 0.370.
(iii) The expected value of the process variance is 1.793.
One class of policyholders is selected at random from the book.
Nine policyholders are selected at random from this class and are observed to have produced a
total of seven claims.
Five additional policyholders are selected at random from the same class.
Determine the Bhlmann credibility estimate for the total number of claims for these five
policyholders.
(A) 2.5
(B) 2.8
(C) 3.0
(D) 3.3
(E) 3.9
9.60 (4, 11/01, Q.11 & 2009 Sample Q.62) (2.5 points)
An insurer writes a large book of home warranty policies.
You are given the following information regarding claims filed by insureds against these policies:
(i) A maximum of one claim may be filed per year.
(ii) The probability of a claim varies by insured, and the claims experience for each insured
is independent of every other insured.
(iii) The probability of a claim for each insured remains constant over time.
(iv) The overall probability of a claim being filed by a randomly selected insured in a year is 0.10.
(v) The variance of the individual insured claim probabilities is 0.01.
An insured selected at random is found to have filed 0 claims over the past 10 years.
Determine the Bhlmann credibility estimate for the expected number of claims the selected insured
will file over the next 5 years.
(A) 0.04
(B) 0.08
(C) 0.17
(D) 0.22
(E) 0.25

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 418
9.61 (4, 11/01, Q.26 & 2009 Sample Q.72) (2.5 points)
You are given the following data on large business policyholders:
(i) Losses for each employee of a given policyholder are independent
and have a common mean and variance.
(ii) The overall average loss per employee for all policyholders is 20.
(iii) The variance of the hypothetical means is 40.
(iv) The expected value of the process variance is 8000.
(v) The following experience is observed for a randomly selected policyholder:
Year
Average Loss per
Number of
Employee
Employees
1
15
800
2
10
600
3
5
400
Determine the Bhlmann-Straub credibility premium per employee for this policyholder.
(A) Less than 10.5
(B) At least 10.5, but less than 11.5
(C) At least 11.5, but less than 12.5
(D) At least 12.5, but less than 13.5
(E) At least 13.5
9.62 (4, 11/01, Q.38 & 2009 Sample Q.78) (2.5 points) You are given:
(i) Claim size, X, has mean and variance 500.
(ii) The random variable has a mean of 1000 and variance of 50.
(iii) The following three claims were observed: 750, 1075, 2000.
Calculate the expected size of the next claim using Bhlmann credibility.
(A) 1025
(B) 1063
(C) 1115
(D) 1181
(E) 1266

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 419
9.63 (4, 11/02, Q.29 & 2009 Sample Q. 48) (2.5 points)
You are given the following joint distribution:

0.4

0.1

0.1

0.2

0.1

0.1

For a given value of and a sample of size 10 for X:


10

xi = 10.
i=1

Determine the Bhlmann credibility premium.


(A) 0.75
(B) 0.79
(C) 0.82
(D) 0.86

(E) 0.89

9.64 (4, 11/02, Q.32 & 2009 Sample Q. 50) (2.5 points) You are given four classes of insureds,
each of whom may have zero or one claim, with the following probabilities:
Number of Claims
Class
0
1
I
0.9
0.1
II
0.8
0.2
III
0.5
0.5
IV
0.1
0.9
A class is selected at random (with probability 1/4), and four insureds are selected at random
from the class. The total number of claims is two.
If five insureds are selected at random from the same class, estimate the total number of
claims using Bhlmann-Straub credibility.
(A) 2.0
(B) 2.2
(C) 2.4
(D) 2.6
(E) 2.8

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 420
9.65 (4, 11/03, Q.23 & 2009 Sample Q.18) (2.5 points) You are given:
(i) Two risks have the following severity distributions:
Probability of Claim
Probability of Claim
Amount of Claim
Amount for Risk 1
Amount for Risk 2
250
0.5
0.7
2,500
0.3
0.2
60,000
0.2
0.1
(ii) Risk 1 is twice as likely to be observed as Risk 2.
A claim of 250 is observed.
Determine the Bhlmann credibility estimate of the second claim amount from the same risk.
(A) Less than 10,200
(B) At least 10,200, but less than 10,400
(C) At least 10,400, but less than 10,600
(D) At least 10,600, but less than 10,800
(E) At least 10,800
9.66 (4, 11/04, Q.9 & 2009 Sample Q.139) (2.5 points) Members of three classes of insureds
can have 0, 1 or 2 claims, with the following probabilities:
Number of Claims
Class
0
1
2
I
0.9
0.0
0.1
II
0.8
0.1
0.1
III
0.7
0.2
0.1
A class is chosen at random, and varying numbers of insureds from that class are observed
over 2 years, as shown below:
Year Number of Insureds
Number of Claims
1
20
7
2
30
10
Determine the Bhlmann-Straub credibility estimate of the number of claims in Year 3 for 35
insureds from the same class.
(A) 10.6
(B) 10.9
(C) 11.1
(D) 11.4
(E) 11.6

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 421
9.67 (4, 11/04, Q.25 & 2009 Sample Q.151) (2.5 points) You are given:
(i) A portfolio of independent risks is divided into two classes.
(ii) Each class contains the same number of risks.
(iii) For each risk in Class 1, the number of claims per year follows a Poisson distribution with mean 5.
(iv) For each risk in Class 2, the number of claims per year follows a binomial distribution
with m = 8 and q = 0.55.
(v) A randomly selected risk has three claims in Year 1, r claims in Year 2 and four claims in Year 3.
The Bhlmann credibility estimate for the number of claims in Year 4 for this risk is 4.6019.
Determine r.
(A) 1
(B) 2
(C) 3
(D) 4
(E) 5
9.68 (4, 5/05, Q.20 & 2009 Sample Q.190) (2.9 points)
For a particular policy, the conditional probability of the annual number of claims given = , and the
probability distribution of are as follows:
Number of claims

Probability

1 - 3

Probability[ = 0.05] = 0.80. Probability[ = 0.30] = 0.20.


Two claims are observed in Year 1.
Calculate the Bhlmann credibility estimate of the number of claims in Year 2.
(A) Less than 1.68
(B) At least 1.68, but less than 1.70
(C) At least 1.70, but less than 1.72
(D) At least 1.72, but less than 1.74
(E) At least 1.74

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 422
9.69 (4, 5/07, Q.36) (2.5 points)
For a portfolio of insurance risks, aggregate losses per year per exposure follow a normal
distribution with mean and standard deviation 1000, with varying by class as follows:
Class

Percent of Risks in Class

X
2000
60%
Y
3000
30%
Z
4000
10%
A randomly selected risk has the following experience over three years:
Year
Number of Exposures
Aggregate Losses
1
24
24,000
2
30
36,000
3
26
28,000
Calculate the Bhlmann-Straub estimate of the mean aggregate losses per year per exposure
in Year 4 for this risk.
(A) 1100
(B) 1138
(C) 1696
(D) 2462
(E) 2500

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 423
Solutions to Problems:
9.1. C. Each Geometric Distribution has mean , and variance (1+).
Beta

2
5

A Priori
Chance of This Hypothetical
Type of Risk
Mean
0.333
0.667

Overall

Square of
Hypothetical
Mean

Process
Variance

2.000
5.000

4.000
25.000

6.000
30.000

4.000

18.000

22.000

VHM = 18 - 42 = 2. K = EPV / VHM = 22/2 = 11.


One insured for one year. N = 1. Z = 1/(1 + 11) = 1/12.
A priori mean = 4. Observation = 10.
Estimate = (10)(1/12) + (4)(11/12) = 4.5.
Comment: Setup taken from 4, 5/05, Q.35 on Bayesian Analysis.
If it were one insured for three years, then N = 3.
If it were instead five insureds for 3 years each, then N = 15.
9.2. D. Each LogNormal Distribution has = 5, mean = exp[5 + 2/2], and
second moment = exp[10 + 22].
Sigma

1
1.5
Overall

A Priori
Chance of This Hypothetical
Type of Risk
Mean
0.500
0.500

Square of
Hypothetical
Mean

Second
Moment

Process
Variance

245
457

59,874
208,981

162,755
1,982,759

102,881
1,773,778

351

134,428

938,329

VHM = 134,428 - 3512 = 11,227. K = EPV / VHM = 938,329/11,227 = 83.6.


N = number of claims = 5. Z = 5/(5 + 83.6) = 5.6%. A priori mean = 351.
Observation = (50 + 100 + 150 + 200 + 250)/5 = 150.
Estimate = (5.6%)(150) + (1 - 5.6%)(351) = 340.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 424
9.3. B. For Risk A the mean is: (.8)(0) + (.1)(100) + (.1)(500) = 60 and the second moment is:
(.8)(02 ) + (.1)(1002 ) + (.1)(5002 ) = 26000.
Thus the process variance for Risk A is: 26000 - 602 = 22400.

Risk
A
B
C

A Priori
Chance of
Risk
0.333
0.333
0.333

Mean

Mean
60
120
150

Square of
Mean
3600
14400
22500

110.00

13500

Second
Moment
26,000
52,000
55,000

Process
Variance
22,400
37,600
32,500
30,833

Thus the Variance of the Hypothetical Means = 13500 - 1102 = 1400.


K = EPV / VHM = 30833 / 1400 = 22.0. Z = 1/(1 + K) = 1/23 = .043.
The a priori mean is 110. The observation is 500.
Thus the estimated die roll is: (.043)(500) + (1 - .043)(110) = 126.8.
9.4. C. Each Pareto Distribution has = 4, mean /(1) = /3,
second moment 22/{(1)(2)} = 2/3, and variance 2/3 - (/3)2 = 22/9.
Theta

1
3

A Priori
Chance of This Hypothetical
Type of Risk
Mean
0.500
0.500

Overall

Square of
Hypothetical
Mean

Process
Variance

0.333
1.000

0.111
1.000

0.222
2.000

0.667

0.556

1.111

VHM = .556 - .6672 = .111. K = EPV / VHM = 1.111/.111 = 10.


N = 1. Z = 1/(1 + 10) = 1/11. A priori mean = 0.667. Observation = 5.
Estimate = (5)(1/11) + (.667)(10/11) = 1.06.
9.5. B. For each Poisson, the process variance is the mean. Therefore, Expected Value of the
process variance = (.6)(.05) + (.3)(.1) + (.1)(.2) = .05 = Overall mean frequency.

Type of
Driver
Good
Bad
Ugly
Average

A Priori
Chance of
This Type
of Driver
0.5
0.3
0.2

Mean
Annual
Claim
Freq.
0.03
0.05
0.1

Square of
Mean
Claim
Freq.
0.0009
0.0025
0.0100

Poisson
Process
Variance
0.03
0.05
0.1

0.050

0.0032

0.050

Therefore the variance of the hypothetical mean frequencies = .0032 - .052 = .0007.
Therefore K= EPV / VHM = .05 / .0007 = 71.4.
Z = 5/(5 + 71.4) = 6.5%. Estimated frequency = (6.5%)(.2) + (93.5%)(.05) = 0.060.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 425
9.6. B. One needs to figure out for each type of driver a single observation of the risk process, in
other words for the observation of a single claim, the process variance of the average size of a claim.
Process variance for a Pareto Distribution is: 2/{(1)2(2)},
so the process variances are: 60,000, 104,167, and 222,222.
The probability weights are the product of claim frequency and the a priori frequency of each type of
driver: (.5)(.03), (.3)(.05), (.2)(.10). The probabilities that a claim came from each of the types of
drivers are the probability weights divided by the their sum: .3, .3, .4.
Thus the weighted average process variance of the severity is:
(60,000)(.3) + (104,167)(.3) + (222,222)(.4) = 138,139.
A Priori
Chance of
Type of This Type
Driver
of Driver
Good
Bad
Ugly

0.5
0.3
0.2

Avg.
Claim
Freq.

Probability
Weight
For
Claim

Probability
For
Claim

Alpha

Process
Variance
Claim
Severity

0.03
0.05
0.1

0.015
0.015
0.020

0.3
0.3
0.4

6
5
4

60,000
104,167
222,222

0.050

1.000

Average

138,139

Comment: On the one hand, a claim is more likely to be from a Good Driver since there are many
Good Drivers. On the other hand, a claim is more likely to be from an Ugly Driver, because each
such driver produces more claims. Thus one needs to take into account both the proportion of a
type of driver and its expected claim frequency. The probability that a claim came from each type of
driver is proportional to the product of claim frequency and the a priori frequency of each type of
driver.
9.7. C. Average severities for the Pareto Distributions are: /(1) = 200, 250, and 333.
The overall average severity is 268.3. Average of the severity squared is:
(.3)(40,000) + (.3)(62,500) + (.4)(111,111) = 75,194. Therefore, the variance of the hypothetical
mean severities = 75,194 - 268.32 = 3209.
A Priori
Chance of
Type of This Type
Driver
of Driver
Good
0.5
Bad
0.3
Ugly
0.2
Average

Avg.
Claim
Freq.
0.03
0.05
0.1

Probability
Weight
For
Claim
0.015
0.015
0.020

Probability
For
Claim
0.300
0.300
0.400

0.050

1.000

Alpha
6
5
4

Avg.
Claim
Severity
200
250
333

Square of
Avg.
Claim
Severity
40,000
62,500
111,111

268.3

75,194

9.8. A. K = EPV/ VHM = 138,139 / 3209 = 43.0. Z = 1 / (1 + 43.0) = 1/44.


New estimate = {2500 + (43)(268.3)} / 44 = $319.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 426
9.9. A. For a Poisson frequency, variance of p.p. = (mean freq.)(2nd moment of severity).
For the Pareto Distribution, the second moment is: 2 2 / {(1)(2)}.
A

Type of
Driver
Good
Bad
Ugly

A Priori
Chance of
This Type
of Driver
0.5
0.3
0.2

Claim
Freq.
0.03
0.05
0.1

Alpha
6
5
4

Expected Value
of Square of
Claim Sizes
100,000
166,667
333,333

Variance of P.P. :
Product
of Columns
C&E
3,000
8,333
33,333
10,667

Average

9.10. E. The variance of the hypothetical pure premiums = 287.1 - 13.422 = 107.0.

Type of
Driver
Good
Bad
Ugly
Average

A Priori
Chance of
This Type
of Driver
0.5
0.3
0.2

Avg.
Claim
Freq.
0.03
0.05
0.1

Alpha
6
5
4

Avg.
Claim
Severity
200
250
333

Avg.
Pure
Premium
6.00
12.50
33.33

Square of
Avg.
Pure
Premium
36.0
156.2
1,111.1

13.42

287.1

9.11. D. The observed pure premium is $2500 / 5 = $500.


K = EPV/ VHM = 10667 / 107.0 = 99.7. Z = 5/ (5 + 99.7) = 4.8%.
Estimated pure premium = (4.8%)($500) + (1 - 4.8%)($13.42) = $36.78.
Comment: The result of making separate estimates for frequency and severity is different than just
working directly with pure premiums: 36.76 (.060)(319) = 19.1.
9.12. B. As shown in the solutions to problems in the previous section, EPV = .1845 and the
VHM = .0055. Thus K = EPV / VHM = .1845 / .0055 = 33.7.
Thus for 9 observations Z = 9 / (9+33.7) = 21.1%.
The prior mean is .255 and the observation is 4/9 = .444.
Thus the new estimate is: (.211)(.444) + (.789)(.255) = 0.295.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 427
9.13. E. For a single cab for a single month, frequency is Bernoulli.
Process variance is: q(1-q), which is: (.01)(.99) = .0099 and (.02)(.98) = .0196, for the two types.
EPV = (1/2)(.0099) + (1/2)(.0196) = .01475.
Overall mean = (1/2)(.01) + (1/2)(.02) = .015.
VHM = (1/2)(.01 - .015)2 + (1/2)(.02 - .015)2 = .000025.
K = EPV/VHM = .01475/.000025 = 590. We observe: (3)(12)(10) = 360 cab-months.
Z = 360/(360 + K) = 37.9%. Observed frequency = 4/360.
Estimated future frequency = (.379)(4/360) + (1 - .379)(.015) = .01353 per cab-month.
For (3)(12)(12) = 432 cab-months, we expect: (.01353)(432) = 5.84 accidents.
Alternately, since the frequency per month is so small, it is very close to a Poisson.
Thus we have approximately, Poisson frequency with annual mean of .12 or .24.
EPV = overall mean = .18. VHM = .062 = .0036. K = EPV/VHM = .018/.0036 = 50.
We observe 30 cab-years. Z = Z = 30/(30 + K) = 3/8. Observed frequency = 4/30.
Estimated future frequency = (3/8)(4/30) + (5/8)(.18) = .1625 per cab-year.
For (3)(12) = 36 cab-years, we expect: (.1625)(36) = 5.85 accidents.
Comment: Have assumed that the 12 future cabs are all of the same (unknown) type as were the
10 past cabs.
9.14. C. Each Gamma Distribution has = 4, mean = 4, and variance = 42.
Theta

10
12
Overall

A Priori
Chance of This Hypothetical
Type of Risk
Mean
0.750
0.250

Square of
Hypothetical
Mean

Process
Variance

40
48

1,600
2,304

400
576

42

1,776

444

VHM = 1776 - 422 = 12.


K = EPV / VHM = 444/12 = 37.
N = number of claims = 8. Z = 8/(8 + 37) = 17.8%. A priori mean = 42.
Observation = (10 + 20 + 25 + 30 + 35 + 40 + 40 + 50)/8 = 31.25.
Estimate = (17.8%)(31.25) + (1 - 17.8%)(42) = 40.1.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 428
9.15. E. Each Poisson Distribution has mean = ,
and each Exponential Distribution has mean and second moment 22.
The mean aggregate loss is . The variance of the aggregate loss is 22.
Class

A
B

0.6
0.8

A Priori
Chance of This
Type of Risk
0.750
0.250

11
15

Hypothetical
Mean

Square of
Hypothetical
Mean

Process
Variance

6.60
12.00

43.56
144.00

145.20
360.00

7.95

68.67

198.90

Overall

VHM = 68.67 - 7.952 = 5.47.


K = EPV / VHM = 198.9/5.47 = 36.36.
One risk for one year. N = 1. Z = 1/(1 + 36.36) = 2.7%. A priori mean = 7.95.
Observation = 7 + 10 + 21 = 38.
Estimate = (2.7%)(38) + (1 - 2.7%)(7.95) = 8.76.
9.16. C. Each Binomial Distribution has mean = 10q and variance = 10q(1-q).
q

0.1
0.2
0.4

A Priori
Chance of This Hypothetical
Type of Risk
Mean
0.600
0.300
0.100

Overall

Square of
Hypothetical
Mean

Process
Variance

1.00
2.00
4.00

1.00
4.00
16.00

0.90
1.60
2.40

1.60

3.40

1.26

VHM = 3.4 - 1.62 = 0.84.


K = EPV / VHM = 1.26/0.84 = 1.5.
One insured for one year. N = 1. Z = 1/(1 + 1.5) = 40.0%.
A priori mean = 1.6. Observation = 5.
Estimate = (40%)(5) + (1 - 40%)(1.6) = 2.96.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 429
9.17. A. If one has a $1 million maximum covered loss the censoring has no effect. If instead one
has a $100,000 maximum covered loss, the chance of having a $100,000 payment is the chance of
having a total size of loss 100,000, which is 1/5 + 1/20 = .25.
In that case the mean is: (.5)(10,000) + (.25)(50,000) + (.25)(100,000) = 42,500.
Similarly, with a $100,000 maximum covered loss, the second moment of the severity is:
(.5)(10,0002 ) + (.25)(50,0002 ) + (.25)(100,0002 ) = 3.175 x 109 . Thus with a $100,000 maximum
covered loss the process variance is: 3.175 x 109 - (42,5002 ) = 1.369 x 109 .
Maximum
Covered
Loss
1 million
100 thous.

A Priori
Chance of This Hypothetical
Type of Risk
Mean
0.333
0.667

Overall

87,500
42,500

Second
Moment

Process
Variance

Square of
Hypothetical
Mean

5.2675e+10
3.1750e+9

4.5019e+10
1.3688e+9

7.6562e+9
1.8062e+9

1.5919e+10

3.7562e+9

57,500

VHM = 3.756 x 109 - 57,5002 = 4.50 x 108 . K = EPV/VHM = 1.592 x 1010/4.50 x 108 = 35.4.
N = 2. Z = 2 / (2+35.4) = 5.4%. A priori mean = $57,500. Observation = $100,000.
Estimate = (100,000)(5.4%) + (57,500)(1 - 5.4%) = 59,800.
9.18. B. VHM = .00433 - .0592 = .000849. EPV = .081.
Type of
Driver
A
B
C

A Priori Chance
of Driver
0.200
0.500
0.300

Mean

K = EPV/VHM = .081/.000849 = 95.4.


Comment: Similar to 4, 11/01, Q.23.

Mean
Freq.
0.020
0.050
0.100

Square of
Mean Freq.
0.0004
0.0025
0.0100

Variance
of Freq.
0.03
0.06
0.15

0.0590

0.00433

0.081

Z = 1/ (1 + 95.4) = 1.0%.

9.19. A. Given the number of claims incurred m, the mean number of claims reported is: .65m.
Thus the VHM = Var[.65m] = .652 Var[m] =
(.4225){((2-4)2 + (3-4)2 + (4-4)2 + (5-4)2 + (6-4)2 )/5} = (.4225)(2) = .845.
Given the number of claims incurred m, the number of claims reported by year end is Binomial with
q = .65 and m. Thus the process variance is: (.65)(.35)m = .2275m.
EPV = E[.2275m] = .2275E[m] = (.2275)(4) = .91. K = EPV/VHM = .91/.845 = 1.077.
For one observation of the risk process, Z = 1/(1+1.077) = 48.1%.
Relying solely on the observation, the estimated number of claims incurred is:
3/.65= 4.615. The a priori mean number of claims incurred is 4.
Thus the estimated number of claims incurred is: (.481)(4.615) + (.519)(4) = 4.30.
Comment: Beyond what you are likely to be asked on your exam.
One would estimate the number of claims yet to be reported as: 4.30 - 3 = 1.30.
See Loss Development Using Credibility, by Eric Brosius.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 430
9.20. E. Adding the probabilities, there is a 30% a priori probability of = 1, risk type A, and a
70% a priori probability of = 2, risk type B.
The mean for risk type A is: E[X | = 1] = (0)(1/3) + (5)(1/3) + (25)(1/3) = 10.
The 2nd moment for risk type A is: E[X2 | = 1] = (02 )(1/3) + (52 )(1/3) + (252 )(1/3) = 216.67.
Process Variance for Risk Type A is: Var[X | = 1] = 216.67 - 102 = 116.67.
Similarly, Risk Type B has mean 16.43, 2nd moment 367.86, and process variance 97.91.
Risk Type

A Priori Chance

Mean

Square of Mean

Process Variance

A
B

0.3
0.7

10.00
16.43

100.00
269.94

116.67
97.91

14.50

218.96

103.54

Average

The variance of the hypothetical means = 218.96 - 14.502 = 8.71.


K = EPV/VHM = 103.54/8.71 = 11.9. Z = 30/(30 + K) = 71.6%.
Observed mean is: 510/30 = 17. Prior mean is 14.5.
The estimate using Credibility is: (.716)(17) + (.284)(14.5) = 16.3.
Comment: Similar to 4, 11/02, Q.29.
9.21. D. Given the number of claims incurred m, the mean number of claims reported is: .7m.
Thus the VHM = Var[.7m] = .72 Var[m] = (.49){(2)(1.6)(1+1.6)} = 4.077.
Given the number of claims incurred m, the number of claims reported is Binomial with q = .7 and m.
Thus the process variance is (.7)(.3)m = .21m.
EPV = E[.21m] = .21E[m] = (.21)(2)(1.6) = .672. K = EPV/VHM = .672/4.077 = .165.
For one observation of the risk process, Z = 1/1.165 = .858.
Relying solely on the observation the estimated number of claims incurred is:
5/.7 = 7.143 The a priori mean number of claims incurred is: (2)(1.6) = 3.2.
Thus the estimated number of claims incurred is: (.858)(7.143) + (.142)(3.2) = 6.58.
Comment: Beyond what you are likely to be asked. Since in this case the Bayesian Analysis
estimates turn out to lay along a straight line, they are equal to the estimate using Buhlmann
Credibility. See a problem in a previous section for the Bayesian Analysis estimate for the same
situation. See Loss Development Using Credibility, by Eric Brosius.
9.22. C. E[q] = Overall mean = 0.12. .03 = VHM = Var[q] = E[q2 ] - E[q]2 .
Therefore, E[q2 ] = .03 + E[q]2 = .03 + .122 = .0444.
EPV = E[q(1-q)] = E[q] - E[q2 ] = .12 - .0444 = .0756. K = EPV/VHM = .0756/.03 = 2.52.
Z = 8/(8 + 2.52) = .760. Estimated future annual frequency = (.760)(2/8) + (.240)(.12) = .219.
Estimated number of claims for 3 years = (3)(.219) = 0.657.
Comment: Similar to 4, 11/01, Q.11.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 431
9.23. A. EPV = 1007.5. VHM = 1007.5 - 30.52 = 77.25. K = EPV/VHM = 13.0.

Type

A Priori
Chance of
This Type

Process
Variance

Mean

Square of
Mean

1
2
3

0.700
0.200
0.100

25
40
50

625
1,600
2,500

25
40
50

625
1,600
2,500

1,007.50

30.50

1,007.50

Overall

Z = 3/(3 + 13.0) = 3/16. Observed Mean = (30 + 40 + 70)/3 = 46.67.


Prior Mean = 30.5. Estimate = (3/16)(46.67) + (13/16)(30.5) = 33.53.
Comment: Since there is no mention of differing frequency by type, we assume the
mean frequencies for the types are the same; i.e., we ignore frequency.
N = 3 is the number of claims. The number of years observed is not used.
9.24. B. The process variance for a risk from class X is: {(1.4)(3)}2 = 17.64.

Class

A Priori
Chance of
This Type

Process
Variance

Mean

Square of
Mean

X
Y
Z

0.500
0.300
0.200

3
4
5

17.64
31.36
49.00

3
4
5

9
16
25

28.03

3.70

14.30

Overall

EPV = 28.03. VHM = 14.30 - 3.702 = 0.61.


K = EPV/VHM = 50.0. Z = 325/(325 + 50.0) = 86.7%.
Prior Mean = 3.70. Observed Mean = (403 + 360 + 371)/(124 + 103 + 98) = 1134/325 = 3.49.
Estimated future pure premium = (86.7%)(3.49) + (1 - 86.7%)(3.70) = 3.52.
Estimated aggregate loss for 100 exposures: (100)(3.52) = 352.
Comment: Similar to 4, 5/07, Q. 36.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 432
9.25. D. As a first step adjust everything to the year 4 level.
The means of the Exponentials are: (200)(1.053 ) = 231.53, and (300)(1.053 ) = 347.29.
The EPV is: (75%)(231.532 ) + (25%)(347.292 ) = 70,357.
The overall mean is: (75%)(231.53) + (25%)(347.29) = 260.47.
The VHM is: (75%)(231.53 - 260.47)2 + (25%)(347.29 - 260.47)2 = 2513.
K = 70,357/2513 = 28. We observe 4 claims, so Z = 4/(4 + 28) = 1/8.
On a year four level, the observed average claim size is:
{20(1.053 ) + 100(1.053 ) + 50(1.05) + 400(1.05)}/4 = 152.85.
The estimated future claim size on the year 4 level is:
(1/8)(152.85) + (7/8)(260.47) = 247.02.
Alternately, working in year 1, the EPV is:
(75%)(2002 ) + (25%)(3002 ) = 52,500.
The overall mean is: (75%)(200) + (25%)(300) = 225.
The VHM is: (75%)(200 - 225)2 + (25%)(300 - 225)2 = 1875.
K = 52,500/1875 = 28. We observe 4 claims, so Z = 4/(4 + 28) = 1/8.
On a year one level, the observed average claim size is:
(20 + 100 + 50/1.052 + 400/1.052 )/4 = 132.04.
The estimated future claim size on the year 1 level is:
(1/8)(132.04) + (7/8)(225) = 213.38.
On the year four level: (213.38)(1.053 ) = 247.01.
Comment: Both the EPV and VHM are multiplied by the total inflation factor squared. Therefore, K,
the Buhlmann Credibility Parameter, is not affected by inflation which changes the scale.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 433
9.26. A. As a first step adjust everything to the year 2008 level.
The LogNormals have parameters = 1.5, and = 8 + ln(1.13 ) = 8.286 and 9 + ln(1.13 ) = 9.286.
The first LogNormal has mean exp[8.286 + 1.52 /2] = 12,222,
second moment exp[(2)(8.286) + (2)1.52 ] = 1,417,272,377,
and variance 1,417,272,377 - 12,2222 = 1,267,895,093.
The second LogNormal has mean exp[9.286 + 1.52 /2] = 33,223,
second moment exp[(2)(9.286) + (2)1.52 ] = 10,472,305,100,
and variance 10,472,305,100 - 33,2232 = 9,368,537,371.
The EPV is: (50%)(1,267,895,093) + (50%)(9,368,537,371) = 5,318,216,232.
The overall mean is: (50%)(12,222) + (50%)(33,223) = 22,723.
The VHM is: (50%)(12,222 - 22,723)2 + (50%)(33,223 - 22,723)2 = 110,250,000.
K = EPV/VHM = 5,318.22/110.25 = 48.2. We observe 3 years, so Z = 3/(3 + 48.2) = 5.9%.
On the year 2008 level, the average aggregate loss is:
{(32,000)(1.13 ) + (29,000)(1.12 ) + (37,000)(1.1)}/3 = 39,461.
The estimated aggregate loss for 2008 is: (5.9%)(39,461) + (1 - 5.9%)(22,723) = 23,711.
Alternately, working in the year 2004, the first LogNormal has mean exp[8 + 1.52 /2] = 9182,
second moment exp[(2)(8) + (2)1.52 ] = 799,902,178,
and variance 799,902,178 - 91822 = 715,593,054.
The second LogNormal has mean exp[9 + 1.52 /2] = 24,959,
second moment exp[(2)(9) + (2)1.52 ] = 5,910,522,063,
and variance 5,910,522,063 - 24,9592 = 5,287,570,382.
The EPV is: (50%)(715,593,054) + (50%)(5,287,570,382) = 3,001.6 million.
The overall mean is: (50%)(9182) + (50%)(24,959) = 17,071.
The VHM is: (50%)(9182 - 17,071)2 + (50%)(24,959 - 17,071)2 = 622.2 million.
K = EPV/VHM = 3,001.6/ 622.2 = 48.2. We observe 3 years, so Z = 3/(3 + 48.2) = 5.9%.
On a year 2004 level, the average aggregate loss is:
(32,000 + 29,000/1.1 + 37,000/1.12 )/3 = 29,647.
The estimated future aggregate loss on the year 2004 level is:
(5.9%)(29,647) + (1 - 5.9%)(17,071) = 17,813.
On the year 2008 level, the estimated aggregate loss is: (17,813)(1.13 ) = 23,709.
Comment: Both the EPV and VHM are multiplied by the total inflation factor squared. Therefore, K,
the Buhlmann Credibility Parameter, is not affected by inflation which changes the scale.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 434
9.27. B. For = 1, the process variance is: r (1+) = (3)(1)(1 + 1) = 6.
For = 2, the process variance is: r (1+) = (3)(2)(1 + 2) = 18.
EPV = (6 + 18)/2 = 12.
For = 1, the mean is: r = (3)(1) = 3. For = 2, the mean is: r = (3)(2) = 6.
The overall mean is: (3 + 6)/2 = 4.5.
The VHM = (.5)(3 - 4.5)2 + (.5)(6 - 4.5)2 = 2.25.
K = EPV/VHM = 12/2.25 = 5.33. Z = 1/(1 + 5.33) = 15.8%.
(15.8%)(7) + (1 - 15.8%)(4.5) = 4.895.
9.28. D. f(7) = {r(r+1)(r+2)(r+3)(r+4)(r+5)(r+6)/7!}7/(1+)r+7 = 367/(1+)10.
For = 1, f(7) = 3.516%. For = 2, f(7) = 7.804%.
P[Observation] = (1/2)(3.516%) + (1/2)(7.804%) = 5.66%.
By Bayes Theorem, P[Risk Type | Observation] = P[Obser. | Type] P[Type] / P[Observation].
P[ = 1 | Observation] = (3.516%)(.5)/(5.66%) = 31.06%.
P[ = 2 | Observation] = (7.804%)(.5)/(5.66%) = 68.94%.
(31.06%)(3)(1) + (68.94%)(3)(2) = 5.068.
9.29. B. For each LogNormal, E[X] = exp[ + 2/2] and E[X2 ] = exp[2 + 22].
Process variance = Second Moment - Mean2 .

Mean

Square
of Mean

Second
Moment

Process
Variance

6
7
6
7

0.5
0.5
1
1

457.14
1,242.65
665.14
1,808.04

208,981.29
1,544,174.47
442,413.39
3,269,017.37

268,337.29
1,982,759.26
1,202,604.28
8,886,110.52

59,356.00
438,584.80
760,190.89
5,617,093.15

1,043.24

1,366,146.63

3,084,952.84

1,718,806.21

Average

VHM = 1,366,146.63 - 1043.242 = 277,797.


K = EPV/VHM = 1,718,806 / 277,797 = 6.2.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 435
9.30. A. VHM = 0.031 - 0.152 = 0.0085.
Type

A priori
Probability

Process
Variance

Mean

Square of
Mean

I
II
III

0.7
0.2
0.1

0.09000
0.16000
0.24000

0.10000
0.20000
0.40000

0.010
0.040
0.160

0.11900

0.15000

0.03100

EPV = 0.119. K = 0.119/0.0085 = 14.


Z = 6/(6 + 14) = 30%. Estimate = (3/6)(30%) + (0.15)(70%) = 0.255.
Comment: Setup taken from 4, 11/03, Q.39, where instead one should use Bayes analysis.
9.31. A. For frequency, EPV = 0.04548 and VHM = 0.00252 - 0.0482 = 0.000216.
Type
1
2
Average

A Priori
Probability
40%
60%

Mean
0.03000
0.06000

Square of
Mean
0.00090
0.00360

Process
Variance
0.02910
0.05640

0.04800

0.00252

0.04548

K = EPV / VHM = 0.04548/0.00216 = 21.1.


There are a total of 100 exposures, so Z = 100/(100 + 21.1) = 82.6%.
Observed frequency is 4/100. Prior mean is 0.048.
Estimated future frequency is: (82.6%)(0.040) + (1 - 82.6%)(0.048) = 4.14%.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 436
x -1 e - x /
9.32. B. The density of the Gamma Distribution is:
.
()
Since is the same for both risk types, we can ignore the factor of: e-x/
The chance of the observation is proportional to:
q2

(1-q)18

500 -1 1000 -1 0
300 -1
800 -1
25
1
29
1
24
q (1-q)
q (1-q)
q (1-q)
=
100 () 100 ()
100 ()
100 ()

q4

(1-q)96

1200 -1
.
1004 () 4

Therefore, ignoring the constant of 1004 in the denominator, the probability weights are:
(40%)(0.034 )(0.9796)

3
12005
4 )(0.9496) 1200 = 0.02729.
=
0.20884,
and
(60%)(0.06
(5!)4
(3!)4

Thus the posterior distribution is: 88.44% and 11.56%.


Therefore, the estimated future frequency is: (88.44%)(0.03) + (11.56%)(0.06) = 3.35%.
Comment: In the case of Bayes analysis, we use all of the information given.
In this case, we are given information on severity, both in the model and the observation, and thus
this is used to help get the chance of the observation and thus the posterior distribution.
If we not been given the information on severity then the analysis would have been instead:
The chance of the observation is:
20
25
30
25
2 q2 (1-q)18 0 q0 (1-q)25 1 q1 (1-q)29 1 q1 (1-q)24.




This is proportional to: q4 (1-q)96.
Therefore, the probability weights are: (40%)(0.034 )(0.9796), and (60%)(0.064 )(0.9496).
Thus the posterior distribution is:

and

(40%)(0.034 )(0.9796)
(40%)(0.034 )(0.9796) + (60%)(0.064)(0.9496 )

(60%)(0.064 )(0.9496)
(40%)(0.034 )(0.9796) + (60%)(0.064)(0.9496 )

= 45.96%,

= 54.04%.

Therefore, the estimated future frequency is: (45.96%)(0.03) + (54.04%)(0.06) = 4.62%.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 437
9.33. C. The mean frequencies are 3% and 6%, thus of the claims we expect from type 1:
(40%)(3%)
= 25%, and the remaining 75% from type 2.
(40%)(3%) + (60%)(6%)
Mean Severity is 100. Variance of Severity is 1002 = 10,000 .
Thus the two process variances are: 60,000 and 40,000.
For severity, EPV = (25%)(60,000) + (75%)(40,000) = 45,000.
The hypothetical means are: 600 and 400.
Thus the first moment of the hypothetical means is: (25%)(600) + (75%)(400) = 450.
The second moment of the hypothetical means is: (25%)(6002 ) + (75%)(4002 ) = 210,000.
VHM = 210,000 - 4502 = 7500.
K = EPV / VHM = 45,000/7500 = 6.
There are a total of 4 claims, so Z = 4/(4 + 6) = 40%.
Observed severity is: (500 + 1000 + 300 + 800)/4 = 650. Prior mean is 450.
Estimated future severity is: (40%)(650) + (60%)(450) = 530.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 438
x -1 e - x /
9.34. E. The density of the Gamma Distribution is:
.
()
Since is the same for both risk types, we can ignore the factor of: e-x/
The chance of the observation is proportional to:
q2

(1-q)18

500 -1 1000 -1 0
300 -1
800 -1
25
1
29
1
24
q (1-q)
q (1-q)
q (1-q)
=
100 () 100 ()
100 ()
100 ()

q4

(1-q)96

1200 -1
.
1004 () 4

Therefore, ignoring the constant of 1004 in the denominator, the probability weights are:
(40%)(0.034 )(0.9796)

3
12005
4 )(0.9496) 1200 = 0.02729.
=
0.20884,
and
(60%)(0.06
(5!)4
(3!)4

Thus the posterior distribution is: 88.44% and 11.56%.


The hypothetical mean severities are: 600 and 400.
Therefore, the estimated future severity is: (88.44%)(600) + (11.56%)(400) = 577.
Comment: In the case of Bayes analysis, we use all of the information given.
In this case, we are given information on frequency, both in the model and the observation, and thus
this is used to help get the chance of the observation and thus the posterior distribution.
If we not been given the information on frequency in the model, then the analysis would have been
instead:
The chance of the observation is proportional to:
500 -1 1000 -1
300 -1
800 -1
1200 -1
=
.
100 () 100 () 100 () 100 () 1004 () 4
Therefore, the probability weights are:
(40%)

12005
12003
=
0.048,
and
(60%)
= 0.008.
1004 (5!)4
1004 (3!)4

Thus the posterior distribution is: 85.71% and 14.29%.


The hypothetical mean severities are: 600 and 400.
Therefore, the estimated future severity is: (85.71%)(600) + (14.29%)(400) = 571.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 439
9.35. D. Mean frequency is q. Variance of frequency is: q(1-q).
Mean Severity is 100. Variance of Severity is: 1002 .
Thus the variance of the pure premium is: q1002 + (100)2 q(1-q) = 10,000q{1 + (1 - q)}.
Thus the two process variances are:
(10,000)(0.03)(6){1 + (6)(0.97)} = 12,276, and (10,000)(0.06)(4){1 + (4)(0.94)} = 11,424.
Thus, for pure premium, EPV = (40%)(12,276) + (60%)(11,424) = 11,765.
The hypothetical means are: (0.03)(600) = 18, and (0.06)(400) = 24.
Thus the first moment of the hypothetical means is: (40%)(18) + (60%)(24) = 21.6.
The second moment of the hypothetical means is: (40%)(182 ) + (60%)(242 ) = 475.2.
VHM = 475.2 - 21.62 = 8.64.
K = EPV / VHM = 11,765/8.64 = 1362.
There are a total of 100 exposures, so Z = 100/(100 + 1362) = 6.8%.
Observed pure premium is: (500 + 1000 + 300 + 800)/100 = 26. Prior mean is 21.6.
Estimated future pure premium is: (6.8%)(26) + (1 - 6.8%)(21.6) = 21.90
Given 40 exposures, the estimated aggregate loss is: (40)(21.90) = 876.

9.36. A. The density of the Gamma Distribution is:

x -1 e - x /
.
()

Since is the same for both risk types, we can ignore the factor of: e-x/
The chance of the observation is proportional to:
q2 (1-q)18

q4 (1-q)96

-1
-1
500 -1 1000 -1 0
25 q1 (1-q)29 300
1 (1-q)24 800
q
(1-q)
q
=
100 () 100 ()
100 ()
100 ()

1200 -1
.
1004 () 4

Therefore, ignoring the constant of 1004 in the denominator, the probability weights are:
(40%)(0.034 )(0.9796)

3
12005
4 )(0.9496) 1200 = 0.02729.
=
0.20884,
and
(60%)(0.06
(5!)4
(3!)4

Thus the posterior distribution is: 88.44% and 11.56%.


The hypothetical mean pure premiums are: (0.03)(600) = 18, and (0.06)(400) = 24.
Therefore, the estimated future pure premium is: (88.44%)(18) + (11.56%)(24) = 18.69.
Given 30 exposures, the estimated aggregate loss is: (30)(18.69) = 561.
Comment: In the case of Bayes analysis, we use all of the information given.
In this case, we are given information on both frequency and severity, both in the model and the
observation, and thus this is used to help get the chance of the observation and thus the posterior
distribution. Thus the posterior distribution is the same for estimating frequency, severity, and pure
premiums: 88.44% chance of type 1 and 11.56% chance of type 2.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 440
9.37. D. The hypothetical mean frequencies are: 5% and 10%
VHM = 2.5%2 = 0.000625.
The process variances are: (0.05)(0.95) = 0.0475, and (0.1)(0.9) = 0.09.
EPV = (0.0475 + 0.09)/2 = 0.06875.
K = EPV / VHM = 0.06875 / 0.000625 = 110.
Z = 3 / (3 + K) = 2.65%.
9.38. B. The mean severity is 2000. The variance of severity is 1,800,000
The hypothetical means are: (5%)(2000) = 100, and (10%)(2000) = 200.
VHM = 502 = 2500.
The process variance for type 1 is:
(0.05)(1,800,000) + (20002 )(0.05)(0.95) = 280,000.
The process variance for type 1 is:
(0.10)(1,800,000) + (20002 )(0.10)(0.90) = 540,000.
EPV = (280,000 + 540,000)/2 = 410,000.
K = EPV / VHM = 410,000 / 2500 = 164.
Z = 3 / (3 + K) = 1.80%.
9.39. A. Expected Value of the Process Variance = .19.
Class
1
2
Average

A Priori
Probability
0.4000
0.6000

Mean for this


Class
0.2
0.3

Square of Mean
of this Class
0.04
0.09

Process
Variance
0.16
0.21

0.26

0.0700

0.1900

Variance of the Hypothetical Means = .070 - .262 = .0024.


K = EPV / VHM = .19 / .0024 = 79.2. Thus for N = 5, Z = 5/(5 + 79.2) = 5.94%.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 441
9.40. D. Set the Poisson parameter for an insured equal to . We are given that across the different
insureds E[] = .1 and VAR[] = .1. Therefore, E[2 ] = .1 + .12 = .11.
Set the mean severity for an insured equal to . We are given that across the different insureds
E[] = 100 and VAR[] = 2500. Therefore, E[2 ] = 2500 + 1002 = 12500.
We are given that the process variance of the severity is 2 . Therefore the second moment of the
severity process = severity process variance + (mean severity)2 = 2 + 2 = 2 2 .
Since the frequency is Poisson and frequency and severity are independent, the process variance
of the pure premium is equal to: (mean frequency)(2nd moment of the severity) = 22 . The
expected value of the process variance of the P.P. = E[22 ] = 2E[]E[2 ] = (2)(.1)(12500) =
2500. (Where we have made use of the fact that the frequency and severity are independent.)
For an individual insured, the hypothetical pure premiums is .
The variance of the hypothetical pure premiums = E[()2] - E2 [] =
E[2 ]E[2 ] - E2 [] E2 [] = (12500)(.11) - (.12 )(1002 ) = 1375 - 100 = 1275.
K = EPV/VHM = 2500 / 1275 = 1.961. For one observation Z = 1 /(1+1.961) = 0.338.
Comment: Difficult. Really tests your understanding of the concepts.
9.41. B. Z = N / (N+K). .40 = 2 / (2+K). Therefore, K = 3.
Therefore for N = 3, Z = 3 / (3+3) = 0.5.
9.42. D. The hypothetical mean pure premiums are: (.1667)(4) = 2/3 and (.8333)(2) = 5/3.
Since the two classes have the same number of risks the overall mean is 7/6 and the variance of the
hypothetical mean pure premiums between classes is:
{(2/3 - 7/6)2 + (5/3 - 7/6)2 }/2 = 1/4.
For each type of risk, the process variance of the pure premiums is given by: fs2 + s2 f2 .
For Class A, that is: (.1667)(20)+(42 )(.1389) = 5.5564.
For Class B, that is: (.8333)(5)+(22 )(.1389) = 4.7221.
Since the classes have the same number of risks,
the Expected Value of the Process Variance = (.5)(5.5564) + (.5)(4.7221) = 5.139.
Thus K = EPV / VHM = 5.139 / .25 = 20.56. Z = N /(N + K) = 4/(4 + 20.56) = 0.163.
9.43. E. The prior estimate is the overall mean of 7/6. The observation is .25.
Thus the new estimate is: (.163)(.25) + (7/6)(1 - .163) = 1.017.
Comment: Uses the solution of the previous question.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 442
9.44. A. Variance of Hypothetical Means = .520 - .72 = .03.
Type
Risk
A
B
Mean

A Priori
Chance of
Risk
0.750
0.250

Avg.
Claim
Frequency
0.6
1.0
0.700

Square of
Avg. Claim
Frequency
0.36
1.00
0.520

Second
Moment of
Claim Freq.
1.400
2.400

Process
Variance
Claim Freq.
1.04
1.40
1.130

K= EPV /VHM = 1.13 / .030 = 37.7. Z = 1 / (1+37.7) = .026.


New Estimate = (.026)(1) + (1 - .026)(.700) = 0.708.
Comment: The Process Variance for each risk is the Second Moment of the Frequency minus the
square of Average Claim Frequency. For example, 1.4 - .36 = 1.04.
The second moment of the claim frequency for Risk B is computed as:
(02 )(.5) + (12 )(.2) + (22 )(.1) + (32 )(.2) = 2.40.
9.45. E. We have E[] = .25 and Var[] = E[2] - E[]2 = .07. Therefore, E[2] = .1325.
For the Binomial we have a process variance of mq(1-q) = (1) = 2.
Therefore, the Expected Value of the Process Variance = E[] - E[2] = .25 - .1325 = .1175.
For the Binomial the mean is mq = . Variance of the Hypothetical Means = Var[] = .07.
K = EPV / VHM = .1175 / .07 = 1.6786. For one observation Z = 1 / (1 + 1.6786) = 0.373.
Comment: Since the Buhlmann credibility does not depend on the form of the prior distribution (but
just on its mean and variance), one could assume it was a Beta Distribution. In that case one would
have a Beta-Bernoulli conjugate prior situation. (See Mahlers Guide to Conjugate Priors.)
One can then solve for the parameters a and b of the Beta:
mean = .25 = a / (a+b) and variance = .07 = ab / {(a+b+1)(a+b)2 }.
Therefore b = 3a. 4a+1 = (.25)(.75) / .07 = 2.679. a = .420. b = 1.260.
The Buhlmann Credibility parameter for the Beta-Bernoulli is: a + b = 1.68.
For one observation, Z = 1 /(1+1.68) = .373.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 443
9.46. B. The variance of the hypothetical means = 14,318,301 - 36532 = 973,892.
Risk
1
2

A priori
Probability
0.666
0.333

Average

Mean
4350
2270

Second
Moment
80,305,000
40,207,000

3,653

Process
Variance
61,382,500
35,054,100

Square of
Mean
18,922,500
5,152,900

52,553,760

14,318,301

Thus K = EPV / VHM = 52,553,760 / 973,892 = 54. For one observation, Z = 1 / (1 + 54) = 1/55.
Thus the new estimate = ($100)(1/55) + ($3653)(54/55) = $3588.
Comment: (.7)(1002 ) + (.2)(10002 ) + (.1)(200002 ) = 40,207,000. 40,207,000 - 22702 =
35,054,100. The Buhlmann Credibility estimate has to be between the a priori mean of $3653 and
the observation of $100, so that prior to calculating Z one knows that the solution is either A , B, or
just barely C.
9.47. D. The Mean for Risk A = (.84)(0) + (.16)(50) + (.04)(1000) = 48.
The Second Moment for Risk A = (.84)(02 ) + (.16)(502 ) + (.04)(10002 ) = 40400.
The Process Variance for Risk A = 40400 - 482 = 38096.
Risk
A
B

A priori
Probability
0.5
0.5

Average

Mean
48
172

Second
Moment
40400
160600

110

Process
Variance
38096
131016

Square of
Mean
2304
29584

84556

15944

The variance of the hypothetical means = 15944 - 1102 = 3844.


Thus K = EPV / VHM = 84556 / 3844 = 22. For two observations, Z = 2 / (2 + 22) = 1/12.
Thus the new estimate = ($0)(1/12) + ($110)(11/12) = $100.83.
9.48. B. Since the frequency and severity are independent, for each class
the Process Variance of the Pure Premium =
(mean frequency)(variance of severity) + (mean severity)2 (variance of frequency).
Class
1
2
Weighted Average

A Priori
Probability
0.25
0.75

Mean
Pure Premium
1
4

Variance of
Pure Premium
17
66

Square of
Mean P.P.
1
16

3.25

53.75

12.25

Variance of Hypothetical Mean P.P. = 12.25 - (3.25)2 = 1.6875.


K = EPV/ VHM = 53.75 / 1.6875 = 31.9. Z = 1/(1+31.9) = 3.0%.
New Estimate = (.03)(P1 ) + (.97)(3.25) = 0.03P1 + 3.15.
Comment: One has to assume that one is unaware of which class the risk has been selected from,
just as in for example problems involving urns, where one doesnt know from which urn a ball has
been drawn.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 444
9.49. A. Any information about which risk we had chosen, which may have been contained in prior
observations, is no longer relevant once we make a new random selection of a risk from the
portfolio. Therefore our best estimate is the a priori mean of 3.25.
9.50. B. For example, the second moment of aggregate losses for risk type A is:
(02 )(.80) + (502 )(.16) + (20002 )(.04) = 160400.
Therefore the Process Variance for type A risk is: 164000 - 882 = 152,656.
Type
Risk
A
B
C
Mean

A Priori
Chance of
Risk
0.333
0.333
0.333

Average
Aggregate
Losses
88.0
332.0
576.0
332.0

Square of
Average
Aggregate Losses
7,744
110,224
331,776
149,915

Second
Moment of
Agg. Losses
160,400
640,600
1,120,800

Process
Variance
Agg. Losses
152,656
530,376
789,024
490,685

The overall a priori mean is 332. The second moment of the means is: 149915.
Therefore, the Variance of the Hypothetical Means = 149915 - 3322 = 39691.
Expected Value of the Process Variance = 490685. K = EPV / VHM = 490685/39691 = 12.4.
Z= 1 / (1+12.4) = 7.5%. New estimate = (.075)(50) + (.925) (332) = $311.
9.51. D. Variance of the Hypothetical Means = 2.5 - 1.52 = .25.
Class
1
2

A Priori
Probability
0.5000
0.5000

Mean for
this Class
1
2

Square of Mean
of this Class
1
4

Process
Variance
1
2

1.5

2.5

1.5

Average

K= EPV / VHM = 1.5 / .25 = 6. Thus for N = 2, Z = 2/(2 + 6) = 25%.


The observed mean is (2+0)/2 = 1 and the a priori mean is 1.5.
Therefore, the new estimate = (1)(25%) + (1.5)(75%) = 1.375.
9.52. C. The Expected Value of the Process Variance = .2222.
The Variance of the Hypothetical Means = .2222 - .44442 = .0247.
K = EPV/VHM = .2222 / .0247 = 9. Z = 2/(2+9) = 2/11.
The observed frequency is 2/2 = 1. The a priori mean is .4444.
Thus the estimate of the future frequency is: (2/11)(1) + (9/11)(.4444) = .545 = 6 /11.
Class

A Priori Chance
of This Class

Mean Annual
Claim Freq.

Square of
Mean Claim Freq.

Bernoulli
Process Variance

1
2
3

0.3333
0.3333
0.3333

0.3333
0.3333
0.6667

0.1111
0.1111
0.4444

0.2222
0.2222
0.2222

0.4444

0.2222

0.2222

Average

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 445
9.53. C. The process variances of the classes are 4002 /12, 6002 /12, and 8002 /12.
Since the a priori probabilities of the class are all equal,
the Expected Value of the Process Variance = {(4002 /12) + (6002 /12) + (8002 /12)} / 3 = 32,222.
The means of the classes are 200, 300 and 400.
Thus the a priori overall mean is: (200+300+400) /3 = 300.
Thus the Variance of the Hypothetical Means is:
{(200-300)2 + (300-300)2 + (400-300)2 } / 3 = 6667.
Thus the Buhlmann Credibility Parameter K = EPV/VHM = 32222/6667 = 4.83.
Thus for one claim, (N=1), Z = 1/(1+4.83) = .171.
Thus after an observation of 340, the Buhlmann credibility estimate of the expected value of a
second claim from the same risk is: (340)(.171) + (300)(1 - .171) = 307.
In spreadsheet form, the calculation of the EPV and VHM is as follows:
Class
1
2
3
Overall

A Priori Chance
of this Class
0.333
0.333
0.333

Process
Variance
13333
30000
53333

Mean for
this Class
200
300
400

Square of Mean
for this Class
40000
90000
160000

32222

300

96667

The Variance of the Hypothetical Means = 96667 - 3002 = 6667.


Comment: The variance of the uniform distribution on [a,b] is (b-a)2 /12. Recall that when estimating
severities, the number of observations is in terms of the number of claims, in this case one. Since
when using credibility the posterior estimate is always between the a priori mean and the
observation, we can eliminate choices A and B.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 446
9.54. A. The mean pure premium for Class One is u3/4. The second moment of the pure
premium is (0)(1/4) + (u2 )(3/4) = (u2 )(3/4); thus the Process Variance for the premium is:
(u2 )(3/4) - (u3/4)2 = 3u2 /16.
Alternately, the process is u times a Bernoulli, so the process variance is u2 times the process
variance of a Bernoulli: u2 q(1-q) = u2 (3/4)(1/4) = 3u2 /16.
Similarly, the process variance for Class 2, which has 2u times a Bernoulli with q =1/4, is:
(2u)2 (1/4)(1-1/4) = 12u2 /16. Thus since the classes are equally likely, the
EPV = (1/2)(3u2 /16) + (1/2)(12u2 /16) = u2 15/32.
The mean pure premium for Class 2 is (2u)(1/4) = 2u/4. That for Class 1 is u3/4.
Thus since the two classes are equally likely, the overall mean pure premium is:
(1/2)(3u/4) + (1/2)(2u/4) = 5u/8.
The Variance of the Hypothetical Mean Pure Premiums is:
(1/2)(5u/8 - 3u/4)2 + (1/2)(5u/8 - 2u/4)2 = u2 /64.
Thus the Buhlmann Credibility Parameter, K = EPV/VHM = (u2 15/32)/( u2 /64) = 30.
Thus for one exposure, the credibility Z = 1 /(1+K) = 1/31 = 0.032.
In spreadsheet form the calculation of the EPV and VHM is as follows:
Class
1
2
Average

A Priori
Probability
0.5000
0.5000

Mean for this


Class
3u/4
2u/4

Square of Mean
of this Class
(9/16)u^2
(4/16)u^2

Process
Variance
(3/16)u^2
(12/16)u^2

5u/8

(13/32)u^2

(15/32)u^2

VHM = (13/32)u2 - (5u/8)2 = u2 (26/64 - 25/64) = u2 /64. EPV = (15/32)u2 .


Comment: Since the given solutions do not depend on u, just taking u =1 at the beginning may help
you to get the correct solution. (u drops out of the solution.)

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 447
9.55. B. Each class has a mixed distribution.
The moments of a mixed distribution are weighted averages of the moments of the individual
distributions.
Thus the mean for Class A is: (1/2)(1/6) + (1/2)(1/3) = 1/4.
Similarly, the mean for class B is: (1/2)(2/3) + (1/2)(5/6) = 3/4.
Class A and B are equally likely, therefore the overall mean is 1/2.
The Variance of the Hypothetical Means is: (1/2)(1/4-1/2)2 + (1/2)(3/4-1/2)2 = 1/42 = 1/16.
The second moment of a Poisson with mean (and variance ) is: + 2.
Thus the second moment for a Poisson with mean 1/6 is: 1/6 + 1/36 = 7/36.
The second moment for a Poisson with mean 1/3 is: 1/3 + 1/9 = 16/36.
Thus the second moment for class A is: (1/2)(7/36) + (1/2)(16/36) = 23/72.
Therefore, the process variance of Class A is: 23/72 - (1/4)2 = 37/144.
Similarly, the second moment for Class B is: (1/2)(2/3 + 4/9) + (1/2)(5/6 + 25/36) = 95/72.
Therefore, the process variance of Class B is: 95/72 - (3/4)2 = 109/144.
Therefore, EPV = (1/2)(37/144) + (1/2)(109/144) = 73/144.
Buhlmann Credibility Parameter, K = EPV/VHM = (73/144)/(1/16) = 73/9.
For one observation, Z = 1/(1 + 73/9) = 9/82.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 448
9.56. E. Risk A has a mean of .5 and a variance of 12 /12 = 1/12.
Risk B has a mean of 1 and a variance of 22 /12 = 1/3.
Since the risks are equally likely, the a priori mean is: (.5+1)/2 = .75.
The VHM = (1/4)2 = 1/16. The EPV = (1/12 + 1/3)/2 = 5/24.
K = EPV/VHM = (5/24)/(1/16) = 10/3. For one observation, Z = 1/(1+10/3) = 3/13.
If the observation is L, the estimate from Buhlmann Credibility is:
X = (3/13)L + (10/13)(.75) = (6L + 15)/26.
The density at L given risk A is 1 If L1 and 0 if L >1. The density at L given risk B is 1/2.
If L 1, then the density at L given Risk A is twice that given Risk B.
Thus since the risks are equally likely a priori, the posterior distribution is: 2/3 Risk A and
1/3 Risk B, if L < 1. If L > 1, then the posterior distribution is: 100% B.
If L 1, then the Bayesian estimate is: Y = (2/3)(1/2) + (1/3)(1) = 2/3.
If L > 1, then the Bayesian estimate of next years losses is: Y = (0)(1/2) + (1)(1) = 1.
Now we check versus the given statements.
A. If L < 1, X = (6L + 15)/26 and Y = 2/3. For L < 7/18, X < Y. For L > 7/18, X > Y.
Thus Statement A is not true.
B. If L > 1, X = (6L + 15)/26 and Y = 1. For L < 11/6, X < Y. For L > 11/6, X > Y.
Thus Statement B is not true.
C. If L =1/2, then X = (6L + 15)/26 = 18/26 = 9/13 = .692 > 2/3 = Y.
Thus Statement C is not true.
D. X = Y at L = 7/18 and L = 11/6. Thus Statement D is not true.
E. At L = 7/18, X = Y = 2/3. At L = 11/6, X = Y = 1.
Statement E is true.
Comment: The uniform distribution on [a, b] has variance: (b-a)2 /12.
A graph of the two estimates as a function of L follows:

1
0.9
0.8
0.7
0.6

0.5

1.5

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 449
9.57. A. Let the unknown process variance for Class 2 be y.
Mean = ((380)(25) + (23)(50))/(25+50) = 142.
Second Moment of the Hypothetical Means = ((3802 )(25) + (232 )(50))/(25+50) = 48,486.
VHM = 48486 - 1422 = 28,322.
EPV = k VHM = (2.65)(28,322) = 75,053.
Process Variance for Class 1 = 365,000 - 3802 = 220,600.
75053 = EPV = ((220600)(25) + 50y)/75.
Therefore, y = 2280.
Comment: Gives the usual output and asks you to solve for a missing input.
The missing input, E[X2 ] for Class 2 is: 2280 + 232 = 2809.
If one had been given this value, the calculation of K would have gone as follows:
Class
1
2

A Priori
Probability
0.3333
0.6667

Hypothetical
Mean
380
23

Square of
Hypoth. Mean
144,400
529

142

48,486

Average

Second
Moment
365,000
2,809

Process
Variance
220,600
2,280
75,053

VHM = 48,486 - 1422 = 28,322.


K = EPV/VHM = 75,053/28,322 = 2.65, matching the given value of K.
9.58. B. EPV = .32 + .08c2 . VHM = .08 + .02c2 - (.2 + .1c)2 = .04 - .04c + .01c2 .
Class
Mean P.P. Square of Mean
Variance of P.P.
A

(2)(.2)= .4

.16

(22 )(.2)(.8) = .64

.2c

.04c2

c2 (.2)(.8) = .16c2

Overall

.2 + .1c

.08 + .02c2

.32 + .08c2

K = EPV/VHM = (.32 + .08c2 )/(.04 - .04c + .01c2 ) = (.32/c2 + .08)/(.04/c2 - .04/c + .01).
The limit as c approaches of K is .08/.01 = 8. Therefore, Z approaches 1/(1+8) = 1/9.
Alternately, one can turn this into a numerical problem by taking a value for c that is much larger than
2. For example, for c = 1000:
For Class A, mean P.P. is: (.2)(2) = .4, process variance of P.P. is: (22 ){(.2)(.8)} = .64.
For Class B, mean P.P. is: (.2)(1000) = 200,
process variance of P.P. is: (10002 ){(.2)(.8)} = 160,000.
Class
A
B

A Priori
Probability
0.5000
0.5000

Average

Mean
P.P.
0.4
200

Square of Mean
P.P.
0.16
40,000

Process
Variance
0.6400
160,000

100.2

20,000.08

80,000.32

VHM = 20000.08 - 100.22 = 9960.04.


K = EPV/VHM = 80000.32/9960.04= 8.032. Z = 1/(1 + 8.032) 1/9.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 450
9.59. D. K = EPV/VHM = 1.793/.370 = 4.85. Z = 9/(9 + K) = .650.
Estimated frequency for the class = (.65)(7/9) + (1 - .65)(.425) = .654.
For five policyholders, we expect (5)(.654) = 3.27 claims.
Comment: This is an example of classification ratemaking. One can use the observation to predict
the future claim frequency of policyholders from the same class, whether or not these are the same
policyholders one has observed. The hypothetical means are the hypothetical mean frequencies for
the each class. It is assumed that each individual in a given class has the same frequency
distribution. The VHM is the variance between classes.
9.60. D. E[q] = Overall mean = 0.1. .01 = VHM = Var[q] = E[q2 ] - E[q]2 .
Therefore, E[q2 ] = .01 + E[q]2 = .01 + .12 = .02.
EPV = E[q(1-q)] = E[q] - E[q2 ] = .10 - .02 = .08. K = EPV/VHM = .08/.01 = 8.
Z = 10/(10 + 8) = 5/9. Estimated future frequency = (5/9)(0) + (4/9)(.1) = .0444.
Estimated number of claims for 5 years = (5)(.0444) = 0.222.
9.61. C. K = EPV/VHM = 8000/40 = 200. Z = 1800/(1800 + 200) = 90%.
Total reported losses = (15)(800) + (10)(600) + (5)(400) = 20,000.
Observed pure premium = 20000/1800 = 11.11. Overall mean pure premium = 20.
Estimated future pure premium = (.9)(11.11) + (1 - .9)(20) = 12.
Comment: If we were told that there are 500 employees in year 4, then the estimated aggregate
loss for year 4 is: (500)(12) = 6000. First estimate the future pure premium or frequency. Then
multiply by the number of exposures in the later year. See for example, 4, 11/04, Q.9.
9.62. B. VHM = Var[] = 50. EPV = E[Var[X | ]] = E[500] = 500. K = EPV/VHM = 10.
Z = 3/(3 + 10) = 3/13. A priori mean = E[] = 1000.
Estimated future severity = (3/13)((750 + 1075 + 2000)/3) + (10/13)(1000) = 1063.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 451
9.63. D. Adding the probabilities, there is a 60% a priori probability of = 0, risk type A, and a
40% a priori probability of = 1, risk type B.
For risk type A, the distribution of X is: 0@.4/.6 = 2/3, 1@.1/.6 = 1/6, and 2@.1/.6 = 1/6.
Putting this into ordinary language:
Prob[Type A] = Prob[ = 0] = .4 + .1 + .1 = 60%.
Prob[Type B] = Prob[ = 1] = .1 + .2 + .1 = 40%.
X
0
1
2

Type A
4/6
1/6
1/6

Type B
1/4
2/4
1/4

The mean for risk type A is: E[X | = 0] = (0)(2/3) + (1)(1/6) + (2)(1/6) = 0.5.
The 2nd moment for risk type A is: E[X2 | = 0] = (02 )(2/3) + (12 )(1/6) + (22 )(1/6) = .8333.
Process Variance for risk type A is: Var[X | = 0] = .8333 - .52 = .5833.
Similarly, risk type B has mean 1, second moment 1.5, and process variance 0.5.
Risk
Type

A Priori
Chance

Mean

Square of
Mean

Process
Variance

A
B

0.6
0.4

0.50
1.00

0.25
1.00

0.583
0.500

0.70

0.55

0.550

Average

EPV = (.6)(.5833) + (.4)(0.5) = .550. Overall Mean = (.6)(0.5) + (.4)(1) = 0.7.


2nd moment of the hypothetical means is: (.6)(0.52) + (.4)(12) = 0.55.
VHM = .55 - .72 = .06. K = EPV/VHM = .55/.06 = 9.17.
Z = 10/(10 + K) = 52.2%. Observed mean is: 10/10 = 1. Prior mean is .7.
Estimate is: (.522)(1) + (.478)(.7) = 0.857.
Comment: Overall we have: 50% X = 0, 30% X = 1, and 20% X = 2. Thus the overall mean is .7,
and the total variance is .61. Note that EPV + VHM = .55 + .06 = .61 = total variance.
9.64. C. Variance of the Hypothetical Means = .2775 - .4252 = .0969.
Class
1
2
3
4
Average

A Priori
Probability
0.25
0.25
0.25
0.25

Mean
0.1
0.2
0.5
0.9

Square of
Mean
0.01
0.04
0.25
0.81

Process
Variance
0.09
0.16
0.25
0.09

0.4250

0.2775

0.1475

K = EPV/ VHM = .1475/.0969 = 1.5. Z = 4/(4 + K) = 72.7%.


Estimated future frequency = (.727)(2/4) + (.273)(.425) = .48. (5)(.48) = 2.4.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 452
9.65. D. For risk 1, the mean is: (.5)(250) + (.3)(2500) + (.2)(60000) = 12,875,
the second moment is: (.5)(2502 ) + (.3)(25002 ) + (.2)(600002 ) = 721,906,250,
and the process variance is: 721,906,250 - 12,8752 = 556,140,625.
Risk

A Priori
Probability

Mean

Square of
the Mean

Second
Moment

Process
Variance

1
2

66.67%
33.33%

12,875.00
6,675.00

165,765,625
44,555,625

721,906,250
361,293,750

556,140,625
316,738,125

10,808.33

125,362,292

476,339,792

VHM = 125,362,292 - 10,808.332 = 8,542,945. EPV = 476,339,792.


K = EPV/VHM = 55.8. Z = 1/(1 + K) = 1.8%.
(1.8%)(250) + (1 - 1.8%)(10,808.33) = 10,618.
Comment: With only two types of risks, VHM = (12875 - 6675)2 (2/3)(1/3) = 8,542,222, with the
difference from above due to rounding.
9.66. C. For the third class, (.7)0 + (.2)1 + (.1)2 = .4. (.7)02 + (.2)12 + (.1)22 = .6.
Class

A Priori Chance
of this Class

Mean for
for this Class

1
2
3

0.333
0.333
0.333

0.200
0.300
0.400

0.0400
0.0900
0.1600

0.400
0.500
0.600

0.360
0.410
0.440

0.300

0.0967

0.500

0.403

Overall

Square ofSquare
Mean ofSecond
Mean Moment
for this Class
for this Class
for this Class

Process
Variance

The Variance of the Hypothetical Means = .0967 - .32 = .0067.


K = EPV/VHM = .403/.0067 = 60.1. Z = 50/(50 + 60.1) = 45.4%.
Estimated future frequency = (45.4%)(17/50) + (54.6%)(.3) = .318. (35)(.318) = 11.1.
9.67. C. Variance of the Hypothetical Means = 22.18 - 4.72 = 0.09.
Class
1
2
Average

A Priori
Probability
0.5000
0.5000

Mean for
this Class
5
4.4

Square of Mean
of this Class
25
19.36

Process
Variance
5
1.98

4.7

22.18

3.49

K= EPV / VHM = 3.49 / .09 = 38.8. Thus for N = 3, Z = 3/(3 + 38.8) = 7.2%.
The observed mean is: (3 + r + 4)/3 = (7 + r)/3, and the a priori mean is 4.7.
Therefore, the estimate = (7.2%)(7 + r)/3 + (92.8%)(4.7).
Set this equal to 4.6019: 4.6019 = (7.2%)(7 + r)/3 + (92.8%)(4.7). r = 3.0.

2013-4-9 Buhlmann Cred. 9 Buhl. Cred. Discrete Risk Types, HCM 10/19/12, Page 453
9.68. B. E[X | ] = (0)(2) + (1)() + (2)(1 - 3) = 2 - 5.
E[X2 | ] = (0)(2) + (1)() + (4)(1 - 3) = 4 - 11.

Prob.

Mean

Mean2

Second Moment

Process Variance

.05
.30

.8
.2

1.75
0.50

3.0625
0.2500

3.45
0.70

.3875
.4500

1.50

2.5000

Avg.

.4000

VHM = 2.5 - 1.52 = .25. K = EPV/VHM = .4/.25 = 1.6. Z = 1/(1 + K) = 1/2.6.


Estimate is: (2)(1/2.6) + (1.5)(1 - 1/2.6) = 1.692.
9.69. B. The process variance is 10002 = 1 million for a risk from each class. EPV = 1 million.
The overall mean is: (60%)(2000) + (30%)(3000) + (10%)(4000) = 2500.
The second moment of the hypothetical means is:
(60%)(20002 ) + (30%)(30002 ) + (10%)(40002 ) = 6.7 million.
Therefore, VHM = 6.7 million - 25002 = 0.45 million.
K = EPV/VHM = 1/0.45 = 2.22.
Z = 80/(80 + 2.22) = 97.3%.
The observed pure premium is:
(24,000 + 36,000 + 28,000)/(24 + 30 + 26) = 88,000/80 = 1100.
Estimated future pure premium is: (97.3%)(1100) + (1 - 97.3%)(2500) = 1138.
Comment: Choice A would correspond to Z = 1, while choice E would correspond to Z = 0,
thus one can probably eliminate these choices.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 454

Section 10, Buhlmann Credibility, with Continuous Risk Types


In the prior section Buhlmann Credibility was applied to situations where there were several distinct
types of risks. In this section, Buhlmann Credibility will be applied in a similar manner to situations in
which there are an infinite number of risk types, parameterized in some continuous manner. Where
summation was used in the discrete case, instead integration will be used in the continuous case.
Mahlers Guide to Conjugate Priors contains many more examples of Buhlmann Credibility with
Continuous Risk Types.115
An Example of Mixing Bernoullis:
For example, assume:
In a large portfolio of risks, the number of claims for one policyholder during one
year follows a Bernoulli distribution with mean q.
The number of claims for one policyholder for one year is independent of the number
of claims for the policyholder for any other year.
The distribution of q within the portfolio has density function: (q) = 20.006q4 , 0.6 q 0.8
For a given value of q, the process variance of a Bernoulli is q(1- q) = q - q2 .116
We use integration to get the expected value of the process variance:117
0.8

EPV = E[q -

q2 ]

(q -

q2)

(q) dq = 20.006

0.6

0.8

(q5 - q6) dq = 0.199.

0.6

For a given value of q, the hypothetical mean is q.


We use integrals to get the first and second moments of the hypothetical means:
0.8

E[q] =

q (q) dq = 20.006

0.6
0.8

E[q2 ] =

0.6

0.8

q5 dq = 0.71851.

0.6

q2 (q) dq = 20.006

0.8

q6 dq = 0.51936.

0.6

VHM = E[q2 ] - E[q]2 = 0.51936 - 0.718512 = 0.0031.


K = EPV/VHM = 0.199/0.0031 = 64.
115

See the sections on Mixing Poissons, Gamma-Poisson, Beta-Binomial, Inverse Gamma - Exponential, NormalNormal, and Overview.
116
In general, the process variance and the hypothetical mean will be some function of the parameter.
117
In cases where the distribution of the parameter is well known, example uniform or exponential, we can avoid
doing integrals, since we already know the mean, moments, and variance of such a distribution.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 455
Exercise: A policyholder is selected at random from this portfolio. He is observed to have one claim
in three years. Using Buhlmann Credibility, what is his expected future frequency?
[Solution: Z = 3/(3 + K) = 4.5%. Estimate = (4.5%)(1/3) + (95.5%)(0.71851) = 0.70.]
Summary:
Assume q is a parameter which varies across the portfolio, via prior distribution (q).
EPV = (Process Variance | q) (q) dq .
First Moment of Hypothetical Means = (Mean | q) (q) dq .
Second Moment of Hypothetical Means = (Mean | q)2 (q) dq
VHM = Second Moment of Hypothetical Means - (First Moment of Hypothetical Means)2 .
Using Moment Formulas:
In order to calculate the EPV and VHM, in those cases where the prior distribution (q) is one of the
distributions in the Appendices attached to the exam, often one can use moment formulas rather
than doing integrals.
For example, assume that the annual number of claims on a given policy has a Geometric
Distribution with parameter .
The prior distribution of is Gamma Distribution with parameters = 4 and = 0.1.
The process variance is: (1+) = + 2.
Thus the EPV = E[] + E[2] = (mean of the Gamma) + (second Moment of the Gamma) =
(4)(0.1) + (4)(5)(0.12 ) = 0.60.
The hypothetical mean is .
Thus the VHM = Var[] = Variance of the Gamma = (4)(0.12 ) = 0.04.
Therefore, K = EPV / VHM = 0.60 / 0.04 = 15.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 456
A Severity Example:
Assume severity is Pareto, with = 1, and alpha varying across the portfolio.
Assume alpha is uniformly distributed from 3 to 5.
E[X | ] = 1 / ( - 1).
5

Thus the first moment of the hypothetical means is:

1 1
d = {ln(4) - ln(2)} / 2 = 0.34657.
- 1 2

Exercise: What is the second moment of the hypothetical means?


5

[Solution:

1
1
d = (1/4 - 1/2) / 2 = 1/8 = 0.125.]
( - 1)2 2

Thus the VHM = 0.125 - 0.346572 = 0.004887.


E[X2 | ] =

2
.
( - 1) ( - 2)

Therefore, Var[X | ] =

Thus, EPV =

2
1
.118
( - 1) ( - 2) ( - 1)2

1
2
d ( - 1) ( - 2) 2

1
1
d =
2
( - 1) 2

3 - 2 1

1
d - 1/8 =
- 1

ln(3/1) - ln(4/2) - 1/8 = 0.2805.


Therefore, K = EPV / VHM = 0.2805 / 0.004887 = 57.4.
Exercise: For an individual policyholder, we observe 20 claims which total 9.
Use Buhlmann Credibility to estimate the size of the next claim from the same policyholder.
[Solution: Z = 20 / (20 + 57.4) = 25.8%. Prior mean is 0.34657, from above.
Estimate is: (25.8%)(9/20) + (1 - 25.8%)(0.34657) = 0.373.]

118

I have left the process variance in this form in order to make it easier to integrate.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 457
Problems:
Use the following information for the next two questions:

Claim sizes follow a Gamma Distribution, with parameters and = 10.

The prior distribution of is assumed to be uniform on the interval (2.5, 4.5).

Buhlmann Credibility is being used to estimate claim severity.

10.1 (2 points) Determine the value of the Buhlmann Credibility Parameter, K.


A. Less than 10
B. At least 10, but less than 11
C. At least 11, but less than 12
D. At least 12, but less than 13
E. 13 or more
10.2 (1 point) You observe from an insured 2 claims of sizes 15 and 31.
What is the estimated future claim severity from this insured?
A. 25
B. 27
C. 29
D. 31
E. 33
10.3 (3 points) You are given:

The annual claim count for each individual insured has a Negative Binomial Distribution,
with parameters r and , which do not change over time.

For each insured, = 0.3.


The r parameters vary across the portfolio of insureds, via a Gamma Distribution
with = 4 and = 5.
Determine the Buhlmann credibility factor Z for an individual driver for one year.
A. 52%
B. 54%
C. 56%
D. 58%
E. 60%
10.4 (3 points) You are given:
The annual claim count for each individual insured has a Negative Binomial Distribution,
with parameters r and , which do not change over time.

For each insured, r = 3.


The parameters vary across the portfolio of insureds, via an Exponential Distribution
with = 0.7.
Determine the Buhlmann credibility factor for an individual driver for one year.
(A) 45%
(B) 55%
(C) 65%
(D) 75%
(E) 85%

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 458
10.5 (3 points) You are given the following:
Claim sizes for a given policyholder follow an exponential distribution with f(x) = e-x , 0 < x < .
The prior distribution of is uniform from 0.02 to 0.10.
The policyholder experiences a claim of size 60.
Using Buhlmann Credibility, determine the expected size of the next claim from this policyholder.
A. 24
B. 26
C. 28
D. 30
E. 32
10.6 (2 points) Use the following information:

The probability of y successes in m trials is given by a Binomial distribution


with parameters m and q.

The prior distribution of q is uniform on [0, 1].

Two successes were observed in six trials.

Use Buhlmann Credibility to estimate the probability that a success will occur on the seventh trial.
A. 0.34
B. 0.36
C. 0.38
D. 0.40
E. 0.42
10.7 (4 points) You are given the following:
Claim sizes for a given policyholder follow a distribution with density function
f(x) = 3x2 /b3 , 0 < x < b.
The prior distribution of b is a Single Parameter Pareto Distribution with = 6 and = 40.
A policyholder experiences two claims of sizes 30 and 60. Use Buhlmann Credibility to determine
the expected value of the next claim from this policyholder.
A. 37
B. 38
C. 39
D. 40
E. 41
Use the following information for the next two questions:
(i) Xi is the claim count observed for driver i for one year.
(ii) Xi has a negative binomial distribution with parameters = 1.6 and ri.
(iii) The ris have an exponential distribution with mean 0.8.
(iv) The size of claims follows a Pareto Distribution with = 5 and = 1000.
10.8 (3 points) An individual driver is observed to have 2 claims in one year.
Use Buhlmann Credibility to estimate this drivers future annual claim frequency.
(A) 1.3
(B) 1.4
(C) 1.5
(D) 1.6
(E) 1.7
10.9 (3 points) An individual driver is observed to have 2 claims in one year, of sizes 1500 and
800. Apply Buhlmann Credibility to aggregate losses in order to estimate this drivers future annual
aggregate loss.
(A) 600
(B) 700
(C) 800
(D) 900
(E) 1000

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 459
10.10 (3 points) Use the following information:
In a large portfolio of risks, the number of claims for one policyholder during one year
follows a Bernoulli distribution with mean q.
The number of claims for one policyholder for one year is independent of the number of claims
for the policyholder for any other year. The number of claims for one policyholder
is independent of the number of claims for any other policyholder.
The distribution of q within the portfolio has density function:
400q , 0 < q 0.05

f(q) =
40 - 400q , 0.05 < q < 0.10
A policyholder Phillip DeTanque is selected at random from the portfolio.
During Year 1, Phil has one claim. During Year 2, Phil has no claim. During Year 3, Phil has one claim.
Use Buhlmann Credibility to estimate the probability that Phil will have a claim during Year 4.
A. 5.4%
B. 5.8%
C. 6.2%
D. 6.6%
E. 7.0%
Use the following information for the next three questions:
Losses for individual policyholders follow a Compound Poisson Distribution.
The prior distribution of the Poisson parameter is uniform on [2, 6].
Severity is Gamma with parameters = 3 and .
The prior distribution of has density 62.5e-5//4 , > 0.
The distributions of and are independent.
10.11 (2 points) An individual policyholder has 2 claims this year. Using Buhlmann Credibility, what
is the expected number of claims from that policyholder next year?
(A) 3.00
(B) 3.25
(C) 3.50
(D) 3.75
(E) 4.00
10.12 (3 points) An individual policyholder has 2 claims of sizes 4 and 7. Using Buhlmann
Credibility, what is the expected size of the next claim from that policyholder?
(A) 5.5
(B) 6.0
(C) 6.5
(D) 7.0
(E) 7.5
10.13 (4 points) This year, an individual policyholder has 2 claims of sizes 4 and 7.
Applying Buhlmann Credibility directly to the aggregate losses, what is the expected aggregate
loss from that policyholder next year?
(A) 16
(B) 18
(C) 20
(D) 22
(E) 24
10.14 (4 points) Severity is LogNormal with parameters and = 1.2.
varies across the portfolio via a Gamma Distribution with = 8 and = 0.3.
Determine the value of the Buhlmann Credibility Parameter K for severity.
A. 3
B. 4
C. 5
D. 6
E. 7

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 460
10.15 (3 points) Use the following information:
(i) Xi is the claim count observed for insured i for one year.
(ii) Xi has a negative binomial distribution with parameters r = 2 and i.
(iii) The is have a distribution [] =

280 4
, 0 < < .
(1 + )9

An insured has 8 claims in one year.


Using Buhlmann Credibility, what is that insureds expected future annual claim frequency?

Hint:

(c) (d)
xc - 1
(1 + x) c + d dx = (c + d) , for c > 0, d > 0.
0

A. 4.8

B. 5.0

C. 5.2

D. 5.4

E. 5.6

Use the following information for a group of insureds for the next two questions:

The amount of a claim is uniformly distributed, but will not exceed a certain unknown limit b.
The prior distribution of b is: (b) = 3000/b4 , b > 10.
From an insured, three claims of sizes 17, 13, and 22 are observed in that order.
10.16 (3 points) Use Buhlmann Credibility to estimate the size of the next claim from this insured.
A. 11
B. 12
C. 13
D. 14
E. 15
10.17 (3 points) Use Bayes Analysis to estimate the size of the next claim from this insured.
A. 11
B. 12
C. 13
D. 14
E. 15

10.18 (3 points) You are given the following:


Claim sizes for a given policyholder follow an exponential distribution with density function
f(x) = e-x/ / , 0 < x < .
The prior distribution of is Pareto with = 2 and = 100.
The policyholder experiences a claim of size 60.
Using Buhlmann Credibility, determine the expected size of the next claim from this policyholder.
A. 60
B. 70
C. 80
D. 90
E. 100

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 461
10.19 (3 points) You are given:
(i) Claim counts follow a Poisson Distribution with mean /20.
(ii) Claim sizes follow a Gamma Distribution with = 5 and = 4.
(iii) Claim counts and claim sizes are independent, given .
(iv) The prior distribution has probability density function:
() = 7/8, > 1.
For 100 exposures, calculate the Buhlmann Credibility for aggregate losses.
(A) 44%
(B) 47%
(C) 50%
(D) 53%
(E) 56%
10.20 (2 points) You are given the following:
Severity is uniform from 0 to b.
b is distributed uniformly from 10 to 15.
Calculate Bhlmanns K for severity.
A. 25
B. 30
C. 35
D. 40

E. 45

10.21 (3 points) You are given the following:


Number of claims for a single insured follows a Poisson distribution with mean .
The amount of a single claim has a distribution with mean and coefficient of variation of 4.
and are independent random variables.
E[] = 2, Var[] = 3.
The distribution of has a coefficient of variation of 2.
Number of claims and claim severity distributions are independent.
Calculate Bhlmanns K for aggregate losses.
A. 3.5
B. 4.0
C. 4.5
D. 5.0
E. 5.5

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 462
Use the following information for the next two questions:
During year 5, claim sizes are LogNormal with = 6 and = 1.
Claim frequency per employee is Binomial with parameters m = 3 and q.
q is the same for the employees of a given employer.
The prior distribution of q between employers is uniform from 0.01 to 0.03.
Frequency and severity are independent.
10.22 (3 points) Determine the Buhlmann Credibility Parameter, K, for estimating pure premiums.
(A) Less than 400
(B) At least 400, but less than 500
(C) At least 500, but less than 600
(D) At least 600, but less than 700
(E) At least 700
10.23 (2 points) You observe the following experience for a particular employer.
Year
Number of Employees
Loss per Employee
1
100
21
2
120
28
3
140
27
You expect 130 employees for this employer in year 5.
Inflation is 4% per year.
Use Buhlmann Credibility to determine the expected losses from this employer in year 5.
A. 4600
B. 4700
C. 4800
D. 4900
E. 5000

10.24 (3 points) You are given:


(i) The number of claims in a year for a selected risk follows a Poisson distribution with
mean .
(ii) The severity of claims for the selected risk follows an exponential distribution with
mean .
(iii) The number of claims is independent of the severity of claims.
(iv) The prior distribution of is exponential with mean 4.
(v) The prior distribution of is Poisson with mean 7.
(vi) A priori, and are independent.
Using Bhlmanns credibility for aggregate losses, determine k.
(A) 4/9
(B) 1/2
(C) 5/9
(D) 2/3
(E) 3/4

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 463
10.25 (4 points) You are given the following:

The number of claims follows a distribution with mean and variance 2.

Claim sizes follow a distribution with mean and variance 22.

The number of claims and claim sizes are independent.

and have a prior probability distribution with joint density function


f(, ) = 1.5, 0 < < 2, 0 < < 1.

Determine the value of Buhlmann's k for severity.


(A) 5
(B) 6
(C) 7
(D) 8

(E) 9

10.26 (2 points) You are given:


(i) The annual number of claims on a given policy has a geometric distribution with parameter .
(ii) The prior distribution of is Gamma with parameters and .
Determine the Bhlmann credibility parameter, K.
(A) ( - 1)/

(B) /

(C) ( + 1)/

(D) + 1/

(E) None of A, B, C, or D.
10.27 (3 points) Use the following information:
Claim sizes for a given policyholder follow a mixed exponential distribution with density
function f(x) = 0.75e-x + 0.5e-2x , 0 < x < .
The prior distribution of is uniform from 0.01 to 0.05.
The policyholder experiences a claim of size 60.
Use Buhlmann Credibility to determine the expected size of the next claim from this policyholder.
A. 36
B. 37
C. 38
D. 39
E. 40
10.28 (4 points) You are given:
(i) Claim counts follow a Poisson Distribution with mean /20.
(ii) Claim sizes follow a Gamma Distribution with = 5 and = 4.
(iii) Claim counts and claim sizes are independent, given .
(iv) The prior distribution has probability density function:
() = 7/8, > 1.
Calculate the Buhlmann Credibility parameter, K, for estimating severity.
(E) 9
(A) 1
(B) 3
(C) 5
(D) 7

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 464
10.29 (3 points) You are given:
(i) The number of claims in a year for a selected risk follows a Poisson distribution with mean .
(ii) The severity of claims for the selected risk follows an exponential distribution with mean .
(iii) The number of claims is independent of the severity of claims.
(iv) The joint density of and is: 2,500,000 3 e10 -4 e10/ / 3, > 0, > 0.
Using Bhlmanns credibility for aggregate losses, determine k.
(A) 1
(B) 3
(C) 5
(D) 7
(E) 9
Use the following information for the next two questions:
(i) The number of claims for each policyholder has a Binomial Distribution
with parameters m = 3 and q.
(ii) The prior distribution of q is f(q) = 1.5 - q, 0 < q < 1.
(iii) A randomly selected policyholder had the following claims experience:
Year Number of Claims
1
1
2
0
3
2
4
1
5
0
1

(iv)

qa 1 (1 q)b 1 dq =
0

(a - 1)! (b- 1)!


.
(a + b - 1)!

10.30 (4 points) Use Bayes Analysis to estimate this policyholders frequency for year 6.
10.31 (3 points) Use Buhlmann Credibility to estimate this policyholders frequency for year 6.

10.32 (5 points) The distribution of aggregate losses is LogNormal with parameters 5 and .
2 varies across the portfolio via an Inverse Gaussian Distribution with = 0.6 and = 2.
Determine the value of the Buhlmann Credibility Parameter for aggregate losses.
Hint: Use the moment generating function of the Inverse Gaussian Distribution.
A. 20
B. 30
C. 40
D. 50
E. 60

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 465
10.33 (3 points) You are given the following:
The amount of an individual claim has an Inverse Gamma distribution with shape parameter = 6
and scale parameter .
The parameter is distributed via an Exponential Distribution.
Calculate Bhlmanns k for severity.
A. 1/4
B. 1/2
C. 1

D. 3/2

E. 5/2

10.34 (4 points) You are given:


(i) The number of claims in a year for a selected risk follows a Poisson distribution with mean .
(ii) The severity of claims for the selected risk follows an exponential distribution with mean .
(iii) The number of claims is independent of the severity of claims.
(iv) The joint density of and is: 0.3 e0.1/(1 + 10 )4 , > 0, > 0.
Using Bhlmanns credibility for aggregate losses, determine k.
(A) 10
(B) 12
(C) 14
(D) 16
(E) 18
10.35 (2 points) You are given:
(i) Given = , claim sizes follow a Pareto distribution with parameters = 4 and .
(ii) The prior distribution of the parameter is uniform from 10 to 50.
Six claims are observed.
Determine Bhlmanns credibility to be used in order to estimate the size of the seventh claim.
A. less than 10%
B. at least 10% but less than 15%
C. at least 15% but less than 20%
D. at least 20% but less than 25%
E. at least 25%
10.36 (3 points) Severity is Gamma with parameters that vary continuously:
(, ) =

2 3
, 2 < < 3, 1 < < 2.
23.75

Determine the Buhlmann credibility parameter.


A. 5
B. 7
C. 9
D. 11

E. 13

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 466
10.37 (5 points) Severity is LogNormal with parameters u and 2.
u varies across the portfolio via a Normal Distribution with = 5 and 2 = 3.
A policyholder submits 20 claims with an average size of 10,000.
Use Buhlmann Credibility to predict the size of the next claim from this policyholder.
A. less than 6500
B. at least 6500 but less than 7000
C. at least 7000 but less than 7500
D. at least 7500 but less than 8000
E. at least 8000
10.38 (4 points) Severity is Inverse Gamma with parameters that vary continuously.
and vary independently.
is uniformly distributed from 3 to 5.
is distributed via: () =

3000
, > 10.
4

Determine the Buhlmann credibility to be assigned to 4 claims.


1
1
Hint:
dx =
+ ln[x-2] - ln[x-1].
2
(x 2) (x 1)
x 1
A. 55%

B. 60%

C. 65%

D. 70%

E. 75%

10.39 (2 points) The size of loss has a density: f(x | ) = 33 / x4 , x > .


The prior distribution of is uniform from 10 to 15.
We observe 8 claims for a total of 200 from an insured.
Use Buhlmann Credibility to estimate the size of the next claim from that insured.
A. 19.50
B. 19.75
C. 20.00
D. 20.25
E. 20.50
10.40 (3 points) You are given:
(i) The claim count observed for an individual driver for one year
has a geometric distribution with parameter .
(ii) varies across a group of drivers.
(iii) 10 follows a zero-truncated poisson distribution with = 2.
Determine the Buhlmann credibility factor for an individual driver for five years.
(A) Less than 0.05
(B) At least 0.05, but less than 0.10
(C) At least 0.10, but less than 0.15
(D) At least 0.15, but less than 0.20
(E) At least 0.20

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 467
10.41 (4 points) You are given:
(i) Severity follows a LogNormal Distribution with = 2.
(ii) varies across a group of drivers.
(iii) -5 follows a Negative Binomial distribution with r = 6 and = 0.05.
Determine the Buhlmann credibility parameter K.
A. 10
B. 20
C. 30
D. 50

E. 75

10.42 (3 points) For group medical insurance, you have the following three years of experience
from a particular insured group:
Year
Number of members
Number of Claims
Average Loss Per Member
1
10
25
2143
2
15
40
2551
3
20
45
2260
There will be 25 members in year 4.
The number of claims per member in any year follows a Binomial distribution with parameters
m = 10 and q.
q is the same for all members in a group, but varies between groups.
q is distributed uniformly over the interval (0.20, 0.40).
Claim severity follows a Gamma distribution with parameters = 5, = 200.
Calculate the Buhlmann-Straub estimate of aggregate losses in year 4.
(A) 60,000 (B) 61,000 (C) 62,000 (D) 64,000 (E) 66,000

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 468
Use the following information for the next four questions:
There exists a set of risks, each of which can have at most one accident during each year.
The hypothetical mean frequencies vary among the individual risks and is a priori distributed with
equal probability in the interval between 0.07 and 0.13.
The severity and frequency distributions are independent.
There are two types of risks, each with a different severity distribution:
Risk Type
5 units
10 units
20 units
type 1:
1/3
1/2
1/6
type 2:
1/2
1/4
1/4
60% of the risks are type 1, and 40% are type 2.
10.43 (4, 11/82, Q.46A) (3 points)
What is the variance of the hypothetical mean pure premiums?
A. less than 0.02
B. at least 0.02 but less than 0.03
C. at least 0.03 but less than 0.04
D. at least 0.04 but less than 0.05
E. at least 0.05
10.44 (4, 11/82, Q.46B) (4 points) What is the expected value of the process variance?
A. less than 11.6
B. at least 11.6 but less than 11.7
C. at least 11.7 but less than 11.8
D. at least 11.8 but less than 11.9
E. at least 11.9
10.45 (4, 11/82, Q.46C) (1 point) Find K, the Buhlmann Credibility Parameter.
A. less than 350
B. at least 350 but less than 400
C. at least 400 but less than 450
D. at least 450 but less than 500
E. at least 500
10.46 (1 point) A risk is chosen at random. You observe a total of 45 units of losses over 15 years.
Use Buhlmann Credibility to estimate the future pure premium for that same risk.
A. less than 1.1
B. at least 1.1 but less than 1.2
C. at least 1.2 but less than 1.3
D. at least 1.3 but less than 1.4
E. at least 1.4

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 469
10.47 (4B, 11/97, Q.10) (3 points) You are given the following:
In a large portfolio of automobile risks, the number of claims for one policyholder
during one year follows a Bernoulli distribution with mean m/100,000, where m is the
number of miles driven each and every year by the policyholder.
The number of claims for one policyholder for one year is independent of the number
of claims for the policyholder for any other year. The number of claims for one
policyholder is independent of the number of claims for any other policyholder.
The distribution of m within the portfolio has density function
m/ 100,000,000 , 0 < m 10,000

f(m) =
.
(20,000 - m)/ 100,000,000, 10,000 < m < 20,000
A policyholder is selected at random from the portfolio. During Year 1, one claim is observed for this
policyholder. During Year 2, no claims are observed for this policyholder. No information is available
regarding the number of claims observed during Years 3 and 4.
Hint: Use a change of variable such as q = m/100,000. Determine the Buhlmann credibility estimate
of the expected number of claims for the selected policyholder during Year 5.
A. 3/31
B. 1/10
C. 7/62
D. 63/550
E. 73/570
10.48 (4B, 11/98, Q.19) (2 points) You are given the following:

Claim sizes follow a gamma distribution, with parameters and = 1/2 .

The prior distribution of is assumed to be uniform on the interval (0, 4).

Determine the value of Buhlmann's k for estimating the expected value of a claim.
A. 2/3
B. 1
C. 4/3
D. 3/2
E. 2
10.49 (4B, 5/99, Q.13) (3 points) You are given the following:

The number of claims follows a distribution with mean and variance 2.

Claim sizes follow a distribution with mean and variance 22.

The number of claims and claim sizes are independent.

and have a prior probability distribution with joint density function


f(, ) = 1, 0 < < 1, 0 < < 1.

Determine the value of Buhlmann's k for aggregate losses.


A. Less than 3
B. At least 3, but less than 6
C. At least 6, but less than 9
D. At least 9, but less than 12
E. At least 12
10.50 (2 points) In the previous question, determine the value of Buhlmann's k for severity.
(A) 5
(B) 6
(C) 7
(D) 8
(E) 9

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 470
10.51 (4B, 11/99, Q.20) (3 points) You are given the following:
The number of claims follows a Poisson distribution with mean .
Claim sizes follow a distribution with density function
f(x) = e-x/ / , 0 < x < .
The number of claims and claim sizes are independent.
The prior distribution of has density function
g() = e, 0 < < .
Determine the value of Buhlmann's k for aggregate losses.

Hint: n e- d = n!
0

A. 0

B. 3/5

C. 1

D. 2

E.

10.52 (4, 5/00, Q.37) (2.5 points) You are given:


(i) Xi is the claim count observed for driver i for one year.
(ii) Xi has a negative binomial distribution with parameters = 0.5 and ri.
(iii) i is the expected claim count for driver i for one year.
(iv) The is have an exponential distribution with mean 0.2.
Determine the Buhlmann credibility factor for an individual driver for one year.
(A) Less than 0.05
(B) At least 0.05, but less than 0.10
(C) At least 0.10, but less than 0.15
(D) At least 0.15, but less than 0.20
(E) At least 0.20

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 471
10.53 (4, 11/01, Q.18 & 2009 Sample Q.67) (2.5 points) You are given the following
information about a book of business comprised of 100 insureds:
Ni

(i) Xi =

Yij is a random variable representing the annual loss of the ith insured.
j=1

(ii) N1 , N2 , ..., N100 are independent random variables distributed according to


a negative binomial distribution with parameters r (unknown) and = 0.2.
(iii) Unknown parameter r has an exponential distribution with mean 2.
(iv) Yi1, Yi2 , ..., YiNi are independent random variables distributed according to
a Pareto distribution with = 3 and = 1000.
Determine the Bhlmann credibility factor, Z, for the book of business.
(A) 0.000
(B) 0.045
(C) 0.500
(D) 0.826
(E) 0.905
10.54 (2 points) In the previous question, 4, 11/01, Q.18, change bullet iii:
(iii) The prior distribution of r is discrete:
Prob[r = 1] = 1/3, Prob[r = 2] = 1/3, and Prob[r = 3] = 1/3.
Determine the Bhlmann credibility factor, Z, for the book of business.
(A) 0.50
(B) 0.60
(C) 0.70
(D) 0.80
(E) 0.90

10.55 (4, 11/02, Q.18 & 2009 Sample Q. 41) (2.5 points) You are given:
(i) Annual claim frequency for an individual policyholder has mean and variance 2.
(ii) The prior distribution for is uniform on the interval [0.5, 1.5].
(iii) The prior distribution for 2 is exponential with mean 1.25.
A policyholder is selected at random and observed to have no claims in Year 1.
Using Bhlmann credibility, estimate the number of claims in Year 2 for the selected policyholder.
(A) 0.56
(B) 0.65
(C) 0.71
(D) 0.83
(E) 0.94
10.56 (2 points) You are given:
(i) Annual claim frequency for an individual policyholder has mean and variance 2.
(ii) The prior distribution for has a 50% chance of 0.5 and 50% chance of 1.5.
(iii) The prior distribution for 2 has a 75% chance of 1 and a 25% chance of 2.
(iv) The prior distribution for and 2 are independent.
A policyholder is selected at random and observed to have no claims in Year 1.
Using Bhlmann credibility, estimate the number of claims in Year 2 for the selected policyholder.
(A) 0.56
(B) 0.65
(C) 0.71
(D) 0.83
(E) 0.94

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 472
10.57 (4, 11/03, Q.11 & 2009 Sample Q.8) (2.5 points) You are given:
(i) Claim counts follow a Poisson distribution with mean .
(ii) Claim sizes follow an exponential distribution with mean 10.
(iii) Claim counts and claim sizes are independent, given .
(iv) The prior distribution has probability density function:
() = 5/6, > 1.
Calculate Bhlmanns k for aggregate losses.
(A) Less than 1
(B) At least 1, but less than 2
(C) At least 2, but less than 3
(D) At least 3, but less than 4
(E) At least 4
10.58 (2 points) In the previous question, 4, 11/03, Q.11, change bullet iv:
(iv) The prior distribution of is discrete:
Prob[ = 1] = 70%, Prob[ = 2] = 30%.
Calculate Bhlmanns k for aggregate losses.
(A) Less than 1
(B) At least 1, but less than 2
(C) At least 2, but less than 3
(D) At least 3, but less than 4
(E) At least 4

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 473
10.59 (4, 11/04, Q.29 & 2009 Sample Q.154) (2.5 points) You are given:
(i) Claim counts follow a Poisson distribution with mean .
(ii) Claim sizes follow a lognormal distribution with parameters and .
(iii) Claim counts and claim sizes are independent.
(iv) The prior distribution has joint probability density function:
f(, , ) = 2, 0 < < 1, 0 < < 1, 0 < < 1.
Calculate Bhlmanns k for aggregate losses.
(A) Less than 2
(B) At least 2, but less than 4
(C) At least 4, but less than 6
(D) At least 6, but less than 8
(E) At least 8
10.60 (3 points) In the previous question, change bullet iv:
(iv) The prior joint distribution of , , and is discrete:
Prob[ = 1/4, = 3/4, = 1/2] = 30%.
Prob[ = 3/4, = 1/2, = 1/4] = 20%.
Prob[ = 1/2, = 1/4, = 3/4] = 50%.
Calculate Bhlmanns k for aggregate losses.
(A) Less than 10
(B) At least 10, but less than 20
(C) At least 20, but less than 30
(D) At least 30, but less than 40
(E) At least 40
10.61 (4, 5/05, Q.11 & 2009 Sample Q.181) (2.9 points) You are given:
(i) The number of claims in a year for a selected risk follows a Poisson distribution with
mean .
(ii) The severity of claims for the selected risk follows an exponential distribution with
mean .
(iii) The number of claims is independent of the severity of claims.
(iv) The prior distribution of is exponential with mean 1.
(v) The prior distribution of is Poisson with mean 1.
(vi) A priori, and are independent.
Using Bhlmanns credibility for aggregate losses, determine k.
(A) 1
(B) 4/3
(C) 2
(D) 3
(E) 4

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 474
10.62 (4, 5/05, Q.17 & 2009 Sample Q.187) (2.9 points) You are given:
(i) The annual number of claims on a given policy has a geometric distribution with parameter .
(ii) The prior distribution of has the Pareto density function
() = /( + 1)(+1), 0 < < ,
where is a known constant greater than 2.
A randomly selected policy had x claims in Year 1.
Determine the Bhlmann credibility estimate of the number of claims for the selected policy
in Year 2.
1
( 1) x
1
x+1
x +1
(A)
(B)
+
(C) x
(D)
(E)
( 1)

1
10.63 (4, 11/05, Q.7 & 2009 Sample Q.219) (2.9 points)
For a portfolio of policies, you are given:
(i) The annual claim amount on a policy has probability density function: f(x | ) = 2x/2, 0 < x < .
(ii) The prior distribution of has density function: () = 43, 0 < < 1.
(iii) A randomly selected policy had claim amount 0.1 in Year 1.
Determine the Bhlmann credibility estimate of the claim amount for the selected policy in
Year 2.
(A) 0.43
(B) 0.45
(C) 0.50
(D) 0.53
(E) 0.56

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 475
Solutions to Problems:
10.1. B. The process variance for a Gamma is 2. Thus EPV = E[2] = E[100] = 100E[] =
(100)((4.5+2.5)/2) = 350. The mean of a Gamma is .
Thus VHM = Var[] = Var[10] = 100Var[] = (100){(4.5-2.5)2 /12} = 33.33.
K = EPV/VHM = 350/33.33 = 10.5.
Comment: I have used the fact that the variance of a uniform distribution on [a,b] is:
(b-a)2 /12. Note that the value of the scale parameter, , drops out of the calculation of the Buhlmann
Credibility Parameter, K.
10.2. E. From the previous solution K = 10.5. Thus Z = 2/(2+10.5) = 16%.
The observation is: (15+31)/2 = 23. The a priori mean is: E[] = E[10] = 10E[] =
(10)((4.5+2.5)/2) = 35. Thus the new estimate is: (23)(16%) + (35)(84%) = 33.1.
10.3. B. E[r] = mean of the Gamma = = (4)(5) = 20.
Var[r] = variance of the Gamma = 2 = (4)(52 ) = 100.
The mean of each Negative Binomial Distribution is: r = 0.3r.
The process variance of each Negative Binomial is: r(1+) = (0.3)(1.3)r = 0.39r.
EPV = E[0.39r] = 0.39E[r] = (0.39)(20) = 7.8.
VHM = Var[0.3r] = 0.32 Var[r] = (0.09)(100) = 9.
K = EPV/VHM = 7.8 /9 = 0.867. For one driver, for one year, Z = 1/(1+0.867) = 53.6%.
Comment: Similar to 4, 5/00, Q.37.
10.4. A. E[] = mean of the Exponential = = 0.7.
Var[] = variance of the Exponential = 2 = 0.72 = 0.49.
E[2] = second moment of the Exponential = 22 = 2(0.72 ) = 0.98.
The mean of each Negative Binomial Distribution is: r = 3.
The process variance of each Negative Binomial is: r(1+) = 3 + 32.
EPV = E[3 + 32] = 3E[] + 3E[2] = (3)(0.7) + (3)(0.98) = 5.04.
VHM = Var[3] = 32 Var[] = (9)(0.49) = 4.41. K = EPV/VHM = 5.04 /4.41 = 1.143.
For one driver, for one year, Z = 1/(1+1.143) = 46.7%.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 476
10.5. B. Given the process variance is 1/2.
.10

.10

EPV = E[1/2] = (1/.08) 1/2 d = 500. E[1/] = (1/.08) 1/ d = 20.12.


.02

.02

Given the mean is 1/. Prior Mean = E[1/] = 20.12.


VHM = E[1/2] - E[1/]2 = 500 - 20.122 = 95.2.
K = EPV/VHM = 500/95.2 = 5.25. Z = 1/(1 + K) = .160.
Estimated future severity = (.160)(60) + (.840)(20.12) = 26.5.
10.6. C. Process Variance for a single trial is: q(1- q) = q - q2 . EPV = E[q] - E[q2 ] = 1/2 - 1/3 = 1/6.
The hypothetical mean for one trial is q. VHM = variance of a uniform from [0 , 1] = 1/12.
K = EPV/ VHM = 2. Z = 6/(6 + K) = 75%.
Estimated future frequency per trial is: (75%)(2/6) + (25%)(1/2) = 0.375.
Comment: A Beta-Bernoulli conjugate prior situation. The uniform distribution is a Beta distribution
with a = 1 and b = 1. K = a + b = 2. See Mahlers Guide to Conjugate Priors.
10.7. E. The Single Parameter Pareto Distribution has mean: (6)(40)/(6 -1) = 48,
second moment: (6)(402 )/(6 - 2) = 2400, and variance: 2400 - 482 = 96.
b

E[X | b] = x 3x2 /b3 dx = 3b/4. E[X2 | b] = x2 3x2 /b3 dx = 3b2 /5.


0

Process Variance given b is: 3b2 /5 - (3b/4)2 = 3b2 /80.


EPV = E [3b2 /80] = (3/80)(2nd moment of the Single Parameter Pareto Distribution) =
(3/80)(2400) = 90. VHM = Var[3b/4] = (9/16)Var[b] = (9/16)(96) = 54.
K = EPV/VHM = 90/54 = 5/3. We observe two claims, so Z = 2/(2 + K) = 54.5%.
Prior mean = E[E[X | b]] = E[3b/4] = (3/4)E[b] = (3/4)(48) = 36.
Observed mean = (30 + 60)/2 = 45.
Estimated future severity = (54.5%)(45) + (45.5%)(36) = 40.9.
10.8. C. Process variance given r is: r(1.6)(2.6) = 4.16 r.
EPV= E[4.16r] = 4.16 E[r] = (4.16)(.8) = 3.328.
Mean given r is: 1.6 r. Prior mean = E[1.6r] = (1.6)(.8) = 1.28.
VHM = Var[1.6 r] = 2.56Var[r] = 2.56(.82 ) = 1.6384.
K = EPV/VHM = 3.328/1.6384 = 2.03. Z = 1/(1 + K) = 33.0%.
Estimated future frequency = (.330)(2) + (.670)(1.28) = 1.52.
Comment: Similar to 4, 5/00, Q.37.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 477
10.9. C. The Pareto Severity has mean 1000/4 = 250, second moment (2)(10002 )/((4)(3)) =
166,667, and variance 166667 - 2502 = 104,167.
Given r, the process variance of aggregate loss is:
(1.6 r)(104,167) + (2502 )(4.16 r) = 426667r.
EPV = E[426667r] = (426667)(.8) = 341,334.
Given r, the mean aggregate loss is: (1.6r)(250) = 400r.
VHM = 4002 Var[r] = (160000)(.82 ) = 102400.
K = EPV/VHM = 341,334/102,400 = 3.33.
Observe one year, Z = 1/(1 + K) = 23.1%.
A priori mean aggregate loss is: E[400r] = (400)(.8) = 320.
Estimated future annual aggregate loss = (.231)(2300) + (.769)(320) = 777.
Comment: Similar to 4, 11/01, Q.18. Note that the mean severity times the answer to the previous
question: (250)(1.52) = 380 is not equal to the solution to this question.
Since the severity distribution does not vary by insured, the former estimate has something to
recommend it.
.05

.10

.10

10.10. D. E[q] = q f(q)dq = 400q2 dq + 40q dq - 400q2 dq = .05.


0

.05

.05

.10

.05
.10

E[q2 ] = q2 f(q)dq = 400q3 dq + 40q2 dq - 400q3 dq = .002917.


0

.05

.05

Process Variance for a single year is: q(1- q) = q - q2 .


EPV = E[q] - E[q2 ] = .05 - .002917 = .0471.
The hypothetical mean for one trial is q. VHM = variance of the distribution of q =
E[q2 ] - E[q]2 = .002917 - .052 = .000417.
K = EPV/ VHM = .0471/.000417 = 113. Z = 3/(3+K) = 2.6%.
Estimated future frequency per trial is: (2.6%)(2/3) + (97.4%)(.05) = 6.6%.
10.11. C. Since we are mixing Poissons, EPV of Frequency = mean frequency = 4.
VHM frequencies = Variance of uniform distribution on [2, 6] = (6 - 2)2 /12 = 4/3.
K = EPV/ VHM = 4/(4/3) = 3. Z = 1/(1 + K) = 1/4.
Estimated future frequency = (1/4)(2) + (3/4)(4) = 3.5.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 478
10.12. B. The distribution of is Inverse Gamma with = 3 and scale parameter 5, with mean 5/2,
and second moment 25/2.
Given , the process variance of the severity is: 2 = 32.
EPV = E[32] = 3 E[2] = (3)(25/2) = 37.5.
Given , the mean aggregate loss is: = 3.
First moment of the hypothetical means is: E[3] = 3E[] = (3)(5/2) = 7.5.
Second moment of the hypothetical means is: E[92] = 9E[2] = (9)(25/2) = 112.5.
VHM = 112.5 - 7.52 = 56.25. K = EPV/VHM = 37.5/56.25 = 2/3.
We observe 2 claims, Z = 2/(2 + K) = 75.0%.
Estimated future severity is: (75.0%)(11/2) + (25.0%)(7.5) = 6.0.
10.13. B. The distribution of has mean 4, variance 4/3, and second moment 17.333.
The distribution of is Inverse Gamma with = 3 and scale parameter 5, with mean 5/2, and
second moment 25/2. Given and , the process variance of the aggregate loss is:
(second moment of the Gamma severity) = (+1)2 = 122.
EPV = E[122] = 12 E[]E[2] = (12)(4)(25/2) = 600.
Given and , the mean aggregate loss is: = 3.
First moment of the hypothetical means is: E[3] = 3 E[] E[] = (3)(4)(5/2) = 30.
Second moment of the hypothetical means is: E[922] = 9E[2] E[2] = (9)(17.333)(25/2) = 1950.
VHM = 1950 - 302 = 1050.
K = EPV/VHM = 600/1050 = .57. We observe one year, Z = 1/(1 + K) = 63.7%.
Estimated future aggregate loss is: (63.7%)(11) + (36.3%)(30) = 17.9.
Comment: Note that the product of the separate estimates of frequency and severity:
(3.5)(6) = 21, is not equal to the estimate of aggregate losses 17.9.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 479
10.14. B. The moment generating function is defined as: M(t) = E[ext].
For the Gamma Distribution, M(t) = (1 - t), t < 1/.
Therefore, E[e] = M(1) = (1 - 0.3)-8 = 17.3467. E[e2] = M(2) = (1 - 0.6)-8 = 1525.88.
For the LogNormal Distribution, E[X] = exp[ + 2/2] = 2.0544 e.
Thus, the first moment of the hypothetical means is: 2.0544 E[e] = (2.0544)(17.3467) = 35.637.
Second moment of the hypothetical means is: 2.05442 E[e2] = (2.05442 )(1525.88) = 6440.1.
VHM = 6440.1 - 35.6372 = 5170.
For the LogNormal, E[X2 ] = exp[2 + 22] = 17.8143 e2.
Process Variance = 17.8143 e2 - (2.0544 e)2 = 13.5937 e2.
EPV = 13.5937 E[e2] = (13.5937)(1525.88) = 20,742.
K = EPV/VHM = 20,742/5170 = 4.0.
Comment: 1/ = 1/0.3 = 3.333, so that M(2) exists.
10.15. C. Given , the process variance is: 2(1+ ).

E[] = 2805/(1 + )9 d = (280)(6)(3)/(9) = 280(5!)(2!)/(8!) = 1.667.


0

E[2] = 2806/(1 + )9 d = (280)(7)(2)/(9) = 280(6!)(1!)/(8!) = 5.


0

EPV = E[2(1+ )] = (2){E[] + E[2]} = (2)(1.667 + 5) = 13.33.


Given , the mean is: 2. A priori mean is: 2E[] = (2)(1.667) = 3.33.
VHM = Var[2] = 4 Var[] = (4){E[2] - E[]2 } = (4)(5 - 1.6672 ) = (4)(2.22) = 8.88.
K = EPV/VHM = 13.33 /8.88 = 1.5. Z = 1/(1 + K) = .4.
Estimated future frequency = (.4)(8) + (.6)(3.33) = 5.2.
Comment: If for fixed r, 1/(1+) of the Negative Binomial is distributed over a portfolio by a Beta,
then the posterior distribution of 1/(1+) parameters is also given by a Beta.
Thus the Beta distribution is a conjugate prior to the Negative Binomial Distribution for fixed r.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 480
10.16. D. The prior distribution of b is a Single Parameter Pareto with = 10 and = 3,
with first moment /(-1) = 15, second moment 2/(-2) = 300,
and variance 300 - 152 = 75.
Given b, the losses are uniform on [0, b], with process variance b2 /12.
EPV = E[b2 /12] = E[b2 ]/12 = 300/12 = 25.
Given b, the hypothetical mean is b/2. VHM = Var[b/2] = Var[b]/4 = 75/4 = 18.75.
K = EPV/VHM = 25/18.75 = 4/3. Since we observe 3 claims, Z = 3/(3 + K) = 69.2%.
Prior mean = E[b/2] = E[b]/2 = 15/2 = 7.5. Observed mean = (17 + 13 + 22)/3 = 17.33.
Estimated future severity = (69.2%)(17.33) + (30.8%)(7.5) = 14.3.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 481
10.17. C. Severity is uniform on [0, b].
If for example, b = 22.01, the chance of a claim of size 22 is 1/22.01.
If for example, b = 21.99, the chance of a claim of size 22 is 0.
For b 22, Prob[observation] = 6 f(17) f(13) f(22) = (6)(1/b)(1/b)(1/b) = 6/b3 .
For b < 22, Prob[observation] = 6 f(17) f(13) f(22) = = 6 f(17) f(14) 0 = 0.
(b) = 3000/b4 b > 10.

22

10 (b) Prob[observation | b] db = 10 (b) 0 db + 22

3000 6
db = 3000 / 226 .
b4 b3

By Bayes Theorem, the posterior distribution of is:


(b) Prob[observation | b ] 18,000 / b7
226
=
=
6
, b 22.
3000 / 226
3000 / 226
b7
(Recall that if < 22, Prob[observation] = 0.)
The mean of the uniform from 0 to b is b/2.
Thus, the expected value of the next claim from the same insured is:

22

226 b
db = (3)(226 ) / {(5)(225 )} = 13.2.
b7 2

Comment: The Single Parameter Pareto is a Conjugate Prior to the uniform Iikelihood.
In general, = + n, and = Max[x1 , ... xn , ].
In this case, the posterior distribution of b is Single Parameter Pareto with:
= 3 + 3 = 6, and = Max[10, 17, 13, 22] = 22.
The mean of the uniform from 0 to b is b/2.
Thus the estimate from Bayes Analysis is the posterior expected value of b/2:
(Mean of the posterior Single Parameter Pareto) / 2 = {(6)(22)/(6-1)} / 2 = 13.2.
Since the Single Parameter Pareto is not a member of a linear exponential family, it does not follow
that Buhlmann Credibility is equal to Bayes Analysis; in this case they differ.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 482
10.18. C. Var[X | ] = 2.
EPV = E[Var[X | ]] = E[2] = second moment of the Pareto = 22/{(-1)(-2)}.
E[X | ] = . VHM = Var[E[X | ]] = Var[] = variance of the Pareto = 22/{(-1)(-2)} - 2/(-1)2 =
2/{(-1)2(-2)}.
However, for = 2, the 2nd moment and variance of the Pareto do not exist.
Nevertheless, as 2, K = EPV/VHM = 2(-1)/ 1.
If one takes K = 1, then Z = 1/(1 + K) = 1/2. Prior mean = mean of Pareto = 100/(2-1) =100.
Estimated severity = (1/2)(60) + (1/2)(100) = 80.
Comment: Beyond what you are likely to be asked on the exam! EPV and VHM .
10.19. B. The second moment of the Gamma Distribution is: (6)(5)(4)2 = 4802.
For a Poisson frequency, the process variance is:
(second moment of severity) =(/20) (4802) = 243.

EPV =

1 (PV given ) () d = 1 24

7 / 8 d =

=
-42 / 4

= 42.

=1

The mean aggregate loss given is: (/20)(5)(4) = 2.

Overall Mean =

() d =

7 / 8

d =

=
-(7 / 5) / 5

2nd moment of the hypothetical means =

= 7/5.

=1

1 ( )

22

() d =

7 / 8

d =

VHM = 7/3 - (7/5)2 = .3733. K = EPV/VHM = 42/.3733 = 112.5.


For 100 exposures, Z = 100/(100 + K) = 47.1%.
Comment: Similar to 4, 11/03, Q.11.
10.20. A. Process Variance = b2 /12.
EPV = E[b2 /12] = E[b2 ]/12 = (52 /12 + 12.52 )/12 = 13.19.
Hypothetical Mean = b/2. VHM = Var[b/2] = Var[b]/4 = (52 /12)/4 = .521.
K = EPV/VHM = 13.19/.521 = 25.3.

=
-(7 / 3) / 3

=1

= 7/3.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 483
10.21. E. Let E[] = c. Then Var[] = 22 E[]2 = 4c2 . E[2] = 5c2 .
Severity has mean , standard deviation 4, and 2nd moment: (4)2 + 2 = 172.
Process Variance = (2nd moment of severity) = 172.
EPV = E[172] = 17E[]E[2] = (17)(2)(5c2 ) = 170c2 .
Hypothetical Mean = . VHM = Var[] = E[()2 ] E[]2 = E[2]E[2] E[]2 E[]2 =
(3 + 22 )(5c2 ) - 22 c2 = 31c2 . K = EPV/VHM = 170/31 = 5.48.
10.22. C. The mean severity is: exp[6 + 12 /2] = e6.5 = 665.142.
The second moment of severity is: exp[(2)(6) + (2)(12 )] = e14 = 1,202,604.
The variance of severity is: 1,202,604 - 665.1422 = 760,191.
E[q] = .02. Var[q] = (.03 - .01)2 /12 = 0.0003333. E[q2 ] = Var[q] + E[q]2 = 0.00043333.
For a given value of q, The process variance of pure premium is:
3q760,191 + 3q(1-q)665.1422 = 3,607,812q - 1,327,240q2 .
EPV = 3,607,812E[q] - 1,327,240E[q2 ] = 3,607,812(0.02) - 1,327,240(0.00043333) = 71,581.
Hypothetical mean = (3q)(665.142) = 1995.4q. Overall mean is: (1995.4)(.02) = 39.91.
VHM = 1995.42 Var[q] = (1995.42 )(0.00003333) = 132.7. K = EPV/VHM = 71,581/132.7 = 539.
10.23. A. The total number of employee-years is: 100 + 120 + 140 = 360.
Z = 360/(360 + 539) = 40.0%.
The inflated losses are: (100)(21)(1.044 ) + (120)(28)(1.043 ) + (140)(27)(1.042 ) = 10,324.69.
Observed pure premium is: 10,324.69/360 = 28.68, brought to the year 5 level.
From the previous solution, the a priori mean pure premium on the year 5 level is: 39.91.
Estimated future pure premium on the year 5 level is: (.4)(28.68) + (1 - .4)(39.91) = 35.42.
For 130 employees, the expected losses are: (130)(35.42) = 4605.
Comment: In year 4, one has data from years 1 to 3 and is predicting the losses for year 5.
10.24. A. The mean aggregate loss is: .
The variance of aggregate loss is: (2nd moment of severity) = 22.
EPV = E[22] = 2 E[] E[2] = (2)(4)(7 + 72 ) = 448.
First moment of the hypothetical means: E[] = E[]E[] = (4)(7) = 28.
Second moment of the hypothetical means: E[()2 ] = E[2]E[2] = {(2)42 }(7 + 72 ) = 1792.
VHM = 1792 - 282 = 1008. K = EPV/VHM = 448/1008 = 4/9.
Comment: Similar to 4, 5/05, Q.11.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 484
10.25. C. The process variance for severity is given as 22.
Since for the joint distribution and are not independent, one has to weight together the process
variances of the severities for the individual types using the chance that a claim came from each type.
The chance that a claim came from an individual of a given type is proportional to the product of the a
priori chance of an insured being of that type and the mean frequency for that type. This is similar to
picking a die and spinner together.
In this case, these weights are: f(, ) = 1.5 2, 0 < < 2 < 1.
1 2

1.5 2 d d = 33 d = 3/4.
0 0

0
1

1.52 22 d d / 1.52 d d= 1.5 2 163/3 d / (3/4) = (8/6)(4/3) = 16/9.

EPV =
0

The hypothetical mean severity is given as .


1

Overall Mean severity =

1.52 d d / 1.52 d d= 1.5 2 22 d / (3/4) =


0

(3/5)(4/3) = 4/5.
2

1.52 2 d d / 1.52 d d = 8/9.

2nd moment of the hypothetical means =


0

VHM = 8/9 - (4/5)2 = 56/225.


K = EPV/VHM = (16/9)/(56/225) = 450/63 = 7.14.
Comment: Beyond what you are likely to be asked on your exam! The setup of 4B, 5/99, Q.13
has been altered in the final bullet from an independent to a dependent distribution.
10.26. E. EPV = E[(1+)] = E[] + E[2] =
1st Moment of Gamma + 2nd Moment of Gamma = + (+1)2.
VHM = Var[] = variance of the Gamma = 2.
K = EPV/VHM = { + (+1)2}/2 = 1/ + + 1.
Comment: Similar to 4, 5/05, Q.17.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 485
10.27. D. f(x) = 0.75e-x + 0.5e-2x = 0.75e-x + (0.25) (2e-2x).
X is a 75%-25% mixture of Exponentials with means 1/ and 1/(2).
E[X | ] = .75/ + .25/(2) = .875/.
E[X2 | ] = .75(2/2) + .25{2/(2)2 } = 1.625/2.
Var[X | ] = 1.625/2 - (.875/)2 = .859375/2.
.05

EPV = (.859375/2)/.04 d = 1718.75.


.01
.05

First Moment of the hypothetical means = (.875/)/.04 d = 35.206.


.01
.05

Second Moment of the hypothetical means = (.875/)2 /.04 d = 1531.25.


.01

VHM = 1531.25 - 35.2062 = 291.8.


K = 1718.75/291.8 = 5.89. Z = 1/(1 + K) = 14.5%.
The expected size of the next claim from the same policyholder is:
(14.5%)(60) + (85.5%)(35.206) = 38.80.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 486
10.28. C. One has to weight together the process variances of the severities for the individual
types using the chance that a claim came from each type. The chance that a claim
came from an individual of a given type is proportional to the product of the a priori chance of an
insured being of that type and the mean frequency for that type.
In this case, these weights are: () /20 = (7/20)/7, > 1.

1 () / 20 d = (7/20) 1 - 7 d = (7/20)(1/6) = 7/120.


The variance of the Gamma Distribution is: (5)(4)2 = 802.

EPV =

1 (PV given ) () / 20 d / 1 () / 20 d = (120/7) 1 80 2 (7 / 8) ( / 20) d

= 480

1 - 5 d = 120.

The mean of the Gamma Distribution is: (5)(4) = 20.

Overall Mean =

1 20 () / 20 d / 1 () / 20 d = (120/7) 1 7- 6 d = 120/5 = 24.

2nd moment of the hypothetical means = (20 )2 () / 20 d /

1 () / 20 d

= (120/7)

1 140 - 5 d = (120/7)(140/4) = 600.

VHM = 600 - 242 = 24. K = EPV/VHM = 120/24 = 5.


Comment: Beyond what you are likely to be asked on your exam.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 487
10.29. D. and are distributed independently.
has a density which is Gamma with = 4 and = 1/10.
() = 3 e10 104 / [4] = 3 e10 10,000/ 6.
E[] = (4)(1/10) = 0.4.

E[2] = (4)(5)(1/10)2 = 0.2.

has a density which is Inverse Gamma with = 3 and = 10.


() = -4 e10/ 103 / [3] = -4 e10/ 500.
E[2] = (102 )/{(2)(1)} = 50.

E[] = 10/2 = 5.

The mean aggregate loss is: .


The variance of aggregate loss is: (2nd moment of severity) = 22.
EPV = E[22] = 2 E[] E[2] = (2)(0.4)(50) = 40.
First moment of the hypothetical means: E[] = E[]E[] = (0.4)(5) = 2.
Second moment of the hypothetical means: E[()2 ] = E[2]E[2] = (0.2)(50) = 10.
VHM = 10 - 22 = 6. K = EPV/VHM = 40/6 = 6.67.
Comment: One can compute the moments by doing the relevant integrals.
All of the integrals are of the Gamma type.
E[22] =

2 2 (2,500,000 / 3) 3 e- 10 -4 e-10 /

d d

= (5,000,000/3)

e-10 d

- 2

e-10 / d = (5,000,000/3)([5]/105 )/([1] 101 ) = 40.

E[] =

(2,500,000 / 3) 3 e-10 -4 e-1 0 / d d


0

= (2,500,000/3)

e-10 d

E[22] =

-3

e-10 / d = (2,500,000/3)([5]/105 )/([2] 102 ) = 2.

2 2 (2,500,000 / 3) 3 e-10 -4 e- 10 /

d d

= (2,500,000/3)

e-10 d

- 2

e-10 / d

= (2,500,000/3)([6]/106 )/([1] 101 ) = 10.


The mixed distribution for frequency is a Negative Binomial with r = = 4 and = = 1/10.
This Negative Binomial has mean of (4)(1/10) = 0.4, and variance of (4)(1/10)(11/10) = 0.44.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 488
The mixed distribution for severity is a Pareto with = 3 and = 10.
This Pareto has mean of 10/2 = 5, second moment of (2)(102 )/{(2)(1)} =100,
and variance of 100 - 52 = 75.
Thus one might think that the variance of aggregate loss is: (0.4)(75) + (52 )(0.44) = 41.
However, EPV + VHM = 40 + 6 = 46 > 41.
Just looking at the separate frequency and severity mixed distributions does not capture the
full variance, since it ignores the various combinations of Poissons with Exponentials such as
a small with a small , or a large with a large .
10.30. The chance of the observation is proportional to:
q(1 - q)2 (1-q)3 q2 (1- q) q(1 - q)2 (1-q)3 = q4 (1-q)11.
Thus the posterior distribution is proportional to: (1.5 - q) q4 (1-q)11 = 1.5q4 (1-q)11 - q5 (1-q)11.
Therefore, the posterior mean of q is:
1

{1.5 q5 (1- q)11 dq -

q6 (1- q)11 dq }/{1.5 q4 (1- q)11 dq -

q5 (1- q)11 dq }

= {(1.5)(5! 11! / 17!) - 6! 11! / 18!}/{(1.5)(4! 11! / 16!) - 5! 11! / 17!}


= {(1.5)(5! / 17!) - 6! / 18!}/{(1.5)(4! / 16!) - 5! / 17!} = {(1.5)(18) 5! - 6!}/{(1.5)(17)(18) 4! - (18) 5!}
= {(1.5)(18)(5) - (5)(6)}/{(1.5)(17)(18) - (18)(5)} = 105/369 = 0.28455.
The estimated future frequency is: (3)(0.28455) = 0.8537.
10.31. The process variance is 3q(1-q).
1

EPV = 3 q(1- q)(1.5 - q) dq = 4.5 q(1- q) dq - 3 q2(1- q) dq = (4.5) 1! 1!/3! - (3) 2! 1! / 4! = 0.5.
1

Overall mean is:

3q(1.5 - q) dq = 4.5 q dq - 3 q2 dq = 4.5/2 - 3/3 = 1.25.

Second Moment of the Hypothetical Means is:


1

(3q)2 (1.5 - q) dq = 13.5 q2 dq - 9 q3 dq = 13.5/3 - 9/4 = 2.25.


VHM = 2.25 - 1.252 = .6875. K = EPV/ VHM = .5/.6875 = 0.7273.
Z = 5/(5 + K) = .873.
Estimated future frequency is: (.873)(4/5) + (1 - .873)(1.25) = 0.857.
Comment: While the Buhlmann and Bayes estimates are very similar, they are not equal.
The prior distribution of q is similar to but not equal to a Beta Distribution.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 489
10.32. C. The moment generating function is defined as: M(t) = E[ext].
For the Inverse Gaussian Distribution, M(t) = exp[( / ) (1 Therefore, E[exp[2]] = M(1) = exp[(2 / 0.6) (1 E[exp[2/2]] = M(0.5) = exp[(2 / 0.6) (1 E[exp[22]] = M(2) = exp[(2 / 0.6) (1 -

1 - 2t2 / )] , t <

.
2 2

1 - 2 (0.62 )/ 2 )] = 1.9477.

1 - 2(0.5) (0.62) / 2 )] = 1.3701.


1 - 2(2) (0.62) / 2 )] = 4.8042.

For the LogNormal Distribution, E[X] = exp[ + 2/2] = exp[5 + 2/2] = 148.41 exp[2/2].
Thus, the first moment of the hypothetical means is:
148.41 E[exp[2/2]] = (148.41)(1.3701) = 203.34.
Second moment of the hypothetical means is:
148.412 E[exp[2]] = (148.412 )(1.9477) = 42,899.
VHM = 42,899 - 203.342 = 1552.
For the LogNormal, E[X2 ] = exp[2 + 22] = 22,026 exp[22].
Process Variance = 22,026 exp[22] - (148.41 exp[2/2])2 = 22,026 exp[22] - 21,937 exp[2].
EPV = 22,026 E[exp[22]] - 21,937 E[exp[2]] =
(22,026)(4.8042) - (21,937)(1.9477) = 63,091.
K = EPV/VHM = 63,091/1552 = 40.7.
Comment: Long and difficult!

Note that
= 2.777, so that M(2) exists.
2 2
10.33. B. Each Inverse Gamma has mean: /( 1) = /5,
and second moment: 2/{( 1)( 2)} = 2/20.
Therefore, each Inverse Gamma has a (process) variance of: 2/20 - (/5)2 = 2/100.
is distributed via an Exponential Distribution; call the mean of this Exponential Distribution .
E[] = , E[2] = 22, and Var[] = 2.
EPV = E[2/100] = E[2]/100 = 2 2/100 = 2/50.
VHM = Var[/5] = Var[]/52 = 2/25.
K = EPV/VHM = (2/50)/(2/25) = 0.5.
Comment: This is an example of an Exponential-Inverse Gamma. K = 2/( - 2), for > 2.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 490
10.34. B. and are distributed independently.
has a density which is Pareto with = 3 and = 1/10.
() = 3(1/10)3 /(1/10 + )4 = 30/(1+ 10)4 .
E[] = (1/10)/2 = 0.05. E[2] = (2)(1/10)2 /{(2)(1)} = 0.01.
has a density which is Gamma with = 2 and = 10.
() = e0.1 / ([2] 102 ) = e0.1 / 100.

E[2] = (2)(3)(102 ) = 600.

E[] = (2)(10) = 20.

The mean aggregate loss is: .


The variance of aggregate loss is: (2nd moment of severity) = 22.
EPV = E[22] = 2 E[] E[2] = (2)(0.05)(600) = 60.
First moment of the hypothetical means: E[] = E[]E[] = (0.05)(20) = 1.
Second moment of the hypothetical means: E[()2 ] = E[2]E[2] = (0.01)(600) = 6.
VHM = 6 - 12 = 5. K = EPV/VHM = 60/5 = 12.
Comment: One can compute the moments by doing the relevant integrals, but it is a lot of work!

2 2 0.3 e- 0.1 / (1 + 10 )4 d d = 0.6

E[22] =

/ (1 + 10 ) 4 d

3 e- 0.1 d
0

= (0.6)(1/600)([4] 104 ) = 60.

E[] =

0.3

e- 0.1

/ (1 +

10 )4 d

d = 0.3

/ (1 + 10

) 4

2 e- 0.1 d
0

= (0.3)(1/600)([3] 103 ) = 1.

E[22] =

2 2 0.3 e - 0.1 / (1 + 10 )4 d d = 0.3

2 / (1 + 10 )4 d

3 e- 0.1 d
0

= (0.3)(1/3000)([4] 104 ) = 6. All of the integrals involving are of the Gamma type.
The integrals involving can be done via integration by parts, or if one is clever by using the
moment formulas for a Pareto Distribution.

For example,

/ (1 + 10 ) 4 d =

x 0.1 4 / (0.1 + x)4 dx = (0.1/3) x 3 0.13 / (0.1 + x)4 dx


0

= (0.1/3) x f(x) dx, where f(x) is the density of a Pareto with = 3 and = 0.1.
0

Therefore, (0.1/3) x f(x) dx = (0.1/3)(1st moment of this Pareto) = (0.1/3)(0.1/2) = 1/600.


0

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 491
10.35. E. For a Pareto Distribution with = 4, E[X] = /3, E[X2 ] = 2/3, and Var[X] = 22/9.
The uniform distribution from 10 to 50 has mean 30 and variance 402 /12.
EPV = E[22/9] = (2/9)E[2] = (2/9)(402 /12 + 302 ) = 229.63.
VHM = Var[/3] = Var[]/32 = (402 /12)/9 = 14.815.
K = EPV/VHM = 229.63/14.815 = 15.5.
Z = 6/(6 + 15.5) = 27.9%.
10.36. D. The hypothetical mean is: .
Therefore, the First Momentof Hypothetical Means is:
3

(, ) d d =

3 d / 23.75 = (16.25)(6.2)/23.75 = 4.242.


1

Second Momentof Hypothetical Means is:

)2 (,

) d d =

2 3 d / 23.75 = (42.2)(10.5)/23.75 = 18.657.


1

VHM = 18.657 - 4.2422 = 0.662.


The process variance is: 2.
Therefore, the EPV is:

(, ) d d =

2 3 d / 23.75 = (16.25)(10.5)/23.75 = 7.184.


1

K = EPV/VHM = 7.184/0.662 = 10.85.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 492
10.37. A. The process variance given u is:
Second Moment of LogNormal - Square of First Momentof LogNormal =
Exp[2u + (2)(22 )] - Exp[u + (22 )/2]2 = Exp[2u] (e8 - e4 ).
Therefore, EPV = E[e2u](e8 - e4 ).
u is Normal with mean 5 and variance 3.
(When we multiply a variable by a constant such as 2, we multiply the variance by that constant
squared, and thus we multiply the standard deviation by that constant.)
Therefore, 2u is Normal with mean 10, and variance: (22 )(3) = 12.
12 .

Therefore, e2u is LogNormal withparameters 10 and


Therefore,E[e2u] is the mean of this LogNormal:
Exp[10 + 12/2] = e16.
Therefore, EPV =e16(e8 - e4 ) = 26,004 million.
The hypothetical mean given u is:
First Momentof LogNormal =Exp[u + (22 )/2] =eu e2 .
Therefore, VHM = Var[eu ](e2 )2 = Var[eu ]e4 .
u is Normal with mean 5 and variance 3.
Therefore, eu is LogNormal withparameters 5 and

3.

Therefore,Var[eu ] is the variance of this LogNormal:


Exp[(2)(5) + (2)(3)] - Exp[5 + 3/2]2 = e16 - e13.
Therefore, VHM =(e16 - e13)e4 = 461.01 million.
K = EPV/VHM = 26,004/461.01 = 56.4.
Z = 20/(20 + 56.4) = 26.2%.
The hypothetical mean given u is: eu e2 .
eu is LogNormal withparameters 5 and

3.

Therefore, E[eu ] is the mean of this LogNormal: Exp[5 + 3/2] = e6.5.


Prior mean is: E[eu ] e2 = e6.5 e2 = 4915.
Estimate = (26.2%)(10,000) + (1 - 26.2%)(4915) = 6247.
Comment: Long and hard!

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 493
10.38. C. The prior density of alpha is: 1/2, 3 5.
The hypothetical mean is the mean of the Inverse Gamma: /(1).
Therefore, the First Momentof Hypothetical Means is:

1 3000
1
d
d
=
1500
d
1
3 d = (1500) {ln(4) - ln(2)} (1/200) = 5.1986.
1 2 4
3
10

Second Momentof Hypothetical Means is:


5

2 1 3000
1
1 2 4 d d = 1500 ( 1)2 d 2 d = (1500) {1/2 - 1/4} (1/10) = 37.5.
3
10

VHM = 37.5 - 5.19862 = 10.475.


The second moment of an Inverse Gamma is:

Therefore, the process variance is:

2
.
( 1) ( 2)

2
2
2
-
=
.
( 1) ( 2) 1
( 1)2 ( 2)

Therefore, the EPV is:

5
1 3000
2
1
2
d d = 1500
d
d .
4
2
2
( 1) ( 2) 2
(

2)
(

1)
10
3

=5

1
1
( 1)2 ( 2) d = 1 + ln[ - 2] - ln[ - 1]
3

= (1/4 - 1/2) + ln[3/1] - ln[4/2] = 0.155465.

=3

EPV = (1500) (1/10) (0.155465) = 23.32. K = EPV/VHM = 23.32/10.475 = 2.23.


Z = 4/(4 + 2.23) = 64.2%.
10.39. D. Severity follows a Single Parameter Pareto with = 3.
E[X | ] = 3/(3-1) = 3/2.

E[X2 | ] = 32/(3-2) = 32.

Var[X | ] = 32 - (3/2)2 = 32/4.


Then the VHM = VAR[3/2] = (9/4) Var[] = (9/4)(52 /12) = 75/16.
EPV = E[32/4] = (3/4)E[2] = (3/4)(52 /12 + 12.52 ) = 118.75. K = EPV/VHM = 76/3.
The prior mean is: E[3/2] = (3/2)E[] = (1.5)(12.5) = 18.75.
The observed mean is: 200/8 = 25.
Z = 8 / (8 + 76/3) = 24.0%.
Estimate = (24.0%)(25) + (1 - 24.0%)(18.75) = 20.25.
Comment: The variance of a uniform is its width squared divided by 12.
The second moment of a uniform is its variance plus the square of its mean.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 494
10.40. E. The mean of the zero-truncated poisson distribution is: /(1 - e) = 2/(1 - e-2) = 2.313.
Therefore, E[10] = 2.313. E[] = 0.2313.
As shown in Appendix B of Loss Models, the variance of the zero-truncated poisson distribution is:

1 - ( + 1)e-
1 - 3e- 2
=
(2)
= 1.5890.
(1 - e - )2
(1 - e-2 )2

Therefore, Var[10] = 1.5890. Var[] = 1.5890/100 = 0.01589.


The process variance is: (1+) = + 2.
EPV = E[ + 2] = E[] + E[2] = E[] + Var[2] + E[]2 = 0.2313 + 0.01589 + 0.23132 = 0.3007.
VHM = Var[] = 0.01589.
K = EPV / VHM = 0.3007 / 0.01589 = 18.92.
For 5 years of data, Z = 5/(5 + 18.92) = 20.9%.
10.41. E. E[X | ] = exp[ + 22 /2] = e7 e5. E[X] = e7 E[e5].
E[X | ]2 = e14 e2(5).
second moment of the hypothetical means = E[E[X | ]2 ] = e14 E[e2(-5)].
Var[X | ] = E[X2 | ] - E[X | ]2 = exp[2 + (2)(22 )] - (e2 e)2 = (e8 - e4 )e2 = (e18 - e14)e2(-5)
EPV = E[Var[X | ]] = (e18 - e14) E[e2(-5)].
Now the probability generating function is defined as: P[z] = E[zn ].
Here 5 follows Negative Binomial distribution, so 5 takes the place of n.
Therefore, E[e5] = P[e], and E[e2(-5)] = P[e2 ].
As shown in Appendix B of Loss Models, for the Negative Binomial:
r

1
1
P(z) =
, z < 1 + 1/ = 21.
=
1 - (z -1)
(1.05 - 0.05z)6
P(e) = 1.7143.
P(e2 ) = 10.0659.
EPV = (e18 - e14) E[e2(-5)] = (e18 - e14) P(e2 ) = (e18 - e14) (10.0659) = 648.8 million.
VHM = E[E[X | ]2 ] - E[X]2 = e14 E[e2(-5)] - (e7 E[e5])2 = e14 P(e2 ) - e14 P(e)2 =
e14 (10.0659 - 1.71432 ) = 8.571 million.
K = EPV / VHM = 648.8 / 8.571 = 75.7.
Comment: Difficult!

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 495
10.42. B. E[q] = 0.3 E[q2 ] = Var[q] + E[q]2 = 0.22 /12 + 0.32 = 0.09333.
Process variance of pure premium is:
(10q)(5)(2002 ) + (10002 )(10)(q)(1-q) = (12 million)q - (10 million)q2 .
EPV = (12 million) E[q] - (10 million)E[q2 ] = (12 million) (0.3) - (10 million) (0.09333)
= 2,667,667.
The mean pure premium is: (5)(200)(10q) = 10,000q.
A priori mean pure premium is: 10,000 E[q] = (10,000)(0.3) = 3000.
VHM = Var[10,000q] = (100 million) Var[q] = (100 million) 0.22 /12 = 333,333.
K = EPV / VHM =2,666,667 / 333,333 = 8.0.
There are a total of 45 members during the three years. Z = 45 / (45 + 8) = 85.0%.
Observed average pure premium is:
{(10)(2143) + (15)(2551) + (20)(2260)} / (10 + 15 + 20) = 2331.
Estimated future pure premium is: (0.850)(2331) + (1 - 0.850)(3000) = 2431.
Estimate of aggregate losses in year 4 is: (2431)(25) = 60,775.
Comment: Similar to 4, 11/01, Q.18.
No use is made of the given claim counts.
10.43. C. Mean frequency is .10.
Mean severity for risk type one is: (1/3)(5) + (1/2)(10) + (1/6)(20) = 10.
Mean severity for risk type two is: (1/2)(5) + (1/4)(10) + (1/4)(20) = 10.
Thus the mean severity is independent of whether a risk is type 1 or type 2.
Let m be the mean frequency for a risk. Since the frequency and severity are distributed
independently, the hypothetical mean is 10m.
The first moment of the hypothetical means is (10)(.10) = 1.
The second moment of the hypothetical means is:
m = .13

(10m)2 (1 / .06) dm = (100/3) {(.133) - (.073)} / .06 = .0618 / .06 = 1.03.


m = .07

Thus the variance of the hypothetical means is 1.03 - 12 = 0.03.


Comment: Since the mean severity is independent of the type of risk, the computation is simplified
somewhat.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 496
10.44. E. For a risk with average frequency m, and of type 1, the variance of the pure premium is
E(PP2 ) - E2 (PP) = 125m - 100m2 :
Outcome

A Priori
Probability

Pure
Premium

Square of
Pure Premium

No Claim
A Claim of Size 5
A Claim of Size 10
A Claim of Size 20

1-m
m/3
m/2
m/6

0
5
10
20

0
25
100
400

Overall

10m

125m

For a risk with average frequency m, and of type 2, the variance of the pure premium is
E(PP2 ) - E2 (PP) = 137.5m - 100m2 :
Outcome

A Priori
Probability

Pure
Premium

Square of
Pure Premium

No Claim
A Claim of Size 5
A Claim of Size 10
A Claim of Size 20

1-m
m/2
m/4
m/4

0
5
10
20

0
25
100
400

Overall

10m

137.5m

The density function for the average frequency m is (1/.06) on [.07,.13]. Therefore, the expected
value of the process variance of the pure premiums can be obtained by weighting together the
chances of the two types of severities and integrating over m:
m = .13

m = .13

m = .07

m = .07

(60%) {125m -100m2 } (1 / .06) dm + 40% {137.5m -100m2 } (1 / .06) dm =


m = .13

(1/.06)

{130m -100m2} dm = (1/.06){(130)(.006) - (100)(.00618)} = 11.97.

m = .07

Alternately, the mean severity for type 1 is 10 with a variance of 25. The mean severity for type 2
is 10 with a variance of 37.5. For a risk with mean frequency m, the variance of the frequency is
m(1-m), since we have a Bernoulli. Thus for a risk with average frequency m, and of type 1,
the variance of the pure premium is: m25 + (102 )m(1-m) = 125m - 100m2 .
For a risk with average frequency m, and of type 2, the variance of the pure premium is:
m37.5 + (102 )m(1-m) = 137.5m - 100m2 . Then proceed as above.
Comment: Alternately one can compute the total variance which turns out to be 12 and subtract the
variance of the hypothetical means from the previous question of .03, getting the expected value of
the process variance of 11.97.
10.45. B. K = EPV / VHM = 11.97 / 0.03 = 399.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 497
10.46. A. Z = 15 / (15 + 399) = .036. The prior estimate is the mean pure premium of 1.
The observation is 45/15 = 3. Thus the new estimate is (.036) (3) + (1-.036)(1) = 1.072.
10.47. D. For q = m/100,000, f(q) = 100q, for 0 < q .1, and f(q) = 20 - 100q, for .1 < q .2.
The process variance for the Bernoulli given q is q(1-q). Thus the Expected Value of the Process
Variance is the integral of q(1-q)f(q) which is 1.06/12.
.2

.1

.2

f(q)q(1-q)dq = (100q)q(1-q)dq + (20 - 100q)q(1-q)dq =


0

.1
q=.1

{100q3 /3 - 25q4 }

q=.2

+ {10q2 - 40q3 + 25q4 } ] = .37/12 + (3.6 - 3.36 +.45)/12 = 1.06/12.

q=0

q =.1

The hypothetical mean is q. Therefore overall mean = E[q] = the integral from 0 to 0.2 of qf(q):
.2

.1

.2

f(q)qdq = (100q)qdq + (20 - 100q)qdq =


0

.1

q=.1

100q3 /3

q=.2

+ {10q2 - (100/3)q3 }] = .0333 + .3 - .2333 = .1.

q=0

q =.1

The Variance of the Hypothetical Means is the integral of f(q)(q-.1)2 .


By symmetry around q = .1, we can take twice the integral from 0 to .1 :
.1

.1

q=.1

2 f(q)(q-.1)2 dq = 2 (100q)(q2 - .2q -.01) = 50q4 - 40q3 /3 + q2


0

= .02/12

q=0

Buhlmann Credibility Parameter, K = EPV/VHM = (1.06/12) / (.02/12) = 53.


For two exposures, Z = 2/(2+53) = 2/55. The observed frequency is 1/2.
The new estimate is: (2/55)(1/2) + (53/55)(.1) = 63/550.
Comment: Since f(q) is symmetric around q = .1, E[q] = .1.
10.48. D. The process variance for a Gamma is 2. Thus EPV = E[2] = E[/22 ] = E[]/4 =
((4+0)/2)/4 = 1/2. The mean of a Gamma is . Thus VHM = Var[] = Var[/2] = Var[]/22 =
((4-0)2 /12)/4 = 1/3. K = EPV/VHM = (1/2)/(1/3) = 3/2.
Comment: The variance of a uniform distribution on [a,b] is: (b-a)2 /12.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 498
10.49. E. Since and are independent and each distributed uniformly over (0,1),
E[] = E[] = 1/2. E[2 ] = E[2 ] = 1/3. For a given and , the mean pure premium is , and the
process variance of the pure premium is:

PP2 = FS2 + S2 F2 = (22) + ()2 (2) = 42.


Therefore, EPV = E[42] = 4 E[2] E[] = (4)(1/3)(1/2) = 2/3.
VHM = VAR[] = E[()2 ] - E2 [] E2 [] = E[2 ] E[2 ] - (1/2)2 (1/2)2 = (1/3)(1/3) - 1/16 = 7/144.
Buhlmann Credibility Parameter K = EPV/VHM = (2/3)/(7/144) = 13.7.
Comment: E[], E[], E[2 ], and E[2 ], can each be computed by doing a double integral with
respect to f(,)dd.
10.50. D. Since for the joint distribution and are independent, for severity the type of risk is
determined just by . This is similar to picking a die and spinner separately.
1

EPV = 22 d = 2/3.
0
1

Overall Mean severity = d = 1/2.


0
1

2nd moment of the hypothetical mean severities = 2 d = 1/3. VHM = 1/3 - 1/22 = 1/12.
0

K = EPV/VHM = (2/3)/(1/12) = 8.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 499
10.51. B. Given , the mean losses = (mean of Poisson)(mean severity) = = 2.
Thus the overall mean is:

2 f() d = 2 e d = 2! = 2.
0

The second moment of the hypothetical means is:

4 f() d = 4 e d = 4! = 24.
0

Therefore the Variance of the Hypothetical means = 24 - 22 = 20.


Given , process variance = (mean of Poisson)(second moment of severity) = (22) = 23.

EPV = 23f() d = 2
0

3 e d = 2(3!) = 12. K = EPV/VHM = 12/20 = 3/5.

10.52. C. The mean of each Negative Binomial Distribution is: r = 0.5ri = i.


E[.5ri] = E[i] = 0.2. Therefore, E[ri] = 0.2/0.5 = 0.4.
The process variance of each Negative Binomial is: r(1+) = (0.5)(1.5)ri = 0.75ri.
EPV = E[0.75ri] = 0.75E[ri] = (0.75)(0.4) = 0.3.
VHM = Var[i] = Variance of the Exponential = 0.22 = 0.04.
K = EPV/VHM = .3 /.04 = 7.5. For one driver, for one year, Z = 1/(1 + 7.5) = 11.8%.
Alternately, since beta is fixed at 0.5, r = /0.5 = 2.
Therefore, since follows an Exponential Distribution with mean 0.2, the r parameters over the
portfolio have an Exponential Distribution with mean 0.4.
Hypothetical mean is: r = 0.5 r. VHM = Var[0.5 r] = 0.52 Var[r] = (0.25) (0.42 ) = 0.04.
Proceed as before.
Comment: The frequency process is Negative Binomial, with fixed and r varying across the
portfolio of risks.
It has been implicitly assumed that for each driver his r parameter is constant over time.
Whenever as here, we are given the distribution of the hypothetical means, we can use that
distribution to get the VHM directly, as I did here. For example, in the case of the Gamma-Poisson,
the VHM is the variance of the Gamma distribution of lambdas.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 500
10.53. E. mean severity = 1000/(3 - 1) = 500.
2nd moment severity = (2)(10002 )/((3 - 1)(3 - 2)) = 1,000,000. Variance = 750,000.
Mean frequency = .2r. Variance of frequency = (.2)(1.2)r = .24r.
EPV = E[(.24r)(5002 ) + (.2r)(750,000)] = E[210000r] = 210000E[r] = (210000)(2) = 420,000.
VHM = Var[(.2r)(500)] = Var[100r] =1002 Var[r] = (10000)(22 ) = 40,000.
K = EPV/VHM = 420,000/40,000 = 10.5. Z = 100/(100 + 10.5) = 0.905.
Comment: Assume r is the same for every insured in this book of business; bullet #2 could have
been clearer in this regard.
10.54. B. mean severity = 1000/(3 - 1) = 500.
2nd moment severity = (2)(10002 )/{(3 - 1)(3 - 2)} = 1,000,000. Variance = 750,000.
Mean frequency = .2r. Variance of frequency = (.2)(1.2)r = .24r.
E[r] = 2. E[r2 ] = 1/3 + 4/3 + 9/3 = 14/3. Var[r] = 14/3 - 22 = 2/3.
EPV = E[(.24r)(5002 ) + (.2r)(750,000)] = E[210000r] = 210000E[r] = (210000)(2) = 420,000.
VHM = Var[(.2r)(500)] = Var[100r] =1002 Var[r] = (10000)(2/3) = 6667.
K = EPV/VHM = 420,000/6667 = 63. Z = 100/(100 + 63) = 61.4%.
10.55. E. EPV = E[2] = 1.25. VHM = Var[] = (1.5 - .5)2 /12 = 1/12. K = 1.25/(1/12) = 15.
Z = 1/ (1 + K) = 1/16. Prior mean = E[] = 1. Estimate = (1/16)(0) + (15/16)(1) = 15/16 = 0.94.
Comment: You have to assume that the distributions of and 2 are independent.
10.56. D. EPV = E[2] = (75%)(1) + (25%)(2) = 1.25.
VHM5= Var[] = (.5)(.5 - 1)2 + (.5)(1.5 - 1)2 = 0.25. K = 1.25/0.25 = 5.
Z = 1/ (1 + K) = 1/6. Prior mean = E[] = 1. Estimate = (1/6)(0) + (5/6)(1) = 0.833.
Comment: Discrete risk type analog to 4, 11/02, Q.18.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 501
10.57. C. The second moment of the Exponential Distribution is: 2(10)2 = 2002.
For a Poisson frequency, the process variance of aggregate losses is:
(second moment of severity) = (2002) = 2003.

EPV = (PV given ) () d = 2003 5/6 d = -500/2 ] = 500.


1

= 1

The mean aggregate loss given is: ()(10) = 102.

Overall Mean = 102 () d = 102 5/6 d = -(50/3)/3 ] = 50/3.


1

= 1

2nd moment of the hypothetical means = (102)2 () d = 1004 5/6 d = -500/ ] = 500.
1

=1

VHM = 500 - (50/3)2 = 2000/9. K = EPV/VHM = 500/(2000/9) = 2.25.


Alternately, () = 5/6, > 1.
Rewriting this prior distribution as f(x) = 5/x6 , x >1, it is a Single Parameter Pareto Distribution with
= 5 and = 1.
The process variance of aggregate losses is: (second moment of severity) = (2002) = 2003.
Therefore, EPV = E[2003] = 200(third moment of the Single Parameter Pareto Distribution)
= (200)(5)(13 )/(5 - 3) = 500.
The mean aggregate loss given is: ()(10) = 102.
Overall Mean = E[102] = 10(second moment of the Single Parameter Pareto Distribution)
= (10)(5)(12 )/(5 - 2) = 50/3.
2nd moment of the hypothetical means = E[(102)2 ] = 100E[4] =
(100)(fourth moment of the Single Parameter Pareto Distribution) = (100)(5)(14 /(5 - 4) = 500.
VHM = 500 - (50/3)2 = 2000/9. K = EPV/VHM = 500/(2000/9) = 2.25.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 502
10.58. D. 2nd moment of Exponential: 2(10)2 = 2002.
Poisson frequency the process variance is: (2nd moment of severity) = (2002) = 2003.
For = 1, process variance is: 200. For = 2, process variance is: 1600.
EPV = (70%)(200) + (30%)(1600) = 620.
Mean aggregate loss given is: ()(10) = 102.
For = 1, mean is: 10. For = 2, mean is: 40.
Overall Mean = (70%)(10) + (30%)(40) = 19.
2nd moment of the hypothetical means = (70%)(102 ) + (30%)(402 ) = 550.
VHM = 550 - 192 = 189. K = EPV/VHM = 620/189 = 3.28.
10.59. E. For fixed parameters, the mean aggregate is: exp[ + 2/2] = exp[] exp[2/2].
Variance of aggregate is: (2nd moment of severity) = exp[2 + 22] = exp[2] exp[22].
=1 =1 =1

EPV =

=1

exp[2] exp[2 2] 2 d d d = 2 / 2]

=0

=0 =0 =0

=1

=1

exp[2]/ 2]

exp[22 ]/ 2]

=0

= 0

= (1/2) {(e2 - 1)/2}{(e2 - 1)/2} = 5.1025.


=1 =1 =1

Overall mean =

exp[]

=0 =0 =0

exp[ 2 / 2]

2 d d d =

=1
2
/ 2]
=0

=1

exp[]]

= 0

= (1/2)(e1 - 1)2(e1/2 - 1) = 1.1147.


2nd moment of hypothetical means =
=1 =1 =1

=0 =0 =0

exp[2 ]

exp[2 ]

2 d d d =

= (1/3){(e2 - 1)/2}(e1 - 1) = 1.8297.


VHM = 1.8297 - 1.11472 = 0.5872.
K = EPV/VHM = 5.1025/0.5872 = 8.69.

=1
3
/ 3]

=1

=1
exp[2]/ 2
exp[ 2]
=0
=0
= 0

=1
2
exp[ / 2]
=0

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 503
10.60. E. For fixed parameters, the mean aggregate is: exp[ + 2/2].
Variance of aggregate is: (2nd moment of severity) = exp[2 + 22].
Type

Probability

lambda

mu

sigma

Mean

Square of Mean Process Var.

1
2
3

0.3
0.2
0.5

0.25
0.75
0.50

0.75
0.50
0.25

0.50
0.25
0.75

0.5997
1.2758
0.8505

0.3597
1.6276
0.7234

1.8473
2.3102
2.5392

0.8603

0.7951

2.2858

VHM = .7951 - .86032 = .0550. K = EPV/VHM = 2.2858/.0550 = 41.6.


10.61. B. The mean aggregate loss is: .
The variance of aggregate loss is: (2nd moment of severity) = 22.
EPV = E[22] = 2 E[] E[2] = (2)(1)(1 + 12 ) = 4.
First moment of the hypothetical means: E[] = E[]E[] = (1)(1) = 1.
Second moment of the hypothetical means: E[()2 ] = E[2]E[2] = {(2)12 }(1 + 12 ) = 4.
VHM = 4 - 12 = 3. K = EPV/VHM = 4/3.
10.62. D. The distribution of is a Pareto Distribution with = 1. It has mean 1/( - 1),
second moment 2/{( - 1)( - 2)},
and variance: 2/{( - 1)( - 2)} - 1/( - 1)2 = /{( - 1)2 ( - 2)}.
A Geometric Distribution has mean and variance (1 + ).
EPV = E[(1+)] = E[] + E[2] = 1/( - 1) + 2/{( - 1)( - 2)} = /{( - 1)( - 2)}.
VHM = Var[] = variance of the Pareto = /{( - 1)2 ( - 2)}.
K = EPV/VHM = - 1. Z = 1/(1 + K) = 1/.
Observation is x and the prior mean is: E[] = 1/( - 1).
Estimate is: (1/)x + (1 - 1/){1/( - 1)} = (x + 1)/.

2013-4-9 Buhlmann Credibility 10 Buhl. Cred. Contin. Risk Types, HCM 10/19/12, Page 504

10.63. A. E[X | ] =

0 2x2 / 2 dx

E[X2 | ] =

= 2/3.

0 2x3 / 2 dx

= 2/2.

Process Variance = E[X2 | ] - E[X | ]2 = 2/18.

EPV =

=1

(2 / 18) 4 3 d = (2 / 9) 6 / 6]

=0
1

First moment of the hypothetical means =

= 1/27.

=1

(2/ 3) 4 3 d = (8 / 3) 5 / 5]

=0

2nd moment of the hypothetical means


1

=1

(2/ 3)2 43 d (2/3)2 43 d = (16 / 9) 6 / 6]

=0

= 8/27.

VHM = 8/27 - (8/15)2 = 0.01185.


K = EPV/VHM = (1/27)/0.01185 = 3.125.
Z = 1/(1 + K) = 24.2%.
Estimate is: (24.2%)(0.1) + (1 - 24.2%)(8/15) = 0.428.

= 8/15.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 505

Section 11, Linear Regression & Buhlmann Credibility


Buhlmann Credibility is a linear regression approximation to Bayes Analysis.
Fitting a Straight Line with an Intercept:119 120
Two-variable regression model 1 independent variable and 1 intercept.
Y i = + Xi + i.
Ordinary least squares regression: minimize the sum of the squared differences between the
estimated and observed values of the dependent variable.
^

estimated slope = =

N Xi Yi N Xi 2

Xi Yi .
- (Xi)2

^ = Y - ^ X .

To convert a variable to deviations form, one subtracts its mean.


A variable in deviations form is written with a small rather than capital letter.
xi = Xi - X . Variables in deviations always have a mean of zero.
In deviations form, the least squares regression to the two-variable (linear) regression model,
Y i = + Xi + i, has solution:
^

119

xi yi xi Yi
=
.
xi2 xi2

^ = Y - ^ X .

A review of material not on the syllabus of Exam 4/C. See Mahlers Guide to Regression, in the Discussions
portion of the NEAS webpage: www.neas-seminars.com/Discussions/
120
Provided you are given the individual data rather than the summary statistics, the allowed electronic calculators will
fit a least squares straight line with an intercept.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 506

Weighted Regressions:121
In a weighted regression we weight some of the observations more heavily than others.
One can perform a weighted regression by minimizing the weighted sum of squared errors:

wi (Yi

- Yi ) 2 =

wi (Yi

- - Xi) 2 .

The resulting fitted parameters are:122

wi wi Xi Yi - wi Xi wi Yi
=
wi wi Xi2 - ( wi Xi) 2
^

wi Yi - ^ wi Xi

=
w i

Provided that the weights add to one, the weighted regression can be put into deviations form, by
subtracting the weighted average from each variable:
xi = Xi - wiXi

y i = Yi - wiY i
^

For the two variable model: =

wi xi yi
.
wi xi2

^ = wiY i - ^ wiXi.

Multisided Die Example:


Lets apply weighted least squares regression to the Bayesian Estimates of the results of a single
die-roll in the multi-sided die example. We had previously for this example:
Observation
A Priori Probability
Bayesian Estimate

1
0.2125
2.853

2
0.2125
2.853

3
0.2125
2.853

4
0.2125
2.853

5
0.0625
3.7

6
0.0625
3.7

7
0.0125
4.5

8
0.0125
4.5

The weights to be used are the a priori probabilities of each observation.


Put the variables in deviations form, by subtracting the weighted average from each variable:
xi = Xi - wiXi, yi = Yi - wiY i.
w = {0.2125, 0.2125, 0.2125, 0.2125, 0.0625, 0.0625, 0.0125, 0.0125}.
X = {1, 2, 3, 4, 5, 6, 7, 8}.

121
122

A review of material not on the syllabus of Exam 4/C.


If all of the wi = 1, then this reduces to the case of an unweighted regression.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 507

wiXi = 3 = a priori mean.


x = X - wiXi = {-2, -1, 0, 1, 2, 3, 4, 5}.
Y = {2.853, 2.853, 2.853, 2.853, 3.7, 3.7, 4.5, 4.5}.

wiY i = 3.123
y = Y - wiY i = {-0.147, -0.147, -0.147, -0.147, 0.7 , 0.7, 1.5, 1.5}
Then if the least squares line is Y = + X,
^

wi xi yi
= 0.45/2.6 = 0.173.
wi xi2

^ = wiY i - ^ wiXi = 3 - (3)(0.173) = (3)(0.827) = 2.481.


^

Yi = 2.481 + 0.173Xi, where Xi is the observation.


Note that the slope of the line is the previously calculated credibility for one roll of a die,
Z = 17.3%.124 Note that for one exposure Z = 1 / (1 + K) = 1/(1 + EPV/VHM) =
VHM /(VHM + EPV) = VHM / Total Variance = 0.45 / 2.6.
The intercept of the fitted line is: (0.827)(3) = (1 - Z)(a priori mean).
Thus in this example, the fitted weighted regression line is the estimates using Buhlmann Credibility:
Z(observation) + (1-Z)(a priori mean). This is true in general.
Derivation of the Relationship of Buhlmann to Bayes:
In a more general situation, we would have the possible types of risks Rm.
(In the example, these were the three types of multi-sided dice. In other situations the different type
of risks are parameterized by a continuous distribution such as a Gamma.)
The Bayesian Estimates would be E[X | D], where D is the observed data.
Let Di be the possible outcomes of the risk process.

wi Xi is the a priori mean.


wi Yi = P(Di) E[X | Di ] = E[X] = the prior mean.
In other words, the Bayesian Estimates are always in balance.
123

The Bayesian Estimates are in balance, so that their weighted average is equal to the a priori mean of 3.
For the multi-sided die example, we had EPV = 2.15, VHM = .45, K = EPV/VHM = 4.778, and for the roll of a
single die Z = 1/(1+K) = 17.3%.
124

2013-4-9

wi xi2

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 508

= P(Di) (Di - prior mean)2 = the variance of the whole risk process (by definition).

wi xi yi = P(Di) (Di - prior mean) (E[X | Di ] - prior mean) =


P(Di) Di E[X | Di ] - P(Di) Di (prior mean) - (prior mean) P(Di) E[X | Di ] + (prior mean)2 =

P(Di) Di E[X |

Di ] - (prior mean)2 =

P(Di) DiP(Rm | Di) E(Rm) - (prior mean)2 =


i

Di P(Di) P(Rm | Di) E(Rm) - (prior mean)2 =


i,m

Di P(Di

| Rm) P(Rm) E(Rm ) - (prior mean)2 =

i,m

E(Rm) P(Rm) E(Rm) - (prior mean)2 =


m

second moment of the hypothetical means - (prior mean)2 = VHM.125

Thus the slope of the weighted regression line is: =

wi xi yi
=
wi xi2

(Variance of the Hypothetical Means) / (Total Variance) = VHM / (EPV + VHM) =


1 / (1 + EPV/VHM) = 1/(1+K) = Buhlmann Credibility for one observation = Z.
Thus the slope of the weighted least squares line to the Bayesian Estimates is the Buhlmann
Credibility.
The intercept of the weighted regression line is:
^ = wiY i - ^ wiXi = prior mean - Z (prior mean) = (1 - Z)(prior mean).
Thus the weighted regression line is:
Y = + X = (1 - Z) (prior mean) + Z (observation) = estimate using Buhlmann Credibility.
125

By Bayes Theorem, P(Rm)P(Di | Rm) = P(Di )P(Rm|Di); they are each equal to the probability of having both Rm

and Di. Also Di P(Di | Rm) = E(Rm).

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 509

A General Result:
Thus the line formed by the Buhlmann Credibility estimates is the weighted least
squares line to the Bayesian estimates, with the a priori probability of each outcome
acting as the weights. The slope of this weighted least squares line to the

Bayesian Estimates is the Buhlmann Credibility. Buhlmann Credibility


is the Least Squares approximation to the Bayesian Estimates.
When the a priori probabilities of each outcome are equal, then this weighted regression reduces to
an ordinary regression.
Exercise: You are given the following information about a model:
First
Unconditional
Bayesian Estimate of
Observation
Probability
Second Observation
1
1/4
3
4
1/4
6
10
1/4
13
25
1/4
18
Determine the Bhlmann credibility, Z, to be applied to one observation.
[Solution: X = 10. x = Xi - X = {-9, -6, 0, 15}.
Y = 10. y = Yi - Y = {-7, -4, 3, 8}.
^

xi yi
= 207/342 = 0.605 = Z for one observation.]
xi2

^
Continuing, the intercept of the regression line is: ^ = Y - X = 10 - (10)(0.605) = 3.95.
^

The weighted regression line is: Yi = 3.95 + 0.605Xi.


Thus the estimate using Buhlmann Credibility is:
(0.605)(observation) + 3.95 = Z(observation) + (1 - Z)(10), with Z = 60.5% and prior mean = 10.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 510

Least Squares Approximations:


It turns out that Buhlmann Credibility Estimates form a Least Squares Approximation in three
different but related manners.
Buhlmann Credibility Estimates are the (weighted) least squares line between:
1. Bayesian Estimates vs. Possible Observations
2. True Means vs. Observations
3. Subsequent Observations vs. Prior Observations
The third result is an asymptotic one as the sample size approaches infinity. It will be demonstrated
in a later section.126 127

126

It is illustrated graphically for a particular example in A Graphical Illustration of Experience Rating Credibilities, by
Howard C. Mahler, PCAS 1998.
127
The connection between linear regression and Buhlmann Credibility is a key idea used in Loss Development
Using Credibility by Eric Brosius, not on the syllabus of this exam.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 511

Problems:
11.1 (2 points) Let X1 be the outcome of a single trial and let E[ X2 | X1 ] be the expected value of
the outcome of a second trial as described in the table below.
Outcome V Initial Probability of Outcome
Bayesian Estimate E[ X2 | X1 = V ]
0
1/4
2
3
1/2
4
10
1/4
6
Determine the Buhlmann Credibility assigned to a single observation.
A. 31%
B. 33%
C. 35%
D. 37%
E. 39%
11.2 (2 points) You are given the following information:
First
Unconditional
Bayesian Estimate of
Observation
Probability
Second Observation
1
75%
1.50
4
25%
2.50
Determine the Bhlmann credibility estimate of the second observation, given that the first
observation is 4.
(A) 2.50
(B) 2.75
(C) 3.00
(D) 3.25
(E) 3.50
11.3 (1 point) You are given the following:
An experiment consists of ten possible outcomes: R1 , R2 , ... , R10.
The a priori probability of outcome Ri is Pi.
For each possible outcome, Bayesian analysis was used to calculate predictive
estimates, Ei, for the second observation of the experiment.
10

Pi Ei

= 24.

i=1

The Buhlmann credibility factor after one experiment is 1/4.


Determine the values for the parameters a and b that minimize the expression:
10

Pi (a

+ bRi - Ei )2 .

i=1

A. a = 6; b = 3/4
D. a = 18; b = 1/4

B. a = 18; b = 3/4
C. a = 6; b = 1/4
E. None of A, B, C, or D.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 512

11.4 (3 points) You are given the following information about a credibility model:
First
Unconditional
Bayesian Estimate of
Observation
Probability
Second Observation
10
1/3
18
20
1/3
23
60
1/3
49
Determine the estimate of the second observation using Buhlmann Credibility, given that the first
observation is 60.
(A) 43
(B) 45
(C) 47
(D) 49
(E) 51
11.5 (4, 5/83, Q.38) (1 point) Which of the following are true?
1. The estimates resulting from the use of Buhlmann Credibility
and the application of Bayes Theorem are always equal.
2. The estimate resulting from the use of Buhlmann Credibility is a linear approximation to
the estimate resulting from the use of Bayes Theorem.
3. If the estimate resulting from the use of Buhlmann Credibility is greater than
the hypothetical mean, then the estimate resulting from the application of Bayes Theorem
is also greater than the hypothetical mean.
A. 2
B. 3
C. 1, 3
D. 2, 3
E. 1, 2, 3
11.6 (4, 5/90, Q.57) (3 points) Let X1 be the outcome of a single trial and let
E[X2 | X1 ] be the expected value of the outcome of a second trial as described in the table below.
Outcome Initial Probability
K
of Outcome

Bayesian Estimate
E[X2 | X1 = K ]

0
1/3
1
3
1/3
6
12
1/3
8
Which of the following represents the Buhlmann credibility estimates corresponding to the Bayesian
estimates (1, 6, 8)?
A. (3, 5, 10) B. (2, 4, 10)
C. (2.5, 4.0, 8.5) D. (1.5, 3.375, 9.0) E. (1, 6, 8)
11.7 (4B, 5/93, Q.6) (1 point) Which of the following are true?
1. Buhlmann credibility estimates are the best linear least squares approximations to
estimates from Bayesian analysis.
2. Buhlmann credibility requires the assumption of a distribution for the underlying
process generating claims.
3. Buhlmann credibility estimates are equivalent to estimates from Bayesian analysis
when the likelihood density function is a member of a linear exponential family
and the prior distribution is the conjugate prior.
A. 1
B. 2
C. 3
D. 1, 2
E. 1, 3

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 513

11.8 (4B, 11/93, Q.24) (3 points) You are given the following:
An experiment consists of three possible outcomes, R1 = 0, R2 = 2, and R3 = 14.
The a priori probability distribution for the experiment's outcome is:
Outcome, Ri
Probability, Pi
0
2/3
2
2/9
14
1/9
For each possible outcome, Bayesian analysis was used to calculate predictive
estimates, Ei, for the second observation of the experiment.
The predictive estimates are:
Bayesian Analysis Predictive
Outcome, Ri
Estimate Ei Given Outcome Ri
0
7/4
2
55/24
14
35/12
The Buhlmann credibility factor after one experiment is 1/12.
Determine the values for the parameters a and b that minimize the expression:
3

Pi (a

+ bRi - Ei )2 .

i=1

A. a = 1/12; b = 11/12
D. a = 22/12; b = 1/12

B. a = 1/12; b = 22/12
E. a = 11/12; b = 11/12

C. a = 11/12; b = 1/12

11.9 (4, 11/02, Q.7 & 2009 Sample Q. 35) (2.5 points)
You are given the following information about a credibility model:
First
Unconditional
Bayesian Estimate of
Observation
Probability
Second Observation
1
1/3
1.50
2
1/3
1.50
3
1/3
3.00
Determine the Bhlmann credibility estimate of the second observation, given that the first
observation is 1.
(A) 0.75

(B) 1.00

(C) 1.25

(D) 1.50

(E) 1.75

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 514

Solutions to Problems:
11.1. D. The Buhlmann Credibility is the slope of the least squares line fit to the Bayesian
Estimates. One needs to do a weighted regression with the weights equal to the a priori
probabilities; in this case one can just duplicate the point (3, 4) and perform an unweighted
regression. Thus the X values are: 0, 3, 3, 10 and the Y values are: 2, 4, 4, 6.
The slope is: { ((1/n)XiY i) - ((1/n)Xi) ((1/n)Yi) } / { ((1/n)Xi2 ) - ((1/n)Xi)2 } =
{21 - (4)(4)} /{29.5 - 42 } = 5/13.5 = 0.370.
11.2. A. The line formed by the Buhlmann Credibility estimates is the weighted least squares line
to the Bayesian estimates, with the a priori probability of each outcome acting as the weights. Since
there are only two values, the Bayesian Estimates are on a straight line, so Buhlmann equals Bayes.
Given that the first observation is 4, the Bhlmann credibility estimate is 2.5.
Comment: Fitting the weighted regression: X = 1, 4. X = wiXi = 1.75. x = X - X = -.75, 2.25.
Y = 1.5, 2.5. Y = wiY i = 1.75. y = Y - Y = -.25, .75. w = .75, .25. wixiy i = .5625.

wixi2 = 1.6875. slope = xiy i / xi2 = .5625/1.6875 = 0.333.


Intercept = Y - (slope) X = 1.75 - (.333)(1.75) = 1.167. 1.167 + (4)(.333) = 2.50.
11.3. D. Since the Bayesian Estimates are in balance, PiEi = 24 = a priori mean.
The line formed by the Buhlmann Credibility estimates is the weighted least squares line to the
Bayesian estimates, with the a priori probability of each outcome acting as the weights.
The slope, b = Z = 1/4. The intercept, a = (1 - Z)(a priori mean) = (3/4)(24) = 18.
11.4. D. The line formed by the Buhlmann Credibility estimates is the weighted least squares line
to the Bayesian estimates, with the a priori probability of each outcome acting as the weights.
Since the a priori probabilities are equal we fit an unweighted regression.
X = 10, 20, 60. X = 30. x = -20, -10, 30. Y = 18, 23, 49. Y = 30. y = -12, -7, 19.

xiy i = 880. xi2 = 1400. slope = xiy i / xi2 = 880/1400 = .6286.


Intercept = Y - (slope) X = 30 - (.6286)(30) = 11.14.
Bhlmann credibility estimate of the second observation = 11.14 + (.6286)(first observation).
Given that the first observation is 60, the estimate is: 11.14 + (.6286)(60) = 48.9.
Comment: Similar to 4, 11/02, Q.7.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 515

11.5. A. 1. False. They are sometimes equal (as in the Gamma-Poisson Conjugate Prior), but are
often unequal. See for example my multi-sided die example in prior sections.
2. True.
3. False. The Buhlmann Credibility estimate is a linear approximation to the result of Bayesian
Analysis. They can be on different sides of the hypothetical mean. For example in my multi-sided
die example in prior sections, the prior mean is 3, and if a 4 is observed the Buhlmann credibility
estimate is 3.17 while the Bayes Analysis estimate is 2.85.
11.6. C. The Buhlmann Credibility is the slope of the least squares line fit to the Bayesian
Estimates. One needs to do a weighted regression with the weights equal to the a priori
probabilities; in this case since the a priori probabilities are the same one can perform an unweighted
regression.
The X values are: 0, 3, 12 and the Y values are: 1, 6, 8. The slope is:
{ ((1/n)XiY i) - ((1/n)Xi) ((1/n)Yi) } / {((1/n)Xi2 ) - ((1/n)Xi)2 } = {(38) - (5)(5)} /{51 - 52 } = 0.5.

Average

X
0
3
12

Y
1
6
8

XY
0
18
96

X^2
0
9
144

38

51

Thus the Buhlmann Credibility is .50 and the new estimates are:
(observation)Z + (prior mean)(1 - Z) = (0, 3, 12)(.5) + (5)(1 - .5) = (0, 1.5, 6) + 2.5 =
(2.5, 4.0, 8.5).
Alternately, one can check whether the given choices are each of the form:
new estimate = (observation)Z + (prior mean)(1 - Z) =
prior mean + Z(observation - prior mean) = 5 + Z(observation - 5).
This will be so if Z = (new estimate - 5) / (observation - 5), is the same for the different
observations and corresponding estimates.
Observation
0
3
12

Est.
A
3
5
10

Cal.
Z
0.40
0.00
0.71

Est.
B
2.00
4.00
10.00

Cal.
Z
0.60
0.50
0.71

Est.
C
2.50
4.00
8.50

Cal.
Z
0.50
0.50
0.50

Est.
D
1.50
3.38
9.00

Cal.
Z
0.70
0.81
0.57

Est.
E
1.00
6.00
8.00

Cal.
Z
0.80
-0.50
0.43

Since the new estimates in choice C are the only ones in the desired form, we have eliminated all
the other choices. If some of the other choices were in the proper form one could compare to see
which one had the smallest squared error compared to the Bayesian Estimates. In the case of choice
C, the squared error is:
(1/3)(2.5 - 1)2 + (1/3)(4 - 6)2 + (1/3)(8.5 - 8)2 = 2.167.
Comment: Note that the Bayesian Estimates are in balance; they average to the a priori overall
mean of 5: (1/3)(1) + (1/3)(6) + (1/3)(8) = 5.
The average of the Buhlmann Credibility Estimates is also 5. This eliminates choices A, B, and D.

2013-4-9

Buhlmann Credibility 11 Linear Regression,

HCM 10/19/12,

Page 516

11.7. E. 1. True. 2. False, we need only know the mean, VHM and EPV; we need not know the
distribution. False, consider for example a die-spinner example. 3. True.
11.8. D. Buhlmann Credibility is the least squares linear approximation to the Bayesian analysis
result. The given expression is the squared error of a linear estimate. Thus the values of a and b that
minimize the given expression correspond to the Buhlmann credibility estimate. In this case, the new
estimate using Buhlmann Credibility =
(prior mean)(1-Z) + (observation) Z = 2(1 - 1/12) + (1/12)(observation) = 22/12 + (1/12)(obser.).
Therefore a = 22/12 and b = 1/12. Alternately, one can minimize the given expression.
One takes the partial derivatives with respect to a and b and sets them equal to zero.

2Pi (a + bRi - Ei)

= 0, and 2Pi Ri (a + bRi - Ei) = 0.

Therefore, (2/3)(a + b(0) - 7/4) + (2/9)(a + b(2) - 55/24) + (1/9)(a + b(14) - 35/12) = 0

a + 2 b = 2, and (2/9)(2)(a + b(2) - 55/24) + (1/9)(14)(a + b(14) - 35/12) = 0


18 a + 204 b = 300/6 =50 9 a + 102 b = 25. One can either solve these two simultaneous linear
equations by matrix methods or try the choices A through E.
Comment: Normally one would not be given the Buhlmann credibility factor as was the case here,
allowing the first method of solution, which does not use the information given on the values of the
Bayesian analysis estimates. Note that the Bayesian estimates balance to the a priori mean of 2:
(2/3)(7/4) + (2/9)(55/24) + (1/9)(35/12) = (126 + 55 + 35)/108 = 216/108 = 2.
11.9. C. The line formed by the Buhlmann Credibility estimates is the weighted least squares line
to the Bayesian estimates, with the a priori probability of each outcome acting as the weights. Since
the a priori probabilities are equal we fit an unweighted regression.
X = 1, 2, 3. X = 2. x = X - X = -1, 0, 1. Y = 1.5, 1.5, 3. Y = 2. y = Y - Y = -.5, -.5, 1.

xiy i = 1.5. xi2 = 2. slope = xiy i / xi2 = 1.5/2 = .75 = Z.


Intercept = Y - (slope) X = 2 - (.75)(2) = .5.
Bhlmann credibility estimate of the second observation = .5 + .75(first observation).
Given that the first observation is 1, the Bhlmann credibility estimate is: (.5) + (.75)(1) = 1.25.
Comment: The Bhlmann credibility estimate given 2 is 2; the estimate given 3 is 2.75.
The Bayesian Estimates average to 2, the overall a priori mean. Bayesian estimates are in balance.
The Bhlmann Estimates are also in balance; they also average to 2.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 517

Section 12, Philbrick Target Shooting Example128


In An Examination of Credibility Concepts, by Stephen Philbrick there is an excellent target
shooting example that illustrates the ideas of Buhlmann Credibility.129
Assume there are four marksmen each shooting at his own target.
Each marksmans shots are assumed to be distributed around his target, marked by the letters A, B,
C, and D, with an expected mean equal to the location of his target.
If the targets are arranged as in Figure 1, the resulting shots of each marksman would tend to cluster
around his own target. The shots of each marksman have been distinguished by a different symbol.
So for example the shots of marksman B are shown as triangles. We see that in some cases one
would have a hard time deciding which marksman had made a particular shot if we did not have the
convenient labels.

Figure 1

128

While the specific target shooting example is not on the current syllabus, all the ideas it illustrates are on the
syllabus. In addition, most of the problems in this section would be legitimate questions for your exam.
Therefore, many of you will benefit from going over this section and doing at least some of the problems.
129
In the 1981 Proceedings of the CAS. It is on the syllabus of the Group and Health - Design and Pricing Exam of
the SOA. In my opinion this is the single best paper ever written on the subject of credibility. More actuaries have
gotten a good intuitive understanding of credibility by reading this paper, than from any other source.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 518

The point E represents the average of the four targets A, B, C, and D. Thus E is the overall mean.130
If we did not know which marksman were shooting we would estimate that the shot would be at E;
the a priori estimate is E.
Once we observe a shot from an unknown marksmen131, we could be asked to estimate the location
of the next shot from the same marksman. Using Buhlmann Credibility our estimate would be
between the observation and the a priori mean of E. The larger the Credibility assigned to the
observation, the closer the estimate is to the observation. The smaller the credibility assigned to the
data, the closer the estimate is to E.
There are a number of features of this target shooting example that control how much Buhlmann
Credibility is assigned to our observation. We have assumed that the marksmen are not perfect;
they do not always hit their target. The amount of spread of their shots around their targets can be
measured by the variance. The average spread over the marksmen is the Expected Value of the
Process Variance (EPV). The better the marksmen, the smaller the EPV and the more tightly
clustered around the targets the shots will be.
The worse the marksmen, the larger the EPV and the less tightly spread. The better the marksmen,
the more information is contained in a shot. The worse the marksmen, the more random noise
contained in the observation of the location of a shot. Thus when the marksmen are good, we expect
to give more weight to an observation (all other things being equal) than when the marksmen are
bad. Thus the better the marksmen, the higher the credibility:

Marksmen
Good
Bad

Clustering
of Shots
Tight
Loose

Expected Value of
the Process Variance
Small
Large

Amount of
Noise
Low
High

Credibility Assigned
to an Observation
Larger
Smaller

The smaller the Expected Value of the Process Variance the larger the credibility. This is illustrated
by Figure 2. It is assumed in Figure 2 that each marksman is better132 than was the case in Figure 1.
The EPV is smaller and we assign more credibility to the observation. This makes sense, since in
Figure 2 it is a lot easier to tell which marksmen is likely to have made a particular shot based solely
on its location.

130

In this example, each of the marksmen is equally likely. Thus we weight each target equally. As was seen
previously, in general one would take a weighted average using the not necessarily equal a priori probabilities as the
weights.
131
Thus the shot does not have one of the convenient labels attached to it. This is analogous to the situation in
Auto Insurance, where the drivers in a classification are presumed not to be wearing little labels telling us who are the
safer and less safe drivers in the class. We rely on the observed experience to help estimate that.
132
Alternately the marksmen could be shooting from closer to the targets. See Part 4B, 5/93, Q.4.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 519

Figure 2

Another feature that determines how much credibility to give an observation is how far apart the four
targets are placed. As we move the targets further apart (all other things being equal) it is easier to
distinguish the shots of the different marksmen. Each target is a hypothetical mean of one of the
marksmen shots. The spread of the targets can be quantified as the Variance of the Hypothetical
Means.

Targets
Closer
Far Apart

Variance of the
Information
Hypothetical Means Content
Small
Lower
Large
Higher

Credibility Assigned
to an Observation
Smaller
Larger

As illustrated in Figure 3, the further apart the targets the more credibility we would assign to our
observation. The larger the VHM the larger the credibility. It is easier to distinguish which marksmen
made a shot based solely on its location in Figure 3 than in Figure 1.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 520

Figure 3

The third feature that one can vary is the number of shots observed from the same unknown
marksman. The more shots we observe, the more information we have and thus the more credibility
we would assign to the average of the observations.
Each of the three features discussed follows from the formula for Buhlmann Credibility
Z = N / (N + K) = N(VHM) / {N(VHM) + EPV}. Thus as the EPV increases, Z decreases.
As VHM increases, Z increases. As N increases, Z increases.
Feature of Target
Shooting Example

Mathematical
Quantification

Buhlmann
Credibility

Better Marksmen

Smaller EPV

Larger

Targets Further Apart

Larger VHM

Larger

More Shots

Larger N

Larger

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 521

Expected Value of the Process Variance versus Variance of the Hypothetical Means:
There are two separate reasons why the observed shots vary. First, the marksmen are not perfect.
In other words the Expected Value of the Process Variance is positive. Even if all the targets were in
the same place, there would still be a variance in the observed results. This component of the total
variance due to the imperfection of the marksmen is quantified by the EPV.
Second, the targets are spread apart. In other words, the Variance of the Hypothetical Means is
positive. Even if the every marksman were perfect, there would still be a variance in the observed
results, when the marksmen shoot at different targets. This component of the total variance due to
the spread of the targets is quantified by the VHM.
One needs to understand the distinction between these two sources of variance in the observed
results. Also one has to know that the total variance of the observed shots is a sum of these two
components: Total Variance = EPV + VHM.
Buhlmann Credibility is a Relative Concept:
In general, when trying to predict the future by using the past, one tries to separate the signal (useful
information) from the noise (random fluctuation).
In Philbricks target shooting example, we are comparing two estimates of the location of the next
shot from the same marksman:
1. average of the shots, average of the observations,
2. average of the targets, the a priori mean.
Z measures the usefulness of one estimator relative to the other estimator.
When the marksmen are better, there is less random fluctuation in the shots and the average of the
observations is a better estimate, relative to the a priori mean. In this case, the weight Z, applied to
the average of the observations is larger, while the weight, 1-Z, applied to the a priori mean is
smaller.
As the targets get closer together, there is less variation of the hypothetical means, and the a priori
mean becomes a better estimate, relative to the average of the observations. In this case, the
weight applied to the average of the observations, Z, is smaller, while that applied to the a priori
mean, 1-Z, is larger.
The Buhlmann Credibility measures the usefulness of one estimator, the average of the
observations, relative to another estimator, the a priori mean. If Z = 50%, then the two estimators are
equally good or equally bad. Buhlmann credibility is a relative measure of the value of the
information contained in the observation versus that in the a priori mean.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 522

A One Dimension Target Shooting Example:


Here is a one-dimensional example of Philbrick's target shooting model, such that the marksmen
only miss to the left or right. Assume:
There are two marksmen.
The targets for the marksmen are at the points on the number line: 20 and 30.
The distribution of shots from each marksman follows a normal distribution with
mean equal to his target value and with standard deviation of 12.
Here are 20 simulated shots from each of the marksmen, with the shots labeled as to which
marksman it was from:

Assume instead we had unlabeled shots from a single unknown marksman. By observing where an
unknown marksman's shot(s) hit the number line, you want to predict the location of his next shot.
We can use either Bayesian Analysis or Buhlmann Credibility.
To use Buhlmann Credibility we need to calculate the Expected Value of the Process Variance and
the Variance of the Hypothetical Means. The process variance for every marksmen is assumed to
be the same and equal to 122 = 144. Thus the EPV = 144.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 523

The overall mean is 25 and the VHM is: {(20-25)2 + (30-25)2 }/2 = 25. Thus the Buhlmann
Credibility parameter is K = EPV / VHM = 144 /25 = 5.76.
Exercise: Assume a single shot at 18 from an unknown marksman. Use Buhlmann Credibility to
predict the location of the next shot from the same marksman.
[Solution: The credibility of a single observation is Z = 1/(1+5.76) = 14.8%. The Buhlmann
Credibility estimate of the next shot is: (18)(14.8%) + (25)(85.2%) = 24.0.]
In order to apply Bayesian Analysis, one must figure out the likelihood of the observation of 18
given that the shot came from each marksman. In order to do so, one uses the assumed
(x -)2
]
22 , - < x < .
2

exp[Normal Distribution: f(x) =

(x - 10)2
]
288 .
12 2

exp[For example, the first marksman has a probability density function: f(x) =

( - )2
exp[- 18 10 ]
288
Thus the density function at 18 for the first marksman is f(18) =
= 0.0328.
12 2
Similarly the chance of observing 18 if the shot came from the other marksman is 0.0202.
One then computes probability weights as the product of the (equal) a priori chances and the
conditional likelihoods of the observation.
One converts these to probabilities by dividing by their sum.
The resulting posterior chance that it was marksman number 1 is: 0.01639 / 0.02648 = 61.9%. Then
the posterior estimate of the next shot is a weighted average of the means using the posterior
probabilities: (61.9%)(20) + (38.1%)(30) = 23.8.
This whole calculation can be arranged in a spreadsheet as follows:
A

A Priori
Chance Prob. Weight = Posterior Chance of
Chance of
of the
Product
this Type of
MarksStandard this Type of Observing of Columns
Marksman =
18
D&E
Col. F / (Sum of Col. F)
man Mean Deviation Marksman
1
2
Overall

20
30

12
12

0.500
0.500

0.0328
0.0202

Mean

0.01639
0.01008

61.92%
38.08%

20
30

0.02648

1.000

23.81

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 524

More Shots:
What if instead of a single shot at 18 one observed three shots at 18, 26 and 4 from the same
unknown marksman? Then the chance of the observation is now the product of the likelihoods at 18,
26, and 4. Otherwise the Bayesian Analysis proceeds as before and the estimate of the next shot
is 21.3:
A

Chance Chance Chance Chance


A
of the
of the
of the
of
MarksStd. Priori ObservingObservingObserving Obserman Mean Dev. Chance
18
26
4
vation
1
2
Overall

20
30

12
12

0.50
0.50

0.0328
0.0202

0.0293
0.0314

0.0137
0.0032

1.3147e-5
2.0162e-6

Probability
Weight
Dx H

Posterior
Chance
of
Marksman Mean

6.57e-6
1.01e-6

86.70%
13.30%

20
30

7.58e-6

1.000

21.33

As calculated above, K = 5.76. The credibility of 3 observations is: Z = 3/(3+5.76) = 34.2%.


The larger number of observations has increased the credibility.
The average of the observations is: (18 + 26 + 4)/3 = 16.
The a priori mean is 25.
Thus the Buhlmann Credibility estimate of the next shot is: (16)(34.2%) + (25)(65.8%) = 21.9.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 525

Moving the Targets:


Assume that the targets were further apart:
There are two marksmen.
The targets for the marksmen are at the points on the number line: 10 and 40.
The distribution of shots from each marksman follows a normal distribution with
mean equal to his target value and with standard deviation of 12.
Here are 20 simulated shots from each of the marksmen:

Each shot has more informational content than in the previous case. Without the labels, here one
would be more likely to be able to correctly determine the location of which marksman made a
particular shot, than when the targets were closer together.
Exercise: Assume we observe three shots from an unknown marksman at: 38, 46 and 24.
What is the Buhlmann Credibility estimate of the next shot from the same marksman?
[Solution: The EPV is still 144 while the VHM is now: 152 = 225.
K = EPV/VHM = 144/225 = .64. Z = 3/(3+.64) = 82.4%.
The average of the observations is: (38+46+24)/3 = 36.
The a priori mean is 25.
Thus the Buhlmann Credibility estimate of the next shot is:
(36)(82.4%) + (25)(17.6%) = 34.1.]

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 526

The larger VHM has increased the credibility.133


Exercise: Assume we observe three shots from an unknown marksman at:
38, 46 and 24. Use Bayesian Analysis to estimate of the next shot from the same marksman.
A

Chance Chance Chance Chance


A
of the
of the
of the
of
MarksStd. Priori ObservingObservingObserving Obserman Mean Dev. Chance
38
46
24
vation
1
2

10
40

12
12

0.50
0.50

0.0022
0.0328

Overall

0.0004
0.0293

0.0168
0.0137

1.358e-8
1.315e-5

Probability
Weight
Dx H

Posterior
Chance
of
Marksman Mean

6.792e-9
6.574e-6

0.10%
99.90%

10
40

6.580e-6

1.000

39.97

Altering the Skill of the Marksmen:


Return to the original situation, but now assume that the marksmen are more skilled:134
There are two marksmen.
The targets for the marksmen are at the points on the number line: 20 and 30.
The distribution of shots from each marksman follows a normal distribution with
mean equal to his target value and with standard deviation of 3.
With a smaller process variance, each shot contains more information about which marksman
produced it. Here we can more easily infer which marksman is likely to have made a shot than when
the marksmen were less skilled. The smaller EPV has increased the credibility.135

133

If instead one had moved the targets closer together then the credibility assigned to a single shot would have
been less. A smaller VHM leads to less credibility.
134
Alternately assume the marksmen are shooting from closer to the targets.
135
If instead one had less skilled marksmen then the credibility assigned to a single shot would have been less.
A larger EPV leads to less credibility.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 527

Here are 20 simulated shots from each of the marksmen:

Exercise: Assume we observe three shots from an unknown marksman at: 18, 26 and 4.
What is the Buhlmann Credibility estimate of the next shot from the same marksman?
[Solution: The EPV is 32 = 9 while the VHM is 52 = 25. K = EPV/VHM = 9 /25 = .36.
Z= 3/3.36 = 89.3%, more than in the original example.
The average of the observations is: (18+26+4)/3 = 16. The a priori mean is 25.
The Buhlmann Credibility estimate of the next shot is: (16)(89.3%) + (25)(10.7%) = 17.0.]
Exercise: Assume we observe three shots from an unknown marksman at:
18, 26 and 4. Use Bayesian Analysis to estimate of the next shot from the same marksman.
A

Chance Chance Chance Chance


A
of
of
of
of
MarksStd. Priori Observ. Observ. Observ. Obserman Mean Dev. Chance
18
26
4
vation
1
2
Overall

20
30

3
3

0.50
0.50

0.1065
0.0000

0.0180
0.0547

0.0000 1.7e-10
0.0000 1.588e-23

Probability
Weight
Dx H

Posterior
Chance
of
Marksman Mean

8.48e-11
7.94e-24

100.00%
0.00%

20
30

8.48e-11

1.000

20.00

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 528

Limiting Situations and Buhlmann Credibility:


As the number of observation approaches infinity, the credibility approaches one.
In the target shooting example, as the number of shots approaches infinity, our Buhlmann Credibility
estimate approaches the mean of the observations.
On the other hand, if we have no observations, then the estimate is the a priori mean.
We give the a priori mean a weight of 1, so 1-Z = 1 or Z = 0.
Buhlmann Credibility is given by Z = N / (N+ K). In the usual situations where one has a finite
number of observations, 0 < N < , one will have 0 < Z < 1 provided 0 < K < .
The Buhlmann Credibility is only zero or unity in unusual situations.
The Buhlmann Credibility parameter K = EPV / VHM. So K = 0 if EPV = 0 or VHM = .
On the other hand K is infinite if EPV = or VHM = 0.
The Expected Value of the Process Variance is zero only if one has certainty of
results.136 In the case of the Philbrick Target Shooting Example, if all the marksmen were absolutely
perfect, then the expected value of the process variance would be zero. In that situation we assign
the observation a credibility of unity; our new estimate is the observation.
The Variance of the Hypothetical Means is infinite if one has little or no knowledge and therefore has
a large variation in hypotheses137. In the case of the Philbrick Target Shooting Example, as the
targets get further and further apart, the variance of the hypothetical means approaches infinity. We
assign the observations more and more weight as the targets get further apart. If one target were in
Alaska, another in California, another in Maine and the fourth in Florida, we would give the
observation virtually 100% credibility. In the limit, our new estimate is the observation; the credibility
is one.
However, in most applications of Buhlmann Credibility the Expected Value of the Process Variance
is positive and the Variance of the Hypothetical Means is finite, so that K > 0.
The Expected Value of the Process Variance can be infinite only if the process variance is infinite for
at least one of the types of risks. If in an example involving claim severity, one assumed a Pareto
distribution with 2, then one would have infinite process variance. In the Philbrick Target Shooting
example, a marksman would have to be infinitely terrible in order to have an infinite process
variance.
136

For example, one could assume that it is certain that the sun will rise tomorrow; there has been no variation of
results, the sun has risen every day of which you are aware.
137
For example, an ancient Greek philosopher might have hypothesized that the universe was more than 3000 years
old with all such ages equally likely.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 529

As the marksmen get worse and worse, we give the observation less and less weight. In the limit
where the location of the shot is independent of the location of the target we give the observation no
weight; the credibility is zero.
The Variance of the Hypothetical Means is zero only if all the types of risks have the same mean.
For example, in the Philbrick Target Shooting example, if all the targets are at the same location (or
alternately each of the marksmen is shooting at the same target) then the VHM = 0. As the targets
get closer and closer to each other, we give the observation less and less weight. In the limit we
give the observation no weight; the credibility is zero. In limit all the weight is given to the single
target.
However, in the usual applications of Buhlmann Credibility there is variation in the hypotheses and
there is a finite expected value of process variance and therefore K is finite.
Assuming 0 < K < and 0 < N < , then 0 < Z < 1. Thus in ordinary circumstances the
Buhlmann Credibility is strictly between zero and one.
Buhlmann Credibility Parameter = K = EPV/VHM.
Buhlmann Credibility = Z = N / (N + K).
EPV 0 K 0 Z 1.
EPV K Z 0.
VHM 0 K Z 0.
VHM K 0 Z 1.
N Z 1.
Analysis of Variance:
There are two distinct reasons why the shots from a series of randomly selected marksmen vary:
different targets and the imperfection of the marksmen.
Marksmen Perfect EPV = 0 shots vary only due to separate targets.
Targets the Same VHM = 0 shots vary only due to the imperfection of marksmen.
These two effects add: Total Variance = EPV + VHM.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 530

Bayesian Analysis:
The Philbrick Target Shooting example is also useful for illustrating ideas from Bayesian Analysis.
If we observe a shot from an unknown marksman, the new estimate using Bayesian Analysis is a
weighted average of the locations of the targets.138 Assuming the a priori probabilities of each
marksman are equal, the weight applied to each target is proportional to the chance of the shot
having come from the corresponding marksman.139
Consider the limiting case where the targets get further and further apart; i.e., the Variance of the
Hypothetical Means approaches infinity. As the targets get further and further apart the chance of
observing a shot goes quickly to zero for all but the closest target.140 Thus for the Bayesian Analysis
approach, (assuming the a priori chance of the closest target is greater than zero), as the targets get
further and further apart, the probability weight of the closest target is much larger than any of the
others. Thus virtually all of the weight is given to the mean of the closest target, and the posterior
estimate approaches the mean of that closest target as the VHM .
This differs from the Buhlmann Credibility approach. Assuming all else stays constant, as the
Variance of the Hypothetical Means approaches infinity, the Buhlmann Credibility Parameter
K = EPV / VHM approaches zero. Therefore, Z = 1/(1+K) 1, and the Buhlmann Credibility
estimate approaches the observed shot.
If the Variance of the Hypothetical Means approaches zero, the hypothetical means get closer and
closer. The Bayes Analysis estimate is a weighted average of these hypothetical means, which all
approach the overall mean as the VHM 0.
If the EPV approaches zero, then all the process variances approach zero. Therefore, the chance of
observing a shot goes quickly to zero for all but the closest target. Thus for the Bayesian Analysis
approach, (assuming the a priori chance of the closest target is greater than zero), as the marksmen
get better, the probability weight of the closest target is much larger than any of the others. Thus
virtually all of the weight is given to the mean of the closest target, and the posterior estimate
approaches the mean of that closest target as the EPV 0. It should be noted that as the
EPV 0, the expected distance between the shot and the closest target also goes to zero. Thus
the posterior estimate also approaches the location of the shot as well as the nearest target.
138

In general the Bayesian Estimate is a weighted average of the hypothetical means.


In general, the weight applied to each hypothetical mean is proportional to the product of the chance of the
observation given that we have the corresponding type of risk times the a priori chance of that type of risk.
140
See for example, the one dimensional example above. When the targets were moved further apart, the posterior
probabilities for targets other than closest got small. Move the targets even further apart, and the posterior
probability for closest will quickly go to one while the others go to zero.
139

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 531

Problems:
Use the following information in the next four questions:
There are three marksmen, each of whose shots are Normally Distributed (in one dimension) with
means and standard deviations:
Risk Mean
Standard Deviation
A
100
60
B
200
120
C
300
240
12.1 (2 points) A marksman is chosen at random.
What is the Buhlmann Credibility Parameter?
A. Less than 1
B. At least 1 but less than 2
C. At least 2 but less than 3
D. At least 3 but less than 4
E. At least 4
12.2 (1 point) A marksman is chosen at random. You observe two shots at 90 and 150.
Using Buhlmann Credibility, estimate the next shot from the same marksman.
A. Less than 130
B. At least 130 but less than 140
C. At least 140 but less than 150
D. At least 150 but less than 160
E. At least 160
12.3 (3 points) A marksman is chosen at random.
If you observe two shots at 90 and 150, what is the chance that it was marksman B?
A. Less than 10%
B. At least 10% but less than 15%
C. At least 15% but less than 20%
D. At least 20% but less than 25%
E. At least 25%
12.4 (1 point) A marksman is chosen at random. If you observe two shots at 90 and 150, what is
the Bayesian Estimate of the next shot from the same marksman?
A. Less than 130
B. At least 130 but less than 140
C. At least 140 but less than 150
D. At least 150 but less than 160
E. At least 160

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 532

12.5 (2 points) You are given the following:


Four shooters are to shoot at a target some distance away that has the following design:
W

Shooter A hits Area W with probability 1/2 and Area X with probability 1/2.
Shooter B hits Area X with probability 1/2 and Area Y with probability 1/2.
Shooter C hits Area Y with probability 1/2 and Area Z with probability 1/2.
Shooter D hits Area Z with probability 1/2 and Area W with probability 1/2.
Three of the four shooters are randomly selected, and each of the three selected shooters fires one
shot. Two shots land in Area X, and one shot lands in Area Z (not necessarily in that order).
The remaining shooter (who was not among the three previously selected) then fires a shot.
Determine the probability that this shot lands in Area Z.
A. 1/4
B. 1/2
C. 2/3
D. 3/4
E. 1

Use the following information to answer each of the next two questions:
Assume you have two shooters, each of whose shots is given by a (one dimensional) Normal
distribution:
Shooter
Mean
Variance
A
+10
9
B
-10
225
Assume a priori each shooter is equally likely. You observe a single shot at +20.
12.6 (2 points) Use Bayes Theorem to estimate the location of the next shot.
A. less than 0
B. at least 0 but less than 2
C. at least 4 but less than 6
D. at least 6 but less than 8
E. at least 8
12.7 (2 points) Use Buhlmann Credibility to estimate the location of the next shot.
A. less than 0
B. at least 0 but less than 2
C. at least 4 but less than 6
D. at least 6 but less than 8
E. at least 8

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 533

Use the following information for the next 5 questions:

There are three shooters P, Q, and R.


Each shooter is to shoot at his target, P, Q, or R, some distance away.
The shots of each shooter are distributed over a circle of radius 2 centered at his targeted point.
The probability density is given by f(r,) =

1
,
4 r

where r is the distance from his targeted point,


and is the angle measured counterclockwise from the vertical.

The targeted points P, Q, and R are at the vertices of an equilateral triangle with sides of length 1.
1 3
P is at (0,0), Q is at (1,0) and R is at ( ,
).
2 2
One of the three shooters is randomly selected, and that shooter fires a shot at his targeted point.
The shot lands at the point S (-0.4, 0). This same shooter then fires a second shot (at the same
point targeted in the first shot.)
12.8 (3 points) Determine the Bayesian analysis estimate of the location of the second shot.
A. (0.140, 0.153) B. (0.278, 0.173) C. (0.320, 0.231) D. (0.500, 0.866)
E. None of the A, B, C, or D
12.9 (2 points) What is the Expected Value of the Process Variance?
A. 2/3
B. 1
C. 4/3
D. 5/3
E. 2
12.10 (1 point) What is the Variance of the Hypothetical Means?
A. 1/4
B. 1/3
C. 2/5
D. 1/2

E. 3/5

12.11 (2 points) Determine the Buhlmann Credibility estimate of the location of the second shot.
A. (0.320, 0.231)
B. (0.140, 0.173)
C. ( -0.040, 0.115)
D. (-0.220, 0.058)
E. None of the A, B, C, or D
12.12 (2 points) The second shot is observed to be (-0.8, -0.9). The third shot is observed to be
(-0.3, -0.7). Using the information provided by the location of the first three shots, determine the
Buhlmann Credibility estimate of the location of the fourth shot from the same shooter.
A. (0.320, 0.231)
B. (0.140, 0.173)
C. ( -0.040, 0.115)
D. (-0.220, 0.058)
E. None of the A, B, C, or D

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 534

Use the following information for the next two questions:


There are four marksmen.
The targets for the marksmen are at the points on the number line:
10, 20, 30, and 40.
The marksmen only miss to the left or right.
The distribution of shots from each marksman follows a normal distribution with
mean equal to his target value and with standard deviation of 15.
Three shots from an unknown marksman are observed at 22, 26, and 14.
(x -)2
]
22 , - < x < .
2

exp[Normal Distribution:

mean =

variance = 2

12.13 (3 points) Use Buhlmann Credibility to predict the location of the next shot from the same
marksman.
A. less than 20
B. at least 20 but less than 21
C. at least 21 but less than 22
D. at least 22 but less than 23
E. at least 23
12.14 (4 points) Use Bayesian Analysis to predict the location of the next shot from the same
marksman.
A. less than 20
B. at least 20 but less than 21
C. at least 21 but less than 22
D. at least 22 but less than 23
E. at least 23

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 535

Use the following information for the next 4 questions:


With equal probability, one of two people will be shooting at one of two targets.
Each person aims for a different target. You observe one shot (but not the shooter).
Assume each person's shots are normally distributed around their target, with standard deviation 1.
Define a coordinate system with one target at +m, and the other at -m.
The observed shot was at x.
(x -)2
]
22 , - < x < .
2

exp[Normal Distribution:

mean =

variance = 2

12.15 (4, 5/83, Q.47a) (2 points) Using Buhlmann Credibility, estimate where the next shot by
the same person will appear.
A.

x
1 + m2

B.

xm
1 + m2

C.

m - x
1 + m2

D.

x m2
1 + m2

E. None of A, B, C, or D.

12.16 (4, 5/83, Q.47c) (1 point) For a fixed observed shot x, what happens to your estimate in
the prior question as the distance between the targets, 2m, gets very large?
A. It approaches zero.
B. It approaches m if x > 0, and -m if x < 0.
C. It approaches x, the observed shot
D. None of A, B, or C
E. Can not be determined
12.17 (4, 5/83, Q.47b) (2 points) Using Bayes' Theorem, estimate where the next shot by the
same person will appear.
exp(mx) + exp(-mx)
exp(mx) - exp(-mx)
A. m
B. m
exp(mx) - exp(-mx)
exp(mx) + exp(-mx)
C. x

exp(mx) + exp(-mx)
exp(mx) - exp(-mx)

D. x

exp(mx) - exp(-mx)
exp(mx) + exp(-mx)

E. None of the above.


12.18 (4, 5/83, Q.47c) (1 point) For a fixed observed shot x0, what happens to your estimate
in the prior question as the distance between the targets, 2m, gets very large?
A. It approaches zero.
B. It approaches m if x > 0, and -m if x < 0.
C. It approaches x, the observed shot
D. None of A, B, or C
E. Can not be determined

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 536

12.19 (4, 5/88, Q.36) (1 point) In reference to Philbrick's gun shot example, which of the following
statements are correct?
1. The variance of the hypothetical means increases as the relative distance between
the means increases.
2. The credibility of a single observation will be increased as
the variance of the hypothetical means increases.
3. Using a Bayesian credibility approach, the best estimate of the location of a second shot
after observing a single shot is somewhere on a line connecting the mean of all of
the clusters and the mean of the cluster to which the observed shot is closest.
A. 1
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3
12.20 (4, 5/90, Q.27) (1 point) Which of the following will increase the credibility of your body of
data?
1. A larger number of observations.
2. A smaller process variance.
3. A smaller variance of the hypothetical means.
A. 1
B. 2
C. 1, 2
D. 1, 3
E. 1, 2, 3
12.21 (4B, 5/92, Q.13) (2 points) The following is based upon Philbrick's target shooting model
with marksmen a, b, c and d each shooting at targets with mean target hits A, B, C and D,
respectively, and overall mean E.
Given the observation of a single shot X without knowing which marksman fired the shot, which of the
following are true concerning the prediction of the same marksman's next shot, Y?
1. If Y is predicted using Bayesian analysis, then Y = F where F is the revised mean of
the marksmen determined using the posterior probabilities that shot X was fired by
each of the marksmen a, b, c, d.
2. The Buhlmann credibility estimate of Y is equivalent to a linear interpolation of the points X and E
where the point X is given weight Z= N/(M+N) and E is given weight 1-Z,
M is the expected variance of the marksmen's shots and N is the variance of the mean shots
of the marksmen.
3. If Y is predicted using Bayesian analysis, then it is possible that Y is farther in absolute distance
from E than both the observed shot X and the Buhlmann credibility estimate for Y.
A. 1 only
B. 1 and 2 only
C. 1 and 3 only
D. 2 and 3 only
E. 1, 2, and 3

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 537

Use the following information for the next two questions:


Consider a one-dimensional example of Philbrick's target shooting model such that the marksmen
only miss to the left or right.
There are four marksmen.
Each marksman's target is initially 100 feet from him.
The initial targets for the marksmen are at the points 2, 4, 6, and 8 on a number line
(measured in one-foot increments).
The accuracy of each marksman follows a normal distribution with mean equal to his
target value and with standard deviation directly proportional to the distance from
the target. At a distance of 100 feet from the target,the standard deviation is 3 feet.
By observing where an unknown marksman's shot hits the number line, you want to
predict the location of his next shot.
12.22 (4B, 11/93, Q.3) (1 point) Determine the Buhlmann credibility assigned to a single shot of a
randomly selected marksman.
A. Less than 0.10
B. At least 0.10, but less than 0.20
C. At least 0.20, but less than 0.30
D. At least 0.30, but less than 0.40
E. At least 0.40
12.23 (4B, 11/93, Q.4) (3 points) Which of the following will increase Buhlmann credibility the
most?
A. Revise targets to 0, 4, 8, and 12.
B. Move marksmen to 60 feet from targets.
C. Revise targets to 2, 2, 10, and 10.
D. Increase number of observations from same marksman to 3.
E. Move two marksmen to 50 feet from targets and increase number of observations from
same selected marksman to 2.
12.24 (4B, 11/94, Q.14) (1 point) Which of the following statements are true?
1. As the process variance goes to zero, the credibility associated with
the current observations goes to one.
2. As the variance of the hypothetical means increases, the credibility associated with
the current observations will increase.
3. As the variance of the hypothetical means increases,
estimates produced by the Buhlmann credibility method
approach those produced by a pure Bayesian analysis method.
A. 1
B. 3
C. 1, 2
D. 2, 3
E. 1, 2, 3

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 538

12.25 (4B, 11/94, Q.23) (2 points) Two marksmen shoot at a target. For each marksman, point
values are assigned based upon the location of each shot. The three possible locations and point
values are:
Location
Point Value
Hit center of target
50 points
Hit target, but not center
10 points
Miss target completely
0 points
Probabilities for the shot locations for each marksman are:
Probability of
Marksman
Hitting Center
Hitting Target, But Not Center
Missing Target
A
0.01
0.09
0.90
B
0.20
0.45
0.35
A marksman is randomly selected and his shots are observed. Determine the expected score of the
21st shot if the first 20 shots all missed the target.
A. 0.00
B. 1.40
C. 7.95
D. 14.50
E. Cannot be determined from the given information.
12.26 (4B, 5/95, Q.25) (1 point) Philbrick uses a target shooting example to help explain
credibility. For this problem, consider the limiting case where the variance of the hypothetical means
approaches infinity. Assume that the location of the shot, the mean of the closest cluster to the shot,
and the population mean are all known. Match each technique with its resulting best estimate of the
location of the next shot from the same shooter.
Technique
Estimate of Location of Next Shot
B = Pure Bayesian approach
1 = Location of the shot
C = Buhlmann credibility
2 = Mean of the closest cluster
3 = Mean of the population
A. B with 1, C with 1
B. B with 1, C with 2
C. B with 2, C with 1
D. B with 2, C with 2
E. B with 2, C with 3

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 539

Use the following information for the next two questions:


A shooter is to shoot at one of three points, X, Y, or Z, on a target some distance away.
The shots of the shooter are uniformly distributed over a circle of radius 1 centered
at the targeted point.
X, Y, and Z are at the vertices of an equilateral triangle with sides of length 1.

G is the point equidistant from X, Y, and Z at the center of the triangle.

M is the point halfway between X and Y on the line segment joining X and Y.
One of the three points, X, Y, or Z, is randomly selected, and the shooter fires a shot at this point.
The shot lands at the point S, halfway between X and M on the line segment joining X and M.
The shooter then fires a second shot at the same point targeted in the first shot.
12.27 (4B, 11/96, Q.17) (2 points) Determine the Bayesian analysis estimate of the location of
the second shot.
A. X
B. G
C. M
D. S
E. A point other than X, G, M, or S
12.28 (4B, 11/96, Q.18) (2 points) Determine the Buhlmann credibility estimate of the location of
the second shot.
A. X
B. G
C. M
D. S
E. A point other than X, G, M, or S

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 540

Use the following information for the next two questions:


A shooter is to shoot at one of three points, X, Y, or Z, on a target some distance away.
The shots of the shooter are uniformly distributed inside a circle of radius 2/3 that is
centered at the targeted point.
X, Y, and Z are at the vertices of an equilateral triangle with sides of length 1.
G is the point equidistant from X, Y, and Z at the center of the triangle.
M is the point halfway between X and Y on the line segment joining X and Y.
N is the point halfway between X and Z on the line segment joining X and Z.
P is the point halfway between M and N on the line segment joining M and N.
One of the three points, X, Y, or Z, is randomly selected, and the shooter fires a shot at this point.
The shot lands at the point M. The shooter then fires a second shot at the same point targeted in the
first shot.
12.29 (4B, 5/97, Q.14) (2 points) Determine the Bayesian analysis estimate of the location of the
second shot.
A. X
B. Y
C. Z
D. G
E. M
12.30 (4B, 5/97, Q.15) (2 points) The second shot lands at the point N.
The shooter then fires a third shot at the same point targeted in the first two shots.
Determine the Bayesian analysis estimate of the location of the third shot.
A. X
B. G
C. M
D. N
E. P
12.31 (4B, 11/97, Q.14) (2 points) You are given the following:
Four shooters are to shoot at a target some distance away that has the following design:

Shooter A hits Area W with probability 1/2 and Area X with probability 1/2.
Shooter B hits Area X with probability 1/2 and Area Y with probability 1/2.
Shooter C hits Area Y with probability 1/2 and Area Z with probability 1/2.
Shooter D hits Area Z with probability 1/2 and Area W with probability 1/2.
Three of the four shooters are randomly selected, and each of the three selected shooters fires one
shot. One shot lands in Area W, one shot lands in Area X, and one shot lands in Area Y (not
necessarily in that order). The remaining shooter (who was not among the three previously selected)
then fires a shot. Determine the probability that this shot lands in Area Z.
A. 1/4
B. 1/2
C. 2/3
D. 3/4
E. 1

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 541

12.32 (4B, 5/98 Q.12) (1 point) You are given the following:
Six shooters are to shoot at a target some distance away that has the following design:

Shooter A hits Area W with probability 1/2 and Area X with probability 1/2.
Shooter B hits Area W with probability 1/2 and Area Y with probability 1/2.
Shooter C hits Area W with probability 1/2 and Area Z with probability 1/2.
Shooter D hits Area X with probability 1/2 and Area Y with probability 1/2.
Shooter E hits Area X with probability 1/2 and Area Z with probability 1/2.
Shooter F hits Area Y with probability 1/2 and Area Z with probability 1/2.
Five of the six shooters are randomly selected, and each of the five selected shooters fires one
shot. Three shots land in Area W and two shots lands in Area Y (not necessarily in that order).
The remaining shooter (who was not among the five previously selected) then fires a shot.
Determine the probability that this shot lands in Area Y.
A. 0
B. 1/4
C. 1/3
D. 1/2
E. 1

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 542

Solutions to Problems:
12.1. D. The expected value of the process variance is 25,200. The variance of the hypothetical
means is: 46667 - 2002 = 6667. K = EPV / VHM = 25200 / 6667 = 3.8.
A Priori
Chance of
This Type
of Marksman
0.333
0.333
0.333

Type of
Marksman
A
B
C

Mean
100
200
300

Square
of
Mean
10000
40000
90000

200

46667

Average

Standard
Deviation
60
120
240

Process
Variance
3600
14400
57600
25200

12.2. E. Z = N / (N + K) = 2 / (2 + 3.8) = .345.


The average observation is: (90 + 150) /2 = 120. The a priori mean = 200.
Thus the estimate of the next shot is: (.345)(120) + (1 - .345)(200) = 172.
12.3. C. The density for a Normal Distribution with mean and standard deviation is given by f(x)
= exp(-.5{(x-)/}2 ) / { 2 }. Thus the density function at 90 for Marksman A is:
2 } = 0.000656.

exp(-.5{(90-100)/60}2 ) / { 60

A Priori
Chance
Chance of
of
MarksStd.
Type of Observing
man Mean Dev. Marksman
90
A
B
C

100
200
300

60
120
240

0.333
0.333
0.333

0.00656
0.00218
0.00113

Overall

Chance
of
Observing
150

Chance
of
the
Observat.

Probability
Weight

Posterior
Chance of
Type of
Marksman

0.00470
0.00305
0.00137

3.081e-5
6.657e-6
1.550e-6

1.027e-5
2.219e-6
5.167e-7

78.97%
17.06%
3.97%

1.301e-5

1.000

12.4. A. Use the results of the previous question to weight together the a priori means:

Marksman
A
B
C
Overall

Posterior
Chance of
This Type
of Risk
78.97%
17.06%
3.97%

A Priori
Mean
100
200
300
125

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 543

12.5. B. For convenience add the shooters around the outer edge of the diagram.
A

Y
C

Work out the chance of the observation given that each of the shooters was excluded.
If Shooter A was excluded, then there could have been at most one shot in Area X.
Thus the probability of the observation if shooter A was excluded is 0.
If Shooter B was excluded, then there could have been at most one shot in Area X.
Thus the probability of the observation if shooter B was excluded is 0.
If Shooter C was excluded, then Shooter D must have hit area Z, (since D could only hit Z or W,
and no shot was observed to have hit W.) Shooter A and Shooter B must have hit area X, (since
there are two shots observed in Area X.)
The probability of this observation is: (1/2)(1/2)(1/2) = 1/8.
If Shooter D was excluded, then Shooter C must have hit area Z, (since C could only hit Z or Y,
and no shot was observed to have hit Y.) Shooter A and Shooter B must have hit area X, (since
there are two shots observed in Area X.)
The probability of this observation is: (1/2)(1/2)(1/2) = 1/8.
Then as computed in the spreadsheet, there is a 50% chance that Area Z will be hit.
A

Excluded
Marksman

A Priori
Chance of
This Situation

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior
Chance of
This Situation

Chance
of hitting
Area Z

A
B
C
D

0.25
0.25
0.25
0.25

0
0
0.125
0.125

0.00000
0.00000
0.03125
0.03125

0%
0%
50%
50%

0.0
0.0
0.5
0.5

0.062

100%

50%

Overall

Comment: The posterior probabilities of the shot landing in the other areas are: X: 0, Y: 25%, and
W: 25%.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 544

12.6. A. The Normal Distribution has a density function of: exp[-(x-)2 / (22) ] / { 2 }.
The probability weights are the chance of the observation times the a priori probability.
A Priori
Shooter Probability
A
B

Standard
Mean Deviation

0.5
0.5

10
-10

3
15

Chance of
Shot at
20

Probability
Weights

Posterior
Probability

Mean

0.000514
0.003599

0.000257
0.001800

0.1250
0.8750

10
-10

0.002057

1.0000

-7.50

Sum

Comment: Note that the new estimate of -7.5 is outside the interval formed by the observed value
of 20 and the overall mean of 0. While this can happen for estimates based on Bayes Theorem,
this can not occur for estimates based on Credibility.
12.7. E. The expected value of the process variance is 117.
The variance of the hypothetical means is: 100 - 02 = 100. K = EPV / VHM = 117 / 100 = 1.17.
Z = 1 / (1 + 1.17) = .461. Estimate = (.461)(20) + (1 - .461)(0) = 9.2.
Shooter
A
B
Average

A Priori
Probability
0.5
0.5

Process
Variance
9
225

Mean
10
-10

Square of
Mean
100
100

117

100

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 545

12.8. B. The point S is at (-0.4, 0) and is at a distance of .4 from P, 1.4 from Q, and
{(.92 ) + ( 3 /2)2 }.5 = 1.249 from R. Thus if the shooter is P, the probability density function S is
1/ (4 .4). If the shooter is Q, the probability density function at S is 1/ (41.4). If the shooter is R,
the probability density function at S is 1/ (41.249). The posterior probability of shooter P is
proportional to the product of the a priori probability of shooter P and the density function at S given
shots at target P, in this case (1/3)/ (4 .4). Similarly, one gets a probability weights for Targets Q
and R of: (1/3)/ (41.4) and, (1/3)/(41.249). Thus the posterior probabilities are proportional to:
1/.4 = 2.5, 1/1.4 = .714, and 1/1.249 = .801.
Thus the posterior probabilities are: .6227, .1779, and .1994.
Thus the posterior estimate is: (.6227)P + (.1779)Q + (.1994)R =
(.6227)(0,0) + (.1779)(1, 0) + (.1994)(1/2,

3 /2) = (0.278, 0.173).

Target

A Priori
Probability

Chance
of the
Observation

Prob. Weight =
Product of
Columns B & C

Posterior Chance
of This Type
Marksman

x-value

y-value

P
Q
R

33.33%
33.33%
33.33%

0.19894
0.05684
0.06371

0.06631
0.01895
0.02124

62.27%
17.79%
19.94%

0.000
1.000
0.500

0.000
0.000
0.866

0.10650

100.00%

0.278

0.173

Overall

Comment: Beyond what you are likely to be asked on the exam. Since it is a weighted average of
P, Q, and R, with weights between 0 and 1, the Bayesian Estimate is within the triangle PQR;
estimates from Bayesian Analysis are always within the range of hypotheses.
12.9. C. The process variance is the expected squared distance of observation from its expected
value. If we are shooting at target P, then the process variance is the expected squared distance of
the shot from P:
2

2 2

{(r2)/ (4r)} rdrd = (1/ 4) r2 drd = (8/ 12) d = 4/3.


=0 r=0

=0

r=0

=0

Thus the process variance for shooting at this target is 4/3. The process variance for the other two
targets is the same, so that the EPV = 4/3.
Comment: Difficult. Beyond what you are likely to be asked on the exam.
12.10. B. The Variance of the Hypothetical Means is the (weighted) average squared distance of
the targets from their grand mean of M. In this case P is at (0,0), Q is at (1,0) and R is at (1/2,

3 /2),

then M is at (P/3) + (Q/3) + (R/3) = (1/2, 3 /6). The squared distance of M to any of the targets is
then 1/4 + 3/36 = 1/3. Therefore VHM = 1/3.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 546

12.11.A. Using Buhlmann Credibility the estimate of the next shot will be a weighted average of
the prior mean M and the average of the observation(s), in this case S.
Thus the new estimate will be somewhere on the line between the point
M = (1/2, 3 /6) and the point S = (-.4, 0). The Buhlmann Credibility Parameter
K = EPV / VHM = (4/3)/(1/3) = 4. Thus for one observation Z = 1/(1+K) = 1/5. Thus the estimate
of the next shot = (1/5)S + (4/5) M = (1/5)(-0.4, 0) + (4/5) (1/2, 3 /6) = (0.320, 0.231).
Comment: Difficult. Beyond what you are likely to be asked on the exam; it is extremely unlikely
that you will be asked to compute the credibility in a two dimensional situation. See the following
diagram:

M
* Buhlmann Credibility Estimate
+ Bayesian Estimate

12.12. E. The Buhlmann Credibility Parameter K = EPV / VHM = (4/3)/(1/3) = 4.


Thus for three observations Z = 3/(3+K) = 3/7. The average of the three observations is :
((-.4,0) + (-.8,-.9) + (-.3, - .7))/3 = (-.500, -.533). Thus the estimate of the next shot =
(3/7)(-.500, -.533) + (4/7) (1/2, 3 /6) = (0.071, -0.063).
Comment: Note that the estimate is outside the triangle PQR. The estimates that result from
Buhlmann Credibility are occasionally outside the range of hypotheses.
12.13. D. The process variance for every marksmen is assumed to be the same and equal to 152
= 225. Thus the EPV = 225. The overall mean is 25 and the VHM is:
{(10-25)2 + (20-25)2 + (30-25)2 + (40-25)2 }/4 = 125. K = EPV / VHM = 225 /125 = 1.8.
Z = 3/(3+1.8) = 62.5%. The average observation: (22+26+14)/3 = 20.67.
The estimate of the next shot is: (20.67)(62.5%) + (25)(37.5%) = 22.3.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 547

12.14. C. The estimate of the next shot is 21.2:


A

A
Chance Chance Chance Chance
MarksStd. Priori
of
of
of
of
man Mean Dev. ChanceObservingObservingObserving Obser22
26
14
vation
1
2
3
4

10
20
30
40

15
15
15
15

0.25
0.25
0.25
0.25

0.0193
0.0264
0.0231
0.0129

0.0151
0.0246
0.0257
0.0172

0.0257
0.0246
0.0151
0.0059

7.4641e-6
1.5889e-5
8.9163e-6
1.3189e-6

Overall

Probability
Weight
Dx H

Posterior
Chance

1.87e-6
3.97e-6
2.23e-6
3.30e-7

22.22%
47.31%
26.55%
3.93%

10
20
30
40

8.40e-6

1.000

21.2

Mean

12.15. D. The a priori mean is 0. The variance of the hypothetical means is m2 .


The process variance for each marksman is 12 = 1. Therefore, the EPV = 1.
K = EPV/ VHM = 1/m2 . Thus for one observation, Z = 1 / (1+K) = m2 / (1+m2 ).
Thus the new estimate is (x){m2 / (1+m2 )} + 0{1 / (1+m2 )} = xm2 / (1+m2 ).
12.16. C. As m , xm2 / (1+m2 ) x. In other words, as m approaches , the VHM gets large,
Z approaches 1, and the estimate approaches the observation.
12.17. B. The posterior distribution is proportional to the product of the chance of the observation
and the a priori chance of having each target. The targets are equally likely, so their a priori probability
is each 1/2. Given that the shooter is aiming at a target with mean , the chance of the observation is:
(1/ 2 ) exp( -[(x- )2 ]/2 ).
Thus the posterior chance of the shooter aiming at the target at +m is proportional to:
exp( -[(x-m )2 ]/2]), while the posterior chance of the target at -m is proportional to:
exp( -[(x+m )2 ]/2]). Thus the new estimate is:
{mexp( -[(x-m )2 ]/2])-mexp( -[(x+m )2 ]/2])}/{exp( -[(x-m )2 ]/2])+exp( -[(x+m )2 ]/2])} =
{mexp( -(x2 +m2 )/2){exp(xm) - exp( -xm )}/{exp( -(x2 +m2 )/2)(exp(xm) - exp( -xm ))} =
m{exp( mx)-exp( -mx)}/{exp( mx)+exp( -mx)}.
12.18. B. As m , if x > 0, then m{exp( mx)-exp( -mx)}/{exp( mx)+exp( -mx)}
mexp( mx)/{exp mx) = m. As m , if x < 0, then
m{exp( mx)-exp( -mx)}/{exp( mx)+exp( -mx)} -mexp(- mx) / exp( -mx) = -m.
As the targets get further apart the Bayesian Analysis estimate approaches the closer target.
Comment: For an observation of zero, the Bayes Analysis estimate is zero, regardless of m.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 548

12.19. B. 1. True. 2. True. As the targets get further apart the credibility assigned to a shot
increases. 3. False. Using a Bayesian (Buhlmann) credibility approach, the best estimate of the
location of a second shot after observing a single shot is somewhere on a line connecting the mean
of all of the clusters and the observed shot.
12.20. C. 1. True. The more shots you observe, the more weight you give to the mean
observation. 2. True. The better the marksmen, the more the credibility given the observation. 3.
False. The closer the targets, the more weight is given to the overall mean and the less weight is
given to the shot.
12.21. E. 1. T. The definition of the Bayesian Estimate. 2. T. The definition of Buhlmann Credibility
estimate, plus the fact that the Buhlmann Credibility Z = N/(N+K) which for one shot is Z = 1/(1+K) =
1/(1+ EPV/VHM) = VHM / (VHM + EPV). VHM = is the variance of the mean shots of the
marksmen, while EPV = the expected variance of the marksmen's shots.
3. T. While the Buhlmann Credibility estimate is between E and X, the Bayesian estimate need not
be so. The Bayesian estimate can be further from the overall mean, E, than the observed shot, X,
let alone than the Buhlmann Estimate.
Comment: The Buhlmann Credibility estimate must be on the straight line between E and X. The
Bayesian Estimate being a weighted average of A, B, C and D (with weights between 0 and 1)
must be somewhere within or on the square formed by the targets. Depending on the particular
situation the Bayesian Analysis estimate could be either closer to or further from E than is the
Buhlmann Credibility Estimate. The situation in statement 3 could look as follows if the Bayesian
Estimate were further from E:
A

E
* Buhlmann Estimate

+ Bayesian Estimate
D

12.22. D. Expected Value of the Process Variance = 32 = 9. Overall mean is 5.


The variance of the Hypothetical Means = (1/4){ (5-2)2 + (5-4)2 + (5-6)2 + (5-8)2 } = 5.
K = 9/5. Z = 1/ (1+9/5) = 5/14 = 0.357.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 549

12.23. A. A. Expected Value of the Process Variance remains = 9. Overall mean is 6.


The Variance of the Hypothetical Means = (1/4){(6-0)2 + (6-4)2 + (6-8)2 + (6-12)2 } = 20.
K = 9/20. Z = 1/ (1+9/20) = 20/29 = 0.690.
B. New standard deviation = (3)(6/10) = 1.8. EPV = 1.82 = 3.24. VHM = 5.
K = 3.24 / 5 = .648. Z = 1 / 1.648 = .607.
C. VHM = (10-6)2 = 16. K = 9 /16. Z = 1/1.563 = .640.
D. K = 9/5. Z = 3 / (3 + 1.8) = .625
E. The standard deviation for the two closer marksmen = (3)(5/10) = 1.5. One needs to average
the process variances of the two closer and two further marksmen.
Expected Value of the Process Variance = (1/2)(1.52 + 32 ) = 5.625.
K = 5.625 / 5 = 1.125. Z = 2 / (2 + 1.125) = .640.
Comment: Too long for three points! Good review of the ways to increase Buhlmann Credibility:
A & C: Increase Variance of the Hypothetical Means,
B & E: Decrease the Expected Value of the Process Variance,
D & E: Increase Number of Observations.
12.24. C. 1. T, 2. T , 3. F. As the variance of the hypothetical means increases, the Buhlmann
credibility estimate approaches the observed shot, while the Bayesian analysis estimate
approaches the cluster closest to the shot.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 550

12.25. B. The probability of Marksman A missing the target 20 times is proportional to .920.
The probability of Marksman B missing the target 20 times is proportional to .3520.
Thus Bayes Analysis gives virtually all the weight to Marksman A.
Marksman A has a mean score of (.01)(50) + (.09)(10) = 1.4
Marksman
A
B

A Priori
Probability
0.5
0.5

Sum

1.0000

Chance of
Observation
0.1216
7.61e-10

Probability
Weights
0.0608
3.80e-10

Posterior
Probability
1.0000
6.26e-9

Mean
Score
1.4
14.5

0.0608

1.0000

1.400

Comment: Tests understanding of the basic concept of Bayesian analysis.


If not told otherwise use Bayesian Analysis, rather than its linear approximation Buhlmann Credibility.
Note that in this case, if one had been asked to apply Buhlmann Credibility, the solution would be
as follows:
Marksman
A
B

A Priori
Probability
0.5
0.5

Process
Variance
32.04
334.75

Hypothetical
Mean
1.40
14.50

Square of
Mean
1.96
210.25

Sum

1.00

183.395

7.950

106.105

For Marksman A the process variance is: {(.01)(50)2 + (.09)(10)2 + (.9)(0)2 } - (1.4)2 = 32.04.
EPV = 183.395, VHM = 106.105 - (7.950)2 = 42.9025. K = 183.395 / 42.9025 = 4.275.
For 20 observations, Z = 20/24.275 = 82.4%.
The observation is an average score of 0, and the a priori estimate is 7.95.
Thus the estimate using Buhlmann Credibility is: (0)(82.4%) + (7.95)(17.6%) = 1.399.
That the estimates based on Bayesian Analysis and Buhlmann Credibility are virtually the same is a
coincidence! For example, if instead we had observed 40 shots all of which had missed the target,
then the Bayesian Analysis estimate would again be 1.4. However, the Buhlmann Credibility would
now be 40/44.275 = 90.3%; thus the estimate using Buhlmann Credibility would now be:
(0)(90.3%) + (7.95)(9.7%) = .771.
12.26. C. As the clusters get further and further apart the chance of observing a shot goes quickly
to zero for all but the closest cluster. (For the pure Bayes approach the posterior probability weight
of a cluster is the product of the a priori chance of a cluster and the chance of the observation given
that cluster. ) Thus for the pure Bayesian approach, (assuming the a priori chance of the closest
cluster is greater than zero), as the clusters get further and further apart, the probability weight of the
closest cluster is much larger than any of the others. Thus virtually all of the weight is given to the
mean of the closest cluster, and the posterior estimate is the mean of that closest cluster.
For the Buhlmann Credibility approach, K = EPV/VHM. Assuming EPV stays constant, as the
VHM approaches infinity, K approaches zero. Therefore, Z = 1/(1 + K) approaches one, and the
Buhlmann Credibility estimate is the observed shot.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 551

12.27. B. For shots at each target the probability density is 1/ uniform over a unit circle centered at
that target. (The area of a unit circle is .) The point S is inside a unit circle centered at either X, Y or Z.
Thus each of the three probability density functions is 1/ at S. The posterior probability of Target X
is proportional to the product of the a priori probability of Target X and the density function at S
given shots at target X, in this case (1/3)(1/) = 1/ 3. Similarly, one gets an equal probability weight
for Targets Y or Z. Thus the posterior probabilities are equal to 1/3. Thus the posterior estimate is:
(1/3)X + (1/3)Y + (1/3)Z = G.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 552

12.28. E. Using Buhlmann Credibility the estimate of the next shot will be a weighted average of
the prior mean G and the average of the observation(s), in this case S.
Thus the new estimate will be somewhere on the line between the point G and the point S.
This is a point other than X, G, M, or S.
Comment: It is extremely unlikely that you will be asked to compute the credibility in a two
dimensional situation as above.
Here is a diagram showing the estimate using Buhlmann Credibility, with Z = 40%:
Z

*
X

Estimate

The Buhlmann Credibility minimizes the expected squared errors, which are defined in terms
of the squared distances between points. The process variance is the expected squared
distance of observation from its expected value. If we are shooting at target X, then the process
variance is the expected squared distance of the shot from X. The density is 1/ over a unit
circle. Thus the expected squared distance from X is:
2

2 1

(r2)/ rdrd =
=0 r=0

=0 r=0

r3 / drd = 1/ 4 d = 1/2.
=0

Thus the process variance for shooting at this target is 1/2. The process variance for the other
two targets is the same, so that the Expected Value of the Process Variance = EPV = 1/2. The
Variance of the Hypothetical Means is the (weighted) average squared distance of the targets
from their mean of G. In this case if X is at (0,0), Y is at (1,0) and Z is at (1/2,

3 /2), then G is at

(1/2, 3 /6). The squared distance of G to any of the targets is then 1/4 + 3/36 = 1/3. Therefore
VHM = 1/3.
K = EPV / VHM = 1/2 / (1/3) = 3/2. Z = 1 /(1+K) = 1/2.5 = .4.
Thus the new estimate is: .4S + .6G = (.4)(1/4,0) + (.6)(1/2,

3 /6) = (.4,

3 /10).

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 553

12.29. E. The posterior probabilities of the targets are proportional to the product of the chance of
the observation given each target and the a priori probability of each target. Since the a priori
probabilities of the targets are all equal, the posterior probabilities are proportional to the chance of
the observation given each target. Thus, the posterior probabilities are proportional to the density
functions at the observed shot M. M is a distance of 1/2 from either X or Y, and a distance from Z of
3 /2 =.866 > 2/3. Therefore, if the target is Z, then the density at M is zero.
If the target is either X or Y, then the density at M is: 9/4.
Therefore, the posterior probabilities are proportional to: 9/4, 9/4, and 0.
Thus the posterior probabilities are: 1/2, 1/2, and 0.
Thus the Bayesian analysis estimate of the location of a second shot is: (1/2)X + (1/2)Y + (0)Z = M.
Z

N
G
P
X

Comment: For shots at each target the probability density is 9/4 uniform over a circle of radius 2/3
centered at that target, since the area of a circle of radius 2/3 is (2/3)2 .
The triangle XMZ is a right triangle. Its hypotenuse XZ is length 1.
Side XM is length 1/2. Therefore, side MZ is of length:

(12) - (1/ 2)2 = 3 /2.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 554

12.30. A. The posterior probabilities of the targets are proportional to the product of the chance of
the observation given each target and the a priori probability of each target. Since the a priori
probabilities of the targets are all equal, the posterior probabilities are proportional to the chance of
the observation given each target. Thus, the posterior probabilities are proportional to product of
the density functions at the observed shots M and N. If the target is Z, then the density at M is zero.
If the target is either X or Y, then the density at M is 9/4. If the target is Y, then the density at N is
zero. If the target is either X or Z, then the density at N is 9/4. Therefore, the posterior probabilities
are proportional to: (9/4)(9/4), (9/4)(0), and (0)(9/4) = (9/4)2 , 0, and 0. Thus the posterior
probabilities are: 1, 0, 0. Thus the Bayesian analysis estimate of the location of a second shot is:
(1)X + (0)Y + (0)Z = X.
Comment: If the shot is at M, then the target could not have been Z, since it is too far away. If the
shot is at N, then the target could not have been Y, since it is too far away.
Thus if one observes a shot at M and a shot at N, the target must have been X.

2013-4-9

Buhlmann Credibility 12 Philbrick Target Shooting, HCM 10/19/12, Page 555

12.31. A.

Y
C

Work out the chance of the observation given that each of the shooters was excluded.
If Shooter A was excluded, then Shooter B must have hit Area X, (since only A or B could have hit
X.) Thus Shooter C must have hit Area Y, (since only B or C could have hit Y, and B hit X.) Shooter
D must have hit area W, (since only A or D could have hit W, and A is assumed not to have shot.)
Thus the probability of the observation if shooter A was excluded is: (1/2)(1/2)(1/2) = 1/8.
If Shooter B was excluded, then Shooter A must have hit Area X. Thus Shooter D must have hit
Area W and thus Shooter C must have hit Area Y. The probability of this observation is:
(1/2)(1/2)(1/2) = 1/8.
If Shooter C was excluded, then Shooter B must have hit Area Y, (since only C or B could have
hit Y.) Shooter D must have hit area W, (since D could only hit Z or W, and no shot was observed
to have hit Z.) Thus Shooter A must have hit area X, (since A is the only shooter remaining, and X
must have been hit by someone.) The probability of this observation is: (1/2)(1/2)(1/2) = 1/8.
If Shooter D was excluded, then Shooter A must have hit area W. Shooter C must have hit
Area Y. Shooter B must have hit area X. Probability of this observation is: (1/2)(1/2)(1/2) = 1/8.
Then as computed in the spreadsheet, the posterior chances that the remaining shooter is A, B, C or
D are equally likely; there is a 25% chance that Area Z will be hit.
Excluded
Marksman

A Priori
Chance of
This Situation

Chance
of the
Observation

Prob. Weight =
Product of
Prior Columns

Posterior
Chance of
This Situation

Chance
of hitting
Area Z

A
B
C
D

0.25
0.25
0.25
0.25

0.125
0.125
0.125
0.125

0.03125
0.03125
0.03125
0.03125

25%
25%
25%
25%

0.0
0.0
0.5
0.5

0.125

100%

25%

Overall

Comment: This is an example where the observation does not alter the a priori probabilities.
This would not be true if instead there had been observed 2 shots in Area X and 1 shot in Area Z.
12.32. A. For three shots to have appeared in area W, shooters A, B, and C must have been
selected and all must have hit area W. Thus since the remaining two shooters each hit area Y, they
must have been shooters D and F. That means that shooter E is the remaining shooter. Thus the
probability that the remaining shooter hits area Y is 0.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 556

Section 13, Die/Spinner Models141


There are simple models of pure premiums involving dice and spinners. The frequency is based on
a die roll and the severity is based on the result of a spinner. Either Buhlmann Credibility or
Bayesian Analysis can be applied to these models.
For example, assume:
Two dice, A1 and A2 , are used to determine the number of claims. Each side of both dice are
marked with either a 0 or a 1, where 0 represents no claim and 1 represents a
claim. The probability of a claim for each die is:
Die
Probability of Claim
1/6
A1
A2

4/6

In addition, there are two spinners, B1 and B2 , representing claim severity. Each spinner has two
areas marked 30 and 50. The probabilities for each claim size are:
Claim Size
Spinner
30
50
B1
0.75
0.25
B2

0.40
0.60
A die is selected randomly from A1 and A2 and a spinner is selected randomly from B1 and B2 .
Note that in this example the die and spinner are chosen independently of each other.142
See the problems for an example where they are chosen together. Therefore, there are four
different combinations of die and spinner. This is an example of a cross classification system.

A1
A2

B1

B2

25%
25%

25%
25%

Four observations from the selected die and spinner yield the following claim amounts:
30, 0, 0, 30.

141

See An Examination of Credibility Concepts, by Stephen Philbrick, PCAS 1981. Philbrick expands on an
example in Credibility for Severity , by Charles C. Hewitt, PCAS 1970. Both the Philbrick and Hewitt papers are
excellent reading for those who wish to understand credibility.
142
This is similar to what is done in 4, 5/01, Q.28. In contrast, in 4, 5/01, Q.10-11, the frequency and severity
distribution go together, they are not selected independently.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 557

Using Buhlmann Credibility we can determine the expected pure premium for the next observation
from the same die and spinner, without separately estimating the future frequency and severity.143
In order to calculate the EPV of 305.5, for each possible pair of die and spinner use the formula:
variance of p.p. = f s2 + s2 f2 .
Therefore, we need to compute the mean and variance of each die and each spinner.
For example, the mean severity for spinner B2 is: (0.4)(30) + (0.6)(50) = 42.
The variance of spinner B2 is: (0.4)(30 - 42)2 + (0.6)(50 - 42)2 = 96.
Die
A Priori
and
Chance of
Spinner
Risk
A1, B1
0.250
A1, B2
0.250
A2, B1
0.250
A2, B2
0.250

Mean
Freq.
0.167
0.167
0.667
0.667

Variance
of Freq.
0.139
0.139
0.222
0.222

Mean
Variance
Severity of Sev.
35
75
42
96
35
75
42
96

Mean

Process
Variance
of P.P.
182.6
261.0
322.2
456.0
305.5

One computes the mean pure premium for each possible combination of die and spinner.
Die and
Spinner
A1, B1
A1, B2
A2, B1
A2, B2
Mean

A Priori Chance
of Risk
0.250
0.250
0.250
0.250

Mean
Freq.
0.167
0.167
0.667
0.667

Mean
Severity
35
42
35
42

Mean Pure
Premium
5.83
7.00
23.33
28.00

Square of
Mean P.P.
34.03
49.00
544.44
784.00

16.04

352.87

Thus the Variance of the Hypothetical Means = 352.87 - 16.042 = 95.29.


Therefore, the Buhlmann Credibility Parameter for pure premium = K = EPV / VHM =
305.5 / 95.29 = 3.2. Thus the credibility for 4 observations is 4/(4 + K) = 4/7.2 = 55.6%.
The a priori mean pure premium as computed above is 16.04.
The observed pure premium is (30 + 0 + 0 + 30)/4 = 15.
Thus the estimated future pure premium is: (0.556)(15) + (1 - 0.556)(16.04) = 15.44.

143

In general the product of separate estimates of frequency and severity will not equal that gotten working directly
with the pure premiums.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 558

The Bayesian Analysis for the same example proceeds as follows:


A

Type of A Priori
Prob. Weight = Posterior
Mean
Die
Chance of
Chance
Product
Chance of
Pure
and
This Die
of the
of Columns
this Die
Mean
Mean
Premium
Spinner & Spinner Observation
B&C
& Spinner Frequency Severity Cols. F x G
A1, B1
0.25
0.010851
0.0027127
0.2187
0.1667
35
5.833
A1, B2
0.25
0.003086
0.0007716
0.0622
0.1667
42
7.000
A2, B1
0.25
0.027778
0.0069444
0.5599
0.6667
35
23.333
A2, B2
0.25
0.007901
0.0019753
0.1592
0.6667
42
28.000
Overall

0.0124040

1.000

19.233

For example, if one has die A1 and spinner B2 , the chance of the observation of 30, 0, 0, 30, is:
{(1/6)(0.4)}(5/6)(5/6){(1/6)(0.4)} = 0.003086.
The posterior chance of die A1 and spinner B2 is: 0.003086 / 0.0124040 = 0.0622.
The estimated future mean pure premium =
(0.2187)(5.833) + (0.0622)(7) + (0.5599)(23.333) + (0.1592)(28) = 19.23.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 559

Simulation Experiments:
It can take many repetitions of the die-spinner process for the observed mean pure premium to
approach its expected value.
I simulated the four different combinations of die and spinner of this die-spinner example.
Here are examples of the average pure premiums after different numbers of simulation runs:
Die A1 and Spinner B1, with mean pure premium: (1/6)(35) = 5.83:
14
12
10
8
6
4
2

20

50

100

200

500

1000

Die A1 and Spinner B2, with mean pure premium: (1/6)(42) = 7:


14
12
10
8
6
4
2
20

50

100

200

500

1000

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 560

Die A2 and Spinner B1, with mean pure premium: (4/6)(35) = 23.33:
30
28
26
24
22
20
18
16
20

50

100

200

500

1000

Die A2 and Spinner B2, with mean pure premium: (4/6)(42) = 28:

34
32
30
28
26
24
22

20

50

100

200

500

1000

Generally, the larger the process variance the more observations it will take until you can discern
which risk you are likely observing.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 561

Problems:
Use the following information for the next 5 questions:
Two dice, A1 and A2 , are used to determine the number of claims. Each side of both dice are
marked with either a 0 or a 1, where 0 represents no claim and 1 represents a claim.
The probability of a claim for each die is:
Die
Probability of Claim
1/6
A1
A2

2/6

In addition, there are two spinners, B1 and B2 , representing claim severity. Each spinner has two
areas marked 10 and 40. The probabilities for each claim size are:
Claim Size
Spinner
10
40
B1
0.70
0.30
B2

0.20
0.80
A die is selected randomly from A1 and A2 and a spinner from B1 and B2.
Four observations from the selected die and spinner yield the following claim amounts in the
following order: 10, 0, 40, 10.
13.1 (3 points) Using Buhlmann Credibility, determine the expected claim frequency for the next
observation from the same die and spinner.
A. 0.28
B. 0.30
C. 0.32
D. 0.34

E. 0.36

13.2 (3 points) Using Buhlmann Credibility, determine the expected claim severity for the next
observation from the same die and spinner.
A. 22.0
B. 22.4
C. 22.8
D. 23.2

E. 23.6

13.3 (4 points) Using Buhlmann Credibility, determine the expected pure premium for the next
observation from the same die and spinner.
(Do not separately estimate the future frequency and severity.)
A. 7.0
B. 7.5
C. 8.0
D. 8.5
E. 9.0
13.4 (3 points) Using Bayesian Analysis, determine the expected pure premium for the next
observation from the same die and spinner.
A. 6.7
B. 7.0
C. 7.3
D. 7.6

E. 7.9

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 562

13.5 (2 points) A new die is selected randomly from A1 and A2 and a new spinner from B1
and B2 . For these same selected die and spinner, determine the limit of
E[Xn | X1 = X2 = . . . = Xn-1 = 0] as n goes to infinity.
A. 3.8

B. 4.0

C. 4.2

D. 4.4

E. 4.6

Use the following information for the next four questions.


Assume there are 3 types of risks each with equal probability. Whether or not there is a claim is
determined by whether a six-sided die comes up with a zero or a one, with a one indicating a claim.
If a claim occurs then its size is determined by a spinner.
Type Number of die faces with a 1 rather than a 0
Claim Size Spinner
I
2
$100 70%, $200 30%
II
3
$100 50%, $200 50%
III
4
$100 30%, $200 70%
13.6 (3 points) In one observation of a risk, you observe a single claim for $200.
Use Bayes Theorem to estimate the pure premium for this risk.
A. less than 90
B. at least 90 but less than 92
C. at least 92 but less than 94
D. at least 94 but less than 96
E. at least 96
13.7 (2 points) What is the variance of the hypothetical mean pure premiums?
A. 820
B. 840
C. 860
D. 880
E. 900
13.8 (3 points) What is the expected value of the process variance of the pure premiums?
A. less than 6100
B. at least 6100 but less than 6200
C. at least 6200 but less than 6300
D. at least 6300 but less than 6400
E. at least 6400
13.9 (2 points) In one observation of a risk, you observe a single claim for $200.
Use Buhlmann credibility to estimate the pure premium for this risk.
A. less than 90
B. at least 90 but less than 92
C. at least 92 but less than 94
D. at least 94 but less than 96
E. at least 96

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 563

Use the following information for the next two questions.


Two dice, A and B, are used to determine the number of claims. The faces of each die are marked
with either a 1 or a 2, where 1 represents 1 claim and 2 represents 2 claims.
The probabilities for each die are:
Die
A
B

Probability of 1 Claim Probability of 2 Claims


2/3
1/3
1/3
2/3

In addition, there are two spinners, X and Y, which are used to determine claim size. Each spinner
has two areas marked 2 and 5. The probabilities for each spinner are:
Probability
Probability
Spinner that Claim Size = 2 that Claim Size = 5
X
2/3
1/3
Y
1/3
2/3
For the first trial, a die is randomly selected from A and B and rolled. If 1 claim occurs, spinner X is
spun. If 2 claims occur, both spinner X and spinner Y are spun. For the second trial, the same die
selected in the first trial is rolled again. If 1 claim occurs, spinner X is spun. If 2 claims occur, both
spinner X and spinner Y are spun.
13.10 (3 points) If the first trial yielded total losses of 7, use Bayesian Analysis to determine the
expected total losses for the second trial.
A. Less than 4.6
B. At least 4.6, but less than 4.9
C. At least 4.9, but less than 5.2
D. At least 5.2, but less than 5.5
E. At least 5.5
13.11 (4 points) If the first trial yielded total losses of 7, use Buhlmann Credibility to determine the
expected total losses for the second trial.
A. Less than 4.6
B. At least 4.6, but less than 4.9
C. At least 4.9, but less than 5.2
D. At least 5.2, but less than 5.5
E. At least 5.5

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 564

Use the following information for the next 4 questions:


Two dice, A1 and A2 , are used to determine the number of claims. Each side of both dice are
marked with either a 0 or a 1, where 0 represents no claim and 1 represents a claim.
The probability of a claim for each die is:
Die
Probability of Claim
1/3
A1
A2

1/2

In addition, there are two spinners, B1 and B2 , representing claim severity. Each spinner has two
areas marked 2 and 5. The probabilities for each claim size are:
Claim Size
Spinner
2
5
B1
0.60
0.40
B2

0.30
0.70
For Spinner B1 , the mean is 3.2, and the process variance is 2.16.
For Spinner B2 , the mean is 4.1, and the process variance is 1.89.
A die is selected randomly from A1 and A2 and a spinner from B1 and B2. Five observations from
the selected die and spinner yield the following claim amounts in the following order: 0, 2, 5, 0, 5.
13.12 (2 points) Using Bayesian Analysis, determine the expected claim frequency for the next
observation from the same die and spinner.
A. 0.40
B. 0.42
C. 0.44
D. 0.46
E. 0.48
13.13 (2 points) Using Bayesian Analysis, determine the expected claim severity for the next
observation from the same die and spinner.
A. less than 3.4
B. at least 3.4 but less than 3.5
C. at least 3.5 but less than 3.6
D. at least 3.6 but less than 3.7
E. at least 3.7
13.14 (3 points) Using Bayesian Analysis, determine the expected pure premium for the next
observation from the same die and spinner. (Do not separately estimate the future frequency and
severity.)
A. 1.60
B. 1.62
C. 1.64
D. 1.66
E. 1.68
13.15 (3 points) Using Buhlmann Credibility, determine the expected pure premium for the next
observation from the same die and spinner.
A. 1.60
B. 1.62
C. 1.64
D. 1.66
E. 1.68

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 565

Use the following information for the next two questions:

For each insured, frequency is Geometric with mean .


For an insured picked at random, is equally likely to be 5% or 15%.
For each insured, frequency is Exponential with mean .
For an insured picked at random, is equally likely to be 10 or 20.
The distributions of and are independent.
During years 1, 2, and 3, from an individual insured you observe a total of 2 claims
of sizes 5 and 15.
13.16 (3 points) Determine the Bayesian estimate of the expected value of the aggregate losses
from this same insured in year four.
A. less than 1.7
B. at least 1.7 but less than 1.8
C. at least 1.8 but less than 1.9
D. at least 1.9 but less than 2.0
E. at least 2.0
13.17 (3 points) Determine the Buhlmann Credibility estimate of the expected value of the
aggregate losses from this same insured in year four.
A. less than 1.7
B. at least 1.7 but less than 1.8
C. at least 1.8 but less than 1.9
D. at least 1.9 but less than 2.0
E. at least 2.0

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 566

Use the following information for the next two questions:


Two spinners, A1 and A2 , are used to determine number of claims.
Each spinner is divided into regions marked 0 and 1 where 0
represents no claims and 1 represents a claim. The probability of
a claim for each spinner is:
Spinner
Probability of Claim
A1
0.15
A2

0.05
A second set of spinners, B1 and B2 , represents claim severity.
Each spinner has two areas marked 20 and 40. The probabilities for each claim size are:
Claim Size
Spinner
20
40
B1
0.80
0.20
B2

0.30
0.70
A spinner is selected randomly from A1 and A2 and a second from B1 and B2 .
Three observations from the selected spinners yield the following claim amounts in the following
order: 0, 20, 0.
13.18 (4B, 11/92, Q.6) (3 points) Use Buhlmann credibility to separately estimate the expected
number of claims and expected severity. Use these estimates to calculate the expected value of
the next observation from the same pair of spinners.
A. Less than 2.9
B. At least 2.9 but less than 3.0
C. At least 3.0 but less than 3.1
D. At least 3.1 but less than 3.2
E. At least 3.2
13.19 (4B, 11/92, Q.7) (3 points) Determine the Bayesian estimate of the expected value of the
next observation from the same pair of spinners.
A. Less than 2.9
B. At least 2.9 but less than 3.0
C. At least 3.0 but less than 3.1
D. At least 3.1 but less than 3.2
E. At least 3.2

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 567

Use the following information for the next two questions:


Two dice, A1 and A2 , are used to determine the number of claims. Each side of both dice are
marked with either a 0 or a 1, where 0 represents no claim and 1 represents a claim.
The probability of a claim for each die is:
Die
Probability of Claim
1/6
A1
A2

3/6

In addition, there are two spinners, B1 and B2 , representing claim severity. Each spinner has two
areas marked 2 and 14. The probabilities for each claim size are:
Claim Size
Spinner
2
14
5/6
1/6
B1
B2

3/6

3/6

A die is randomly selected from A1 and A2 and a spinner is randomly selected from B1 and B2 .
The selected die is rolled and if a claim occurs, the selected spinner is spun.
13.20 (4B, 5/93, Q.13) (2 points) Determine E[X1 ], where X1 is the first observation from the
selected die and spinner.
A. 2/3
B. 4/3
C. 2
D. 4
E. 8
13.21 (4B, 5/93, Q.14) (2 points) For the same selected die and spinner, determine the limit of
E[Xn | X1 = X2 = . . . = Xn-1 = 0] as n goes to infinity.
A. Less than 0.75
B. At least 0.75 but less than 1.50
C. At least 1.50 but less than 2.25
D. At least 2.25 but less than 3.00
E. At least 3.00

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 568

13.22 (4B, 11/93, Q.14) (3 points) There are two methods for calculating credibility estimates for
pure premium. One utilizes separate estimates for frequency and severity, and the other utilizes
only the aggregate claim amount. Let A1 and A2 be equally likely frequency distributions and let B1
and B2 be equally likely severity distributions.
Number
of Claims

Probability of
Claim for:
A1
A2

Amount
of Claims

Probability of
Claim Amount for:
B1
B2

0
0.80
0.60
100
0.40
0.80
1
0.20
0.40
200
0.60
0.20
A state Ai, Bj is selected at random, and a claim of 100 is observed. Determine the Buhlmann
credibility estimate for the next observation from the same selected state utilizing only aggregate
claim amounts.
A. Less than 41.5
B. At least 41.5, but less than 42.5
C. At least 42.5, but less than 43.5
D. At least 43.5, but less than 44.5
E. At least 44.5

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 569

Use the following information for the next two questions:

A portfolio of independent risks is divided into two classes of equal size.

All of the risks in Class 1 have identical claim count and claim size distributions
as follows:
Class 1
Class 1
Number of Claims Probability
Claim Size Probability
1
1/2
50
2/3
2
1/2
100
1/3
All of the risks in Class 2 have identical claim count and claim size distributions as follows:
Class 2
Class 2
Number of Claims Probability
Claim Size Probability
1
2/3
50
1/2
2
1/3
100
1/2
The number of claims and claim size(s) for each risk are independent.
A risk is selected at random from the portfolio, and a pure premium of 100 is observed
for the first exposure period.

13.23 (4B, 11/95, Q.14 & Course 4 Sample Exam 2000, Q.19) (3 points)
Determine the Bayesian analysis estimate of the expected number of claims for this same risk for
the second exposure period.
A. 4/3
B. 25/18
C. 41/29
D. 17/12
E. 3/2
13.24 (4B, 11/95, Q.15 & Course 4 Sample Exam 2000, Q.20) (2 points)
A pure premium of 150 is observed for this risk for the second exposure period.
Determine the Buhlmann credibility estimate of the expected pure premium for this same risk for the
third exposure period.
A. Less than 110
B. At least 110, but less than 120
C. At least 120, but less than 130
D. At least 130, but less than 140
E. At least 140

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 570

13.25 (4B, 5/96, Q.19) (2 points) Two dice, A and B, are used to determine the number of claims.
The faces of each die are marked with either a 1 or a 2, where 1 represents 1 claim and 2 represents
2 claims. The probabilities for each die are:
Die
A
B

Probability of 1 Claim Probability of 2 Claims


2/3
1/3
1/3
2/3

In addition, there are two spinners, X and Y, which are used to determine claim size. Each spinner
has two areas marked 2 and 5. The probabilities for each spinner are:

Spinner
X
Y

Probability
that Claim Size = 2
2/3
1/3

Probability
that Claim Size = 5
1/3
2/3

For the first trial, a die is randomly selected from A and B and rolled. If 1 claim occurs, spinner X is
spun. If 2 claims occur, both spinner X and spinner Y are spun. For the second trial, the same die
selected in the first trial is rolled again. If 1 claim occurs, spinner X is spun. If 2 claims occur, both
spinner X and spinner Y are spun. If the first trial yielded total losses of 5, determine the expected
number of claims for the second trial.
A. Less than 1.38
B. At least 1.38, but less than 1.46
C. At least 1.46, but less than 1.54
D. At least 1.54, but less than 1.62
E. At least 1.62
13.26 (2 points) In the previous question, if the first trial yielded total losses of 7, determine the
expected losses for the second trial using Bayes Analysis.
13.27 (3 points) In 4B, 5/96, Q.19, if the first trial yielded total losses of 7, determine the expected
losses for the second trial using Buhlmann Credibility.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 571

Use the following information for the next two questions:


Two dice, A and B, are used to determine the number of claims. The faces of each die are marked
with either a 0 or a 1, where 0 represents 0 claims and 1 represents 1 claim. The probabilities for
each die are:
Die
Probability of 0 Claims Probability of 1 Claim
A
2/3
1/3
B
1/3
2/3
In addition, there are two spinners, X and Y, which are used to determine claim size. Spinner X has
two areas marked 2 and 8. Spinner Y has only one area marked 2. The probabilities for each
spinner are:
Probability that
Probability that
Spinner
Claim Size = 2
Claim Size = 8
X
1/3
2/3
Y
1
0
For the first trial, a die is randomly selected from A and B and rolled. If a claim occurs, a spinner is
randomly selected from X and Y and spun.
13.28 (4B, 11/96, Q.6) (1 point)
Determine the expected amount of total losses for the first trial.
A. Less than 1.4
B. At least 1.4, but less than 1.8
C. At least 1.8, but less than 2.2
D. At least 2.2, but less than 2.6
E. At least 2.6
13.29 (4B, 11/96, Q.7) (2 points)
For each subsequent trial, the same die selected in the first trial is rolled again.
If a claim occurs, a spinner is again randomly selected from X and Y and spun.
Determine the limit of the Bayesian analysis estimate of the expected amount of total losses for the
nth trial as n goes to infinity if the first n-1 trials each yielded total losses of 2.
A. Less than 1.4
B. At least 1.4, but less than 1.8
C. At least 1.8, but less than 2.2
D. At least 2.2, but less than 2.6
E. At least 2.6

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 572

13.30 (4, 11/00, Q.33) (2.5 points) A car manufacturer is testing the ability of safety devices to limit
damages in car accidents. You are given:
(i) A test car has either front air bags or side air bags (but not both), each type being
equally likely.
(ii) The test car will be driven into either a wall or a lake, with each accident type
being equally likely.
(iii) The manufacturer randomly selects 1, 2, 3 or 4 crash test dummies to put into a
car with front air bags.
(iv) The manufacturer randomly selects 2 or 4 crash test dummies to put into a car
with side air bags.
(v) Each crash test dummy in a wall-impact accident suffers damage randomly equal
to either 0.5 or 1, with damage to each dummy being independent of damage to
the others.
(vi) Each crash test dummy in a lake-impact accident suffers damage randomly equal
to either 1 or 2, with damage to each dummy being independent of damage to
the others.
One test car is selected at random, and a test accident produces total damage of 1.
Determine the expected value of the total damage for the next test accident, given that the kind of
safety device (front or side air bags) and accident type (wall or lake) remain the same.
(A) 2.44
(B) 2.46
(C) 2.52
(D) 2.63
(E) 3.09
13.31 (3 points) In the previous question, determine the expected value of the total damage for the
next test accident if: the number of dummies, the kind of safety device (front or side air bags), and
the accident type (wall or lake), all remain the same.
(A) 1.0
(B) 1.1
(C) 1.2
(D) 1.3
(E) 1.4

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Use the following information for 4, 5/01, questions 10 and 11.


(i) The claim count and claim size distributions for risks of type A are:
Number of
Claims
Probabilities
Claim Size Probabilities
0
4/9
500
1/3
1
4/9
1235
2/3
2
1/9
(ii) The claim count and claim size distributions for risks of type B are:
Number of
Claims
Probabilities
Claim Size Probabilities
0
1/9
250
2/3
1
4/9
328
1/3
2
4/9
(iii) Risks are equally likely to be type A or type B.
(iv) Claim counts and claim sizes are independent within each risk type.
(v) The variance of the total losses is 296,962.
A randomly selected risk is observed to have total annual losses of 500.
13.32 (4, 5/01, Q.10) (2.5 points)
Determine the Bayesian premium for the next year for this same risk.
(A) 493
(B) 500
(C) 510
(D) 513
(E) 514
13.33 (4, 5/01, Q.11) (2.5 points)
Determine the Bhlmann credibility premium for the next year for this same risk.
(A) 493
(B) 500
(C) 510
(D) 513
(E) 514

Page 573

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 574

13.34 (4, 5/01, Q.28) (2.5 points) Two eight-sided dice, A and B, are used to determine the
number of claims for an insured. The faces of each die are marked with either 0 or 1, representing the
number of claims for that insured for the year.
Die
Pr(Claims = 0)
Pr(Claims = 1)
A
1/4
3/4
B
3/4
1/4
Two spinners, X and Y, are used to determine claim cost. Spinner X has two areas marked 12 and c.
Spinner Y has only one area marked 12.
Spinner
Pr(Cost = 12)
Pr(Cost = c)
X
1/2
1/2
Y
1
0
To determine the losses for the year, a die is randomly selected from A and B and rolled.
If a claim occurs, a spinner is randomly selected from X and Y and spun.
For subsequent years, the same die and spinner are used to determine losses.
Losses for the first year are 12. Based upon the results of the first year, you determine that the
expected losses for the second year are 10. Calculate c.
(A) 4
(B) 8
(C) 12
(D) 24
(E) 36
13.35 (4, 11/01, Q.23 & 2009 Sample Q.70) (2.5 points) You are given the following
information on claim frequency of automobile accidents for individual drivers:
Business Use
Pleasure Use
Expected
Claim
Expected
Claim
Claims
Variance
Claims
Variance
Rural
1.0
0.5
1.5
0.8
Urban
2.0
1.0
2.5
1.0
Total
1.8
1.06
2.3
1.12
You are also given:
(i) Each drivers claims experience is independent of every other drivers.
(ii) There are an equal number of business and pleasure use drivers.
Determine the Bhlmann credibility factor for a single driver.
(A) 0.05
(B) 0.09
(C) 0.17
(D) 0.19
(E) 0.27

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 575

Solutions to Problems:
13.1. C. The EPV for frequency is .1806 and is calculated as follows:
Type of
Die
A1
A2

Bernoulli
Parameter
0.1667
0.3333

A Priori
Probability
0.50
0.50

Average
Type of
Die
A1
A2

Process
Variance
0.1389
0.2222
0.1806

A Priori
Probability
0.50
0.50

Average

Mean
0.1667
0.3333

Square of
Mean
0.0278
0.1111

0.2500

0.0694

VHM = .0694 - .25002 = .0069. K = EPV / VHM = .1806 / .0069 = 26.2. Z = 4/(4+K) = .132.
The a priori mean frequency is .25. The observed claim frequency is 3/4= .75.
Thus the estimated future frequency is: (.132)(.75) + (1 - .132)(.25) = 0.316.
13.2. D. For Spinner B1 , the mean is (.7)(10) + (.3)(40) = 19, the second moment is
(.7)(102 ) + (.3)(402 ) = 550, and the process variance is 550 - 192 = 189.
Type of
Spinner
B1
B2

A Priori
Probability
0.50
0.50

Mean
19
34

Second
Moment
550
1300

Average

Process
Variance
189
144
166.5

The Expected Value of the Process Variance = (1/2)(189) + (1/2)(144) = 166.5.


Type of
Spinner
B1
B2
Average

A Priori
Probability
0.50
0.50

Mean
19
34

Square of
Mean
361
1156

26.5

758.5

Thus the Variance of the Hypothetical Means = 758.5 - 26.52 = 56.25.


Therefore, the Buhlmann Credibility Parameter for frequency = K = EPV / VHM =
166.5 / 56.25 = 2.96. Thus the credibility for 3 claims is: 3/(3 + K) = 50.3%.
The a priori mean severity is 26.5. The observed claim severity is: (10 + 40 + 10)/3= 20.
Thus the estimated future severity is: (.503)(20) + (1 - .503)(26.5) = 23.2.
Comment: Note that the spinners are chosen independently of the dice, so frequency and severity
are independent across risk types. Thus one can ignore the frequency process in this question. One
can not do so when for example low frequency is associated with low severity, as in the questions
related to good, bad and ugly drivers.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 576

13.3. C. For each possible pair of die and spinner, variance of p.p. = f s2 + s2 f2 .
Die
and
Spinner
A1, B1
A1, B2
A2, B1
A2, B2

A Priori
Chance of
Risk
0.250
0.250
0.250
0.250

Mean
Freq.
0.167
0.167
0.333
0.333

Variance
of Freq.
0.139
0.139
0.222
0.222

Mean
Severity
19
34
19
34

Variance
of Sev.
189
144
189
144

Mean
Die
and
Spinner
A1, B1
A1, B2
A2, B1
A2, B2

Process
Variance
of P.P.
81.64
184.56
143.22
304.89
178.58

A Priori
Chance of
Risk
0.250
0.250
0.250
0.250

Mean
Freq.
0.167
0.167
0.333
0.333

Mean
Severity
19
34
19
34

Mean

Mean
Pure
Premium
3.17
5.67
6.33
11.33

Square of
Mean
P.P.
10.03
32.11
40.11
128.44

6.62

52.67

Thus VHM = 52.67 - 6.622 = 8.85. K = EPV / VHM = 178.6 / 8.85 = 20.2.
Z = 4/(4 + K) = .165. The a priori mean pure premium is 6.62.
The observed pure premium is: (10 + 0 + 40 + 10)/4 = 15.
Thus the estimated future pure premium is: (.165)(15) + (1 - .165)(6.62) = 8.00.
Comment: Note that the result is not equal to the product of the separate estimates for frequency
and severity: (.316)(23.3) = 7.36 8.00. Neither is it equal to the estimate of pure premium using
Bayesian Analysis: 6.84 8.00.
13.4. A. There are four possible combinations of die and spinner, or four risk types.
If Die A1 and Spinner B1, then the chance of the observation of 10, 0, 40, 10, is:
{(1/6)(.7)}(5/6){(1/6)(.3)}{(1/6)(.7)} = .000567.
If Die A1 and Spinner B2, then the chance of the observation is:
{(1/6)(.2)}(5/6){(1/6)(.8)}{(1/6)(.2)} = .000123.
If Die A2 and Spinner B1, then the chance of the observation is:
{(2/6)(.7)}(4/6){(2/6)(.3)}{(2/6)(.7)} = .003630.
If Die A2 and Spinner B2, then the chance of the observation is:
{(2/6)(.2)}(4/6){(2/6)(.8)}{(2/6)(.2)} = .000790.
Die
and
Spinner
A1, B1
A1, B2
A2, B1
A2, B2
Mean

A Priori
Chance of
Risk
0.250
0.250
0.250
0.250

Chance
of the
Observation
0.000567
0.000123
0.003630
0.000790

Probability
Weight
0.0001418
0.0000308
0.0009075
0.0001975
0.001277

Posterior
Mean
Mean Pure
ProbabilityMean
SeverityPremium
Freq.
11.1%
3.17
2.4%
5.67
71.0%
6.33
15.5%
11.33
100.0%

6.74

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 577

13.5. D. As we have more and more observations with no claim, the probability that we selected
die A1 rather than die A2 increases. Therefore the expected value of the pure premium goes to
(mean frequency for die A1 ) (mean severity) = (1/6)((19 + 34)/2) = 4.42. Alternately lets assume
we have for example 100 observations with no claim, then the chance of this observation is (5/6)100
if we have die A1 and (4/6)100 if we have die A2 .
Thus the Bayes Analysis is as follows:
A

Type of
A Priori
Prob. Weight = Posterior
Mean
Die
Chance of
Chance
Product
Chance of
Pure
and
This Die
of the
of Columns
this Die
Mean
Mean
Premium
Spinner & Spinner Observation
B&C
& Spinner Frequency Severity Cols. F x G
A1, B1
0.25
1.207e-8
3.019e-9
0.5000
0.167
19
3.17
A1, B2
0.25
1.207e-8
3.019e-9
0.5000
0.167
34
5.67
A2, B1
0.25
2.460e-18
6.149e-19
0.0000
0.333
19
6.33
A2, B2
0.25
2.460e-18
6.149e-19
0.0000
0.333
34
11.33
Overall

6.037e-9

4.42

1.000

13.6. C. For a Type I Risk, the chance of observing a $200 claim is (2/6)(30%) = 10%, since there
is a 2/6 chance of observing any claim, and once a claim is observed there is 30% chance it will be
$200. Similarly for Type II : (3/6)(50%) = 25%, For Type III : (4/6)(70%) = 46.7%. Since the
types of risks are equally likely a priori, the posterior probabilities are therefore proportional to 10%,
25% and 46.7%. As calculated below this results in a posterior weighted average p.p. of 93.0.
A

A Priori
Chance of
Type of This Type
Risk
of Risk
I
0.333
II
0.333
III
0.333
Overall

Chance
of the
Observation
0.100
0.250
0.467

Prob. Weight =
Product
of Columns
B&C
0.033
0.083
0.156

Posterior
Chance of
This Type
of Risk
12.2%
30.6%
57.1%

Avg.
Pure
Premium
43.3
75.0
113.3

0.272

1.000

93.0

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 578

13.7. A. The mean pure premiums are computed for each type of risk by multiplying the mean
frequency by the mean severity. The frequencies are : 2/6, 3/6, 4/6. The mean severities are: $130,
$150, $170. Mean p.p.: 130/3, 150/2, 340/3. Remembering that the a priori probabilities are
stated to be equal, the grand mean pure premium is $77.22. The expected value of the p.p.
squared is 6782. Thus the variance = 6782 - 77.222 = 819.

Type of
Risk
I
II
III

A Priori
Chance of
This Type
of Risk
0.333
0.333
0.333

Avg.
Claim
Freq.
0.333
0.500
0.667

Avg.
Claim
Severity
130
150
170

Avg.
Pure
Premium
43.3
75.0
113.3

Square of
Avg.
Pure
Premium
1878
5625
12844

77.2

6782

Average

13.8. D. Use for each type the formula: variance of p.p. = f s2 + s2 f2 to get process variances
of: 4456, 6875, and 7816. For example for Type I : f = 1/3, f2 = (1/3)(2/3), s = 130,
s2 = (70%)(302 )+(30%)(702 ) = 2100; Type I variance of p.p. = (1/3)(2100) + (2/9)(1302 ) =
4456. Remembering that the a priori probabilities are stated to be equal, these process variances
average to 6384.
Type
Risk
I
II
III
Mean

A Priori
Chance of
Risk
0.333
0.333
0.333

Mean
Freq.
0.333
0.500
0.667

Variance
of Freq.
0.222
0.250
0.222

Mean
Severity
130
150
170

Variance
of Sev.
2100
2500
2100

Process
Variance
of P.P.
4456
6875
7822
6384

Comment: In this case, while for any given type of risk the frequency and severity are independent,
this is not true across risks. For example, the Type I risks are low frequency and low severity.
13.9. B. Using the solutions to the two previous questions,
K = EPV/ VHM = 6384/819 = 7.79. Thus for one observation, Z = 1 / 8.79 = 11.4%.
estimated p.p. = (.114)($200) + (1 - .114)($77.22) = $91.2

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 579

13.10. D. Note that unlike the usual die-spinner example, we do not pick the spinners at random
(independent of the number of claims.) Rather, the spinners depend on the number of claims, but
not the type of risk. In fact we only have two types of risk: low frequency corresponding to Die A
and high frequency corresponding to Die B. The mean of Spinner X is: (2/3)(2) +(1/3)(5) = 3, while
the mean of Spinner Y is: (1/3)(2) + (2/3)(5) = 4. If we have one claim the mean loss is E[X] = 3.
If we have two claims, then the mean loss is: E[X+Y] = E[X] + E[Y] = 3 + 4 = 7.
Thus the mean pure premium for Die A is: (2/3)(3) + (1/3)(7) = 4.333.
The mean pure premium for Die B is: (1/3)(3) + (2/3)(7) = 5.667.
In this case, if one observes a total losses of 7, it must have come from two claims, one of size 2
and one of size 5. There are two ways this could occur; either the claim from Spinner X is 2 and that
from Spinner Y is 5 or vice versa. Thus if we have two claims, the chance of one being of size 2 and
the other of size 5 is the sum of these two situations: (2/3)(2/3) + (1/3)(1/3) = 5/9.
Thus if we have selected Die A, there is a (1/3)(5/9) = .1852 chance of this observation.
If we have selected Die B, there is a (2/3)(5/9)= .3704 chance of this observation.
A

Die
A
B

A Priori
Chance of
This
Die
0.500
0.500

Chance
of the
Observation
0.1852
0.3704

Prob. Weight =
Product
of Columns
B&C
0.0926
0.1852

Posterior
Chance of
This Type
of Die
33.3%
66.7%

Mean
P.P. for
This Type
of Die
4.333
5.667

0.2778

1.000

5.222

Overall

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 580

13.11. C. The mean pure premium if Die A is 4.333 and the mean pure premium if Die B is 5.667.
Thus since Die A and Die B are equally likely a priori, the overall mean is 5 and the Variance of the
Hypothetical Mean Pure Premiums is:
{(5 .667 - 5)2 +.(4.333 - 5)2 }/2 = 0.6672 = 0.444.
If one has Die A, then the possible outcomes are as follows:
Situation
1 claim @ 2
1 claim @ 5
2 claims @ 2 each
2 claims: X @ 2 & Y @ 5
2 claims: X @ 5 & Y @ 2
2 claims @ 5 each

Probability
44.4%
22.2%
7.4%
14.8%
3.7%
7.4%

Pure Premium
2
5
4
7
7
10

Square of P.P.
4
25
16
49
49
100

Overall

100.0%

4.333

25.00

Thus for Die A, the process variance of the pure premiums is 25 - 4.3332 = 6.225.
Similarly, if one has Die B, then the possible outcomes are as follows:
Situation
1 claim @ 2
1 claim @ 5
2 claims @ 2 each
2 claims: X @ 2 & Y @ 5
2 claims: X @ 5 & Y @ 2
2 claims @ 5 each

Probability
22.2%
11.1%
14.8%
29.6%
7.4%
14.8%

Pure Premium
2
5
4
7
7
10

Square of P.P.
4
25
16
49
49
100

Overall

100.0%

5.667

39.00

Thus for Die B, the process variance of the pure premiums is 39 - 5.6672 = 6.885.
Thus since Die A and Die B are equally likely a priori, the Expected Value of the Process Variance of
the Pure Premiums is: (.5)(6.225) + (.5)(6.885) = 6.555.
Thus the Buhlmann Credibility Parameter K = EPV / VHM = 6.555 / .444 = 14.8. Thus one
observation would be given credibility of 1/(1 + 14.8) = 6.3%. The a priori mean pure premium is:
(.5)(4.333) + (.5)(5.667) = 5. Since the observed pure premium is 7, the Buhlmann Credibility
estimate of the future pure premium is: (.063)(7) + (1 - .063)(5) = 5.126.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 581

13.12. C. The frequency only depends on the type of die.


We observe no claim, claim, claim, no claim, claim.
Chance of observation is: (1-q)2 q3 .
Die

A Priori
Chance of
Die

Chance
of the
Observation

Probability
Weight

A1
A2

0.500
0.500

0.016461
0.031250

0.0082305
0.0156250

34.5%
65.5%

0.3333
0.5000

0.023855

100.0%

0.4425

Mean

Posterior
Mean
MeanFrequency
ProbabilityMean
Severity
Freq.

13.13. E. The severity only depends on the type of spinner.


We observe claims of size 2, 5, and 5.
For Spinner B1, this has a probability of: (.6)(.4)(.4) = 0.096.
Spinner

A Priori
Chance of
Spinner

Chance
of the
Observation

Probability
Weight

B1
B2

0.500
0.500

0.096000
0.147000

0.0480000
0.0735000

39.5%
60.5%

3.200
4.100

0.121500

100.0%

3.744

Mean

Posterior
Mean
Mean Severity
ProbabilityMean
Severity
Freq.

13.14. D. There are four possible combinations of die and spinner, or four risk types.
If Die A1 and Spinner B1, then the chance of the observation of 0, 2, 5, 0, 5 is:
(2/3){(1/3)(.6)}{(1/3)(.4)}(2/3){(1/3)(.4)} = .001580.
If Die A1 and Spinner B2, then the chance of the observation is:
(2/3){(1/3)(.3)}{(1/3)(.7)}(2/3){(1/3)(.7)} = .002420.
If Die A2 and Spinner B1, then the chance of the observation is:
(1/2){(1/2)(.6)}{(1/2)(.4)}(1/2){(1/2)(.4)} = .003000.
If Die A2 and Spinner B2, then the chance of the observation is:
(1/2){(1/2)(.3)}{(1/2)(.7)}(1/2){(1/2)(.7)} = .004594.
Die
and
Spinner

A Priori
Chance of
Risk

Chance
of the
Observation

Probability
Weight

A1, B1
A1, B2
A2, B1
A2, B2

0.250
0.250
0.250
0.250

0.001580
0.002420
0.003000
0.004594

0.0003951
0.0006049
0.0007500
0.0011484

13.6%
20.9%
25.9%
39.6%

1.067
1.367
1.600
2.050

0.002898

100.0%

1.657

Mean

Posterior
Mean
Mean Pure
ProbabilityMean
SeverityPremium
Freq.

Comment: Note that the result is equal to the product of the separate estimates for frequency and
severity: (.4425)(3.744) = 1.657. This is due to the fact that in this case the die and spinner are
chosen separately. This is not a general property of Bayesian Analysis.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 582

13.15. C. In order to calculate the EPV for each possible pair of die and spinner use the formula:
variance of p.p. = f s2 + s2 f2 .
Die
and
Spinner

A Priori
Chance of
Risk

Mean
Freq.

Variance
of Freq.

Mean
Severity

Variance
of Sev.

Process
Variance
of P.P.

A1, B1
A1, B2
A2, B1
A2, B2

0.250
0.250
0.250
0.250

0.333
0.333
0.500
0.500

0.222
0.222
0.250
0.250

3.2
4.1
3.2
4.1

2.16
1.89
2.16
1.89

2.996
4.366
3.640
5.147

Mean

4.037

Die
and
Spinner

A Priori
Chance of
Risk

Mean
Freq.

Mean
Severity

Mean
Pure
Premium

Square of
Mean
P.P.

A1, B1
A1, B2
A2, B1
A2, B2

0.250
0.250
0.250
0.250

0.333
0.333
0.500
0.500

3.2
4.1
3.2
4.1

1.067
1.367
1.600
2.050

1.138
1.868
2.560
4.202

1.521

2.442

Mean

Thus VHM = 2.442 - 1.5212 = 0.1286. K = EPV / VHM = 4.037 / 0.1286 = 31.4.
Z = 5/(5 + K) = .137. The a priori mean pure premium is 1.52.
The observed pure premium is: (0 + 2 + 5 + 0 + 5)/5 = 2.4.
Thus the estimated future pure premium is: (.137)(2.4) + (1 - .137)(1.52) = 1.64.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 583

13.16. C. There are four risk types equally likely:


= 5% and = 10, = 5% and = 20, = 15% and = 10, = 15% and = 20.
Given and , the mean (annual) pure premium is .
Over three years, the number of claims is Geometric with mean 3; the density at 2 is:

(3)2
.
(1 + 3)3

For the Exponential: 2 f(5) f(15) = (2) (e-5//) (e-15//).


2
e- 20/
Thus the chance of the observation is proportional to:
.
(1 + 3)3
2
Since the four risk types are equally likely, we can also use this as the probability weight.
Beta

Theta

Probability
Weight

Posterior
Probability

5%
5%
15%
15%

10
20
10
20

0.00000222
0.00000151
0.00000999
0.00000679

10.8%
7.4%
48.7%
33.1%

0.500
1.000
1.500
3.000

0.00002051

100.0%

1.851

Mean

Mean
Pure #REF!
Premium

Comment: Analogous to a die-spinner question in which the die and spinner are chosen separately,
and thus with a cross-classification setup and 4 risk types.
13.17. B. There are four risk types equally likely:
= 5% and = 10, = 5% and = 20, = 15% and = 10, = 15% and = 20.
Given and , the mean (annual) pure premium is: .
Given and , the process variance of the annual pure premium is: 2 + 2 (1+) = 2 (2+2).
Beta

Theta

Process
Variance

Mean
Pure
Premium

5%
5%
15%
15%

10
20
10
20

10.25
41.00
32.25
129.00

0.5
1.0
1.5
3.0

0.250
1.000
2.250
9.000

53.12

1.5

3.125

Mean

Square of
Mean Pure#REF!
Premium

EPV = 53.12.
VHM = 3.125 - 1.52 = 0.875.
K = EPV / VHM = 53.12 / 0.875 = 60.7. For three years of data, Z = 3 / (3 + 60.7) = 4.7%.
The observed annual pure premium is: (5 + 15)/3 = 20/3.
The prior mean annual pure premium is 1.5.
The estimated pure premium for year four is: (0.047)(20/3) + (1 - 0.047)(1.5) = 1.74.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 584

13.18. D. First one estimates the frequency. We have a Bernoulli process with mean q and
process variance q(1-q).

Type of
Spinner
A1
A2

A Priori
Chance of
This Type
Spinner
0.500
0.500

Overall

Process
Variance
0.1275
0.0475

Mean
Spin
0.15
0.05

Square of
Mean
Spin
0.0225
0.0025

0.0875

0.1000

0.0125

The variance of the hypothetical means = .0125 - 0.12 = .0025.


K = EPV / VHM = .0875 / .0025 = 35. N = 3 because we have three observations, so
Z = 3/(3 + 35) = 7.9%. The a priori estimate is .1, and the observation is 1/3 (one claim in three
trials), so the new estimate of the frequency is: (.079)(1/3) + (1 - .079)(.1) = .118.
Similarly one can estimate the severity:

Type of
Spinner
B1
B2
Overall

A Priori
Chance of
This Type
Spinner
0.500
0.500

Process
Variance
64
84

Mean
Spin
24
34

Square of
Mean
Spin
576
1156

74

29

866

For example, the process variance for Spinner B2 is: (0.3)(202 ) + (0.7)(402 ) - 342 = 84.
The variance of the hypothetical means = 866 - 292 = 25. K = EPV / VHM = 74 / 25 = 2.96.
N = 1 because we have a single claim and thus only one observation of the claim severity process.
(The B spinner was only spun a single time.) Thus Z = 1/(1 + 2.96) = 25.3%.
The a priori estimate is 29, and the observation is 20 (one claim of size 20), so the new estimate of
the severity is (.253)(20) + (1 - .253)(29) = 26.7.
Combining the separate estimates of frequency and severity, one gets an estimated pure premium
of: (.118 )(26.7) = 3.15.
Comment: Note the solution would differ if one worked directly with the pure premiums rather than
separately estimating the frequency and severity. The solution to this alternate problem is as
follows: The variance of the hypothetical means is: 10.825 - 2.92 = 2.415.
The expected value of the process variance is 83.175.
Therefore K = EPV / VHM = 83.175 / 2.415 = 34.4. For three observations,
Z = 3/(3 + 34.4) = 8.0%. The observation is 20 / 3. The a priori mean pure premium is 2.9.
Thus the new estimate of the pure premium is: (20/3)(.08) + (2.9)(1-.08) = 3.20.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 585

13.19. E.
A

A Priori
Prob. Weight =
Chance of
Chance
Product
Type of
This Pair
of the
of Columns
Spinners of Spinners Observation
B&C
A1, B1
0.25
0.0867
0.0217
A1, B2
0.25
0.0325
0.0081
A2, B1
0.25
0.0361
0.0090
A2, B2
0.25
0.0135
0.0034
Overall

0.0422

Posterior
Mean
Chance of
Pure
This Pair
Mean
Mean
Premium
of Spinners Frequency Severity Cols. F x G
0.5135
0.15
24
3.600
0.1926
0.15
34
5.100
0.2138
0.05
24
1.200
0.0802
0.05
34
1.700
3.223

1.000

For example, the chance of the observation if one has spinners A1 and B2 is:
(.85)(.15)(.85)(.30) = .0325.
For example, the posterior chance of spinners A1 and B2 is: .0081 / .0422 = .1926.
For example, the mean severity if Spinner B2 is: (.30)(20) + (.70)(40) = 34.
13.20. C. The mean frequency is: (1/2){(1/6)+(3/6)} = 1/3. For spinner B1 the mean claim size is:
(5/6)(2) + (1/6)(14) = 4. For spinner B2 the mean claim size is: (3/6)(2) + (3/6)(14) = 8. Thus the
mean claim size is: (1/2)(4 + 8) = 6. The mean pure premium as the product of the mean frequency
and mean severity: (1/3) (6) = 2.
Alternately, one can calculate the mean pure premiums for each type of risk and average:
Type of
Risk
A1,B1
A1,B2
A2,B1
A2,B2
Overall

A Priori Chance
of This Type
of Risk
0.250
0.250
0.250
0.250

Mean
Frequency
0.167
0.167
0.500
0.500

Mean
Severity
4.000
8.000
4.000
8.000

Mean
Pure
Premium
0.667
1.333
2.000
4.000

0.333

6.000

2.000

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 586

13.21. B. As we have more and more observations with no claim, the probability that we selected
die A1 rather than die A2 increases. Therefore the expected value of the pure premium goes to
(mean frequency for die A1 ) (mean severity) = (1/6)(6) = 1.
Alternately lets assume we have for example 100 observations with no claim, then the chance of
this observation is (5/6)100 if we have die A1 and (3/6)100 if we have die A2 .
Thus the Bayes Analysis is as follows:
A

Type of
Risk
A1,B1
A1,B2
A2,B1
A2,B2

A Priori
Chance of
This Type
of Risk
0.250
0.250
0.250
0.250

Overall

Chance
of the
Observation
1.207e-8
1.207e-8
7.889e-31
7.889e-31

Prob. Weight =
Product
of Columns
B&C
3.019e-9
3.019e-9
1.972e-31
1.972e-31

Posterior
Chance of
This Type
of Risk
50.00%
50.00%
0.00%
0.00%

Mean
Pure
Premium
0.667
1.333
2.000
4.000

6.037e-9

1.000

1.000

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 587

13.22. E. Mean frequencies are A1 = .2, A2 = .4. (Bernoulli) Process variances of the frequencies
are: A1 = (.2)(.8) = .16, A2 = (.4)(.6) = .24. Mean severities are: B1 = 160, B2 = 120.
Calculating the second central moment, the process variances of the severities are for B1 :
(.4)(100 - 160)2 + (.6)(200 - 160)2 = 2400, and for B2 : (.8)(100 - 120)2 + (.2)(200 - 120)2 =
1600.
Since the frequency and severity are independent, for each type of risk the Process Variance of the
Pure Premium =
(mean frequency)(variance of severity) + (mean severity)2 (variance of frequency).
Type
Risk
A1, B1
A1, B2
A2, B1
A2, B2
Mean

A Priori
Probability
0.25
0.25
0.25
0.25

Mean
Freq.
0.2
0.2
0.4
0.4

Variance
of Freq.
0.16
0.16
0.24
0.24

Mean
Severity
160
120
160
120

Variance
of Sev.
2400
1600
2400
1600

Mean
P.P.
32
24
64
48

0.3

0.2

140

2000

42

Variance
Sq. of
of P.P. Mean P.P.
4576
1024
2624
576
7104
4096
4096
2304
4600

2000

Expected Value of the Process Variance of the P.P. = 4600.


Variance of the Hypothetical Mean Pure Premiums = 2000 - 422 = 236.
Thus, K = 4600 / 236 = 19.5. Z = 1/(1 + 19.5) = 0.0488.
The new estimate = (100)(.0488) + (42)(1 - .0488) = 44.8.
Comment: Similar to a Die/Spinner question. What if instead the question had asked about the
separate estimates of frequency and severity rather than the method using only aggregate claim
amounts? In that case the solution would differ as follows. For the frequency,
EPV = .20, VHM = .12 = .01, and K = .20 / .01 = 20. Z = 1 / (1 + 20) = 1/21.
New estimated frequency = (1)(1/21) + (.3)(20/21) = 1/3.
For the severity, EPV = 2000, VHM = 202 = 400, and K = 2000 / 400 = 5. Z = 1 / (1 + 5) = 1/6.
New estimated severity = (100)(1/6) + (140)(5/6) = 133.33. The new estimated pure premium in
this case would be the product of the separate estimates of frequency and severity: (1/3)(133.33) =
44.44. Notice that this differs from the solution to the question that was asked on the exam. In
general the two methods would give different results.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 588

13.23. C. The a priori chance of a risk from either Class is 1/2. If we have a risk from Class 1, then
there are two ways of observing a pure premium of 100. One can observe a single claim of 100,
which has a probability of (1/2)(1/3) = 1/6, or one can observe two claims each of size 50, which has
a probability of (1/2)(2/3)(2/3) = 2/9. Thus for a risk from Class 1, the chance of observing a pure
premium of 100 is: (1/6) + (2/9) = 28/72. Similarly, for a risk from Class 2, the chance of observing a
pure premium of 100 is: (1/3) + (1/12) = 30/72. By Bayes Theorem, the posterior chance of a risk
from Class 1 is proportional to the product of having made the observation if the risk had been from
Class 1 times the a priori probability of the risk having been from Class 1; this product is:
(28/72)(1/2) = 14/72. Similarly, for Class 2 the posterior probability is proportional to: (30/72)(1/2)
= 15/72. Therefore, the posterior probabilities are: (14/29) and (15/29). The mean claim frequency
for Class 1 is 3/2, while that for Class 2 is 4/3. Thus the posterior estimate of the claim frequency is:
(14/29)(3/2) + (15/29)(4/3) = (21+20)/29 = 41/29.
In spreadsheet form, being sure to retain plenty of decimal places:
A

A Priori
Chance of
Type of This Type
Risk
of Risk
1
0.500
2
0.500
Overall

Chance
of the
Observation
0.38889
0.41667

Prob. Weight =
Product
of Columns
B&C
0.19444
0.20833
0.40278

Posterior Chance of
This Type of Risk = Average
Col. D /
Claim
Sum of Col D.
Frequency
0.48276
1.50000
0.51724
1.33333
1.41379
1.00000

Comment: One has to be careful of rounding, since 17/12 =1.417 while 41/29 = 1.414, so that
choices D and C are very close.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 589

13.24. A. The average pure premiums are the product of the average frequency and the average
severity. Since the average pure premium for each type is 100, the Variance of the Hypothetical
Mean Pure Premiums is zero.
Class
1
2
Overall

A Priori
Chance
0.5
0.5

Avg.
Freq.
1.5000
1.3333

Avg.
Severity
66.6667
75.0000

Avg. Pure
Premium
100.0000
100.0000

Variance
of Freq.
0.2500
0.2222

Variance
of Sev.
555.55
625.00

Process
Variance of P.P.
1944.44
2083.21
2013.82

Since the frequency and severity are independent, the process variance of the pure premium is:
(variance of severity)(mean frequency) + (variance of frequency)(mean severity2 ).
For example, for class 2 the process variance of pure premium = (625)(1.333) + (.2222)(752 ) =
2083. Since the Expected Value of the Process Variance is greater than zero, and
VHM = 0, we have Z = 0. (We have K = , N = 2, and therefore Z = N / (N+K) = 0.)
The a priori estimate of the pure premium is: (.5)(100) + (.5)(100) = 100.
Thus the new estimate is: (125)(0) + (100)(1 - 0) = 100.
Comment: Theres no need to compute the EPV in order to answer this question.
When the mean pure premiums for each class are equal, observations of the pure premium are
given no Buhlmann credibility. However, as seen in the previous question, the more exact Bayes
Analysis is able to extract some useful information from the observations, even in this case where
the mean pure premiums are equal for each class.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 590

13.25. B. In this case, if one observes a total losses of 5, it must have come from a single claim of
size 5. If we have selected Die A, there is a: (2/3)(1/3) chance of this observation.
If we have selected Die B, there is a: (1/3)(1/3) chance of this observation.
A

Die
A
B

A Priori
Chance of
This
Die
0.500
0.500

Chance
of the
Observation
0.2222
0.1111

Prob. Weight =
Product
of Columns
B&C
0.1111
0.0556

Overall

0.1667

Posterior
Mean
Chance of
for
This Type This Type
of Die
of Die
66.7%
1.333
33.3%
1.667
1.000

1.444

Comment: Note that what is being asked for the expected number of claims. Unlike the usual diespinner example, we do not pick the spinners at random (independent of the number of claims.)
Rather, the spinners depend on the number of claims, but not the type of risk. In fact we only have
two types of risk: low frequency corresponding to Die A and high frequency corresponding to Die B.
13.26. In this case, if one observes a total losses of 7, it must have come from two claims of sizes
2 and 5. The probability of that is proportional to the probability of having two claims, which for Die
A is 1/3 and for die B is 2/3.
Thus the posterior distribution is 1/3 @ A and 2/3 @ B.
The mean of Spinner X is: (2/3)(2) + (1/3)(5) = 3.
The mean of Spinner Y is: (1/3)(2) + (2/3)(5) = 4.
Thus the mean pure premium if die A is: (2/3)(3) + (1/3)(3 + 4) = 13/3.
The mean pure premium if die B is: (1/3)(3) + (2/3)(3 + 4) = 17/3.
Thus the Bayesian estimate of the pure premium for trial two is:
(1/3)(13/3) + (2/3)(17/3) = 47/9 = 5.222.
Comment: The types of die are the two risk types. The severity distribution is independent of which
die we pick. Thus the probability of the observation conditional on having 2 claims is independent of
the type of die chosen.
Assuming we have 2 claims, the chance of one of them being 2 and the other being 5 is:
Prob[X = 2] Prob[Y = 5] + Prob[Y = 2] Prob[X = 5] = (2/3)(2/3) + (1/3)(1/3) = 5/9.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 591

13.27. The mean pure premium if Die A is 13/3 and the mean pure premium if Die B is 17/3.
Thus since Die A and Die B are equally likely a priori, the Variance of the Hypothetical Mean Pure
Premiums is: (2/3)2 = 4/9.
If one has Die A, then the possible outcomes are as follows:
Situation
1 claim @ 2
1 claim @ 5
2 claims @ 2 each
2 claims: 1 @ 2 & 1 @ 5
2 claims: 1 @ 5 & 1 @ 2
2 claims @ 5 each

Probability
44.4%
22.2%
7.4%
14.8%
3.7%
7.4%

Pure Premium
2
5
4
7
7
10

Square of P.P.
4
25
16
49
49
100

Overall

100.0%

4.333

25.00

Thus for Die A, the process variance of the pure premiums is: 25 - 13/32 = 56/9 = 6.222.
Similarly, if one has Die B, then the possible outcomes are as follows:
Situation
1 claim @ 2
1 claim @ 5
2 claims @ 2 each
2 claims: 1 @ 2 & 1 @ 5
2 claims: 1 @ 5 & 1 @ 2
2 claims @ 5 each

Probability
22.2%
11.1%
14.8%
29.6%
7.4%
14.8%

Pure Premium
2
5
4
7
7
10

Square of P.P.
4
25
16
49
49
100

Overall

100.0%

5.667

39.00

Thus for Die B, the process variance of the pure premiums is: 39 - (17/3)2 = 62/9 = 6.889.
Thus since Die A and Die B are equally likely a priori, the Expected Value of the Process Variance of
the Pure Premiums is: (0.5)(56/9) + (0.5)(62/9) = 59/9.
Thus the Buhlmann Credibility Parameter K = EPV / VHM = (59/9) / (4/9) = 14.75.
For one observation, Z = 1 / (1+14.75) = 6.35%.
The a priori mean pure premium is: (0.5)(13/3) + (0.5)(17/3) = 5.
The Buhlmann Credibility estimate of the future pure premium is:
(0.0635)(7) + (1 - 0.0635)(5) = 5.127.
Comment: While for this observation the estimates from Buhlmann Credibility and Bayesian
Analyses are very similar, they are not equal.
13.28. C. Frequency and severity are independent. The mean frequency is:
(1/3 + 2/3)/2 = 1/2. The mean of spinner X is: (2)(1/3) + (8)(2/3) = 6.
The mean of spinner Y is 2. Thus, the mean severity is (6 + 2)/2 = 4.
Thus the mean pure premium is: (mean frequency)(mean severity) = (1/2)(4) = 2.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 592

13.29. E. Since the spinner is randomly reselected after each trial, as n goes to infinity we continue
to assume that spinner X and Y are equally likely. However the same die is used for each trial, so
we can apply Bayes Theorem to estimate the posterior probability of each die.
We observe a claim every trial. Therefore the posterior probability of die B is proportional to
(2/3)n-1, while that for die A is proportional to (1/3)n-1. Thus as n goes to infinity the ratio of the
probability of die B compared to that of die A, goes to infinity. Since the probabilities add to unity,
the probability of die B goes to unity.
Thus the expected frequency goes to 2/3, that of die B. The expected severity remains 4.
Thus the expected pure premium goes to (2/3)(4) = 8/3 = 2.667.
Comment: What if instead neither the die nor spinner are reselected after each trial? Then in the limit
die B and spinner Y get all the probability. Thus the posterior estimate of the pure premium would
be: (2/3)(2) = 4/3 in this case.
13.30. C. There are four equally likely types of tests:
Front
F, W
F, L

Wall
Lake

Side
S, W
S, L

The number of crash dummies acts as the frequency, while the amount of damage acts as the
severity. We use Bayes Analysis to predict the future pure premium.
If we have a Front and Wall test, then the total damage can be 1 if there is either one dummy with
damage of 1, or 2 dummies each with damage .5.
This has probability of: (1/4)(1/2) + (1/4)(1/2)2 = 3/16.
If we have a Front and Lake test, then the total damage can be 1 if there is one dummy with damage
of 1. This has probability of: (1/4)(1/2) = 1/8.
If we have a Side and Wall test, then the total damage can be 1 if there is 2 dummies each with
damage .5. This has probability of: (1/2)(1/2)2 = 1/8.
If we have a Side and Lake test, then the total damage can not be 1.
A

Type

A
Priori
Prob.

Frequency

F, W
F, L
S, W
S, L

0.25
0.25
0.25
0.25

1 or 2 or 3 or 4
1 or 2 or 3 or 4
2 or 4
2 or 4

SUM

Mean
Mean Chance
Probability Posterior
Freq. Severity Sev.
of
Weights =
Prob.
Col. B x Col. G
Observ.Col.
2.5
2.5
3.0
3.0

.5 or 1
1 or 2
.5 or 1
1 or 2

0.75
1.5
0.75
1.5

0.1875
0.1250
0.1250
0.0000

Mean
P.P.

0.04688
0.03125
0.03125
0.00000

0.429
0.286
0.286
0.000

1.875
3.750
2.250
4.500

0.10938

1.000

2.518

The posterior distribution is: 3/7, 2/7, 2/7, 0.


The estimated total damage from a test of the same type is:
(3/7)(1.875) + (2/7)(3.75) + (2/7)(2.25) + (0)(4.5) = 2.518.
Comment: Mathematically similar to a Die/Spinner model of pure premium in which one separately
chooses one of two dice and one of two spinners.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 593

13.31. D. We use Bayes Analysis to predict the expected value of the total damage for the next
test accident.
If we have a Front and Wall test, then the total damage can be 1 if there is either one dummy with
damage of 1, or 2 dummies each with damage .5.
If we have a Front and Lake test, then the total damage can be 1 if there is one dummy with damage
of 1.
If we have a Side and Wall test, then the total damage can be 1 if there is 2 dummies each with
damage .5.
If we have a Side and Lake test, then the total damage can not be 1.
Type of Accident
Air Bag
Type

Number of
Dummies

A Priori
Probability

Chance of
Observation

Probability
Weight

Posterior
Probability

Average
Damage

Front
Front
Front
Front

Wall
Wall
Wall
Wall

1
2
3
4

0.0625
0.0625
0.0625
0.0625

0.5
0.25
0
0

0.03125
0.015625
0
0

0.286
0.143
0.000
0.000

0.75
1.50
2.25
3.00

Front
Front
Front
Front

Lake
Lake
Lake
Lake

1
2
3
4

0.0625
0.0625
0.0625
0.0625

0.5
0
0
0

0.03125
0
0
0

0.286
0.000
0.000
0.000

1.50
3.00
4.50
6.00

Side
Side

Wall
Wall

2
4

0.1250
0.1250

0.25
0

0.03125
0

0.286
0.000

1.50
3.00

Side
Side

Lake
Lake

2
4

0.1250
0.1250

0
0

0
0

0.000
0.000

3.00
6.00

0.109375

1.286

SUM

If there is a Wall accident the average damage per dummy is: (.5 + 1)/2 = .75.
If there is a Lake accident the average damage per dummy is: (1 + 2)/2 = 1.5.
The estimated total damage for the next test accident is:
(.286)(.75) + (.143)(1.5) + (.286)(1.5) + (.286)(1.5) = 1.286.
Comment: The difference from Course 4, 11/00, Q.33 is that in this question the number of
dummies is kept the same for the next test, in addition to the type of air bag and the type of
accident.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 594

13.32. A. For risk type A, the chance of observing total annual losses of 500 is:
Prob(1 claim)Prob(size = 500) = (4/9)(1/3) = 12/81.
For risk type B, the chance of observing total annual losses of 500 is:
Prob(2 claims)Prob(size = 250)2 = (4/9)(2/3)2 = 16/81.
Since the risk types are equally likely, the posterior distribution is proportional to the chances of the
observation, 12/81 and 16/81.
Thus the posterior chance of A is: 12/(12 + 16) = 3/7 and of B is: 16/(12 + 16) = 4/7.
The mean loss for A is: (2/3)(990) = 660. The mean loss for B is: (4/3)(276) = 368.
The estimated future loss is: (3/7)(660) + (4/7)(368) = 493.
A

A Priori
Chance of
Type of This Type
Risk
of Risk
A
B

Chance
of the
Observation

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Avg.
Pure
Premium

0.148
0.198

0.074
0.099

42.9%
57.1%

660.0
368.0

0.173

1.000

493.1

0.500
0.500

Overall

13.33. D. VHM = 285512 - 5142 = 21,316.


Risk
Type

Mean
Freq.

Var.
Freq.

Mean
Sev.

Var.
Sev.

Process
Variance

Mean
P.P.

Square of
Mean P.P.

A
B

0.667
1.333

0.444
0.444

990
276

120,050
1,352

515,633.3
35,659

660
368

435,600
135,424

275,646

514

285,512

Avg.

K = EPV/VHM = 275646/21316 = 12.9. Z = 1/(1 + 12.9) = 7.2%.


Estimate = (.072)(500) + (1 - .072)(514) = 513.
Comment: EPV + VHM = 275,646 + 21,316 = 296,962 = variance of the total losses.
Thus one can save time by using the given total variance and either the EPV or VHM to get the
other, or one can use the given total variance to check our work.

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 595

13.34. E. Given Die A and Spinner X, chance of the observation is: (3/4)(1/2) = 3/8.
Given Die A and Spinner Y, the chance of the observation is: (3/4)(1) = 3/4.
Given Die B and Spinner X, the chance of the observation is: (1/4)(1/2) = 1/8.
Given Die B and Spinner Y, the chance of the observation is: (1/4)(1) = 1/4.
A

Type of A Priori
Prob. Weight =
Die
Chance of
Chance
Product
and
This Die
of the
of Columns
Spinner & Spinner Observation
B&C
A, X
0.25
0.375
0.0938
A, Y
0.25
0.750
0.1875
B, X
0.25
0.125
0.0312
B, Y
0.25
0.250
0.0625
Overall

0.3750

Posterior
Chance of
this Die
& Spinner
0.2500
0.5000
0.0833
0.1667

Mean
Freq.
0.7500
0.7500
0.2500
0.2500

Mean
Sev.
6 +.5c
12
6 +.5c
12

Mean
Pure
Premium
4.5 + 3c/8
9.000
1.5 + c/8
3.000

1.000

The posterior mean pure premium is:


(.25)(4.5 + 3c/8) + (.50)(9) + (.0833)(1.5 + c/8) + (.1667)(3) = 6.25 + .1042c.
Setting this equal to the stated estimate of 10: 10 = 6.25 + .1042c. c = 36.
Comment: If c = 12, then the two spinners are equal. Each loss is then of size 12, and the
chance of the observation is the chance of observing 1 claim.
In this case, the estimated future pure premium would be 7.5:
Type of A Priori
Chance
Die
Chance of
of the
This Die Observation
A
B
Overall

0.50
0.50

0.750
0.250

Prob. Weight

Posterior
Chance of
this Die

Mean
Freq.

Mean
Sev.

Mean
Pure
Premium

0.3750
0.1250

0.7500
0.2500

0.7500
0.2500

12
12

9.000
3.000

0.5000

1.000

7.500

2013-4-9

Buhlmann Credibility 13 Die/Spinner Models

HCM 10/19/12,

Page 596

13.35. D. For Business Use drivers: (% rural)(1) + (% urban)(2) = 1.8.


Thus of the Business Use drivers: 80% are urban and 20% are rural.
For Pleasure Use drivers: (% rural)(1.5) + (% urban)(2.5) = 2.3.
Thus of the Pleasure Use drivers: 80% are urban and 20% are rural.
Type
of
Driver
B, R
B, U
P, R
P, U

A Priori
Chance of
Driver
0.100
0.400
0.100
0.400

Mean

Mean
Freq.
1.000
2.000
1.500
2.500

Square
of Mean
Freq.
1.000
4.000
2.250
6.250

Variance
of Freq.
0.5
1.0
0.8
1.0

2.050

4.425

0.930

VHM = 4.425 - 2.052 = 0.2225. EPV = 0.930. K = EPV/VHM = 0.930/0.2225 = 4.18.


Z = 1/ (1 + 4.18) = 0.193.
Comment: It is intended that there are four separate cells: Business/Rural, Business/Urban,
Pleasure/Rural, Pleasure/Urban. Each driver is in one and only one of the four cells.
Rural
Urban

Business

Pleasure

10%
40%

10%
40%

The EPV within Business Use is: (0.2)(.5) + (0.8)(1) = 0.9.


The Variance of the Hypothetical Means within Business Use is:
(0.2)(1 - 1.8)2 + (0.8)(2 - 1.8)2 = 0.16.
0.9 + 0.16 = 1.06, the shown total claims variance for Business Use.
This is not the way one would estimate the experience of an individual driver in this type of
situation in practical applications with classifications.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 597

Section 14, Classification Ratemaking


An important aspect of most forms of insurance is the classification of insureds.144 One groups
together insureds with similar characteristics related to risk, so that differences in costs may be
recognized.145 Various characteristics might be used in order to group together insureds who are likely
to have a similar average loss cost.
For example, whether or not someone smokes could be used for life or health insurance. Age
is an important classification variable for life insurance. The type of business is important for
Workers Compensation Insurance; Furniture Stores are more hazardous on average than
Hardware Stores. The year and model of car are important for Automobile Collision Insurance.
The place of principal garaging, the years of driving experience, etc., might be important for
automobile insurance.
Such classes are intended to improve the estimate of the future compared to not using the
classification information. They are not intended to eliminate all uncertainty.
Generally, one would try to estimate the future pure premium or loss ratio for a classification by
applying weight Z to the observation for the class and weight 1 - Z to the estimate for a group of
classes. For example, one might estimate the Soap Manufacturing Class by weighting its
experience with that of all Manufacturing classes. In classification ratemaking, Z is quite often
calculated using Classical Credibility.146 However, ideas from Buhlmann Credibility and Empirical
Bayesian Credibility147 can also be used.148
As will be discussed in the next section, an individual policyholders experience can be used
together with his classification in order to improve the estimate of that policyholders future
experience compared to not relying on the individual experience at all. Experience Rating applies
weight Z to the observation of an individual policyholder and weight 1 - Z to the estimate for its
class. In Experience Rating, Z is usually calculated using ideas from Buhlmann Credibility and/or
Empirical Bayesian Credibility.
Experience Rating is used on top of and in addition to Classification Ratemaking. A well designed
system of Classifications and Experience Rating Plan work together in order to improve the
estimates of the future. Giving some weight to the additional information provided by the
experience of an individual policyholder, improves the estimate of the future but does not
remove all prediction error.
144

For example, see Foundations of Casualty Actuarial Science, Fourth Edition, Chapters 6 and 3.
For example, see the Actuarial Standards Board, Standards of Practice #12.
146
See Mahlers Guide to Classical Credibility.
147
See Mahlers Guide to Empirical Bayesian Credibility.
148
In that case the complement of of credibility is applied to the loss ratio for a larger group of classes, as in Empirical
Bayesian Credibility for Workers Compensation Ratemaking, by Glenn Meyers, PCAS 1984, or the current relativity
for a class, as in Workers Compensation Classification Credibilities, by Howard C. Mahler, CAS Forum, Fall 1999.
145

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 598

Homogeneity of Classifications:
One important feature is the homogeneity of the classifications.149 One desires classifications that
are relatively homogeneous; one desires that the insureds in a class be as similar as possible in their
expected pure premiums.150
Below is shown an example of two classes that are relatively homogeneous. The Poisson
parameters of the first class are distributed via a Gamma Distribution around a mean of 10%, while
those of the second class are distributed via a Gamma Distribution around a mean of 30%.
Density
10

Gamma with alpha = 5 and theta = 0.02

4
Gamma with alpha = 5 and theta = 0.06
2

0.2

0.4

0.6

0.8

Lambda

Note that there is considerable overlap between the classes. The worst insured in the low risk
class has a higher expected claim frequency than many insureds in the high risk class. It is
generally the case that classifications will exhibit such overlap. For example the safest
Furniture Store is probably of lower hazard than the least safe Hardware Store, even though
on average Furniture Stores are more hazardous.
Above, each class has a spread of insureds from more to less risky. The more homogeneous the
class the less spread there is. Below is shown an example of less homogeneous classes.

149

See pages 555-556 of Loss Models.


If one ignores differences in expected severity, one desires that the insureds within a class have similar expected
frequencies.
150

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 599

Density
Gamma with alpha = 2 and theta = 0.08
4

Gamma with alpha = 2 and theta = 0.12

0.2

0.4

0.6

0.8

Lambda

These less homogeneous, more heterogeneous classes are the type of thing one might get if one
classified insureds according to their middle initials. One would not expect to get much or even any
distinction between the classes. The purpose of class plans is to group insureds of similar hazard.
A good class plan, such as in the first diagram, would produce class means far apart from each other
and individual means within a class tightly bunched around the class mean. We would then assign
more credibility to the average for a class and less to the overall mean. The more homogeneous
the classes, the more credibility is assigned to their data and the less to the overall
average, when determining classification rates.
In actual applications there are competing goals.151 One wants to have classifications for which
there is likely to be enough data from which to make reasonably accurate rates. Thus it is not
useful to divide the total universe of insureds into very tiny but very homogeneous classes.
Rather, one wants reasonably homogeneous classes with a reasonable amount of data in
most of them.
For example, generally one divides states into territories that are large enough to produce a
usable quantity of data, but are small enough to capture the variation of hazard across the
state. Then one would make rates for each territory as a credibility weighted average of the
experience of that territory and the combined experience for the state.
151

The American Academy of Actuaries Committee on Risk Classification Report : Risk Classification Statement of
Principles, June 1980, lists three statistical considerations: Homogeneity, Credibility, and Predictive Stability.
Michael Walters in Risk Classification Standards, PCAS 1981, lists the following broad desirable characteristics of
classification systems: homogeneous, well-defined, and practical.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 600

Homogeneity of Territories:
For most lines of insurance, the premium charged depends on the geographical location.
A state will be divided into many territories, with the rate depending on the territory.
Territories act mathematically like another classification dimension, and thus many of the same ideas
apply to territories as apply to classes.
Let us assume that for a line of insurance one models the costs by zipcode across a state.152
Then one could create territories by grouping together zipcodes with similar expected pure
premiums.153 One would want homogeneous territories, but also territories that are each big enough
to have sufficient data to have enough credibility for use in determining territory relativities.
One way to measure the homogeneity of territories would be to divide the total variance between
zipcodes into a variance between territories and a variance within territories. The smaller the within
variance, and thus the larger the between variance, the more homogeneous the territories.
For example, one might get a graph of the within variance similar to the following:154
% of Total Variance
50
40
30
20

10

15

20

25

30

35

40

Number of Territories

As one divides the state into more and more territories, the rate at which homogeneity improves
declines. In this case, one might choose to use about 20 territories, balancing the desire for
homogeneous territories with the desire for credible territories.
152

The line of insurance might be homeowners or private passenger automobile.


The model might be a generalized linear model, taking into account many aspects of each zipcode.
The effect on expected costs due to zipcodes could be determined having adjusted for the effect of the other
rating variables.
153
Traditionally one requires that territories be contiguous, but if one relaxes that restriction one can get more
homogeneous territories.
154
Adapted from Determination of Statistically Optimal Geographic Territory Boundaries, by Klayton N. Southwood,
CAS Special Interest Seminar on Predictive Modeling, October 2006.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 601

Problems:
14.1 (1 point) Which of the following are true with respect to grouping of policies to create rating
classifications in Property/Casualty insurance?
1. Policies are occasionally grouped according to the different levels of the various
risk factors involved.
2. Such grouping should leave an insignificant level of residual heterogeneity.
3. When a significant level of residual heterogeneity would result rather than use such
a grouping one should use experience rating methods.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3
E. None of A, B, C, or D.
Use the following information for the next 15 questions:

The random variable representing the number of claims for a single policyholder in
a year follows a Poisson distribution.

Policyholders are divided into five distinct classes, with the following percent in
each of the classes:
Class 1
Class 2
Class 3
Class 4
Class 5
20%
30%
25%
15%
10%

Within each class of policyholders, the Poisson parameters vary via


a Gamma Distribution, representing the heterogeneity of risks within that class.
Gamma Parameters
Class

1
2
3
4
5

1.1
1.6
2.0
1.8
2.5

0.32
0.26
0.28
0.38
0.31

14.2 (1 point) For Class 1, what is the variance of the hypothetical mean frequencies of its
policyholders?
A. 0.10
B. 0.11
C. 0.12
D. 0.13
E. 0.14
14.3 (1 point) For Class 1, what is the expected value of the process variance of the frequencies of
its policyholders?
A. 0.27
B. 0.29
C. 0.31
D. 0.33
E. 0.35
14.4 (1 point) Let N be the number of claims next year for a policyholder chosen at random from
Class 1. What is the variance of N?
A. 0.46
B. 0.49
C. 0.52
D. 0.55
E. 0.58

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 602

14.5 (2 points) Define more homogeneous as a smaller variance of the hypothetical mean
frequencies. Which of the five classes is most homogeneous?
A. Class 1 B. Class 2 C. Class 3 D. Class 4 E. Class 5
14.6 (1 point) Which of the five classes is least homogeneous?
A. Class 1 B. Class 2 C. Class 3 D. Class 4 E. Class 5
14.7 (1 point) Assume you were creating an experience rating system to apply to just the
policyholders in the most homogeneous class, how much Buhlmann Credibility should be applied
to three years of experience of an individual policyholder? (The complement of credibility will be
applied to the class mean.)
A. Less than 42%
B. At least 42%, but less than 45%
C. At least 45%, but less than 48%
D. At least 48%, but less than 51%
E. At least 51%
14.8 (1 point) Assume you were creating an experience rating system to apply to just to the
insureds in the least homogeneous class, how much Buhlmann Credibility should be applied to
three years of experience of an individual policyholder? (The complement of credibility will be
applied to the class mean.)
A. Less than 42%
B. At least 42%, but less than 45%
C. At least 45%, but less than 48%
D. At least 48%, but less than 51%
E. At least 51%
14.9 (1 point) Using the answer to the previous question, estimate the future annual frequency of a
policyholder chosen at random this Class, who had 5 claims in 3 years.
A. 1.00
B. 1.05
C. 1.10
D. 1.15
E. 1.20
14.10 (1 point) Let N be the number of claims next year for a policyholder chosen at random from
this portfolio. What is the mean of N?
A. 0.43
B. 0.46
C. 0.49
D. 0.52
E. 0.55
14.11 (2 points) Let N be the number of claims next year for a policyholder chosen at random from
this portfolio. What is the variance of N?
A. 0.61
B. 0.63
C. 0.65
D. 0.67
E. 0.69

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 603

14.12 (2 points) What is the (weighted) average of the variances of the hypothetical means within
each of the classes?
A. Less than 0.16
B. At least 0.16, but less than 0.18
C. At least 0.18, but less than 0.20
D. At least 0.20, but less than 0.22
E. At least 0.22
14.13 (2 points) Assume you were creating an experience rating system to be applied to all the
policyholders, how much Buhlmann Credibility should be applied to three years of experience of an
individual policyholder? (The complement of credibility will be applied for each policyholder to the
mean of its class. Thus for each policyholder we make use of the knowledge of the class to which it
belongs.)
Hint: Use the answer to the previous question.
A. Less than 35%
B. At least 35%, but less than 40%
C. At least 40%, but less than 45%
D. At least 45%, but less than 50%
E. At least 50%
14.14 (1 point) Using the answer to the previous question, estimate the future annual frequency of a
policyholder chosen at random from Class 4, who had 5 claims in 3 years.
A. 1.00
B. 1.05
C. 1.10
D. 1.15
E. 1.20
14.15 (2 points) Assume the state passed a law banning insurers from using the above
classification system. If you create an experience rating system to be applied to all the
policyholders, how much Buhlmann Credibility should be applied to three years of experience of an
individual policyholder? (One ignores classification for predicting the future frequency and applies the
complement of credibility to the mean over all classes.)
A. Less than 35%
B. At least 35%, but less than 40%
C. At least 40%, but less than 45%
D. At least 45%, but less than 50%
E. At least 50%
14.16 (1 point) Using the answer to the previous question, estimate the future annual frequency of a
policyholder chosen at random from Class 4, who had 5 claims in 3 years.
A. 1.00
B. 1.05
C. 1.10
D. 1.15
E. 1.20

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 604

14.17 (4, 5/88, Q.38) (1 point) Which of the following statements are true?
1. Large values of credibility are always desirable.
2. A class plan with homogeneous classes will result in low credibilities for individual risk experience.
3. For good class plans the credibility of class experience will be higher than for a poorer
class plan.
A. 1
B. 2
C. 3
D. 2, 3
E. 1, 2 and 3
14.18 (4, 5/91, Q.45) (3 points) A population of insureds consists of two classifications each with
50% of the total insureds. The Buhlmann credibility for the experience of a single insured within a
classification is calculated below.
Variance of Expected Value
Mean
Hypothetical of Process
Buhlmann
Classification Frequency Means
Variance
Credibility
A
0.09
0.01
0.09
0.10
B
0.27
0.03
0.27
0.10
Calculate the Buhlmann credibility for the experience of a single insured selected at random from the
population if its classification is unknown.
A. Less than 0.08
B. At least 0.08 but less than 0.10
C. At least 0.10 but less than 0.12
D. At least 0.12 but less than 0.14
E. At least 0.14
14.19 (4B, 5/93, Q.23) (1 point) Which of the following statements are true concerning the use of
credibility in classification ratemaking?
1. A small standard deviation of observations within a particular class would indicate
a homogeneous group of risks within the class.
2. The credibility assigned to class data will tend to decrease as the variance of
the hypothetical means between classes increases.
3. A well-designed class plan (resulting in homogeneous classes) generally results in
high credibility assigned to the classification experience.
A. 1
B. 2
C. 1, 2
D. 1, 3
E. 1, 2, 3

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 605

14.20 (4B, 5/95, Q.24) (2 points) You are given the following:

The random variable representing the number of claims for a single policyholder
follows a Poisson distribution.

For each class of policyholders, the Poisson parameters follow a gamma distribution
representing the heterogeneity of risks within that class.

For four distinct classes of risks, the random variable representing the number of claims
of a policyholder, chosen at random, follows a negative binomial distribution
with parameters r and , as follows:
r

Class 1
5.88

Class 2
1.26

Class 3
10.89

Class 4
2.47

0.2041

0.1111

0.0101

0.0526

The negative binomial distribution with parameters r and has the form
f(x) =

x
r(r +1)...(r + x - 1)
.
(1+ )x + r
x!

The lower the standard deviation of the gamma distribution, the more homogeneous the class.
Which of the four classes is most homogeneous?
A. Class 1
B. Class 2
C. Class 3
D. Class 4
E. Cannot be determined from the given information.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 606

Solutions to Problems:
14.1. E. 1. F. Policies are usually grouped into classifications.
2. F. The residual heterogeneity is often still considerable.
3. F. Experience rating is used in addition to classifications, rather than instead of classifications.
14.2. B. For Class 1, VHM = Var[] = Variance of the Gamma Distribution = 2 = (1.1)(.322 ) =
0.1126.
14.3. E. For Class 1, EPV = E[] = Mean of the Gamma = = (1.1)(.32) = 0.352.
14.4. A. Total Variance = EPV + VHM = 0.352 + 0.1126 = 0.4646.
Comment: See the discussion of the Gamma-Poisson in Mahlers Guide to Conjugate Priors.
The marginal distribution is Negative Binomial with r = = 1.1 and = = 0.32, with variance:
r(1 + ) = (1.1)(0.32)(1.32) = 0.4646.
14.5. B. & 14.6. D. The most homogeneous class has the lowest variance of the distribution of
Poisson parameters within that class.
VHM = Var[] = Variance of the Gamma Distribution = 2.
A

Class

VHM

1
2
3
4
5

1.1
1.6
2.0
1.8
2.5

0.32
0.26
0.28
0.38
0.31

0.11264
0.10816
0.15680
0.25992
0.24025

Class 2 is the most homogeneous.


Class 4 is least homogeneous or the most heterogeneous.
14.7. B. The most homogeneous class is Class 2. The EPV = E[] = mean of the Gamma
Distribution = , since we are mixing Poissons. Within Class 2 the variance of the hypothetical
means is VHM = Var[] = Variance of the Gamma Distribution = 2.
Therefore, the Buhlmann Credibility parameter K = / 2 = 1/ = 1/.26 = 3.85.
Thus three years of experience gets credibility Z = 3/(3 + K) = 43.8%.
Comment: See the Gamma-Poisson in Mahlers Guide to Conjugate Priors.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 607

14.8. E. The least homogeneous class is Class 4. K = / 2 = 1/ = 1/.38 = 2.63.


Thus three years of experience gets credibility Z = 3/(3 + K) = 53.3%.
Comment: Note that the credibility for experience rating was less for the more homogeneous class
than it is here for the more heterogeneous class. Here the classification does a worse job of
predicting the future frequency of an individual policyholder, so we give relatively more weight to the
experience of the individual policyholder.
14.9. E. From the previous solution Z = 53.3%. The mean for Class 4 is: (1.8)(3.8) = 0.684.
Estimated future annual frequency is: (0.533)(5/3) + (1 - 0.533)(0.684) = 1.208.
14.10. D. & 14.11. E. As shown below the weighted average frequency is 0.5153.
Using analysis of variance, the variance for the whole portfolio is:
E[Variance | Class] + VAR[Mean | Class].
Class

A Priori
Probability

Class
Mean

1
2
3
4
5

20%
30%
25%
15%
10%

1.1
1.6
2.0
1.8
2.5

0.32
0.26
0.28
0.38
0.31

Average

0.3520
0.4160
0.5600
0.6840
0.7750

Square of
Class
Mean
0.1239
0.1731
0.3136
0.4679
0.6006

Total
Within Class
Variance
0.4646
0.5242
0.7168
0.9439
1.0152

0.5153

0.2853

0.6725

VAR[Mean | Class] = variance of the hypothetical means between classes =


0.2853 - 0.51532 = 0.0198.
The Total Variance within each class is: EPV + VHM within class = + 2 = (1 + ).
E[Variance | Class] = (weighted) average of the (total) variances within each class =
(20%)(0.4646) + (30%)(0.5242) + (25%)(0.7168) + (15%)(0.9439) + (10%)(1.0152) = 0.6725.
Therefore the variance for the whole portfolio = 0.6725 + 0.0198 = 0.6923.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 608

14.12. A. Within a class, VHM = Var[] = Variance of the Gamma Distribution = 2.


A

Class

A Priori
Probability

VHM

20%
30%
25%
15%
10%

1.1
1.6
2.0
1.8
2.5

0.32
0.26
0.28
0.38
0.31

0.1126
0.1082
0.1568
0.2599
0.2402

1
2
3
4
5

0.1572

Average

Comment: Note that one can divide the total variance for the portfolio into:
VHM Between Classes + VHM Within Classes + EPV = .0198 + .1572 + .5153 = .6923.
14.13. D. The Expected Value of the Process Variance = mean = .5153, since we are mixing
Poissons. The average of the within class variances of the hypothetical means is .1572.
Therefore, the Buhlmann Credibility parameter K = .5153 / .1572 = 3.28.
Thus three years of experience gets credibility Z = 3/(3 + K) = 47.8%.
Comment: Very difficult. This is a simplified version of what is done in real world applications. One
applies experience rating on top of and in addition to classification rating. In theory one could use
different Buhlmann Credibility Parameters for each Class, based on the EPV for that class and
variance of hypothetical means within that class.
In each case, K = / 2 = 1/.
A

Class

1
2
3
4
5

1.1
1.6
2.0
1.8
2.5

0.32
0.26
0.28
0.38
0.31

3.12
3.85
3.57
2.63
3.23

However, in practice one usually uses the same Buhlmann Credibility parameter, in this case 3.28,
for insureds from every class. Thus the credibility would be determined from some average of the
variance of the hypothetical means within the classes. (Note that this average of the variance of the
hypothetical means within the classes is smaller than the total variance of hypothetical means ignoring
the class plan : .1572 < .1572 + .0198 = .1770. ) The resulting Buhlmann Credibility parameter is
generally somewhere in the range of parameters that could be calculated for each class separately.
The resulting experience rating credibilities are in the general range of those that would result from
using a separately calculated Buhlmann Credibility parameter for each class.
14.14. D. From the previous solution Z = 47.8%. The mean for Class 4 is: (1.8)(3.8) = .684.
Estimated future annual frequency is: (.478)(5/3) + (1 - .478)(.684) = 1.154.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 609

14.15. E. The VHM = VHM within classes plus VHM between classes =
.0198 + .1572 = .1770. EPV = .5153. K = EPV/VHM = .5153/.1770 = 2.91.
Z = 3/(3+K) = 50.8%.
Comment: In the absence of the classification plan, the individual experience gets more weight than
it did in the presence of the classification system.
14.16. C. From the previous solution Z = 50.8%.
From a previous solution, the overall mean is .5153.
Estimated future annual frequency is: (.508)(5/3) + (1 - .508)(.5153) = 1.100.
14.17. D. 1. False. 2. True. 3. True.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 610

14.18. D. The overall mean is: (.5)(.09)+(.5)(.27) = .18. In order to compute the Variance of the
Hypothetical Means, one can compute the second moment of the hypothetical means for each
class. For class A the second moment of the hypothetical means is: .01 + .092 = .0181.
For class B the second moment of the hypothetical means is: .03 + .272 = .1029.
The second moment for the whole population is the weighted average of these second moments
for each class: (.5)(.0181) + (.5)(.1029) = .0605.
Thus the overall variance of the hypothetical means is: .0605 - .182 = .0281.
The Expected Value of the Process Variance for the whole population is a weighted average of the
expected value of the process variance for each class: (.5)(.09) + (.5)(.27) = .18.
Thus K = EPV / VHM = .18 / .0281 = 6.406.
For one observation Z = 1 / (1 + 6.406) = 0.135.
Alternately, in order to compute the VHM, one can apply the concepts of analysis of variance. The
VAR[Mean | Class] = (.5)(.09 - .18)2 + (.5)(.27 - .18)2 = .0081.
Therefore, the variance of the hypothetical means for the whole population is:
E[Variance of the hypothetical means | Class] + VAR[Mean | Class] =
{(.5)(.01) + (.5)(.03)} + .0081 = .0281, as computed above.
Comment: Difficult! More recent exam questions assume that everyone in a class has the same
distribution; in other words that they are independent, identically distributed variables. Instead, here it
is assumed that each class is not homogeneous; looking at each class separately, there is a variance
of hypothetical means within each class. This is more realistic. Therefore, when we combine the two
classes and pick an insured at random, without knowing what class it is from, we have to do extra
work to get the overall VHM .
The definition of the expected value is such that one can weight together the expected value for a
subpopulation times the chance of being in that subpopulation.
Thus, (combined) EPV = E[Process Variance] =
E[Process Variance | Class A] Prob[Class A] + E[Process Variance | Class B] Prob[Class B]
= (EPV for Class A)(proportion in Class A) + (EPV for Class B)(proportion in Class B).
Also, E[Square of Hypothetical Means] = E[Square of Hypothetical Means | Class A]Prob[A] +
E[Square of Hypothetical Means | Class B]Prob[B].
Note that the overall Variance of the Hypothetical Means is greater than the average of that for the
individual classes: .0281 > {(.5)(.01) + (.5)(.03)} .
Note that one makes no use of the Buhlmann Credibility" given for each class; for each class
EPV/ VHM = K = 9 and thus for one observation Z = 1/(1+9) = 1/10. Note that when a risk is
chosen at random without knowing which class it is from, the credibility for one observation is
increased compared to that when we know which class the risk is from. In other words, in the
absence of the class plan we give more weight to the individual insureds observed experience. In
the absence of the class plan, the complement of credibility is given to the overall mean rather than
the mean of the relevant class. The overall mean of .18 is a worse predictor of an individuals future
experience than was the relevant class mean of either .09 or .27, and therefore it is given less
weight. Credibility is a measure of the relative value of two predictors.

2013-4-9

Buhlmann Cred. 14 Classification Ratemaking,

HCM 10/19/12,

Page 611

14.19. D. 1. True. 2. False. The credibility will increase not decrease. As the classes differ more
from each other, the data for each class will be given more credibility in estimating its own mean while
the overall mean will be given less credibility. 3. True.
14.20. C. The most homogeneous class has the lowest standard deviation of the Gamma
Distribution and therefore the smallest variance of the Gamma which is the smallest Variance of the
Hypothetical Means. The Variance of the Hypothetical Means =
Total Variance - Process Var. = Var. of the Negative Binomial - Mean of the Gamma =
Variance of the Negative Binomial - Mean of Negative Binomial = r(1+)- r = r2.
It is smallest for Class 3.
Class
1
2
3
4

r
5.88
1.26
10.89
2.47

0.0204
0.1111
0.0101
0.0526

Variance of Gamma
0.0024
0.0156
0.0011
0.0068

Alternately, the parameters of the Gamma can be derived from those of the Negative Binomial,
= r, = . Then the Variance of the Gamma = 2 = r2.

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 612

Section 15, Experience Rating


Assume that a insured has had no accidents over the last decade. This provides evidence that he is
a safer than average insured; his expected claim frequency is lower than average for his class. Thus
for automobile insurance one might give him a safe driver discount off of the otherwise applicable
rate for his class.
This is an example of experience rating.155 Generally, experience rating consists of modifying the
rate charged to an insured (driver, business, etc.) based on its past experience. While such plans
can be somewhat complex in detail,156 in broad outline they all reward better than expected
experience and penalize worse than expected experience. Depending on the particular
circumstances more or less weight is put on the insureds observed experience from the recent
past.157
The new estimate of the insureds frequency or pure premium is a weighted average of that for his
classification and the observation. The amount of weight given to the observation is the credibility
assigned to the individual insureds data. How much credibility to assign to an individual insureds
data is precisely what has been covered in previous sections. In general it should depend on:
1. What is being estimated. Pure Premiums are harder to estimate than frequencies.
Total Limits losses are harder to estimate than basic limits losses.
2. The volume of data. All other things being equal, the more data the more credibility is assigned to
the observation.158
3. The Expected Value of the Process Variance. The more volatile the experience, the less
credibility is assigned to it.
4. The variance of the hypothetical means within classes; the more homogeneous the classification
the smaller this variance and the less credibility is assigned to the insureds individual experience
compared to that for the whole classification.
The more homogeneous the classes, the less credibility assigned an individuals data
and the more to the average for the class, when performing experience rating (individual
risk rating.) The credibility is a relative measure of the value of the information contained
in the observation of the individual versus the information in the class average.
155

For example, see Foundations of Casualty Actuarial Science, Fourth Edition, Chapter 4.
Experience Rating Plans are currently covered on the CAS Part 5 and Part 9 Exams.
157
The period of past experience used varies between the different Experience Rating Plans.
158
For example, in Workers Compensation Insurance the data from a business with $10,000 in Expected Losses
would be given much less credibility for Experience Rating than the data from a business with $1 million in Expected
Losses.
156

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 613

The more homogeneous the classes, the more value we place on the class average and the less
we place in the individuals experience.
Thus low credibility is neither good nor bad. It merely reflects the relative values of two pieces of
information. With a well designed class plan, the less we need to rely on the observations of the
individual, compared to a poorly designed class plan. In auto insurance if we classified insureds
based on their middle initials, we would expect to give the insureds individual experience a lot of
credibility. A poor class plan leads one to rely more on individual experience.
Note that the role of the class in Experience Rating has changed from its role in Classification
Ratemaking. In Experience Rating, the class experience receives the complement of credibility not
given to the individuals experience. In the case of classification rating, the class experience gets the
credibility while the complement of credibility is assigned to the experience of all classes combined.
In Experience Rating, the insured is the smaller unit while the class is the larger unit. In Classification
Ratemaking, the class is the smaller unit while the state is the larger unit. In both cases, the weight
given to the classifications experience is larger the more homogeneous the class. Thus the more
homogeneous the classes, the more credibility is given to the experience of each class for
Classification Ratemaking. The more homogeneous the class, the less credibility is assigned to the
individuals experience and therefore the more weight is given to the class experience for
Experience Rating.
A model that helps one to understand the concepts of experience rating is the Gamma-Poisson
frequency process.159 Each insureds frequency is given by a Poisson Process. The mean
frequencies of the insureds within a class are distributed via a Gamma Distribution. The variance of
this Gamma Distribution quantifies the homogeneity of the class. The smaller the variance of this
Gamma, the more homogeneous the class.
The observed experience of an insured can be used to improve the estimate of that insureds future
claim frequency. We assume a priori that the average claim frequencies of the insureds in a class are
distributed via a Gamma Distribution with = 3 and = 2/3. The average frequency for the class is
(3)(2/3) = 2.
If we observe no claims in a year, then the posterior distribution of that insureds (unknown) Poisson
parameter is a Gamma distribution with = 3 and = 0.4, with an average of: (3)(0.4) = 1.2.160
Thus the observation has lowered our estimate of this insureds future claim frequency.

159

See Mahlers Guide to Conjugate Priors.


See Mahlers Guide to Conjugate Priors.
The posterior alpha is 3 + 0 = 0. The posterior theta = 1/(1+1/(2/3))= 1/ 2.5 = .4.
160

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 614

The prior Gamma with = 3 and = 2/3, and the posterior Gamma with = 3 and = 0.4,
are shown:
Density
0.7
Posterior

0.6
0.5
0.4
0.3

Prior

0.2
0.1
1

Lambda

If instead we observe 5 claims in a year, then the posterior distribution of that insureds (unknown)
Poisson parameter is a Gamma distribution with = 8 and = 0.4, with an average of: (8)(0.4) =
3.2.161 Thus this observation has raised our estimate of this insureds future claim frequency. The
posterior Gamma in the case of this alternate observation is shown below:
Density

Prior

0.4

0.3

Posterior

0.2

0.1

161

See Mahlers Guide to Conjugate Priors.


The posterior alpha is 3 + 5 = 8. The posterior theta = 1/(1+1/(2/3))= 1/ 2.5 = .4.

Lambda

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 615

Problems:
The following information pertains to the next three questions.
For a large class of drivers the Variance of the Hypothetical Mean Frequencies is 0.124 and the
overall mean frequency is 0.660. Assume the claim count for each individual driver has a Poisson
distribution whose mean does not change over time.
15.1 (1 points) Use Buhlmann Credibility to estimate the expected annual claim frequency of a
driver who has had one accident free year.
A. less than 55%
B. at least 55% but less than 58%
C. at least 58% but less than 61%
D. at least 61% but less than 64%
E. at least 64%
15.2 (1 point) Use Buhlmann Credibility to estimate the expected annual claim frequency of a driver
who has had one accident in four years.
A. less than 40%
B. at least 40% but less than 42%
C. at least 42% but less than 44%
D. at least 44% but less than 46%
E. at least 46%
15.3 (1 point) Use Buhlmann Credibility to estimate the expected annual claim frequency of a driver
who has had eight accidents in ten years.
A. less than 70%
B. at least 70% but less than 72%
C. at least 72% but less than 74%
D. at least 74% but less than 76%
E. at least 76%
15.4 (1 point) Under a certain Experience Rating Plan, an insured with $15,000 in Expected Losses
who has no claims during the experience period receives a 17% credit modification. Under this
Experience Rating Plan, how much credibility is assigned to the data of an insured with $15,000 in
Expected Losses?
A. less than 10%
B. at least 10% but less than 15%
C. at least 15% but less than 20%
D. at least 20% but less than 25%
E. at least 25%

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 616

Use the following information for the following 14 questions:162


Claim severity in the State of Confusion follows a Pareto distribution, with parameters = 3,
= $20,000. There are four types of risks in this State, all with claim frequency given by a Poisson
distribution:
Type
Excellent
Good

Average Annual
Claim Frequency
5
10

Bad

15

Ugly

20

15.5 (2 points) A risk is selected at random from a class made up equally of Good and Bad risks.
What is the Buhlmann Credibility assigned to this risks claim frequency observed over a single
year? (The complement of credibility will be assigned to the estimated claim frequency for the
class.)
A. less than 30%
B. at least 30% but less than 40%
C. at least 40% but less than 50%
D. at least 50% but less than 60%
E. at least 60%
15.6 (1 point) A risk is selected at random from a class made up equally of Good and Bad risks.
What is the Buhlmann Credibility assigned to this risks claim frequency observed over a three year
period? (The complement of credibility will be assigned to the estimated claim frequency for the
class.)
A. less than 30%
B. at least 30% but less than 40%
C. at least 40% but less than 50%
D. at least 50% but less than 60%
E. at least 60%

162

In my paper A Graphical Illustration of Experience Rating Credibilities, PCAS 1998, not on the syllabus, I use the
situations assumed in these problems, in order to illustrate via graphs the concepts of Experience Rating.

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 617

15.7 (2 points) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. What is the Buhlmann Credibility assigned to this risks claim frequency observed over a single
year? (The complement of credibility will be assigned to the estimated claim frequency for the
class.)
A. less than 60%
B. at least 60% but less than 70%
C. at least 70% but less than 80%
D. at least 80% but less than 90%
E. at least 90%
15.8 (2 points) A risk is selected at random from a class made up equally of all four types of risks.
What is the Buhlmann Credibility assigned to this risks claim frequency observed over a single
year?
(The complement of credibility will be assigned to the estimated claim frequency for the class.)
A. less than 60%
B. at least 60% but less than 70%
C. at least 70% but less than 80%
D. at least 80% but less than 90%
E. at least 90%
15.9 (3 points) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. What is the expected value of the process variance of the loss pure premium?
A. less than 5 billion
B. at least 5 billion but less than 5.5 billion
C. at least 5.5 billion but less than 6 billion
D. at least 6 billion but less than 6.5 billion
E. at least 6.5 billion
15.10 (2 points) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. What is the variance of the hypothetical loss pure premiums?
A. less than 5 billion
B. at least 5 billion but less than 5.5 billion
C. at least 5.5 billion but less than 6 billion
D. at least 6 billion but less than 6.5 billion
E. at least 6.5 billion

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 618

15.11 (1 point) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. What is the Buhlmann Credibility assigned to this risks loss pure premium observed over a
single year?
(The complement of credibility will be assigned to the estimated loss pure premium for the class.)
A. less than 40%
B. at least 40% but less than 50%
C. at least 50% but less than 60%
D. at least 60% but less than 70%
E. at least 70%
15.12 (1 point) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. This risk is observed to have $300,000 in losses in a single year.
Using Buhlmann Credibility what is the expected dollars of loss for this risk in a single future year?
A. less than $200,000
B. at least $200,000 but less than $205,000
C. at least $205,000 but less than $210,000
D. at least $210,000 but less than $215,000
E. at least $215,000
15.13 (1 point) Claim sizes are limited to $25,000.
What is the mean of the (limited) severity?
A. less than $6,000
B. at least $6,000 but less than $6,500
C. at least $7,000 but less than $7,500
D. at least $7,500 but less than $8,000
E. at least $8,000
15.14 (2 points) Claim sizes are limited to $25,000.
What is the second moment of the (limited) severity distribution?
A. less than 115 million
B. at least 115 million but less than 120 million
C. at least 120 million but less than 125 million
D. at least 125 million but less than 130 million
E. at least 130 million

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 619

15.15 (3 points) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. Claim sizes are limited to $25,000.
What is the expected value of the process variance of the loss pure premium?
A. less than 1.3 billion
B. at least 1.3 billion but less than 1.4 billion
C. at least 1.4 billion but less than 1.5 billion
D. at least 1.5 billion but less than 1.6 billion
E. at least 1.6 billion
15.16 (3 points) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. Claim sizes are limited to $25,000.
What is the variance of the hypothetical loss pure premium?
A. less than 3.3 billion
B. at least 3.3 billion but less than 3.4 billion
C. at least 3.4 billion but less than 3.5 billion
D. at least 3.5 billion but less than 3.6 billion
E. at least 3.6 billion
15.17 (1 point) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. Claim sizes are limited to $25,000. What is the Buhlmann Credibility assigned to this risks
(limited) loss pure premium observed over a single year? (The complement of credibility will be
assigned to the estimated (limited) loss pure premium for the class.)
A. less than 71%
B. at least 71% but less than 73%
C. at least 73% but less than 75%
D. at least 75% but less than 75%
E. at least 77%
15.18 (1 point) A risk is selected at random from a class made up equally of Excellent and Ugly
risks. Claim sizes are limited to $25,000. This risk is observed to have $200,000 in losses in a
single year. Using Buhlmann Credibility what is the expected dollars of (limited) loss for this risk in a
single future year?
A. less than $170,000
B. at least $170,000 but less than $175,000
C. at least $175,000 but less than $180,000
D. at least $180,000 but less than $185,000
E. at least $185,000

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 620

15.19 (3 points) For an experience rating plan, the credibility assigned to an insureds experience is
given by the Buhlmann Credibility formula with K = 40,000:
Z = E / (E + 40,000), with E = expected losses for the insured.
The sizes of insureds, in other words their expected losses, vary across the portfolio via a
Pareto Distribution with = 40,000 and .
Determine the average credibility assigned to an insured in this portfolio.
15.20 (4, 5/84, Q.47) (1 point) Suppose you are given a class of insurance which is
homogeneous and has a Poisson claim count process for the individual risk. Assume the expected
frequency is 10% and disregard severity. What Buhlmann credibility would be assigned to the
annual experience of an individual risk taken from the class?
A. 0%
B. 10%
C. 90%
D. None of the above
E. Insufficient information given
15.21 (4, 5/85, Q.43) (1 point) An insured's loss rate is to be credibility weighted with the loss rate
of its class. Which of the following statements are true?
1) As the variance of the hypothetical means increases, the insured's credibility should increase.
2) As the expected value of the process variance increases,
the insured's credibility should decrease.
3) If all insureds in the class are identical, the insured's credibility should be zero.
A. 3
B. 2, 3
C. 1, 3
D. 1, 2
E. 1, 2, 3

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 621

Solutions to Problems:
15.1. B. Each insureds frequency process is given by a Poisson with parameter , with varying
over the group of insureds. The process variance for each insured is .
Thus the expected value of the process variance is estimated as follows:
E[VAR[X | ]] = E[] = overall mean = .660.
K = EPV / VHM = .66 / .124 = 5.32. For one year Z = 1 / (1 + 5.32) = .158.
If there are no accidents, estimated frequency is: (0)(.158) + (.66)(1 - .158) = 0.556.
Comment: This indicates a claim free credit for one year of 15.8%, equal to the credibility.
15.2. E. From the solution to the previous question, K = 5.32.
For four year the credibility Z = 4 / (4 + 5.32) = 42.9%. The observed frequency is 1/4 = .25.
The prior mean is .66. T he new estimate = (.25)(42.9%) + (.66)(57.1%) = 0.484.
Comment: Under a (simplified) experience rating this insured might get a credit of:
1 - (.484 / .66) = 26.7%.
15.3. D. K = 5.32. For 10 years, Z = 10 /(10 + 5.32) = 65.3%. The observed frequency is .80 and
the prior mean is .66. The new estimate is: (.8)(.653) + (.66)(.347) = 0.751.
15.4. C. Let R be the class rate. Let D be the rate based solely on the observed data for the
chosen insured. Then for credibility Z, the rate charged the insured = ZD + (1 - Z)R.
For the case where D = 0 and the rate charged the insured is: R(1 - 0.17) = 0.83 R, we have:
0.83 R = Z(0) + (1 - Z) R = (1 - Z) R. Thus Z = 0.17.
Comment: The credibility is equal to the claims free discount, in this case 17%.
15.5. B. The overall mean frequency is: (10 + 15)/2 = 12.5. Since we are mixing Poissons,
Expected Value of the Process Variance = overall mean = 12.5.
Variance of Hypothetical Mean Frequencies = {(10 - 12.5)2 + (15 - 12.5)2 } /2 = (5/2)2 = 6.25.
K = EPV/VHM = 12.5 / 6.25 = 2. Z = 1 / (1+2) = 33.3%
15.6. E. K = 12.5 / 6.25 = 2. Z = 3 / (3+2) = 60%.
15.7. D. The overall mean frequency is: (5 + 20)/2 = 12.5. Since we are mixing Poissons,
Expected Value of the Process Variance = overall mean = 12.5.
Variance of Hypothetical Mean Frequencies = {(5 - 12.5)2 + (20 - 12.5)2 } /2 = 7.52 = 56.25.
K = EPV/VHM = 12.5 / 56.25 = .222. Z = 1 / (1 + .222) = 81.8%.

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 622

15.8. C. The overall mean frequency is: (5 +10 + 15 + 20)/4 = 12.5.


Since we are mixing Poissons, Expected Value of the Process Variance = overall mean =12.5.
Variance of Hypothetical Mean Frequencies
= {(5 - 12.5)2 + (10 - 12.5)2 + (15 - 12.5)2 + (20 - 12.5)2 } /4 = 31.25.
K = 12.5 / 31.25 = .4. Z = 1 / (1 + 0.4) = 71.4%
15.9. B. The process variance of the pure premium with a Poisson frequency =
(mean frequency)(second moment of the severity.) The severity distribution is assumed to be the
same for all types of risks, therefore the expected value of the process variance =
(overall mean frequency)(second moment of Pareto).
For a Pareto with parameters: = 3, = $20,000, Second moment of Severity =
22/{(1)(2)} = 4 x 108 .
Therefore, Expected Value of Process Variance = (12.5)( 4 x 108 ) = 5 x 109 .
15.10. C. For a Pareto with parameters = 3, = $20,000, Mean Severity= /(1) = 10,000.
Thus the Hypothetical Mean Pure Premiums are: (5)(10000) and (20)(10000).
The overall mean pure premium is: (12.5)(10000). Thus the Variance of the Hypothetical Mean
Pure Premiums = 100002 {(5 - 12.5)2 + (20 - 12.5)2 }/2 = 5.625 x 109 .
15.11. C. K = 5 x 109 / 5.625 x 109 = .889. Z = 1 / (1 + .889) = 52.9%.
15.12. E. The mean losses for the class = (12.5)($10000) = 125,000. The credibility is 52.9%.
Therefore, new estimate = (.529)(300000) + (1 - .529)(125000) = $217,575.
15.13. E. For the Pareto E[X x] = {/(1)}{1(/+x)1}.
E[X 25000] = (20000 / 2) (1 - (20000 / 45000)2 ) = $8025.
15.14. C. For the Pareto: E[(X L)2 ] = E[X2 ] {1 - (1+L/ )1[1+ (-1)L/ ]}.
E[(X 25000)2 ] = ( 4 x 108 ) {1 - (1+1.25)-2[1+ (2)(1.25)]} = 1.235 x 108 .
15.15. D. The process variance with a Poisson frequency =
(mean frequency)(second moment of the severity.) The severity distribution is assumed to be the
same for all types of risks, therefore the expected value of the process variance =
(overall mean frequency)(second moment of severity).
From the previous problem, second moment of severity = 1.235 x 108 .
Therefore, Expected Value of Process Variance = (12.5)(123.5 million) = 1.544 x 109 .

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 623

15.16. E. From a previous problem, E[X 25000] = $8025.


Thus the Hypothetical Mean Pure Premiums are: (5)($8025) and (20)($8025).
Variance of the Hypothetical Mean Pure Premiums = (80252 ){(5 - 12.5)2 + (20 - 12.5)2 }/2 =
3.623 x 109 .
15.17. A. K = 1.544 x 109 / 3.623 x 109 = .426. Z = 1/(1 + .426) = 70.1%.
15.18. B. Z = 70.1% and the overall mean loss is 12.5 x $8025 = $100,313.
Thus the new estimate = (.701)($200,000) + (1 - .701)($100,313) = $170,194.

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 624

40,000
15.19. The survival function of the sizes of insureds is: S(E) =
.
E + 40,000
1 - S(E)1/ = 1 - 40,000/(E + 40,000) = E / (E + 40,000) = Z.

E
Therefore, the mean Z is:
f(E) dE =
E + 40,000

{1 - S(E) 1/ } f(E) dE =

f(E) dE -

E =

S(E)1/ f(E) dE = {F(E) + S(E)1+ 1/ / (1/ + 1)} ]

E=0

40,000
E
dE =
E + 40,000
+1
(E
+
40000)
0

+ 2
0 (E + 40000)

dE =
E =

40,000 {

1
1
=
.
1/ + 1 + 1

E
Alternately, the mean Z is:
f(E) dE =
E + 40,000
40,000

=1-

( + 1) (E + 40000) + 1
-E

E=0

-1

+ 1
0 ( +1) (E + 40000)

dE } =

E=

40,000 { 0 -

( +1) (E + 40000)
1

} = 40,000

E =0

Alternately,

( +1) 40,000 + 1
( x + 40000)+ 2

1
=
.
+ 1
( +1) 40000

is the density of a Pareto Distribution with parameters +1 and

40,000. This distribution has a mean of: 40,000/( + 1 - 1) = 40,000/. Therefore,

( +1) 40,000 + 1
(x +

40000) + 2

dx = 40,000/.

x
(x +

40000) + 2

1
( +1) 40000

+ 2
0 (E + 40000)

E
The mean Z is:
f(E) dE = 40,000
E + 40,000
40,000

dx =

dE =

1
=
.
+ 1
( +1) 40000

Comment: The Buhlmann credibility formula has the same mathematical form as the distribution

2013-4-9

Buhlmann Credibility 15 Experience Rating

HCM 10/19/12,

Page 625

function of a Pareto with = 1.


In the first solution, I used the fact that the derivative of the survival function is minus one times the
density function. In the second solution, I used integration by parts.
15.20. A. Since the class is (completely) homogeneous the experience of an individual is assigned
no credibility for experience rating. (The VHM is zero, so Z = 0.)
15.21. E. 1. True. 2. True. 3. True. If the class is (completely) homogeneous, then the Variance of
the Hypothetical Means is zero and the credibility assigned to the experience of an individual is
zero. (The class loss rate is given 100% credibility.)

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 626

Section 16, Loss Functions


A loss function is defined as a function of the estimate of a parameter and its true value.163
The loss function most commonly used by actuaries is the squared error loss function.164
The squared error is: (estimate - true value)2 .
The smaller the expected value of the loss function, the better the estimate.
Which estimator is best depends on which loss function one attempts to minimize.165
It turns out that the estimator that minimizes squared errors is the mean.
Another loss function is the absolute error: | estimate - true value|.
It turns out that the estimator that minimizes absolute errors is the median.
Bayesian Estimation:
Throughout the section on Bayesian Analysis, I used the mean of the posterior distribution of the
quantity of interest as the estimator. When doing Bayes Analysis questions, use the mean of the
posterior distribution, unless specifically stated otherwise. However, the mean is only one possible
Bayesian estimator.
Bayes Analysis using the Squared-error Loss Function, just means do what we usually
do, get the posterior mean of the quantity of interest.
In general, the Bayes Estimator minimizes the expected value of the given loss function.166
Depending on the loss function one attempts to minimize, one gets the following estimators:167

163

See Definition 15.15 in Loss Models.


For example, in linear regression one tries to minimize squared errors.
165
In general, which estimator is best, depends on which criterion one uses.
166
See Definition 15.16 in Loss Models.
167
See Theorem 15.18 in Loss Models.
Note that multiplying a loss function by a constant does not change the estimator that minimizes that loss function.
164

2013-4-9

Buhlmann Credibility 16 Loss Functions,

Error or Loss Function

Name

HCM 10/19/12,

Page 627

Bayesian Point Estimator

(estimate - true value)2

Squared-error

Mean

0 if estimate = true value

1 if estimate true value

Zero-one

Mode

Absolute-error

Median

| estimate - true value |

(1-p) | estimate - true value |, if estimate true value (overestimate)

(p)| estimate - true value|, if estimate true value (underestimate)

p th percentile

Exercise: Assume the following information:

The probability of y successes in m trials is given by a Binomial distribution


with parameters m and q.

The prior distribution of q is uniform on [0,1].

One success was observed in three trials.

Determine the posterior distribution of q.


[Solution: Assuming a given value of q, the chance of observing one success in three trials is
3q(1 - q)2 . The prior distribution of q is: (q) = 1, 0 q 1. By Bayes Theorem, the posterior
distribution of q is proportional to the product of the chance of the observation and the prior
distribution: 3q(1 - q)2 . Thus the posterior distribution of q is proportional to q - 2q2 + q3 .
The integral of q - 2q2 + q3 from 0 to 1 is: 1/2 - 2/3 + 1/4 = 1/12.
Thus the posterior distribution of q is: 12(q - 2q2 + q3 ) = 12q - 24q2 + 12q3 , 0 q 1.]

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 628

In this exercise, the posterior distribution of q is a Beta Distribution with a = 2, b = 3, and = 1:168
Density

1.5

1.0

0.5

0.2

0.4

0.6

0.8

1.0

Using the squared-error loss function, the expected future frequency is given by the mean of the
posterior distribution: a/(a+b) = (1)(2)(2 + 3) = 2/5.169
Using instead the zero-one loss function, the expected future frequency is given by the mode of the
posterior distribution: (a - 1) / (a + b - 2) = (1)(2 - 1)/(2 + 3 - 2) = 1/3.170
The graph of the density reaches a maximum at q = 1/3.
Using the absolute loss function, the expected future frequency is given by the median of the
posterior distribution.
The posterior distribution of q is: f(q) = 12q - 24q2 + 12q3 .
Therefore by integration, F(q) = 6q2 - 8q3 + 3q4 .
At the median the distribution function is 0.5: 6q2 - 8q3 + 3q4 = 0.5.
One can solve numerically for q = 0.3857. In the above graph, half of the area is to the left of
q = 0.3857, while the other half of the area is to the right of q = 0.3857.

168

As shown in Appendix A attached to the exam, the density of the Beta Distribution is
f(x) = {(a + b - 1)! / ((a-1)! (b-1)!)} (x/)a-1 (1 - x/)b-1 /, 0 x .
This is a special case of a Beta-Bernoulli conjugate prior. See Mahlers Guide to Conjugate Priors.
169
See Appendix A of Loss Models.
One can compute the mean by integrating q times the posterior density, from 0 to 1.
170
f(q)=12(q - 2q2 + q3 ), f(q) = 12(1 - 4q + 3q2 ), which is zero when q = 1/3 or 1. One can confirm, that the density
reaches a maximum at q = 1/3.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 629

Exercise: Use a loss function based on the absolute error, but treat underestimates as three times as
important as overestimates.
Determine the Bayesian estimate of the future claim frequency.
[Solution: The loss function is:
| estimate - true value |, if estimate true value (overestimate)
3 | estimate - true value |, if estimate true value (underestimate).
We can multiply this loss function by 1/4, without affecting which estimator is best:
(1/4)| estimate - true value |, if estimate true value (overestimate)
(3/4)| estimate - true value |, if estimate true value (underestimate).
Consulting the previous chart, minimizing the expected value of this loss function corresponds to
using the 75th percentile.
For the 75th percentile of the posterior distribution of q: 6q2 - 8q3 + 3q4 = 0.75.
Solving numerically, q = 0.5437.
Comment: Since we really want to avoid underestimates, we choose a larger estimator, the 75th
percentile. If we instead treated overestimates as three times as important as underestimates, then
we would choose a smaller estimator, the 25th percentile.]
Thus we see depending on which loss function we use, the estimated future frequency is either the
mean of the posterior distribution 0.4, the mode of the posterior distribution 0.3333, the median of
the posterior distribution 0.3857, or the 75th percentile of the posterior distribution 0.5437.
Note that the only difference is the criterion that was used to decide which estimator is best; the a
priori assumptions and the observations are the same in each case.
Connecting the Loss Functions with the Estimators:
Assume we observe a sample of size 5 from an unknown distribution: 12, 3, 38, 5, 8.
We wish to estimate the next value from this distribution.
Let us assume we wish to minimize the squared error of the estimate.
Let the estimate be y.
Then if x is the next observed value, we wish to minimize (x - y)2 .
Let us assume the uniform and discrete distribution on the given sample, in other words the empirical
distribution function.171 In other words, let us assume a 20% chance of: 12, 3, 38, 5 or 8.
Then the expected squared error for an estimate of y is:
.2(12 - y)2 + .2(3 - y)2 + .2(38 - y)2 + .2(5 - y)2 + .2(8 - y)2 .

171

This is similar to a step used in Bootstrapping. See Mahlers Guide to Simulation.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 630

Taking the derivative with respect to y and setting it equal to zero:


0 = 0.4{(12 - y) + (3 - y) + (38 - y) + (5 - y) + (8 - y)}.

y = (12 + 3 + 38 + 5 + 8)/5 = 11.


Thus the sample mean minimizes the squared error loss function.
More generally, for the empirical distribution function, the expected squared error using an estimate
of y is:

0=

(xi

- y)2
. We minimize the expected squared error:
n

(xi - y)2
2y
n
=
y
n

(xi

- y) . y =

xi = sample mean.
n

Similarly, the expected absolute error using an estimate of y is:


0.2|12 - y| + 0.2|3 - y| + 0.2|38 - y| + 0.2|5 - y| + 0.2|8 - y|.
Here is a graph of the expected absolute error as a function of the estimate y:
abs.error
70
65
60
55
50
45

10

15

20

The expected absolute error is minimized for y = 8, which is the empirical median.
More generally, for the empirical distribution function, the expected absolute error for an estimate of
|x - y |
i
sgn[xi - y] , where
| xi - y |
n
y is:
. We minimize the expected absolute error: 0 =
=
n
y
n
sgn(z) equals 1 when z > 0 , equals -1 when z < 0, and equals 0 when z = 0.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 631

This partial derivative is equal to zero when there is an equal chance that y > xi or y < xi, which occurs
when y is the sample median.
Thus in this example, if one wanted to estimate the next value, one might take either the empirical
mean of 11 or the empirical median of 8. Which of these two estimates is better depends on which
criterion or loss function one uses to decide the question.
Note that the squared error loss function (dashed line) counts extreme errors more heavily than does
the absolute error loss function (solid line.)

-2

-1

Therefore, it is not surprising that one would get a different best estimate, depending on which of
these two loss functions one is trying to minimize.
If y is the estimate and x is the observation, let the loss function be defined by:
error =

(1- p) | y - x |, if y x (overestimate)
(p) | y - x |, if y x (underestimate)

If the estimate is y, then the partial derivative with respect to y is (p - 1) if x y and p if x y.


Thus the expected value of this derivative is: Prob( x y)(p - 1) + Prob(x > y)() =
Prob(x y)(p - 1) + {1 - Prob(x y)}(p) = p - Prob(x y). Thus the expected value of the derivative
of this loss function is zero when Prob(x y) = p, in other words when y is the pth percentile of the
distribution of x.
Note that if p = 0.5, then the loss function is proportional to the absolute error loss function discussed
previously, which is minimized by the 50th percentile, in other words, the median.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 632

Exercise: Assume that we wish to estimate Loss Reserves and that we believe errors when our
reserve turns out to be an underestimate to be 9 times as bad as when our reserve turns out to be
an overestimate. In other words, we are more concerned about the possibility that the outcome x
turns out to be greater than the estimate y, than vice versa.
What estimator should we use?
[Solution: In this case, the loss function would be:172
| y - x |, if y x (overestimate)
error =
9 | y - x |, if y x (underestimate)
If we multiply this loss function by 1/10, then this is a special case of the prior loss function, with
p = 9/10. Thus the best estimator would be the 90th percentile.]
If y is the estimate and x is the observation, let the loss function be defined by:173
0 if y = x
error =
1 if y x
The mode is that value of x most likely to occur. In other words, the probability of not matching a
single selected value is smallest at the mode. Therefore, an estimate equal to the mode of the
distribution function of x will minimize the expected value of this loss function.

172

The loss function is subject to an arbitrary multiplicative constant.


This is referred to by Loss Models as the zero-one loss function.
It has little application to actuarial work.
173

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 633

Problems:
Use the following information for the next 4 questions:
9 years of losses (in millions of dollars), ranked from smallest to largest are observed:
2, 7, 10, 18, 23, 30, 58, 72, and 617.
One wishes to estimate the losses (in millions of dollars) for the next year.
16.1 (2 points) If you are interested in minimizing the expected squared error, what is your estimate
of next years losses (in millions of dollars)?
A. Less than 20
B. At least 20 but less than 40
C. At least 40 but less than 60
D. At least 60 but less than 80
E. At least 80
16.2 (2 points) You are interested in minimizing the expected absolute error.
What is your estimate of next years losses (in millions of dollars)?
A. Less than 20
B. At least 20 but less than 40
C. At least 40 but less than 60
D. At least 60 but less than 80
E. At least 80
16.3 (2 points) You are doing loss reserving, and are more concerned with underestimates than with
overestimates. Each absolute value of any underestimates will be treated as four times as important
as the absolute value of any overestimates.
What is your estimate of next years losses (in millions of dollars)?
A. Less than 20
B. At least 20 but less than 40
C. At least 40 but less than 60
D. At least 60 but less than 80
E. At least 80
16.4 (2 points) You are more concerned with overestimates than with underestimates. Each
absolute value of any overestimates will be treated as four times as important as the absolute value
of any underestimates. What is your estimate of next years losses (in millions of dollars)?
A. Less than 20
B. At least 20 but less than 40
C. At least 40 but less than 60
D. At least 60 but less than 80
E. At least 80

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 634

Use the following information for the next 3 questions:

Let severity be given by an exponential distribution with mean : f(x) = e-x/ / .

In turn, let have the improper prior distribution () = 1/, 0 < < .

One observes 3 claims of sizes from an insured: 6, 9 ,11.


You may use the following values of the Incomplete Gamma Function:
[2;1.67835] = [3; 2.67406] = [4; 3.67206] = [5; 4.67091] = 0.5

16.5 (2 points) Estimate for this insured, using the squared-error loss function.
A. Less than 8
B. At least 8 but less than 10
C. At least 10 but less than 12
D. At least 12 but less than 14
E. At least 14
16.6 (2 points) Estimate for this insured, using the zero-one loss function.
A. Less than 8
B. At least 8 but less than 10
C. At least 10 but less than 12
D. At least 12 but less than 14
E. At least 14
16.7 (2 points) Estimate for this insured, using the absolute error loss function.
A. Less than 8
B. At least 8 but less than 10
C. At least 10 but less than 12
D. At least 12 but less than 14
E. At least 14

16.8 (2 points) A loss function has been defined by:


2(x ) if x
loss =
3(x ) if x
where is the Bayesian point estimate of x.
Which statistic of x should be so as to minimize the expected value of the loss function?
A. 40th percentile

B. 60th percentile

C. Mean

D. Median

E. Mode

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 635

Use the following information for the next three questions:


In a large portfolio of risks, the number of claims for one policyholder during one
year follows a Bernoulli distribution with mean q.
The number of claims for one policyholder for one year is independent of the number
of claims for the policyholder for any other year. The number of claims for one
policyholder is independent of the number of claims for any other policyholder.
The distribution of q within the portfolio has density function:
400q , 0 < q 0.05

f(q) =
40 - 400q , 0.05 < q < 0.10
A policyholder Phillip DeTanque is selected at random from the portfolio.
During Year 1, Phillip has one claim.
During Year 2, Phillip has no claim.
During Year 3, Phillip has no claim.
16.9 (3 points) A loss function is defined as equal to zero if the estimate equals the true value, and
one otherwise. You are interested in minimizing the expected value of this loss function. Find the
Bayesian estimate of Phillip's q.
A. 0.0400
B. 0.0424
C. 0.0500
D. 0.0576
E. 0.0600
16.10 (4 points) You are interested in minimizing the expected absolute error.
Find the Bayesian estimate of Phillip's q.
A. 0.0400 B. 0.0424 C. 0.0500 D. 0.0576 E. 0.0600
16.11 (2 points) You are interested in minimizing the expected squared error.
Find the Bayesian estimate of Phillip's q.
A. 0.0400
B. 0.0424
C. 0.0500
D. 0.0576
E. 0.0600

16.12 (4 points) Severity follows a LogNormal Distribution.


The prior density of parameters is: ( , ) = 1/.
Three losses were paid on a policy with following sizes: 1000, 2000, 5000.
Determine the Bayesian estimate of and , using the posterior mode.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 636

Use the following information for the next two questions:

The amount of a single payment has the Single Parameter Pareto Distribution
with = 10 and unknown shape parameter.

The prior distribution has the Gamma Distribution with = 3 and scale parameter = 5.
Three losses were paid on a policy with following sizes: 13, 16, 21.
16.13 (3 points) With the squared error loss function, what is the Bayes estimate of the shape
parameter of the Single Parameter Pareto Distribution for this policy?
A. 3.0
B. 3.2
C. 3.4
D. 3.6
E. 3.8
16.14 (3 points) With the zero-one loss function, what is the Bayes estimate of the shape
parameter of the Single Parameter Pareto Distribution for this policy?
A. 3.0
B. 3.2
C. 3.4
D. 3.6
E. 3.8

Use the following information for the next two questions:

Size of loss is uniform on [0, c].


The improper prior of c is: (c) = 1/c, c > 0.
A particular insured has two losses of sizes: 10, 15.
16.15 (3 points) Using the absolute error loss function,
what is the Bayesian estimate of c for this insured?
A. 15
B. 21
C. 25
D. 28

E. 30

16.16 (3 points) Using the absolute error loss function,


what is the Bayesian estimate of the size of the next loss from this insured?
A. 10.50
B. 10.75
C. 11.00
D. 11.25
E. 11.50

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 637

16.17 (4B, 11/94, Q.20) (2 points) The density function for a certain parameter, , is
f() =

4.6 e - 4.6
, = 0, 1, 2, ....
!

A loss function has been defined by:


0 if = 1
loss =
k if 1
where 1 is the Bayesian point estimate of , and k is a positive constant.
Which statistic of should 1 be so as to minimize the expected value of the loss function?
A. 33rd percentile

B. Maximum value

C. Mean

D. Minimum value

16.18 (4B, 11/97, Q.13) (2 points) You are given the following:
The random variable X has the density function
f(x) = e-x, 0 < x < .
A loss function is given by |X - k|, where k is a constant.
Determine the value of k that will minimize the expected loss.
A. ln 0.5
B. 0
C. ln 2
D. 1
E. 2
16.19 (4B, 11/99, Q.8) (2 points) You are given the following:
A loss function is given by
k - X if X - k 0

(X - k) if X - k > 0
where X is a random variable.
The expected loss is minimized when k is equal to the 80th percentile of X.
Determine .
A. 0.2

B. 0.8

C. 1.0

D. 2.0

E. 4.0

E. Mode

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 638

Solutions to Problems:
16.1. E. Using the squared loss function corresponds to using the mean as the estimator.
The observed mean is: (2 + 7 + 10 + 18 + 23 + 30 + 58 + 72 + 617)/9 = 93.
Alternately, for the observed data the sum of the squared errors would have been if we had made
various estimates:
Estimate
Sum of Squared Errors

85
313,878

90
313,383

93
313,302

95
313,338

100
313,743

16.2. B. Using the absolute loss function corresponds to using the median as the estimator.
The estimated median is the (9+1)(50%) = 5th observed loss, which is: 23.
Alternately, for the observed data the sum of the absolute errors would have been if we had made
various estimates:
Estimate
Sum of Absolute Errors

15
754

22
741

23
740

24
741

30
747

For example, for an estimate of 22, the sum of the absolute errors would have been:
|2-22| + |7-22| + |10-22| + |18-22| + |23-22| + |30 - 22| + |58-22| + |72-22| + |617-22| =
20 + 15 + 12 + 4 + 1 + 8 + 36 + 50 + 595 = 741.
16.3. D. Using a loss function proportional to:
(1-.8)| estimate - true value |, if estimate true value (overestimate)
(.8)| estimate - true value |, if estimate true value (underestimate)
corresponds to using the 80th percentile as the estimator.
The estimated 80th percentile is the (9+1)(80%) = 8th observed loss, which is: 72.
Alternately, for the observed data the sum of each absolute value of any underestimates multiplied
by 4, plus the absolute value of any overestimates, would have been if we had made various
estimates:
Estimate
Sum of Errors

50
2598

70
2538

72
2536

75
2548

95
2628

For example, for an estimate of 70, the sum of the errors would have been:
|2-70| + |7-70| + |10-70| + |18-70| + |23-70| + |30 - 70| + |58-70| + 4|72-70| + 4|617-70| = 2538.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 639

16.4. A. Using a loss function proportional to:


(1-.2)| estimate - true value |, if estimate true value (overestimate)
(.2)| estimate - true value |, if estimate true value (underestimate)
corresponds to using the 20th percentile as the estimator.
The estimated 20th percentile is the (9+1)(20%) = 2nd observed loss, which is: 7.
Alternately, for the observed data the sum of each absolute value of any overestimates multiplied
by 4, plus the absolute value of any underestimates, would have been if we had made various
estimates:
Estimate
Sum of Errors

3
815

5
807

7
799

9
801

11
808

For example, for an estimate of 11, the sum of the errors would have been:
4|2-11| + 4|7-11| + 4|10-11| + |18-11| + |23-11| + |30 - 11| + |58-11| +|72-11| + |617-11| = 808.
Comment: Multiplying the loss function by any constant, does not change the estimate of next
years losses.
16.5. D. The chance of the observation is the product of the densities at the observed points:
e-6/e-9/e-11// 3 = e-26/ / 3. Multiplying by the prior distribution of 1/ gives the probability
weights: e-26/ / 4. The posterior distribution is proportional to this and therefore is an Inverse
Gamma Distribution, with = 26 (the sum of the observed claims) and = 3 (the number of
observed claims.)
Using the squared-error loss function, the estimator is the mean of the posterior distribution of
hypothetical means = mean of the posterior Inverse Gamma = / (1) = 26/2 = 13.
16.6. A. From the previous solution, the posterior distribution is an Inverse Gamma Distribution,
with = 26 and = 3. Using the zero-one loss function, the estimator is the mode of the posterior
Inverse Gamma = / (+1) = 26/4 = 6.5.
16.7. B. The posterior distribution is an Inverse Gamma Distribution, with = 26 and = 3.
Using the absolute error loss function, the estimator is the median of the posterior Inverse Gamma.
The Distribution function is: 1 - [; /x] = 1 - [3; 26/x]. The median is where the distribution function
is .5. In other words we want [3; 26/x] = .5. We are given that [3; 2.67406] = 0.5.
Thus 26/x = 2.67406, or x = 26/2.67406 = 9.72.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 640

16.8. B. The partial derivative of the loss function is -2 if x and 3 if x . Thus the expected
value of the derivative is: Prob(x )(-2) + (Prob x )(3) =
Prob(x )(-2) + {1 - Prob(x )}(3) = 3 - (5)Prob(x ). Thus the expected value of the derivative
of this loss function is zero when Prob(x ) = 3/5 = 60%, in other words when is the 60th
percentile of the distribution of x.
Comment: The given loss function is just 5 times the loss function discussed in the text of this
section, with p = 0.6. The best estimator is the pth percentile, in this case the 60th percentile.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 641

16.9. C. The chance of the observation given q is q(1-q)2 = q - 2q2 + q3 . By Bayes Theorem,
the posterior probability density function is proportional to: f(q)(q - 2q2 + q3 ). In order to compute
the posterior density we need to divide by the integral of f(q)(q - 2q2 + q3 ).
.1

.05

.1

f(q)(q - 2q2 + q3)dq = 400{q2 - 2q3 + q4} dq + 40 {q - 12q2 + 21q3 - 10q4} dq =


0

.05

q=.05

400{q3 /3 - q4 /2 + q5 /5}

q=.1

+ 40{q2 /2 - 4q3 + 21q4 /4 - 2q5 } ] =

q=0

q =.05

(400)(.000041667 - .000003125 + .000000063) +


40{(.005 - .004 + .000525 - .00002) - (.00125 - .0005 + .000032813 - .000000625)} =
.01544 + .02891 = .04435. Thus the posterior density is:
400{q2 - 2q3 + q4 }/.04435, for 0< q .05, and 40{q - 12q2 + 21q3 - 10q4 }/.04435, for .05 < q .1.
For the absolute error function, the Bayes Estimator is the mode, where the posterior density is
maximized. Plugging in the given points one gets:
q:
0.04
0.0424
0.05
0.0576
0.06
posterior density at q:
13.3
14.9
20.3
19.6
19.1
Comment: Ignoring the factor of 40/.04435, the derivative of the density is:
10{2q - 6q2 + 4q3 }, for 0 < q .05, and {1 - 24q + 63q2 - 40q3 }, for .05< q .1.
One can check for places where the derivative is zero and the check value of the density at the
endpoints 0, 0.05, and 0.1, and determine that 0.05 is the mode. The density is as follows:
Density
20

15

10

0.02

0.04

0.06

0.08

0.10

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 642

16.10. D. From the previous solution, the posterior density is:


400{q2 - 2q3 + q4 } / .04435, for 0 < q .05, and
40{q - 12q2 + 21q3 - 10q4 } / .04435, for .05< q .1.
For the absolute error function, the Bayes Estimator is the median.
One can integrate the density from 0 to each of the given points and determine where the
distribution function is 0.5. The median is 0.0576.
Plugging in the given points one gets:
q: 0.04 0.0424 0.05 0.0576
0.06
posterior distribution function at q: 0.181 0.215 0.348 0.500
0.547
For example, the posterior distribution function at .06 is computed as follows:
.05

.06

400 {q2 - 2q3 + q4 }/.04435 dq + 40 {q - 12q2 + 21q3 - 10q4 } /.04435 dq =


0

.05
q=.05

400{q3 /3 - q4 /2 + q5 /5} /.04435

q=.06

+ 40{q2 /2 - 4q3 + 21q4 /4 - 2q5 }/.04435

q=0

q =.05

0.348 + 0.199 = 0.547.


16.11. D. From a previous solution, the posterior density is:
400{q2 - 2q3 + q4 } / .04435 for 0 < q .05 and
40{q - 12q2 + 21q3 - 10q4 } / .04435 for .05< q .1.
For the squared error function, the Bayes Estimator is the mean.
Integrating q times the density from q equals 0 to .05 to .1, the mean is 0.0576.
.05

.1

400 {q2 - 2q3 + q4 }q/.04435 dq + 40 {q - 12q2 + 21q3 - 10q4 }q/.04435 dq =


0

.05
q = .05

400{q4 /4 - q5 2/5 + q6 /6}/.04435

]
q=0

q = .1

+ 40{q3 /3 - 3q4 + 21q5 /5 - q6 5/3}/.04435

q = .05

.01299 + .04461 = 0.0576.


Comment: The mean and median are slightly different if taken out to more decimal places.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

[(

exp 16.12. f(x) =

ln(x) )2
22
x 2

HCM 10/19/12,

Page 643

The chance of the observation is: f(1000) f(2000) f(5000), which is proportional to:

[(

exp -

ln(1000 ) )2
22

] [(
exp -

ln(2000) )2
22

] [(
exp -

ln(5000 ) )2
22

The prior distribution of parameters is: ( , ) = 1/.


Thus the density of the posterior distribution of parameters is proportional to:

exp -

( ln(1000 )

)2 + (ln(2000 ) )2 + ( ln(5000 ) )2
2 2
4

The mode is where this density is largest. (The proportionality constant will not effect this.)
We can maximum this density by maximizing its log:
( ln(1000 ) )2 + ( ln(2000 ) )2 + ( ln(5000 ) )2
- 4 ln[].
22
Setting the partial derivative with respect to mu equal to zero:
0=
=

( ln(1000 )

) + ( ln(2000 ) ) + ( ln(5000 ) )
.
2

ln(1000) + ln(2000) + ln(5000)


= 7.675.
3

Setting the partial derivative with respect to sigma equal to zero:


0=

( ln(1000 )

2 =
=

( ln(1000 )

)2 + ( ln(2000 ) )2 + ( ln(5000 ) )2
- 4/.
3
)2 + ( ln(2000 ) )2 + ( ln(5000 ) )2
4

( ln(1000 ) 7.675)2 + ( ln(2000 ) 7.675)2 + ( ln(5000 ) 7.675 )2


4

= 0.571.
Comment: Similar to Exercise 15.80 in Loss Models.
The use of the posterior mode corresponds to the zero-one loss function.

= 0.3259.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 644

16.13. D. & 16.14. A. For the Single Parameter Pareto Distribution, f(x | ) = 10 /x+1.
For the Gamma, () = 2 e-/5 / {[3] 53 } = 2 e-/5 / 250.
The posterior distribution is proportional to: f(13 | ) f(16 | ) f(21 | ) ()
= ( 10 /13+1) ( 10 /16+1) ( 10 /21+1) (2 e-/5 / 250).
This is proportional to: ( 10 /13) ( 10 /16) ( 10 /21) (2 e-/5)
= 5 103 e-/5/ 4368 = 5 e6.9078 e-0.2/ e8.3821 = 5 e-1.6743.
Thus the posterior distribution is a Gamma Distribution with parameters 6 and 1/1.6743.
For the squared error loss function, the Bayes estimate is the mean of the posterior distribution,
which in this case is: = 6/1.6743 = 3.58.
For the zero-one loss function, the Bayes estimate is the mode of the posterior distribution.
The mode of a Gamma distribution for > 1 is: ( - 1) = (6 - 1)/1.6743 = 2.99.
Comment: Similar to Example 15.17 in Loss Models.
In general, the Gamma Distribution is a Conjugate Prior to the Single Parameter Pareto likelihood
with fixed.
In general, if the prior Gamma has shape parameter and scale parameter , then the posterior
n

Gamma has parameters: = + n, and 1/ = 1/ +

ln[xi / ].
i=1

In this case: = 3 + 3 = 6, and 1/ = 1/5 + ln[13/10] + ln[16/10] + ln[21/10] = 1.6743.


16.15. B. If c < 15, then the chance of the observation is zero.
For c 15, the chance of the observation is 1/c2 .
Thus the density of the posterior distribution is proportional to: (1/c)(1/c2 ) = 1/c3 , c > 15.

15

1
dc = (1/2)(1/152 ).
c3

Thus the density of the posterior distribution is: (2)(152 )/c3 , c > 15.
Integrating the density, the distribution function is: 1 - (15/c)2 , c > 15.
Using the absolute error loss function, we want the median of the posterior distribution.
0.5 = 1 - (15/c)2 . c = 15

2 = 21.2.

Comment: The posterior distribution is a Single Parameter Pareto Distribution.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 645

16.16. D. From the previous solution, the density of the posterior distribution is: 450/c3 , c > 15.
The uniform has density 1/c, 0 x c.
Therefore, for x < 15, c > x and the density of the uniform at x is 1/c,

and thus the density of the predictive distribution is:

15

1 450
dc = 2/45.
c c3

However, for x 15, the density of the uniform at x is zero unless c x,

and thus the density of the predictive distribution is:

1 450
dc = 150/x3 .
c c3

The distribution of the next loss from this same insured is the predictive distribution.
Using the absolute error loss function, we want the median of the predictive distribution.
The predictive distribution function at 15 is: (15)(2/45) = 2/3.
Thus the median is less than 15.
1/ 2
The median of the predictive distribution is: {
} (15) = (0.75)(15) = 11.25.
2/3
Comment: We wish to minimize the expected absolute error of our estimate compared to the next
observation.
2 / 45, x < 15
The density of the predictive distribution is: f(x | observation) =
.
150 / x3, x 15
density
0.04

0.03

0.02

0.01

10

20

Its integral from 0 to infinity is: (15)(2/45) +

30

40

15 150 / x3 dx = 2/3 + 75/152 = 1.

2013-4-9

Buhlmann Credibility 16 Loss Functions,

HCM 10/19/12,

Page 646

16.17. E. The mode is that value the distribution is most likely to assume. Therefore the probability
of not matching the single selected value is smallest at the mode. The expected value of the given
loss function is k times the probability of not matching the single selected value 1.
Thus 1 = mode will produce the smallest expected value of the loss function.
Comment: The expected value of the given loss function is a measure of the error of the point
estimate. Different such measures or criteria will yield different best point estimates. For this
particular measure of error the best estimate is the mode. This loss function is proportional to the
zero-one loss function in Loss Models, and therefore produces the same best estimator, the
mode. Note also that the particular density function given for is not used to solve this question.
16.18. C. This absolute value loss function is minimized by the median. The median of this
distribution is the value of x such that .5 = F(x) = 1- e-x. Thus x = ln 2 = .693.
16.19. E. The Loss Function corresponding to the pth percentile is proportional to:
(1-p)(k - X) if X k
error =

{
(-p)(k - X)

if X k

For the 80th percentile, p = .8 and the loss function is proportional to:
0.2(k - X) if X- k 0
error =

{
0.8(X - k)

if X -k 0

Multiplying by 5, we obtain the given loss function, with = (5)(0.8) = 4.


Comment: One can always multiply a loss function by any positive constant, without changing the
estimator that minimize it. In this situation, we particularly dislike underestimates; the loss function is
four times as large when X > k than when X < k. Therefore, this loss function is minimized by an
estimator than tends to aim high, such as the 80th percentile, rather than the median.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 647

Section 17, Least Squares Credibility


Some of the mathematics behind the Buhlmann Credibility formula will be discussed in this
section.174 It will be shown that Buhlmann Credibility is the linear estimator which minimizes
the expected squared error measured with respect to either the future observation, the
hypothetical mean, or the Bayesian Estimate.
As will be shown, the expected squared errors as a function of the amount of credibility assigned to
the observations is a parabola. First well consider a single observation of a risk process and then
extend the result to the average of several observations.
Loss Models distinguishes between the situation where there is no variation in size or exposure, the
so-called Buhlmann Model, and the situation where there is variation in size or exposure, the
so-called Buhlmann-Straub Model.175
The Buhlmann Model:176
For a given policyholder its losses in different years, Xi, are independent, identically distributed
variables.177 178
The means and variances differ across a group of policyholders in some manner.
() = E[Xi | ], the hypothetical mean for risk type .
v() = Var[Xi | ], the process variance for risk type .
Then as discussed previously:
= E[()].
EPV = E[v()].
VHM = Var[()].
K = EPV/VHM.
For N years of past data, Z = N/(N + K).
X = Xi /n.
Buhlmann Credibility Premium = estimate of the future = Z X + (1 - Z).
174

Almost all exam questions ask you to apply the formula rather than asking about the mathematics behind it.
Many actuaries do not make a big deal out of this distinction.
176
See Section 20.3.5 of Loss Models.
177
X could be the number of claims rather than the aggregate loss.
As we have seen, the same mathematics can also be applied to severity, where n is the number of claims.
178
We actually need only assume that for a given policyholder the means and the variances are the same and that the
distributions are independent.
175

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 648

The Buhlmann-Straub Model:179


For a given policyholder its pure premiums in different years, Xi, are independent.180
In year i, the policy has exposures mi, some measure of size.181
As before, () = E[Xi | ].
Now we assume that the variance of the pure premium is inversely proportional to size:
Var[Xi | ] = v()/mi.
Then as discussed previously:
= E[()].

EPV = E[v()].

VHM = Var[()].

K = EPV/VHM.

Let m = mi = total exposures.


Z = m/(m + K).
The observed pure premium is:
total losses / total exposures =

Xi mi = Xi mi .
m
mi

Buhlmann-Straub Credibility Premium = estimate of the future pure premium


= Z(observed pure premium) + (1 - Z).
Now we will discuss the expected squared errors of these estimators, starting with the simpler
Buhlmann Model, without size of insured being important.
Covariance Matrix:
In order to compute expected squared errors, it is useful to work with the Covariance Matrix of
different years of data.182 The Covariance Matrix has variances of individual years down the diagonal
and covariances between different years off the diagonal. In general, the expected squared errors
and thus the least squares credibility depends on the Covariance structure of the data.

179

See Section 20.3.6 of Loss Models.


X could be the frequency rather than the pure premium.
181
If all of the mi = 1, then the Buhlmann-Straub model reduces to the Buhlmann model.
182
Cov[X, Y] = E[XY] - E[X]E[Y]. Cov[X, X] = Var[X].
180

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 649

For example, lets calculate the covariance between two separate rolls in the multi-sided die
example.183 First one calculates the expected product of the die rolls, given a certain sided die has
been picked. Then one takes the expected value over the different sided dice. Then one subtracts
the overall mean squared.
If one has selected a 4-sided die, then the (conditional) expected value of the product of two rolls is:
E[X1 X2 | 4-sided] = {(1)(1) + (1)(2) + (1)(3) + (1)(4) + (2)(1) + (2)(2) + (2)(3) + (2)(4) + (3)(1) +
(3)(2) + (3)(3) + (3)(4) + (4)(1) + (4)(2) + (4)(3) + (4)(4)} /16 = 100/16 = 2.52 .
Given that one has picked a 4-sided die, the two rolls are independent and the expected value of
the product is just the product of the means.
E[X1 X2 | 4-sided] = E[X1 | 4-sided] E[X2 | 4-sided] = (2.5)(2.5).
Similarly, E[X1 X2 | 6-sided] = 3.52 , and E[X1 X2 | 8-sided] = 4.52 .
A priori there is 60% chance of a 4-sided die, 30% chance of a 6-sided die, and 10% chance of an
8-sided die. Thus:
E[X1 X2 ] = P(4)E[X1 X2 | 4-sided] + P(6)E[X1 X2 | 6-sided] + P(8)E[X1 X2 | 8-sided]
= (60%)(2.52 ) + (30%)(3.52 ) + (10%)(4.52 ) = 9.45.
The overall mean is: E[X] = (60%)(2.5) + (30%)(3.5) + (10%)(4.5) = 3.
Thus the covariance between different rolls is:
COV[X1 , X2 ] = E[X1 X2 ] - E[X1 ]E[X2 ] = 9.45 - (3)(3) = 0.45.
This also is the Variance of the Hypothetical Means computed earlier for this multi-sided die
example. In fact, the arithmetic was precisely the same. The covariance between different rolls is just
the VHM.
As shown below, in general for Buhlmann Model, where size is not important, the Covariance
structure between the years of data is:
COV[Xi , Xj] = 2 ij + 2, where ij is 1 for i = j and 0 for ij.
2 is the Expected Value of the Process Variance for one exposure
2 is the Variance of the Hypothetical Means.
183

Assume that there are a total of 100 multi-sided dice of which 60 are 4-sided, 30 are 6-sided, and 10 are 8-sided.
The multi-sided dice with 4 sides have 1, 2, 3 and 4 on them. The multi-sided dice with the usual 6 sides have
numbers 1 through 6 on them. The multi-sided dice with 8 sides have numbers 1 through 8 on them. For a given
die each side has an equal chance of being rolled; i.e., the die is fair.
Your friend has picked at random a multi-sided die. He then rolled the die and told you the result. You are to estimate
the result when he rolls that same die again.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 650

Then COV[X1 , X2 ] = 2, and COV[X1 , X1 ] = VAR[X1 ] = 2 + 2.


Thus in the multi-sided die example in which 2 = 0.45 and 2 = 2.15, the
variance-covariance matrix between five rolls of the same die is:184

1
2
3
4
5

2.60
0.45
0.45
0.45
0.45

0.45
2.60
0.45
0.45
0.45

0.45
0.45
2.60
0.45
0.45

0.45
0.45
0.45
2.60
0.45

0.45
0.45
0.45
0.45
2.60

As discussed previously, for this example the covariance between different years of data is 0.45,
while the (total) variance of a single year of data is 2.60, where one roll of the die
one year of data.
When size of risk is important, the Buhlmann-Straub covariance structure is:
COV[Xi , Xj] = ij ( 2 / mi) + 2, where mi is the exposures for year i.
As discussed previously, the EPV is (assumed to be) inversely proportional to the size of risk.
Derivation of the Buhlmann Covariance Structure:
In general, let the different types of risks be parameterized by .185 Let () be the hypothetical
mean for risks of type , E[X | ] = (). Note that E [()] = , the overall a priori mean.
Also 2 = VHM = E[()2 ] - 2.
Then E[X1X2] = E[ E[ X1X2 | ] ] = E[ E[ X1 | ] E[ X2 | ] ] = E[()()] = E[()2 ]
= 2 + 2. Where we have used the fact that for a given type of risk, the first observation X1 and the
second observation X2 are independent draws from the same risk process.

184

Ive only shown five rows and columns, corresponding to a total of five trials or die rolls.
For example, in the case of the Gamma-Poisson, is the Poisson parameter.
In a discrete case, could take on the four values: Excellent, Good, Bad , or Ugly.
185

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 651

When i j, for example 1 and 2, we have that the expected covariance between different years of
data is:
COV[X1 , X2 ] = E[X1 X2 ] - E[X1 ]E[X2 ] = E[E[X1 X2 |]] - 2 =
E[ E[X1 |] E[X2 |] ] - E[()]2 = E[2()]] - E[()]2 =
second moment of the hypothetical means - square of the overall mean =
Variance of the Hypothetical Means = 2.
When i = j we have that:
COV[Xi , Xj] = COV[Xi , Xi] = VAR[X] = Total Variance = EPV + VHM = 2 + 2.
Thus putting together the cases when i = j and i j, COV[Xi , Xj] = 2ij + 2.
Squared Errors, Using a Single Observation:
Assume we have a universe of risks.186 Choose one risk at random. Let X1 be an observation of a
risk process.187 We wish to predict the next outcome X2 of the risk process for the same risk.
Let the a priori overall mean be . It is assumed that the a priori expected value of the future
observation equals : E[X2 ] = , as does that of the prior observation: E[X1 ] = .
Let the new estimate be of the linear form: F = Z X1 + (1 - Z).
Then the error of the estimate compared to the posterior observation is:
F - X2 = Z X1 + (1 - Z) - X2 = Z(X1 - X2 ) + (1 - Z)( - X2 ).
Thus the expected value of the squared error as a function of Z is:
V(Z) = E[{Z(X1 - X2 ) + (1 - Z)( - X2 )}2 ]
= Z2 E[(X1 - X2 )2 ] + 2Z(1 - Z) E [(X1 - X2 )( - X2 )] + (1 - Z)2 E[( - X2 )2 ].
It will be useful to write out the expected value of various product terms. Let 2 be the expected
value of the process variance of a single observation and 2 be the variance of the hypothetical
means of a single observation. Recall that the total variance of a single observation is: 2 + 2.
186
187

An urn filled with dice, a group of drivers, etc.


Roll a die, observe an individual driver for a year, etc.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 652

Then the expected value of the square of an observation from a year is the total variance plus the
square of the overall mean: E[X1 2 ] = E[X2 2 ] = 2 + 2 + 2.
The expected value of a product of observations from different years is the variance of the
hypothetical means plus the square of the overall mean:
E[X1 X2 ] = Cov[X1 , X2 ] + E[X1 ]E[X2 ] = 2 + 2.188
Terms involving the overall mean, involve neither the EPV nor VHM:
E[X1 ] = E[X1] = 2.
The terms entering into the expected value of the squared error are:
E[(X1 - X2 )2 ] = E[X1 2 ] - 2E[X1 X2 ] + E[X2 2 ] = 2(2 + 2 + 2) - 2(2 + 2) = 22.
E[(X1 - X2 )( - X2)] = E[X1 ] - E[X1 X2 ] - E[X2 ] + E[X2 2 ] =
2 - (2 + 2) - 2 + (2 + 2 + 2) = 2.
E[( - X2 )2 ] = E[2] - 2E[X2 ] + E[X2 2 ] = 2 - 22 + (2 + 2 + 2) = 2 + 2.
Thus the expected value of the squared error as a function of Z is:
V(Z) = Z2 E[(X1 - X2 )2 ] + 2Z(1 - Z) E[(X1 - X2 )( - X2 )] + (1 - Z)2 E[( - X2 )2 ] =
22 Z 2 + 22Z(1 - Z) + (2 + 2)(1-Z)2 = (2 + 2) Z2 - 22 Z + (2 + 2).
Expected Value of the Squared Error = (2 + 2)Z2 - 22 Z + (2 + 2).189
Thus the expected value of the squared error as a function of the weight given to the
observation, Z, is a parabola.
In order to minimize the expected value of the squared error, one sets its derivative equal to zero.
Setting V(Z) = 0, and solving for Z:
Z = 2 / (2 + 2 ) = VHM / Total Variance = 1 / (1 + 2/2).
If we let K = 2/2 = EPV /VHM, where both the VHM and EPV are for a single observation of the
risk process, then for one observation: Z = 1/(1 + K).
188
189

For two different years of data, there is no term involving the expected value of the process variance.
The squared error between the estimate and the observation.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 653

For example, in the multi-sided die example as calculated in previous sections, for a single
observation, EPV = 2 = 2.15, and VHM = 2 = 0.45. In that case, the Buhlmann Credibility
parameter, K = 2.15/.45 = 4.778, and for a single observation the Buhlmann Credibility was
1/(1 + 4.778) = 0.173.
The expected value of the squared error as function of Z is in this case:
V(Z) = (2 + 2) Z2 - 22 Z + (2 + 2) = 2.6Z2 - 0.9Z + 2.6.
The mean squared error as a function of the weight applied to the observation are shown below for
1 die (solid line), 3 dice (dashed line), and 10 dice (dotted line):190
MSE

2.9
N= 1

2.8
2.7

N= 3

2.6
2.5

N = 10

2.4
2.3

Weight
0.2

0.4

0.6

0.8

V(Z) for an observation of one die roll is minimized for Z = 0.173, the value of the Buhlmann
Credibility.191 Similarly, the mean squared errors for observations of 3 and 10 die rolls are minimized
for Z = 3/(3 + 4.778) = 0.386 and 10/(10 + 4.778) = 0.677.

As discussed below, for N dice, V(Z) = (2/N + 2) Z2 - 22 Z + (2 + 2) = (2.15/N + .45)Z2 - .9Z + 2.6.
Note that for values of Z somewhat different than .173, the expected squared error is still relatively small. For
values near optimal, small differences in the Credibility have a relatively small effect on the expected squared error.
See An Actuarial Note on Credibility Parameters, by Howard C. Mahler, PCAS 1986.
190
191

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 654

For the cases where the observations consist of either one (solid line), three (dashed line), or ten
(dotted line) die rolls, shown below are the expected squared error as a function of k, such that the
weight to the observation = N/(N + k):
MSE

2.8

2.7

2.6
N= 1
2.5
N= 3
2.4
N = 10
2.3
k
1

10

In each case, the mean squared error is minimized for k = 4.778, the Buhlmann Credibility Parameter
for this multisided die example.192
A Simulation Experiment:
Simulate the multi-sided die example, as follows:
Pick a die at random, with a 60% chance of a 4-sided die, a 30% chance of a 6-sided die, and a 10%
chance of a 8-sided die.
Roll the die.
Estimate the second roll as: w(first roll) + (1 - w)(a priori mean) = w(first roll) + 3(1 - w).
Roll this same die again.
Then compute the squared error: (predicted second roll - actual second roll)2 .
192

For values near the optimal value of 4.778, small differences in K have a very small effect on the expected
squared error. Generally one needs only to estimate K within about a factor of 2. In this case, for values of K
between about 3 and 7, the expected squared error is close to minimal. As discussed in a separate study guide,
Empirical Bayesian Credibility methods attempt to estimate K solely from the observed data. While the random
fluctuations in the data often produce considerable uncertainty in the estimate of K, fortunately K does not need to
be estimated very precisely.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 655

For example, if the first die is a six-sided die, and the two rolls are a 4 and a 5, then the squared
prediction error would be: {w4 + (1-w)3 - 5}2 . Given a simulated pair of rolls from a die, the squared
error is a function of w, the weight applied to the observation.
This situation was simulated 1000 times, and the squared errors were averaged.
Here is a graph of the Mean Squared Error as a function of w:
MSE
4.25
4
3.75
3.5
3.25
3
2.75
0.2

0.4

0.6

0.8

Weight

The mean squared error between prediction and observation, is minimized for about
w = 0.178, close to the Buhlmann Credibility for a single die roll of 0.173.193
MSE
2.5586
2.5584
2.5582
2.558
2.5578
2.5576

0.16

193

0.17

0.18

0.19

The difference is due to simulating only 1000 runs.

0.2

Weight

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 656

Squared Errors, Using an Average of Several Observations:


If instead of a single observation, one uses the average of several observations, the key difference
is that the EPV of the average of several observations is inversely proportional to the number of
observations. Otherwise, the situation parallels that for a single observation.
Let X1 , X2 , ..., XN be N observations of a risk process.194 We wish to predict the next outcome XN+1
for the same risk. Let the average of the observations be O = X = (1/N) Xi.
Let the a priori overall mean be . Its assumed that the a priori expected value of the future
observation equals : E[XN+1] = , as does that of each prior observation: E[Xi] = .
Let the new estimate be of the linear form: F = Z O + (1 - Z).
Then the error compared to the posterior observation is:
F - XN+1 = Z O + (1 - Z) - XN+1 = Z(O - XN+1) + (1 - Z)( - XN+1).
Thus the expected value of the squared error as a function of Z is:
V(Z) = E[{Z(O - XN+1) + (1 - Z)( - XN+1)}2 ] =
Z 2 E[(O - XN+1)2 ] + 2Z(1 - Z) E[(O - XN+1)( - XN+1)] + (1 - Z)2 E[( - XN+1)2 ].
It will be useful to write out the expected value of various product terms. As before, let 2 be the
expected value of the process variance of a single observation and 2 be the variance of the
hypothetical means of a single observation. Then the expected value of the square of an
observation from a single year is the total variance plus the square of the overall mean:
E[X1 2 ] = E[X2 2 ] = E[XN+12 ] = 2 + 2 + 2.
O is the average of N independent draws from the same risk process, therefore its mean is and its
(total) variance is: 2/N + 2. Thus E[O2] = 2/N + 2 + 2.195
The expected value of a product of observations from different years is the variance of the
hypothetical means plus the square of the overall mean: E[X1 XN+1] = 2 + 2.
Thus E[OXN+1] = (1/N) E[XiXN+1] = (1/N)N(2 + 2) = 2 + 2.
Terms involving the observation and year to be estimated do not contain the EPV.
194
195

Roll N identical dice, observe an individual driver for N years, etc.


The process variance of an average declines as per the number of observations.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 657

Terms involving the overall mean, involve neither the EPV nor VHM: E[X1 ] = E[X1 ] = 2.
The terms entering into the expected value of the squared error are:
E[(O - XN+1)2 ] = E[O2 ] - 2E[OXN+1] + E[XN+12 ] =
(2/N + 2 + 2) - 2(2 + 2) + (2 + 2 + 2) = (1 + 1/N)2.
E[(O - XN+1)( - XN+1)] = E[O] - E[OXN+1] - E[XN+1] + E[XN+12 ] =
2 - (2 + 2) - 2 + (2 + 2 + 2) = 2.
E[( - XN+1)2 ] = E[2] - 2E[XN+1] + E[XN+12 ] = 2 - 22 + (2 + 2 + 2) = 2 + 2.
Thus the expected value of the squared error as a function of Z is:
V(Z) = Z2 E[(O - XN+1)2 ] + 2Z(1 - Z) E[(O - XN+1)( - XN+1)] + (1 - Z)2 E[( - XN+1)2 ] =
(1 + 1/N)2 Z2 + 22Z(1 - Z) + (2 + 2)(1 - Z)2 = (2/N + 2) Z2 - 22 Z + (2 + 2).
V(Z) = (2/N + 2) Z2 - 22 Z + (2 + 2).
As was the case of a single observation, the expected value of the squared error as a function of Z
is a parabola. In order to minimize the expected value of the squared error, one sets its derivative
equal to zero. Setting V(Z) = 0, and solving for Z:
Z = 2 / (2/N + 2) = VHM/ Total Variance = N/(N + 2/2).
If we let K = 2/2 = EPV / VHM, where both the VHM and EPV are for a single observation of the
risk process, then for N observations Z = N/(N + K).
Therefore Z = N / (N + K), where K = 2/2 = EPV / VHM,
where EPV and VHM are each for a single observation.
For example in the multi-sided die example, for a single observation EPV = 2 = 2.15, and
VHM = 2 = .45. In that case, the Buhlmann Credibility parameter K = 2.15 / 0.45 = 4.778,
and for N observations the Buhlmann Credibility was Z = N/(N + 4.778).
For example, for N = 3, the Buhlmann Credibility is: Z = 3/7.778 = 0.386.
The expected value of the squared error as function of Z is in this case:
V(Z) = (2/N + 2)Z2 - 22 Z + (2 + 2) = (2.15/N + 0.45)Z2 - 0.9Z + 2.6.
For N = 3, V(Z) = 1.1667Z2 - 0.9Z + 2.6.
V(Z) is minimized for Z = 0.386, the value of the Buhlmann Credibility for three rolls of a die.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 658

Z = VHM / (VHM + EPV) = VHM / Total Variance, where the Variance of the Hypothetical
Means and Expected Value of the Process Variance have already been adjusted to be
that for N observations.196
Note that the Credibility is: Z = VHM / (VHM + EPV) = (1/EPV) / {(1/EPV) + (1/VHM)},
while the complement of credibility is: 1 - Z = (1/VHM) / {(1/EPV) + (1/VHM)}.
Thus the observation is given a weight inversely proportional to its variance: EPV, while the overall
a priori mean is given a weight inversely proportional to its variance: VHM.
Thus each estimator of the quantity of interest is weighted inversely proportionally to
its variance.197 Note that credibility is a measure of how reliable one estimator is relative to the
other estimators; the larger the inverse variance of an estimator is compared to that of the other
estimators, the more credibility it is given.
Summary of Behavior with Number of Observations:
As we add up N independent observations of the risk process, the process variances add.
Therefore, if 2 is the process variance of a single observation, then N2 is the process variance of
the sum of N identical draws from the given risk process.
In contrast if 2 is the variance of the hypothetical means of one observation, then the variance of the
hypothetical means of the sum of N observations is N2 2 . This follows from the fact that each of the
means is multiplied by N, and that multiplying by a constant multiplies the variance by the square of
that constant.
Thus as N increases, the variance of the hypothetical means of the sum goes up as N2 , while the
process variance of the sum goes up only as N. Since the average is just the sum times 1/N, the
variance of the hypothetical means of the average is independent of N, while the process variance
of the average goes down as 1/N. Thus since the relative weights are inversely proportional to the
variance, the weight given the observation increases relative to that given to the prior mean, as N
increases. As N increases, the credibility given to the observation increases.
The credibility to be given by: Z = VHM / Total Variance = VHM / (VHM + EPV) =
N2 2 / (N2 2 + N2) = 2 / (2 + 2/N) = N / (N + 2/2).
196

One computes K = EPV / VHM where the EPV and VHM are for one observation.
The formula Z = N / (N + K) automatically adjust for the number of observations N.
197
This is a special case of general statistical situation. If two unbiased estimators are independent, then the
minimum variance unbiased estimator is the weighted average of the two estimators, with each estimator weighted in
inverse proportion to its variance.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 659

Therefore Z = N / (N + K), where K = 2 / 2 = EPV / VHM, where EPV and VHM are each
for a single observation.198
Behavior with Size of Risk:
In lines of insurance such as Group Health, Commercial Automobile, and Workers Compensation,
one deals with insureds with differing number of exposures. For example, one fleet might consist of
10 trucks, while another consists of 100 trucks. All else being equal, one assigns more credibility to
the experience of a larger insured than to that of a smaller insured. Let m be some relevant measure
of the size of an insured such as exposures, then Z gets larger as m gets larger.
Generally what we are estimating for the insured is a quantity with some measure of size of risk in the
denominator. So for example, the frequency is number of claims divided by exposures, severity is
number of dollars divided by number of claims, pure premiums are losses divided by number of
exposures, and a loss ratio is losses divided by premiums. In each case the denominator is some
measure of the size of the insured.
If one adds up identical exposures then 2, the process variance of losses, is multiplied by m, while
in contrast 2, the variance of the hypothetical mean losses, is multiplied by m2 .
However, if instead we deal with pure premiums which are losses divided by m, then each variance
is divided by m2 . Thus we have that the process variance is 2/m, while the variance of the
hypothetical means is 2. As the size of risk increases, the process variance of pure
premiums decreases while the variance of the hypothetical mean pure premiums
remains the same.199 200 This corresponds to the Buhlmann-Straub covariance structure:
Cov[Xi, Xj] = 2 + ij2/mi.
As the size of risk increases the process variance (noise) decreases, so we assign the observation
more credibility. Specifically the credibility Z = VHM / Total Variance =
VHM / (VHM + EPV) = 2 / (2 + 2/m) = m / (m + 2/2).

198

On the exam, the calculation of the Credibility thus involves the calculation of these two variances for a single
observation. In general situations, one has to analyze a Covariance matrix, as discussed in the next section.
See for example, Howard Mahlers An Example of Credibility and Shifting Risk Parameters, PCAS 1990.
199
While this behavior holds on the exam, unless specifically stated otherwise as in 4, 5/01, Q.23, it does not hold in
all practical applications. This is discussed briefly in the next section.
200
The key idea is that the pure premium is an average; pure premium is the average loss per exposure.
The same mathematics would apply to other averages such as frequency or severity.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 660

Therefore Z = m / (m + K), where K = 2 / 2 = EPV / VHM, where EPV and VHM are each
for a single unit of m; m is measured in whatever units appear in the denominator of the
quantity of interest.201 If we are interested in frequency or pure premiums then E, the measure of
size of risk, is in units of exposures. If we are interested in severity then E, the measure of size of
risk, is in units of (expected number of) claims. If we are interested in loss ratios then E, the measure
of size of risk, is in units of premiums.
Errors Compared to the Hypothetical Mean:
Throughout this section the expected squared error was measured by comparing the estimate to the
future observation. When dealing with models, one could instead measure the error of the
observation with respect to the hypothetical mean for the chosen risk.202 However, these two
varieties of errors are closely related to each other.
It turns out that the expected squared error measured with respect to the future observation just
contains an extra EPV compared to the squared error with respect to the hypothetical mean.203
This can be shown as follows.
As previously, let 2 be the expected value of the process variance of a single observation and let
2 be the variance of the hypothetical means of a single observation.
Let O be the past observation, an average of N data points.
Let F be the Credibility estimate, F = ZO + (1 - Z).
Let XN+1 be the future observation.
V X(Z) = E[(XN+1 - F)2 ].
V HM(Z) = E[(() - F)2 ], where () is the hypothetical mean for a given risk type .
Then VHM(Z) = E[(()-F)2 ] = E[(() - XN+1 + XN+1 - F)2 ] =
E[(() - XN+1)2 ] + E[( XN+1 - F)2 ] + 2E[(() - XN+1)( XN+1 - F)].
Now we have E[(() - XN+1)2 ] = 2, since this is just the EPV, the variance of the (single) future
observation around its hypothetical mean.
201

On the exam the calculation of the credibility thus involves the calculation of these two variances for a single
exposure. In general situations, as discussed in the next section, one has to analyze a Variance-Covariance matrix.
202
Note that in insurance applications the hypothetical mean for a given insured can not be observed.
203
This extra EPV comes from the extra random fluctuation of the future observation around the hypothetical mean
for that risk.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 661

E[(XN+1 - ())( XN+1 - O)] = E[(XN+12 ] + E[() O)] - E[(XN+1O)] - E[()XN+1] =


(2 + 2 + 2) + (2 + 2) - (2 + 2) - (2 + 2) = 2.
E[(XN+1 - ())( XN+1 - )] = E[(XN+12 ] + E[() )] - E[(XN+1 )] - E[ ()XN+1] =
(2 + 2 + 2) + 2 - 2 - (2 + 2) = 2.
Therefore, E[(() - XN+1)(XN+1 - F)] = -E[(XN+1 - ())(XN+1 - F)] =
-E[(XN+1 - ())(XN+1 - ZO + (1 - Z))] =
-ZE[(XN+1 - ())(XN+1 - O)] - (1 - Z)E[(XN+1 - ())(XN+1 - )] = -Z2 - (1 - Z)2 = -2.
Therefore, VHM(Z) = E[(() - XN+1)2 ] + E[(XN+1 - F)2 ] + 2E[(() - XN+1)( XN+1 - F)]
= 2 + VX(Z) - 22 = VX(Z) - 2.
Thus as stated above, the expected squared error measured with respect to the future
observation contains just an extra EPV compared to the squared error with respect to
the hypothetical mean: VX(Z) = VHM(Z) + 2.
Using the previous formula for the expected squared error with respect to the future observation,
when one uses N observations, we have:
V X(Z) = (2/N + 2)Z2 - 22 Z + (2 + 2).
V HM(Z) = (2/N + 2)Z2 - 22 Z + 2.
Since these two expected squared errors differ by only 2, independent of Z, the value of Z that
minimizes one also minimizes the other. Thus the Buhlmann Credibility minimizes the expected
squared error measured with respect to either the future observation or the hypothetical means.
Exercise: For the multi-sided die example, where one observes one roll of a die, what is VX(Z),
the expected squared error measured with respect to the future observation?
[Solution: VX(Z) = (2/N + 2)Z2 - 22Z + (2 + 2). For this example, N =1, 2 = 0.45, and
2 = 2.15. Thus VX(Z) = 2.6 Z2 - 0.9 Z + 2.6.]

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 662

Exercise: For the multi-sided die example, where one observes one roll of a die, what is VHM(Z),
the expected squared error measured with respect to the hypothetical means?
[Solution: VHM(Z) = (2/N + 2)Z2 - 22Z + 2. For this example, N =1, 2 = 0.45, and
2 = 2.15. Thus VHM(Z) = 2.6Z2 - 0.9 Z + 0.45.]
Exercise: For the multi-sided die example, what is the Buhlmann Credibility assigned to an
observation of one roll of a die?
[Solution: Minimize either VX(Z) or VHM(Z) and obtain Z = 0.45 / 2.6 = 17.3%.]
Errors Compared to the Bayesian Estimates:
One can also consider the expected squared error with respect to the results of Bayes Analysis.
As shown in the section on Linear Regression, Buhlmann Credibility also minimizes these squared
errors.
Let E[() | O] be the Bayesian Estimate, given the past observation O.
Let VB(Z) = E[(E[() | O] - F)2 ].
Then VHM(Z) = E[(() - F)2 ] = E[(() - E[() | O] + E[() | O] - F)2 ] =
E[(E[() | O] - ())2 ] + E[(E[() | O] - F)2 ] - 2E[(E[() | O] - ())(E[() | O] - F)].
Now we have that, E[(E[() | O] - ())(E[() | O] - F)] =
EO[E[(E[() | O] - ())(E[() | O] - F) | O]] =
EO[(E[() | O] - E[() | O])(E[() | O] - {ZO + (1 - Z)})] =
EO[0(E[() | O] - {ZO + (1 - Z)})] = EO[0] = 0.
Thus, VHM(Z) = E[(E[() | O] - ())2 ] + VB(Z).
Note that the first term on the righthand side of the equation, E[(E[() | O] - ())2 ], is independent of
Z. Therefore the value of Z that minimizes the expected squared error with respect to the
hypothetical means, VHM(Z), will also minimize the expected squared error with respect to the
Bayesian Estimates, VB(Z).

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 663

Thus the Buhlmann Credibility is the linear estimator which minimizes the expected
squared error measured with respect to either the future observation, the hypothetical
mean, or the Bayesian Estimate.
Exercise: For the multi-sided die example, where one observes one roll of a die, what is
E[(E[() | O] - ())2 ], the expected squared difference between the hypothetical means and the
Bayesian Estimates?
[Solution: As calculated in a previous section, the Bayesian Estimate corresponding to each of the
possible observations are:
Observation
1
2
3
4
Bayesian Estimate 2.853 2.853 2.853 2.853

5
3.7

6
3.7

7
4.5

8
4.5

For a given observation, we can compute the posterior chance that we have a 4, 6, or 8 sided die
and thus () equal to 2.5, 3.5, or 4.5. For example, as shown in a prior section if a 3 is observed,
those posterior chances are: 70.6%, 23.5% and 5.9%.
Thus if one observes a roll of a 3, E[(E[() | O] - ())2 | 3] =
(70.6%)(2.853 - 2.5)2 + (23.5%)(2.853 - 3.5)2 + (5.9%)(2.853 - 4.5)2 = 0.346.
For other possible observations one can do the similar computation and then weight together the
results using the a priori chances of each observation:204
A Priori
Chance of Bayesian
Observation Observation Estimate
1
2
3
4
5
6
7
8

0.212
0.212
0.212
0.212
0.062
0.062
0.013
0.013

Average

2.853
2.853
2.853
2.853
3.700
3.700
4.500
4.500

Posterior
Posterior
Posterior
Chance of Chance of Chance of
4-sided die 6-sided die 8-sided die
70.6%
70.6%
70.6%
70.6%
0.0%
0.0%
0.0%
0.0%

23.5%
23.5%
23.5%
23.5%
80.0%
80.0%
0.0%
0.0%

Expected
Squared
Difference

5.9%
5.9%
5.9%
5.9%
20.0%
20.0%
100.0%
100.0%

3.000

0.346
0.346
0.346
0.346
0.160
0.160
0.000
0.000
0.314

Thus for this example, E[(E[() | O] - ())2 ] = 0.314.]


Exercise: For the multi-sided die example, where one observes one roll of a die, what is
V B(Z) , the expected squared error measured with respect to the Bayesian Estimates?
[Solution: VHM(Z) = 2.6Z2 - 0.9Z + 0.45, and E[E[() | O] - ())2 ] = 0.314.
Thus VB(Z) = VHM(Z) - E[(E[() | O] - ())2 ] = 2.6Z2 - 0.9Z + 0.45 - 0.314
= 2.6Z2 - 0.9Z + 0.136.]
204

These a priori chances were computed in a previous section.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 664

Exercise: For the multi-sided die example, what is the Buhlmann Credibility assigned to an
observation of one roll of a die?
[Solution: Minimize either VX(Z), VHM(Z), or VB(Z) and obtain Z = 0.45 / 2.6 = 17.3%.]
Note that the expected squared error of the Credibility Estimates with respect to the Bayesian
Estimate is smaller than that with respect to the Hypothetical Means. In the example,
2.6Z2 - 0.9 Z + 0.136 < 2.6Z2 - 0.9 Z + 0.45. Buhlmann Credibility is a closer approximation to the
Bayesian Estimates than to the Hypothetical Mean. This makes sense, since the Bayesian
Estimates are themselves an approximation to the Hypothetical Means; the Bayesian Estimates are
closer to the Hypothetical Means than are the Credibility Estimates, except when the two estimates
are equal, since the Bayesian Estimates are not restricted to being a linear function of the
observations.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 665

Problems:
Use the following information for the next 5 questions:
You observe the following experience for six insureds during the years 1996 to 1998 combined:
Insured

Premium in 1996-98
(prior to
experience mod)

Losses
in 1996-98

Loss
Ratio
in 1996-98

Acme Anvils
Buzz Beer
Cash & Carey
Drum & Drummer
Everything Elvis
Frank & Stein

100
60
130
120
190
80

65
35
80
110
90
70

65.0%
58.3%
61.5%
91.7%
47.4%
87.5%

Overall

680

450

66.2%

Experience modifications will be calculated for each of these insureds using their experience from
1996 to 1998 together with the formulas:
Z=

P
(L / P) Z + (1- Z) (0.662)
and M =
,
P +K
0.662

where Z = credibility, P = premium, L = losses, M = experience modification, and 0.662 is the


observed overall loss ratio.
You subsequently observe the following experience for these same six insured during the year
2000.
Insured

Premium in 2000
(prior to
experience mod)

Losses
in 2000

Loss
Ratio
in 2000

Acme Anvils
Buzz Beer
Cash & Carey
Drum & Drummer
Everything Elvis
Frank & Stein

30
20
50
40
60
35

20
15
35
30
25
30

66.7%
75.0%
70.0%
75.0%
41.7%
85.7%

Overall

235

155

66.0%

17.1 (1 point) If K = 70, what is the experience modification for Everything Elvis?
A. Less than 0.75
B. At least 0.75 but less than 0.78
C. At least 0.78 but less than 0.81
D. At least 0.81 but less than 0.84
E. At least 0.84

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

17.2 (2 points) The modified loss ratio in the year 2000 is:

HCM 10/19/12,

Page 666

losses
.
(premiums) (experience modification)

If K = 70, what is the modified loss ratio for Drum & Drummer?
A. Less than 0.55
B. At least 0.55 but less than 0.57
C. At least 0.57 but less than 0.59
D. At least 0.59 but less than 0.61
E. At least 0.61
17.3 (2 points) If K = 70, what is the squared difference in the year 2000, between the overall loss
ratio and the modified loss ratio for Frank & Stein?
A. Less than 0.004
B. At least 0.004 but less than 0.005
C. At least 0.005 but less than 0.006
D. At least 0.006 but less than 0.007
E. At least 0.007
17.4 (3 points) If K = 70, what is the sum of the squared differences in the year 2000, between the
overall loss ratio and the modified loss ratios for these six insureds?
A. Less than 0.049
B. At least 0.049 but less than 0.050
C. At least 0.050 but less than 0.051
D. At least 0.051 but less than 0.052
E. At least 0.052
17.5 (6 points) Which of the following values of K results in the smallest sum of the squared
differences in the year 2000 between the overall loss ratio and the modified loss ratios for these six
insureds?
A. 40
B. 55
C. 70
D. 85
E. 100
17.6 (3 points) Let X1 , X2 , ..., XN be independent random variables with common mean .
Var[Xi] = i2. Let Y = wiXi.
Determine wi, such that Y is an unbiased estimator of , with the smallest variance.
17.7 (1 point) Let X1 , X2 , ..., XN be independent random variables with common mean .
Var[Xi] = 2/mi. Let Y = wiXi.
Determine wi, such that Y is an unbiased estimator of , with the smallest variance.
Hint: Use the answer to the previous question.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 667

17.8 (1 point) Let X1 , X2 , ..., XN be independent random variables with common mean .
Var[Xi] = b + c/mi. Let Y = wiXi.
Determine wi, such that Y is an unbiased estimator of , with the smallest variance.

Use the following information for the next 10 questions:


There are two urns each containing a large number of balls.
Each ball has a number written on it.
Number on Ball
0
1
4
Urn A
50%
30%
20%
Urn B
20%
50%
30%
An urn is picked at random, and balls are drawn from that urn with replacement.
17.9 (1 point) Let X be the result of drawing a single ball.
Determine E[X | A], E[X | B], and VarU[E[X | U]].
17.10 (1 point) Let Y be the sum of drawing 3 balls from a single urn.
Determine E[Y | A], E[Y | B], and VarU[E[Y | U]].
17.11 (1 point) Let X be the average of drawing 3 balls from a single urn.
Determine E[ X | A], E[ X | B], and VarU[E[ X | U]].
17.12 (1 point) Let X be the result of drawing a single ball.
Determine Var[X | A], Var[X | B], and EU[Var[X | U]].
17.13 (1 point) Let Y be the sum of drawing 3 balls from a single urn.
Determine Var[Y | A], Var[Y | B], and EU[Var[Y | U]].
17.14 (1 point) Let X be the average of drawing 3 balls from a single urn.
Determine Var[ X | A], Var[ X | B], and EU[Var[ X | U]].
17.15 (1 point) Let X be the result of drawing a single ball. Determine Var[X].
17.16 (1 point) Let Y be the sum of drawing 3 balls from a single urn. Determine Var[Y].
17.17 (1 point) Let X be the average of drawing 3 balls from a single urn. Determine Var[ X ].

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 668

17.18 (1 point) An urn is selected at random. Harry draws five balls, which total 5.
Then Sally draws six more balls from the same urn, which total 7. Using Bhlmann-Straub
Credibility, estimate the sum of the next 100 balls drawn from the same urn.
A. Less than 120
B. At least 120 but less than 125
C. At least 125 but less than 130
D. At least 130 but less than 135
E. At least 135

17.19 (2 points) For each policyholder, losses X1 ,, Xn , conditional on , are independently and
identically distributed with mean,
() = E(Xj | = ), j = 1, 2,... , n
and variance,
v() = Var(Xj | = ), j = 1, 2,... , n
You are given:
(i) Cov(Xi, Xj) = 40, for i j.
(ii) Var(Xi) = 130.
Determine the Bhlmann credibility assigned for estimating X4 based on X1 , X2 , X3 .
(A) Less than 60%
(B) At least 60%, but less than 65%
(C) At least 65%, but less than 70%
(D) At least 70%, but less than 75%
(E) At least 75%
17.20 (2 points) Cov[Xi, Xj] = 2 + 18ij, where ij = 0 if i j and 1 if i = j.
If X1 = 27 and X2 = 19, then using Buhlmann Credibility, the estimate of X3 is 32.
If instead X1 = 40, X2 = 61, X3 = 45, and X4 = 29, then using Buhlmann Credibility what is the
estimate of X5 ?
A. 36

B. 37

C. 38

D. 39

E. 40

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 669

17.21 (2 points) Assume the Bhlmann-Straub covariance structure.


Xi is the aggregate losses for year i.
Ei is the exposures for year i.
The expected value of the process variance of aggregate losses for year i is: 20,000 / Ei.
Year Exposures
1
2000
2
3000
3
3000
The credibility assigned for estimating X4 based on X1 , X2 , and X3 is 2/3.
Calculate Cov(X1 , X1 ).
(A) 5

(B) 10

(C) 15

(D) 16.67

(E) None of A, B, C, or D

17.22 (3 points) Using least-squares regression and the following information, estimate the
credibility of one year of claims experience.
Second Period Claim Count
0
1
2
A. 1%

B. 3%

0
8300
750
50
9100

C. 5%

First Period Claim Count


1
2
740
40
100
8
10
2
850
50
D. 7%
E. 9%

Total
9080
858
62
10,000

17.23 (2 points) A model for the claim frequency from an insurance policy is parameterized by .
You are given n years of claim frequencies from this policy, X1 , X2 , ... Xn .
The policy has mi exposures in year i.
You are asked to use the Bhlmann-Straub credibility model to estimate the expected claim
frequency in year n + 1 for this policy.
Which of conditions (A), (B), or (C) are required by the model?
(A) The Xi are independent, conditional on .
(B) The Xi have a common mean.
(C) Var[Xi | = ] = v()/mi.
(D) Each of (A), (B), and (C) is required.
(E) None of (A), (B), or (C) is required.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 670

17.24 (4, 5/00, Q.18) (2.5 points) You are given two independent estimators of an unknown
quantity :
(i) Estimator A: E(A) = 1000 and (A) = 400
(ii) Estimator B: E(B) = 1200 and (B) = 200
Estimator C is a weighted average of the two estimators A and B, such that:
C = w A + (1 - w)B
Determine the value of w that minimizes (C).
(A) 0

(B) 1/5

(C) 1/4

(D) 1/3

(E) 1/2

17.25 (4, 11/05, Q.26 & 2009 Sample Q.236) (2.9 points) For each policyholder, losses X1 ,,
Xn , conditional on , are independently and identically distributed with mean,
() = E(Xj | = ), j = 1, 2,... , n
and variance,
v() = Var(Xj | = ), j = 1, 2,... , n
You are given:
(i) The Bhlmann credibility assigned for estimating X5 based on X1 ,, X4 is Z = 0.4.
(ii) The expected value of the process variance is known to be 8.
Calculate Cov(Xi, Xj), i j.
(A) Less than -0.5
(B) At least -0.5, but less than 0.5
(C) At least 0.5, but less than 1.5
(D) At least 1.5, but less than 2.5
(E) At least 2.5
17.26 (4, 5/07, Q.32) (2.5 points)
You are given n years of claim data originating from a large number of policies.
You are asked to use the Bhlmann-Straub credibility model to estimate the expected number of
claims in year n + 1.
Which of conditions (A), (B), or (C) are required by the model?
(A) All policies must have an equal number of exposure units.
(B) Each policy must have a Poisson claim distribution.
(C) There must be at least 1082 exposure units.
(D) Each of (A), (B), and (C) is required.
(E) None of (A), (B), or (C) is required.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 671

Solutions to Problems:
17.1. C. Z = 190/(190 + 70) = .731.
M = {(.731)(.474) + (1 - .731)(.662)}/.662 = .525/.662 = 0.792.
17.2. D. Z = 120/(120 + 70) = .632. M = {(.632)(.917) + (1 - .632)(.662)}/.662 = 1.243.
For the year 2000: (L/P)/M = .750/1.243 = 0.603.
17.3. C. Z = 80/(80 + 70) = .533. M = {(.533)(.875) + (1 - .533)(.662)}/.662 = 1.172.
For the year 2000: (L/P)/M = .857/1.171 = .732. The overall loss ratio in the year 2000 is .660.
The squared difference is: (.732 - .660)2 = 0.00518.
17.4. B. The sum of the squared differences is 0.04977.

Insured

1996-98
Premium
Loss
(prior toLossesRatio
exper. mod) 1996-98

Acme Anvils
Buzz Beer
Cash & Carey
Drum & Drummer
Everything Elvis
Frank & Stein

100
60
130
120
190
80

65.0%
58.3%
61.5%
91.7%
47.4%
87.5%

Overall

680

66.2%

Experience
Mod, K =
70

Loss
Ratio
2000

Modified
Loss Ratio
2000

Squared
Difference

0.990
0.945
0.954
1.243
0.792
1.172

66.7%
75.0%
70.0%
75.0%
41.7%
85.7%

67.4%
79.3%
73.3%
60.3%
52.6%
73.1%

0.00020
0.01791
0.00545
0.00317
0.01787
0.00517

66.0%

0.04977

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 672

17.5. B. The smallest squared difference occurs when K = 55.

Insured

1996-98
Premium
Loss
(prior toLossesRatio
exper. mod) 1996-98

Acme Anvils
Buzz Beer
Cash & Carey
Drum & Drummer
Everything Elvis
Frank & Stein

100
60
130
120
190
80

65.0%
58.3%
61.5%
91.7%
47.4%
87.5%

Overall

680

66.2%

Insured

1996-98
Premium
Loss
(prior toLossesRatio
exper. mod) 1996-98

Acme Anvils
Buzz Beer
Cash & Carey
Drum & Drummer
Everything Elvis
Frank & Stein

100
60
130
120
190
80

65.0%
58.3%
61.5%
91.7%
47.4%
87.5%

Overall

680

66.2%

Insured

1996-98
Premium
Loss
(prior toLossesRatio
exper. mod) 1996-98

Acme Anvils
Buzz Beer
Cash & Carey
Drum & Drummer
Everything Elvis
Frank & Stein

100
60
130
120
190
80

65.0%
58.3%
61.5%
91.7%
47.4%
87.5%

Overall

680

66.2%

Experience
Mod, K =
40

Loss
Ratio
2000

Modified
Loss Ratio
2000

Squared
Difference

0.987
0.929
0.946
1.289
0.765
1.215

66.7%
75.0%
70.0%
75.0%
41.7%
85.7%

67.5%
80.7%
74.0%
58.2%
54.5%
70.6%

0.00025
0.02186
0.00641
0.00603
0.01324
0.00212

66.0%

0.04990

Experience
Mod, K =
55

Loss
Ratio
2000

Modified
Loss Ratio
2000

Squared
Difference

0.989
0.938
0.951
1.264
0.780
1.191

66.7%
75.0%
70.0%
75.0%
41.7%
85.7%

67.4%
79.9%
73.6%
59.3%
53.4%
72.0%

0.00022
0.01956
0.00588
0.00439
0.01565
0.00362

66.0%

0.04932

Experience
Mod, K =
85

Loss
Ratio
2000

Modified
Loss Ratio
2000

Squared
Difference

0.990
0.951
0.958
1.225
0.804
1.156

66.7%
75.0%
70.0%
75.0%
41.7%
85.7%

67.3%
78.9%
73.1%
61.2%
51.8%
74.1%

0.00018
0.01667
0.00510
0.00226
0.01991
0.00668

66.0%

0.05080

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

Insured

1996-98 Premium
Loss
LossesRatio
(prior to
exper. mod)
1996-98

Acme Anvils
Buzz Beer
Cash & Carey
Drum & Drummer
Everything Elvis
Frank & Stein

100
60
130
120
190
80

65.0%
58.3%
61.5%
91.7%
47.4%
87.5%

Overall

680

66.2%

HCM 10/19/12,

Page 673

Experience
Mod, K =
100

Loss
Ratio
2000

Modified
Loss Ratio
2000

Squared
Difference

0.991
0.956
0.960
1.210
0.814
1.143

66.7%
75.0%
70.0%
75.0%
41.7%
85.7%

67.3%
78.5%
72.9%
62.0%
51.2%
75.0%

0.00017
0.01570
0.00480
0.00158
0.02178
0.00813

66.0%

0.05217

Comment: One desires that the loss ratios after the application of experience rating be similar for the
different insureds. One way to quantify that goal is to after the fact compute the squared differences
between the overall loss ratio and the modified loss ratio. Then the value of K that produced the
smallest squared error would have worked best if it had been used. Thus here a Buhlmann
Credibility Parameter of (about) 55 would have worked well in the past. An actual test would rely
on much more data as well as being somewhat more complicated. See for
example,Parametrizing the Workers Compensation Experience Rating Plan, by William R.
Gillam, PCAS 1992 and the Discussion by Howard Mahler in PCAS 1993.
Here is a graph of the sum of the squared errors as a function of K:
SumSqDiff
0.0520
0.0515
0.0510
0.0505
0.0500

40

50

60

70

80

90

100

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 674

17.6. Y unbiased. = E[Y] = wiE[Xi] = wi. wi = 1.


Var[Y] = wi2 Var[Xi] = wi2 i2.
Use Lagrange Multipliers to minimize Var[Y], subject to the constraint: wi - 1 = 0.
Set equal to zero the partial derivatives with respect to wj of: wi2 i2 + (wi - 1).
0 = 2wjj2 + . j = 1, ..., N. wj = -/(2j2).
In other words, each variable gets weight inversely proportional to its variance.

wi = 1. wj = (1/j2)/(1/i2).
Comment: For the case of just two variables X1 and X2 ,
w1 = (1/1 2)/(1/1 2 + 1/2 2) = 2 2/(1 2 + 2 2) and w2 = (1/2 2)/(1/1 2 + 1/2 2) = 1 2/(1 2 + 2 2).
In other words, each variable gets weight inversely proportional to its variance.
Similar to Exercise 1 in Topics in Credibility by Dean.
17.7. w j = (1/j2)/(1/i2) = (mj/2)mi/2 = mj/m, where m = mi.
Comment: If mi is the exposure associated with each Xi, and each Xi is a frequency, pure premium,
loss ratio, etc., then the Buhlmann-Straub Covariance Structure assumes
Var[Xi] = 2/mi, where 2 would be the process variance for one exposure.
Then the smallest variance of an unbiased estimator of the common mean is the exposure weighted
average of the Xi. If Xi were pure premium, then (mi/m)Xi = (miXi)/m =
(total losses)/(total exposures). Similar to Exercise 10 in Topics in Credibility by Dean.
17.8. Using a previous solution, wj = (1/j2)/(1/i2) = {mj/(bmj + c)}/{mi/(bmi + c)}.
Comment: See Examples 16.7 and 16.29 in Loss Models. If mi is the exposure associated with
each Xi, and each Xi is a frequency, pure premium, loss ratio, etc., then the Buhlmann-Straub
Covariance Structure assumes b = 0. If b = 0, this reduces to the previous question.
17.9. E[X | A] = (50%)(0) + (30%)(1) + (20%)(4) = 1.1.
E[X | B] = (20%)(0) + (50%)(1) + (30%)(4) = 1.7.
E[X] = (1.1 + 1.7)/2 = 1.4. VarU[E[X | U]] = {(1.1 - 1.4)2 + (1.7 - 1.4)2 }/2 = 0.09.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 675

17.10. The expected value of the sum of 3 balls is 3 times the expected value of a single ball.
E[Y | A] = 3E[X | A] = (3)(1.1) = 3.3. E[Y | B] = 3E[X | B] = (3)(1.7) = 5.1.
E[Y] = 4.2. VarU[E[Y | U]] = {(3.3 - 4.2)2 + (5.1 - 4.2)2 }/2 = 0.81.
Comment: The variance of the hypothetical means of the sum is 32 = 9 times the VHM for a single
ball.
17.11. The expected value of the average of 3 balls is the expected value of a single ball.
E[ X | A] = E[X | A] = 1.1. E[ X | B] = E[X | B] = 1.7.
E[ X ] = (1.1 + 1.7)/2 = 1.4. VarU[E[ X | U]] = {(1.1 - 1.4)2 + (1.7 - 1.4)2 }/2 = 0.09.
Comment: The variance of the hypothetical means of the average is the VHM for a single ball.
17.12. E[X2 | A] = (50%)(02 ) + (30%)(12 ) + (20%)(42 ) = 3.5. Var[X | A] = 3.5 - 1.12 = 2.29.
E[X2 | B] = (20%)(02 ) + (50%)(12 ) + (30%)(42 ) = 5.3. Var[X | B] = 5.3 - 1.72 = 2.41.
EU[Var[X | U]] = (2.29 + 2.41)/2 = 2.35.
17.13. The variance of the sum of 3 balls is 3 times the variance of a single ball.
Var[Y | A] = 3Var[X | A] = (3)(2.29) = 6.87.
Var[Y | B] = 3Var[X | B] = (3)(2.41) = 7.23.
EU[Var[X | U]] = (6.87 + 7.23)/2 = 7.05.
Comment: The expected value of the process variance of the sum is 3 times the EPV for a single
ball.
17.14. The variance of the average of 3 balls is 1/3 the variance of a single ball.
Var[Y | A] = Var[X | A]/3 = 2.29/3 = .7633.
Var[Y | B] = Var[X | B]/3 = 2.41/3 = .8033.
EU[Var[X | U]] = (.7633 + .8033)/2 = .7833.
Comment: The expected value of the process variance of the sum is 1/3 the EPV for a single ball.
17.15. For an urn picked at random, Prob[0] = 35%, Prob[1] = 40%, and Prob[4]= 25%.
E[X] = (35%)(0) + (40%)(1) + (25%)(4) = 1.4.
E[X2 ] = (35%)(02 ) + (40%)(12 ) + (25%)(42 ) = 4.4. Var[X] = 4.4 - 1.42 = 2.44.
Alternately, Var[X] = VarU[E[X | U]] + EU[Var[X | U]] = 0.09 + 2.35 = 2.44.
Comment: Similar to Exercise 2 in Topics in Credibility by Dean.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 676

17.16. Var[Y] = VarU[E[Y | U]] + EU[Var[Y | U]] = 0.81 + 7.05 = 7.86.


Alternately, list all of the possible outcomes.
Outcome

Y^2

Prob. Given A

Prob. Given B

Probability

All 0
All 1
All 4
two @0 and one @ 1
two @0 and one @ 4
two @1 and one @ 0
two @1 and one @ 4
two @4 and one @ 0
two @4 and one @ 1
0, 1, 4

0
3
12
1
4
2
6
8
9
5

0
9
144
1
16
4
36
64
81
25

0.1250
0.0270
0.0080
0.2250
0.1500
0.1350
0.0540
0.0600
0.0360
0.1800

0.0080
0.1250
0.0270
0.0600
0.0360
0.1500
0.2250
0.0540
0.1350
0.1800

0.0665
0.0760
0.0175
0.1425
0.0930
0.1425
0.1395
0.0570
0.0855
0.1800

4.2

25.5

1.0000

1.0000

1.0000

Var[Y] = E[Y2 ] - E[Y]2 = 25.5 - 4.22 = 7.86.


Comment: Var[Y] 3Var[X] = (3)(2.44) = 7.32.
For the sum of N balls from a single urn, Var[ X ] = 0.09N2 + 2.35N, where 0.09 is the VHM for a
single ball and 2.35 is the EPV for a single ball.
17.17. Var[ X ] = VarU[E[ X | U]] + EU[Var[ X | U]] = 0.09 + .7833 = 0.8733.
Alternately, list all of the possible outcomes.
Outcome

XBar

XBar^2

Prob. Given A

Prob. Given B

Probability

All 0
All 1
All 4
two @0 and one @ 1
two @0 and one @ 4
two @1 and one @ 0
two @1 and one @ 4
two @4 and one @ 0
two @4 and one @ 1
0, 1, 4

0.0000
1.0000
4.0000
0.3333
1.3333
0.6667
2.0000
2.6667
3.0000
1.6667

0.0000
1.0000
16.0000
0.1111
1.7778
0.4444
4.0000
7.1111
9.0000
2.7778

0.1250
0.0270
0.0080
0.2250
0.1500
0.1350
0.0540
0.0600
0.0360
0.1800

0.0080
0.1250
0.0270
0.0600
0.0360
0.1500
0.2250
0.0540
0.1350
0.1800

0.0665
0.0760
0.0175
0.1425
0.0930
0.1425
0.1395
0.0570
0.0855
0.1800

1.4

2.8333

1.0000

1.0000

1.0000

Var[Y] = E[Y2 ] - E[Y]2 = 2.8333 - 1.42 = 0.8733.


Comment: Var[ X ] Var[X]/3 = 2.44/3 = .8133. See Exercise 5 in Topics in Credibility by Dean.
For the average of N balls from a single urn, Var[ X ] = 0.09 + 2.35/N, where 0.09 is the VHM for a
single ball and 2.35 is the EPV for a single ball.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 677

17.18. D. Using the EPV and VHM for a single ball, K = EPV/VHM = 2.35/.09 = 26.1.
Z = 11/(11 + 26.1) = .296. The a priori mean is 1.4. The observed mean is: 12/11.
Estimate of the mean is: (.296)(12/11) + (1 - .296)(1.4) = 1.309.
Estimate for 100 balls is: 130.9.
Comment: Similar to Exercise 6 in Topics in Credibility by Dean.
17.19. A. VHM = Cov(Xi, Xj) = 40.
VHM + EPV = Var(Xi) = 130. EPV = 90.
K = EPV/VHM = 90/40 = 2.25. Z = N/(N + K) = 3/(3 + 2.25) = 57.1%.
Comment: Similar to 4, 11/05, Q.26.
17.20. B. This is the Buhlmann covariance structure, with EPV = 18 and VHM = 2.
K = EPV/VHM = 18/2 = 9. For two observations, Z = 2/(2 + 9) = 2/11.
(2/11)(27 +19)/2 + (9/11) = 32. = 34. For four observations, Z = 4/(4 + 9) = 4/13.
observed mean is: (40 + 61 + 45 + 29)/4 = 43.75. estimate is: (4/13)(43.75) + (9/13)(34) = 37.
17.21. C. The EPV that we use to calculate K is for one exposure: 20,000.
The observed exposures total 8000. 2/3 = Z = 8000/(8000 + K). K = 4000.
4000 = EPV/VHM = 20,000/VHM. VHM = 5.
Expected value of the process variance of aggregate losses for year 1 is: 20,000 / 2000 = 10.
Cov(X1 , X1 ) = Var[X1 ] =
(Expected value of the process variance for year 1) + VHM = 10 + 5 = 15.
Comment: Similar to 4, 11/05, Q.26.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 678

17.22. C. Let X be the first year of experience and Y be the second year of experience.
X = {(9100)(0) + (850)(1) + (50)(2)}/10000 = .0950.

X2 /N = {(9100)(02 ) + (850)(12 ) + (50)(22 )}/10000 = .1050.


Y = {(9080)(0) + (858)(1) + (62)(2)}/10000 = .0982.

XY/N = {(1)(100) + (2)(10 + 8) + (4)(2)}/10000 = .0144.


Let x = X - X , and y = Y - Y .

xi2 /N = variance of X = X2 /N - X 2 = .1050 - .09502 = .0960.


xiy i/N = sample covariance of X and Y = XY/N - X Y = .0144 - (.0950)(.0982) = .00507.
Estimated Z = slope of the regression line = xiy i / xi2 = .00507/.0960 = 5.3%.
Comment: Beyond what you are likely to be asked on the exam. See A Graphical Illustration of
Experience Rating Credibilities, by Howard C. Mahler, PCAS 1998, or pages 315-316 of Risk
Classification by Robert J. Finger, in Foundations of Casualty Actuarial Science.
17.23. D. All of these are assumptions of the Bhlmann-Straub credibility model.
Comment: See Section 16.4.5 of Loss Models.
17.24. B. The two estimates A and B are independent, therefore:
Var[C] = Var[wA + (1 - w)B] = w2 Var[A] + (1 - w)2 Var[B] = 4002 w2 + 2002 (1 - w)2 .
d Var[C] / dw = 2w4002 - 2(1 - w)2002 .
Setting the derivative equal to zero: 2w4002 - 2(1 - w)2002 = 0.
w = 2002 /(4002 + 2002 ) = 1/5.
Comment: Each of the two estimators A and B is given weight inversely proportional to its variance.
For w = 1/5, estimate C is: (1/5)(1000) + (4/5)(1200) = 1160.
17.25. C. We have four years of data, and therefore Z = 4/(4 + K). 0.4 = 4/(4 + K).

K = 6. We are given, EPV = 8. Therefore, VHM = EPV/K = 8/6 = 1.333.


For the Buhlmann Covariance Structure, for i j, Cov(Xi, Xj) = VHM = 1.333.
Comment: See Equation 20.35 of Loss Models. Cov(Xi, Xi) = EPV + VHM = 9.333.

2013-4-9

Buhlmann Cred. 17 Least Squares Credibility,

HCM 10/19/12,

Page 679

17.26. E. The Bhlmann-Straub credibility model deals with policies with different number of
exposure units, so that A is not true.
There is no requirement of a specific type of claim count distribution, so that B is not true.
There is no minimum number of exposures required, so that C is not true.
Comment: For classical credibility, 1082 claims is a commonly used standard for Full Credibility for
frequency, corresponding to frequency being Poisson, P = 90%, and k = 5%.
In Loss Models, the Bhlmann credibility model refers to the case such as individual automobile
drivers, where there are no exposures, or one driver for one year is one exposure, or where each
policy has the same number of exposures.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 680

Section 18, The Normal Equations for Credibilities


In general, one can solve the normal equations for the least squares credibility. The normal
equations involve the variance-covariance structure of the data.
As discussed in the previous section, for Buhlmann Credibility the Variance-Covariance structure
between the years of data is as follows:205
COV[Xi , Xj] = 2ij + 2

where ij is 1 for i = j and 0 for ij.

More generally let COV[Xi , Xj] = Cij. Assume that in order to estimate X1+N we give weight Zi to
year Xi , for i = 1 to N, with weight 1 - Zi given to the a priori mean .
Then just as in the previous section, it turns out206 that the expected value of the squared errors is a
quadratic function of the credibilities Z:207
N N

V(Z) =

Zi Zj Cij

- 2 Zi Ci,1+ N + C1+N,1+N

i=1 j=1

i=1

The cases dealt with previously are a special case of this more general situation.
Specifically if we observe one year, then N = 1 and the above equation becomes:
V(Z) = Z2 C11 - 2ZC12 + C22.
For the Buhlmann covariance structure, C11 = C22 = 2 + 2 and C12 = 2. Thus for one year of data,
V(Z) = Z2 (2 + 2) - 2Z2 + 2 + 2, as was derived previously for this covariance structure.
For the Buhlmann-Straub covariance structure, for a risk of size m,
C 11 = C22 = 2/m + 2, and C12 = 2.
Thus for one year of data, V(Z) = Z 2 (2/m + 2) - 2Z2 + 2/m + 2.

205

When size of risk E is important, the Buhlmann-Straub covariance structure of frequency, severity or pure
premiums is: COV[Xi , Xj] = ij (2 / m) + 2. The EPV is inversely proportional to the size of risk.
206
207

See for example, A Markov Chain Model of Shifting Risk Parameters, by Howard Mahler, PCAS 1997.
The credibilities are a vector with N elements, one for each year of data.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 681

In the general case, one can minimize the squared error by taking the partial derivatives of V(Z) with
respect to each Zi and setting it equal to zero. This yields N linear equations in N unknowns, which
can be solved by the usual matrix methods.208
N

Zi Cij = Cj, 1+N

for j = 1, 2, 3, ..., N

i=1

These are sometimes called the normal equations.209


In Loss Models they assume linear estimators of the form: a0 + ZiX i.
Then they derive equations for the least squares linear estimator; they derive the linear estimator that
minimizes the mean squared error (MSE).
The unbiasedness equation:210
E[X] = a0 + ZiE[Xi].
as well as the above equations:211
N

Zi Cij j = Cj,

1+N

for j = 1,2,3,..., N

i=1

Loss Models refers to the unbiasedness equation plus the above equations as the normal
equations.
In most applications, one assumes the E[Xi] are each equal to an a priori mean , and therefore if
one takes a0 = (1 - Zi) , then the unbiasedness equation is satisfied. Specifically, it is common to
take a0 = (1 - Zi) = (complement of credibility)(a priori mean).
For one year of data, the estimator reduces to: ZX + (1 - Z)(a priori mean).
208

See for example, A Markov Chain Model of Shifting Risk Parameters, by Howard Mahler, PCAS 1997. The
equations will hold equally well if there is a gap between the data and the year to be estimated; for example, if one
uses years 1, 2, and 3 to predict year 7, then the terms on the righthand side of the equations are Cj, 7.
209
See equations 20.26 in Loss Models. These are sometimes called the normal equations, but Loss Models uses
that term for these equations plus the unbiasedness equation.
210
See equations 20.25 in Loss Models.
211
See equations 20.26 in Loss Models.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 682

Equal Exposures per Year:


Assume the EPV and VHM are some function of the size of the insured, but the insured has the
same number of exposures each year. Then all the EPVs are equal and all the VHMs are equal, and
C ij = VHM + ijEPV. This looks just like the Buhlmann covariance structure, and the solution to the
Normal Equations is Z = N/(N + K).
The Normal equations are in this case:
N

Zi (VHM +

ij EPV) = VHM, for j = 1, 2, 3,..., N

i=1

By symmetry, each Zi is equal to the others; let Zi = Z/N, so that Zi = Z.


Then each of the Normal Equations is: (Z/N)(N VHM + EPV) = VHM.
Z = N/(N + EPV/VHM) = N/(N + K), where K = EPV/VHM as usual.
Thus even if the Buhlmann-Straub covariance structure does not hold, if there are equal
exposures each year, and the EPV and VHM are only functions of the size of insured,
then the usual Buhlmann Credibility formula holds.
If there are the same number of exposures per year, then the Normal Equations produce
the same result as Z = N/(N + K), provided N is the number of years, and K is computed
based on the number of exposures in each year.
Exercise: Mannys Hat Company has 150 exposures in each of 2 years.
You expect a similar company to have an annual frequency of 0.060 claims per exposure.
You assume the variance of the hypothetical mean frequency is 0.0001, and
the expected value of the annual process variance is: 0.001 + .1/mi, where mi is the annual number
of exposures. If Mannys Hat Company had 19 and 14 claims in the two years observed, estimate
its future claims frequency per exposure.
[Solution: This is not the Buhlmann-Straub covariance structure. However, since there are the same
number of exposures each year, we can use Z = N/(N + K), provided K is computed based on the
number of exposures in each year, and N is the number of years.
Each year the EPV is: 0.001 + 0.1/150 = 0.00167. VHM = 0.0001.
K = EPV/VHM = 0.00167/0.0001 = 16.7.
N is the number of years, 2. For 2 years, Z = 2/(2 + K) = 0.107.
The observed frequency per exposure is: 33/300 = 0.110.
Thus the estimated future frequency is: (0.107)(0.110) + (1 - 0.107)(0.060) = 0.065.
Alternately, the covariance matrix is:

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 683

0.0001
0.00167 + 0.0001
0.00177 0.0001
=
.

0.0001
0.00167 + 0.0001 0.0001 0.00177

The Normal Equations are:


0.00177Z1 + 0.0001Z2 = 0.0001.
0.0001Z1 + 0.00177Z2 = 0.0001.
Adding the equations: (0.00187)(Z1 + Z2 ) = 0.0002. Z1 + Z2 = 0.1070.
By symmetry Z1 = Z2 = 0.1070/2 = 0.0535.
Estimated future frequency is: (19/150)(0.0535) + (14/150)(0.0535) + (1 - 0.107)(0.060) = 0.065.]
Buhlmann-Straub Covariance Structure:
In the case of the Buhlmann-Straub Covariance Structure, even if there are differing exposures each
year, the usual Buhlmann Credibility formula holds.
For the Buhlmann-Straub Covariance Structure, for an insured of size mi in year i:
C ij = 2 + ij2/mi.
The Normal Equations are in this case:
N

Zi ( 2
i=1

ij 2

/ mi) = 2, for j = 1, 2, 3,..., N.

Zi + Zj 2/mj = 2, for j = 1, 2, 3,..., N.


i=1

2
Z j = mj 2 (1 - Zi ), for j = 1, 2, 3,..., N.

i=1
Summing these equations over j, and letting m = mi:

Zj = m

2
(1 - Zj )
2

Therefore, Zj =

m 2 / 2
.
1 + m 2 / 2

Therefore, Zj = mj

2
2
1
mj
(1

Z
)
=
m
/
=
.
j
i
2
2 1 + m 2 / 2
2 / 2 + m

2 is the variance of the hypothetical means, and 2 is the Expected Value of the Process Variance
2
mj
for one exposure. If as usual we let K = 2 , then Zj =
.

m + K

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 684

Let us assume we are estimating for example pure premiums.212


Then our estimated future pure premium is:
Lj
mj
mj
Zj PPj + (1 - Zj) =
+ (1 -
) =
m + K mj
m + K
1
m
L
K
Lj + (1 ) =
+
.
K + m
K + m
K + m K + m
What if instead we just combined all the years of data and let Z = m/(m + K)?
Then the estimate of the future pure premium is:
L
K
Z(L/m) + (1 - Z) =
+
, the same result as above.
K + m K + m
We have shown, that in the case of the Buhlmann-Straub Covariance structure, even if the
number of exposures varies by year, using the Normal Equations results in the usual
Buhlmann Credibility formula being applied to all the years of data combined.
Exercise: Mannys Hat Company has the following data for two years:
Year:
1
2
Exposures:
100 200
Claims:
12
21
You expect a similar company to have an annual frequency of 0.060 claims per exposure.
You assume the variance of the hypothetical mean frequency is 0.0001, and the expected value of
the annual process variance is: 0.1/mi, where mi is the annual number of exposures.
Estimate its future claims frequency.
[Solution: This is an example of the Buhlmann-Straub covariance structure. Therefore, we can use
the usual Buhlmann Credibility Formula. K = (EPV for one Exposure)/VHM = 0.1/0.0001 = 1000.
There are 300 Exposures in total, so Z = 300/(300 + 1000) = 3/13.
The observed frequency is: 33/300 = 0.110.
Thus the estimated future frequency is: (3/13)(0.110) + (10/13)(0.060) = 0.07154.
Alternately, one can use the Normal Equations as follows.
The EPV for year 1 is: 0.01/100 = 0.0010. The EPV for year 2 is: 0.1/200 = 0.0005.
The variance of year 1 is: EPV + VHM = 0.0010 + 0.0001 = 0.0011.
The variance of year 2 is: EPV + VHM = 0.0005 + 0.0001 = 0.0006.
The covariance of different years = VHM = 0.0001. The normal equations are:
0.0011Z1 + 0.0001Z2 = 0.0001.

11Z1 + Z2 = 1.

0.0001Z1 + 0.0006Z2 = 0.0001.

Z1 + 6Z2 = 1.

Solving these two linear equations in two unknowns, Z1 = 1/13, and Z2 = 2/13.
The estimated future frequency is: (1/13)(12/100) + (2/13)(21/200) + (10/13)(.06) = 0.07154.]
212

The example would work just as well for frequency, severity, loss ratios, etc.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 685

Varying Exposures by Year, More General Variance-Covariance Structures:


If the number of exposures differ by year, and one does not have the Buhlmann-Straub
Covariance structure, in general then one does not get the usual Buhlmann Credibility
formula. However, one can solve the Normal Equations as one would any other set of
linear equations.
Exercise: Mannys Hat Company has the following data for two years:
Year
1
2
Exposures
100 200
Claims
12
21
You expect a similar company to have an annual frequency of .060 claims per exposure.
You assume the variance of the hypothetical mean frequency is .0001, and
the expected value of the annual process variance is 0.001 + 0.1/mi, where mi is the annual number
of exposures. Estimate its future claims frequency.
[Solution: The EPV for year 1 is: 0.001 + 0.1/100 = 0.0020.
The EPV for year 2 is: 0.001 + 0.1/200 = 0.0015.
The variance of year 1 is: EPV + VHM = 0.0020 + 0.0001 = 0.0021.
The variance of year 2 is: EPV + VHM = 0.0015 + 0.0001 = 0.0016.
The covariance of different years = VHM = 0.0001.
The normal equations are:
0.0021Z1 + 0.0001Z2 = 0.0001
0.0001Z1 + 0.0016Z2 = 0.0001

21Z1 + Z2 = 1.
Z1 + 16Z2 = 1.

Solving these two linear equations in two unknowns, in matrix form:213


Z1 21 1 1 1 16 / 335 1/ 335 1 15 / 335
=

=
=
.
Z2 1 16 1 1/ 335 21/ 335 1 20 / 335
Thus Z1 = 15/335 = 3/67, and Z2 = 20/335 = 4/67.
Therefore, the estimated future frequency is:
(3/67)(12/100) + (4/67)(21/200) + (60/67)(0.06) = 0.06537.]
Note how this estimate of 0.06537 differs significantly from that in a previous exercise, 0.07145,
where the covariance structure was the same, and the total experience was the same, but the
exposures in the two years were the same, rather than different as here.

213

I have shown the solution in matrix form, to remind you how one would solve n linear equations in n unknowns.
If a practical application had for example 4 years of data, then one would need to invert a 4 by 4 matrix, in order to
solve the Normal Equations.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 686

In one example of a different covariance structure, as in the above exercise, the expected value of
the annual process variance is of the form: w + v/mi.214 This covariance structure is used to model
parameter uncertainty. Parameter uncertainty involves random fluctuations in the states of the
universe, that affect most insureds somewhat similarly regardless of size.
As the size of risk increase, the EPV goes to a constant w, rather than zero, as assumed more
commonly.
Therefore, as the size of risk approaches infinity, the credibility does not approach 1.215
Another example of a different covariance structure is to assume that the variance of the hypothetical
means depends on the size of insured: VHM = a + b/mi.
This covariance structure is used to model risk heterogeneity. Risk heterogeneity occurs when
an insured is a sum of subunits, and not all of the subunits have the same risk process.
In Workers Compensation Experience Rating, the commonly assumed covariance structure
includes both parameter uncertainty and risk heterogeneity, which leads to credibilities of the
form:
(Linear Function of Size of Insured)/ (Linear Function of Size of Insured).216
Given these or any other covariance structure, one can obtain the credibilities assigned to each year
of data by solving a set of linear equations in a similar manner.
For example assume the variance-covariance matrix is given by:217
3.5833
0.3750
0.2837
0.2159
0.3750
3.5833
0.3750
0.2837
0.2837
0.3750
3.5833
0.3750
0.2159
0.2837
0.3750
3.5833

214

See Example 20.25 in Loss Models.


See 4, 5/01, Q.23. See Howard Mahlers Discussion of Robin R. Gillams, Parametrizing the Workers
Compensation Experience Rating Plan, PCAS 1993, or Credibility with Parameter Uncertainty, Risk
Heterogeneity, and Shifting Risk Parameters, by Howard Mahler, PCAS 1998.
216
See Example 20.26 in Loss Models. See also, Howard Mahlers discussion of Parametrizing the Workers
Compensation Experience Rating Plan, PCAS 1993, or Howard Mahlers Credibility with Shifting Risk Parameters,
Risk Heterogeneity and Parameter Uncertainty, PCAS 1998.
217
This example is taken from A Markov Chain Model of Shifting Risk Parameters, by Howard Mahler, PCAS 1997.
As with the variance-covariance matrices underlying Buhlmann Credibility, the off-diagonal elements are smaller than
those along the diagonal. In the Buhlmann covariance structure, all of the off-diagonal terms are equal to 2, while all
215

the diagonal elements equal 2 + 2 . In contrast, here as the terms get further from the diagonal they decline. This
decline reflects an assumption that as years of data get further apart they are less closely correlated. This is the
situation one would expect when risk parameters shift over time.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 687

Then the normal equations become for years 1, 2 and 3 predicting year 4:
3.5833Z1 + 0.3750Z2 + 0.2837Z3

0.2159.

0.3750Z1 + 3.5833Z2 + 0.3750Z3

0.2837.

0.2837Z1 + 0.3750Z2 + 3.5833Z3

0.3750.

Note the way that each equation corresponds to a row of the variance-covariance matrix.
These three linear equations in three unknowns have the solution:218
Z 1 = 4.6%, Z 2 = 6.4%, and Z3 = 9.4%.
Thus in this case one would give the data from Year 1 a weight of 4.6%,
that from Year 2 a weight of 6.4%,
that from Year 3 a weight of 9.4%,
with the remaining weight of 79.6% being given to the a priori mean.219

Summary:
For Buhlmann Covariance Structure, the Normal Equations Z =

N
.
N + K

For Buhlmann-Straub Covariance Structure, the Normal Equations Z =

N
.
N + K

If one assumes a covariance structure different than the Buhlmann-Straub, and exposures
vary by year, then the equation Z = m/(m + K) does not hold. In these situations, one can
solve the set of linear equations, called the normal equations, for the amount of credibility to be
assigned to each year of data.

218

The would involve inverting the 3 by 3 matrix of coefficients.


Note the way the data from the most recent year is given more weight than that from a more distant year. This is
typical when one takes into account shifting risk parameters. If the rate of shifting is relatively slow, then the
assumption of stable risk parameters is reasonable to use for practical purposes.
219

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 688

Problems:
Use the following information for the next 5 questions:

There are four years of data: X1 , X2 , X3 , X4 .

Assume a priori that E[X1 ] = E[X2 ] = E[X3 ] = E[X4 ] = .

The assumed variance-covariance matrix between the years of data is:


17 5 5 5

5
17
5
5

5 5 17 5

5 5 5 17

Assume you will use a linear estimator, a0 + a1 X1 + a2 X2 + a3 X3 , in order to predict X4 .

You will employ the normal equations in order to determine the least squares linear
estimator of X4 given X1 , X2 , and X3 .

18.1 (1 point) What is a0 ?


B. /3

A. 5/17

C. 15/27

D.

E. None of the above.


18.2 (1 point) What is a1 ?
A. 5/27

B. 1/4

C. 5/17

D. 1/3

E. None of the above.

C. 5/17

D. 1/3

E. None of the above.

C. 5/17

D. 1/3

E. None of the above.

18.3 (1 points) What is a2 ?


A. 5/27

B. 1/4

18.4 (1 points) What is a3 ?


A. 5/27

B. 1/4

18.5 (1 point) Assume = 100 and X1 = 70, X2 = 110, and X3 = 90,


what is the least squares linear estimate of X4 ?
A. Less than 91
B. At least 91 but less than 92
C. At least 92 but less than 93
D. At least 93 but less than 94
E. At least 94

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 689

Use the following information for the next two questions:

Let Xi be the losses in year i.

Assume a priori that E[X1 ] = E[X2 ] = E[X3 ] = E[X4 ] = 100.

The assumed variance-covariance matrix between the years of data is:


17 4 3 2

4
17
4
3

3 4 17 4

2 3 4 17

Assume you will use a linear estimator, a0 + a1 X1 + a2 X2 + a3 X3 , in order to predict X4 .

You will employ the Normal Equations in order to determine the least squares linear
estimator of X4 given X1 , X2 , and X3 .

18.6 (3 points) If X1 = 70, X2 = 110, and X3 = 90, what is the estimate of X4 ?


A. Less than 94
B. At least 94 but less than 95
C. At least 95 but less than 96
D. At least 96 but less than 97
E. At least 97
18.7 (3 points) If X1 = 90, X2 = 110, and X3 = 70, what is the estimate of X4 ?
A. Less than 94
B. At least 94 but less than 95
C. At least 95 but less than 96
D. At least 96 but less than 97
E. At least 97
18.8 (2 points) You are given the following information about a single risk:
(i) The risk has m exposures in each year.
(ii) The risk is observed for n years.
(iii) The variance of the hypothetical means is a.
(iv) The expected value of the annual process variance is w + v/m.
Determine the limit of the Bhlmann-Straub credibility factor as n approaches infinity.
m
m
m
m
(A)
(B)
(C)
(D)
(E) 1
m + (w + v) / a
m + m2 w / a
m + w/a
m + v/a

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 690

18.9 (1 point) The a priori expected value of X is 1000.


X1 = 900, X2 = 1100, X3 = 800, and X4 = 1300.
Using the Normal Equations you determine that: Z1 = 10%, Z2 = 15%, Z3 = 30%, Z4 = 20%.
What is the resulting estimate of X5 ?
A. 1000

B. 1005

C. 1010

D. 1015

E. 1020

Use the following information for the next three questions:


You are given the following data for Cohen Construction Company:
Year:
2001
2002
Exposures:
10
50
Losses:
80
750
Pure Premium:
8
15
The expected pure premium for a construction company similar to this one is 21.
18.10 (3 points) You assume that:

The variance of the hypothetical mean pure premiums is 12.

The expected value of the annual process variance is: 400/E,


where E is the number of exposures that year.
Use least squares credibility in order to estimate the future pure premium for the Cohen Construction
Company.
A. Less than 16.50
B. At least 16.50 but less than 17.00
C. At least 17.00 but less than 17.50
D. At least 17.50 but less than 18.00
E. At least 18.00
18.11 (4 points) You assume that:

The variance of the hypothetical mean pure premiums is 12.

The expected value of the annual process variance is: 4 + 400/E,


where E is the number of exposures that year.
Use least squares credibility in order to estimate the future pure premium for the Cohen Construction
Company.
A. Less than 16.50
B. At least 16.50 but less than 17.00
C. At least 17.00 but less than 17.50
D. At least 17.50 but less than 18.00
E. At least 18.00

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 691

18.12 (4 points) You assume that

Ei is the number of exposures for year i.

PPi is the pure premium for year i.


Cov[PPi, PPj] = 12 +

80
Ei Ej

+ ij (4 +

400
).
Ej

The exposures in year 2003 are expected to be 30.


Use the normal equations in order to estimate the pure premium in 2003 for the Cohen Construction
Company.
A. Less than 16.50
B. At least 16.50 but less than 17.00
C. At least 17.00 but less than 17.50
D. At least 17.50 but less than 18.00
E. At least 18.00

18.13 (2 points) You are given the following information about a single risk:
(i) The risk has 100 exposures in each year.
(ii) The risk is observed for 3 years.
(iii) The variance of the hypothetical means is 5.
(iv) Where m is the number of exposures observed during a year, the expected value of the
annual process variance is 30 + 7500/m.
Determine the credibility factor Z assigned to the sum of these three years of data.
A. 10%
B. 12.5%
C. 15%
D. 17.5%
E. 20%
18.14 (2 points) You are given the following information about a single risk:
(i) The risk has m exposures in each year.
(ii) The risk is observed for n years.
(iii) The variance of the hypothetical means is a + b/m.
(iv) The expected value of the annual process variance is v/m.
Determine the limit of the Bhlmann-Straub credibility factor as m approaches zero.
n
n
n
n
(A) 0
(B)
(C)
(D)
(E)
n + v/a
n + w/a
n + v/b
n + w/b

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 692

18.15 (8 points) Use the following information:

Let Ri be the class relativity in year i.

Assume a priori that E[Ri] = 1.

Let mi be the expected losses in thousands of dollars for the class in year i,

measuring the size of the class.


0 if i j
ij =
1 if i = j
Cov[Ri, Rj] = 0.05 +

mi

5
mi m j

+ ij (0.005 +

25
mi m j

).

Ri

1
350
0.92
2
180
0.83
3
190
0.76
4
290
0.98
5
320
0.50
6
270
Employ the Normal Equations in order to determine the least squares linear estimator of R6 .
(Use a computer to help you with the computations.)
A. 0.82
B. 0.84
C. 0.86
D. 0.88
E. 0.90
18.16 (6 points) Use the following information:

You are using data from years 1 through 5 in order to predict year 6.
0 if i j

ij =
1 if i = j

The covariance between years of data is:


Cov[Xi, Xj] = 0.9|i-j| + 5 ij.

Employ the Normal Equations in order to determine the credibilities to assign to each of the years of
data. (Use a computer to help you with the computations.)

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 693

18.17 (4, 5/01, Q.23) (2.5 points)


You are given the following information about a single risk:
(i) The risk has m exposures in each year.
(ii) The risk is observed for n years.
(iii) The variance of the hypothetical means is a.
(iv) The expected value of the annual process variance is w + v/m.
Determine the limit of the Bhlmann-Straub credibility factor as m approaches infinity.
n
n
n
n
(A)
(B)
(C)
(D)
(E) 1
2
n + (w + v)/ a
n + n w/a
n + w/a
n + v/a

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 694

Solutions to Problems:
18.1. E. , 18.2. A. , 18.3. A. , 18.4. A. Write down the normal equations for the least squares
linear estimator of X4 given X1 , X2 , and X3 .
The unbiasedness equation is: a0 + a1 + a2 + a3 = . The 3 remaining normal equations are:
17a1 + 5a2 + 5a3 = 5
5a1 + 17a2 + 5a3 = 5
5a1 + 5a2 + 17a3 = 5
By symmetry a1 = a2 = a3 . Therefore 27a1 = 5. a1 = 5/27 = a2 = a3 .
Alternately, one can solve these three linear equations in three unknowns by inverting the matrix of
coefficients:
(a1 ) (22
-5
-5 )
(5)
(5/27)
(a2 ) = (-5

22

-5 )/324 (5) =

(5/27)

(a3 )

-5

22 )

(5/27)

(-5

(5)

Using a1 ,a2 and a3 , and the unbiasedness equation to solve for a0 :


a0 = (1- a1 - a2 - a3) = 12 /27.
Comment: This is the Buhlmann Covariance Structure, with VHM = 5 and
EPV + VHM = 17. Thus, EPV = 12 and K = 12/5.
We give three years of data a credibility of 3/(3 + 12/5) = 15/27.
Thus the credibility estimate of X4 is:
(15/27)( X1 + X2 + X3 )/3 + (1 - 15/27) = 12/27 + (5/27)X1 + (5/27)X2 + (5/27)X3 .
18.5. E. From the previous solutions, a1 = a2 = a3 = 5/27 and a0 = 12 /27. Thus,
X4 = 12/27 + (5/27)X1 + (5/27)X2 + (5/27)X3 =
(12/27)(100) + (5/27)(70) + (5/27)(110) + (5/27)(90) = 94.44.
Comment: This is the Buhlmann Covariance Structure, with VHM = 5 and EPV + VHM = 17.
Thus, EPV = 12 and K = 12/5.
We give three years of data a credibility of 3/(3 + 12/5) = 15/27 = 55.6%.
The a priori mean is 100. The observed average is: (70 + 110 + 90) / 3 = 90.
Thus the credibility estimate of X4 is: (55.6%)(90) + (44.4%)(100) = 94.44.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 695

18.6. E. & 18.7. B. The Normal Equations are E[X] = a0 + ZiE[Xi], and
3

Zi C ij = C j, 4

for j = 1,2,3.

i =1

17Z1 + 4Z2 + 3Z3 = 2.


4Z1 + 17Z2 + 4Z3 = 3.
3Z1 + 4Z2 + 17Z3 = 4.
Solving, Z1 = 17/308, Z2 = 36/308, Z3 = 61/308.

a0 = 100 - (100)(17/308 + 36/308 + 61/308) = (100)(194/308).


^
X4 = (100)(194/308) + (17/308)X1 + (36/308)X2 + (61/308)X3 .

If X1 = 70, X2 = 110, and X3 = 90, then


^
X4 = (100)(194/308) + (17/308)(70) + (36/308)(110) + (61/308)(90) = 97.53.

If X1 = 90, X2 = 110, and X3 = 70, then


^
X4 = (100)(194/308) + (17/308)(90) + (36/308)(110) + (61/308)(70) = 94.68.

Comment: This type of covariance structure can occur when there are shifting risk parameters
over time. In which case, older years of data are given less weight than recent years.
See A Markov Chain Model of Shifting Risk Parameters, by Howard Mahler, PCAS 1997.
18.8. E. K = EPV/VHM = (w + v/m)/a. Z = n/(n + K) = n/(n + (w + v/m)/a).
As n approaches infinity, Z approaches 1.
Comment: Similar to 4, 5/01, Q.23. This covariance structure is used to model parameter
uncertainty. Increasing the number years observed can overcome the effects of parameter
uncertainty, by averaging over the different assumed random states of the universe in each
year. This can not be accomplished by observing more exposures from a single year.
See Credibility with Parameter Uncertainty, Risk Heterogeneity, and Shifting Risk
Parameters, by Howard Mahler, PCAS 1998.
18.9. B. Estimate of X5 =
(.1)(900) + (.15)(1100) + (.3)(800) + (.2)(1300) + (1 - .1 - .15 - .3 - .2)(1000) = 1005.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 696

18.10. A. K = (EPV for one Exposure)/VHM = 400/12 = 100/3.


We observe a total of 60 exposures, so Z = 60/(60 + 100/3) = 9/14.
The observed pure premium is: (80 + 750)/(10 + 50) = 83/6.
Therefore, the estimated future pure premium is:
(9/14)(83/6) + (1 - 9/14)(21) = 459/28 = 16.39.
Alternately, the EPV for 2001 is: 400/10 = 40. The EPV for year 2002 is: 400/50 = 8.
The variance of year 2001 is: EPV + VHM = 40 + 12 = 52.
The variance of year 2002 is: EPV + VHM = 8 + 12 = 20.
The covariance of different years = VHM = 6.
52 12
C=
.
12 20
The normal equations for credibility and linear estimators of the form a0 + ZiXi :
n

Zi C ij = C j, 1+n

for j = 1,2,3 ...n

i =1

Plus the unbiasedness equation: E[X] = a0 + ZiE[Xi].


Since we are assuming each year has the same expected pure premium,
a0 = (1 - Zi) = (a priori mean)(complement of credibility).
The normal equations have coefficients from C:
52Z1 + 12Z2 = 12
12Z1 + 20Z2 = 12.
Solving, Z1 = 3/28 and Z2 = 15/28. Therefore, the estimated future pure premium is:
(8)(3/28) + (15)(15/28) + (21)(1 - (3/28 + 15/28)) = 16.39.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 697

18.11. B. The EPV for 2001 is 4 + 400/10 = 44. The EPV for year 2002 is : 4 + 400/50 = 12.
The variance of year 2001 is: EPV + VHM = 44 + 12 = 56.
The variance of year 2002 is: EPV + VHM = 12 + 12 = 24.
The covariance of different years = VHM = 12.
56 12
C=
.
12 24
The normal equations for credibility and linear estimators of the form a0 + ZiXi :
n

Zi C ij = C j, 1+n

for j = 1,2,3 ...n

i =1

Plus the unbiasedness equation: E[X] = a0 + ZiE[Xi].


Since we are assuming each year has the same expected pure premium,
a0 = E[X](1 - Zi) = (a priori mean)(complement of credibility).
The normal equations have coefficients from C:
56Z1 + 12Z2 = 12
12Z1 + 24Z2 = 12.
Solving, Z1 = 12% and Z2 = 44%. Therefore, the estimated future pure premium is:
(8)(12%) + (15)(44%) + (21)(1 - (12% + 44%)) = 16.80.
Comment: This covariance structure is used to model parameter uncertainty.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 698

18.12. A. Cov[PP1 , PP1 ] = 12 + 80/10 + 4 + 400/10 = 64.


Cov[PP1 , PP2 ] = Cov[PP2 , PP1 ] = 12 + 80/ (10)(50) = 15.578.
Cov[PP2 , PP2 ] = 12 + 80/50 + 4 + 400/50 = 25.6.
Cov[PP1 , PP3 ] = Cov[PP3 , PP1 ] = 12 + 80/ (10)(30) = 16.619.
Cov[PP2 , PP3 ] = Cov[PP3 , PP2 ] = 12 + 80/ (50)(30) = 14.066.
Cov[PP3 , PP3 ] = 12 + 80/30 + 4 + 400/30 = 32.
64
15.578 16.619

C = 15.578 25.6 14.066 .

32
16.619 14.066
The normal equations for credibility and linear estimators of the form a0 + ZiXi :
n

Zi C ij = C j, 1+n

for j = 1,2,3 ...n

i =1

Plus the unbiasedness equation: E[X] = a0 + ZiE[Xi].


Since we are assuming each year has the same expected pure premium,
a0 = E[X](1 - Zi) = (a priori mean)(complement of credibility).
The normal equations have coefficients from C:
64Z1 + 15.578Z2 = 16.619.
15.578Z1 + 25.6Z2 = 14.066.
Solving, Z1 = 14.8% and Z2 = 45.9%. Therefore, the estimated future pure premium is:
(8)(14.8%) + (15)(45.9%) + (21)(1 - (14.8% + 45.9%)) = 16.32.
Comment: This covariance structure can be used to model parameter uncertainty and risk
heterogeneity. See Credibility with Parameter Uncertainty, Risk Heterogeneity, and Shifting
Risk Parameters, by Howard Mahler, PCAS 1998.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 699

18.13. B. There are equal exposures each year, and the EPV and VHM are only functions of the
size of insured, and therefore the usual Buhlmann Credibility formula holds.
Z = N/(N + K), provided N is the number of years, and K is computed based on the number of
exposures in each year.
K = EPV/VHM = (30 + 7500/100)/5 = 105/5 = 21. Z = N/(N + K) = 3/(3 + 21) = 12.5%.
Alternately, set up the normal equations for credibility and linear estimators of the form
a0 + ZiXi :
3

Zi C ij = C j, 1+3

for j = 1,2,3.

i =1

Plus the unbiasedness equation:


E[X] = a0 + ZiE[Xi].
The covariance of years i and j is: VHM = 5 for ij,
and VHM + EPV = 5 + 30 + 7500/100 = 110 for i = j.
Thus the Normal Equations are:
110Z1 + 5Z2 + 5Z3 = 5.
5Z1 + 110Z2 + 5Z3 = 5.
5Z1 + 5Z2 + 110Z3 = 5.
By symmetry, each Zi is equal to the others.
Then each equation becomes 120Z1 = 5. Z1 = 5/120.
The total weight given to the three years of data is: Z1 + Z2 + Z3 = 3Z1 = 5/40 = 12.5%.
Comment: Somewhat similar to 4, 5/01, Q.23.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 700

18.14. D. K = EPV/VHM = (v/m)/(a + b/m) = v/(ma + b).


Z = n/(n + K) = n/(n + v/(ma + b)).
As m approaches zero, Z approaches n/(n + v/b).
Alternately, set up the normal equations for credibility and linear estimators of the form
a0 + ZiXi :
n

Zi C ij = C j, 1+n

for j = 1,2,3 ...n

i =1

Plus the unbiasedness equation:


E[X] = a0 + ZiE[Xi].
The covariance of years i and j is: VHM = a + b/m for ij,
and VHM + EPV = a + b/m + v/m for i = j.
Therefore, Cij = a + b/m + ij(v/m).
By symmetry, each Zi is equal to the others; let Zi = Z/n, so that Zi = Z.
Then (Z/n)(na + nb/m + v/m) = a + b/m. Z = n/(n + (v/m)/(a + b/m)) = n/(n + v/(ma + b)).
As m approaches zero, Z approaches n/(n + v/b).
Comment: Similar to 4, 5/01, Q.23.
Assuming that each year has the same expected value, a0 = (1 - Z)E[X].
This covariance structure is used to model risk heterogeneity. Risk heterogeneity occurs when
an insured is sum of subunits, and not all of the subunits have the same risk process. For this
covariance structure, as the size of risk approaches zero, the credibility approaches a positive
constant. This covariance structure can be refined to remove this feature. See Credibility with
Parameter Uncertainty, Risk Heterogeneity, and Shifting Risk Parameters, by Howard Mahler,
PCAS 1998.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 701

18.15. B. Var[Ri] = Cov[Ri, Ri] = 0.05{1 + 100/mi + .1 + 500/ mi)} = 0.05(1.1 + 600/mi).
For example, Var[R1 ] = 0.05(1.1 + 600/350) = .140714.
For i j, Cov[Ri, Ri] = 0.05 +

5
mi m j

For example, Cov[R1 , R2 ] = 0.05 +

.
5
(350) (180)

= 0.0699205.

Zi C ij = Cj, 6

for j = 1, 2, 3, 4, 5.

i =1

0.140714Z1 + 0.0699205Z2 + 0.0693892Z3 + 0.0656941Z4 + 0.0649404Z5 = .066265.


0.0699205Z1 + 0.221667Z2 + 0.0770369Z3 + 0.0718844Z4 + 0.0708333Z5 = .0726805.
0.0693892Z1 + 0.0770369Z2 + 0.212895Z3 + 0.0713007Z4 + 0.0702777Z5 = .0720755.
0.0656941Z1 + 0.0718844Z2 + 0.0713007Z3 + 0.158448Z4 + 0.0664133Z5 = .0678685.
0.0649404Z1 + 0.0708333Z2 + 0.0702777Z3 + 0.0664133Z4 + 0.14875Z5 = .0670103.
Solving, Z1 = 0.195, Z2 = 0.113, Z3 = 0.118, Z4 = 0.167, Z5 = 0.181.

Zi = 0.195 + 0.113 + 0.118 + 0.167 + 0.181 = .774.


The remaining weight of: 1 - .774 = .226 is given to the a priori mean relativity of 1.
The estimated relativity for year 6 is:
(0.195)(.92) + (0.113)(.83) + (0.118)(.76) + (0.167)(.98) + (0.181)(.50) + (.226)(1) = 0.843.
Comment: Well beyond what you will e asked on your exam.
A very simplified version of Workers Compensation Classification Credibilities, by Howard C.
Mahler, CAS Forum, Fall 1999.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 702

18.16. Var[X] = Cov[X, X] = 1 + 5 = 6.


Cov[X1 , X2 ] = 0.9.

Cov[X1 , X3 ] = 0.92 = 0.81.

Cov[X1 , X4 ] = 0.93 = 0.729.

The covariance matrix between the years of data is:


6
0.9 0.81 0.729 0.6561 0.59049

6
0.9 0.81 0.729 0.6561
0.9
0.81
0.9
6
0.9
0.81
0.729

6
0.9
0.81
0.729 0.81 0.9

6
0.9
0.6561 0.729 0.81 0.9
Therefore, the Normal Equations are:
6Z1 + 0.9Z2 + 0.81Z3 + 0.729Z4 + 0.6561Z5 = 0.59049Z1 .
0.9Z1 + 6Z2 + 0.9Z3 + 0.81Z4 + 0.729Z5 = 0.6561Z2 .
0.81Z1 + 0.9Z2 + 6Z3 + 0.9Z4 + 0.81Z5 = 0.729Z3 .
0.729Z1 + 0.81Z2 + 0.9Z3 + 6Z4 + 0.9Z5 = 0.81Z4 .
0.6561Z1 + 0.729Z2 + 0.81Z3 + 0.9Z4 + 6Z5 = 0.9Z5 .
Solving: Z1 = 6.86%, Z2 = -5.21%, Z3 = 8.83%, Z4 = 10.22%, Z5 = 12.16%.
6.86% - 5.21% + 8.83% + 10.22% + 12.16% = 32.86%.
The remaining weight of 67.14% is given to the a priori mean.
Comment: Beyond what you will be asked on your exam.
The older years are less correlated with year 6, the year we wish to estimate, and thus their data is
given less weight.
The Normal Equations can have solutions where some of the credibilities are negative or greater
than one.
In this case, giving negative weight to the data from year 2, allows us to give more weight to the
data from year 1, resulting in a smaller expected squared error.
Year 1 is correlated with year 0, etc., and therefore contains valuable information about prior years.
See A Markov Chain Model of Shifting Risk Parameters, by Howard Mahler, PCAS 1997, not on
the syllabus.

2013-4-9

Buhlmann Credibility 18 Normal Equations,

HCM 10/19/12,

Page 703

18.17. B. K = EPV/VHM = (w + v/m)/a. Z = n/(n + K) = na/(na + w + v/m).


As m approaches infinity, Z approaches: na/(na + w) = n/(n + w/a).
Alternately, set up the normal equations for credibility and linear estimators of the form
a0 + ZiXi :
n

Zi C ij = C j, 1+n

for j = 1,2,3 ...n

i =1

Plus the unbiasedness equation:


E[X] = a0 + ZiE[Xi].
The covariance of years i and j is: VHM = a for ij,
and VHM + EPV = a + w + v/m for i = j.
Therefore, Cij = a + ij(w + v/m).
By symmetry, each Zi is equal to the others; let Zi = Z/n, so that Zi = Z.
Then (Z/n)(na + w + v/m) = a. Z = na/(na + w + v/m).
As m approaches infinity, Z approaches na/(na + w) = n/(n + w/a).
Comment: Not as hard as it looks! This is an example where the Buhlmann-Straub covariance
structure does not hold. Even if the Buhlmann-Straub covariance structure does not hold, if there are
equal exposures each year, and the EPV and VHM are only functions of the size of insured, then
the usual Buhlmann Credibility formula holds. If there are the same number of exposures per year,
then the Normal Equations produce the same result as
Z = N/(N + K), provided N is the number of years, and K is computed based on the number of
exposures in each year.
Assuming each year has the same expected value, a0 = (1-Z)E[X].
This covariance structure is used to model parameter uncertainty. As the size of risk increase,
the EPV goes to a constant w, rather than zero, as assumed more commonly. Therefore, as the
size of risk approaches infinity, the credibility does not approach one, assuming w > 0. See
Howard Mahlers Discussion of Robin R. Gillams, Parametrizing the Workers Compensation
Experience Rating Plan, PCAS 1993, or Credibility with Parameter Uncertainty, Risk
Heterogeneity, and Shifting Risk Parameters, by Howard Mahler, PCAS 1998.

2013-4-9

Buhlmann Credibility 19 Important Ideas,

HCM 10/19/12,

Page 704

Section 19, Important Formulas and Ideas


Here are what I believe are the most important formulas and ideas from this study guide to know for
the exam.
Conditional Distributions (Section 2):

P[A | B] =

E[X | B] =

P[A and B]
P[B]

y P[ X = y

P[A] =

P[A

| Bi] P[Bi]

Bi

E[X] =

| B]

E[X

| Bi] P[Bi]

Bi

Covariances and Correlations (Section 3):


Cov[X,Y] E[XY] - E[X]E[Y].
Cov[X, X] = Var[X].
Var[X + Y] = Var[X] + Var[Y] + 2Cov[X,Y].
If X and Y are independent then: Cov[X, Y] = 0 and Var[X + Y] = Var[X] + Var[Y].
Corr[X,Y]

Cov[X, Y]
Var[X] Var[Y]

The correlation is always in the interval [-1, +1].

Bayesian Analysis (Sections 4, 5, and 6):


Bayes Theorem: P(A | B) =

P(B | A) P(A)
.
P(B)

P(Risk Type | Observation) =

P(Observation | Risk Type) P(Risk Type)


.
P(Observation)

Unless stated otherwise, the estimate resulting from Bayesian Analysis is the mean of the posterior
distribution, (corresponding to using the squared error loss function.)
The result of Bayesian Analysis is always within the range of hypotheses.
The estimates that result from Bayesian Analysis are always in balance:
The sum of the product of the a priori chance of each outcome times its posterior Bayesian estimate
is equal to the a priori mean.

2013-4-9

Buhlmann Credibility 19 Important Ideas,

HCM 10/19/12,

Page 705

If () is the prior distribution of the parameter


then the posterior distribution of is proportional to: () P(Observation | )
The posterior distribution of is:

() Prob[Observation | ]

() Prob[Observation | ] d

(Mean given ) () Prob[Obs. | ] d

The Bayes estimate is:


.
() Prob[Obs. | ] d
When there is a continuous distribution of risk types, one can use the posterior distribution to get
Bayesian Interval Estimates.
Buhlmann Credibility (Sections 7, 8, 9, and 10):
v = EPV = Expected Value of the Process Variance = E[VAR[X | ]].
a = VHM = Variance of the Hypothetical Means = VAR[ E[X | ] ].
EPV + VHM = Total Variance.
Buhlmann Credibility Parameter = K =

EPV
,
VHM

where the Expected Value of the Process Variance and the Variance of the Hypothetical Means are
each calculated for a single observation of the risk process.
One calculates the EPV, VHM, and K prior to knowing the particular observation!
If one is estimating claim frequencies or pure premiums, then N is in exposures.
If one is estimating claim severities, then N is in number of claims.
For N observations, the Buhlmann Credibility Factor is: Z =

N
.
N + K

K is the number of observations needed for 50% credibility.

Estimate of the future = (Z) (Observation) + (1 - Z) (Prior Mean).


In the use of credibility, the estimate is always between the a priori estimate and the observation.

2013-4-9

Buhlmann Credibility 19 Important Ideas,

HCM 10/19/12,

Page 706

The Buhlmann Credibility estimate is a linear function of the observation.


If exposures do not vary, the estimates that result from Buhlmann Credibility are in balance:
The weighted average of the Buhlmann Credibility estimates over the possible observations for a
given situation, using as weights the a priori chances of the possible observations, is equal to the a
priori mean.
Linear Regression (Section 11):
The line formed by the Buhlmann Credibility estimates is the weighted least squares line to the
Bayesian estimates, with the a priori probability of each outcome acting as the weights. Buhlmann
Credibility is the Least Squares approximation to the Bayesian Estimates. The slope of
the weighted least squares line to the Bayesian Estimates is the Buhlmann Credibility.
Classification and Experience Rating (Sections 14 and 15):
The more homogeneous the classes, the more credibility is assigned to the class data and the less
to the overall average, when determining classification rates. The more homogeneous the classes,
the less credibility assigned an individuals data and the more to the average for the class, when
performing experience rating (individual risk rating.) The credibility is a relative measure of the
value of the information contained in the observation of the individual versus the information
in the class average.
Loss Functions (Section 16):
Bayes Analysis using the Squared-error Loss Function, just means do what we usually
do, get the posterior mean of the quantity of interest.
Error or Loss Function

Name

Bayesian Point Estimator

(estimate - true value)2

Squared-error

Mean

0 if estimate = true value

1 if estimate true value

Zero-one

Mode

Absolute-error

Median

| estimate - true value |

(1-p) | estimate - true value |, if estimate true value (overestimate)

(p)| estimate - true value|, if estimate true value (underestimate)

p th percentile

2013-4-9

Buhlmann Credibility 19 Important Ideas,

HCM 10/19/12,

Page 707

Least Squares Credibility (Section 17):


Buhlmann Credibility is the linear estimator which minimizes the expected squared error measured
with respect to either the future observation, the hypothetical mean, or the Bayesian Estimate.
The expected value of the squared error as a function of the weight applied to the observation is a
parabola.
Buhlmann Covariance Structure: COV[Xi , Xj] = EPVij + VHM.
The Buhlmann-Straub Model:
For a given policyholder its data (frequency, severity, or pure premium) in different years, Xi, are
independent. In year i, the policy has exposures mi, some measure of size.
() = E[Xi | ]. Var[Xi | ] = v()/mi.
Buhlmann-Straub Covariance Structure: COV[Xi , Xj] = (EPV / Size) ij + VHM.

Normal Equations (Section 18):


If there are equal exposures each year, and the EPV and VHM are only functions of the size of
insured, then the Normal Equations produce the same result as Z = N/(N + K), provided N is the
number of years, and K is computed based on the number of exposures in each year.
For Buhlmann Covariance Structure, the Normal Equations Z = N / (N+K).
For Buhlmann-Straub Covariance Structure, the Normal Equations Z = N / (N+K)..
If one assumes a covariance structure different than the Buhlmann-Straub, and exposures vary by
year, then the equation Z = m/(m + K) does not hold. In these situations, one can solve the set of
linear equations, called the normal equations, for the amount of credibility to be assigned to each
year of data.
Assume linear estimators of the form: a0 + ZiXi.
Variance-Covariance Matrix: Cij.
Then the least squares estimator satisfies the equations:
The unbiasedness equation: E[X] = a0 + ZiE[Xi].
N

Zi Cij = Cj, 1+N , for j = 1, 2, 3, ... , N.


i=1

Mahlers Guide to

Conjugate Priors
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-10


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-10,

Conjugate Priors,

HCM 10/21/12,

Page 1

Mahlers Guide to Conjugate Priors


Copyright 2013 by Howard C. Mahler.
Information in bold or sections whose title is in bold are more important for passing the exam.
Larger bold type indicates it is extremely important.
Information presented in italics should not be needed to directly answer exam questions and should
be skipped on first reading. It is provided to aid the readers overall understanding of the subject,
and to be useful in practical applications.
Solutions to the problems in each section are at the end of that section.1

B
C

Section #

Pages

Section Name

1
2
3
4
5
6
7
8
9
10
11
12
13

3
4-60
61-67
68-152
153-159
160-203
204-231
232-237
238-265
266-287
288-303
304-343
344-353

Introduction
Mixing Poissons
Gamma Function and Distribution

Gamma-Poisson
Beta Distribution
Beta-Bernoulli
Beta-Binomial
Inverse Gamma Distribution
Inverse Gamma-Exponential
Normal-Normal
Linear Exponential Families
Overview of Conjugate Priors
Important Formulas and Ideas

Note that problems include both some written by me and some from past exams. The latter are copyright by the
CAS and SOA and are reproduced here solely to aid students in studying for exams. In some cases Ive rewritten
these questions in order to match the notation in the current Syllabus. In some cases the material covered is
preliminary to the current Syllabus; you will be assumed to know it in order to answer exam questions, but it will not
be specifically tested. The solutions and comments are solely the responsibility of the author; the CAS and SOA
bear no responsibility for their accuracy. While some of the comments may seem critical of certain questions, this is
intended solely to aid you in studying and in no way is intended as a criticism of the many volunteers who work
extremely long and hard to produce quality exams.

2013-4-10,

Conjugate Priors,

HCM 10/21/12,

Page 2

Course 4 Exam Questions by Section of this Study Aid2


Section Sample
1
2
3
4
5
6
7
8
9
11
10
11
12
Section
1
2
3
4
5
6
7
8
9
10
11
12

5/00

11/00

5/01

18, 37, 38

30

34

11/01

11/02

11/03

11/04

5/05
6, 14

3 34

27

21

11
7

23

10
11/05

11/06

5/07

19

2, 19, 23

10

9, 29

3, 15
30

The CAS/SOA did not release the 5/02, 5/03, 5/04, 5/06, 11/07 and subsequent exams.

Excluding any questions that are no longer on the syllabus.

2013-4-10,

Conjugate Priors 1 Introduction,

HCM 10/21/12,

Page 3

Section 1, Introduction
Bayesian Analysis and Buhlmann Credibility will be applied to particular situations that commonly
occur.
First will be discussed situations in which each insured has a Poisson frequency, but with different
means. Mixing Poissons can involve either discrete risk types or a continuous distribution of . This
section serves as a useful review of how to apply Bayesian Analysis and Buhlmann Credibility, as
well as a way to prepare for the important Gamma-Poisson conjugate prior situation.
If the prior and posterior distributions have the same type of distribution, then the prior distribution is
called a conjugate prior. For example, the Gamma Distribution is a conjugate prior to the Poisson
Distribution. This Gamma-Poisson frequency process, in which each insured is Poisson and has a
Gamma Distribution, is very important to learn well.
This Study Aid will review in detail four common conjugate prior situations, in decreasing order of
importance: Gamma-Poisson, Beta-Bernoulli, Inverse Gamma-Exponential, and Normal-Normal.
In all four of these cases, the estimates from Bayesian Analysis and Buhlmann Credibility are equal.
These situations are examples of what Loss Models refers to as exact credibility.
Also, some important results for linear exponential families will be discussed in Section 10.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 4

Section 2, Mixing Poissons


This section presents a simple frequency example as a precursor to the important Gamma-Poisson
frequency process. Most of the important features of the Gamma-Poisson are present in the
example in this section. Study this example closely and then go back and forth between it and the
Gamma-Poisson. Even those who know the Gamma-Poisson very well, should find this a useful
example of Bayesian Analysis and Buhlmann Credibility.
Prior Distribution:3
Assume there are four types of risks or insureds, all with claim frequency given by a Poisson
distribution:
Average Annual
A Priori
Type
Claim Frequency
Probability
Excellent
1
40%
Good
2
30%
Bad

20%

Ugly

10%

These four different Poisson distributions are shown below, through eight claims:

The first portion of this example is in Mahlers Guide to Frequency Distributions.


However, here we introduce observations and then apply Bayes Analysis and Buhlmann Credibility.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 5

For a Poisson Distribution with parameter the chance of having n claims is given by:
f(n) = n e / n!. So for example for an Ugly risk with = 4, the chance of n claims is:
4n e-4 / n!. For an Ugly risk the chance of 6 claims is: 46 e-4 / 6! = 10.4%
Similarly the chance of 6 claims for Excellent, Good, or Bad risks are: .05%, 1.20%, and 5.04%,
respectively.
Marginal Distribution (Prior Mixed Distribution):
If we have a risk but do not know what type it is, we weight together the 4 different chances of
having 6 claims, using the a priori probabilities of each type of risk in order to get the chance of
having 6 claims: (0.4)(0.05%) + (0.3)(1.20%) + (0.2)(5.04%) + (0.1)(10.42%) = 2.43%.
The table below displays similar values for other numbers of claims. The probabilities in the final
column represents the marginal distribution (also referred to as the prior mixed distribution), which is
the assumed distribution of the number of claims for the entire portfolio of risks, prior to any
observations.4
Number
of
Claims
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

Probability
for
Excellent Risks
36.79%
36.79%
18.39%
6.13%
1.53%
0.31%
0.05%
0.01%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%

Probability
for
Good Risks
13.53%
27.07%
27.07%
18.04%
9.02%
3.61%
1.20%
0.34%
0.09%
0.02%
0.00%
0.00%
0.00%
0.00%
0.00%

Probability
for
Bad Risks
4.98%
14.94%
22.40%
22.40%
16.80%
10.08%
5.04%
2.16%
0.81%
0.27%
0.08%
0.02%
0.01%
0.00%
0.00%

Probability
for
Ugly Risks
1.83%
7.33%
14.65%
19.54%
19.54%
15.63%
10.42%
5.95%
2.98%
1.32%
0.53%
0.19%
0.06%
0.02%
0.01%

Probability
for
All Risks
19.95%
26.56%
21.42%
14.30%
8.63%
4.78%
2.43%
1.13%
0.49%
0.19%
0.07%
0.02%
0.01%
0.00%
0.00%

SUM

100.00%

100.00%

100.00%

100.00%

100.00%

Prior Mean:
Note that the overall (a priori) mean can be computed in either one of two ways.
First one can weight together the means for each type of risks, using the a priori probabilities:
(0.4)(1) + (0.3)(2) + (0.2)(3) + (0.1)(4) = 2. Alternately, one can compute the mean of the marginal
distribution: (0)(.01995) + (1)(0.2656) + (2)(0.2142) + ... = 2.
4

While the marginal distribution is easily computed by weighting together the four Poisson distributions, in this case
it is not itself a Poisson nor another well known distribution.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 6

Prior Total Variance:


Number
Probability
Probability
of for for
Excellent
Claims
Good
Bad
Ugly
Risks
Risks
All
Risks
Risks
Risks
0.1995
0.2656
0.2142
0.1430
0.0863
0.0478
0.0243
0.0113
0.0049
0.0019
0.0007
0.0002
0.0001
0.0000
0.0000
Mean Mean

Number of
Claims
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

Square of #
of Claims
0
1
4
9
16
25
36
49
64
81
100
121
144
169
196

2.000

7.000

Variances can be computed. The total variance is the variance of the claim distribution for the entire
portfolio. As seen above, the total variance = 7 - 22 = 3.
Prior Expected Value of the Process Variance:
The process variance for an individual risk is its Poisson parameter since the frequency for each risk
is Poisson. Therefore, the expected value of the process variance = the expected value of
= the a priori mean frequency = 2.
Prior Variance of the Hypothetical Means:
The variance of the hypothetical means is computed as follows:
Type
of Risk

A Priori
Probability

Mean

Mean
Squared

Excellent
Good
Bad
Ugly

0.4
0.3
0.2
0.1

1
2
3
4

1
4
9
16

Overall

Variance of the Hypothetical Means = 5 - 22 = 1.


The Expected Value of the Process Variance + Variance of the Hypothetical Means =
2 + 1 = 3 = Total Variance.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 7

Observations:
Let us now introduce the concept of observations. A risk is selected at random and it is observed to
have 5 claims in one year.
Posterior Distribution:
We can employ Bayesian analysis to compute what the chances are that the selected risk was of
each type:
A

Type
of Risk

A Priori
Probability

Chance of
Observation

Col. B x Col. C =
Probability Weight

Col. D / Sum of Col. D. =


Posterior Probability

Excellent
Good
Bad
Ugly
SUM

0.4
0.3
0.2
0.1

0.0031
0.0361
0.1008
0.1563

0.00124
0.01083
0.02016
0.01563
0.04786

2.59%
22.63%
42.12%
32.66%
100.00%

While the posterior probabilities can be calculated in a straightforward manner, they do not come
from some named well-known distribution.
Predictive Distribution:
Using these posterior probabilities, one can compute the predictive distribution; i.e., the posterior
analog of the marginal distribution:
Number of Probability for Probability for
Claims
Excellent Risks Good Risks

Probability for
Bad Risks

Probability for
Ugly Risks

Probability for
All Risks

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

36.79%
36.79%
18.39%
6.13%
1.53%
0.31%
0.05%
0.01%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%

13.53%
27.07%
27.07%
18.04%
9.02%
3.61%
1.20%
0.34%
0.09%
0.02%
0.00%
0.00%
0.00%
0.00%
0.00%

4.98%
14.94%
22.40%
22.40%
16.80%
10.08%
5.04%
2.16%
0.81%
0.27%
0.08%
0.02%
0.01%
0.00%
0.00%

1.83%
7.33%
14.65%
19.54%
19.54%
15.63%
10.42%
5.95%
2.98%
1.32%
0.53%
0.19%
0.06%
0.02%
0.01%

6.71%
15.76%
20.82%
20.06%
15.54%
10.18%
5.80%
2.93%
1.33%
0.55%
0.21%
0.07%
0.02%
0.01%
0.00%

SUM

100.00%

100.00%

100.00%

100.00%

100.00%

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 8

For example, in the year subsequent to our observation, for this same risk the chance of having 6
claims is: (0.0259)(.05%) + (0.2263)(1.20%) + (0.4212)(5.04%) + (0.3266)(10.42%) = 5.80%.
After having observed 5 claims in a year our new estimate of the chance of that same risk having 6
claims the next year is 5.8% rather than only 2.4% as it was prior to any observation.
Below are displayed both the posterior predictive distribution (squares) and the prior marginal
distribution (triangles), through 8 claims:

0.25

0.2

0.15

0.1

0.05

Posterior Mean:
One can compute the means and variances posterior to the observations. The posterior mean can
be computed either by weighting together the means of the different types of risks using the
posterior weights or by by computing the mean of the predictive distribution. The former gives
(2.59%)(1)+ (22.63%)(2) + (42.12%)(3) + (32.66% )(4) = 3.05. Alternately, the mean of the
predictive distribution is: (0.0671)(0) + (0.1576)(1) + (0.2082) (2) + ... = 3.05.
Thus the new estimate posterior to the observations for this risk using Bayesian Analysis is 3.05.
This compares to the a priori estimate of 2. In general, the observations provide information about
the given risk, which allows one to make a better estimate of the future experience of that risk.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 9

Mixing Poissons, Bayesian Analysis


Mixing

Prior Distribution of

Marginal Distribution (Number of Claims)

Observations

Posterior Distribution of

Predictive Distribution (Number of Claims)

Mixing
Posterior Expected Value of the Process Variance:
Just as prior to the observations, posterior to the observations one can compute three variances:
the expected value of the process variance, the variance of the hypothetical pure premiums, and the
total variance. The process variance for an individual risk is its Poisson parameter since the
frequency for each risk is Poisson. Therefore the expected value of the process variance =
the expected value of = the posterior mean frequency = 3.05.
Posterior Variance of the Hypothetical Means:
The variance of the hypothetical means is computed as follows:
Type
of Risk

Posterior
Probability

Mean

Square
Mean

Excellent
Good
Bad
Ugly

0.0259
0.2263
0.4212
0.3266

1
2
3
4

1
4
9
16

3.05

9.95

Overall

Variance of the Hypothetical Means = 9.95 - 3.052 = 0.65.


Note how after the observation the variance of the hypothetical means is less than prior, since the
observations have allowed us to narrow down the possibilities.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 10

Posterior Total Variance:


The EPV + VHM = 3.05 + 0.65 = 3.70 = Total Variance. Calculating the total variance directly from
the predictive distribution as shown below, Total Variance = 12.99 - 3.052 = 3.69, which matches
the sum of the expected value of the process variance and the variance of the hypothetical means,
except for rounding.
Posterior
Predictive Distrib.

Number of
Claims

Square of
# of Claims

0.0671
0.1576
0.2082
0.2006
0.1554
0.1018
0.0580
0.0293
0.0133
0.0055
0.0021
0.0007
0.0002
0.0001
0.0000

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

0
1
4
9
16
25
36
49
64
81
100
121
144
169
196

Mean

3.05

12.99

Buhlmann Credibility:
Next, lets apply Buhlmann Credibility to this example. The Buhlmann Credibility parameter K = the
expected value of the process variance / variance of the hypothetical means = 2 / 1 = 2. Note that K
can be computed prior to any observation and doesnt depend on them. Having observed 5 claims
in one year, Z = 1 / (1+ 2) = 1/3. The observation = 5. The a priori mean = 2. Therefore, the new
estimate = (1/3)(5) + (1 - 1/3)(2) = 3. Note that in this case the estimate from Buhlmann Credibility
does not match the estimate from Bayesian Analysis.
Mixing Poissons, Buhlmann Credibility
EPV = Overall A Priori Mean = E[] = Mean of Prior Distribution of
= Mean of Marginal Distribution.
VHM = Var[] = Variance of Prior Distribution of .
Total Variance = Variance of Marginal Distribution = EPV + VHM =
Mean of Prior Distribution + Variance of Prior Distribution.5
5

Variance of the Marginal Distribution = Mean of the Marginal Distribution + Variance of Prior Distribution. Therefore,
Variance of the Marginal Distribution> Mean of the Marginal Distribution. See Equation 6.45 in Loss Models.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 11

Extending the Example:


There is nothing unique about assuming four types of risks. If one had assumed for example 100
different types of risks, with mean frequencies from 0.1 to 10, then there would have been no
change in the conceptual complexity of the situation, although the computational complexity would
have been increased.
Assume that a priori the probability of each of these 100 types of risks with mean frequency , was
given approximately by:
(10)(1.53 ) 2 e-1.5 / (3) = 16.875 2 e-1.5

= 0.1, 0.2 , 0.3,..., 9.9, 10.

(This is 10 times6 a Gamma density with = 3 and = 2/3. Recall that (3) = 2! = 2.) Note that the
Gamma distribution isnt being used here as a size of loss distribution. There has been no mention
of claim severity; only claim frequency has been dealt with here.
One would compute the overall mean frequency by taking the sum from = 0.1 to = 10 of the
product of this density times times = (16.875 3 e-1.5)(.1)
mean of a Gamma distribution with = 3 and = 2/3.
Thus the overall mean is approximately: (3)(2/3) = 2.
A further extension of this discrete example to a continuous case would give the Gamma-Poisson
situation discussed in a subsequent section. The same type of questions as were asked in this
example can be asked about the Gamma-Poisson situation. Due to the mathematical properties of
the Gamma and Poisson there are some specific relationships in the case of the Gamma-Poisson in
addition to those in this example.

One multiplies by 10 in order to compensate for having selected = 1/10.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 12

A General Result for Mixing Poissons:


Assume there is a mixture of different individuals each of which has frequency given by a Poisson,
but with different means. For each individual risk the Process Variance is the mean, since this is the
case for the Poisson. Therefore the Expected Value of the Process Variance is the (a priori) overall
mean frequency. The total variance is the sum of the Expected Value of the Process Variance plus
the Variance of the Hypothetical Means.
Thus one could estimate the EPV and VHM as:7
EPV = estimated mean.
VHM = estimated total variance - EPV.
Therefore, the Buhlmann Credibility parameter K = EPV / VHM =
A Priori Mean / (A Priori Total Variance - A Priori Mean). Therefore:
Buhlmann Credibility Parameter = A Priori Mean / Excess Variance.
In the example, K = 2 / (3 - 2) = 2, which matches the result above. The denominator is the extra
variance beyond the Poisson, that is introduced by the mixing of individual risks with different
expected means. Note that the credibility assigned to one observation is:
Z = 1/(1+K) = (A Priori Total Variance - A Priori Mean) / A Priori Total Variance =
1 - (A Priori Mean / A Priori Total Variance).
In general, when mixing Poissons with means via a distribution g(), the Expected Value of the
Process Variance is the mean of g, and the Variance of the Hypothetical Means is the variance of g.
K = E[]/Var[]. This general result for mixing Poissons is the idea behind semiparametric estimation
with Poissons, as discussed in Mahlers Guide to Semiparametric Estimation.
In the important special case where g is a Gamma Distribution, one gets the Gamma-Poisson
frequency process, with the EPV= mean of the Gamma = ,
while the VHM = variance of the Gamma = 2,
so that K = () / (2) = 1/, the inverse of the scale parameter of the Gamma.

See Mahlers Guide to Semiparametric Estimation.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Problems:
Use the following information for the next 5 questions:
Each insureds claim frequency follows a Poisson process.
There are three types of insureds as follows:
Type A Priori Probability Mean Annual Claim Frequency (Poisson Parameter)
A
60%
1
B
30%
2
C
10%
3
2.1 (1 point) What is the chance of a single individual having 4 claims in a year?
A. 4.7%

B. 4.9%

C. 5.1%

D. 5.3%

E. 5.5%

2.2 (3 points) You observe 4 claims by an individual in a single year.


Use Buhlmann Credibility to predict that individuals future claim frequency.
A. less than 1.7
B. at least 1.7 but less than 1.8
C. at least 1.8 but less than 1.9
D. at least 1.9 but less than 2.0
E. at least 2.0
2.3 (3 points) You observe 4 claims by an individual in a single year.
Use Bayesian Analysis to predict that individuals future claim frequency.
A. less than 2.0
B. at least 2.0 but less than 2.1
C. at least 2.1 but less than 2.2
D. at least 2.2 but less than 2.3
E. at least 2.3
2.4 (2 points) You observe 37 claims by an individual in ten years.
Use Buhlmann Credibility to predict that individuals future claim frequency.
A. less than 2.9
B. at least 2.9 but less than 3.1
C. at least 3.1 but less than 3.3
D. at least 3.3 but less than 3.5
E. at least 3.5
2.5 (3 points) You observe 37 claims by an individual in ten years.
Use Bayesian Analysis to predict that individuals future claim frequency.
A. 2.8
B. 3.0
C. 3.2
D. 3.4
E. 3.6

Page 13

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 14

Use the following information for the next four questions:


Each insured has its accident frequency given by a Poisson Process with mean .
For a portfolio of insureds, is distributed uniformly on the interval from 3 to 7.
2.6 (1 point) What is the Expected Value of the Process Variance?
A. 1 B. 2 C. 3 D. 4 E. 5
2.7 (2 points) What is the Variance of the Hypothetical Means?
A. less than 1.0
B. at least 1.0 but less than 1.5
C. at least 1.5 but less than 2.0
D. at least 2.0 but less than 2.5
E. at least 2.5
2.8 (2 points) An individual insured from this portfolio is observed to have 7 accidents in a single
year.
Use Buhlmann Credibility to estimate the future accident frequency of that insured.
A. less than 5.5
B. at least 5.5 but less than 5.7
C. at least 5.7 but less than 5.9
D. at least 5.9 but less than 6.1
E. at least 6.1
2.9 (4 points) An individual insured from this portfolio is observed to have 7 accidents in a single
year. Use Bayesian Analysis to estimate the future accident frequency of that insured.
(8; 7) - (8; 3)
(8; 7) - (8; 3)
A. 7
B. 8
(7; 7) - (7; 3)
(7; 7) - (7; 3)
C. 7

(9; 7) - (9; 3)
(8; 7) - (8; 3)

D. 8

(9; 7) - (9; 3)
(8; 7) - (8; 3)

E. None of the above.

2.10 (2 points) Each insured has its accident frequency given by a Poisson distribution with mean .
Over a portfolio of insureds, is distributed via g().
g has a mean of 0.09 and a variance of 0.003.
For an insured, two claims are observed in three years.
Use Buhlmann Credibility to estimate the future claim frequency of this insured.
A. 10%
B. 11%
C. 12%
D. 13%
E. 14%

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 15

Use the following information for the next 8 questions:


(i) An individual automobile insured has annual claim frequencies that follow a Poisson distribution
with mean .
(ii) An actuarys prior distribution for the parameter has probability density function:
() = (0.7)(20e20) + (0.3)(10e10).
2.11 (1 point) What is the a priori expected annual frequency?
(A) 5.0%
(B) 5.5%
(C) 6.0%
(D) 6.5%
(E) 7.0%
2.12 (1 point) For an insured picked at random, what is the probability that > 10%?
A. less than 16%
B. at least 16% but less than 18%
C. at least 18% but less than 20%
D. at least 20% but less than 22%
E. at least 22%
2.13 (3 points) If in the first policy year, no claims were observed for an insured, what is the
probability that for this insured, > 10%?
A. 14%

B. 16%

C. 18%

D. 20%

E. 22%

2.14 (3 points) If in the first policy year, no claims were observed for an insured,
determine the expected number of claims in the second policy year.
(A) 4.0%
(B) 4.5%
(C) 5.0%
(D) 5.5%
(E) 6.0%
2.15 (3 points) If in the first policy year, one claim was observed for an insured,
determine the expected number of claims in the second policy year.
A. 11%
B. 13%
C. 15%
D. 17%
E. 19%
2.16 (3 points) If in the first policy year, two claims was observed for an insured,
determine the expected number of claims in the second policy year.
(A) 19%
(B) 20%
(C) 21%
(D) 22%
(E) 23%
2.17 (3 points) Determine the Buhlmann credibility parameter, K.
(A) 10
(B) 12
(C) 14
(D) 16
(E) 18
2.18 (1 point) If in the first policy year, two claims was observed for an insured,
using Buhlmann Credibility, what is the expected number of claims in the second policy year?
(A) 19%
(B) 20%
(C) 21%
(D)22%
(E) 23%

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 16

Use the following information for the next seven questions:


Each insured has its accident frequency given by a Poisson Process with mean .
For a portfolio of insureds is distributed as follows on the interval from a to b:
f() =

(d + 1) d
, 0 a b .
bd + 1 - ad + 1

You may use the following values of the Incomplete Gamma Function:
( ; x)
x

=2.5

=3.5

=4.5

1.2

0.209

0.066

0.017

2.4
0.559
0.316
0.149
3.6
0.794
0.592
0.384
You may also use the following values of the Complete Gamma Function:
(2.5) = 1.329, (3.5) = 3.323, (4.5) = 11.632
2.19 (1 point) What is the expected mean value of , prior to any observations?
A. (1 +
D.

1
bd - ad
) d+ 1
- ad + 1
d b

bd - ad
bd + 1 - ad + 1

B. (d+1)
E.

bd + 2 - ad + 2
bd + 1 - ad + 1

C.

d + 1 bd + 2 - ad + 2
d + 2 bd + 1 - ad + 1

d + 1
b - a

2.20 (2 points) An insured is randomly selected from the portfolio and we observe C claims in Y
years. Which of the following is the posterior distribution of for this insured?
A.

Yd + C d + C e - Y
{(d+ C; bY) - (d + C; aY) } (d + C)

B.

Yd + C d + C + 1 e -Y
(d+ C; bY) - (d + C; aY)

C.

Yd + C + 1 d + C + 1 e - Y
(d+ C +1; bY) - (d + C +1; aY)

D.

Yd + C + 1 d + C + 1 e - Y
(d+ C; bY) - (d + C; aY)

E.

Yd + C + 1 d + C e- Y
{(d+ C +1; bY) - (d + C + 1; aY) } (d + C + 1)

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 17

2.21 (2 points) An insured is randomly selected from the portfolio and we observe C claims in Y
years. Which of the following is the Bayesian Analysis estimate of for this insured?
A.

(d+ C; bY) - (d + C; aY)


d+ C
Y (d+ C +1; bY) - (d + C +1; aY)

B.

d + C (d+ C +1; bY) - (d + C +1; aY)


(d + C; bY) - (d + C; aY)
Y

C.

d + C (d+ C + 2; bY) - (d + C + 2; aY)


(d+ C; bY) - (d + C; aY)
Y

D.

(d+ C; bY) - (d + C; aY)


d+ C+ 1
(d+ C +1; bY) - (d + C +1; aY)
Y

E.

d + C + 1 (d+ C + 2; bY) - (d + C + 2; aY)


(d+ C + 1; bY) - (d + C + 1; aY)
Y

2.22 (1 point) If the parameter d = -1/2, and if a = 0.2 and b =0 .6, what is the expected mean value
of , prior to any observations?
A. 0.34

B. 0.35

C. 0.36

D. 0.37

E. 0.38

2.23 (2 points) The parameter d = -1/2, and a =0.2 and b = 0.6.


An individual insured from this portfolio is observed to have 2 claims over 6 years.
Which of the following, with support from 0.2 to 0.6, is the posterior distribution of for this insured?
A. 113.4 1.5 e-6

B. 113.4 1.5 e-7

D. 113.4 2.5 e-7

E. None of the above

C. 113.4 2.5 e-6

2.24 (2 points) The parameter d = -1/2, and a = 0.2 and b =0 .6.


An individual insured from this portfolio is observed to have 2 claims over 6 years.
Which of the following is the Bayesian Analysis estimate of for this insured?
A. less than 0.35
B. at least 0.35 but less than 0.36
C. at least 0.36 but less than 0.37
D. at least 0.37 but less than 0.38
E. at least 0.38
2.25 (2 points) The parameter d = -1, and a = 0 and b = .
An individual insured from this portfolio is observed to have 2 claims over 6 years.
Which of the following is the Bayesian Analysis estimate of for this insured?
A. less than 0.35
B. at least 0.35 but less than 0.36
C. at least 0.36 but less than 0.37
D. at least 0.37 but less than 0.38
E. at least 0.38

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 18

Use the following information for the next 7 questions:


A group of drivers have their expected annual claim frequency uniformly distributed over the interval
2% to 8%.
Each driver's observed number of claims per year follows a Poisson distribution.
2.26 (3 points) A particular driver from this group is observed to have two claims over the most
recent five year period. Using Buhlmann credibility, what is the estimate of this driver's future annual
claim frequency?
A. 6.0%
B. 6.2%
C. 6.4%
D. 6.6%
E. 6.8%
2.27 (2 points) A particular driver from this group is observed to have no claims over the most
recent five year period. Using Bayesian Analysis, what is the estimate of this driver's future annual
claim frequency?
A. less than 4.6%
B. at least 4.6% but less than 4.7%
C. at least 4.7% but less than 4.8%
D. at least 4.8% but less than 4.9%
E. at least 4.9%
2.28 (2 points) A particular driver from this group is observed to have no claims over the most
recent five year period.
What is the probability that this drivers annual Poisson parameter is less than 5%?
A. less than 53%
B. at least 53% but less than 55%
C. at least 55% but less than 57%
D. at least 57% but less than 59%
E. at least 59%
2.29 (3 points) A particular driver from this group is observed to have one claim over the most
recent five year period. Using Bayesian Analysis, what is the estimate of this driver's future annual
claim frequency?
A. 5.1%
B. 5.3%
C. 5.5%
D. 5.7%
E. 5.9%
2.30 (3 points) A particular driver from this group is observed to have one claim over the most
recent five year period. What is the probability that this drivers annual Poisson parameter is less
than 5%?
A. less than 33%
B. at least 33% but less than 35%
C. at least 35% but less than 37%
D. at least 37% but less than 39%
E. at least 39%

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 19

2.31 (2 points) A particular driver from this group is observed to have no claims over the most
recent five year period.
What is the probability that this driver has no claims the following year?
A. 93%
B. 94%
C. 95%
D. 96%
E. 97%
2.32 (2 points) A particular driver from this group is observed to have no claims over the most
recent five year period.
What is the probability that this driver has one claim the following year?
A. 4.6%
B. 4.7%
C. 4.8%
D. 4.9%
E. 5.0%

Use the following information for the next three questions:


(i)
The number of claims experienced in a given year by each insured follows
a Poisson distribution.
(ii)

The mean value of the Poisson distribution is distributed across the population
according to a Single Parameter Pareto Distribution with = 3 and = 0.8.

(iii)

is constant for each insured over time.

(iv)

An insured is picked at random and has 4 claims in 7 years.

2.33 (3 points) What is the Buhlmann credibility estimate of the future expected annual claim
frequency for this particular insured?
A. 74%
B. 76%
C. 78%
D. 80%
E. 82%
2.34 (1 point) What is the a priori chance of observing 4 claims in 7 years?
A. 7%
B. 8%
C. 9%
D. 10%
E. 11%
2.35 (3 points) Use Bayes Theorem in order to estimate the future expected annual claim frequency
for this particular insured.
A. Less than 80%
B. At least 80%, but less than 85%
C. At least 85%, but less than 90%
D. At least 90%, but less than 95%
E. 95% or more

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 20

Use the following information for the next 6 questions:


(i) Claim counts for individual insureds follow a Poisson distribution.
(ii) Half of the insureds have expected annual claim frequency of 40%.
(iii) The other half of the insureds have expected annual claim frequency of 60%.
2.36 (2 points) A randomly selected insured has made 2 claims in each of the first two policy years.
Determine the Bayesian estimate of this insureds claim count in the next (third) policy
year.
(A) 0.554
(B) 0.556
(C) 0.558
(D) 0.560
(E) 0.562
2.37 (2 points) A randomly selected insured has made 1 claim in the first policy year and 3 claims in
the second policy year. Determine the Bayesian estimate of this insureds claim count in the next
(third) policy year.
(A) 0.554
(B) 0.556
(C) 0.558
(D) 0.560
(E) 0.562
2.38 (2 points) A randomly selected insured has made a total of 4 claims in the first two policy
years. Determine the Bayesian estimate of this insureds claim count in the next (third) policy year.
(A) 0.554
(B) 0.556
(C) 0.558
(D) 0.560
(E) 0.562
2.39 (3 points) A randomly selected insured had at most 2 claims in each of the first two policy
years. Determine the Bayesian estimate of this insureds claim count in the next (third) policy year.
(A) 0.490
(B) 0.492
(C) 0.494
(D) 0.496
(E) 0.498
2.40 (3 points) A randomly selected insured has made at most a total of 2 claims in the first two
policy years. Determine the Bayesian estimate of this insureds claim count in the next (third) policy
year.
(A) 0.490
(B) 0.492
(C) 0.494
(D) 0.496
(E) 0.498
2.41 (3 points) A randomly selected insured had at least 2 claims in each of the first two policy
years. Determine the Bayesian estimate of this insureds claim count in the next (third) policy year.
(A) 0.54
(B) 0.55
(C) 0.56
(D) 0.57
(E) 0.58

2.42 (3 points) You are given:


(i) The annual number of claims for an individual risk follows a Poisson distribution with mean .
(ii) For 80% of the risks, = 4.
(iii) For 20% of the risks, = 7.
A randomly selected risk had r claims in Year 1.
The Bayesian estimate of this risks expected number of claims in Year 2 is 5.97. Determine r.
(A) 8
(B) 9
(C) 10
(D) 11
(E) 12

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 21

Use the following information for the next two questions:


(i) The conditional distribution of the number of claims per policyholder is Poisson with mean .
(ii) The variable has a Weibull distribution with parameters and .
(iii) A policyholder has 1 claim in Year 1, 2 claims in year 2, and 3 claims in year 3.
2.43 (3 points) Which of the following is equal to the mean of the posterior distribution of ?

(A)

+ 6 exp[{3 + ( / ) }] d /

(B)

+ 3 exp[{6 + ( / ) }] d /

(D)

( / ) }] d

+ 3 exp[{5 +

( / ) }] d

+ 6 exp[{3 + ( / ) }] d /

+ 3 exp[{5 +
0

(C)

+ 5 exp[{3 +

( / ) }] d

+ 3 exp[{6 + ( / ) }] d /

+ 5 exp[{3 +

( / ) }] d

(E) None of A, B, C, or D
2.44 (3 points) Which of the following is equal to the probability of three claims in year 4 from this
policyholder?

(A) (1/6)

+ 8 exp[{3 + ( / ) }] d /

+ 8 exp[{3 + ( / ) }] d /

(D)

+ 5 exp[{3 +

( / ) }] d

(B) (1/6) + 8 exp[{4 + ( / ) }] d /


(C)

+ 5 exp[{3 +

+ 5 exp[{3 +

( / ) }] d

( / ) }] d

+ 8 exp[{4 + ( / ) }] d /

(E) None of A, B, C, or D

+ 5 exp[{3 +
0

( / ) }] d

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 22

2.45 (3 points) For an individual high-tech company, the number of Workers Compensation
Insurance claims per employee per year is Poisson with mean .
Over the high-tech industry, is uniformly distributed from 0.002 to 0.008.
You have the following experience for Initech, a high-tech company:
Year Number of Employees
Number of Workers Compensation Insurance Claims
1
2000
15
2
2400
19
3
1600
12
During year 4, Initech will have 1400 employees. Using Bhlmann-Straub Credibility, estimate the
number of Workers Compensation Insurance Claims Initech will have in year 4.
(A) 7
(B) 8
(C) 9
(D) 10
(E) 11
2.46 (3 points) You are given the following information for health insurance:
(i) Annual claim counts for individual insureds follow a Poisson distribution.
(ii) Three quarters of the insureds have expected annual claim frequency of 2.
(iii) The other quarter of the insureds have expected annual claim frequency of 4.
(iv) A particular insured has the following experience over 6 years:
Number of Claims
Number of Years
0 or 1
1
2 or 3
2
more than 3
3
Determine the Bayesian expected number of claims for the insured in the next year.
(A) 2.7
(B) 2.9
(C) 3.1
(D) 3.3
(E) 3.5
2.47 (3 points) The number of claims each year for an individual insured has a Poisson distribution
with parameter . The expected annual claim frequencies of the entire population of insureds is
distributed by: 32/8 for 0 < < 2.
Chip Monk had 2 claims during the past 3 years.
Using Buhlmann credibility, what is the estimate of Chips future annual claim frequency?
A. 1.25
B. 1.27
C. 1.29
D. 1.31
E. 1.33

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 23

Use the following information for the next three questions:


(i) The number of claims incurred in a month by any employee of an insured follows
a Poisson distribution with mean .
(ii) For a given insured, is the same for all employees.
(iii) The number of claims for different employees are independent.
(iv) For a particular employer, you have the following experience:
Month
Number of employees
Number of Claims
1
200
16
2
210
19
3
230
21
4
270
25
2.48 (2 points) The prior distribution of is 0.1 and 0.2 equally likely.
Determine the Bhlmann-Straub credibility estimate of the number of claims in the next 12 months
for 400 employees.
(A) Less than 460
(B) At least 460, but less than 500
(C) At least 500, but less than 540
(D) At least 540, but less than 580
(E) At least 580
2.49 (2 points) The prior distribution of is 0.1 and 0.2 equally likely.
Determine the Bayes Analysis estimate of the number of claims in the next 12 months for 400
employees.
(A) Less than 460
(B) At least 460, but less than 500
(C) At least 500, but less than 540
(D) At least 540, but less than 580
(E) At least 580
2.50 (2 points) Using classical credibility, you wish the estimated mean frequency to be within 10%
of its true value 95% of the time.
Similar employers have a mean frequency of 15% per employee per month.
Estimate of the number of claims in the next 12 months for 400 employees.
(A) Less than 460
(B) At least 460, but less than 500
(C) At least 500, but less than 540
(D) At least 540, but less than 580
(E) At least 580

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 24

2.51 (4, 5/84, Q.36) (2 points) There is a new brand of chocolate chip cookies. You buy a box of
50 of the cookies, and discover that there are a total of 265 chips. Your prior research has led you to
expect 4.80 chips per cookie, with a variance between brands of 0.20. You assume that for a given
brand the number of chips in each cookie varies randomly and is given by a Poisson distribution.
Using Buhlmann credibility, estimate the average number of chips per cookie for this new brand.
A. Less than 4.9
B. At least 4.9, but less than 5.0
C. At least 5.0, but less than 5.1
D. At least 5.1, but less than 5.2
E. 5.2 or more
2.52 (4, 5/86, Q.42) (3 points) A group of drivers have their expected annual claim frequency
uniformly distributed over the interval (0.10, 0.30). Each driver's observed number of claims per
year follows a Poisson distribution. A particular driver from this group is observed to have three
claims over the most recent five year period. Using Buhlmann credibility, what is the estimate of this
driver's future claim frequency?
A. Less than 0.21
B. At least 0.21, but less than 0.22
C. At least 0.22, but less than 0.23
D. At least 0.23, but less than 0.24
E. 0.24 or more.
2.53 (4, 5/88, Q.43) (2 points) The number of claims each year for an individual insured has a
Poisson distribution. The expected annual claim frequencies of the entire population of insureds are
uniformly distributed over the interval (0.0, 1.0). An individual's expected annual claim frequency is
constant through time. An insured is selected at random. The insured is then observed to have no
claims during a year. What is the posterior density function of the expected annual claim frequency
for this insured?
A. is uniformly distributed over (0, 1)

B. 3(1-)2 for 0 < < 1

C. e / (1 - e-1) for 0 < < 1

D. e / (e-1) for 0 < < 1

E. has a beta distribution

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 25

2.54 (4, 5/88, Q.44) (2 points) The number of claims each year for an individual insured has a
Poisson distribution. The expected annual claim frequencies of the entire population of insureds are
uniformly distributed over the interval (0.0, 1.0). An individual's expected annual claim frequency is
constant through time.
A particular insured had four claims during the prior three years.
Using Buhlmann credibility, what is the estimate of this insured's future annual claim frequency?
A. Less than 0.65
B. At least 0.65, but less than 0.70
C. At least 0.70, but less than 0.75
D. At least 0.75, but less than 0.80
E. 0.80 or more
2.55 (4, 5/89, Q.41) (3 points) Assume an individual insured is selected at random from a
population of insureds. The number of claims experienced in a given year by each insured follows a
Poisson distribution. The mean value of the Poisson distribution is distributed across the
population according to the following distribution:
f() = 3-4 over the interval (1, ).
Given that a particular insured experienced a total of 20 claims in the previous 2 years, what is the
Buhlmann credibility estimate of the future expected annual claim frequency for this particular insured?
(Assume frequency is constant for each insured over time.)
A. Less than 2
B. At least 2, but less than 4
C. At least 4, but less than 6
D. At least 6, but less than 8
E. 8 or more
2.56 (4, 5/90, Q.41) (2 points) Assume that the number of claims made by an individual insured
follows a Poisson distribution. Assume also that the expected number of claims, , for insureds in
the population has the probability density function f() = 4 -5 for 1 < < .
What is the value of K used in Buhlmanns credibility formula for estimating the expected number of
claims for an individual insured?
A. K < 5.7 B. 5.7 < K < 5.8 C. 5.8 < K < 5.9 D. 5.9 < K < 6.0 E. 6.0 < K

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 26

2.57 (4, 5/90, Q.52) (2 points) The number of claims each year for an individual insured has a
Poisson distribution. The expected annual claim frequency of the entire population of insureds is
uniformly distributed over the interval (0, 1). An individuals expected claim frequency is constant
through time. A particular insured had 3 claims during the prior three years. Using Buhlmann
credibility, what is the estimate of this insured's future annual claim frequency?
A. Less than 0.60
B. At least 0.60 but less than 0.65
C. At least 0.65 but less than 0.70
D. At least 0.70 but less than 0.75
E. At least 0.75
Use the following information for the next two questions:

The claim count N for an individual insured has a Poisson distribution with mean .
is uniformly distributed between 1 and 3.
2.58 (4, 5/91, Q.42) (2 points) Find the probability that a randomly selected insured will have no
claims.
A. Less than 0.11
B. At least 0.11 but less than 0.13
C. At least 0.13 but less than 0.15
D. At least 0.15 but less than 0.17
E. At least 0.17
2.59 (4, 5/91, Q.43) (2 points) If an insured has one claim during a first period, use Buhlmann's
credibility formula to estimate the expected number of claims for that insured in the next period.
A. Less than 1.20
B. At least 1.20 but less than 1.40
C. At least 1.40 but less than l.60
D. At least 1.60 but less than 1.80
E. At least 1.80

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 27

2.60 (4B, 11/96, Q.20) (3 points) You are given the following:

The number of claims for a single risk follows a Poisson distribution with mean .

and have a prior probability distribution with joint density function


f(,) = 1, 0 < < 1, 0 < < 1.

Determine the value of Buhlmann's k.


A. Less than 5.5
B. At least 5.5, but less than 6.5
C. At least 6.5, but less than 7.5
D. At least 7.5, but less than 8.5
E. At least 8.5
Use the following information for the next two questions:
You are given the following:

A large portfolio of automobile risks consists solely of youthful drivers.


The number of claims for one driver during one exposure period follows a Poisson distribution
with mean 4-g, where g is the grade point average of the driver.
The distribution of g within the portfolio is uniform on the interval [0,4].
A driver is selected at random from the portfolio. During one exposure period, no claims are
observed for this driver.
2.61 (4B, 5/97, Q.4) (2 points) Determine the posterior probability that the selected driver has a
grade point average greater than 3.
A. Less than 0.15
B. At least 0.15, but less than 0.35
C. At least 0.35, but less than 0.55
D. At least 0.55, but less than 0.75
E. At least 0.75
2.62 (4B, 5/97, Q.5) (2 points) Determine the Buhlmann credibility estimate of the expected
number of claims for this driver during the next exposure period.
A. Less than 0.375
B. At least 0.375, but less than 0.425
C. At least 0.425, but less than 0.475
D. At least 0.475, but less than 0.525
E. At least 0.525

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 28

2.63 (4B, 5/98, Q.2) (1 point) You are given the following:

The number of claims for a single insured follows a Poisson distribution with mean .
varies by insured and follows a Poisson distribution with mean .
Determine the value of Buhlmann's k.
A. 1

B.

C.

D. /

E. /

2.64 (4, 11/00, Q.3) (2.5 points) You are given the following for a dental insurer:
(i) Claim counts for individual insureds follow a Poisson distribution.
(ii) Half of the insureds are expected to have 2.0 claims per year.
(iii) The other half of the insureds are expected to have 4.0 claims per year.
A randomly selected insured has made 4 claims in each of the first two policy years.
Determine the Bayesian estimate of this insureds claim count in the next (third) policy
year.
(A) 3.2
(B) 3.4
(C) 3.6
(D) 3.8
(E) 4.0
2.65 (2 points) In the previous question, 4, 11/00, Q.3, what is the probability that this insureds
claim count in the next (third) policy year is 1?
A. Less than 8%
B. At least 8%, but less than 9%
C. At least 9%, but less than 10%
D. At least 10%, but less than 11%
E. At least 11%
2.66 (2 points) In 4, 11/00, Q.3, using Buhlmann Credibility, estimate this insureds claim count in the
next (third) policy year.
(A) 3.2
(B) 3.4
(C) 3.6
(D) 3.8
(E) 4.0
2.67 (4, 5/01, Q.18) (2.5 points) You are given:
(i) An individual automobile insured has annual claim frequencies that follow
a Poisson distribution with mean .
(ii) An actuarys prior distribution for the parameter has probability density function:
() = (0.5)5e5 + (0.5)e/5/5.
(iii) In the first policy year, no claims were observed for the insured.
Determine the expected number of claims in the second policy year.
(A) 0.3
(B) 0.4
(C) 0.5
(D) 0.6
(E) 0.7
2.68 (3 points) In the previous question, using Buhlmann Credibility, determine the expected
number of claims in the second policy year.
(A) 0.26
(B) 0.28
(C) 0.30
(D) 0.32
(E) 0.34

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 29

Use the following information for 4, 5/01, questions 37 and 38.


You are given the following information about workers compensation coverage:
(i) The number of claims for an employee during the year follows a Poisson distribution
with mean (100 - p)/100,
where p is the salary (in thousands) for the employee.
(ii) The distribution of p is uniform on the interval (0, 100].
2.69 (4, 5/01, Q.37) (2.5 points) An employee is selected at random.
No claims were observed for this employee during the year.
Determine the posterior probability that the selected employee has salary greater than
50 thousand.
(A) 0.5
(B) 0.6
(C) 0.7
(D) 0.8
(E) 0.9
2.70 (4, 5/01, Q.38) (2.5 points) An employee is selected at random.
During the last 4 years, the employee has had a total of 5 claims.
Determine the Bhlmann credibility estimate for the expected number of claims the employee will
have next year.
(A) 0.6
(B) 0.8
(C) 1.0
(D) 1.1
(E) 1.2

2.71 (4, 5/05, Q.6 & 2009 Sample Q.177) (2.9 points) You are given:
(i) Claims are conditionally independent and identically Poisson distributed with mean .
(ii) The prior distribution function of is:
F() = 1 -

1
, > 0.
(1 + ) 2.6

Five claims are observed.


Determine the Bhlmann credibility factor.
(A) Less than 0.6
(B) At least 0.6, but less than 0.7
(C) At least 0.7, but less than 0.8
(D) At least 0.8, but less than 0.9
(E) At least 0.9

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 30

2.72 (4, 5/05, Q.14 & 2009 Sample Q.184) (2.9 points) You are given:
(i) Annual claim frequencies follow a Poisson distribution with mean .
(ii) The prior distribution of has probability density function:
() = (0.4)e/6/6 + (0.6)e/12/12, > 0.
Ten claims are observed for an insured in Year 1.
Determine the Bayesian expected number of claims for the insured in Year 2.
(A) 9.6
(B) 9.7
(C) 9.8
(D) 9.9
(E) 10.0
2.73 (3 points) In the previous question, use Buhlmann Credibility in order to determine the
expected number of claims for the insured in Year 2.
(A) 9.6
(B) 9.7
(C) 9.8
(D) 9.9
(E) 10.0

2.74 (CAS3, 5/05, Q.17) (2.5 points) An insurer selects risks from a population that consists of
three independent groups.

The claims generation process for each group is Poisson.


The first group consists of 50% of the population.
These individuals are expected to generate one claim per year.

The second group consists of 35% of the population.


These individuals are expected to generate two claims per year.

Individuals in the third group are expected to generate three claims per year.
A certain insured has two claims in year 1.
What is the probability that this insured has more than two claims in year 2?
A. Less than 21%
B. At least 21%, but less than 25%
C. At least 25%, but less than 29%
D. At least 29%, but less than 33%
E. 33% or more

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 31

2.75 (SOA M, 5/05, Q.39 & 2009 Sample Q.170) (2.5 points)
In a certain town the number of common colds an individual will get in a year follows a Poisson
distribution that depends on the individuals age and smoking status.
The distribution of the population and the mean number of colds are as follows:
Proportion of population
Mean number of colds
Children
0.30
3
Adult Non-Smokers
0.60
1
Adult Smokers
0.10
4
Calculate the conditional probability that a person with exactly 3 common colds in a year is
an adult smoker.
(A) 0.12
(B) 0.16
(C) 0.20
(D) 0.24
(E) 0.28
2.76 (4, 11/05, Q.19 & 2009 Sample Q.230) (2.9 points)
For a portfolio of independent risks, the number of claims for each risk in a year follows a Poisson
distribution with means given in the following table:
Class
Mean Number of Claims per Risk
Number of Risks
1
1
900
2
10
90
3
20
10
You observe x claims in Year 1 for a randomly selected risk.
The Bhlmann credibility estimate of the number of claims for the same risk in Year 2 is 11.983.
Determine x.
(A) 13
(B) 14
(C) 15
(D) 16
(E) 17
2.77 (4, 11/06, Q.2 & 2009 Sample Q.247) (2.9 points)
An insurance company sells three types of policies with the following characteristics:
Type of Policy
Proportion of Total Policies
Annual Claim Frequency
I

5%

Poisson with = 0.25

II

20%

Poisson with = 0.50

III

75%

Poisson with = 1.00

A randomly selected policyholder is observed to have a total of one claim for Year 1 through
Year 4.
For the same policyholder, determine the Bayesian estimate of the expected number of
claims in Year 5.
(A) Less than 0.4
(B) At least 0.4, but less than 0.5
(C) At least 0.5, but less than 0.6
(D) At least 0.6, but less than 0.7
(E) At least 0.7

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 32

2.78 (4, 11/06, Q.19 & 2009 Sample Q.263) (2.9 points) You are given:
(i) The number of claims incurred in a month by any insured follows a Poisson distribution
with mean .
(ii) The claim frequencies of different insureds are independent.
(iii) The prior distribution of is Weibull with = 0.1 and = 2.
(iv) Some values of the gamma function are
(0.5) = 1.77245, (1) = 1, (1.5) = 0.88623, (2) = 1.
(v)

Month
Number of Insureds
Number of Claims
1
100
10
2
150
11
3
250
14
Determine the Bhlmann-Straub credibility estimate of the number of claims in the next 12 months
for 300 insureds.
(A) Less than 255
(B) At least 255, but less than 275
(C) At least 275, but less than 295
(D) At least 295, but less than 315
(E) At least 315
2.79 (4, 11/06, Q.23 & 2009 Sample Q.267) (2.9 points) You are given:

(i) The annual number of claims for an individual risk follows a Poisson distribution with mean .
(ii) For 75% of the risks, = 1.
(iii) For 25% of the risks, = 3.
A randomly selected risk had r claims in Year 1.
The Bayesian estimate of this risks expected number of claims in Year 2 is 2.98.
Determine the Bhlmann credibility estimate of the expected number of claims for this risk in Year 2.
(A) Less than 1.9
(B) At least 1.9, but less than 2.3
(C) At least 2.3, but less than 2.7
(D) At least 2.7, but less than 3.1
(E) At least 3.1

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 33

2.80 (4, 5/07, Q.6) (2.5 points)


An insurance company sells two types of policies with the following characteristics:
Type of Policy
Proportion of Total Policies
Poisson Annual Claim Frequency
I

= 0.50

II

1-

= 1.50

A randomly selected policyholder is observed to have one claim in Year 1.


For the same policyholder, determine the Bhlmann credibility factor Z for Year 2.
(A)

- 2
1.5 - 2

(B)

1.5 -
1.5 - 2

C)

2.25 - 2
1.5 - 2

(D)

2 - 2
1.5 - 2

(E)

2.25 - 22
1.5 - 2

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 34

Solutions to Problems:
2.1. D. Chance of observing 4 accidents is 4e / 24.
Weight the chances of observing 4 accidents by the a priori probability of .
A Priori
Probability
0.6
0.3
0.1

Type
A
B
C

Poisson
Parameter
1
2
3

Chance of
4 Claims
1.53%
9.02%
16.80%

Average

5.31%

2.2 . E. Since we are mixing Poissons, EPV = Overall mean = 1.5.


VHM = 2.7 - 1.52 = .45. K = EPV/VHM = 3.33. Z= 1/ (1 + 3.33) = 23.1%.
A Priori
Probability
0.6
0.3
0.1

Type
A
B
C
Average

Poisson
Parameter
1
2
3

Square of
Mean
1
4
9

1.5

2.7

New Estimate = (23.1%)(4) + (76.9%)(1.5) = 2.08.


2.3. C. Chance of observing 4 accidents is 4e / 24.
Type
A
B
C

A Priori
Probability
0.6
0.3
0.1

Poisson
Parameter
1
2
3

Sum

1.50

Chance of
4 Claims
0.0153
0.0902
0.1680

Probability
Weights
0.0092
0.0271
0.0168

Posterior
Probability
0.1733
0.5101
0.3166

Mean
1
2
3

0.0531

1.0000

2.14

2.4. C. EPV = 1.5. VHM = 2.7 - 1.52 = .45. K = 3.33. (Same as in the solution before last.)
Z= 10/ (10 + 3.33) = 75%. New Estimate = (75%)(3.7) + (25%)(1.5) = 3.15.
2.5. B. Chance of observing 37 accidents in ten years is (10)37e10 / 37!.
Type
A
B
C

A Priori
Probability
0.6
0.3
0.1

Poisson
Parameter
1
2
3

Sum

1.50

Chance of
37 Claims
3.299e-11
2.058e-4
3.061e-2

Probability
Weights
0.00000
0.00006
0.00306

Posterior
Probability
0.0000
0.0198
0.9802

Mean
1
2
3

0.00312

1.0000

2.98

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 35

2.6. E. EPV = Expected Value of = overall mean = 5.


2.7. B. For the uniform distribution on the interval (a, b), the variance = (b-a)2 /12.
In this case with a = 3 and b = 7, the variance is: (7-3)2 /12 = 4/3.
2.8. A. K = EPV / VHM = 5/(4/3) = 3.75. For one year, Z = 1 / (1 + 3.75) = .211.
New estimate is: (7)(.211) + (5)(1 - .211) = 5.42.
2.9. D. The posterior distribution of is proportional to the product of the chance of the observation
of 7 claims in one year times the a priori probability of . The former is 7 e/7! for a Poisson.
The latter is .25 for between 3 and 7, and zero elsewhere. The posterior distribution can be
obtained by dividing .257 e/7! by its integral from 3 to 7.
7

(.25/7!) 7 e d = (.25/7!) (8) {(8; 7) - (8; 3)}.


3

Therefore, the posterior density of is given by: 7 e / {(8)((8; 7) - (8; 3))}.


The posterior mean is:
7

7e d / {{(8)((8; 7) - (8; 3))} = (9)((9; 7) - (9; 3)) / {((8){(8; 7) - (8; 3))} =


3

= 8{(9; 7) - (9; 3)} / {(8; 7) - (8; 3)}.


Comment: This is a difficult question. By use of a computer:
8{(9; 7) - (9; 3)} / {(8; 7) - (8; 3)} = (8)(.2709 - .0038)/(.4013 - .0119) = 5.49.
2.10. E. For each insured, its variance is , thus EPV= E[] = mean of g = .09.
VHM = Var[] = variance of g = .003. K = EPV/VHM = .09/.003 = 30.
Z = 3/(3 + K) = 3/33 = 1/11. The observed frequency is 2/3.
Estimate future frequency = (1/11)(2/3) + (10/11)(.09) = 0.142.
Comment: In order to perform Buhlmann Credibility, we do not need to know the form of g, but only
its first two moments, or equivalently its mean and variance.
2.11. D. E[] = the mean of the prior mixed exponential = weighted average of the means of the
two exponential distributions = (.7)(1/20) + (.3)(1/10) = 6.5%.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

2.12. D.

HCM 10/21/12,

Page 36

()d = 14 e20d + 3 e10d = (14 e-2/ 20) + (3e-1/10) = 20.51%.


0.1

0.1

0.1

Alternately, S(.1) for the mixed distribution is weighted average of S(.1) for the two Exponentials:
(.7)(e-(20)(.1)) + (.3)(e-(10)(.1)) = .7e-2 + .3e-1 = 20.51%.
2.13. C. Posterior distribution is: ()e / ()ed =

{14e21 + 3e11}/{14e21d + 3e11d} = {14e21 + 3e11}/{(14/21) + (3/11)} =


0

{14e21 + 3e11} / 0.9394.

(1.0645) {14e21 + 3e11} d = (1.0645){(14e-2.1/21) + (3e-1.1/11)} = 18.35%.


0.1

2.14. E. Given , the chance of the observation is: e.

Therefore, by Bayes Theorem the posterior distribution is: ()e / ()ed.

Therefore the posterior mean is: ()ed / ()ed =

{14 e21d + 3 e11d} / {14 e21d + 3 e11d} =


0

{(14 / 212 ) + (3/112 )} / {(14/21) + (3/11)} = 0.05654/0.9394 = 6.02%.


Comment: Similar to 4, 5/01, Q.18.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 37

2.15. B. Given , the chance of the observation is: e.

Therefore, by Bayes Theorem the posterior distribution is: ()e / ()ed.

Therefore the posterior mean is: 2()ed / ()ed =

{14 2e21d + 3 2e11d} / {14 e21d + 3 e11d} =


0

{(2)(14/213 ) + (2)(3/113 )} / {(14/212 ) + (3/112 )} = .007531/.05654 = 13.3%.


Comment: For Gamma type integrals, as discussed in the section on the Gamma Function:

t1 et/ dt = (), or tn e-ct dt =


0

n! / cn+1.

Here is a graph of the prior (dashed) and posterior distributions of lambda:


15
12.5
10
7.5
5
2.5
0.1

0.2

0.3

0.4

0.5

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 38

2.16. D. Given , the chance of the observation is: 2e/2.

Therefore, by Bayes Theorem the posterior distribution is: () 2e / () 2ed.

Therefore the posterior mean is: 3 () e d / 2 () e d =

{14 3e21d + 3 3e11d} / {14 2e21d + 3 2e11d} =


0

{(6)(14/214 ) + (6)(3/114 )} / {(2)(14/213 ) + (2)(3/113 )} = 0.001661/0.007531 = 22.1%.


Comment: If n claims are observed, then the posterior distribution is:
{14 n e21 + 3 n e11 }/ {(14)((n+1) /21n+1) + (3)((n+1) /11n+1)}
If n claims are observed, then the posterior mean is:
{(14)((n+2) /21n+2) + (3)((n+2) /11n+2)}/ {(14)((n+1) /21n+1) + (3)((n+1) /11n+1)} =
(n+1){(14 /21n+2) + (3 /11n+2)}/ {(14 /21n+1) + (3 /11n+1)}.
For example, for n = 0, 1, 2, 3, 4, and 5, the posterior means are:
0.060, 0.133, 0.221, 0.319, 0.421, and 0.523.
2.17. B. Since we are mixing Poissons, EPV = a priori mean = .065.

second moment of the hypothetical means = 2 () d =

14 2e20d + 3 2e10d = (14)(2/203 ) + (3)(2/103 ) = .0095.


0

VHM = .0095 - .0652 = .005275.


K = EPV/VHM = .065/.005275 = 12.3.
Alternately, the distribution of is a mixed exponential, with 2nd moment a weighted average of
those of the individual distributions: (.7)(2/202 ) + (.3)(2/102 ) = .0095. Proceed as before.
2.18. C. One year of data is given a credibility of: 1/(1+12.3) = 7.5%.
The a priori mean is .065. If 2 claims are observed, the estimated future frequency is:
(.075)(2) + (.925)(.065) = 0.210.
Comment: If n claims are observed, then the estimated future frequency is:
.075n + (.925)(.065).
Number of claims
Buhlmann Credibility Estimate
Bayes Analysis Estimate

0
6.0%
6.0%

1
13.5%
13.3%

2
21.0%
22.1%

3
28.5%
31.9%

4
36.0%
42.1%

5
43.5%
52.3%

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 39

f() d = {(d+1)/{bd+1 - ad+1}} d+2 / (d+2) =

2.19. C.
a

{(d+1)/ (d+2)}{bd + 2 - ad + 2} / {bd + 1 - ad + 1}.


2.20. E. For an insured with a Poisson annual frequency of , over Y years his frequency is Poisson
with mean Y. Therefore the chance of the observation is: e-Y (Y)C / C!.
The posterior distribution is proportional to the product of the a priori probability and the chance of
the observation: {(d+1) d / {bd+1 - ad+1} } e-Y (Y)C / C!.
Therefore, the posterior distribution is proportional to d+C e-Y.
Since the density has support [a,b], we must integrate from a to b in order to calculate the constant
we must divide by in order that the posterior density integrates to unity as required.
b

d+Ce-Yd = d+Ce-Yd - d+Ce-Yd =


a

{(d+C+1; bY) - (d+C+1; aY)} (d+C+1) /Yd+C+1.


Thus the posterior distribution is:
Yd + C + 1 d + C e-Y / [{(d+C+1;bY) - (d+C+1;aY) }(d+C+1) ].
Comment: See the next section in order to see how to do the required integrals, related to the
Gamma Distribution and the Incomplete Gamma Function.
2.21. E. We want the mean of the posterior distribution determined in the previous question.
We need to integrate the posterior distribution times , over its support [a,b].
b

{Yd+C+1 / [{(d+C+1;bY) -(d+C+1;aY) }(d+C+1) ] } d+C+1e-Yd =


a

{Yd+C+1 / [{(d+C+1;bY) - (d+C+1;aY) }(d+C+1) ] }{(d+C+2;bY) -(d+C+2;aY) }(d+C+2) / Yd+C+2

= {{d+C+1)/Y}{(d+C+2; bY) - (d+C+2; aY) } / {(d+C+1; bY) - (d+C+1; aY) }.


2.22. E. From the solution to a prior question, the prior mean is:
{(d+1)/ (d+2)}{bd+2 - ad+2 } / {bd+1 - ad+1} = (.5/1.5){.61.5 - .21.5 ) / (.6.5 - .2.5 ) = 0.382.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 40

2.23. A. From the solution to a previous question, the posterior distribution is:
Y d+C+1 d+C e-Y / [{(d+C+1;bY) - (d+C+1;aY) }(d+C+1) ] =
62.5 1.5 e-6 / [{(2.5;3.6) - (2.5;1.2) }(2.5) ] = 88.182 1.5 e-6 / {(0.794 - 0.209)1.329} =
113.4 1 . 5 e-6 , with support [0.2, 0.6].
2.24. D. From a previous solution, the mean of the posterior distribution is:
{{d+C+1)/Y}{(d+C+2; bY) - (d+C+2; aY) } / {(d+C+1; bY) - (d+C+1; aY) } =
(2.5/6){(3.5;3.6) - (3.5;1.2)} / {(2.5;3.6) - (2.5;1.2)} = (.4167)(.592 - .066)/(.794 - .209) =
0.375.
Comment: The observation of a frequency of 1/3 has reduced the posterior estimate to .375 from
the prior estimate of .382.
2.25. A. From the solution to a previous question, the mean of the posterior distribution is:
{{d+C+1)/Y}{(d+C+2;bY) - (d+C+2;aY) } / {(d+C+1;bY) - (d+C+1;aY) } =
(C/Y){(3; ) - (3; 0)} / {(2; ) - (2; 0)} = (C/Y)(1 - 0)/(1 - 0) = C/Y = 1/3.
Comments: Note that 1/ on [0,) is not a proper density, since it has an infinite integral.
Nevertheless, such improper priors are used in Bayesian Analysis. Note that the posterior
estimate is the same as the observed frequency of 2/6 = 1/3. This will always be the case for this
choice of a, b and d; this prior is therefore referred to as the non-informative prior or vague prior
for a Poisson process. Note that (; ) = 1 and (; 0) = 0; this is why the Gamma Distribution with
support from 0 to can be defined by F(x) = (; x/).
2.26. A. EPV = E[] = overall mean = .05. VHM = Var[] = (8% - 2%)2 /12 = .0003.
K = EPV / VHM = .05 / .0003 = 166.7. For five years, Z = 5 / (5 + 166.7) = 2.9%.
The a priori frequency is 5%, while the observed frequency is: 2/5 = 40%.
Thus the new estimate is: (2.9%)(40%) + (97.1%) (5%) = 6.0%.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 41

2.27. D. The chance of the observing no claims over five years, given , is e5.
The prior density of is 16.67 for .02 < < .08. Therefore, the posterior density of is:
.08

16.67e5/ 16.67e5 d = 21.32e5 for 0.02 < < 0.08.


.02

The mean of the posterior distribution is:


.08

.08

21.32e5 d = -21.32(e5/5 + e5/25)] = 4.85%.


.02

.02

2.28. B. The posterior density is: 21.32e5 for .02 < < .08.
.05

.05

Prob[ < .05] = 21.32e5 d = -21.32e5/5] = 53.7%.


.02

.02

2.29. C. The chance of the observing one claim over five years, given , is 5e5.
The prior density of is 16.67 for .02 < < .08.
Therefore, the posterior density of is:
.08

16.67(5e5)/ 16.67(5e5) d = 16.67(5e5)/0.1896 = 439.6e5 for .02 < < .08.


.02

The mean of the posterior distribution is:


.08

.08

439.6e5 d = -439.6(2e5/5 + 2e5/25 + 2e5/125) ] = 5.47%.


.02

.02

2.30. D. The posterior density is: 439.6e5 for .02 < < .08.
.05

.05

Prob[ < .05] = 439.6e5 d = -439.6(e5/5 + e5/25)] = 38.4%.


.02

.02

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 42

2.31. C. The posterior density is: 21.32e5 for .02 < < .08.
The probability of observing no claims given is e.
The probability of observing no claims is:
.08

.08

e21.32e5 d = -21.32(e6/6)] = 95.3%.


.02

.02

Comment: While in this case the probability of observing no claims is close to


exp[-future mean frequency], they are not equal.
To more decimal places, the future mean frequency is 0.048502. e-.048502 = .95266.
On the other hand, the probability of observing no claims is 0.95280.
2.32. A. The posterior density is: 21.32e5 for .02 < < .08.
The probability of observing one claim given is e.
The probability of observing one claim is:
.08

.08

e21.32e5 d = -21.32(e6/6 + e6/36)] = 4.59%.


.02

.02

2.33. A. E[] = ( / ( 1)) = (3/2)(0.8) = 1.2. E[2] = 2 /( 2) = 1.92.


Var[] = 1.92 - (1.2)2 = .48. EPV = E[] = 1.2. VHM = Var[] = .48.
Thus K = EPV / VHM = (1.2) /(.48) = 2.5. Z = 7/(7 + K) = .737.
Estimated future annual frequency = (.737)(4/7) + (1 - .737)(1.2) = 73.6%.
Comment: Note that the distribution of has support > 0.8, and therefore the estimated future
annual frequency is outside the range of hypotheses. This can happen when using Buhlmann
Credibility, a linear approximation to Bayes Analysis, but this can not happen when using Bayes
Analysis.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 43

2.34. B. Prob(Observation | ) = (7)4 e7 / 4! = 100.04 4 e7.


Prior density of is: / +1 = 1.536/ 4, > .8.

Prob(observation) =

(100.04 4 e7) (1.536/ 4) d = 153.7e7d = 8.1%.

.8

.8

Comment: If in this case one observed for example 6 claims in 7 years, the answer would involve
Incomplete Gamma Functions. However, if one observed for example 3 claims in 7 years, the
answer would involve doing Exponential Integrals; Exponential Integrals involve an integral of
e-x times a negative integral power of x from t to . See Handbook of Mathematical Functions.
Here is the chance of the observation for various numbers of claims observed over 7 years:
# claims

Prob. of Obser.

# claims

Prob. of Obser.

# claims

Prob. of Obser.

0
1
2
3
4
5
6
7

0.12%
0.75%
2.36%
5.01%
8.12%
10.72%
12.06%
11.96%

8
9
10
11
12
13
14

10.73%
8.92%
7.01%
5.30%
3.93%
2.89%
2.13%

15
16
17
18
19
20
21

1.59%
1.20%
0.92%
0.72%
0.57%
0.45%
0.37%

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 44

2.35. D. Prob(Observation | ) = (7)4 e7 / 4! = 100.04 4 e7.


Prior density of is: / +1 = 1.536/ 4, > .8.
Therefore the posterior density is proportional to: (100.04 4 e7) (1.536/ 4) ~ e7, > .8.

Posterior density is: e7 / e7 d = 1893e7, > .8.


.8

Mean of posterior distrib. = 1893e7 d = 1893(e7/7 + e7/49) = 94.3%.


.8

.8

Comment: If in this case one observed for example 6 claims in 7 years, the estimate would involve
Incomplete Gamma Functions. However, if one observed for example 3 claims in 7 years, the
estimate would involve doing Exponential Integrals; Exponential Integrals involve an integral
of e-x times a negative integral power of x from t to . See Handbook of Mathematical
Functions. Here are the estimated future annual frequencies for various numbers of claims
observed over 7 years, with Bayes Analysis as the dots and Buhlmann Credibility as the straight
line:
Freq.
3
2.5
2
1.5
1
0.5

10

15

20

25

Claims

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 45

2.36. A. The chance of observing 2 claims in a year is: 2e/2!.


Therefore, the chance of observing 2 claims in each of the first two years is: (2e/2!)2 .
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.00288
0.00976

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.00144
0.00488

0.228
0.772

0.400
0.600

0.0063

1.000

0.554

Comment: Similar to 4, 11/00, Q.3.


2.37. A. The chance of observing 1 claim in a year is: e.
The chance of observing 3 claims in a year is: 3e/3!.
Therefore, the chance of the observation is: e 3e/3! = 4e2/6.
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.00192
0.00651

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.00096
0.00325

0.228
0.772

0.400
0.600

0.0042

1.000

0.554

Comment: Since the chances of observation are proportional to those in the previous question, we
get the same posterior distribution and the same answer.
2.38. A. Over two years we have a Poisson with mean 2.
Therefore, the chance of the observation is: e2(2)4/4! = (2/3)4e2.
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.00767
0.02602

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.00383
0.01301

0.228
0.772

0.400
0.600

0.0168

1.000

0.554

Comment: Since the chances of observation are proportional to those in the previous questions, we
get the same posterior distribution and the same answer. In this question we have somewhat less
information about what occurred, than in the previous questions. In general, one must be careful to
use the exact wording of the observation, even though in these particular questions the result of
Bayesian Analysis only depended on the sum of the claims over the first two years.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 46

2.39. E. The chance of observing at most 2 claims in a year is: f(0) + f(1) + f(2) =
e + e + 2e/2!. Therefore, the chance of observing at most 2 claims in each of the first two
years is: (e + e + 2e/2!)2 .
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.98421
0.95430

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.49211
0.47715

0.508
0.492

0.400
0.600

0.9693

1.000

0.498

2.40. D. Over two years we have a Poisson with mean 2.


Therefore, the chance of the observation is: e2 + 2e2 + (2)2e2/2!.
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.95258
0.87949

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.47629
0.43974

0.520
0.480

0.400
0.600

0.9160

1.000

0.496

2.41. C. The chance of observing at least 2 claims in a year is: 1 - {f(0) + f(1)} = 1 - {e + e}.
Therefore, the chance of observing at least 2 claims in each of the first two years is:
(1 - {e + e})2 .
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.00379
0.01486

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.00189
0.00743

0.203
0.797

0.400
0.600

0.0093

1.000

0.559

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 47

2.42. B. Probabilities of the Observation: 4re-4/r!, 7re-7/r!.


Probability Weights: .8 4re-4/r!, .2 7re-7/r!.
Posterior Distribution: .8 4re-4/(.8 4re-4 + .2 7re-7), .2 7re-7/(.8 4re-4 + .2 7re-7).
Let w = posterior probability that = 4. We are given that: 5.97 = (4)w + (7)(1 - w). w = 0.343.

0.343 = .8 4re-4/(.8 4re-4 + .2 7re-7) = 4/{4 + (4/7)re-3}.


0.343 (4/7)r e-3 = 2.628. r = ln(153.9)/ln(4/7) = 9.
Comment: Similar to 4, 11/06, Q.23.
2.43. C. The chance of the observation given is: (e)(2e/2)(3e/6) = 6e3/12.
() = (/) exp[-(/)] / = 1 exp[-(/)] / .
Therefore, the posterior distribution of is proportional to:
(6e3){1 exp[-(/)]} = +5 exp[-{3 + (/)}].
The mean of the posterior distribution of is:
(integral of times the probability weight) / (integral of the probability weight)

+ 6 exp[{3 + ( / ) }] d /

+ 5 exp[{3 +

( / ) }] d .

2.44. B. The chance of 3 claims given is: 3e/6.


From the previous solution, the posterior distribution of is proportional to: +5 exp[-{3 + (/)}].
The probability of three claims in year 4 from this policyholder is:
(integral of 3e/6 times the probability weight) / (integral of the probability weight)

= (1/6)

+ 8 exp[{4 + ( / ) }] d /

+ 5 exp[{3 +

( / ) }] d .

2.45. D. EPV = E[] = .005. VHM = Var[] = (.008 - .002)2 /12 = .000003.
K = EPV/VHM = .005/.000003 = 1667.
The total number of employees observed is 6000. Z = 6000/(6000 + 1667) = 78.3%.
Observed frequency is: 46/6000 = .00767. Prior mean is .005.
Estimated future frequency is: (.783)(.00767) + (1 - .783)(.005) = .00709.
Estimated number of claims for year four is: (.00709)(1400) = 9.9.
Comment: Similar to Exercise 7 in Topics in Credibility by Dean.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 48

2.46. E. The probability covered by the first interval is: f(0) + f(1) = (1+)e.
The probability covered by the second interval is: f(2) + f(3) = (2/2 + 3/6)e.
The probability covered by the last interval is: 1 - {f(0) + f(1) + f(2) + f(3)}.
For = 2, these three probabilities are: 0.4060, 0.4511, and 0.1429.
Thus for = 2 the Iikelihood is: (0.4060)(0.45112 )(0.14293 ) = 0.000241.
For = 4, these three probabilities are: 0.0916, 0.3419, and 0.5665.
Thus for = 2 the Iikelihood is: (0.0916)(0.34192 )(0.56653 ) = 0.001947.
Thus the probability weights are: (3/4)(0.000241), and (1/4)(0.001947).
Therefore, the posterior distribution of lambda is: 27.1% and 72.9%.
The expected value of lambda for this insured is: (2)(27.1%) + (4)(72.9%) = 3.46.
2

2.47. D. EPV = E[] =

0 32 / 8 d = 3/2.

E[2] =

0 2 32 / 8 d = 12/5.

VHM = Var[] = E[2] - E[]2 = 12/5 - (3/2)2 = 3/20.


K = EPV / VHM = (3/2) / (3/20) = 10.
Z = 3 / (3 + K) = 3/13.
Estimated future frequency for Chip is: (3/13)(2/3) + (10/13)(3/2) = 102/78 = 1.308.
2.48. A. The EPV = prior mean = 0.15, VHM = 0.052 = 0.0025, K = 0.15/0.0025 = 60.
Z = 910/(910 + K) = 910/970. Observed mean is: 81/910.
Estimated future frequency is:
(910/970)(81/910) + (60/970)(0.15) = 0.09278 per month, per employee.
(12)(400)(0.09728) = 445.4 claims.
Comment: An example of where for Buhlmann Credibility, the estimated future frequency of
0.09278 is outside the range of hypothesis: 0.1 to 0.2.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 49

2.49. B. The sum of 910 independent, identically distributed Poissons is a Poisson with mean
910. Therefore, the chance of the observation given lambda is: (910)81 exp[-910] / 81!.
This is proportional to: 81 e-910.
Since the two types are equally likely, the probability weights are proportional to:
0.181 e-91, and 0.281 e-182.
These in turn are proportional to: e91 and 281.
The first weight is much, much bigger then the second; their ratio is 1.37 x 1015.
Thus the posterior distribution is (subject to rounding): 100% and 0%.
Estimated future frequency is: 0.1 per month, per employee.
(12)(400)(0.1) = 480 claims.
2.50. E. P = 95%. y = 1.960. n0 = (1.960/0.10)2 = 384 claims.
Z=

81
= 45.9%.
384

(45.9%)(81/910) + (1 - 45.9%)(0.15) = 0.1220 per month, per employee.


(12)(400)(0.1220) = 586 claims.
2.51. D. Prior to any observations, the Expected Value of the Process Variance is the (a priori)
Overall Mean of 4.8, since we are mixing Poissons. The Variance of the Hypothetical Means
(between brands) is .20. K = EPV/VHM = 4.8 / .2 = 24.
We observe 50 cookies, so Z = 50/(50 + 24) = .676. The observed frequency is 265/50 = 5.3.
The new estimate is: (.676)(5.3) + (1 - .676)(4.8) = 5.14 chips per cookie.
Comment: Note the way we use our a priori model to compute K, prior to any observations. Note
that the exposure unit in this case is cookies; we are attempting to estimate the chips per cookie.
2.52. D. EPV = overall mean frequency = (.1 + .3)/2 = .2.
.3

The 2nd moment of the mean frequencies =

x2 dx

/ (.3 - .1) = .008667 /.2 = .04333.

.1

Therefore the Variance of the Hypothetical Mean Frequencies = .04333 - .22 = .00333.
K = EPV / VHM = .2 / .003333 = 60. For five years, Z = 5 / (5 + 60) = 1/13.
The a priori frequency is .2, while the observed frequency is 3/5 = .6.
Thus the estimated future annual frequency is: (.6)(1/13) + (.2) (12/13) = 0.231.
Comment: For the uniform distribution on the interval (a,b), the Variance = (b - a)2 /12.
In this case with a = 0.1 and b = 0.3, the variance is .22 /12 = .00333, the VHM.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 50

2.53. C. By Bayes Theorem the posterior density is proportional to the product of the prior
density and the chance of the observation given . The chance of the observation given is e.
Thus the posterior density is proportional to (1)e for 0 < < 1. In order to convert to a density
function, one must divide by its integral from 0 to 1, which is (1 - e-1).
Thus the posterior density is: e / (1 - e- 1), for 0 < < 1.
2.54. D. Let the mean claim frequency for each insured be , then f() = 1 for 0 1. Then the
second moment is the integral from 0 to 1 of 2f()d, which is 3 /3 from zero to one, or 1/3. The
Expected Value of the Process Variance = E[] = 1/2. The variance of the hypothetical means =
VAR[] = second moment - mean2 = (1/3) - (1/2)2 = 1/12.
Therefore, K = EPV / VHM = 1/2 / (1/12) = 6. For 3 years Z = 3 / (3 + K) = 3/9 = 1/3.
The prior estimate is 1/2 and the observed frequency is 4/3 .
Thus the new estimate = (1/3)(4/3) + (1 - 1/3)(1/2) = 0.777.
Comment: The variance of the uniform distribution on the interval (a,b) is (b-a)2 /12, which in this
case is 1/12.
2.55. C. The (prior) distribution of has a mean of:

f() d = 33 d = (3/2)2 ] = 1.5. The second moment is:


1

2 f() d =

32 d = (3)2 ] = 3.
1

Thus the variance of the hypothetical means is 3 - 1.52 = .75. The Expected Value of the Process
Variance is E[] = 1.5. Thus K = EPV/VHM = 1.5 / .75 = 2. For two years of data
Z = 2/(2+K) = 1/2. The observation is a frequency of 20/2 =10. The a priori mean is 1.5.
Thus the estimate of this insureds future frequency is: (.5)(10) + (1-.5)(1.5) = 5.75.
Comment: The distribution of is a Single Parameter Pareto Distribution with = 3 and =1.
It has mean = ( / ( 1)) = 3/2, and variance = ( / [ ( 2) ( 1)2 ])2 = 3 /{(1)(22)} = 3/4.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 51

2.56. E. The distribution of is a Single Parameter Pareto with = 1 and = 4.


The mean = ( / ( 1)) = 4/3. The second moment = 2 /( 2) = 2.
The variance = 2 - (4/3)2 = 2/9.
EPV = E[] = mean of f() = 4/3. VHM = Var[] = 2/9. K = EPV / VHM = (4/3) /(2/9) = 6.
2.57. C. Let the mean claim frequency for each insured be , then f() = 1 for 0 1. Then the
second moment is the integral from 0 to 1 of 2f()d, which is 3 /3 from zero to one, or 1/3.
Then the Expected Value of the Process Variance = E[] = 1/2. The variance of the hypothetical
means = VAR[] = second moment - mean2 = (1/3) - (1/2)2 = 1/12.
Therefore, K = EPV / VHM = (1/2) / (1/12) = 6. For 3 years Z = 3 / (3+K) = 3/9 = 1/3.
The prior estimate is 1/2 and the observed frequency is 3/3 = 1.
Thus the new estimate = (1/3)(1) + (1 - 1/3)(1/2) = 2/3.
2.58. D. The chance of no claims for a Poisson is e.
We average over the possible values of :
3

(1/2) e d = (1/2)(-e) ] = (1/2)(e-1 - e-3) = (1/2)(.368 - .050) = 0.159.


1

2.59. E. Expected Value of the Process Variance = E[] = 2.


Variance of the Hypothetical Means = VAR[] = (3 - 1)2 /12 = 1/3.
K = EPV / VHM = 2 / (1/3) = 6. For one observation, Z = 1/(1+6) = 1/7.
The prior estimate is E[] = 2. The observation is a frequency of 1.
Thus the new estimate is: (1/7)(1)+(1 - 1/7)(2) = 13/7 = 1.857.
2.60. A. The mean frequency is E[] = E[]E[] = (1/2)(1/2) = 1/4.
Since we are mixing Poissons, the EPV = mean frequency = 1/4.
The Variance of the Hypothetical Means = VAR[] = E[()2 ] - E2 [] = E[2]E[2] - 1/42 =
(1/3)(1/3) - 1/16 = 7/144.
Thus K = EPV / VHM = (1/4) / (7/144) = 36/7 = 5.143.
Comment: Note that since and are independent, the expected values can be separated into
products of separate expected values. The uniform distribution from zero to one has first moment of
1/2 and second moment of 1/3.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 52

2.61. D. The posterior probabilities are proportional to the product of the chance of the observation
given each grade point average and the a priori probability of each grade point average.
The density of g is 1/4 on [0,4]. The chance of observing zero claims from a Poisson with mean is:
e = e-(4-g) = eg-4. Thus, the posterior density is proportional to (1/4) eg-4 on [0,4]. Dividing by the
integral from zero to 4, (1-e-4)/4, the posterior density is: eg-4 / (1-e-4). The posterior probability
that the selected driver has a grade point average greater than 3, is the integral of the posterior
density from 3 to infinity: (1-e-1) / (1- e-4) = 0.632/0.982 = 0.644.
Comment: The integrals used are as follows:
g=4

g=4

e g-4 dg = eg-4 ] = 1-e-4


g=0

g=0

g=4

g=4

eg-4 dg = eg-4 ] = 1 - e-1.


g=3

g=3

2.62. C. For a Poisson process the variance is equal to the mean, in this case 4-g.
Thus the Expected Value of the Process Variance is equal to the overall mean of 2:
4

(4 g) f(g) dg = (4 g) (1/ 4) dg =

g-

g= 4
2
g /8 ]
g=0

= 2.

The Variance of the Hypothetical Means is the variance of the Uniform Distribution on [0,4], which is
(4 - 0)2 /12 = 16/12 = 4/3. Thus K = EPV / VHM = 2/(4/3) = 1.5. For 5 exposures,
Z = 5/(5+1.5) = .769. Expected number of claims is: (0)(.769) + (2)(1-.769) = 0.462.
Comments: E[4 - g] = 4 - E[g] = 4 - 2 = 2. VAR[4 - g] = VAR[4] + VAR[g] = 0 + 4/3 = 4/3.
The variance of the uniform distribution on [a,b] is (b-a)2 /12.
2.63. A. Expected Value of the Process Variance = overall mean = .
Variance of the Hypothetical Means = Var[] = . K = EPV/ VHM = / = 1.
Comment: Each individual insured has a Poisson frequency.
The distribution of hypothetical means is given by a second Poisson.
Var[] = Variance of the second Poisson = Mean of the second Poisson = .
Since acts as a sort of dummy variable, the solution cant depend on , thus eliminating choices B,
D, and E.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 53

2.64. C. The chance of observing 4 claims in a year is: 4e/4!.


Therefore, the chance of observing 4 claims in each of the first two years is: (4e/4!)2 .
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.00814
0.03817

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.00407
0.01908

0.176
0.824

2
4

0.0232

1.000

3.65

Comment: If instead the observation had been a total of 8 claims over two years,
then the chance of the observation would have been: (2)8e2/8!.
A

Class

A Priori
Chance of
This Class

Chance
of the
Observation

A
B

0.5000
0.5000

0.02977
0.13959

Overall

Prob. Weight =
Posterior
Product of
Chance of
Columns B & C This Class

Mean
Frequency

0.01489
0.06979

0.176
0.824

2
4

0.0847

1.000

3.65

While in this case one would get the same answer, in general in using Bayes Theorem, it is
important to use all of the information in the observation exactly as given.
2.65. D. The posterior probabilities of the two risk types are: .176 and .824.
For = 2, f(1) = 2 e-2 = 0.271. For = 4, f(1) = 4 e-4 = 0.073.
(.176)(.271) + (.824)(.073) = 10.8%.
2.66. B. Mean = (.5)(2) + (.5)(4) = 3. Mixing Poissons EPV = mean = 3.
Second Moment of the Hypothetical Means = (.5)(22 ) + (.5)(42 ) = 10.
VHM = 10 - 32 = 1. K = EPV/VHM = 3/1 = 3. Z = 2/(2+K) = 2/5.
Observed Frequency = 8/2 = 4. Estimated Frequency = (2/5)(4) + (3/5)(3) = 3.4.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 54

2.67. A. Given , the chance of the observation is: e.


Therefore, by Bayes Theorem the posterior distribution of is proportional to:
()e = 2.5e6 + 0.1e1.2 = (5/12)6e6 + (1/12)1.2e1.2.
This is proportional to the mixed exponential distribution: (5/6)6e6 + (1/6)1.2e1.2,
which must therefore be the posterior distribution of . The expected number of claims =
mean of the posterior distribution: (5/6)(1/6) + (1/6)(1/1.2) = 0.278.
Comment: When not told which one to use, use Bayes Analysis, rather than Buhlmann Credibility
which is a linear approximation to Bayes Analysis,
The a priori mean is: (1/2)(1/5) + (1/2)(5) = 2.6.

The posterior distribution is: ()e / ()ed =

{2.5e6 + 0.1e1.2} / { 2.5e6 + 0.1e1.2 d} =


0

{2.5e6 + 0.1e1.2}/{2.5/6 + 0.1/1.2} = {2.5e6 + 0.1e1.2}/(1/2) = 5e6 + 0.2e1.2.

{5e6 + 0.2e1.2} d = 5 e6d + 0.2e1.2 d =

posterior mean =
0

5/62 + 0.2/1.22 = 0.278. If the prior distribution of had been an exponential rather than mixed
exponential, then this would have been a special case of the Gamma-Poisson.
Here is a graph of the prior (dashed) and posterior distributions of lambda:
2

1.5

0.5

0.5

1.5

2.5

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 55

2.68. D. first moment of the hypothetical means = () d =

2.5 e5d + .1 e/5d = (2.5)(1/52 ) + (.1)(52 ) = 2.6.


0

second moment of the hypothetical means = 2 () d =

2.5 2e5d + .1 2e/5d = (2.5)(2/53 ) + (.1)(2)(53 ) = 25.04.


0

Since we are mixing Poissons, EPV = a priori mean = 2.6.


VHM = 25.04 - 2.62 = 18.28. K = EPV/VHM = 2.6/18.28 = 0.142. Z = 1/(1 + K) = 0.876.
Estimated frequency = (0.876)(0) + (1 - 0.876)(2.6) = 0.322.
Alternately, the distribution of is a mixed exponential, with 50% weight to an exponential with
mean 1/5 and 50% weight to an exponential with mean 5.
First moment of the mixture = (.5)(.2) + (.5)(5) = 2.6.
Second moment of the mixture = (.5)(2)(.22 ) + (.5)(2)(52 ) = 25.04. Proceed as before.
2.69. B. The chance of the observation of no claims is: e = e.01p -1.
The prior density of p is .01, 0 < p < 100. Therefore, by Bayes Theorem, the posterior density is
proportional to: .01e.01p -1 = .01e-1e.01p, 0 < p < 100.
100

.01e-1e.01p dp = .01e-1{100(e - 1)}.


0

The posterior distribution is: .01e-1e.01p/{.01e-1100(e - 1)} = e.01p /{100(e - 1)}, 0 < p < 100.
The posterior probability that p > 50 is:
100

(1/ 100(e - 1)) e.01pdp = (e - e.5)/(e - 1) = 0.622.


50

Comment: One can approximate the solution, by use of discrete risk types.
Divide the employees into two equally likely groups: p < 50 and p > 50.
Then the groups have average salaries of 25 and 75, with average lambdas of .75 and .25.
Therefore, the chances of the observation for the two groups are approximately:
e-.75 = .472 and e-.25 = .779. Since the two groups are equally likely a priori, the posterior
distribution is: .472/(.472 + .779) = .377 and .779/(.472 + .779) = 0.623.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 56

2.70. B. = 1 - .01p. VHM = Var[] = (.012 )Var[p] = (.012 )(1002 /12) = 1/12.
EPV = E[Process Variance | ] = E[] = 1 - .01E[p] = 1 - (.01)(50) = 1/2.
K = EPV/VHM = (1/2)/(1/12) = 6. Z = 4 /(4+6) = 40%.
Estimated frequency = (40%)(5/4) + (60%)(1/2) = 0.8.
Comment: There is no reason why the salaries could not have been uniform from for example
(20, 100], rather than (0, 100]. The variance of the uniform distribution on [a, b] is: (b-a)2 /12.
2.71. E. (See Comment) F() is Pareto with parameters 2.6 and 1.
It has mean: 1/(2.6 - 1) = 0.625, second moment: (2)(12 )/{(2.6 - 1)(2.6 - 2)} = 2.0833, and variance:
2.0833 - .6252 = 1.6927.
EPV = E[] = 0.625. VHM = Var[] = 1.6927. K = EPV/VHM = .625/1.6927 = .37.
We observe 5 claims, so n = 5. Z = 5/(5 + K) = 5/5.37 = 93.1%.
Comment: It is intended that the claim severity is Poisson, although the question should have made
this much clearer. The CAS/SOA accepted both choices C and E.
Apparently, they allowed Z = 1/(1 + K) = 1/1.37 = 73.0%. This is incorrect, since when dealing with
severity the number of draws from the risk process is the number of claims observed.
However, the question should have made it clearer that we were dealing with severity.
For example, Claim sizes are conditionally independent and identically Poisson ...

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 57

2.72. D. The probability of observing 10 claims given is proportional to: 10e.


Therefore the posterior distribution of is proportional to:
(10e){(0.4)e/6/6 + (0.6)e/12/12} = .06666710e7/6+ 0.0510e13/12.
The posterior mean is:

(.06666710e7/6+ 0.0510e13/12) d / .06666710e7/6+ 0.0510e13/12 d =


0

{(.066667)11!(6/7)12/ + (0.05)11!(12/13)12}/{(.066667)10!(6/7)11/ + (0.05)10!(12/13)11} =


(11){(.066667)(6/7)12 + (0.05)(12/13)12}/{(.066667)(6/7)11 + (0.05)(12/13)11} = 9.885.
Comment: Gamma type integrals, discussed in Mahlers Guide to Conjugate Priors.
This is not a Gamma-Poisson situation since () is a mixture of two Exponentials.
In general, let () be a mixture of two Exponentials: () = we// + (1 - w)e//.
If one observes C claims in Y years, then applying Bayes Theorem as in this solution, the posterior
w(Y + 1/ )C + 2 + (1- w)(Y+ 1/ )C + 2
mean turns out to be: (C + 1)
.
w(Y + 1/ )C + 2 (Y + 1/ ) + (1- w)(Y + 1/ )C + 2(Y + 1/ )
If () were an Exponential with mean 6, then = 1 + 10 = 11 and 1/ = 1/6 + 1 = 7/6, and the
expected number of claims in Year 2 would be = 11/(7/6) = 66/7.
If instead () were an Exponential with mean 12, then = 1 + 10 = 11 and 1/ = 1/12 + 1 =
13/12, and the expected number of claims in Year 2 would be = 11/(13/12) = 132/13.
As an approximation to the exact answer, one could weight these two results together:
(.4)(66/7) + (.6)(132/13) = 9.864, very close to the correct answer in this case, but not always.
2.73. E. The distribution of is a 40%-60% mixture of Exponential Distributions with means 6 and
12. This mixture has a mean of: (40%)(6) + (60%)(12) = 9.6, a second moment of:
(40%)(2)(62 ) + (60%)(2)(122 ) = 201.6, and variance of: 201.6 - 9.62 = 109.44.
EPV = E[Process Variance | ] = E[] = 9.6. VHM = Var[] = 109.44.
K = EPV/VHM = 9.6/109.44 = .0877. Z = 1/(1 + .0877) = .919.
Estimated future frequency: (.919)(10) + (1 - .919)(9.6) = 9.97.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 58

2.74. C. The chance of the observation is: e 2/2.


Group
I
II
III

A priori
Probability
0.50
0.35
0.15

lambda

Chance of
Observation

Probability
Weight

Posterior
Probability

Prob.
N>2

1
2
3

0.18394
0.27067
0.22404

0.09197
0.09473
0.03361

0.4175
0.4300
0.1525

8.03%
32.33%
57.68%

0.22031

26.05%

For example: (.5)(.18394) = .09197. .09197/.22031 = .4175.


Prob[N > 2] = 1 - e - e - e 2/2.
(.4175)(8.03%) + (.4300)(32.33%) + (.1525)(57.68%) = 26.05%.
2.75. B. The chance of the observation is: e 3/6.
Type
Children
Nonsmoker
Smoker

A priori
Probability

lambda

Chance of
Observation

Probability
Weight

Posterior
Probability

30%
60%
10%

3
1
4

0.22404
0.06131
0.19537

0.06721
0.03679
0.01954

54.4%
29.8%
15.8%

0.12354

100.0%

100%

By Bayes Theorem,
Prob[Smoker | 3 colds] = Prob[3 colds | smoker] Prob[smoker] / Prob[3 colds] =
(0.19537)(0.1) / {(0.22404)(0.3) + (0.06131)(0.6) + (0.19537)(0.1)} =
0.01954/0.12354 = 15.8%.
2.76. B. The overall mean is: {(1)(900) + (10)(90) + (20)(10)}/1000 = 2.
Since we are mixing Poissons, EPV = overall mean = 2.
The 2nd moment of the hypothetical means is: {(12 )(900) + (102 )(90) + (202 )(10)}/1000 = 13.9.
VHM = 13.9 - 22 = 9.9. K = EPV/VHM = 2/9.9 = 0.202. Z = 1/(1 + K) = 1/1.202 = .832.
11.983 = estimate = .832x + (1 - .832)(2). x = 14.
Comment: Given the output, one needs to determine the missing input.

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 59

2.77. D. Over 4 years, Type I is Poisson with = 1.00, Type II is Poisson with = 2.00,
and Type III is Poisson with = 4.00. f(1) = e.
A

A Priori
Type of Chance of
Chance
Risk
This Type
of the
of Risk Observation
I
II
III

0.05
0.20
0.75

Overall

0.3679
0.2707
0.0733

Prob. Weight =
Product
of Columns
B&C

Posterior
Chance of
This Type
of Risk

Mean
Annual
Freq.

0.01839
0.05413
0.05495

14.43%
42.47%
43.10%

0.250
0.500
1.000

0.12748

1.000

0.679

2.78. B. EPV = E[] = Mean of Weibull = [1 + 1/] = (.1) (1.5) = 0.088623.


Second Moment of Weibull = 2 [1 + 2/] = (.01) (2) = 0.01.
VHM = Var[] = Variance of Weibull = 0.01 - 0.0886232 = 0.002146.
K = EPV/VHM = 0.088623/0.002146 = 41.3. Z = 500/(500 + 41.3) = .924.
A Priori Mean = E[] = Mean of Weibull = 0.088623.
Estimated Future Frequency: (.924)(35/500) + (1 - .924)(0.088623) = 0.0714 per month.
Estimated number of claims for 300 insureds for 12 months: (12)(300)(0.0714) = 257.
Comment: While we observe 3 months of experience, note that we predict the next 12 months.
2.79. E. Probabilities of Observation: e-1/r!, 3re-3/r!. Probability Weights: .75e-1/r!, .25 3re-3/r!.
Posterior Distribution: .75e-1/(.75e-1 + .25 3re-3), .25 3re-3/(.75e-1 + .25 3re-3).
We are given that: 2.98 = (1).75e-1/(.75e-1 + .25 3re-3) + (3).25 3re-3/(.75e-1 + .25 3re-3).

2.235e-1+ 0.745 3re-3 = .75e-1+ .75 3re-3. 3r = 1.49e-1/(.005e-3) = 298e2 .


r = {ln(298) + 2}/ln(3) = 7.
EPV = E[] = (.75)(1) + (.25)(3) = 1.5 = a priori mean.
VHM = Var[] = (.75)(1 - 1.5)2 + (.25)(3 - 1.5)2 = 0.75.
K = EPV/VHM = 2. Z = 1/(1 + 2) = 1/3. (1/3)(7) + (2/3)(1.5) = 3.33.
Alternately, let w = posterior probability that = 1.
We are given that: 2.98 = (1)w + (3)(1 - w). w = 0.01.

0.01 = .75e-1/(.75e-1 + .25 3re-3) = 1/(1 + 3r-1e-2).


3r-1e-2 = 99. r = 7. Proceed as before.
Comment: Long!

2013-4-10,

Conjugate Priors 2 Mixing Poissons,

HCM 10/21/12,

Page 60

2.80. A. Since we are mixing Poissons, EPV = mean = 0.50 + (1 - )(1.50) = 1.5 - .
Second moment of the hypothetical means is: 0.502 + (1 - )(1.502 ) = 2.25 - 2.
VHM = 2.25 - 2 (1.5 - )2 = 2.
K = EPV/VHM = (1.5 - )/( 2). Z = 1/(1 + K) = ( 2) / (1.5 - 2).
Comment: The number of claims observed in year one does not affect Z.
the Bhlmann credibility factor Z for Year 2 is the Bhlmann credibility factor Z used for predicting
the number of claims in Year 2.
One could take for example = 0.3, do the problem numerically, and then see which of the given
choice matches your solution.
Type
1
2
Average

A Priori
Probability
0.3
0.7

Poisson
Parameter
0.50
1.50

Square of
Mean
0.25
2.25

1.20

1.65

EPV = 1.2. VHM = 1.65 - 1.22 = 0.21. K = 1.2/0.21 = 5.714. Z= 1/(1 + K) = 1/6.714 = 0.1489.
Choice A gives: (.3 - .32 )/(1.5 - .32 ) = 0.21/ 1.41 = 0.1489.

2013-4-10,

Conjugate Priors 3 Gamma Function,

HCM 10/21/12,

Page 61

Section 3, Gamma Function and Distribution8


The quantity x1e-x is finite for x 0 and 1.
Since it declines quickly to zero as x approaches infinity, its integral from zero to exists.
This is the much studied and tabulated (complete) Gamma Function.
() =

t - 1 e - t

dt

() = ( 1) !

t - 1e - t /

dt , for 0 , 0.

() = (1)(1)

(1) = 1. (2) = 1. (3) = 2. (4) = 6. (5) = 24. (6) = 120. (7) = 720. (8) = 5040.
One does not need to know how to compute the complete Gamma Function for noninteger alpha.
Many computer programs will give values of the complete Gamma Function.

(1/2) =

(3/2) = 0.5

For 10: ln() ( - 0.5) ln - +


+

(-1/2) = -2

(-3/2) = (4/3) .

ln(2 )
1
1
1
1
+
+
3
5
2
12 360
1260
1680 7

691
3617
1
1
+
.9
9
11
13
15
360,360
122,400
1188
156

For < 10 use the recursion relationship () = (1) (1).


The Gamma function is undefined at the negative integers and zero.
For large : () e- 1/2 2 , which is Sterlings formula.10
The ratios of two Gamma functions with arguments that differ by an integer can be computed in
terms of a product of factors, just as one would with a ratio of factorials.
Exercise: What is (8) / (5)?
[Solution: (8) / (5) = 7! / 4! = (7)(6)(5) = 210.]
Exercise: What is (8.3) / (5.3)?
[Solution: (8.3) / (5.3) = 7.3! / 4.3! = (7.3)(6.3)(5.3) = 243.747.]
8

See Appendix A of Loss Models. Also see the Handbook of Mathematical Functions, by M. Abramowitz, et. al.
See Appendix A of Loss Models, and the Handbook of Mathematical Functions, by M. Abramowitz, et. al.
10
See the Handbook of Mathematical Functions, by M. Abramowitz, et. al.
9

2013-4-10,

Conjugate Priors 3 Gamma Function,

HCM 10/21/12,

Page 62

Note that even when the arguments are not integer, the ratio still involves a product of factors.
The solution of the last exercise depended on the fact that 8.3 - 5.3 = 3 is an integer.
Integrals involving ex and powers of x can be written in terms of the Gamma function:

t - 1e - t /

dt =

() , or for integer n:

tn e- c t

dt = n! / cn+1.

Exercise: What is the integral from 0 to of: t4 e-t/10?


[Solution: With = 5 and = 10, this integral is: (5) 105 = (4!) (100,000) = 2,400,000.]
This formula for gamma-type integrals is very useful for working with anything involving the Gamma
distribution, for example the Gamma-Poisson process. It follows from the definition of the Gamma
function and a change of variables.
The Gamma density in the Appendix of Loss Models is: x1 ex/ / ().
Since this probability density function must integrate to unity, the above formula for gamma-type
integrals follows. This is a useful way to remember this formula on the exam.

Incomplete Gamma Function:


As shown in Appendix A of Loss Models, the Incomplete Gamma Function is defined as:
( ; x) =

t - 1 e- t

dt / ().

( ; 0) = 0. ( ; ) = ()/() = 1. As discussed below, the Incomplete Gamma Function with


the introduction of a scale parameter is the Gamma Distribution.
Exercise: Via integration by parts, put (2 ; x) in terms of Exponentials and powers of x.
[Solution: (2 ; x) =

t e - t dt / (2) =

t =x

t e - t dt = e - t - t e - t ]

t =0

= 1 - e-x - xe-x.]

2013-4-10,

Conjugate Priors 3 Gamma Function,

HCM 10/21/12,

Page 63

One can prove via integration by parts that ( ; x) = (-1 ; x) - x-1 e-x / ().11
This recursion formula for integer alpha is: (n ; x) = (n-1 ; x) - xn-1 e-x /(n-1)!.
Combined with the fact that (1 ; x) = et dt = 1 - e-x, this leads to the following formula for the
Incomplete Gamma for positive integral alpha:12

(n ; x) = 1 -

n1 i
x

i =0

e- x
.
i!

Integrals Involving Exponentials times Powers:


One can use the incomplete Gamma Function to handle integrals involving te-t/.
x

t e- t / dt =

x/

s e-s ds = 2 se-s ds = 2(2 ; x/)(2) = 2{1 - e-x/ - (x/)e-x/}.

t e - t / dt

= 2 {1 - e-x/ - (x/)e-x/ }.

Exercise: What is the integral from 0 to 4.3 of: te-t/5?


[Solution: (52 ) {1 - e-4.3/5 - (4.3/5)e-4.3/5} = 5.32.]
Such integrals can also be done via integration by parts, or one can make use of the formula for the
Limited Expected Value of an Exponential Distribution:13
x

t e- t / dt = t e- t / / dt = {E[X

x] - xS(x)} =

{(1 - e-x/) - xe-x/} = 2{1 - e-x/ - (x/)e-x/}.

11

See for example, Formula 6.5.13 in the Handbook of Mathematical Functions, by Abramowitz, et. al.
See Theorem A.1 in Appendix A of Loss Models. One can also establish this result by computing the waiting time
until the nth claim for a Poisson Process, as shown in Mahlers Guide to Stochastic Processes, on another exam.
13
See Appendix A of Loss Models.
12

2013-4-10,

Conjugate Priors 3 Gamma Function,

HCM 10/21/12,

Page 64

When the upper limit is infinity, the integral simplifies:

t e - t dt = 2.

14

In a similar manner, one can use the incomplete Gamma Function to handle integrals involving
tn e-t/, for n integer:
x

tn

e- t/

dt

= n+1

x /

sn

e -s

ds

= n+1(n+1; x/)(n+1)

= n!

n+1{1

x ei!

-x

}.

i =0

Exercise: What is the integral from 0 to 4.3 of: t2 e-t/5?


x

[Solution:

x /

0 t2 e - t / dt = 3 0 s2 e - s ds = 3 (3 ; x/) (3) =

23 {1 - e-x/ - (x/)e-x/ - (x/)2 e-x//2}.


For = 5 and x = 4.3, this is:
250 {1 - e-0.86 - 0.86e-0.86 - 0.862 e-0.86/2} = 14.108.]
x

In general,

0 t2 e- t / dt = 23 {1 - e-x/ - (x/)e-x/ - (x/)2 e-x//2}.

If one divided by , then the integrand would be t times the density of an Exponential Distribution.
Therefore, the given integral is (mean of an Exponential Distribution) = 2.
14

2013-4-10,

Conjugate Priors 3 Gamma Function,

HCM 10/21/12,

Page 65

Gamma Distribution:15
The Gamma Distribution can be defined in terms of the Incomplete Gamma Function,
F(x) = ( ; x/ ). Note that (; ) = () / () = 1 and (; 0) = 0, so we have as required for a
distribution function F() = 1 and F(0) = 0.

f(x) =

(x / ) e - x /
x 1 e - x /
=
, x > .
x ( )
(a)

Exercise: What is the mean of a Gamma Distribution?


[Solution:

x f(x) dx =

x-1

e - x/

()

x e - x/ dx

dx = 0

()

(+ 1) + 1
(+ 1)
=
q = .]

()
()

Exercise: What is the nth moment of a Gamma Distribution?


[Solution:

xn f(x) dx = xn
0

x- 1

e - x/

()

xn + 1 e - x/ dx

dx = 0

()

(+ n) + n (+ n) n
=

( )
()

= (+n-1)(+n-2)....() n .
Comment: This is the formula shown in Appendix A of Loss Models.]
Exercise: What is the 3rd moment of a Gamma Distribution with = 6 and = 10?
[Solution: (+n-1)(+n-2)....()n = (6+3-1)(6+3-2)(6)(103 ) = (8)(7)(6)(1000) = 336,000.]

15

See Mahlers Guide to Loss Distributions.

2013-4-10,

Conjugate Priors 3 Gamma Function,

HCM 10/21/12,

Problems:
3.1 (1 point) What is the value of the integral from zero to infinity of: x6 e-4x?
A. less than 0.04
B. at least 0.04 but less than 0.05
C. at least 0.05 but less than 0.06
D. at least 0.06 but less than 0.07
E. at least 0.07
3.2 (1 point) What is the density at x = 15 of the Gamma distribution with parameters
= 4 and = 10?
A. less than 0.012
B. at least 0.012 but less than 0.013
C. at least 0.013 but less than 0.014
D. at least 0.014 but less than 0.015
E. at least 0.015
3.3 (1 point) What is the value of the integral from zero to infinity of: x-5 e-7/x?
A. less than 0.002
B. at least 0.002 but less than 0.003
C. at least 0.003 but less than 0.004
D. at least 0.004 but less than 0.005
E. at least 0.005
3.4 (2 points) What is the integral from 3 to 12 of: x e-x/3?
A. 5.2

B. 5.4

C. 5.6

D. 5.8

E. 6.0

3.5 (1 point) What is the density at x = 70 of the Gamma distribution with parameters
= 5 and = 20?
A. less than 0.008
B. at least 0.008 but less than 0.009
C. at least 0.009 but less than 0.010
D. at least 0.010 but less than 0.011
E. at least 0.011

Page 66

2013-4-10,

Conjugate Priors 3 Gamma Function,

HCM 10/21/12,

Page 67

Solutions to Problems:
3.1. B.

t1 et/ dt = (). Set - 1= 6 and = 1/4.


0

t1 et/ dt = (6+1) / 46+1 = 6! / 47 = 0.0439.


0

3.2. B. x1 ex/ / () = (10-4) 153 e-1.5 / (4) = 0.0126.


3.3. B. The density of the Inverse Gamma is: e/x /{x+1 ()}, 0 < x < .
Since this density integrates to one, x(+1) e/x integrates to ().
Thus taking = 4 and = 7, x-5e-7/x integrates to: 7-4 (4) = 6 / 74 = 0.0025.
Comment: Alternately, one can make the change of variables y = 1/x and convert this to the integral
of a Gamma density, rather than that of an Inverse Gamma Density.
3.4. D.

te-t/ dt = 2{1 - e-x/ - (x/)e-x/}.

Set = 3.

0
12

12

x = 12

te-t/3 dt = te-t/3 dt - te-t/3 dt = (32){1 - e-x/3 - (x/3)e-x/3}] =


3

(9){e-1 + (1)e-1 - e-4 - (4)e-4} = 5.80.


Comment: Can also be done using integration by parts.
3.5. C. (x/) ex/ / {x ()} = (70/20)5 e-70/20 / {70 (5)}
= (3.5)5 e-3.5 / {(70) (24)} = 0.00944.

x=3

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 68

Section 4, Gamma-Poisson
The number of claims a particular policyholder makes in a year is assumed to be Poisson with mean
. Recall that for a Poisson Distribution with parameter the chance of having n claims is given by:
n e / n!. For example the chance of having 6 claims is given by: 6 e / 6!.
Prior Distribution:16
Assume the values of the portfolio of policyholders are Gamma distributed with = 3 and
= 2/3, and therefore probability density function:17
f() = 1.6875 2 e1.5

0.

The prior Gamma is displayed below:

0.4

0.3

0.2

0.1

Poisson Parameter

The Prior Distribution Function is given in terms of the Incomplete Gamma Function:18
F() = (3; 1.5). So for example, the a priori chance that the value lies between
4 and 5 is: F(5) - F(4) = (3; 7.5) - (3; 6) = 0.9797 - 0.9380 = 0.0417.19
Graphically, this is the area between 4 and 5 and under the prior Gamma.
16

The first portion of this example is also in Mahlers Guide to Frequency Distributions.
However, here we introduce observations and then apply Bayes Analysis and Buhlmann Credibility.
17
For the Gamma Distribution, f(x) = x1e -x//{() }. One can look up the formulas for the density and
distribution function of a Gamma Distribution in the tables attached to the exam.
18
For the Gamma Distribution, F(x) = (; x/).
19
These values of the Incomplete Gamma Function were calculated on a computer using Mathematica.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 69

Marginal Distribution (Prior Mixed Distribution):


If we have a risk and do not know what type it is, in order to get the chance of having 6 claims, one
would weight together the chances of having 6 claims, using the a priori probabilities and integrating
from zero to infinity20:

6 -

6 e-
e
2 e- 1.5 d = 0.00234375
f()
d
=
1.6875

6!
6!
8 e- 2.5 d .
0
0
0

This integral can be written in terms of the Gamma function, as was shown in a previous section:

1 e- /

d = () .

Thus

8 e- 2.5 d = (9) 2.5-9 = (8!) (0.4)9 10.57.


0

Thus the probability of having 6 claims (0.00234375)(10.57) 2.5%.


More generally, if the distribution of Poisson parameters is given by a Gamma distribution
f() = 1 e/ / (), and we compute the chance of having n accidents by integrating from
zero to infinity:

6 -

n e-
e
1 e- /
1
n + 1 e- (1 + 1 / ) d =
d =
n! f() d = 6!
()
()

n!

0
0
0

n
1
(n + )
( + 1)...( + n -1)
=
.
(1 + )n +
n! () (1 + 1/ )n +
n!
The mixed distribution is in the form of the Negative Binomial distribution with parameters
r = and = :
Probability of x accidents =

x
r(r +1)...(r + x - 1)
.
(1+ ) x + r
x!

For the specific case dealt with previously: r = = 3 and, = = 2/3.

Note the way both the Gamma and the Poisson have terms involving powers of and e and these similar terms
combine in the product.
20

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 70

This marginal Negative Binomial is displayed below, through 10 claims:


0.25
0.2
0.15
0.1
0.05

The chance of having 6 claims is:

10

(2 / 3)6
(3)(4)(5)(6)(7)(8)
= 2.477%.
(1 + 2 / 3)6 + 3
6!

This is the same result as calculated above.


On the exam, one should not go through this calculation above.
Rather remember that for the Gamma-Poisson the (prior) marginal distribution is always a
Negative Binomial, with r = = shape parameter of the (prior) Gamma and = = scale
parameter of the (prior) Gamma.21
r goes with alpha, beta rhymes with theta.
Prior Mean:
Note that the overall (a priori) mean can be computed in either one of two ways.
First one can weight together the means for each type of risk, using the a priori probabilities. This is
E[] = the mean of the prior Gamma = = 3(2/3) = 2. Alternately, one can compute the mean of
the marginal distribution: the mean of a Negative Binomial is r = 3(2/3) = 2.
Of course the two results match.

goes with , and they rhyme, leaving r to go with . If integer, is the number of identical Exponentials one adds
in order to get a Gamma; while, if integer, r is the number of identical Geometric variables one adds in order to get a
Negative Binomial.
21

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 71

Prior Expected Value of the Process Variance:


The process variance for an individual risk is its Poisson parameter since the frequency for each risk
is Poisson. Therefore the expected value of the process variance = the expected value of = the a
priori mean frequency = = 3(2/3) = 2.
Prior Variance of the Hypothetical Means:
The variance of the hypothetical means is the variance of = Var[] =
Variance of the Prior Gamma = 2 = 3 (2/3)2 = 1.33.
Prior Total Variance:
The total variance is the variance of the marginal distribution, which for the Negative Binomial equals
r(1+) = 3(2/3)(5/3) = 3.33. The Expected Value of the Process Variance +
Variance of the Hypothetical Means = 2 + 1.33 = 3.33 = Total Variance.
In general, The Expected Value of the Process Variance + Variance of the Hypothetical Means =
+ 2 = (1+) = r(1+) = Total Variance.
For the Gamma-Poisson we have: Variance of the Gamma = 2 = r2 =

r(1+) =
r(1+) =
(Variance of the marginal Negative Binomial).
1+
1+
1+
Mean of the Gamma = = r =

1
1
r(1+) =
r(1+) =
1+
1+

1
(Variance of the marginal Negative Binomial).
1+
Therefore, Variance of the Gamma + Mean of the Gamma =

1
(
+
) (Variance of the marginal Negative Binomial) =
1+ 1+
Variance of the marginal Negative Binomial.
Which is just another way of saying that: EPV + VHM = Total Variance.
VHM = the variance of the Gamma.
Total Variance = the variance of the Negative Binomial = EPV + VHM.
2 = Variance of Gamma < Variance of Negative Binomial = r(1+) = + 2.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 72

Observations:
Let us now introduce the concept of observations.
A risk is selected at random and it is observed to have 5 claims in one year.
Posterior Distribution:
We can employ Bayesian analysis to compute what the chances are that the selected risk had a
given Poisson Parameter. Given a Poisson with parameter , the chance of observing 5 claims is:
5 e / 5!. The a priori probability of is the Prior Gamma distribution: f() = 1.6875 2 e1.5.
Thus the posterior chance of is proportional to the product of the chance of observation and the
a priori probability: 7 e2.5.
This is proportional to the density for a Gamma distribution with = 8 and = 1/2.5 = 2/5.
For an observation of 5 claims, the posterior Gamma is displayed below:

0.35
0.3
0.25
0.2
0.15
0.1
0.05
1

Poisson Parameter

The Posterior Distribution Function is given in terms of the Incomplete Gamma Function:
F() = (8; 2.5). So for example, the posterior chance that the value lies between 4 and 5 is:
F(5) - F(4) = (8; 12.5) - (8; 10) = 0.9302 - 0.7798 = 0.1504.
Graphically, this is the area between 4 and 5 and under this posterior Gamma. Note how observing
5 claims in one year has increased the chance of the Poisson parameter being in the interval from 4
to 5, from 0.0417 to 0.1504. This is an example of a Bayesian Interval Estimate.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 73

For an observation of 5 claims, the posterior Gamma with = 8 and = 2/5, and the prior Gamma
with = 3 and = 2/3, are compared below:
0.4

0.3
Posterior

Prior
0.2

0.1

lambda

After observing 5 claims in a year, the probability that this risk has a small Poisson parameter has
decreased, while the probability that it has a large Poisson parameter has increased.
In general, if one observes C claims for E exposures, we have that the chance of the observation
given is proportional to (E)C eE.22 This is proportional to C eE. The prior Gamma is
proportional to 1 e/. Thus the posterior probability for is proportional to the product:
C+1 e(E+1/). This is proportional to the density for a Gamma distribution with a shape
parameter of: C+ and scale parameter of: 1/(E + 1/) = /(1 + E).
Exercise: A risk is selected at random and it is observed to have 0 rather than 5 claims in one year.
Determine the posterior probability that the mean future expected frequency for this risk lies
between 4 and 5.
[Solution: The posterior distribution is a Gamma distribution with shape parameter of
= C + = 0 + 3 = 3 and 1/ = 1/ + E = 3/2 + 1 = 5/2. = 2/5. In other words,
F() = (3; /(2/5)) = (3; 2.5). So the posterior chance that the value lies between 4 and 5 is:
F(5) - F(4) = (3; 12.5) - (3; 10) = 0.99966 - 0.99723 = 0.00243.
Comment: Graphically, the solution to this exercise is the area between 4 and 5 and under this
posterior Gamma. Note how observing 0 claims in one year has decreased the chance of the
Poisson parameter being in the interval from 4 to 5.]
22

The Poisson parameter for E exposures is E.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 74

For an observation of 0 claims, the posterior Gamma with = 3 and = 2/5, and the prior Gamma
with = 3 and = 2/3, are compared below:
0.6

Posterior

0.5
0.4
0.3
0.2

Prior

0.1
1

lambda

After observing no claims in a year, the probability that this risk has a small Poisson parameter has
increased, while the probability that it has a large Poisson parameter has decreased.
For the Gamma-Poisson the posterior density function is also a Gamma.
This posterior Gamma has a shape parameter =
prior shape parameter + the number of claims observed.
This posterior Gamma has a scale parameter =
1
.
1 / (Prior scale parameter) + number of exposures (usually years) observed
The updating formulas are:
Posterior = Prior + C.

1
1
=
+ E.
Posterior
Prior

For example, in the case where we observed 5 claims in 1 year, C = 5 and E = 1.


The prior shape parameter was 3 while the prior scale parameter was 2/3.
Therefore the posterior shape parameter = 3 + 5 = 8,
while the posterior scale parameter = 1/(3/2 + 1) = 2/5, matching the result obtained above.
The fact that the posterior distribution is of the same form as the prior distribution is why the Gamma
is a Conjugate Prior Distribution for the Poisson.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 75

Predictive Distribution:
Since the posterior distribution is also a Gamma distribution, the same analysis that led to a
Negative Binomial (prior) marginal distribution, will lead to a (posterior) predictive distribution that is
Negative Binomial. However, the parameters of the predictive Negative Binomial are related to the
posterior Gamma.
For the Gamma-Poisson the (posterior) predictive distribution is always a Negative
Binomial, with r = shape parameter of the posterior Gamma, and
= scale parameter of the posterior Gamma.
Thus for the Predictive Negative Binomial:
r = shape parameter of the prior Gamma + number of claims observed, while
1
=
.
1 / (Scale parameter of the Prior Gamma) + number of exposures observed
In the particular example, r = 3 + 5 = 8. = 1/(1/(2/3) + 1) = 2/5 = 0.4. Thus posterior to the
observation of 5 claims, the chance of observing n claims in a year is given by:
(8)(9)...(8 + n - 1) 0.4n
.
1.4n + 8
n!
Therefore posterior to having observed 5 claims in one year, the chance of observing 6 claims in a
future year is:

13!
0.46
6.3%.
7! 6! 1.46 + 8

Our estimate of the chance of having 6 claims has been increased by the observations from 2.5% to
6.3%.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 76

Below the prior marginal Negative Binomial distribution (triangles) and posterior predictive Negative
Binomial distribution (squares) are compared:

0.25

0.2

0.15

0.1

0.05

10

Observing 5 claims has increased the probability of seeing a large number of claims from this risk in
the future.

Posterior Mean:
One can compute the means and variances posterior to the observations. The posterior mean can
be computed in either one of two ways.
First one can weight together the means for each type of risk, using the posterior probabilities.
This is E[] = the mean of the posterior Gamma = = 8 / 2.5 = 3.2.
Alternately, one can compute the mean of the predictive distribution: the mean of a Negative
Binomial is: r = 8 / 2.5 = 3.2. Of course the two results match.
Thus the new estimate posterior to the observations for this risk using Bayesian Analysis is 3.2. This
compares to the a priori estimate of 2. In general, the observations provide information about the
given risk, which allows one to make a better estimate of the future experience of that risk. Not
surprisingly observing 5 claims in a single year has raised the estimated frequency from 2 to 3.2.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 77

Posterior Expected Value of the Process Variance:


Just as prior to the observations, posterior to the observations one can compute three variances:
the expected value of the process variance, the variance of the hypothetical pure premiums, and the
total variance. The process variance for an individual risk is its Poisson parameter since the
frequency for each risk is Poisson. Therefore the expected value of the process variance =
the expected value of = the posterior mean frequency = 3.2.
Posterior Variance of the Hypothetical Means:
The variance of the hypothetical means is:
the variance of = Var[] = Variance of the Posterior Gamma = 2 = 8(2/5)2 = 1.28.
After the observation the variance of the hypothetical means is less than prior (1.28 < 1.33)
since the observations have allowed us to narrow down the possibilities.23
Posterior Total Variance:
The total variance is the variance of the predictive distribution. The variance of the Negative Binomial
equals: r(1+) = 8(.4)(1.4) = 4.48.
The Expected Value of the Process Variance + Variance of the Hypothetical Means = 3.2 + 1.28 =
4.48 = Total Variance.
In general, EPV + VHM = + 2 = (1+) = Total Variance.

23

While the posterior VHM is usually less then the prior VHM, when the observation is sufficiently far from our prior
expectations, the posterior VHM can be larger than the prior VHM. For example, with a prior Gamma with = 2 and
= 1/10, if we observe 5 claims in one year, then the posterior Gamma has parameters: = 7 and = 1/11.
The posterior VHM is 7 /112 = 0.058, which is greater than the prior VHM = 2 /102 = 0.020.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 78

Buhlmann Credibility:
Next, lets apply Buhlmann Credibility to this example. The Buhlmann Credibility parameter K =
the (prior) expected value of the process variance / the (prior) variance of the hypothetical means =
2 / (4/3) = 1.5. Note that K can be computed prior to any observation and doesnt depend on them.
Specifically both variances are for a single insured for one year.
For the Gamma-Poisson in general, K =

EPV

=
= 1/.
VHM 2

For the Gamma-Poisson the Buhlmann credibility parameter K is equal to the inverse of
the scale parameter of the Prior Gamma.
For the example, K = 1/ = 1/(2/3) = 1.5.
Having observed 5 claims in one year, Z = 1 / (1+ 1.5) = 0.4. The observation = 5.
The a priori mean = 2. Therefore, the new estimate = (0.4)(5) + (1 - 0.4)(2) = 3.2.
Note that in this case the estimate from Buhlmann Credibility matches the estimate from Bayesian
Analysis.
For the Gamma-Poisson the estimates from using Bayesian Analysis and
Buhlmann Credibility are equal.24
Summary:
The many different aspects of the Gamma-Poisson are summarized in below. It would be a good
idea to know everything on that diagram for the exam.
The Gamma distribution is a distribution of parameters, while the Negative Binomial is a distribution
of number of claims.
Be sure to be able to clearly distinguish between the situation prior to observations and that
posterior to the observations.
It is important to note that the Exponential distribution is a special case of the Gamma distribution, for
= 1. Therefore, many exam questions involving the Exponential-Poisson can be answered
quickly as a special case of the Gamma-Poisson.

24

As discussed in a subsequent section this is a special case of the general results for conjugate priors of members
of linear exponential families. This is an example of what Loss Models refers to as exact credibility.

Page 79
HCM 10/21/12,

Conjugate Priors 4 Gamma-Poisson,


2013-4-10,

Gamma-Poisson Frequency Process

Poisson
Process
Mixing

r = shape parameter of the Prior Gamma = .


= scale parameter of the Prior Gamma = .
Mean = r = .
Variance = r(1+) = + 2.

Negative Binomial Marginal Distribution


(Number of Claims)

Gamma is a Conjugate Prior, Poisson is a Member of a Linear Exponential Family


Buhlmann Credibility Estimate = Bayes Analysis Estimate
Buhlmann Credibility Parameter, K = 1/.
Gamma Prior
(Distribution of Parameters)
Shape parameter = alpha = ,
Scale parameter = theta = .

Poisson
Process

Mixing

r = shape parameter of the Posterior Gamma


= = + C.
= scale parameter of the Posterior Gamma
= = 1/(E + 1/).
Mean = r = ( + C)/(E + 1/).
Variance = r(1+) =
( + C)/(E + 1/) + ( + C)/(E + 1/)2.

Negative Binomial Predictive Distribution


(Number of Claims)

Observations: # claims = C, # exposures = E.

Gamma Posterior
(Distribution of Parameters)
Posterior Shape parameter =
= + C.
Posterior Scale parameter =
1/ = 1/ + E.

Poisson Parameters of individuals making up the entire portfolio are distributed via a Gamma Distribution
with parameters and : f(x) = - x-1 e-x/ / [], mean = , variance = 2.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 80

Comparing the Gamma and Negative Binomial Distributions:


Poisson Process
Gamma Prior
Distribution of Parameters
Shape parameter = .

Mixing

Scale parameter = .

Negative Binomial
Marginal Distribution
# of Claims
r = . = .

The Gamma is a distribution of each insureds mean frequency.


The a priori mean frequency is the mean of the Gamma.
The mean of the Negative Binomial is also the a priori mean.
= Mean of Gamma = Mean of Negative Binomial = r.
VHM = the variance of the Gamma.
Total Variance = variance of the Negative Binomial.
Total Variance = EPV + VHM > VHM.
2 = Variance of Gamma < Variance of Negative Binomial = r (1+) = + 2.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 81

Predictive Distribution for More than One Year:


In the example, the posterior distribution of was Gamma with = 8 and = 0.4. Therefore, the
predictive distribution for the next year was Negative Binomial with r = 8 and = 0.4. This predictive
distribution was used to determine the probability of having a certain number of claims during the
next year.
Sometimes one is interested in the number of claims over several future years. For example, let us
determine the probability of having a certain number of claims during the next two years.
The number of claims over one year is Poisson with mean . Therefore, the number of claims over
two years is Poisson with mean 2. The posterior distribution of 2 is Gamma with = 8 and
= (2)(0.4) = 0.8. Thus mixing this Poisson by this Gamma, the distribution of the number of claims
for the next two years is Negative Binomial with r = 8 and = 0.8.
Exercise: What is the probability that this insured has 3 claims over the next two years?
[Solution: f(3) = {r(r+1)(r+2)/3!} 3/(1+)3+r = {(8)(9)(10)/6}0.83 /1.811 = 9.56%.]
In general, if the posterior distribution is Gamma with parameters and , then over Y future years
the predictive distribution is a Negative Binomial with parameters r = and = Y.25
Note that we do not add the predictive Negative Binomial for one year to itself. This would be the
correct thing to do if we assumed each year had a different lambda picked at random. Here we are
assuming that each year has the same unknown lambda.
Exercise: Alan and Bob each have 5 claims over one year.
What is the probability that they have in total 3 claims over the next year?
[Solution: Each of Alan and Bob has a posterior distribution of which is Gamma with = 8 and
= 0.4. Therefore, the predictive distribution for the next year for each of them is Negative Binomial
with r = 8 and = 0.4. The number of claims Alan and Bob will have are independent.
Therefore, the sum of their claims next year is the sum of their Negative Binomials, a Negative
Binomial Distribution with r = (2)(8) = 16 and = 0.4.
f(3) = {r(r+1)(r+2)/3!} 3/(1+)3+r = {(16)(17)(18)/6}0.43 /1.419 = 8.74%.
Comment: While the Negative Binomial with r = 16 and = 0.4 has the same mean as the one with
r = 8 and = 0.8, it does not have the same probabilities. 8.74% 9.56%.]
25

For the Gamma-Poisson, the mixed distribution for Y years of data is given by a Negative Binomial Distribution, with
parameters r = and = Y. See Mahlers Guide to Frequency Distributions.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 82

Problems:
4.1 (2 points) For an insurance portfolio the distribution of the number of claims a particular
policyholder makes in a year is Poisson with mean . The -values of the policyholders follow the
Gamma distribution, with parameters = 5, and = 1/8. The probability that a policyholder chosen at
random will experience x claims is given by which of the following?
A.

(5)(6) ... (4 + x) 85
9x
x!

B.

(5)(6) ... (4 + x) 85
9x+ 5
x!

C.

(8)(9) ... (7 + x) 85
9x
x!

D.

(8)(9) ... (7 + x) 85
9x+ 5
x!

E. None of A, B, C, or D.

Use the following information to answer the following two questions:


Let the likelihood of a claim be given by a Poisson distribution with parameter .
The prior density function of is given by f() = 3125 4 e- 5 /24.
You observe 1 claim in 2 years.
4.2 (2 points) The posterior density function of is proportional to which of the following?
A. 3 e- 5
B. 4 e- 6
C. 5 e- 7
D. 6 e- 8
E. None of A, B, C, or D.
4.3 (2 points) What is the Buhlmann credibility estimate of the posterior mean claim frequency?
A. less than 0.70
B. at least 0.70 but less than 0.75
C. at least 0.75 but less than 0.80
D. at least 0.80 but less than 0.85
E. at least 0.85

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 83

Use the following information to answer the next 14 questions:


The number of claims a particular policyholder makes in a year is Poisson with mean .
The values of the portfolio of policyholders have probability density function:
f() = (100,000 / 24) e-10 4.
You are given the following values of the Incomplete Gamma Function:
( ; y)
y

=4

=5

=6

=7

3.0

0.353

0.185

0.084

0.034

4.0
5.4

0.567
0.787

0.371
0.627

0.215
0.454

0.111
0.298

7.2

0.928

0.844

0.724

0.580

4.4 (1 point) What is the mean claim frequency for the portfolio?
A. less than 35%
B. at least 35% but less than 45%
C. at least 45% but less than 55%
D. at least 55% but less than 65%
E. at least 65%
4.5 (1 point) What is the probability that an insured picked at random from this portfolio will have a
Poisson parameter between 0.3 and 0.4?
A. less than 15%
B. at least 15% but less than 16%
C. at least 16% but less than 17%
D. at least 17% but less than 18%
E. at least 18%
4.6 (2 points) What is the probability that a policyholder chosen at random will experience
3 claims in a year?
A. less than 0.6%
B. at least 0.6% but less than 0.9%
C. at least 0.9% but less than 1.2%
D. at least 1.2% but less than 1.5%
E. at least 1.5%

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 84

4.7 (1 point) What is the variance of the claim frequency for the portfolio?
A. less than 0.54
B. at least 0.54 but less than 0.56
C. at least 0.56 but less than 0.58
D. at least 0.58 but less than 0.60
E. at least 0.60
4.8 (1 point) What is the expected value of the process variance?
A. less than 0.48
B. at least 0.48 but less than 0.50
C. at least 0.50 but less than 0.52
D. at least 0.52 but less than 0.54
E. at least 0.54
4.9 (1 point) What is the variance of the hypothetical mean frequencies?
A. less than 0.042
B. at least 0.042 but less than 0.044
C. at least 0.044 but less than 0.046
D. at least 0.046 but less than 0.048
E. at least 0.048
4.10 (1 point) An insured has 2 claims over 8 years.
Using Buhlmann Credibility what is the estimate of this insured's expected claim frequency?
A. less than 40%
B. at least 40% but less than 45%
C. at least 45% but less than 50%
D. at least 50% but less than 55%
E. at least 55%
4.11 (2 points) An insured has 2 claims over 8 years.
What is the posterior probability density function for this insured's Poisson parameter ?
A. f() = 850305.6 e12 8
B. f() = 850305.6 e12 6
C. f() = 850305.6 e18 8
D. f() = 850305.6 e18 6
E. None of A, B, C, or D

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 85

4.12 (1 point) An insured has 2 claims over 8 years.


What is the mean of the posterior distribution of ?
A. less than 40%
B. at least 40% but less than 45%
C. at least 45% but less than 50%
D. at least 50% but less than 55%
E. at least 55%
4.13 (1 point) An insured has 2 claims over 8 years.
What is the probability that this insured has a Poisson parameter between 0.3 and 0.4?
A. 26%
B. 28%
C. 30%
D. 32%
E. 34%
4.14 (2 points) An insured has 2 claims over 8 years.
What is the variance of the posterior distribution of ?
A. less than 0.014
B. at least 0.014 but less than 0.016
C. at least 0.016 but less than 0.018
D. at least 0.018 but less than 0.020
E. at least 0.020
4.15 (1 point) An insured has 2 claims over 8 years. What is the probability that this insured has a
Poisson parameter between 0.3 and 0.4? Use the Normal Approximation.
A. less than 27%
B. at least 27% but less than 29%
C. at least 29% but less than 31%
D. at least 31% but less than 33%
E. at least 33%
4.16 (2 points) An insured has 2 claims over 8 years.
What is the probability that this insured will experience 3 claims in the next year?
A. 0.6%
B. 0.8%
C. 1.0%
D. 1.2%
E. 1.4%
4.17 (2 points) An insured has 2 claims over 8 years.
What is the variance of the predictive distribution?
A. 0.35
B. 0.37
C. 0.39
D. 0.41

E. 0.43

4.18 (2 points) An insured has 2 claims over 8 years.


What is the probability that this insured will experience 4 claims in the next three years?
A. 2.2%
B. 2.4%
C. 2.6%
D. 2.8%
E. 3.0%

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 86

Use the following information for the next 2 questions:


Prior to any observations you assume each group health policyholder has a frequency distribution
which is Poisson, with mean ()(number of individuals in that group). You assume that is
distributed across the different group health policyholders via a Gamma Distribution.
You observe the following data for a portfolio of group health policyholders:
Policyholder
1997
17
9

Year
1998
20
10

1999
16
13

Sum
53
32

# claims
# in group

# claims
# in group

19
11

23
8

17
7

59
26

# claims
# in group

26
14

30
17

35
18

91
49

# claims
# in group

Sum

62
34

73
35

68
38

203
107

4.19 (2 points) Prior to any observations, you assume the Gamma Distribution of has parameters
= 9 and = 0.2. You expect Policyholder 1 to have 14 individuals in the year 2001.
What are the average number of claims expected in the year 2001 from Policyholder 1?
A. 23.5
B. 24.0
C. 24.5
D. 25.0
E. 25.5
4.20 (2 points) Prior to any observations, you assume the Gamma Distribution of has parameters
= 9 and = 0.2. You expect Policyholder 2 to have 7 individuals in the year 2001.
You expect the average claim to cost $800 in the year 2001.
What is the Buhlmann credibility premium in the year 2001 for Policyholder 2?
A. $11,250
B. $11,500
C. $11,750
D. $12,000

E. $12,250

4.21 (3 points) The number of robberies of a given convenience store during the month is assumed
to be Poisson distributed with an unknown mean, that varies by store via an Exponential
Distribution with mean 0.015.
The Big Apple Convenience Store on Main Street has had 4 robberies over the last 36 months.
What is the probability that this store will have two robberies over the next 12 months?
A. less than 9%
B. at least 9% but less than 10%
C. at least 10% but less than 11%
D. at least 11% but less than 12%
E. at least 12%

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 87

Use the following information to answer the next 6 questions:


The number of claims a particular policyholder makes in a year is Poisson. The values of the Poisson
parameter (for annual claim frequency) for the individual policyholders in a portfolio follow a Gamma
distribution, with parameters = 3 and = 1/12.
4.22 (2 points) What is the chance that an insured picked at random from the portfolio will have no
claims over the next three years?
A.45%
B. 47%
C. 49%
D. 51%
E. 53%
4.23 (2 points) What is the chance that an insured picked at random from the portfolio will have one
claim over the next three years?
A. 30%
B. 35%
C. 40%
D. 50%
E. 55%
4.24 (2 points) How much credibility would be assigned to three years of data from an insured
picked at random from the portfolio?
A. 10%
B. 15%
C. 20%
D. 25%
E. 30%
4.25 (1 point) An insured picked at random from the portfolio is observed for three years and has
no claims. Use Buhlmann credibility to estimate its future annual claim frequency.
A. 0.14
B. 0.16
C. 0.18
D. 0.20
E. 0.22
4.26 (1 point) An insured picked at random from the portfolio is observed for three years and has
one claim. Use Buhlmann credibility to estimate its future annual claim frequency.
A. less than 0.19
B. at least 0.19 but less than 0.21
C. at least 0.21 but less than 0.23
D. at least 0.23 but less than 0.25
E. at least 0.25
4.27 (3 points) Use Bayesian Analysis to predict the future annual claim frequency of those insureds
who have fewer than two claims over a three year period.
A. less than 0.19
B. at least 0.19 but less than 0.21
C. at least 0.21 but less than 0.23
D. at least 0.23 but less than 0.25
E. at least 0.25

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 88

4.28 (3 points) The conditional distribution of the annual number of accidents per driver is Poisson
with mean . is constant for a particular driver, but varies between different drivers.
The variable has a gamma distribution with parameters = 1.5 and = 0.03.
A particular driver, Green Acker, has had a total of 2 accidents over the last 3 years.
What is the probability that Green Acker will have a total of 1 accident over the next 3 years?
(A) 20%
(B) 21%
(C) 22%
(D) 23%
(E) 24%
Use the following information for the next 2 questions:

The random variable representing the number of claims for a single policyholder follows
a Poisson distribution.

For a portfolio of policyholders, the Poisson parameters follow a Gamma distribution


representing the heterogeneity of risks within that portfolio.

The random variable representing the number of claims in a year of a policyholder,


chosen at random, follows a Negative Binomial distribution with parameters:
r = 4 and = 3/17.
4.29 (1 point) Determine the variance of the Gamma distribution.
(A) 0.110 (B) 0.115 (C) 0.120 (D) 0125
(E) 0.130
4.30 (2 points) For a policyholder chosen at random from this portfolio, determine the chance of
observing 2 claims over 5 years.
(A) 17.0% (B) 17.5% (C) 18.0% (D) 18.5% (E) 19.0%
Use the following information for the next three questions:
(i) The annual number of claims for each policyholder follows a Poisson distribution with mean .
(ii) The distribution of across all policyholders has probability density function:
f() = 100e10, > 0.
A randomly selected policyholder is known to have had at least one claim last year.
4.31 (3 points) What is the expected future claim frequency of this policyholder?
(A) 0.29
(B) 0.31
(C) 0.33
(D) 0.35
(E) 0.37
4.32 (2 points) Determine the posterior probability that this same policyholder will have no claims
this year.
(A) 0.72
(B) 0.74
(C) 0.76
(D) 0.78
(E) 0.80
4.33 (3 points) Determine the posterior probability that this same policyholder will have at least 2
claims this year.
(A) 3.0%
(B) 3.5%
(C) 4.0%
(D) 4.5%
(E) 5.0%

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 89

Use the following information for the next 2 questions:


Prior to any observations you assume each group health policyholder has a frequency distribution
which is Poisson, with mean ()(number of individuals in that group).
You assume that is distributed across the different group health policyholders via a Gamma
Distribution, with parameters = 9 and = 0.2.
You observe the following data for a portfolio of group health policyholders:
Policyholder
1997
$8,700
9

Year
1998
$11,800
10

1999
$11,100
13

Sum
$31,600
32

$ of Loss
# in group

$ of Loss
# in group

$13,000
14

$18,200
17

$27,600
18

$58,800
49

$ of Loss
# in group

Sum

$21,700
23

$30,000
27

$38,700
31

$90,400
81

4.34 (2 points) Assume that the average claim is $600.


hat is the expected pure premium for policyholder 2 in the year 2001?
A. 900
B. 1000
C. 1100
D. 1200
E. 1300
4.35 (3 points) Assume that in the year 2001 the average claim cost will be $800.
Assume 7% annual inflation.
Assuming you expect 16 individuals in group 1 in the year 2001, what is the expected cost for
policyholder 1?
A. Less than $19,500
B. At least $19,500, but less than $20,000
C. At least $20,000, but less than $20,500
D. At least $20,500, but less than $21,000
E. At least $21,000

4.36 (2 points) You are given:


(i) The number of claims incurred in a month by any insured has a Poisson distribution with mean .
(ii) The claim frequencies of different insureds are independent.
(iii) The prior distribution of is exponential with mean 1/8.
(iv) A randomly selected insured has 1 claim in the final quarter of 2004 and 3 claims in 2005.
Determine the credibility estimate of the number of claims for this insured during 2006.
(A) 2.4
(B) 2.5
(C) 2.6
(D) 2.7
(E) 2.8

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 90

Use the following information to answer the next 12 questions:


The number of claims a particular policyholder makes in a year is Poisson. The values of the Poisson
parameter (for annual claim frequency) for the individual policyholders in a portfolio of 10,000 follow a
Gamma distribution, with parameters = 4 and = 0.1.
You observe this portfolio for one year and divide it into three groups based on how many claims
you observe for each policyholder:
Group A: Those with no claims.
Group B: Those with one claim.
Group C: Those with two or more claims.
4.37 (1 point) What is the expected size of Group A?
(A) 6200
(B) 6400
(C) 6600
(D) 6800
(E) 7000
4.38 (1 point) What is the expected size of Group B?
(A) 2400
(B) 2500
(C) 2600
(D) 2700
(E) 2800
4.39 (1 point) What is the expected size of Group C?
(A) 630
(B) 650
(C) 670
(D) 690
(E) 710
4.40 (1 point) What is the expected future claim frequency for a member of Group A?
(A) 36%

(B) 38%

(C) 40%

(D) 42%

(E) 44%

4.41 (1 point) What is the expected future claim frequency for a member of Group B?
(A) 37%
(B) 39%
(C) 41%
(D) 43%
(E) 45%
4.42 (3 points) What is the expected future claim frequency for a member of Group C?
(A) 52%
(B) 54%
(C) 56%
(D) 58%
(E) 60%
4.43 (1 point) What is the chance next year of 0 claims from an insured in Group A?
(A) 65%
(B) 67%
(C) 69%
(D) 71%
(E) 73%
4.44 (1 point) What is the chance next year of 0 claims from an insured in Group B?
(A) 65%
(B) 67%
(C) 69%
(D) 71%
(E) 73%
4.45 (3 points) What is the chance next year of 0 claims from an insured in Group C?
(A) 54%
(B) 56%
(C) 58%
(D) 60%
(E) 62%
4.46 (2 points) What is the chance next year of 2 or more claims from an insured in Group A?
(E) 5.9%
(A) 5.1%
(B) 5.3%
(C) 5.5%
(D) 5.7%

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 91

4.47 (2 points) What is the chance next year of 2 or more claims from an insured in Group B?
(A) 7.5%
(B) 7.7%
(C) 7.9%
(D) 8.1%
(E) 8.3%
4.48 (4 points) What is the chance next year of 2 or more claims from an insured in Group C?
(A) 12%
(B) 13%
(C) 14%
(D) 15%
(E) 16%

4.49 (3 points) You are given the following:

A portfolio consists of a number of independent risks.

The number of claims per year for each risk follows a Poisson distribution with mean .

The prior distribution of among the risks in the portfolio is assumed to be

a Gamma distribution.

During several years, a positive number of claims are observed for a particular insured
from this portfolio.
Which of the following statements are true?
1. For this insured, the posterior distribution of can not be an Exponential.
2. For this insured, the coefficient of variation of the posterior distribution of is less than
the coefficient of variation of the prior distribution of .
3. The coefficient of variation of the posterior distribution of the number of claims per year for
this insured is less than the coefficient of variation of the a priori distribution of
the number of claims per year.
A. 1, 2
B. 1, 3
C. 2, 3
D. 1, 2, 3
E. None of A, B, C or D
4.50 (2 points) You are given the following information:

For a given group health policy, the number of claims for each member follows
a Poisson distribution with parameter .

is the same for each member of a given group.

However, varies between groups, via a Gamma distribution with mean = 0.08

and variance = 0.0004.

For a particular group, during the latest three years a total of 120 claims has been observed.
In each of the three years, this group had 300 members.
Determine the Bayesian estimate of lambda for this group based upon the recent observations.
A. less than 0.110
B. at least 0.110 but less than 0.115
C. at least 0.115 but less than 0.120
D. at least 0.120 but less than 0.125
E. at least 0.125

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 92

Use the following information for the next two questions:


(i) The number of claims incurred in a year by any insured has a Poisson distribution with mean .
(ii) For an individual insured, is constant over time.
(iii) The claim frequencies of different insureds are independent.
(iv) The prior density of is gamma with: f() = (200)4 e-200 / (6).
(v) Preferred homeowners in Territory 5 whose homeowners insurance is written by
ABC Insurance Company are assumed to each have the same mean frequency.
(vi) Recent experience for such homeowners insureds has been as follows:
Year
Number of Insureds
Number of Claims
1
200
3
2
250
2
3
300
3
4
350
?
4.51 (2 points) Determine the Bhlmann-Straub credibility estimate of the number of claims in Year
4.
(A) 4.0
(B) 4.2
(C) 4.4
(D) 4.6
(E) 4.8
4.52 (3 points) What is the probability of observing at most 2 claims in Year 4?
(A) 18%
(B) 20%
(C) 23%
(D) 27%
(E) 30%
4.53 (3 points) You are given:
(i) The number of claims per auto insured follows a Poisson distribution with mean .
(ii) The prior distribution for has the following probability density function:
f() = (300)40 e-300 / {(40)}
(iii) Randy observes the following claims experience:
Year 1
Year 2
Number of claims
60
Number of autos insured
500
600
(iv) Let Randys estimate of the expected number of claims in year 2 be R.
(v) Randy rotates to another area of his insurers actuarial department.
Andy takes over Randys old job. Andy observes the following claims experience:
Year 1
Year 2
Year 3
Number of claims
60
90
Number of autos insured
500
600
700
(vi) Let Andys estimate of the expected number of claims in year 3 be A.
Determine R + A.
(A) 170
(B) 172
(C) 174
(D) 176
(E) 178

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 93

For the next 3 questions, use the following information on the number of accidents over a six year
period for two sets of drivers:
Number of Accidents
Number of Female Drivers
Number of Male Drivers
0
19,634
21,800
1
3,573
6,589
2
558
1,476
3
83
335
4
19
69
5
4
16
6
1
4
7
0
2
8
0
1
9
0
1
Total

23,872

30,293

4.54 (5 points) Fit a Negative Binomial to the data for Females using the method of moments.
Test that fit by using the Chi-Square Goodness of Fit Test. Group the data using the largest
number of groups such that the expected number of drivers in each group is at least 5.
4.55 (5 points) Fit a Negative Binomial to the data for Males using the method of moments.
Test that fit by using the Chi-Square Goodness of Fit Test. Group the data using the largest
number of groups such that the expected number of drivers in each group is at least 5.
4.56 (3 points) For each of the previous questions, assume that each insured has a Poisson
frequency with mean that is constant over time. Assume that the means of the Poisson distributions
are Gamma Distributed across each of the groups of drivers. In each case, using the fitted Negative
Binomial Distribution, how much credibility would be given to three years of data from a single driver.

4.57 (2 points) You are given for automobile insurance:


(i) Each driver has a frequency that is Poisson with mean .
(ii) Across the portfolio, has a Gamma Distribution with parameters and .
(iii) A given driver is observed to have no claims in three years.
(iv) The posterior estimate of the future claim frequency for this insured is 85% of the prior estimate.
What is the value of ?
(A) 1/20

(B) 1/17

(C) 1/15

(D) 1/10

(E) Can not be determined.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 94

Use the following information for the next two questions:


(i) The conditional distribution of the number of claims per policyholder is Poisson with
mean .
(ii) The variable has a gamma distribution with parameters and .
(iii) A policyholder has 1 claim in Year 1, 2 claims in year 2, and 3 claims in year 3.
4.58 (3 points) Which of the following is equal to the mean of the posterior distribution of ?

(A)

+4

exp[(3 + 1/ ) ] d /

(B)

+ 4 exp[(6 + 1/ ) ] d /

(D)

+ 3 exp[(6 +

1/ ) ] d

+ 6 exp[(6 + 1/ ) ] d /

+ 3 exp[(3 + 1/ ) ] d
0

(C)

+ 5 exp[(6 +

1/ ) ] d

+6

exp[(3 + 1/ ) ] d /

+ 5 exp[(3 + 1/ ) ] d
0

(E) None of A, B, C, or D
4.59 (3 points) Which of the following is equal to the probability of two claims in year 4 from this
policyholder?

(A)

+5

exp[(3 + 1/ ) ] d /

(B)

+ 5 exp[(4 + 1/ ) ] d /

(D)

+ 3 exp[(6 +

1/ ) ] d

+ 7 exp[(3 + 1/ ) ] d /

+ 3 exp[(3 + 1/ ) ] d
0

(C)

+ 5 exp[(6 +

1/ ) ] d

+7

exp[(4 + 1/ ) ] d /

(E) None of A, B, C, or D

+ 5 exp[(3 + 1/ ) ] d
0

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 95

4.60 (4 points) You are given:


(i) The number of accidents per taxicab follows a Poisson distribution with mean .
(ii) is the same for each taxicab owned by a particular company,
but varies between the companies.
(iii) The prior distribution for has the following probability density function:
f() = (50)5 e-50 / {(5)}.
(iv) Calloway Cab Company has the following claims experience:
Year 1
Year 2
Number of accidents
7
6
Number of taxicabs
20
25
The Calloway Cab Company expects to have 30 cabs in each of Years 3 and 4.
Determine the probability of a total of 9 accidents in Years 3 and 4.
(A) 8%
(B) 9%
(C) 10%
(D) 11%
(E) 12%
4.61 (4 points) The number of medical malpractice claims from each doctor is Poisson with mean .
The improper prior distribution is: () = 1, > 0.
(a) Dr. Phil Fine has no claims in year 1. What is his expected number of claims in year 2?
(b) Dr. Phil Fine now has 2 claims in year 2. What is the variance of his posterior distribution of ?
(c) Dr. Phil Fine now has 1 claim in year 3. What is the variance of his predictive distribution?
4.62 (5 points) Prior to any observations, you assume the Gamma Distribution of has parameters
= 10 and unknown. You assume the claims experience of the different policyholders are
independent. Which of the following equations should be solved in order to estimate via maximum
likelihood from the observed data?
54
60
92
A. 3 =
+
+
32 + 1/ 26 + 1/ 49 + 1/
B. 30 =

63
69
101
+
+
32 + 1/ 26 + 1/ 49 + 1/

C. 3 =

54
60
92
+
+
32 + 10 / 26 + 10 / 49 + 10 /

D. 30 =

63
69
101
+
+
32 + 10 / 26 + 10 / 49 + 10 /

E. None of the above.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 96

Use the following information for the next two questions:


The number of claims in a year for an individual follows a Poisson Distribution with parameter .
follows a Gamma Distribution with = 2 and = 0.10.
For 4 individuals picked at random you observe a total of 6 claims during a year.
4.63 (3 points) Determine the Bayesian estimate of the expected total number of claims next year
for this group of 4 individuals.
A. 1.2
B. 1.3
C. 1.4
D. 1.5
E. 1.6
4.64 (2 points) Determine the probability that this group of 4 individuals will have a total of 3 claims
in the next year.
A. 8.0%
B. 8.5%
C. 9.0%
D. 9.5%
E. 10.0%

Use the following information for the next two questions:


You are given the following information:
The number of claims in a year for an individual follows a Poisson Distribution with parameter .
follows a Gamma Distribution with = 2 and = 0.10.
For an individual you observe a total of 6 claims during 4 years.
4.65 (2 points) Determine the Bayesian estimate of the posterior annual claim frequency rate for this
individual.
A. 0.4
B. 0.5
C. 0.6
D. 0.7
E. 0.8
4.66 (2 points) Determine the probability that this individual will have 3 claims in the next year.
A. 2.0%
B. 2.5%
C. 3.0%
D. 3.5%
E. 4.0%

4.67 (2 points) You are given:


(i) The number of accidents an individual has in a year follows a Poisson distribution with mean .
(ii) varies between individuals via a Gamma Distribution with = 3 and = 1/100.
(iii) David had two accidents in five years.
Use the Bayes estimate that corresponds to the zero-one loss function,
in order to predict Davids future mean annual frequency.
A. 3.0%
B. 3.8%
C. 4.0%
D. 4.8%
E. 5.0%

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 97

Use the following information for the next two questions:

Pan-Global Airways is running a contest.


They will pick at random one of the flights purchased to fly on them this year,
and award that customer free air travel on Pan-Global for the rest of their life.

The number of flights purchased each year by each of their customers is Poisson with mean .
is distributed across their customers via a Gamma Distribution with = 1/3 and = 6.
4.68 (3 points) What is the expected number of flights purchased by the lucky customer during the
year of the contest?
A. 5
B. 6
C. 7
D. 8
E. 9
4.69 (3 points) Assume that the annual number of flights taken by the lucky customer will still be
Poisson, but with a mean 1.5 times what it had been.
What is the expected number of future flights taken per year by the contest winner?
A. 10
B. 11
C. 12
D. 13
E. 14
4.70 (2 points) Claim frequency follows a Poisson distribution with parameter .
is distributed according to: g() = 25 e-5.
An insured selected at random from this population has two claims during the past year.
Find the posterior density function for .
Find the predictive distribution.
4.71 (2 points) Claim frequency follows a Poisson distribution with mean .
is constant for an individual insured, but varies over an insured population via
a Gamma Distribution with = 6 and = 0.04.
An insured is selected at random and is claim free for n years.
Determine n if the posterior estimate of for this insured is 0.15.
A. 5

B. 10

C. 15

D. 20

D. 25

4.72 (3 points) Claim frequency follows a Poisson distribution with parameter .


follows a gamma distribution with a mean equal to 1.
During the next year, 10 policies produced 20 claims.
The predictive distribution has a variance of 16/9.
Determine the variance of the posterior distribution of .
A. 1/12

B. 1/11

C. 1/10

D. 1/9

E. 1/8

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 98

4.73 (4, 11/82, Q.49) (3 points) You are given the following probability density functions:
Poisson: f(d|h) = e-h hd /d!, d = 0, 1, 2, ...

Mean = h. Variance = h.

Gamma: g(h) = ar e-ah hr-1 / (r), 0 h .

(r+1)/(r) = r, r1.

Mean = r/a. Variance = r/a2 .


The probability distribution of claims per year (d) is specified by a Poisson distribution with
parameter h.
The prior distribution of h in a class is given by a Gamma distribution with parameters a, r.
Given an observation of c claims in a one-year period, determine the posterior probability
distribution for h. Use Bayes' Theorem.
A. (a+c)r e-(a+c)h hr-1 / (r)

B. (a+1)r+c+1 e-(a+1)h hr+c / (r+c+1)

C. ar+c e-ah hr+c-1 / (r+c)

D. (a+1)r+c e-(a+1)h hr+c-1 / (r+c)

E. (a+c)r +1 e-(a+c)h hr / (r+1)


4.74 (4, 5/86, Q.37) (1 point) The claim frequency rate Q has a gamma distribution with
parameters and . If N policies are written, the number of claims Y will be Poisson distributed with
mean NQ.
If we observe y claims in one year for N policies, what is the Bayesian update of Q?
A. (1-Z) () + Z (y/N) ; Z = (N / (N + 1/))

B. (1-Z) () + Z (y/N) ; Z = (1/ / (N + 1/))

C. ( + N) / (1/ + y)

D. ( + 1/) / (N + y)

E. None of the above


4.75 (165, 11/86, Q.9) (1.8 points) You are using a Bayesian method to estimate the mean of the
number X of accidents per year in a certain group.
You assume that X has a Poisson distribution with mean m.
Your prior opinion concerning m is that it has an exponential distribution,
f(m) = a e-am for some a.
In 1985, you observed 14 accidents in the group and revise your opinion concerning m.
The mean of the posterior distribution of m is 10.
Determine 1/a, the mean of the prior distribution.
(A) 2
(B) 4
(C) 6
(D) 7
(E) 12

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 99

4.76 (4, 5/87, Q.42) (2 points) If n is the observed number of claims of m independent trials from
a Poisson distribution with mean q, then the probability density function of n is given by:
f(n) = exp(-qm) (qm)n / n! , n = 0, 1, 2,....
The prior distribution of q is Gamma, which has the following distribution:
h(q) = exp(-bq)ba qa-1 / (a-1)! , q > 0
The mean of this Gamma distribution is a/b and its variance is a/b2 .
What is the Buhlmann credibility given to an observation of n claims in m trials?
A. ma / ( ma + b2 )

B. m / ( m + b)

D. m / ( m + a)

E. m / ( m + b2 )

C. ma / ( ma + b)

4.77 (4, 5/87, Q.47) (2 points) Let D be a random variable which represents the number of claims
experienced in a one year period.
The probability density function of D is Poisson with parameter h.
That is, P(D=d | h) = e-h hd /d! ; d = 0, 1, 2, ....
E(D|h) = VAR(D | h) = h.
The random variable H, which represents the parameter of the Poisson distribution,
follows the following probability distribution: g(h) = e-h ; 0 h. E(H) = VAR(H) = 1.
What is the probability of observing 2 claims in the next year?
A. 1/16
B. 1/12
C. 1/8
D. 1/6
E. 1/3
4.78 (4, 5/87, Q.56) (1 point) Suppose that the distribution of the number of claims for an
individual insured is Poisson. Suppose further than an insurer has a number of independent such
individuals. Assume, however, that the expected number of claims for individuals from this
population follows a gamma distribution.
What distribution will the insurer's claims follow?
A. Binomial
B. Negative Binomial
C. Poisson
D. Gamma
E. Cannot be determined

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 100

4.79 (4, 5/89, Q.40) (2 points) The number of claims X for a given insured follows a Poisson
distribution, P[X = x] = xe-/x!.
The expected annual mean of the Poisson distribution over the population of insureds follows the
distribution f() = e- over the interval (0, ).
An insured is selected from the population at random.
Over the last year this particular insured had no claims.
What is the posterior density function of for the selected insured?
A. e-, for > 0

B. e-, for > 0

D. 4e-2, for > 0

E. None of A, B, C, or D.

C. 2e-2, for > 0

Use the following information for the next two questions:


The probability distribution function of claims per year for an individual risk is a Poisson distribution
with parameter h.
The prior distribution of h is a gamma distribution given by g(h) = he-h for 0 < h < .
4.80 (4, 5/90, Q.46) (2 points) Given an observation of 1 claim in a one-year period, what is the
posterior distribution for h?
A. e-h

B. h2 e-h /2

C. 4h2 e-2h

D. he-h

E. 4he-2h

4.81 (4, 5/90, Q.47) (1 point) What is the Buhlmann credibility to be assigned to a single
observation?
A. 1/4
B. 1/3
C. 1/2
D. 2/3
E. 1/(1+h)

4.82 (4, 5/90, Q.48) (2 points) An automobile insurer entering a new territory assumes that each
individual car's claim count has a Poisson distribution with parameter . The insurer also assumes that
has a gamma distribution with probability density function: f() = e-/ (/)1 / {()}.
Initially, the parameters of the gamma distribution are assumed to be = 50 and = 1/500.
During the subsequent two year period the insurer covered 750 and 1100 cars for the first and
second years, respectively.
The insurer incurred 65 and 112 claims in the first and second years, respectively.
What is the coefficient of variation of the posterior gamma distribution?
A. 0.066
B. 0.141
C. 0.520
D. 1.000
E. Not enough information

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 101

4.83 (4, 5/91, Q.31) (2 points) The number of claims a particular policyholder makes in a year has
a Poisson distribution with mean q. The q-values for policyholders follow a gamma distribution with
variance equal to 0.2. The resulting distribution of policyholders by number of claims is a negative
binomial with parameters r and such that the mean is equal to: r,
and variance equal to: r(1 + ) = 0.5.
What is the value of r(1 + )?
A. Less than 0.6
B. At least 0.6 but less than 0.8
C. At least 0.8 but less than 1.0
D. At least 1.0 but less than 1.2
E. At least 1.2
4.84 (4, 5/91, Q.49) (2 points) The parameter is the mean of a Poisson distribution.
If has a prior gamma distribution with parameters and , and a sample x1 , x2 , ..., xn from the
Poisson distribution is available. Which of the following is the formula for the Bayes estimator of
(i.e. the mean of the posterior distribution)?
n

n
ln(xi)
1
A.
+

n + 1/ i=1 n
n / + 1

n
B.
n + 1/ 2

C.

n
xi
1
+

n + 1 i=1 n
n + 1
n

+
E.

xi
i=1

n + 1/

xni +
i=1

D.

1
n / 2

+ 1

1/
xi
n
+

n + 1/ i=1 n
n + 1/

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 102

4.85 (4B, 5/92, Q.11) (2 points) You are given the following information:

Number of claims follows a Poisson distribution with parameter .

The claim frequency rate, , has a Gamma distribution with mean = 0.14
and variance 0.0004.

During the latest two-year period, 110 claims have been observed.

In each of the two years, 310 policies were in force.


Determine the Bayesian estimate of the posterior claim frequency rate based upon the latest
observations.
A. Less than 0.14
B. At least 0.14 but less than 0.15
C. At least 0.15 but less than 0.16
D. At least 0.16 but less than 0.17
E. At least 0.17
Use the following information for the next two questions:
The number of claims for an individual risk in a single year follows a Poisson distribution with
parameter . The parameter has for a prior distribution the following Gamma density function with
parameters = 1 and = 2: f(x) = (1/2) e-/2, > 0.
You are given that three claims arose in the first year.
4.86 (4B, 5/92, Q.28) (2 points) Determine the posterior distribution of .
A. (1/2) e-3/2
B. (1/12)3 e-/2
C. (1/4)3 e-/2
D. (27/32)3 e-3/2
E. (1/12)2 e-3/2
4.87 (4B, 5/92, Q.29) (2 points) Determine the Buhlmann credibility estimate for the expected
number of claims in the second year.
A. Less than 2.25
B. At least 2.25 but less than 2.50
C. At least 2.50 but less than 2.75
D. At least 2.75 but less than 3.00
E. At least 3.00

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 103

4.88 (4B, 11/92, Q.9) (2 points) You are given the following:

Number of claims for a single insured follows a Poisson distribution with mean .

The claim frequency rate, , has a gamma distribution with mean 0.10
and variance 0.0003.

During the last three-year period 150 claims have occurred.

In each of the three years, 200 policies were in force.


Determine the Bayesian estimate of the posterior claim frequency rate based upon the latest
observations.
A. Less than 0.100
B. At least 0.100 but less than 0.130
C. At least 0.130 but less than 0.160
D. At least 0.160 but less than 0.190
E. At least 0.190
4.89 (4B, 11/92, Q.16) (3 points) You are given the following:

Number of claims follows a Poisson distribution with mean .

has the Gamma distribution f() = 3e-3, > 0.

The random variable Y, representing claim size, has the gamma distribution:
p(y) = exp(-y/2500) / 2500, y > 0.
Determine the variance of the pure premium.
A. Less than 2,500,000
B. At least 2,500,000 but less than 3,500,000
C. At least 3,500,000 but less than 4,500,000
D. At least 4,500,000 but less than 5,500,000
E. At least 5,500,000

4.90 (4B, 5/93, Q.32) (2 points) You are given the following:
The number of claims for a class of business follows a Poisson distribution.
The prior distribution for the expected claim frequency rate of individuals belonging to
this class of business is a Gamma distribution with mean = 0.10 and variance = 0.0025.
During the next year, 6 claims are sustained by the 20 risks in the class.
Determine the variance of the posterior distribution for the expected claim frequency rate of
individuals belonging to this class of business.
A. Less than 0.0005
B. At least 0.0005 but less than 0.0015
C. At least 0.0015 but less than 0.0025
D. At least 0.0025 but less than 0.0050
E. At least 0.0050

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 104

4.91 (4B, 11/93, Q.2) (1 point) You are given the following:
Number of claims follows a Poisson distribution with parameter .
Prior to the first year of coverage, is assumed to have the Gamma distribution
f() = 1000150 149 e-1000 / (150), > 0.
In the first year, 300 claims are observed on 1,500 exposures.
In the second year, 525 claims are observed on 2,500 exposures.
After two years, what is the Bayesian probability estimate of E[]?
A. Less than 0.17
B. At least 0.17, but less than 0.18
C. At least 0.18, but less than 0.19
D. At least 0.19, but less than 0.20
E. At least 0.20
Use the following information for the next two questions:
For an individual risk in a population, the number of claims for a single exposure period
follows a Poisson distribution with mean .

For the population, is distributed according to an exponential distribution with mean 0.1,
g() = 10e-10, > 0.

An individual risk is selected at random from the population.


After one exposure period, one claim has been observed.

4.92 (4B, 5/94, Q.25) (3 points) Determine the density function of the posterior distribution of for
the selected risk.
A. 11e-11

B. 10e-11

C. 121e-11

D. (1/10)e-9

E. (11e-11) / 2

4.93 (4B, 5/94, Q.26) (2 points) Determine the Buhlmann credibility factor, z, assigned to the
number of claims for a single exposure period.
A. 1/10
B. 1/11
C. 1/12
D. 1/14
E. None of A, B, C, or D

4.94 (4B, 11/94, Q.3) (2 points) You are given the following:
The number of claims for a single risk follows a Poisson distribution with mean m. m is a random
variable having a prior Gamma distribution with mean = 0.5. The value of k in Buhlmanns partial
credibility formula is 10. After five exposure periods, the posterior distribution is Gamma with mean
0.6. Determine the number of claims observed in the five exposure periods.
A. 3
B. 4
C. 5
D. 6
E. 10

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 105

4.95 (4B, 11/94, Q.24) (2 points) You are given the following:
r is a random variable that represents the number of claims for an individual risk and has the Poisson
density function f(r) = t r e -t / r !, r = 0, 1, 2, ...
The parameter t has a prior Gamma distribution with density function h(t) = 5 e-5t, t > 0.
A portfolio consists of 100 independent risks, each having identical density functions.
In one year, 10 claims are experienced by the portfolio.
Use the Buhlmann credibility method to determine the expected number of claims in the second
year for the portfolio.
A. Less than 6
B. At least 6, but less than 8
C. At least 8, but less than 10
D. At least 10, but less than 12
E. At least 12
Use the following information for the next two questions:
For an individual risk in a population, the number of claims for a single exposure period follows a
Poisson distribution with parameter . For the population, is distributed according to an exponential
distribution: h() = 5 e-5, > 0. An individual risk is randomly selected from the population. After
two exposure periods, one claim has been observed.
4.96 (4B, 11/94, Q.25) (2 points) For the selected risk, subsequent to the observation, determine
the expected value of the process variance.
A. 0.04
B. 0.20
C. 0.29
D. 5.00
E. 25.00
4.97 (4B, 11/94, Q.26) (3 points) Determine the density function of the posterior distribution of
for the selected risk.
A. 7e-7

B. 5e-7

C. 49e-7

D. 1082 e-6

E. 2702 e-6

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 106

Use the following information for the next two questions:


A portfolio of insurance risks consists of two classes, 1 and 2, that are equal in size.
For a Class 1 risk, the number of claims follows a Poisson distribution with mean 1.
1 varies by insured and follows an exponential distribution with mean 0.3.
For a Class 2 risk, the number of claims follows a Poisson distribution with mean 2.
2 varies by insured and follows an exponential distribution with mean 0.7.
Hint: The exponential distribution is a special case of the Gamma distribution with = 1.
4.98 (4B, 5/95, Q.7) (2 points) Two risks are randomly selected, one from each class.
What is total variance of the number of claims observed for both risks combined?
A. Less than 0.70
B. At least 0.70, but less than 0.95
C. At least 0.95, but less than 1.20
D. At least 1.20, but less than 1.45
E. At least 1.45
4.99 (4B, 5/95, Q.8) (2 points) Of the risks that have no claims during a single exposure period,
what proportion can be expected to be from Class 1?
A. Less than 0.53
B. At least 0.53, but less than 0.58
C. At least 0.58, but less than 0.63
D. At least 0.63, but less than 0.68
E. At least 0.68

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 107

4.100 (4B, 5/95, Q.12) (2 points) You are given the following:
A portfolio consists of 1,000 identical and independent risks.
The number of claims for each risk follows a Poisson distribution with mean .
Prior to the latest exposure period, is assumed to have a gamma distribution,
with parameters = 250 and = 1/2000.
During the latest exposure period, the following loss experience is observed:
Number of Claims
Number of Risks
0
906
1
89
2
4
3
1
1,000
Determine the mean of the posterior distribution of .
A. Less than 0.11
B. At least 0.11, but less than 0.12
C. At least 0.12, but less than 0.13
D. At least 0.13, but less than 0.14
E. At least 0.14
4.101 (4B, 11/95, Q.7) (2 points) You are given the following:

The number of claims per year for a given risk follows a Poisson distribution with mean .

The prior distribution of is assumed to be a gamma distribution with


coefficient of variation 1/6.

Determine the coefficient of variation of the posterior distribution of after 160 claims have been
observed for this risk.
A. Less than 0.05
B. At least 0.05, but less than 0.10
C. At least 0.10, but less than 0.15
D. At least 0.15
E. Cannot be determined from the given information.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 108

4.102 (4B, 5/96, Q.21) (2 points) You are given the following:

The number of claims per year for a given risk follows a Poisson distribution with mean .

The prior distribution of is assumed to be a gamma distribution with mean 1/2


and variance 1/8.

Determine the variance of the posterior distribution of after a total of 4 claims have been
observed for this risk in a 2-year period.
A. 1/16
B. 1/8
C. 1/6
D. 1/2

E. 1

4.103 (4B, 11/97, Q.2) (2 points) You are given the following:
A portfolio consists of 100 identical and independent risks.
The number of claims per year for each risk follows a Poisson distribution with mean .
The prior distribution of is assumed to be a gamma distribution with mean 0.25 and
variance 0.0025.
During the latest year, the following loss experience is observed:
Number of Claims Number of Risks
0
80
1
17
2
3
Determine the variance of the posterior distribution of .
A. Less than 0.00075
B. At least 0.00075, but less than 0.00125
C. At least 0.00125, but less than 0.00175
D. At least 0.00175, but less than 0.00225
E. At least 0.00225
4.104 (4B, 5/98, Q.4) (2 points) You are given the following:

A portfolio consists of 10 identical and independent risks.

The number of claims per year for each risk follows a Poisson distribution with mean .

The prior distribution of is assumed to be a gamma distribution with mean 0.05

and variance 0.01.


During the latest year, a total of n claims are observed for the entire portfolio.

The variance of the posterior distribution of is equal to the variance of


the prior distribution of .

Determine n.
A. 0 B. 1 C. 2

D. 3

E. 4

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 109

Use the following information for the next two questions:

The number of errors that a particular baseball player makes in any given game
follows a Poisson distribution with mean .

does not vary by game.

The prior distribution of is assumed to follow a distribution with mean 1/10,


variance /2 and density function f() = e 1 / (), 0 < < .

The player is observed for 60 games and makes one error.

4.105 (4B, 5/99, Q.23) (2 points) If the prior distribution is constructed so that the credibility of the
observations is very close to zero, determine which of the following is the largest.
A. f(0)
B. f(1/100)
C. f(1/20)
D. f(1/10)
E. f(1)
4.106 (4B, 5/99, Q.24) (2 points) If the prior distribution is constructed so that the variance of the
hypothetical means is 1/400, determine the expected number of errors that the player will make in
the next 60 games.
A. Less than 0.5
B. At least 0.5, but less than 2.5
C. At least 2.5, but less than 4.5
D. At least 4.5, but less than 6.5
E. At least 6.5
Use the following information for the next two questions:
The number of claims for a particular insured in any given year follows
a Poisson distribution with mean .
does not vary by year.
The prior distribution of is assumed to follow a distribution with mean 10/m, variance 10/m2 ,
and density function f() = e-m m10 9 / (10), 0 < < , where m is a positive integer.
4.107 (4B, 11/99, Q.23) (2 points) The insured is observed for m years, after which the posterior
distribution of has the same variance as the prior distribution.
Determine the number of claims that were observed for the insured during these m years.
A. 10
B. 20
C. 30
D. 40
E. 50
4.108 (4B, 11/99, Q.24) (2 points) As the number of years of observation becomes larger and
larger, the ratio of the variance of the predictive (negative binomial) distribution to the mean of the
predictive (negative binomial) distribution approaches what value?
A. 0
B. 1
C. 2
D. 4
E.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 110

4.109 (Course 4 Sample Exam 2000, Q.4) An individual automobile insured has a claim count
distribution per policy period that follows a Poisson distribution with parameter . For the overall
population, follows a distribution with density function according to an exponential distribution:
h() = 5 e5, > 0.
One insured is selected at random from the population and is observed to have a total of one claim
during two policy periods. Determine the expected number of claims that this same insured will have
during the third policy period.
4.110 (4, 5/00, Q.30) (2.5 points) You are given:
(i) An individual automobile insured has an annual claim frequency distribution that follows
a Poisson distribution with mean .
(ii) follows a gamma distribution with parameters and .
(iii) The first actuary assumes = 1 and = 1/6.
(iv) The second actuary assumes the same mean for the gamma distribution,
but only half the variance.
(v) A total of one claim is observed for the insured over a three year period.
(vi) Both actuaries determine the Bayesian premium for the expected number of claims
in the next year using their model assumptions.
Determine the ratio of the Bayesian premium that the first actuary calculates to the Bayesian
premium that the second actuary calculates.
(A) 3/4
(B) 9/11
(C) 10/9
(D) 11/9
(E) 4/3
4.111 (4, 5/01, Q.2) (2.5 points) You are given:
(i) Annual claim counts follow a Poisson distribution with mean .
(ii) The parameter has a prior distribution with probability density function: f() = e/3/3, > 0.
Two claims were observed during the first year.
Determine the variance of the posterior distribution of .
(A) 9/16

(B) 27/16

(C) 9/4

(D) 16/3

(E) 27/4

4.112 (2 points) In the previous question, 4, 5/01, Q.2, determine the variance of the predictive
distribution of the number of claims in the second year.
(A) 2.0
(B) 2.5
(C) 3.0
(D) 3.5
(E) 4.0

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 111

4.113 (4, 11/01, Q.3 & 2009 Sample Q.58) (2.5 points) You are given:
(i) The number of claims per auto insured follows a Poisson distribution with mean .
(ii) The prior distribution for has the following probability density function:
f() = (500)50 e-500 / {(50)}
(iii) A company observes the following claims experience:
Year 1
Year 2
Number of claims
75
210
Number of autos insured
600
900
The company expects to insure 1100 autos in Year 3.
Determine the expected number of claims in Year 3.
(A) 178
(B) 184
(C) 193
(D) 209
(E) 224
4.114 (4, 11/01, Q.34 & 2009 Sample Q.76) (2.5 points) You are given:
(i) The annual number of claims for each policyholder follows a Poisson distribution with mean .
(ii) The distribution of across all policyholders has probability density function:
f() = e, > 0.

(iii)

e-n d

= 1/n2 .

A randomly selected policyholder is known to have had at least one claim last year.
Determine the posterior probability that this same policyholder will have at least one claim this year.
(A) 0.70
(B) 0.75
(C) 0.78
(D) 0.81
(E) 0.86
4.115 (3 points) In the previous question, what is the expected future annual frequency for this
policyholder?
(A) 13/6
(B) 11/5
(C) 9/4
(D) 7/3
(E) 5/2

4.116 (4, 11/02, Q.3 & 2009 Sample Q. 32) (2.5 points) You are given:
(i) The number of claims made by an individual insured in a year has a Poisson
distribution with mean .
(ii) The prior distribution for is gamma with parameters = 1 and = 1.2.
Three claims are observed in Year 1, and no claims are observed in Year 2.
Using Bhlmann credibility, estimate the number of claims in Year 3.
(A) 1.35
(B) 1.36
(C) 1.40
(D) 1.41
(E) 1.43

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 112

4.117 (4, 11/03, Q.27 & 2009 Sample Q.21) (2.5 points) You are given:
(i) The number of claims incurred in a month by any insured has a Poisson distribution
with mean .
(ii) The claim frequencies of different insureds are independent.
(iii) The prior distribution is gamma with probability density function:
f() = (100)6 e-100 / (120).
Number of Insureds Number of Claims
(iv) Month
1
100
6
2
150
8
3
200
11
4
300
?
Determine the Bhlmann-Straub credibility estimate of the number of claims in Month 4.
(A) 16.7
(B) 16.9
(C) 17.3
(D) 17.6
(E) 18.0
4.118 (2 points) In the previous question, using the Normal Approximation, determine the
probability that the number of claims observed in month 4 is more than 20.
(A) 16%
(B) 19%
(C) 21%
(D) 24%
(E) 30%

4.119 (4, 5/05, Q.21 & 2009 Sample Q.191) (2.9 points) You are given:
(i) The annual number of claims for a policyholder follows a Poisson distribution with
mean .
(ii) The prior distribution of is gamma with probability density function:
f() =

(2) 5 e - 2
, > 0.
24

An insured is selected at random and observed to have x1 = 5 claims during Year 1 and
x2 = 3 claims during Year 2.
Determine E( | x1 = 5, x2 = 3).
(A) 3.00

(B) 3.25

(C) 3.50

(D) 3.75

(E) 4.00

4.120 (1 point) In the previous question, for this insured what is the probability of observing 4
claims in year 3?
(A) 14%
(B) 15%
(C) 16%
(D) 17%
(E) 18%
4.121 (2 points) In 4, 5/05, Q.21, for this insured what is the probability of observing a total of 6
claims in years 3, 4, and 5?
(A) 5%
(B) 6%
(C) 7%
(D) 8%
(E) 9%

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

4.122 (4, 11/05, Q.2 & 2009 Sample Q.215) (2.9 points) You are given:
(i) The conditional distribution of the number of claims per policyholder is Poisson with
mean .
(ii) The variable has a gamma distribution with parameters and .
(iii) For policyholders with 1 claim in Year 1, the credibility estimate for the number of
claims in Year 2 is 0.15.
(iv) For policyholders with an average of 2 claims per year in Year 1 and Year 2,
the credibility estimate for the number of claims in Year 3 is 0.20.
Determine .
(A) Less than 0.02
(B) At least 0.02, but less than 0.03
(C) At least 0.03, but less than 0.04
(D) At least 0.04, but less than 0.05
(E) At least 0.05
4.123 (4, 11/06, Q.10 & 2009 Sample Q.254) (2.9 points) You are given:
(i) A portfolio consists of 100 identically and independently distributed risks.
(ii) The number of claims for each risk follows a Poisson distribution with mean .
(iii) The prior distribution of is:
() = (50)4 e50 / (6), > 0.
During Year 1, the following loss experience is observed:
Number of Claims Number of Risks
0
90
1
7
2
2
3
1
Total
100
Determine the Bayesian expected number of claims for the portfolio in Year 2.
(A) 8
(B) 10
(C) 11
(D) 12
(E) 14

Page 113

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 114

Solutions to Problems:
4.1. B. The Gamma-Poisson has a Negative Binomial mixed distribution, with parameters
r = = 5 and = = 1/8. {r(r+1) ... (r + x -1)/x!}x / (1+)x+r = {(5)(6) ... (4+x)/x!}(1/8)x /(9/8)x+5 =
{(5)(6) ... (4+x)/x!} 85 / 9x + 5.
4.2. C. For the Gamma-Poisson, if the prior Gamma has parameters = 5, = 1/5, then the
Posterior Gamma has parameters: = 5 + 1 = 6 and 1/ = 5 + 2 = 7.
Posterior Gamma = 1 e-/ / () = 76 5 e-7 / 5! = 980.4 5 e- 7 .
4.3. E. For the Gamma-Poisson, Buhlmann credibility gives the same answer as the mean of the
posterior distribution. The mean of the posterior Gamma with parameters = 6 and = 1/7 is: 6/7.
Alternately, K = 1/ = 5. Z = 2 /(2+K) = 2/7. Prior estimate = 5/5 = 1.
Observed frequency = 1/2. Therefore, new estimate = (2/7)(1/2) + (1 - 2/7)(1) = 6/7 = 0.857.
4.4. C. The prior distribution is a Gamma with = 5 (the power to which is taken in the density
function is -1) and = 1/10 (the value multiplying where it is exponentiated in the density
function is 1/ = 10.) Gamma-Poisson has a marginal distribution (prior to any observations) of a
Negative Binomial with parameters: r = = 5, = = 1/10, with mean = r = 5/10 = 0.5. Alternately,
each insured expected mean frequency is distributed via the Gamma distribution. Therefore, the
overall expected mean for the portfolio is the mean of the Gamma distribution with parameters 5
and 1/10: 5/10 = 0.5.
4.5. E. The prior distribution is a Gamma with = 5 and = 1/10.
Thus F(x) = (; x/ ) = (5; 10x). F(.4) - F(.3) = (5; 4) - (5; 3 ) = .371 - .185 = 0.186.
4.6. E. Gamma-Poisson has a marginal distribution (prior to any observations) of a Negative
Binomial with parameters: = = 1/10, r = = 5.
f(3) = (5)(6)(7)3 / {(3!)(1+)3+r} = (35)(.001)/(1.18 ) = 1.6%.
4.7. B. Variance of the Negative Binomial = mean (1+) = .5 (11/10) = .55
Alternately one can use the solutions to the next two questions: the total variance =
Expected value of the process variance + Variance of the hypothetical means = .5 + .05 = 0.55.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 115

4.8. C. Expected value of the process variance = expected value of the variance of Poisson =
expected value of = mean of Gamma = 0.5.
4.9. E. VHM = variance of = variance of Gamma = (5)(1/102 ) = 0.05.
4.10. A. Using the solutions to the previous two questions, K= .5 / .05 = 10.
Z = 8/(8 + 10) = 4 /9. Prior estimate = .5. Observed frequency = 2/8.
Therefore, the new estimate = (4/9)(1/4) + (5/9)(1/2) = 7/18 = 0.39.
4.11. D. Posterior Gamma has: = + C = 5 + 2 = 7, 1/ = 1/ + E = 10 + 8 = 18.
f() = (187 /(6!))e18 6 = 850305.6 e-18 6 .
4.12. A. Mean of the posterior Gamma is 7/18 = 0.389.
Comment: Note that the posterior mean is the same as the estimate using Buhlmann credibility,
7/18, since the Gamma is a Conjugate Prior for the Poisson, which is a member of a linear
exponential family.
4.13. B. The posterior distribution is a Gamma with = 7 and = 1/18.
Thus F(x) = (; x/) = (7; 18x). F(0.4) - F(0.3) = (7; 7.2) - (7; 5.4 ) = 0.580 - 0.298 = 0.282.
Comment: An example of a Bayesian interval estimation.

Note that (7; 7.2) =

i=7

7.2i e - 7.2
=1i!

i=0

7.2i e - 7.2
= 1 - 0.420 = 0.580.
i!

4.14. E. Variance of the posterior Gamma, 7 / 182 = 0.022.


Comment: This is less than the .050 variance of the prior; the observation has narrowed the
possibilities for this insured.
4.15. A. The posterior distribution is a Gamma, with = 7 and = 1/18, mean .389, and variance
of 7/182 . The standard deviation is:

7 /18 = .147.

Thus F(.4) - F(.3) [(.4-.389)/.147] - [(.3-.389)/.147] = .530 - .272 = 0.258.


Comment: Note that this differs from the exact answer of .282 obtained as the solution to a previous
question using values of the Incomplete Gamma Functions.
4.16. B. The predictive distribution is Negative Binomial with parameters: = = 1/18, r = = 7.
f(3) = (7)(8)(9)3 / {(3!)(1+)3+r} = (84)(1/183 )/(19/18)10 = 0.8%.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 116

4.17. D. The predictive distribution is Negative Binomial with parameters: = = 1/18, r = = 7.


The variance of the predictive distribution is: (7)(1/18)(1 + 1/18) = 0.410.
4.18. E. The posterior distribution of is Gamma with parameters: = 7, = 1/18.
This is for one future year. Over three future years, frequency is Poisson with mean 3.
The posterior distribution of 3 is Gamma with parameters: = 7, = 3/18 = 1/6.
Thus, the number of claims over three years is Negative Binomial with r = 7 and = 1/6.
f(4) = {(7)(8)(9)(10)/4!}4 / (1+)4+r = (210)(1/64 )/(7/6)11 = 2.97%.
Comment: For the Gamma-Poisson, the mixed distribution for Y years of data is given by a
Negative Binomial Distribution, with parameters r = and = Y.
See Mahlers Guide to Frequency Distributions.
4.19. A. We observe 53 claims from 32 exposures. The posterior Gamma has parameters
= 9 + 53 = 62 and = .2/(1+(.2)(32)) = .02703, with mean (62)(.02703) = 1.676.
With 14 individuals we expect: (14)(1.676) = 23.5 claims.
Alternately, the Buhlmann Credibility Parameter K = 1/ = 1/.2 = 5.
For 32 exposures, Z = 32/(32+5) = 32/37. The observed frequency is 53/32 .
The a priori frequency is = (9)(.2) = 1.8.
Thus the future estimated frequency is: (32/37)(53/32) + (1.8)(5/37) = 62/37 = 1.676.
Thus we expect (14)(1.676) = 23.5 claims.
Comment: During 2000, we have data from 1997 to 1999 and trying to predict the year 2001.
4.20. E. We observe 59 claims from 26 exposures. The posterior Gamma has parameters
= 9 + 59 = 68 and = 1/(5 + 26) = .032258, with mean (68)(.032258) = 2.1935.
With 7 individuals we expect (7)(2.1935) = 15.355 claims.
At $800 per claim, this is $12,284.
Alternately, the Buhlmann Credibility Parameter K = 1/ = 1/.2 = 5.
For 26 exposures, Z = 26/(26+5) = 26/31. The observed frequency is 59/26 .
The a priori frequency is = (9)(.2) =1.8.
Thus the future estimated frequency is: (26/31)(59/26) + (1.8)(5/31) = 68/31 = 2.1935.
Then proceed as before.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 117

4.21. B. The prior distribution of per month is Gamma with = 1, and = 0.015.
The posterior distribution of is Gamma with = 1 + 4 = 5 and 1/ = 1/0.015 + 36 = 102.667.
This is for one future month. Over 12 future months, frequency is Poisson with mean 12.
The posterior distribution of 12 is Gamma with = 5, and = 12/102.667 = 0.11688.
Thus, for 12 months, the distribution of the number of accidents is Negative Binomial with
r = 5, and = 0.11688.
f(2) = {r(r+1)/2}2/(1+)2+r = {(5)(6)/2} (0.11688)2 /(1 + 0.11688)7 = 9.45%.
Alternately, if is the mean over 1 month for a particular store, then 12 is the mean over 1 year.
Multiplying an Exponential (Gamma) by a constant we multiply theta by that constant.
Converting to years, the prior distribution is Gamma: = 1, and = (12)(0.015) = 0.18.
We observe 4 robberies over 3 years.
The posterior distribution is Gamma with = 1 + 4 = 5, and 1/ = 1/.18 + 3 = 8.556.
For 1 year, the predictive distribution is Negative Binomial with r = 5 and = 1/8.556.
f(2) = {r(r+1)/2}2/(1+)2+r = {(5)(6)/2} (1/8.556)2 /(1 + 1/8.556)7 = 9.45%.
Comment: While the predictive distribution for one month is Negative Binomial with r = 5 and
= 1/102.667, we can not add up 12 copies of this distribution in order to get the distribution of the
number of accidents for 12 months. This would be what we would do if each month had a different
lambda picked at random from the Gamma Distribution; here each month from a single store has the
same lambda. See Mahlers Guide to Frequency Distributions.
For the Gamma-Poisson, the mixed distribution for Y years of data is given by a Negative Binomial
Distribution, with parameters r = and = Y.
4.22. D. The Poisson parameters over three years are three times those on an annual basis.
Therefore they are given by a Gamma distribution with = 3 and = 3/12 = 1/4.
(The mean frequency is now 3/4 per three years rather than 3/12 = 1/4 on an annual basis.
It might be helpful to recall that is the scale parameter for the Gamma Distribution.)
The marginal distribution for the Gamma-Poisson is a Negative Binomial, with parameters
r = = 3 and = = 1/4. f(0) = 1/(1 + )r = 1/(5/4)3 = 0.512.
4.23. A. Negative Binomial, with parameters r = = 3 and = = 1/4.
Therefore f(1) = r/(1 + )r+1 = (3)(1/4)/(5/4)4 = 0.3072.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 118

4.24. C. For the Gamma-Poisson, the Buhlmann Credibility Parameter K = 1/ = 12.


Z = 3/(3 + 12) = 1/5 = 20%.
Alternately, one can use the of 1/4 from the Gamma distribution of Poisson parameters for 3 years
and get Z = 1 / (1 + 4) = 1/5, where there is a single three-year period.
Comment: Id recommend sticking to the first approach under exam conditions.
4.25. D. The a priori mean annual frequency is the mean of the prior Gamma = 3/12 = 1/4.
Z = .2 from the previous solution. The new estimate is: (.2)(0) + (1 - .2)(.25) = 0.2.
4.26. E. The a priori mean annual frequency is the mean of the prior Gamma = 3/12 = 1/4.
The observed annual frequency is 1/3. The credibility is .2.
The new estimate = (.2)(1/3) + (1 - .2)(.25) = 0.267.
4.27. C. For the Gamma-Poisson the result of Bayesian Analysis equal that of Buhlmann
credibility. The estimates if one observes zero or one claim over three years are given by previous
solutions as: .2 and .267. The probabilities of observing zero or one claim over three years are
given by previous solutions as: .512 and .3072. Thus the combined estimate for insured with either
zero or one claim over three years is the weighted average:
{(.2)(.512) + (.267)(.3072)} / (.512 + .3072) = 0.225.
Alternately, for a Poisson parameter of , the chance of zero or one claim over three years is:
e3 + 3e3 = e3(1 + 3). The a priori probability of is proportional to 2e12.
Thus the posterior distribution of is proportional to: e3(1 + 3)2e12 = e15(2 + 33).
In order to get the posterior distribution we must divide by the integral from zero to infinity.
Twice using the formula for Gamma integrals, this integral is: 15-3(3) + 3(15-4)(4) = .0009481.
The posterior distribution is: 1055e15(2 + 33).
The mean of this distribution is the integral from zero to infinity of times f():

1055e15(3 + 34) d = (1055){(15-4(4) + 3(15-5)(5))} = (1055)(.0002133) = 0.225.


4.28. A. The posterior distribution is Gamma with = 1.5 + 2 = 3.5, and 1/ = 1/0.03 + 3 = 36.33.
Thus, the posterior distribution of 3 is Gamma with = 3.5, and = 3/36.33 = 0.0826.
Thus, for a total of 3 years, the distribution of the number of accidents is Negative Binomial with
r = 3.5 and = 0.0826. f(1) = r/(1+)1+r = 3.5 (0.0826)/(1 + 0.0826)4.5 = 20.2%.
Comment: For the Gamma-Poisson, the mixed distribution for Y years of data is given by a
Negative Binomial Distribution, with parameters r = and = Y.
See Mahlers Guide to Frequency Distributions.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 119

4.29. D. The variance of the Gamma = VHM = Total Variance - EPV =


Variance of the Negative Binomial - Mean of the Gamma =
Variance of the Negative Binomial - Mean of Negative Binomial = r(1+) - r = r2 = (4)(3/17)2 =
0.125.
Alternately, the parameters of the Gamma can be gotten from those of the Negative Binomial,
= r = 4, = = 3/17. Then the Variance of the Gamma = 2 = 0.125.
4.30. B. The Gamma distribution of Poisson parameters for 1 year from the previous solution has
parameters = 4 and = 3/17. The distribution of Poisson parameters for 5 years has the same
= 4, but = (5)(3/17) =15 /17.
The Negative Binomial for 5 years of data has parameters: r = = 4 and = = 15/17.
Therefore the chance of 2 accidents is: f(2) = (r(r + 1)/2)2/(1 + )r+2 =
{(4)(5)/2} (15/17)2 /(1 + 15/17)4+2 = (10)(.7785)/(44.484) = 17.5%.
4.31. A. Prob[observe 1 or more claim | ] = 1 - Prob[0 claims | ] = 1 - e.

(1 0

e - )100

e -10

d = 1 - 100 e -11 d = 1 - 100/112 = 21/121.


0

By Bayes Theorem, the posterior distribution of is: (1 - e)100e10 /(21/121) =


(12100/21)(e10 - e11).
Therefore, the expected future claim frequency of this policyholder =

(12100 / 21)( e-10 e -11 ) d = (12100/21) ( 2 e-10 2 e -11 ) d =


0

(12100/21){2/103 - 2/113 } = 0.287.

Comment: Gamma type integral:

t 1 e- t / dt = (), or for integer n:

tn e- ct
0

dt = n! / cn+1.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 120

4.32. C. From the previous solution, the posterior distribution of is:


(12100/21)(e10 - e11). Therefore, the posterior probability that this same policyholder will
have 0 claims this year is:

e(12100/21)(e10 - e11) d = (12100/21) e-11 - e-12 d =


0

(12100/21)(1/112 - 1/122 ) = 0.761.


Comment: Similar to 4, 11/01, Q.34.
4.33. C. From a previous solution, the posterior distribution of is:
(12100/21)(e10 - e11). Prob[ 2 or more claims | ] = 1 - e - e.
Therefore, the posterior probability that this same policyholder will have at least 2 claims this year is:

(1 - e- e)(12100/21)(e10 - e11) d =
0

1- (12100/21) e-11 + 2e-11 - e-12 - 2e-12 d =


0

1 - (12100/21)(1/112 + 2/113 - 1/122 - 2/123 ) = 0.0405.


4.34. D. Since we are not given the data on the number of claims, we divide the dollars of loss by
the average claim cost. For policyholder 2 that is $58,800/$600 = 98 claims. There are 49
exposures observed for policyholder 2. Thus the posterior Gamma has parameters
= 9 + 98 = 107 and = .2/(1+(.2)(49)) = .01852, with mean (107)(.01852) = 1.9815.
With a frequency of 1.9815 at $600 per claim, the expected pure premium is: (1.9815)(600) =
$1189 per person.
Alternately, one could use Buhlmann Credibility with K = 1/ = 5, and Z = 49/(49 + 5) = 90.7%.
The observed frequency is: 98/49 = 2. The a priori mean frequency is: (9)(.2) = 1.8.
The estimated future frequency is: (.907)(2) + (1 - .907)(1.8) = 1.9814.
The expected pure premium is: (1.9814)(600) = $1189 per person.
Comment: During 2000, we have data from 1997 to 1999 and are trying to predict the year 2001.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 121

4.35. B. Since we are not given the data on the number of claims, we divide the dollars of loss by
the average claim cost. In 1997 the average claim cost is 800/(1.07)4 = 610. Thus the $8700 in
losses correspond to 8700/610 = 14.26 claims. Similarly, 11,800/ (800/(1.07)3 ) = 18.07, and
11,100/ (800/(1.07)2 ) = 15.89. Thus we have 14.26 + 18.07 + 15.89 = 48.22 claims and 32
exposures. The posterior Gamma has parameters: = 9 + 48.22 = 57.22 and
1/ = 1/.2 + 32 = 37, with mean 57.22/37 = 1.547 . With 16 individuals we expect (16)(1.547) =
24.75 claims. At $800 per claim, this is $19,800.
Alternately, one could use Buhlmann Credibility with K = 1/ = 5. With 32 exposures,
Z= 32/37. The inferred, observed frequency is 48.22/32. The a priori frequency is (9)(.2) = 1.8.
Thus the estimated future frequency is: (32/37)(48.22/32) + (5/37)(1.8) = 57.22/37 = 1.546.
Then proceed as before.
4.36. C. This is a Gamma-Poisson with = 1 and = 1/8.
We observe 3 + 12 = 15 months and 4 claims.
= 1 + 4 = 5 and 1/ = 8 + 15 = 23. Mean of the posterior Gamma is: 5/23.
Expected number of claims during 2006 (12 months) is: (12)(5/23) = 60/23 = 2.61.
Alternately, K = 1/ = 8. Z = 15/(15 + 8) = 15/23. A prior mean is 1/8.
Estimated future frequency is: (4/15)(15/23) + (1/8)(8/23) = 5/23.
Expected number of claims during 2006 (12 months) is: (12)(5/23) = 2.61.
4.37. D, 4.38. B, & 4.39. D.
The mixed distribution is a Negative Binomial with r = = 4 and = = 0.1.
f(0) = (1+)-r = 1.1-4 = .6830. Expected size of group A: 6830.
f(1) = r(1+)-(r+1) = (4)(.1)1.1-5 = .2484. Expected size of group B: 2484.
Expected size of group C: 10000 - (6830 + 2484) = 686.
4.40. A. = + 0 = 4. 1/ = 1/ + 1 = 11. = 4/11 = 0.364.
4.41. E. = + 1 = 5. 1/ = 1/ + 1 = 11. = 5/11 = 0.455.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 122

4.42. C. Expected Number of claims for this portfolio is: (10000)(4)(.1) = 4000.
Expected number of claims from Group A is: (4/11)(6830) = 2484.
Expected number of claims from Group B is: (5/11)(2484) = 1129.
Therefore, expected number of claims from Group C is: 4000 - (2484 + 1129) = 387.
Expected claim frequency for Group C is: 387 / 686 = 0.564.
Alternately, by Bayes Theorem, the posterior distribution of is proportional to:
(10000 3 e10 / 6)(1 - e - e).

(1 - e- e)(10000 3 e10 / 6) d =
0

1- (10000/6) 3e-11 + 4e-11 d = 1 - (10000/6)(6/114 + 24/115 ) = .068618.


0

Therefore, the posterior distribution is: 24289 3 e10(1 - e - e), with mean:

24289 3 e10(1 - e - e) d = 24289(24/105 - 24/115 - 120/116) = 0.564.


0

Comment: The future claim frequency for those with exactly 2 claims is: 6/11 = .545.
However, group C also includes some policyholders who had more than 2 claims, and therefore
with even higher expected claim frequencies.
4.43. D. r = = 4, = = 1/11= .09091. For the predictive Negative Binomial:
f(0) = 1.09091-4 = 70.6%.
4.44. A. r = = 5, = = 1/11= .09091. For the predictive Negative Binomial:
f(0) = 1.09091-5 = 64.7%.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 123

4.45. C. From a previous solution, the posterior distribution is:


24289 3 e10(1 - e - e). Chance of 0 claims is:

24289 3 e10(1 - e - e)e d = 24289 3 e11 - 3 e12 - 4 e12 d =


0

24289(6/114 - 6/124 - 24/125 ) = 58.3%.


Comment: Prior to any observation, we expect 6830 policyholders with no claims in the first year. In
the next year, we expect to see: (.706)(6830) + (.647)(2484) + (.583)(686) = 6829 policyholders
with no claims, the same number as the first year, subject to rounding. A better estimate of the mean
frequency for each insured results from observing; however, the distribution for the whole portfolio
remains the same, assuming the original model was okay.
4.46. E. r = = 4, = = 1/11= .09091. For the predictive Negative Binomial:
1 - (f(0) + f(1)) = 1 - {1.09091-4 + (4)(.09091)(1.09091-5)} = 5.86%.
4.47. E. r = = 5, = = 1/11= .09091. For the predictive Negative Binomial:
1 - (f(0) + f(1)) = 1 - {1.09091-5 + (5)(.09091)(1.09091-6)} = 8.31%.
4.48. A. From a previous solution, the posterior distribution is:
24289 3 e10(1 - e - e). Chance of 2 or more claims is:

24289 3 e10(1 - e - e)(1 - e - e) d =


0

24289

3 e10 + 24 e12 + 3 e12 + 5 e12 - 23 e11 - 24 e11 d =

24289(6/104 + 48/125 + 6/124 + 120/126 - 12/114 - 48/115 ) = 11.6%.


Alternately, for the whole portfolio we expect 686 policyholders to have 2 or more claims.
We expect (5.86%)(6830) = 400 from Group A, (8.31%)(2484) = 206 from Group B, and
therefore 686 - (400 + 206) = 80 from Group C. 80/686 = 11.7%.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 124

4.49. A. For the Gamma-Poisson the posterior distribution of is a Gamma.


Posterior = Prior + number of claims observed Prior + 1 > 1. ( > 0.) Thus while the
posterior distribution of is a Gamma it cant be an Exponential, which would be a Gamma with =
1. Thus Statement #1 is true. For the Gamma Distribution, the coefficient is variation is
2 / () = 1/ . Since the Posterior = Prior + number of claims observed Prior + 1 >
Prior , the coefficient of variation of the posterior distribution of is less than the coefficient of
variation of the prior distribution of . Statement #2 is true. The distribution of the number of claims
is a Negative Binomial. The Negative Binomial has a coefficient of variation of: r(1+ ) / (r) =
(1+ ) / (r) . However, r = and = . Thus the prior CV2 is: (1+)/(r) = (1+) / = (1 + 1/) / .
Let C claims be observed in E years. Posterior = Prior + C. 1/Posterior = E + 1/Prior . Thus
the posterior CV2 is: (1 + E + 1/) /(C+). Therefore, depending on the values of C and E, the
posterior CV of the distribution of the number of claims can be either larger or smaller than the prior
CV. Statement #3 is not true.
4.50. D. The scale parameter of a Gamma is the variance/mean,
which in this case for the prior Gamma is: 0.0004 / 0.08 = 1/200.
Then the shape parameter is the mean divided by the scale parameter,
which for the prior Gamma is: (0.08)/(1/200) = 16.
For the Gamma-Poisson, the posterior Gamma has shape parameter =
prior shape parameter + number of claims observed = 16 + 120 = 136,
and inverse posterior scale parameter =
inverse prior scale parameter + the number of observed exposures = 200 + 900 = 1100.
Thus the posterior scale parameter of the posterior Gamma is 1/1100.
The Bayesian estimate is the mean of the posterior Gamma =
posterior shape parameter times the posterior scale parameter = 136/1100 = 0.1236.
Alternately, for the Gamma-Poisson the Bayes estimate is equal to the Buhlmann Credibility
estimate. For the Gamma-Poisson, the Buhlmann Credibility parameter is the inverse scale
parameter of the prior Gamma. Thus the Buhlmann Credibility parameter K = 200.
We have observed (3)(300) = 900 member-years,
so that the credibility Z = N / (N+K) = 900 / (900 + 200) = 9/11.
The observed frequency is 120 / 900 = 0.1333.
The prior estimate of the frequency is the mean of the prior Gamma, which is 0.08.
Thus the new estimate of the frequency = (9/11)(0.1333) + (2/11)(0.08) = 0.1236.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 125

4.51. C. Gamma-Poisson with prior = 4 and prior = 1/200.


= + (3 + 2 + 3) = 12. 1/ = 1/ + (200 + 250 + 300) = 950.
Estimated future frequency = mean of the posterior Gamma = = 12/950.
Estimate of the number of claims in Year 4: (350)(12/950) = 4.42.
Alternately, K = 1/ = 200. Observed frequency = (3 + 2 + 3)/(200 + 250 + 300) = 8/750.
Prior mean frequency = mean of the prior Gamma = = 4/200 = .02.
Z = 750/(750 + K) = .789. Estimated future frequency = (.789)(8/750) + (.211)(.02) = .01264.
Estimate of the number of claims in Year 4: (350)(.01264) = 4.42.
Comment: Similar to 4, 11/03, Q.27.
4.52. C. Posterior to Year 3, the distribution of for one is exposure is
Gamma with = 12 and = 1/950.
For 350 exposures, the mean frequency is Poisson with mean 350 .
The distribution of 350 is Gamma with = 12 and = 350/950 = 7/19.
The number of claims in Year 4 is Negative Binomial with r = 12 and = 7/19
f(0) = 1/(1 + )r = (26/19)12 = .02319.
f(1) = f(0)r/(1+) = (.02319)(12)(7/26) = .07492.
f(2) = f(1)((r+1)/2)/(1+) = (.07492)(13/2)(7/26) = .13113.
f(0) + f(1) + f(2) = 22.9%.
Comment: For the Gamma-Poisson, the mixed distribution for Y exposures is given by a Negative
Binomial Distribution, with parameters r = and = Y.
See Mahlers Guide to Frequency Distributions.
4.53. A. The prior distribution of is Gamma with = 40 and = 1/300.
After one year: = + C = 40 + 60 = 100. 1/ = 1/ + E = 300 + 500 = 800.
Estimated future frequency = Mean of posterior Gamma = = 100/800 = 1/8.
Expected number of claims in year 2: (600)(1/8) = 75.
Updating for one more year of data:
= + C = 100 + 90 = 190. 1/ = 1/ + E = 800 + 600 = 1400.
Estimated future frequency = Mean of posterior Gamma = = 190/1400 = .1357.
Expected number of claims in year 3: (700)(.1357) = 95.
R + A = 75 + 95 = 170.
Alternately, Andy can update for both years at once:
= + C = 40 + 60 + 90 = 190. 1/ = 1/ + E = 300 + 500 + 600 = 1400. Proceed as before.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 126

4.54. For female drivers, the mean is: 0.211126. The second moment is: 0.292895.
r = .211126. r(1+) = 0.292895 - 0.2111262 = 0.248321.

1 + = .248321/.211126 = 1.17617. = 0.17617. r = .211126/.17617 = 1.1984.


Number of
Claims

Observed

Method of Moments
Negative Binomial

Expected

Chi
Square

0
1
2
3
4 and over

19,634
3,573
558
83
24

0.8232820
0.1477789
0.0243305
0.0038853
0.0007233

19,653.4
3,527.8
580.8
92.7
17.3

0.019
0.580
0.896
1.025
2.625

Sum

23,872

23,872.0

5.145

Where the last group is 4 and over, since 5 and over would have had 2.7 expected drivers.
With 2 fitted parameters, we have 5 - 1 - 2 = 2 degrees of freedom.
For the Chi-Square with 2 d.f. the critical value for 10% is 4.605 and for 5% is 5.991.
4.605 < 5.145 < 5.991. Thus we reject the fit at 10%, and do not reject the fit at 5%.
4.55. For male drivers, the mean is: 0.361701. The second moment is: 0.574357.
r = 0.361701. r(1+) = 0.574357 - 0.3617012 = 0.443529.

1 + = 0.443529/0.361701 = 1.22623. = 0.22623. r = .361701/.22623 = 1.5988.


Number of
Claims

Observed

Method of Moments
Negative Binomial

Expected

Chi
Square

0
1
2
3
4
5 and over

21,800
6,589
1,476
335
69
24

0.7217573
0.2128941
0.0510369
0.0112953
0.0023959
0.0006205

21,864.2
6,449.2
1,546.1
342.2
72.6
18.8

0.188
3.030
3.175
0.150
0.176
1.441

Sum

30,293

30,293.0

8.162

Where the last group is 5 and over, since 6 and over would have had 3.8 expected drivers.
With 2 fitted parameters, we have 6 - 1 - 2 = 3 degrees of freedom.
For the Chi-Square with 3 d.f. the critical value for 5% is 7.815 and for 2.5% is 9.348.
7.815 < 8.162 < 9.348. Thus we reject the fit at 5%, and do not reject the fit at 2.5%.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 127

4.56. For each group, we have assumed a Gamma-Poisson. r and .


The data is for six years, so each mean Poisson frequency is 6, where is the mean for a single
year. Thus each Gamma Distribution has 6 times the scale parameter for one year.
Therefore, the for one year is: the fitted divided by 6.
The Buhlmann Credibility parameter for the Gamma-Poisson is 1/ for one year.
K = 6/(fitted beta). For three years of data, Z = 3/(3 + K).
Data Set
Fitted Beta
K
Credibility for Three Years of Data
Female
0.17617
34.1
8.1%
Male
0.22623
26.5
10.2%
Comment: The males drivers are a less homogeneous group than the female drivers.
Thus the experience of male driver is given more credibility, compared to the mean of all male
drivers, than is the case for female drivers. The credibility assigned to an individual drivers
experience would be less if one took into account classifications and territories.
Data for 1969-1974 California Drivers, taken from Table 1 and Table A2 of The Distribution of
Automobile Accidents-Are Relativities Stable Over Time?, by Emilio C. Venezian, PCAS 1990.
The means for each driver are not constant over time. See A Markov Chain Model of Shifting
Risk Parameters, by Howard C. Mahler, PCAS 1997. This results in less credibility being
given to older years of data.
4.57. B. This is a Gamma-Poisson. The prior estimate is: .
= + C = + 0 = . 1/ = 1/ + E = 1/ + 3. = /(1 + 3).
The posterior estimate is: = /(1 + 3).
(Posterior estimate)/(prior estimate) = 1/(1 + 3)
Posterior estimate is 85% of prior estimate. 0.85 = 1/(1 + 3). = 1/17.
Alternately, Z = the claims free discount = 1 - 85% = 15%.
For the Gamma Poisson, K = 1/. Z = 3/(3 + 1/) = 3/(3 + 1) = 15%. = 1/17.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 128

4.58. D. The chance of the observation given is: (e)(2e/2)(3e/6) = 6e3/12.


() = 1 e/ / ().
Therefore, the posterior distribution of is proportional to: (6e3)(1 e/) = +5 e(3+1/).
The mean of the posterior distribution of is:
(integral of times the probability weight)/(integral of the probability weight)

+ 6 exp[(3 + 1/ ) ] d /

+ 5 exp[(3 + 1/ ) ] d .
0

Comment: The integrals are of the Gamma type.

+ 6 exp[(3 +

1/ ) ] d = [+7] / (3 + 1/) +7.

+ 5 exp[(3 + 1/ ) ] d = [+6] / (3 + 1/) +6.


0

{[+7] / (3 + 1/) +7}/{[+6] / (3 + 1/) +6} = ( + 6)/(3 + 1/).


The posterior distribution is Gamma with = + 6, and 1/ = 1/ + 3.
The mean of this Gamma is: = ( + 6)/(3 + 1/).

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 129

4.59. E. The chance of 2 claims given is: 2e/2.


From the previous solution, the posterior distribution of is proportional to: +5 e(3+1/).
The probability of two claims in year 4 from this policyholder is:
(integral of 2e/2 times the probability weight)/(integral of the probability weight)

= 0.5

+ 7 exp[(4 + 1/ ) ] d /

+ 5 exp[(3 + 1/ ) ] d .
0

Comment: The integrals are of the Gamma type.

+ 7 exp[(4 + 1/ ) ] d = [+8] / (4 + 1/) + 8.


0

+ 5 exp[(3 + 1/ ) ] d = [+6] / (3 + 1/) + 6.


0

(.5){[+8] / (4 + 1/) + 8} / {[+6] / (3 + 1/) + 6} =


{(+6)(+7)/2} (3 + 1/) + 6 / (4 + 1/) + 8.
The posterior distribution is Gamma with = + 6, and 1/ = 1/ + 3.
The predictive distribution is Negative Binomial with r = + 6, and =1/(1/ + 3).
The density at two of this Negative Binomial is:
{r(r+1)/2!} 2/(1+)2+r = {( + 6)( + 7)/2} {1/(3 + 1/)}2 / {(4 + 1/)/(3 + 1/)} + 8 =
{(+6)(+7)/2} (3 + 1/) + 6 / (4 + 1/) + 8.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 130

4.60. B. The prior distribution is Gamma with = 5 and = 1/50.


The posterior distribution of is Gamma with = 5 + 7 + 6 = 18, and 1/ = 50 + 20 + 25 = 95.
For 60 exposures, the frequency is Poisson with mean 60.
The posterior distribution of 60 is Gamma with = 18, and = 60/95 = 12/19.
Thus, for a total of 60 exposures in years 3 and 4, the distribution of the number of accidents is
Negative Binomial with r = 18 and = 12/19.
f(9) = {(18)(19)(20)(21)(22)(23)(24)(25)(26)/9!} 9 / (1+)9+r = 9.1%.
Comment: For the Gamma-Poisson, the mixed distribution for Y exposures of data is given by a
Negative Binomial Distribution, with parameters r = and = Y.
See Mahlers Guide to Frequency Distributions.
A graph of the distribution of the total number of accidents in years 3 and 4:
f(x)

0.08
0.06
0.04
0.02

10

15

20

25

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 131

4.61. a. The posterior distribution is proportional to: () f(0) = e, an Exponential with mean 1.
b. The posterior distribution is proportional to: () f(0) f(2) = 2 e2/2. This is proportional to a
Gamma Distribution with = 3 and = 1/2, which must be the posterior distribution.
It has a variance of: 2 = 3/4.
c. The posterior distribution is proportional to: () f(0) f(2) f(1) = 3 e3/2. This is proportional to a
Gamma Distribution with = 4 and = 1/3 which must be the posterior distribution.
Mixing Poissons via this Gamma produces a Negative Binomial predictive distribution, with
r = 4 and = 1/3. This Negative Binomial has a variance of: r(1+) = 2 = 4(1/3)(4/3) = 16/9.
Alternately, posterior EPV = posterior mean = (4)(1/3) = 4/3.
Posterior VHM = Variance of posterior gamma = (4)(1/3)2 = 4/9.
Posterior Total Variance = EPV + VHM = 4/3 + 4/9 = 16/9.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 132

4.62. B. The contributions to the likelihood from each policyholder will multiply, since their
experience is assumed to be independent. For the first policyholder, if it has a Poisson parameter of
, then the loglikelihood from 1997 is f(17) for a Poisson with mean 9: (9)17 e9 / 17!.
Multiplying the likelihoods from the 3 years, the likelihood for the first policyholder is:
{(9)17 e9 / 17!}{(10)20 e10 / 20!}{(13)16 e13 /16!}.
Ignoring annoying constants, which will not affect the maximum likelihood, this is:
17+20+16 e(9+10+13) = 53 e32.
Given in turn is distributed via a Gamma with = 10 and unknown, we can calculate the
expected value of this likelihood:

53 e32 10 9 e/ /(10) d = {10 / (10)} 62 e(32+1/) d =


0

{10 /(10)}{(63) (32 + 1/)-63} = ((63)/(10))10 (32 + 1/)-63.


The first policyholder had 53 claims and 32 exposures and the likelihood was proportional to:
10 (32 + 1/)-63 = (E + 1/)-(C+).
Thus for the second policyholder with 59 claims and 26 exposures, the likelihood is
proportional to: 10 (26 + 1/)-69. For the third policyholder with 91 claims and 49 exposures,
the likelihood is proportional to: 10 (49 + 1/)-101.
The product of the likelihoods is: 30 (32 + 1/)-63(26 + 1/)-69(49 + 1/)-101.
The loglikelihood is: -30 ln() - 63 ln(32+1/) - 69ln(26+1/) - 101 ln(49+1/).
Setting the derivative with respect to equal to 0:
-30/ - (-1/2)63/(32+1/)- (-1/2)69/(26+1/)- (-1/2)101/(49+1/) = 0.
30 = 63/(32+1/) + 69/(26+1/) + 101/(49+1/).
Comment: In this case, the maximum likelihood estimate of = 0.192. Very difficult!
More generally assume we have N policyholders, with policyholder i having Ci claims and Ei
exposures. Then if is known, the maximum likelihood equation for is:
= (1/N) (+Ci)/(Ei + 1/). If we performed Bayesian analysis on each policyholder separately,
then policyholder i has future expected frequency: (+Ci) / (Ei + 1/).
Thus the maximum likelihood equation for is equivalent to:
overall average future claim frequency =
average of the future claim frequencies for the individual policyholders.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 133

4.63. B. & 4.64. D. The number of claims for the group is the sum of 4 independent Poissons,
which in turn is also Poisson with mean equal to: 1 + 2 + 3 + 4.
The lambdas are independent draws from a Gamma Distribution with = 2 and = 0.10.
Thus their sum is a Gamma Distribution with = 8 and = 0.10.
Thus we have a Gamma-Poisson, with 6 claims in one year.
= + C = 8 + 6 = 14. 1/ = 1/ + E = 10 + 1 = 11.
Posterior mean is: 14/11 = 1.273.
The predictive distribution is Negative Binomial with r = 14 and = 1/11.
f(3) =

r (r + 1) (r + 2)
6

=
(1 + )r + 3

(14)(15)(16) (1/ 11)3


= 9.58%.
6
(12 / 11)17

Comment: The average number of claims next year per individual is: 1.273 /4 = 0.318.
The mathematics of looking at 4 random individuals for one year is somewhat different than when
looking at one individual for 4 years.
4.65. C. & 4.66. A. We have a Gamma-Poisson.
= + C = 2 + 6 = 8. 1/ = 1/ + E = 10 + 4 = 14.
Posterior mean is: 8/14 = 0.571.
The predictive distribution is Negative Binomial with r = 8 and = 1/14.
f(3) =

r (r + 1) (r + 2)
6

=
(1 + )r + 3

(8)(9)(10) (1/ 14)3


= 2.05%.
6
(15 / 14)11

4.67. B. The posterior distribution of is Gamma with = 3 + 2 = 5, and 1/ = 100 + 5 = 105.


The zero-one loss function corresponds to the mode.
The mode of this Gamma is: ( - 1) = (1/105)(4) = 3.8%.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 134

4.68. E. Let x be the number of flights purchased by customers during the contest year.
X follows the mixed distribution, a Negative Binomial with r = 1/3 and = 6.
The chance of a customer being picked is proportional to his number of flights x.
Thus the chance that the customer picked had purchased x flights is proportional to:
x Prob[X = x] = x f(x).
Now the sum of x f(x) is the mean of the Negative Binomial: r = (1/3)(6) = 2.
Let Y be the number of flights purchased by the lucky customer.
Then g(y) = y f(y) / 2, where f(y) is the Negative Binomial with r = 1/3 and = 6.
E[Y] =

y g(y) = y y f(y) / 2 = (1/2)(second moment of f) =

(1/2) {(variance of f) + (mean of f)2 } = (1/2) {(1/3)(6)(7) + 22 } = 9.


4.69. C. Let Y be the number of flights purchased by the lucky customer.
Then g(y) = y f(y) / 2, where f(y) is the Negative Binomial with r = 1/3 and = 6.
Given a customer who took y flights in a year, his posterior distribution of lambda is:
Gamma with = 1/3 + y, and 1/ = 1/6 + 1 = 7/6.
Thus the posterior mean is: (1/3 + y)(6/7) = 2/7 + y6/7.
Thus assuming no change in behavior, the mean number of flights next year is:

(2 / 7 + y6 / 7) g(y) = (2/7) g(y) + (6/7) y g(y) = (2/7)(1) + (6/7) y y f(y) / 2 =

2/7 + (3/7)(second moment of f) = (2/7) + (3/7){(1/3)(6)(7) + 22 } = 8.


Given that the lucky customer on average will increase his flights by 50%,
the expected number of future flights taken per year by the contest winner is: (1.5)(8) = 12.
Comment: A customer who takes more than his average number of flights during the contest year is
more likely to have won the contest. Therefore, the winning customer is likely to have taken more
than his average number of flights during the contest year. Therefore, the average lambda for the
contest winner is 8, which is less than 9, the average number of flights taken during the contest year
by the contest winner.
4.70. This is a Gamma-Poisson with = 2 and = 1/5.
The posterior distribution is Gamma with = 2 + 2 = 4, and 1/ = 5 + 1 = 6. = 1/6.
Thus, the predictive distribution, the posterior mixed distribution,
is Negative Binomial with r = 4 and = 1/6.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 135

4.71. C. The posterior distribution is Gamma with = 6 + 0 = 6, and 1/ = 25 + n.


Thus, the mean of the posterior distribution is: 6/(25 + n).
Set: 6/(25 + n) = 0.15. n = 15.
Alternately, the prior mean is: (6)(0.4) = 0.24.
Z 0 + (1-Z) 0.24 = 0.15. Z = 3/8.
For the Gamma-Poisson, K = 1/ = 1/0.04 = 25.

3/8 = n / (n + 25). n = 15.


4.72. D. = 1. 1/ = .
The posterior distribution is Gamma with = + 20, and 1/ = 1/ + 10 = + 10.
The predictive distribution, the posterior mixed distribution,
is Negative Binomial with r = + 20, and = 1/( + 10).
The variance of the predictive distribution is: r (1 + ) = ( + 20)
16/9 = ( + 20)

1
+ 11
.
+ 10 + 10

1
+ 11
. (16)( + 10)2 = (9)( + 20)( + 11).
+ 10 + 10

7 2 + 41 - 380 = 0. = 5. = 25, and = 1/15.


Thus the variance of the posterior distribution is: 2 = 25/152 = 1/9.
4.73. D. p(h | c) is proportional to f(c | h)g(h) = (e-hhc /c! ) ar e-ah hr-1 / (r) =
ar e-(a+1)h hr+c-1 /((r)c!). This is proportional to e-(a+1)h hr+c-1. One can recognize that this is
proportional to a Gamma Distribution with parameters (a+1), (r+c).
Therefore, p(h|c) = (a+1)r+c e-(a+1)h hr+c-1 / (r+c).
Comment: Note that r is the shape parameter of the prior Gamma, corresponding to in Loss
Models while a is the inverse scale parameter, corresponding to 1/ in Loss Models.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 136

4.74. A. For the Gamma-Poisson, the posterior Gamma has parameters: +y, and
1/(1/ + N). The Bayesian Update, is the mean of the posterior distribution: (+y)/( 1/ + N) =
{1// (N + 1/)} () + {N / (N + 1/)}(y/N) = {1 - N/ (N + 1/)} () + {N / (N + 1/)}(y/N).
Alternately, for the Gamma-Poisson the estimates from Buhlmann Credibility and Bayesian
Analysis are equal. For the Gamma-Poisson the Buhlmann Credibility parameter is equal to the
inverse of the scale parameter of the (prior) Gamma, which is 1/ in this case.
Thus Z = N/(N + 1/). The prior estimate is the mean of the prior Gamma or .
The observed frequency is y/N. Thus the new estimate is: Z(y/N) + (1 - Z)().
4.75. A. Gamma-Poisson with = 1 and 1/ = a. = 1 + 14. 1/ = a + 1.
10 = = 15/(a + 1). a = 1/2. 1/a = 2.
4.76. B. The expected value of the process variance = E[q] = a/b. The variance of the hypothetical
means = VAR[q] = a/b2. Therefore, the Buhlmann Credibility Parameter = K = (a/b) / (a/b2) = b.
Thus m trials is given credibility of m/(m+K) = m / (m+b).
4.77. C.

P(D=2) = P(D=2 | h) g(h) dh =


0

(e-h h2 / 2!) e-h dh = (1/2) h2e-2h dh

(1/2) (3) /23 = 2/16 = 1/8.


4.78. B. The marginal distribution for the Gamma-Poisson is a Negative Binomial.
4.79. C. The prior Gamma has = 1 (an Exponential Distribution), and = 1. We observe 0
claims in 1 year. The posterior Gamma has = prior + number of claims = 1 + 0 = 1
(an Exponential Distribution) and 1/ = prior 1/ + number of exposures = 1 + 1 = 2.
That is, the posterior density function is: 2e2.
Comment: Given , the chance of observing zero claims is: 0 e / 0! = e. The posterior
distribution is proportional to the chance of observation and the a priori distribution of :
(e) (e) = e2. Dividing by the integral of e2 from 0 to gives the posterior distribution:
e2 /(1/2) = 2e2.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 137

4.80. C. The prior Gamma Distribution has parameters = 2 and = 1.


Thus the posterior Gamma has parameters = prior + number of claims = 2 + 1 = 3,
1/posterior = 1/prior + number of exposures = 1+ 1 = 2. Thus the posterior Gamma has density
f(h) = h1 e-h/ / () = 23 h3-1e-2h / (3) = 8 h2 e-2h / 2 = 4h2 e- 2 h.
Comment: For the Gamma f(h) = h1 eh/ / (). Thus for = 2 and = 1,
f(h) = 1-2h2-1 e-h / (2) = h e-h, the given prior distribution.
4.81. C. The prior Gamma Distribution has parameters = 2 and = 1.
For the Gamma-Poisson the Buhlmann Credibility Parameter K = 1/(prior ) = 1.
Thus for one observation, Z = 1 / (1 + K) = 1/2.
4.82. A. For the Gamma-Poisson, the posterior Gamma has shape parameter
= prior + number of claims observed = 50 + 65 + 112 = 227.
For the Gamma Distribution, the mean is , while the variance is 2 .
Thus the coefficient of variation is:

variance / mean =

The CV of the posterior Gamma is: 1 /

2 / () = 1 /

227 = 0.066.

Comment: 1/ = 1/ + number of exposures = 500 + 750 + 1100 = 2350. Thus the posterior is
1/2350. One could go from the prior Gamma to the Gamma posterior of both years of observations
in two steps, by first computing the Gamma posterior of one year by just adding in the exposures
and claims observed over the first year.
4.83. B. The variance of the mixed Negative Binomial Distribution is equal to the total variance of
the portfolio = the EPV + VHM = Mean of the Gamma + Variance of the Gamma.
Thus Mean of the Gamma = .5 - .2 = .3.
Solving for the parameters of the Gamma: mean = = 0.3, and variance = 2 = 0.2.
Thus = .2/.3 = 2/3 and = (.3)/(2/3) = .45.
Now r = = .45, while = = 2/3. Thus r(1+) = (.45)(5/3) = 0.75.
Alternately, since = and r = , the variance of the Gamma = 2 = r2 . Thus since we are given
the variance of the Gamma is .2, r2 = .2. Also we are give that r(1+) = .5.
Therefore (1+)/ = .5/.2, and = 2/3. Therefore r = .2/(2/3)2 = .45.
r(1+) = (0.45)(5/3) = 0.75.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 138

4.84. E. For the Gamma-Poisson, the posterior Gamma has shape parameter = + xi, and
inverse of the posterior scale parameter = 1/ + n.
The Bayesian estimate is the mean of the posterior Gamma =
posterior shape parameter / posterior scale parameter = { + xi } / (n+1/).
Alternately, for the Gamma-Poisson the Bayes estimate is equal to the Buhlmann Credibility
estimate. For the Gamma-Poisson, the Buhlmann Credibility parameter is 1/, the (inverse) scale
parameter of the prior Gamma. Thus Z = n/(n + 1/), and 1 - Z = 1/ / (n + 1/).
The observed mean is (1/n) xi and the prior estimate is the mean of the prior Gamma: .
Thus the new estimate is: {(1/n) xi }{n/(n + 1/)} + {}{1/ / (n + 1/)} = { + xi } / (n + 1/).
4.85. D. The prior Gamma has mean = = 0.14, and variance = 2 = 0.0004.
Thus = 0.142 / 0.0004 = 49 and = 0.0004/0.14 = 1/350. The posterior Gamma has parameters
= prior + number of claims observed = 49 + 110 = 159,
and 1/ = prior 1/ + number of exposures observed = 350 + (2)(310) = 970. Thus = 1/970.
The Bayesian estimate is the mean of the posterior Gamma = = 159 / 970 = 0.164.
Alternately, one can calculate the Buhlmann Credibility estimate which for the Gamma-Poisson is
equal to the Bayesian Estimate. EPV = Expected Value of the Poisson Parameter =
Mean of Prior Gamma = .14. VHM = Variance of the Poisson Parameters =
Variance of the Prior Gamma = .0004. K = EPV / VHM = .14 / .0004 = 350 = prior 1/.
Weve observed (2)(310) = 620 policy-years of exposures, so Z = 620 / (620 + 350) = .639.
The prior estimate is .14, while the observation is 110 / 620 = .177.
Thus the new estimate is: (.639)(.177) + (1-.639)(.14) = 0.164.
Comments: Note that the number of exposures observed is: (2)(310) rather than 310 since the
frequency is expressed per year.
4.86. D. = + C = 1 + 3 = 4, and 1/ = 1/ + E = 1/2 + 1 = 1.5. Posterior Gamma:
x1 ex/ / () = 1.54 x4-1 e-1.5x / (4) = (81/16) x3 e-1.5x / 6 = (27/32) x3 e- 1 . 5 x.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 139

4.87. C. For the Gamma-Poisson, the Buhlmann Credibility estimate is equal to the Bayesian
Estimate. The later is equal to the mean of the Posterior Gamma (which equals the mean of the
Poisson parameters of the individual insureds.) Based on the solution to the previous question, the
posterior Gamma has parameters of 4 and 1/1.5 and thus a mean of:
4 /1.5 = 2.67. Alternately, for the Gamma-Poisson the Buhlmann Credibility Parameter is the
inverse of the prior scale parameter. Thus K = .5. Z = N / (N+K) = 1 / (1 + .5) = 2/3.
The observation is 3/1. The prior estimate is the mean of the prior Gamma = 1/.5 = 2.
Thus the posterior estimate = (2/3)(3) + (1/3)(2) = 8/3.
4.88. E. The Gamma distribution has a mean of and variance of 2 .
We are given 0.1 = and .0003 = , so that =.0003 / .1 = .003, and = 33.33.
For the Gamma-Poisson, the posterior Gamma has shape parameter = prior shape parameter plus
the number of claims observed = 33.33 + 150 = 183.33. The posterior Gamma has the inverse of
the scale parameter equal to the inverse of the prior scale parameter plus the number of exposures
observed = 333.33 + (3)(200) = 933.33. Thus the posterior scale parameter is 1/933.33. The
Bayes Estimate is the mean of the posterior Gamma Distribution = 183.33 / 933.33 = 0.1964.
Comment: Note that there are 200 exposures for each of three years, which counts as observing
600 exposures (because the frequencies are numbers of claims per year per exposure.)
4.89. D. The severity is given by an Exponential Distribution, with mean of 2500 and variance of
25002 . The distribution of is Exponential; E[] = 1/3, VAR[] = 1/32 = 1/9.
The hypothetical mean frequencies differ, but the hypothetical mean severities do not.
The hypothetical mean pure premium is 2500. Thus the variance of the hypothetical mean pure
premiums is VAR[2500] = 25002 VAR[] = (6.25 million)(1/9) = .694 million.
The process variance of the pure premium is given by:
(mean frequency)(variance of the severity) + (mean severity)2 (variance of frequency) =
()(25002 ) + (2500)2 () = (12.5 million ). Therefore the EPV of the pure premium =
E[](12.5 million ) = (1/3)(12.5 million) = 4.167 million. The total variance of the pure premiums is:
EPV + VHM = 4.167 + .694 million = 4.861 million.
Comment: One has to assume that the frequency and severity are independent.
The frequency is given by a Gamma-Poisson Process, with a Gamma with parameters = 1 and
= 1/3. The mixed distribution is Negative Binomial with parameters r = 1 and = 1/3,
mean r = 1/3, and variance r(1+) = (1)(1/3)(4/3) = 4/9. The variance of the pure premium is
given by: (mean freq.)(variance of the severity) + (mean severity)2 (variance of frequency) =
(1/3)(25002 ) + (2500)2 (4/9) = 4,861,111. However, this alternative only works when the
frequency is Poisson and the variance of the hypothetical mean severities is zero.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 140

4.90. D. Prior Gamma has mean = .1 = and variance = .0025 = 2. Therefore


= 0.0025 / 0.1 = 1/40, and = .1/ = 4. For the Gamma-Poisson Conjugate Prior, the Posterior
Gamma has shape parameter equal to the shape parameter of the Prior Gamma + Number of
Claims Observed = 4 + 6 = 10. The inverse of the scale parameter of the Posterior Gamma =
inverse of the scale parameter of Prior Gamma + number of exposures = 40 + 20 = 60.
Thus = 1/60. The variance of the Posterior Gamma = 10 /(602 ) = 0.00278.
4.91. D. Prior Gamma has scale parameter = 1/1000 and shape parameter = 150.
After the first year of observations: the new inverse scale parameter = old inverse scale parameter
+ number of exposures = 1000 + 1500 = 2500, and the new shape parameter = old shape
parameter + number of claims = 150 + 300 = 450.
Similarly, after the second year of observations: new inverse scale parameter =
old inverse scale parameter + number of exposures = 2500 + 2500 = 5000,
and the new shape parameter = old shape parameter + number of claims = 450 + 525 = 975.
The Bayesian estimate = the mean of the posterior Gamma =
(Posterior Shape parameter) (Posterior scale parameter) = (975)(1/ 5000) = 0.195.
Comment: One can go directly from the prior Gamma to the Gamma posterior of both years of
observations, by just adding in the exposures and claims observed over the whole period of time.
One would obtain the same result.
4.92. C. Since the exponential is a special case of the Gamma, this is a Gamma-Poisson Process.
The new shape parameter = prior shape parameter + # claims = 1 + 1 = 2.
Posterior inverse scale parameter = prior inverse scale parameter + # expos. = 10 + 1 = 11.
Therefore the Posterior density = 1 e-/ / () = 112 2-1 e11 / (2) = 121 e11.
Alternately, one can apply Bayes Theorem and integrate.
The chance of observing one claim in one exposure period given is e.

eg( )d = 10 e11d = (-10/11) e11 - (10/121) e11 ] = 10/121.


0

By Bayes Theorem, the posterior density of is:


(the a priori probability)(chance of the observation), divided by the above integral.
Therefore the density equals: (10 e10)( e) / (10/121) = 121 e11.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 141

4.93. B. For the Gamma-Poisson, the Buhlmann Credibility parameter K = the inverse of the scale
parameter of the prior Gamma = 10. Thus Z = 1 / (1+10) = 1/11.
Alternately, for the Gamma-Poisson the Buhlmann credibility estimate equals the Bayes Analysis
estimate. The latter is the mean of the posterior Gamma (with = 2, = 1/11) calculated in the
previous question: 2 /11.
The mean of the prior Gamma is 1/10. The observation is 1.
Thus (2/11) = (Z)(1) + (1 - Z)(1/10). Therefore Z = 1/11.
Alternately, EPV = E[] = Mean of Prior Gamma = 1/10.
VHM = VAR[] = Variance of the Prior Gamma = 1/102 .
Therefore K = (1/10) / (1/100) = 10. Therefore Z = 1/11.
4.94. B. Since K = 10, for 5 exposure periods, credibility = 5 / (5 + 10) = 1/3.
For the Gamma-Poisson the Buhlmann Credibility result is equal to the result of Bayesian Analysis,
the mean of the posterior distribution, which is given as .6.
Therefore we know that the Buhlmann credibility estimate of the mean is .6.
If x be the number of claims observed in the five exposure periods, then the observed frequency is
x/5. The prior estimate is .5, the mean of the prior Gamma distribution.
Therefore we have: .6 = Buhlmann Credibility Estimate = (1/3)(x/5) + (2/3)(1/2)
Therefore x = 4.
Alternately, for the Poisson-Gamma, the value of K in the Buhlmann credibility formula is equal to the
inverse of the scale parameter of the prior Gamma distribution. Thus = 1/10. The mean of the prior
Gamma is = .5. Thus the prior Gamma distribution has a shape parameter = 5. For the
Gamma-Poisson, the posterior Gamma has new inverse scale parameter equal to the prior inverse
scale parameter plus the number of exposures. Thus the posterior inverse scale parameter is equal
to 10 + 5 = 15.
The posterior shape parameter is equal to the prior shape parameter of 5 plus the number of claims
observed, or 5 + x. However, we are given that the posterior Gamma has a mean of .6.
But this equals the ratio of its shape parameter to its inverse scale parameter.
Thus .6 = (5 + x)/15. Therefore x = 4.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 142

4.95. D. One solution uses the Bayesian result for the Gamma-Poisson and the fact that for the
Gamma-Poisson the Bayes result equals the Buhlmann Credibility result.
The prior Gamma has parameters: 1, 1/5 and we observe 10 claims for 100 exposures.
Therefore, the posterior Gamma has parameters: 1+10 = 11, 1/(5 +100) = 1/105.
Therefore the Posterior Gamma has mean 11/105 = 10.5%, so 100 risks are expected to have
10.5 claims.
Alternately, the observed is 10 claims. Prior mean of Gamma is 1/5 =.2, so 100 risks are expected
to have 100/5 = 20 claims. The process variance for the Poisson is , which has expected value =
mean of prior Gamma 1/5 = .2. The variance of the hypothetical means is the variance of the prior
Gamma = 2 = 1/52 = .04. Buhlmann K = EPV/VHM = .2 / .04 = 5.
Z = 100 / (100 + 5) = .95. New Estimate = (10)(95%) + (20)(5%) = 10.5
Comment: Once you know that the prior estimate is 20 and that the observation is 10, then the
estimate based on Buhlmann Credibility must be between 10 and 20 and therefore only choices D
and E are possible answers.
4.96. C. The prior Gamma has parameters 1, 1/5. The posterior Gamma has parameters:
1 + 1 = 2, 1/(5 + 2) = 1/7. It has mean 2/7. For the Poisson, the process variance is the mean .
So the expected value of the process variance for this risk after the observation is the expected
value of or the mean of the posterior Gamma: 2/7 = 0.29.
Comments: Note that the Exponential-Gamma is a special case of the Gamma-Poisson.
4.97. C. The prior Gamma has parameters 1, 1/5. The posterior Gamma has parameters:
1 + 1 = 2, 1/(5 + 2) = 1/7. f() = 72 2-1 exp(-7) / (2) = 49 exp(-7).
Alternately, since over 2 periods we have a Poisson with parameter 2, the chance of observing
one claim over two periods is equal to: 2 exp(-2).
So by Bayes Theorem, the posterior chance of various values of lambda is proportional to this
chance times h(): 5 exp(-5)2 exp(-2) = 10 exp(-7).
We need only divide by its integral in order to convert to a probability density function.
One can either compute this integral or realize it must be the Gamma Distribution with shape
parameter 2 and scale parameter 1/7.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 143

4.98. E. The variance for the sum of both risks combined, is the sum of the individual variances.
(Variances add for independent variables.)
For each risk, its total variance is the sum of the EPV and VHM.
For each Class, the EPV = E[Poisson Variance] = E[Poisson Mean] = Mean for the Class.
The Variance of the Hypothetical means for risks from a class is:
Var[Poisson Means] = Variance of Exponential Distribution = 2 = (Mean for the Class)2.
Class

Mean

EPV

VHM

Total Variance

1
2

0.3
0.7

0.3
0.7

0.09
0.49

0.39
1.19

SUM

1.58

Comment: The Exponential Distribution has a mean of and a variance of 2. Alternately, one can
get the total variance for each risk from the mixed negative binomial distributions. As shown in the
solution to the next question, the mixed distribution for Class 1 has parameters
r = 1 and = .3. The variance for the Negative Binomial is r(1+) = .39.
Similarly the variance of a risk from Class 2 with r = 1 and = .7 is r(1+) = 1.19.
The sum of the two variances is therefore: .39 + 1.19 = 1.58.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 144

4.99. B. For the Gamma-Poisson the mixed distribution is a Negative Binomial with parameters
r = shape parameter of the Gamma = 1, and = . For risks from Class 1, = = .3.
For risks from Class 2, = = .7. For the Negative Binomial distribution, f(0) = 1/(1+)r = 1/(1+).
Thus the chances of observing zero claims (over one exposure period) for the two classes are: .769
and .588. Therefore:
A

Class

A priori
Probability

Chance of
Observation

Probability
Weight =
Col. B x Col. C

Probability =
Col. D /
Sum of Col. D

1
2

0.5
0.5

0.769
0.588

0.3845
0.2940

0.5667
0.4333

0.6785

1.0000

Overall

The probability of a risk being from Class 1 if no claims are observed (over one exposure period)
is 56.7%.
Comment: One can work out the probabilities of observing zero claims given a risk from one of the
Classes. The chance of zero claims for a Poisson with mean is e.
Integrating this probability over the values of lambda gives:

e-

f() d =

e-

e- /

/ d = (1/)

0 e- (1+ 1/ ) d

= (1/) / (1 + 1/) = 1/(1 + ).

Which for theta equal to .3 and .7 respectively for the two Classes gives probabilities of .769 and
.588 as obtained above. The remainder of the solution proceeds as above.
4.100. B. The number of observed claims is: (89)(1) + (4)(2) + (3)(1) = 100, for 1000 observed
risks. For the Gamma-Poisson the posterior Gamma has parameters:
= prior + number of claims observed = 250 + 100 = 350, 1/ = 1/prior + number of risks
observed = 2000 + 1000 = 3000. The mean of the posterior Gamma is = 350/3000 = 0.1167.
Alternately, use the fact that for the Gamma-Poisson the Buhlmann Credibility estimate is equal to
the Bayesian Estimate. For the Gamma-Poisson, the Buhlmann Credibility parameter is the inverse
of the scale parameter of the prior Gamma, 1/ = 2000.
Therefore, the credibility for 1000 observed risks is Z = 1000 / (1000 + 2000) = 1/3.
The prior estimate is the mean of the prior Gamma = = 250/2000 = 0.125.
The observed frequency is: 100 /1000 = 0.100.
Therefore, the Buhlmann Credibility estimate is: (1/3)(0.100) + (2/3)(0.125) = 0.1167.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 145

4.101. B. For the Gamma Distribution, the mean is , while the variance is 2 .
Thus the coefficient of variation is:

variance / mean =

2 / () = 1 /

Thus for the Gamma Distribution, = 1/CV2 . Thus the prior Gamma has: = 1 / (1/6)2 = 36.
For the Gamma-Poisson, the posterior Gamma has
shape parameter = prior + number of claims observed = 36 + 160 = 196.
The CV of the posterior Gamma = 1 /

196 = 1/14 = 0.0714.

4.102. C. The Prior Gamma Distribution has mean = 1/2 = and variance = 1/8 = 2.
Therefore = (1/28)/(1/2) = 1/4, and = (mean)/() = (1/2)/(1/4) = 2.
The Posterior Gamma has = prior + number of claims observed = 2 + 4 = 6.
The Posterior Gamma has 1/ = 1/prior + number of exposures observed = 4 + 2 = 6.
Therefore the Posterior Gamma has a variance of: ()2 = 6 / 62 = 1/6.
4.103. B. One can solve for the parameters of the prior Gamma, and , via the Method of
Moments: mean = = .25, variance = 2 = .0025. Thus = 1/100, and = 25.
The parameters of the posterior Gamma are:
posterior alpha = prior alpha + number of observed claims = 25 + 23 = 48.
posterior theta = 1/{(1/prior theta) + number of observed exposures} = 1/(100 + 100) = 1/200.
Variance of the posterior Gamma is: (posterior alpha)(posterior theta)2 = 48/2002 = 0.0012.
4.104. C. Let the parameters of the prior Gamma be and . Then the prior Gamma has a mean
= = .05 and variance 2 = .01. Therefore = .25 and = 1/5. Let the parameters of the
posterior Gamma be and . Then = + n = .25 + n and 1/ = 1/ + 10 = 15.
The posterior Gamma has a variance of: .01 = /2 = (.25 + n) / 152 . Solving n = 2.
4.105. D. Mean of the prior Gamma is 1/10 = /. If Z 0, then K . But for the GammaPoisson, the Buhlmann Credibility Parameter = K = . Thus . However, the variance of the
prior Gamma is /2 = 1/ (10) 0. Thus most of the probability of the prior Gamma is close to its
mean of 1/10. f(1/10) is large.
Comment: The credibility of the observation is small because the VHM is small. Note that in this
question the parameter for the Gamma Distribution corresponds to 1/ in Loss Models.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 146

4.106. C. VHM = /2 = 1/400. Also 1/10 = /. Therefore, = 4 and = 40.


The parameters of the posterior Gamma are 4 + 1 = 5 and 40 + 60 = 100. The posterior mean is
5/100 = .05. Therefore, the expected number of errors that the player will make in the next 60
games is (.05)(60) = 3.
Alternately, K = = 40. Z = 60 / (60 + 40) = 60%. Prior mean is 1/10. Observation = 1/60. The
estimate = (.6)(1/60) + (.4)(1/10) = .05. (.05)(60) = 3.
Comment: Note that in this question the parameter for the Gamma Distribution corresponds to 1/
in Loss Models. A better model would classify players by position. For example, first basemen
have a higher average fielding percentage than shortstops. This is similar to the classification
schemes for insurance, which use characteristics of the insureds to divide the universe into
more homogeneous groups.
4.107. C. The Prior Gamma has parameters = 10 and = 1/m.
If we observe C claims in m years, then the posterior Gamma has parameters:
= 10 + C, and 1/ = m + m = 2m.
Variance of posterior Gamma = (10+C)/ (2m)2 .
Variance of prior Gamma = 10/ m2 .
We want: 10/ m2 = (10 + C)/ (2m)2 . Thus, 40 = 10 + C. C = 30.
4.108. B. For any Negative Binomial, mean = r, variance = r(1+), variance / mean = 1 + .
The predictive Negative Binomial after Y years has = posterior = 1/(Y+ 1/prior ) =
1/(Y+ m). 1 + = 1 + (posterior ) = 1 + 1/ (m + Y), if we observe for Y years.
As Y goes to infinity, 1 + goes to 1.
Comment: For a Negative Binomial Distribution, the variance is always greater than the mean, thus
Choice A (of 0) can be eliminated. As we observe more and more years, the predictive distribution
gets closer and closer to a Poisson Distribution, that of the individual insured we are observing, and
thus the ratio of the variance to the mean approaches one.
4.109. This is a Gamma-Poisson, with prior Gamma with parameters =1 and = 1/5.
The posterior Gamma has parameters: =1+1 = 2. 1/ = 5 + 2 = 7.
Thus the posterior Gamma has a mean of = 2/7.
Alternately, one can use Buhlmann Credibility. K = 1/prior = 5. Z = 2/(2 + 5) = 2/7.
Observed frequency = 1/2. A priori estimate is the mean of the prior Gamma, 1/5.
New Estimate = (2/7)(1/2) + (5/7)(1/5) = 2/7.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 147

4.110. C. The first actuary has a posterior Gamma with parameters:


= + C = 1 + 1 = 2 and 1/ = 1/ + E = 1/(1/6) + 3 = 9.
So for the first actuary, the posterior Gamma has a mean of: = 2/9.
The second actuary has a prior Gamma with a mean the same as that of the first actuarys:
(1)(1/6) = 1/6, and variance half that of the first actuarys: (1/2){(1)(1/6)2 } = 1/72.
Thus as per fitting via the method of moments, for the second actuary:
mean = = 1/6 and variance = 2 = 1/72. Thus the second actuary has a prior Gamma with
parameters: = (1/72)/(1/6) = 1/12 and = (1/6)/(1/12) = 2.
The second actuary has a posterior Gamma with parameters:
= + C = 2 + 1 = 3 and 1/ = 1/ + E = 1/(1/12) + 3 = 15.
So for the second actuary, the posterior Gamma has a mean of: = 3/15 = 1/5.
Therefore, the ratio of the Bayesian premium that the first actuary calculates to the Bayesian premium
that the second actuary calculates is: (2/9)/(1/5) = 10/9.
Alternately, one can work with credibilities. For the first actuary, K = 1/ = 6.
For three years Z = 3/9 = 1/3. Prior mean = mean of prior Gamma = = 1/6. Observation = 1/3.
Estimate = (1/3)(1/3) + (1 - 1/3)(1/6) = 2/9. For the second actuary, K = 1/ = 12.
For three years Z = 3/15 = 1/5. Prior mean = mean of prior Gamma = = 1/6. Observation = 1/3.
Estimate of 2nd actuary = (1/5)(1/3) + (1 - 1/5)(1/6) = 1/5.
The ratio of their estimates is: (2/9)/(1/5) = 10/9.
Comment: The second actuary assumes there is less variation between the insureds, and therefore
applies less weight to the observation than does the first actuary.
Bullet (iv) in the question applies to the prior Gamma, rather than the posterior Gamma.
4.111. B. This is a Gamma-Poisson, with = 1 (Exponential) and = 3.
We observe 2 claims in 1 year. Therefore, the posterior distribution is Gamma with
= + C = 1 + 2 = 3, and 1/ = 1/ + E = 1/3 + 1 = 4/3. = 3/4.
The variance of the posterior Gamma Distribution is: 2 = (3)(3/4)2 = 27/16.
4.112. E. The posterior distribution is Gamma with = 3 and = 3/4.
Therefore, the predictive distribution is Negative Binomial with r = 3, = 3/4, and variance:
r(1 + ) = (3)(3/4)(7/4) = 63/16 = 3.9375.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 148

4.113. B. The prior distribution of is Gamma with = 50 and = 1/500.


= + C = 50 + 75 + 210 = 335. 1/ = 1/ + E = 500 + 600 + 900 = 2000.
Estimated future frequency = Mean of posterior Gamma = = 335/2000 = .1675.
Expected number of claims = (1100)(.1675) = 184.25.
Alternately, K = 1/ = 500. Z = 1500 / (1500 + 500) = 3/4.
Prior mean = mean of prior Gamma = = 50 / 500 = 0.10.
Observation = 285 / 1500 = 0.19.
Estimate = (3/4) (0.19) + (1 - 3/4) (0.10) = 0.1675.
Expected number of claims in Year 3 = (1100) (0.1675) = 184.25.
Comment: Note that posterior to Year 1, a Gamma with:
1
= 50 + 75 = 125, = 500 + 600 = 1100.

This acts as the Gamma prior to Year 2. Then adding in the experience for Year 2:
1
= 125 + 210 = 335,
= 1100 + 900 = 2000.
'
4.114. D. Prob[observe 1 or more claim | ] = 1 - Prob[0 claims | ] = 1 - e.

(1 - e)e- d = e-d - e-2 d = 1 - 1/22 = 3/4.


0

By Bayes Theorem, the posterior distribution of is: (1 - e)e- /(3/4).


Thus, the posterior probability that this same policyholder will have at least one claim this year is:

(1 - e)(4/3)(1 - e)e-d = (4/3) e- - 2e-2 + e-3 d = (4/3)(1 - 2/22 + 1/32) = 0.815.


0

Alternately, Prob[observe 0 claims this year| ] = e.


Therefore, given the posterior distribution of , the density at 0 of the predictive distribution is:

e(4/3)(1 - e)e-d = (4/3) e-2 - e-3 d = (4/3)(1/22 - 1/32) = 5/27.


0

Prob[at least one claim this year] = 1 - 5/27 = 22/27 = 0.815.


Alternately, the prior Gamma, f() = e, > 0, has = 2 and = 1.
The marginal distribution is Negative Binomial with r = = 2 and = = 1.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 149

Therefore, Prob(n claims in Year 1) = (n + 1)/22+n.


If one observe n claims in Year 1, the posterior Gamma has = + C = 2 + n and
1/ = 1/ + 1 = 1 + 1. = 1/2.
The predictive distribution is Negative Binomial with r = = 2 + n and = = 0.5.
Therefore, Prob(0 claims in Year 2 | n claims in Year 1) = 1/1.52+n.
Prob(0 claims in Year 2 | 1 or more claims in Year 1) =

Prob(n claims in Yr 1)Prob(0 claims in Yr 2 | n claims in Yr 1) / Prob(n claims in Year 1)


1

= ((n + 1)/22+n)(1/1.52+n) / (1/1.52+n) = {(1/9) (n/3n ) + (1/3n )}/{(1/1.52 )/(1- 1/1.5)} =


1

(1/9)(3/4 + 1/2)/(4/3) = 5/27.


Prob(1 or more claims in year 2 | 1 or more claims in Year 1) = 1 - 5/27 = 22/27 = 0.815.
Comment: Bullet number iii is a special case of the formula for Gamma type integrals.
It also follows from the formula for the mean of an Exponential Distribution:

x n e-nx dx = (Mean of an Exponential Dist. with density n e-nx) = 1/n x e-nx dx = 1/n2.
0

In the alternate solution, n/3n = n/3n = 1.5 n .5n /1.5n+1 =


1

1.5(mean of Geometric Distribution with = .5) = (1.5)(.5) = 3/4.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 150

4.115. D. From the previous solution, the posterior distribution of is: (4/3)(1 - e)e-.
Therefore, the expected future annual frequency is:

(1 - e)e-d = (4/3) 2e-- 2e-2 d = (4/3)((3)/13 - (3)/23) = (4/3)(2! - 2!/8) = 2.33.


0

Alternately, assume you start with 1000 policyholders. The marginal distribution is Negative
Binomial with r = = 2 and = = 1, with density at 0 of: 1/(1+ 1)2 = 1/4.
Therefore, we expect 250 out of these 1000 policyholders to have 0 claims and thus 750 to have at
least one claim. K = 1/ = 1/1 = 1. Z = 1/(1 + K) = 1/2. The a priori mean is: = (2)(1) = 2.
Therefore, the expected future annual frequency for the 250 who had no claims is:
(1/2)(0) + (1 - 1/2)(2) = 1. Thus we expect these 250 policyholders to have 250 claim next year.
We expect the 1000 policyholders to have (1000)(2) = 2000 claims in total. Thus the 750 who had
at least one claim are expected to have: 2000 - 250 = 1750 claims.
Their expected annual frequency is: 1750/750 = 7/3.
Alternately, we expect 2000 claims from 1000 policyholders. On average 250 had no claim.
Thus the observed frequency for the 750 policyholders with at least one claim is: 2000/750.
K = 1/ = 1/1 = 1. Z = 1/(1 + K) = 1/2. The a priori mean is: = (2)(1) = 2.
Therefore, the expected future annual frequency for the 750 who had at least 1 claim is:
(1/2)(2000/750) + (1 - 1/2)(2) = 7/3.
4.116. D. For the Gamma-Poisson, K = 1/ = 1/1.2 = .833. Z = 2/(2 + K) = .706.
Observed frequency is 3/2. Prior mean frequency is the mean of the Gamma: = 1.2.
Estimated future frequency is: (.706)(3/2) + (1 - .706)(1.2) = 1.412.
Alternately, for the Gamma-Poisson Buhlmann Credibility gives the same result as Bayesian
Analysis. = + C = 1 + 3 = 4. 1/ = 1/ + E = 1/1.2 + 2 = 2.833. = .353.
Mean of the posterior Gamma is: = (4)(.353) = 1.412.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 151

4.117. B. Gamma-Poisson with prior = 6 and prior = 1/100.


= + (6 + 8 + 11) = 31. 1/ = 1/ + (100 + 150 + 200) = 550.
Estimated future frequency = mean of the posterior Gamma = = 31/550.
Estimate of the number of claims in Month 4: (300)(31/550) = 16.9.
Alternately, K = 1/ = 100. Observed frequency = (6 + 8 + 11)/(100 + 150 + 200) = 1/18.
Prior mean frequency = mean of the prior Gamma = = 6/100 = .06.
Z = 450/(450 + K) = 9/11. Estimated future frequency = (9/11)(1/18) + (2/11)(.06) = .0564.
Estimate of the number of claims in Month 4: (300)(.0564) = 16.9.
Comment: It is not clear to me exactly what is going on in this exam question. Perhaps what was
intended is that the prior gamma is for a very large group of insureds,while the experience given is
for a particular type of insured, each of whom is assumed to have the same mean frequency.
4.118. D. The posterior distribution of for a single insured is Gamma with = 31 and = 1/550.
The frequency for a sum of 300 insureds is Poisson with mean 300.
The posterior distribution of 300 is Gamma with = 31 and = 300/550 = 6/11.
The number of claims from 300 insureds is Negative Binomial Distribution with r = 31 and = 6/11.
The mean is: (31)(6/11) = 16.91. The variance is: (16.91)(1 + 6/11) = 26.13.
Prob[more than 20 claims] 1 - [(20.5 - 16.91)/ 26.13 ] = 1 - [.70] = 24.2%.
Comment: For the Gamma-Poisson, the mixed distribution for Y exposures is given by a Negative
Binomial Distribution, with parameters r = and = Y.
See Mahlers Guide to Frequency Distributions.
4.119. B. This is a Gamma-Poisson with = 5 and = 1/2.
= + C = 5 + 8 = 13. 1/ = 1/ + E = 2 + 2 = 4. Posterior mean = = 13/4 = 3.25.
Comment: For the Gamma Distribution in Loss Models: f(x) = (x/) e-x/ / {x ()}, x > 0.
4.120. C. The predictive distribution is Negative Binomial with r = = 13 and = = 1/4.
f(4) = {(13)(14)(15)(16) / 4!} (1/4)4 / (5/4)13+4 = 16.0%.

2013-4-10,

Conjugate Priors 4 Gamma-Poisson,

HCM 10/21/12,

Page 152

4.121. D. Posterior, over one year follows a Gamma with = 13 and = 1/4.

Over 3 years for this single individual, 3 follows a Gamma with = 13 and = (3)(1/4) = 3/4.
Therefore, the posterior mixed distribution for three years is Negative Binomial:
with r = 13 and = 3/4.
f(6) =

(3 / 4)6
(13)(14)(15)(16)(17)(18)
= 7.97%.
6!
(1 + 3 / 4)19

4.122. A. For the Gamma-Poisson, K = 1/. The prior mean frequency is .


For the first case, Z = 1/(1 + K) = 1/(1 + 1/) = /(1 + ).
0.15 = Z1 + (1 - Z) = /(1 + ) + {1/(1 + )} = (1 + )/(1 + ). .15 = .85 + .
For the second case, Z = 2/(2 + K) = 2/(2 + 1/) = 2/(1 + 2).
0.20 = Z2 + (1 - Z) = (2)2/(1 + 2) + {1/(1 + 2)} = (4 + )/(1 + 2). .20 = 3.6 + .
Subtracting the first equation from the second: .05 = 2.75. = 0.0182. = 7.4.
Alternately, using Bayes Analysis, for the first situation:
= + 1, 1/ = 1/ + 1. ( + 1)/(1/ + 1) = .15. + 1 = .15/ + .15. .15 = .85 + .
For the second situation, there are 4 claims in 2 years:
= + 4, 1/ = 1/ + 2. ( + 4)/(1/ + 2) = .20. + 4 = .20/ + .4. .20 = 3.6 + .

= 0.0182. = 7.4.
4.123. D. A Gamma-Poisson with = 4 and = 1/50.
= + C = 4 + (0)(90) + (1)(7) + (2)(2) + (3)(1) = 18. 1/ = 1/ + E = 50 + 100 = 150.
Posterior mean frequency is: = 18/150.
Expected Number of Claims for 100 exposures is: (100)(18/150) = 12.
Alternately, K = 1/ = 50. Z = 100/(100 + 50) = 2/3.
Estimated future frequency is: (2/3)(14/100) + (1/3)(4/50) = 0.12. (100)(0.12) = 12.

2013-4-10,

Conjugate Priors 5 Beta Distribution,

HCM 10/21/12,

Page 153

Section 5, Beta Distribution


The quantity xa-1(1-x)b-1 for a > 0, b > 0, has a finite integral from 0 to 1. This integral is called the
(complete) Beta Function. The value of this integral clearly depends on the choices of the
(a - 1)! (b- 1)! (a) (b)
parameters a and b.26 This integral is:
=
.
(a + b - 1)!
(a + b)
The Complete Beta Function is a combination of three Complete Gamma Functions:
[a,b] =

xa - 1 (1- x)b - 1 dx =

(a - 1)! (b- 1)! (a) (b)


=
.
(a + b - 1)!
(a + b)

Note that (a, b) = (b, a).


Exercise: What is the integral from zero to 1 of x5 (1-x)3 ?
[Solution: (6, 4) = (6) (4) / (6+4) = 5! 3! / 9! = 1/504 = 0.001984.]
One can turn the complete Beta Function into a distribution on the interval [0, 1] in a manner similar to
how the Gamma Distribution was created from the (complete) Gamma Function on [0, ].
The Incomplete Beta Function involves an integral from 0 to x < 1:27
[a,b; x] =

ta - 1 (1- t)b - 1 dt / [a, b].

The Incomplete Beta Function is zero at x = 0 and one at x = 1. The latter follows from:
1

[a,b; 1] =

0 ta - 1 (1- t)b - 1 dt / [a, b] = [a, b] / [a, b] = 1.

The following relationship, is sometimes useful: (a,b; x) = 1 - (b,a; 1-x).


The two parameter Incomplete Beta Function is a special case of what Loss Models calls the Beta
distribution, for = 1.
The Beta Distribution in Loss Models has an additional parameter which determines its support:
F(x) = (a,b; x/), 0 x .
For use in the Beta-Bernoulli frequency process, is always equal to one.
For = 1, f(x) =
26

(a + b - 1)!
xa - 1 (1 - x)b - 1, 0 x 1.
(a - 1)! (b - 1)!

The results have been tabulated and this function is widely used in many applications.
See for example the Handbook of Mathematical Functions, by Abramowitz, et. al.
27
As shown in Appendix A of Loss Models.

2013-4-10,

Conjugate Priors 5 Beta Distribution,

(a,b;x) has mean:

HCM 10/21/12,

Page 154

a (a +1)
ab
a
, second moment:
, and variance:
.
2
(a + b) (a + b + 1)
(a + b) (a+ b +1)
a + b

The mean is between zero and one; for b < a the mean is greater than 0.5.
For a fixed ratio of a/b the mean is constant and for a and b large (a,b;x) approaches a Normal
Distribution. As a or b get larger the variance decreases. For either a or b extremely large, virtually all
the probability is concentrated at the mean.
Here are various Beta Distributions with = 1:

Beta with a = 1, b = 5

Beta with a = 2, b = 4
2

1.5

3
2

0.5
0.2

0.4

0.6

0.8

Beta with a = 4, b = 2

0.2

0.4

0.6

0.8

Beta with a = 5, b = 1

1.5

0.5

1
0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

For a > b the Beta Distribution is skewed to the left. For a < b it is skewed to the right.
For a = b it is symmetric. For a 1, the Mode = 0. For b 1, the Mode = 1.
If 0 < a < 1, then f(0) = . If 0 < b < 1, then f(1) = .
(a,b;x), the Beta distribution for = 1 is closely connected to the Binomial Distribution.
The Binomial parameter q varies from zero to one, the same domain as the Incomplete Beta
Function. The Beta density is proportional to the chance of success to the power a-1, times the
chance of failure to the power b-1. The constant in front of the Beta density is (a+b-1) times the
binomial coefficient for (a+b-2) and a-1.

2013-4-10,

Conjugate Priors 5 Beta Distribution,

HCM 10/21/12,

Page 155

The Incomplete Beta Function is a conjugate prior distribution for the Binomial.28
The Incomplete Beta Function for integer parameters can be used to compute the sum of terms
from the Binomial Distribution.29
Summary of the Beta Distribution:
Support: 0 x

Parameters: a > 0 (shape parameter), b > 0 (shape parameter)


> 0 (similar to a scale parameter, determines the support)

F(x) = (a, b ; x/)

f(x) =
=

(a + b - 1)! x/ a - 1
t (1- t)b - 1 dt .
(a - 1)! (b- 1)! 0

1
(a + b)
(x/)a (1 - x/)b-1 / x =
(x/)a (1 - x/)b-1 / x
(a, b)
(a) (b)
(a + b - 1)!
(x/)a-1 (1 - x/)b-1 / , 0 x .
(a - 1)! (b- 1)!

For a = 1, b = 1, the Beta Distribution is the uniform distribution from [0, ].


(a + b) (a + n)
(a + b - 1)! (a + n - 1)!
E[Xn] = n
= n
(a + b + n) (a)
(a + b + n - 1)! (a - 1)!
= n

Mean =

a(a + 1) ... (a + n - 1)
.
(a + b)(a + b + 1) ... (a + b + n - 1)
a
a+b

E[X2 ] = 2

a (a +1)
(a + b) (a + b + 1)

Variance = 2

Coefficient of Variation = Standard Deviation / Mean =

Skewness =

2 (b - a)
(a + b + 2)

a + b +1
ab

Mode =

ab
(a + b)2 (a+ b +1)

b
.
a (a + b +1)

a - 1
, for a > 1 and b > 1
a + b - 2

Limited Expected Value = E[X x] = {a/(a+b)} (a+1, b; x/) + x {1-(b, a; x/)}.

28

The Beta-Bernoulli discussed in the next section is a special case of the Beta-Binomial.
See Mahlers Guide to Frequency Distributions. On the exam you should either compute the sum of binomial
terms directly or via the Normal Approximation. Note that the use of the Beta Distribution is an exact result, not an
approximation. See for example the Handbook of Mathematical Functions, by Abramowitz, et. al.
29

2013-4-10,

Conjugate Priors 5 Beta Distribution,

HCM 10/21/12,

Page 156

Beta Distribution for a = 3, b = 3, and = 1:

1.75
1.5
1.25
1
0.75
0.5
0.25
0.2

0.4

0.6

0.8

Uniform Distribution:
The Uniform Distribution from 0 to is a Beta Distribution with a = 1 and b = 1.
Specifically, the Uniform Distribution from 0 to 1 is a Beta Distribution with a = 1, b = 1,
and = 1.
DeMoivres Law is a Beta Distribution with a = 1, b = 1, and = .
The future lifetime of a life aged x under DeMoivres Law is a Beta Distribution with a = 1,
b = 1, and = - x.

2013-4-10,

Conjugate Priors 5 Beta Distribution,

HCM 10/21/12,

Page 157

Problems:
5.1 (1 point) For a Beta Distribution with parameters a = 4, b = 6, and = 1, what is the density
function at x = 0.4?
A. 0.5
B. 1.0

C. 1.5

D. 2.0

E. 2.5

5.2 (1 point) For a Beta Distribution with parameters a = 4, b = 6, and = 1, what is the mean?
A. 0.2

B. 0.3

C. 0.4

D. 0.5

E. 0.6

5.3 (1 point) For a Beta Distribution with parameters a = 4, b = 6, and = 1, what is the variance?
A. 0.005

B. 0.01

C. 0.02

D. 0.03

E. 0.04

5.4 (1 point) For a Beta Distribution with parameters a = 4, b = 6, and = 1, what is the mode?
A. 0.350

B. 0.375

C. 0.400

D. 0.425

E. 0.450

5.5 (2 points) A Beta Distribution with = 1, has a mean of 50%


and a coefficient of variation of 20%.
Determine its parameters a and b.
5.6 (IOA 101, 4/00, Q.3) (1.5 points) In an investigation into the proportion (q) of lapses in the first
year of a certain type of policy, the uncertainty about q is modeled by taking q to have a beta
distribution with parameters a = 1, b = 9, and = 1, that is, with density:
f(q) = 9(1 - q)8 , 0 < q < 1.
Using this distribution, calculate the probability that q exceeds 0.2.

2013-4-10,

Conjugate Priors 5 Beta Distribution,

HCM 10/21/12,

Page 158

Solutions to Problems:
5.1. E. f(x) = {(a+b-1)! / (a-1)! (b-1)!} (x/)a-1{1 - (x/)}b-1 / = {9! / 3! 5!}.43 .65 =
504(.064)(.07776) = 2.508.
5.2. C. Mean = a/(a+b) = 4 / 10 = 0.4.
5.3. C. Second moment = 2a(a+1)/{(a+b)(a+b+1)}= (4)(5)/{(10)(11) = .1818.
Variance = .1818 - .42 = 0.0218.
Alternately, Variance = 2ab / {(a+b)2 (a+b+1)} = (4)(6) / {102 11} = 0.0218.
5.4. B. f(x) is proportional to: x3 (1- x)5 .
Setting the derivative with respect to x equal to zero: 0 = 3x2 (1- x)5 - 5x3 (1- q)4 .

3(1 - x) = 5x. x = 3/8 = 0.375.


Comment: In general, the mode of a Beta Distribution is: (a - 1) / (a + b - 2), for a > 1 and b > 1.
A graph of the density of this Beta Distribution, near the mean of 4/(4 + 6) = 0.4:
Prob.
2.5
2.4
2.3
2.2
2.1
2
0.3

5.5. 50% =

0.35

0.4

0.45

0.5

a
. a = b.
a + b

(Second Moment) / (Mean)2 = 1 + CV2 = 1.04. Second Moment = (1.04)(0.52 ) = 0.26.


0.26 =

a (a +1)
a (a +1)
(a +1)
=
=
.
(a + b) (a + b + 1) (a + a) (a + a + 1) 2 (2a + 1)

(0.52) (2a + 1) = a + 1. a = 12. b = 12.

2013-4-10,

Conjugate Priors 5 Beta Distribution,

5.6.

q=1

Prob[q > 0.2] = 9(1 - q)8 dq = -(1 - q)9


0.2

] = 0.89 = 0.134.

q = 0.2

HCM 10/21/12,

Page 159

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 160

Section 6, Beta-Bernoulli
The Beta-Bernoulli is another example of a conjugate prior situation. As with the Gamma-Poisson, it
involves a mixture of claim frequency parameters across a portfolio of risks.
Here rather than a Poisson process, the number of claims a particular policyholder makes in a year or
single trial is assumed to be Bernoulli with mean q. For a Bernoulli Distribution with parameter q the
chance of having 1 claim is q, and of zero claims is 1-q. In a single Bernoulli trial there is either zero or
one claim.30 The parameter q is greater than or equal to zero and less than or equal to 1; 0 q 1.
The mean of the Bernoulli is q and its variance is q(1-q).
Prior Distribution:
Assume that the values of q are given by a Beta Distribution with a = 5, b = 7, and = 1,
(5,7; x), with probability density function:31
f(q) = 2310 q4 (1-q)6 , 0 q 1.
This prior Beta Distribution is displayed below:

2.5
2
1.5
1
0.5

0.2

30

0.4

0.6

0.8

Bernoulli Parameter

The sum of m independent Bernoulli trials with the same parameter q, is a Binomial Distribution with parameters m
and q. The Bernoulli is a special case of the Binomial for m = 1.
31
For the Beta-Bernoulli, the value of is always 1, so that q goes from 0 to 1.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 161

Marginal Distribution (Prior Mixed Distribution):


If we have a risk and do not know what type it is, in order to get the chance of having a claim, one
would weight together the chances of having a claim, using the a priori probabilities of the Bernoulli
parameter q and integrating from zero to one:
1

q f(q) dq = 2310
0

q5 (1- q)6 dq

= 2310

5! 6!
= 2310 / 5544 = 5/12 = 0.417.
12!

Where we have used the fact from the previous section:


1

xa - 1 (1- x)b - 1 dx =

(a - 1)! (b- 1)! (a) (b)


=
.
(a + b - 1)!
(a + b)

Thus the chance of having one claim is 0.417. Regardless of what type of risk we have chosen from
the portfolio, the only other possibility is having no claims and therefore the chance of having no
claims is 0.583. The (prior) marginal distribution is a Bernoulli with parameter q = 0.417.
In general if one has q given by (a,b; x), then the marginal distribution is a Bernoulli with parameter
given by the integral from zero to one of q times f(q). This is just the mean of the (a,b; x)
distribution. Thus, if the Bernoulli parameters q are distributed by (a,b; x), then the
marginal distribution is also a Bernoulli with parameter a/(a+b), the mean of (a,b; x).
Note that for the particular case a = 5 and b = 7 we get a marginal distribution with Bernoulli
parameter of 5/(5+7) = 5/12, which matches the result obtained above.
Prior Expected Value of the Process Variance:
The process variance for an individual risk is: q(1-q) = q - q2 since the frequency for each risk is
Bernoulli. Therefore the expected value of the process variance
= the expected value of q minus the expected value of q2
= the a priori mean frequency - second moment of the frequency.32
The former is the mean of (a,b): a/(a+b).
The latter is the second moment of (a,b):
32

a (a +1)
.
(a + b) (a + b + 1)

This relationship holds generally for mixing Bernoullis or Binomials, whether or not q follows a Beta Distribution.
See Example 20.37 in Loss Models.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 162

For a = 5 and b = 7, the mean of (5,7) is 5/12 = 0.4167,


while the second moment of (5,7) is:

(5)(6)
= 0.1923.
(12)(13)

Thus the expected value of the process variance is: 0.4167 - 0.1923 = 0.224.
In general, the expected value of the process variance is :
first moment of Beta Distribution - second moment of the Beta Distribution =
a (a +1)
ab
a
=
.
a + b (a + b) (a + b + 1) (a + b) (a + b + 1)
For a = 5 and b = 7 this equals: (5)(7) / {(12)(13)} = 0.224, which matches the previous result.
Prior Variance of the Hypothetical Means:
The variance of the hypothetical means is the variance of q = Var[q] =
ab
Variance of the Prior Beta =
. For a = 5 and b = 7 this is: 0.0187.
2
(a + b) (a + b + 1)
Prior Total Variance:
The total variance is the variance of the marginal distribution. The variance of the Bernoulli is the
chance of success times the chance of failure. The marginal distribution is a Bernoulli with chance of
ab
a
a
a
success
. Thus the total variance is:
{1 }=
.
(a + b)2
a + b
a + b
a + b
For a = 5 and b = 7 this equals: (5)(7)/122 = 0.243.
The Expected Value of the Process Variance + Variance of the Hypothetical Means =
0.224 + 0.019 = 0.243 = Total Variance.

EPV + VHM =

ab
ab
ab
+
=
= Total Variance.
2
(a + b) (a + b + 1)
(a + b) (a + b + 1) (a + b)2

VHM = the variance of the Prior Beta.


Total Variance = the variance of the Marginal Bernoulli = EPV + VHM.
ab
ab

= Variance of Prior Beta < Variance of Marginal Bernoulli =


.
2
(a + b) (a + b + 1)
(a + b)2

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 163

Observations:
Let us now introduce the concept of observations. A risk is selected at random and it is observed to
have 13 claims in 19 trials (or years.) Note that for the Bernoulli the number of claims is less than or
equal to the number of trials. (One can describe this case of the Bernoulli as observing 13
successes in 19 trials or equivalently 6 failures in 19 trials.)
Posterior Distribution:
We can employ Bayesian analysis to compute what the chances are that the selected risk had a
given Bernoulli Parameter. Given a Bernoulli with parameter q, the chance of observing 13 claims in
19 trials is Binomial:33 27,132 q13(1-q)6 . The a priori probability of q is the Prior Beta distribution:
(q) = 2310 q4 (1-q)6 , 0 q 1. Thus the posterior chance of q is proportional to the product of the
chance of observation and the a priori probability: q17(1-q)12. This is proportional to the density for
a Beta with a = 18, b = 13, and = 1, f(x) = 1,556,878,050 q17(1-q)12, 0 q 1.
In general, if one observes r claims for n trials, we have that the chance of this observation given q is
proportional to qr(1-q)n-r. The prior Beta is proportional to qa-1(1-q)b-1.
Note the way that both the Beta and the Bernoulli have q to a power and (1-q) to another power.
The posterior probability for q is proportional to their product qa+r-1(1-q)b+n-r-1.
This is proportional to the density for (a+r, b+n-r; x).
Thus for the Beta-Bernoulli the posterior density function is also a Beta.
This posterior Beta has a first parameter = prior first parameter plus the number of claims
observed. This posterior Beta has a second parameter = prior second parameter plus
the number of trials (usually years) minus the number of claims observed34.
The updating formulas are:
a = a + r.

b = b + (n - r).

For example, in the case where we observed 13 claims in 19 trials, r = 13 and n = 19.
The prior first parameter was 5 while the prior second parameter was 7.
Therefore the posterior first parameter = 5 + 13 = 18, while the posterior second parameter =
7 + 19 - 13 = 13, matching the result obtained above, (18,13; x).
33

The constant in front of the Binomial is 19! / (13! 6!) = 27132.


The posterior second parameter = prior second parameter + the number of failures observed. The posterior first
parameter = prior first parameter + the number of successes observed. In all case the third parameter, , is 1.
34

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 164

The prior distribution of q is:


(a + b - 1)!
(q) = (a, b; q) =
qa-1 (1 - q)b-1, 0 q 1.
(a - 1)! (b- 1)!
If for example we were modeling the testing of missiles, then q is associated with successes,
while 1-q is associated with the failures.
We add the number of successes to a, which is in the exponent of q: a = a + r.
We add the number of failures to b, which is in the exponent of 1-q: b = b + (n - r).
The fact that the posterior distribution is of the same form as the prior distribution is why the Beta is a
Conjugate Prior Distribution for the Bernoulli.
Below are compared the prior (5, 7; x) (solid) and the posterior (18, 13; x) (dashed):

0.2

0.4

0.6

0.8

Observing 13 claims in 19 trials has increased the probability of a large Bernoulli parameter and
decreased the probability of a small Bernoulli parameter.
Exercise: If the prior Beta has a = 5 and b = 7, and 3 claims are observed in 19 trials, what is the
posterior Beta?
[Solution: a = 5 + 3 = 8. b = 7 + 19 - 3 = 23. Posterior Beta is (8, 23; x).
Comment: The posterior density is: f(q) = 46,823,400 q7 (1-q)22.]

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 165

Below are compared the prior (5, 7; x) (solid) and this posterior (8, 23; x) (dashed):

q
0.2

0.4

0.6

0.8

Observing 3 claims in 19 trials has decreased the probability of a large Bernoulli parameter and
increased the probability of a small Bernoulli parameter.

Predictive Distribution (Posterior Mixed):


Since the posterior distribution is also a Beta distribution, the same analysis that led to a Bernoulli
(prior) marginal distribution, will lead to a (posterior) predictive distribution that is Bernoulli . However,
the parameters are related to the posterior Beta. For the Beta-Bernoulli the (posterior)
predictive distribution is always a Bernoulli, with
q = (first parameter of the posterior Beta) /
(1st parameter of posterior Beta + 2nd parameter of posterior Beta).
Thus q = (first parameter of the prior Beta + number of claims observed) /
(first parameter of the prior Beta + second parameter of the prior Beta + number of trials
observed).
In the particular example with a posterior distribution of (18,13; x), the parameter of the posterior
Bernoulli predictive distribution is q = 18 / (18+13) = 0.5806.
Alternatively, one can compute this in terms of the prior (5,7; x) and the observations of 13 claims
in 19 trials; q = (5+13) / (5+7+19) = 0.5806.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 166

Posterior Mean:
One can compute the means and variances posterior to the observations. The posterior mean can
be computed in either one of two ways. First one can weight together the means for each type of
risk, using the posterior probabilities. This is E[q] = the mean of the posterior Beta = 18/(18+13) =
0.5806 . Alternately, one can compute the mean of the predictive distribution: the mean of the
predictive Bernoulli is q = 0.5806. Of course the two results match.
Thus posterior to the observations, for this risk, the new estimate using Bayesian Analysis is
0.5806. This compares to the a priori estimate of 0.4167. In general, the observations provide
information about the given risk, which allows one to make a better estimate of the future experience
of that risk. Not surprisingly observing 13 claims in 19 trials (for a frequency of 0.6842) has raised the
estimated frequency from 0.4167 to 0.5806.
Posterior Expected Value of the Process Variance:
The process variance for an individual risk is q(1-q) = q - q2 , where q is its Bernoulli parameter.
Therefore the expected value of the process variance = the expected value of q - the expected
value of q2 = the posterior mean frequency - second moment of the frequency. The former is the
mean of the (a,b): a/(a+b). The latter is the second moment of (a,b): a(a+1) / {(a+b)(a+b+1)}.
The expected value of the process variance is:
1st moment of Beta Distribution - 2nd moment of the Beta Distribution =
a (a +1)
ab
a
=
.
a + b (a + b) (a + b + 1) (a + b) (a + b + 1)
For a = 18 and b = 13 this equals:

(18) (13)
= 0.2359.
(32) (31)

Posterior Variance of the Hypothetical Means:


The variance of the hypothetical means is the variance of q = Var[q] =
variance of the Posterior Beta =

ab
(18) (13)
=
= 0.00761.
(a + b)2 (a+ b +1) (32) (312)

Note how after the observation the variance of the hypothetical means is less than prior
(0.00761 < 0.0187) since the observations have allowed us to narrow down the possibilities.35
35

The posterior VHM is usually but not always less than the prior VHM. When the observation corresponds to a low
prior expectation, then the posterior VHM can be larger than the prior VHM. For example with a = 21 and b = 1, the a
priori mean is 21/22 = 0.955. If one observes one claim in 25 trials, then the posterior (22,25; x) has a variance of
0.0052, greater than the 0.0019 variance of the prior (21,1; x).

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 167

Posterior Total Variance:


The total variance is the variance of the predictive distribution.
The variance of the Bernoulli equals q(1-q) = (0.5806) (1 - 0.5806) = 0.2435 .
The Expected Value of the Process Variance + Variance of the Hypothetical Means =
0.2359 + 0.0076 = 0.2435 = Total Variance, as per the general result.
Buhlmann Credibility:
Next, lets apply Buhlmann Credibility to this example. The Buhlmann Credibility parameter K =
the (prior) expected value of the process variance / the (prior) variance of the hypothetical means =
0.2244 / 0.0187 = 12. Note that K can be computed prior to any observation and doesnt depend
on them. Specifically both variances are for a single insured for one trial.

In general K =

prior EPV
=
prior VHM

ab
(a + b) (a + b +1)
ab
2
(a + b) (a + b +1)

= a + b.

For the Beta-Bernoulli in general, the Buhlmann Credibility parameter


K = a + b, where (a, b; x) is the prior distribution.
For the example, K = 5 + 7 = 12.
Having observed 13 claims in 19 trials, Z = 19 / (19+ 12) = 0.6129.
The observation = 13/19.
The a priori mean = 5/12 = 0.4167.
Therefore the new estimate = (0.6129)(13/19) + (1 - 0.6129)(0.4167) = 0.5806.
Note that in this case the estimate from Buhlmann Credibility matches the estimate from Bayesian
Analysis. For the Beta-Bernoulli the estimates from using Bayesian Analysis and
Buhlmann Credibility are equal.36
Summary:
The many different aspects of the Beta-Bernoulli are summarized below. Be sure to be able to
clearly distinguish between the situation prior to observations and that posterior to the observations.
Note the parallels to the Gamma-Poisson as summarized previously.
36

As discussed in a subsequent section this is a special case of the general results for conjugate priors of members
of linear exponential families.

Page 168
HCM 10/21/12,

Conjugate Priors 6 Beta-Bernoulli,


2013-4-10,

Beta-Bernoulli Frequency Process

Bernoulli
Process

(Number of Claims)
Bernoulli parameter q = mean of Bernoulli =
a/(a+b) = mean of prior Beta.
Variance = q(1-q) = ab/(a+b)2 .

Bernoulli Marginal Distribution:

Beta is a Conjugate Prior, Bernoulli is a Member of a Linear Exponential Family


Buhlmann Credibility Estimate = Bayes Analysis Estimate
Buhlmann Credibility Parameter, K = a + b.

Beta Prior a, b
Mixing

Bernoulli
Process

Mixing

(Number of Claims)
Bernoulli parameter q = mean of Bernoulli =
(a+r)/(a+b+n) = mean of posterior Beta.
Variance = q(1-q) = (a+r)(b+n-r)/(a+b+n)2 .

Bernoulli Predictive Distribution:

Observations: # claims = # successes = r,


# exposures = # of trials = n.

(Distribution of Parameters)

Beta Posterior
(Distribution of Parameters)
Posterior 1st parameter =
a + r.
Posterior 2nd parameter =
b + n - r.

Bernoulli Parameters of individuals making up the entire portfolio are distributed via a
Beta Distribution with parameters a and b: f(x) = (a+b-1)! xa-1 (1-x)b-1 /{(a-1)!(b-1)!}, 0 x 1,
mean = a/(a+b), variance = ab/{(a+b+1)(a+b)2 }.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 169

Uniform-Bernoulli:
Since the Uniform Distribution is a special case of the Beta Distribution with a = 1 and b = 1,
the Uniform-Bernoulli is a special case of the Beta-Bernoulli.37
When the Parameter is Less Than One:
Assume each insured has Bernoulli frequency with parameter q, and q is distributed via a Beta
Distribution with parameters a, b, and < 1.38
The prior distribution of q is proportional to: qa - 1(1 - q/)b - 1, 0 q < 1.
If we observe r claims in n years, the probability of the observation is proportional to:
qr (1 - q)n-r. Therefore, using Bayes Theorem, the posterior distribution of q is proportional to:
qr (1 - q)n-r qa - 1(1 - q/)b - 1 = qr+a -1 (1 - q)n-r (1 - q/)b - 1, 0 q < 1.
Unless n = r, the posterior distribution of q is not a Beta Distribution.39
When < 1, we do not have a Conjugate Prior situation.
We can apply Buhlmann Credibility to this situation in the usual manner.
Process variance is q(1 - q) = q - q2 .
EPV = E[q - q2 ] = E[q] - E[q2 ] = First moment of Beta - Second Moment of Beta =
a/(a + b) - 2a(a + 1)/{(a + b)(a + b + 1)} = {a/(a + b)}{1 - (a+1)/(a + b + 1)}.
VHM = Var[q] = Variance of Beta = Second Moment of Beta - Square of First Moment of Beta
= 2

a (a +1)
a2
2 a { a + 1 - a }.
- 2
=

(a + b)2
(a + b) (a + b + 1)
a + b a + b+ 1 a + b

K = EPV/VHM =

1 - (a + 1) / (a + b + 1)
(a + b + 1) / - (a + 1) 40
= (a + b)
.
{(a + 1)/ (a + b + 1) - a / (a + b)}
b

When = 1, K = a + b, as obtained previously.


37

See 4, 5/89, Q.49; 4B, 5/96, Q.30; 4B, 5/97, Q.9; 4B, 11/97, Q.19; 4B, 11/98, Q.14; 4, 11/00, Q.11.
For a = 1 and b = 1, this is the special case of a uniform distribution from 0 to .
39
4, 11/03, Q.19, where the prior distribution of q is a uniform from 0 to 0.5, is an example of this exception; we see a
claim every year, and thus the posterior distribution is a Beta. However, you should treat 4, 11/03, Q.19 as just
another continuous risk type Bayes question.
40
I would not memorize this formula.
38

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 170

Problems:
Use the following information to answer the next 14 questions:
The number of claims r that a particular policyholder makes in a year is Bernoulli with mean q.
The q values of the portfolio of policyholders have probability density function:
g(q) = 280 q3 (1 - q)4 , 0 q 1.
You are given the following values of the Incomplete Beta Function:
y
0.45
0.50
0.55
0.60

(4,5; y)

(10,11; y)

(11,10; y)

0.523
0.637
0.740
0.826

0.409
0.588
0.751
0.872

0.249
0.412
0.591
0.755

6.1 (1 point) What is the mean claim frequency for the portfolio?
A. less than 45%
B. at least 45% but less than 46%
C. at least 46% but less than 47%
D. at least 47% but less than 48%
E. at least 48%
6.2 (1 point) What is the chance that an insured picked at random from this portfolio has a Bernoulli
parameter between 0.50 and 0.55?
A. less than 10%
B. at least 10% but less than 11%
C. at least 11% but less than 12%
D. at least 12% but less than 13%
E. at least 13%

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 171

6.3 (2 points) The probability that a policyholder chosen at random will experience n claims in a year
is given by which of the following?
1
A. (3n ) (61-n) / 9 n = 0, 1
n
1
B. (6n ) (31-n) / 9
n

n = 0, 1

1
C. (4n ) (51-n) / 9
n

n = 0, 1

1
D. (5n ) (41-n) / 9
n

n = 0, 1

E. None of A, B, C, or D
6.4 (3 points) What is the expected value of the process variance?
A. less than 0.20
B. at least 0.20 but less than 0.22
C. at least 0.22 but less than 0.24
D. at least 0.24 but less than 0.26
E. at least 0.26
6.5 (2 points) What is the variance of the hypothetical mean frequencies?
A. less than 0.018
B. at least 0.018 but less than 0.020
C. at least 0.020 but less than 0.022
D. at least 0.022 but less than 0.024
E. at least 0.024
6.6 (1 point) What is the variance of the claim frequency for the portfolio?
A. less than 0.23
B. at least 0.23 but less than 0.24
C. at least 0.24 but less than 0.25
D. at least 0.25 but less than 0.26
E. at least 0.26

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 172

6.7 (2 points) An insured has 7 claims over 12 years.


Using Buhlmann Credibility what is the estimate of this insured's expected claim frequency?
A. less than 51%
B. at least 51% but less than 53%
C. at least 53% but less than 55%
D. at least 55% but less than 57%
E. at least 57%
6.8 (2 points) An insured has 7 claims over 12 years. The posterior probability density function for
this insured's Bernoulli parameter q is proportional to which of the following?
A. q9 (1-q)10
B. q10 (1-q)9
E. None of A, B, C, or D

C. q8 (1-q)11

D. q11 (1-q)8

6.9 (2 points) An insured has 7 claims over 12 years.


What is the mean of the posterior distribution?
A. less than 49%
B. at least 49% but less than 50%
C. at least 50% but less than 51%
D. at least 51% but less than 52%
E. at least 52%
6.10 (1 point) An insured has 7 claims over 12 years.
What is the chance that this insured has a Bernoulli parameter between 0.50 and 0.55?
A. less than 16%
B. at least 16% but less than 17%
C. at least 17% but less than 18%
D. at least 18% but less than 19%
E. at least 19%
6.11 (2 points) An insured has 7 claims over 12 years.
What is the variance of the posterior distribution?
A. less than 0.010
B. at least 0.010 but less than 0.012
C. at least 0.012 but less than 0.014
D. at least 0.014 but less than 0.016
E. at least 0.016

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 173

6.12 (1 point) An insured has 7 claims over 12 years.


What is the chance that this insured has a Bernoulli parameter between 0.50 and 0.55?
Use the Normal Approximation.
A. 13%
B. 15%
C. 17%

D. 19%

E. 21%

6.13 (2 points) An insured has 7 claims over 12 years. What is the probability density function for
the predictive distribution of the number of claims per year for this insured?
1
A. (9n ) (121-n) / 21 n = 0, 1
n
1
B. (12n ) (91-n) / 21
n

n = 0, 1

1
C. (8n ) (131-n) / 21
n

n = 0, 1

1
D. (13n ) (81-n) / 21
n

n = 0, 1

E. None of A, B, C, or D
6.14 (4 points) An insured has 7 claims over 12 years.
What is the probability that this same insured will have 7 claims over the next 12 years?
A. 15%
B. 17%
C. 19%
D. 21%
E. 23%

6.15 (2 points) You are given the following:

The number of claims for a single insured is a Bernoulli with parameter q,


where q varies between insureds.

The overall average frequency is 0.6.

The Expected Value of the Process Variance is 0.2.

Determine the Variance of the Hypothetical Mean Frequencies.


A. less than 0.03
B. at least 0.03 but less than 0.05
C. at least 0.05 but less than 0.07
D. at least 0.07 but less than 0.09
E. at least 0.09

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 174

Use the following information for the next two questions:


The number of claims that a particular policyholder makes in a year is Bernoulli
with mean q.
The q values over the portfolio of policyholders are uniformly distributed from 0 to 1.

A policyholder is observed to have 2 claims in 7 years.


6.16 (2 points) What is the expected future annual claim frequency for this policyholder?
A. 1/3
B. 3/8
C. 3/7
D. 4/9
E. 5/11
6.17 (3 points) What is the probability that this policyholder has a q parameter less than 0.4?
A. less than 70%
B. at least 70% but less than 72%
C. at least 72% but less than 74%
D. at least 74% but less than 76%
E. at least 76%

Use the following information for the next two questions:


You have your back to a pool table.
Your friend places the cue ball at random on the table.
He places another ball at random on the table,
and tells you that it is to the left of the cue ball.
He places yet another ball at random on the table,
and tells you that it is to the right of the cue ball.
He places yet another ball at random on the table,
and tells you that it is to the left of the cue ball.
6.18 (2 points) Using Bayes Analysis, estimate the fraction of the way the cue ball is from the left
end of the table towards the right end.
6.19 (2 points) Using Bayes Analysis, estimate the probability that the cue ball is less than one
fourth of the way from the left end of the table towards the right end.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 175

Use the following information for the next four questions:


Professor Zweistein of the Institute of Basic Studies in Kingston, N. J. has determined that the
chance of getting a head when flipping a U. S. penny is a Bernoulli process, with the expected
number of heads distributed among the different pennies via the Incomplete Beta Function
(499,501; x).
6.20 (1 point) A penny is chosen at random and flipped. What is the chance of a head?
A. less than 49.4%
B. at least 49.4% but less than 49.6%
C. at least 49.6% but less than 49.8%
D. at least 49.8% but less than 50.0%
E. at least 50.0%
6.21 (2 points) A penny is chosen at random and flipped 2000 times.
1010 heads are observed.
Use Buhlmann Credibility to estimate the future chance of a head when flipping this penny.
A. 50.1%
B. 50.2%
C. 50.3%
D. 50.4%
E. 50.5%
6.22 (3 points) A penny is chosen at random and flipped 2000 times.
1010 heads are observed.
What is the chance that this penny has a Bernoulli parameter greater than .500?
Hint: Approximate a Beta Distribution by a Normal Distribution.
A. less than 60%
B. at least 60% but less than 65%
C. at least 65% but less than 70%
D. at least 70% but less than 75%
E. at least 75%
6.23 (2 points) A penny is chosen at random and flipped 2000 times.
1010 heads are observed.
Which of the following is a 90% confidence interval for the Bernoulli parameter of this penny?
Hint: Approximate a Beta Distribution by a Normal Distribution.
A. (0.479, 0.527)
B. (0.482, 0.524)
C. (0.485, 0.521)
D. (0.488, 0.518)
E. (0.491, 0.515)

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 176

Use the following information about a missile defense system for the next three questions:

Each trial has chance of success q, independent of any other trial.

A priori you assume that q is Beta distributed, with a = 5 and b = 3.

You observe one success in the first six trials.

6.24 (2 points) Estimate the chance of success on the next trial.


(A) 1/5
(B) 2/5
(C) 3/7
(D) 1/2
(E) None of A, B, C, or D
6.25 (1 point) What is the variance of the predictive distribution?
(A) 15/64
(B) 6/25
(C) 35/144 (D) 12/49
(E) None of A, B, C, or D
6.26 (3 points) Estimate the chance of having a failure on each of the next three trials.
A. less than 18%
B. at least 18% but less than 19%
C. at least 19% but less than 20%
D. at least 20% but less than 21%
E. at least 21%

6.27 (8 points) Over the last decade in the Duchy of Grand Fenwick there have been 111 boys
born and 97 girls born.
You assume that the natural proportion of boys born to human beings is 52%.
Use Bayes Analysis to determine the probability that the future longterm proportion of boys born in
the Duchy of Grand Fenwick will be greater than 52%.
(a) (2 points) Assume that the longterm proportion of boys born varies between populations.
It is 48% for 1/4 of the populations, 52% for 1/2 of the populations,
and 56% for the remaining 1/4 of populations.
(b) (4 points) Assume that the longterm proportion of boys born varies between populations,
uniformly from 48% to 56%.
(c) (2 points) Assume that the longterm proportion of boys born varies between populations,
following a Beta Distribution with a = 13, b = 12, and = 1.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 177

Use the following information for the next 3 questions:


The probability that a baseball player gets a hit in any given attempt is q.
The results of attempts are independent of each other.
For a particular ballplayer, q does not vary by attempt.
The prior distribution of q is assumed to follow a distribution with density function
proportional to: q134 (1-q)349, 0 q 1.
6.28 (1 point) What is the probability that a ballplayer chosen at random will get a hit on his next
attempt?
A. less than 26%
B. at least 26% but less than 27%
C. at least 27% but less than 28%
D. at least 28% but less than 29%
E. at least 29%
6.29 (2 points) Flash Phillips is observed for 100 attempts and gets 40 hits.
How many hits do you expect Flash to get in his next 100 attempts?
A. 29
B. 30
C. 31
D. 32
E. 33
6.30 (2 points) Flash Phillips is observed for 100 more attempts for a total of 200,
and gets 5 more hits, for a total of 45.
Estimate the chance that Flash will get a hit in his next attempt.
A. less than 0.255
B. at least 0.255 but less than 0.260
C. at least 0.260 but less than 0.265
D. at least 0.265 but less than 0.270
E. at least 0.270

6.31 (2 points) Lucy van Pelt will hold a football for Charlie Brown to kick.
The probability that Lucy will pull the football away just as Charlie tries to kick it is q.
You assume that the probability q will be the same for each trial, and that the results of each trial are
independent. Prior to any observations, you had assumed that q has a Beta Distribution with
parameters a = 1, b = 5, and = 1.
Over the years, 47 times in a row, Lucy has pulled the football away just before Charlie tried to kick
it, and Charlie landed flat on his back.
The 48th time, what is the probability that Lucy pulls the football away just before Charlie tries to kick
it?
A. 80%

B. 90%

C. 95%

D. 99%

E. 99.9%

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 178

6.32 (4 points) A scientist, Lucky Tom, finds coins on his 60 minute walk to work at a Poisson rate of
0.5 coins/minute. The denominations are randomly distributed:
(i)
60% of the coins are worth 1;
(ii)
20% of the coins are worth 5; and
(iii)
20% of the coins are worth 10.
One of Toms fellow scientists accidentally released a tyrannosaur, which eats only scientists.
Each scientist has a chance of being eaten of q per day.
For an individual scientist his value of q remains constant as long as he remains alive.
Initially, over all scientists, q is distributed via a Beta Distribution with a = 2, b = 150, and = 1.
Since the tyrannosaur was released, Lucky Tom has survived 300 days without being eaten.
What is the expected amount of money Lucky Tom finds in the future before being eaten?
A. 35,000
B. 40,000
C. 45,000
D. 50,000
E. 55,000
Use the following information for the next two questions:
Baseball teams play 162 games in a year.
The probability that a baseball team wins any given game is q.
The results of games are independent of each other.
For a particular team, q does not vary during a year.
Over the different teams, q is distributed via a Beta Distribution with
a = 15, b = 15, and = 1.
The Durham Bulls baseball team wins 40 of its first 60 games this year.
6.33 (2 points) What is the expected total number of games the Durham Bulls will win this year?
(A) 96
(B) 98
(C) 100
(D) 102
(E) 104
6.34 (3 points) What is the variance of the total number of games the Durham Bulls will win this
year?
(A) 48
(B) 50
(C) 52
(D) 54
(E) 56

6.35 (3 points) For each mother, each child has a chance q of being a girl, independent of the
gender of her other children. The value of q varies across the population via a Beta Distribution with
parameters a = 10, b = 10, and = 1.
Mrs. Molly Weasley has had six children, all sons.
What is the probability that her next child will be a girl?
A. less than 40%
B. at least 40% but less than 42%
C. at least 42% but less than 44%
D. at least 44% but less than 46%
E. at least 46%

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 179

Use the following information for the next three questions:

At halftime of a basketball game they will choose someone from the crowd at random.
The person chosen will get a chance to make a shot from half court.
Let q be the chance of making the shot.
You assume that q is distributed across attendees via
a Beta Distribution with a = 1, b = 19, and = 1.

In honor of retiring the uniform number 5 of their former star player Archibald Andrews,
today the team will allow the lucky person chosen five chances to make the half court shot.

At todays game, Steven Quincy Urkel is chosen out of the crowd.


Steve misses his first four attempts at making the shot.
6.36 (2 points) Estimate Steves chance of making his last shot, using the Bayes estimate for the
squared error loss function.
(A) 0
(B) 1%
(C) 2%
(D) 3%
(E) 4%
6.37 (2 points) Estimate Steves chance of making his last shot, using the Bayes estimate for the
absolute error loss function.
(A) 0
(B) 1%
(C) 2%
(D) 3%
(E) 4%
6.38 (2 points) Estimate Steves chance of making his last shot, using the Bayes estimate for the
zero-one loss function.
(A) 0
(B) 1%
(C) 2%
(D) 3%
(E) 4%

6.39 (2 points) Use the following information:


Baseball teams play 162 games in a year.
The probability that a baseball team wins any given game is q.
The results of games are independent of each other.
For a particular team, q varies between the games during a year.
The Hadley Saints baseball team has an expected winning percentage this year of 60%.
Determine the standard deviation of the total number of games the Hadley Saints will win this year.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 180

Use the following information for the next two questions:


Each policyholder will have zero or one claim in a year.

The probability of having a claim is equal to q.


The q values over the portfolio of policyholders are uniformly distributed from 0 to 1.
6.40 (2 points) A particular policyholder has no claims over n years.
Determine the expected number of claims this policyholder will have in the following year.
6.41 (2 points) A different policyholder has one claim in each of n years.
Determine the expected number of claims this policyholder will have in the following year.

6.42 (4, 5/89, Q.38) (2 points) The prior distribution of your hypothesis about the unknown value
of H is given by
P(H = 1/4) = 4/5
P(H = 1/2 ) = 1/5 .
The data from a single experiment is distributed according to
P(D=d | H=h) = hd (1-h)1-d for d = 0, 1.
If the result of a single experimental outcome is d = 1, what is the posterior distribution of H?
A. P(D = d | H = h) = hd/2 (1-h)1-d/2, for d = 0, 1
B. P(H = 1/4) = 2/3, P(H = 1/2) = 1/3
C. P(H = 1/4) = 1/2, P(H = 1/2) = 1/2
D. P(D = d | H = h) = h2d/3 (1-h)1-d/3, for d = 0, 1
E. P(H = 1/4) = 1/3, P(H = 1/2) = 2/3
6.43 (4, 5/89, Q.49) (3 points) The probability of y successes in n trials is given by the binomial
distribution with p.d.f.:
n
f(y; ) = y (1 - )n-y.
y
The prior distribution of is a uniform distribution:
g() = 1, 0 1.
Given that one success was observed in two trials, what is the Bayesian estimate for the
probability that the unknown parameter is in the interval [0.45, 0.55]?
A. Less than 0.10
B. At least 0.10, but less than 0.20
C. At least 0.20, but less than 0.30
D. At least 0.30, but less than 0.40
E. 0.40 or more

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 181

6.44 (165, 11/89, Q.9) (1.7 points) You are using the Bayesian process to estimate a binomial
probability. The prior distribution is Beta with = 1. You are given:
(i) The mean of the prior distribution is 1/10.
(ii) The mode of the prior distribution is 1/20.
(iii) The mean of the posterior distribution is 19/115.
(iv) Five trials of an experiment produce h successes.
Determine h.
Hint: The mode of a Beta Distribution is (a - 1) / (a + b - 2).
(A) 1

(B) 2

(C) 3

(D) 4

(E) 5

6.45 (4B, 11/92, Q.27) (2 points) You are given the following:

The distribution for number of claims is Bernoulli with parameter q.


The prior distribution of q is the beta distribution: f(q) =

(3 + 4 +1)! 3
q (1-q)4 , 0 q 1.
3! 4!

2 claims are observed in 3 trials.


Determine the mean of the posterior distribution of q.
A. Less than 0.45
B. At least 0.45 but less than 0.55
C. At least 0.55 but less than 0.65
D. At least 0.65 but less than 0.75
E. At least 0.75
Use the following information for the next two questions:

The probability of an individual having exactly one claim in


one exposure period is q, while the probability of no claims 1-q.

q is a random variable with the Beta density function


f(q) = 6q(1-q), 0 q 1.
6.46 (4B, 5/94, Q.2) (3 points) Determine the Buhlmann credibility factor, z, for the number of
observed claims for one individual for one exposure period.
A. 1/12
B. 1/6
C. 1/5
D. 1/4
E. None of A, B, C, or D
6.47 (4B, 5/94, Q.3) (2 points) You are given the following:
An individual is selected at random and observed for 12 exposure periods.
During the 12 exposure periods, the selected individual incurs 3 claims.
Determine the probability that the same individual will have one
claim in the next exposure period.
A. 1/4
B. 1/7
C. 2/7
D. 3/8
E. 5/16

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 182

Use the following information for the next two questions:

The probability that a single insured will produce exactly one claim during one
exposure period is q, while the probability of no claim is 1-q.

q varies by insured and follows a beta distribution with density function


f(q) = 6q(1-q), 0 q 1.
6.48 (4B, 11/95, Q.24) (3 points) Two insureds are randomly selected. During the first two
exposure periods, one insured produces a total of two claims (one in each exposure period) and
the other insured does not produce any claims.
Determine the probability that each of the two insureds will produce one claim during the third
exposure period.
A. 2/9
B. 1/4
C. 4/9
D. 1/2
E. 2/3
6.49 (4B, 11/95, Q.25) (2 points) Determine the number of exposure periods of loss experience
of a single insured needed to give a Buhlmann credibility factor, Z, of 0.75.
A. 2
B. 4
C. 6
D. 12
E. 24

6.50 (165, 5/96, Q.11) (1.9 points)


Fifteen successes have been observed in an experiment with n trials.
You are applying the Bayesian process to estimate the true probability of success which you know
is a binomial probability.
You also know that the form of the prior distribution is Beta, with = 1.
However, you are undecided as to which parameters to use.
Prior distribution I has parameters a1 and b1 , while prior distribution II has parameters a2 and b2 .
You are given:
(i) a1 = 5;
(ii) b1 = b2 ;
(iii) the mode of prior distribution I is 1/7;
(iv) the mean of prior distribution II is 6/11; and
(v) the mean of posterior distribution I is 16/31 of the mean of posterior distribution II.
Determine n.
(A) 75
(B) 90
(C) 100
(D) 125
(E) 165

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 183

6.51 (4B, 5/96, Q.30) (3 points) A number x is randomly selected from a uniform distribution on
the interval [0, 1].
Four Bernoulli trials are to be performed with probability of success x.
The first three are successes.
What is the probability that a success will occur on the fourth trial?
A. Less than 0.675
B. At least 0.675, but less than 0.725
C. At least 0.725, but less than 0.775
D. At least 0.775, but less than 0.825
E. At least 0.825
6.52 (4B, 5/97, Q.9 & Course 4 Sample Exam 2000, Q.34) (3 points)
You are given the following:
The number of claims for Risk 1 during a single exposure period follows
a Bernoulli distribution with mean q.
The prior distribution for q is uniform on the interval [0,1].

The number of claims for Risk 2 during a single exposure period follows
a Poisson distribution with mean .

The prior distribution for has the density function f() = e, 0 < < , > 0.

The loss experience of both risks is observed for an equal number of exposure periods.

Determine all values of for which the Buhlmann credibility of the loss experience of Risk 2 will be
greater than the Buhlmann credibility of the loss experience of Risk 1.

Hint:

2 e d

= 2 / 2.

A. > 0

B. < 1

C. > 1

D. < 2

E. > 2

6.53 (4B, 5/97, Q.25) (2 points) You are given the following:
The number of claims for a single insured is 1 with probability q and 0 with probability 1-q,
where q varies by insured.
The expected value of the process variance is 0.10.
The average of the hypothetical means is 0.30.
Determine the variance of the hypothetical means.
A. 0.01
B. 0.09
C. 0.10
D. 0.11

E. 0.19

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 184

6.54 (4B, 11/97, Q.19) (3 points) You are given the following:
The number of claims for a single insured follows a Bernoulli distribution with mean q.
q varies by insured and follows a uniform distribution on the interval [0, s], where 0 s < 1.
Determine the value of Buhlmann's k.
A. 2

B. 8

C. s2 / 12

D. s(3 - 2s) / 6

E. 2(3 - 2s) / s

6.55 (4B, 11/98, Q.14) (2 points) You are given the following:

The probability that a risk has at least one loss during any given month is q.

q does not vary by month.

The prior distribution of q is assumed to be uniform on the interval (0, 1).

This risk is observed for n months.

At least one loss is observed during each of these n months.

After this period of observation, the mean of the posterior distribution of q for this risk is 0.95.
Determine n.
A. 8
B. 9
C. 10
D. 18
E. 19
Use the following information for the next two questions:
The probability that a particular baseball player gets a hit in any given attempt is q .
q does not vary by attempt.
The prior distribution of q is assumed to follow a distribution with mean 1/3, variance
ab
, and density function
(a + b)2 (a+ b +1)
(q) =

(a + b) a-1
q (1-q)b-1, 0 q 1.
(a) (b)

The player is observed for nine attempts and gets four hits.

6.56 (4B, 11/98, Q.23) (2 points) If the prior distribution is constructed so that the credibility of the
observations is arbitrarily close to zero, determine which of the following is the largest.
A. f(0)
B. f(1/3)
C. f(1/2)
D. f(2/3)
E. f(1)
6.57 (4B, 11/98, Q.24) (3 points) If the prior distribution is constructed so that the variance of the
hypothetical means is 1/45, determine the probability that the player gets a hit in the tenth attempt.
A. 1/3
B. 13/36
C. 7/18
D. 5/12
E. 4/9

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 185

6.58 (4, 11/00, Q.11) (2.5 points) For a risk, you are given:
(i) The number of claims during a single year follows a Bernoulli distribution with mean p.
(ii) The prior distribution for p is uniform on the interval [0,1].
(iii) The claims experience is observed for a number of years.
(iv) The Bayesian premium is calculated as 1/5 based on the observed claims.
Which of the following observed claims data could have yielded this calculation?
(A) 0 claims during 3 years
(B) 0 claims during 4 years
(C) 0 claims during 5 years
(D) 1 claim during 4 years
(E) 1 claim during 5 years

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 186

Solutions to Problems:
6.1. A. Beta-Bernoulli. Beta has a = 4 and b = 5 (and theta = 1.) Mean of Bernoulli is q.
Mean of the portfolio = E[q] = Mean of Beta Distribution = a / (a+b) = 4/9 = 0.444.
6.2. B. The prior distribution is a Beta with a = 4 (the exponent of q)
and b = 5 (the exponent of 1-q).
Thus F(x) = (4,5; x). F(.55) - F(.50) = (4,5; .55) - (4,5; .5) = .740 - .637 = 0.103.
6.3. C. The (prior) marginal distribution is a Bernoulli with mean 4/9. This is the case because the
chance of one claim is the integral of q g(q), which is the mean of
g(q), which is the mean of the prior Beta = 4/9, as per the previous question.
The chance of no claim is the integral of (1- q)g(q) =
{(integral of g(q)} - {(integral of q g(q)} = 1 - mean of the posterior Beta = 1 - 4/9 = 5/9.
Note that for each value of q the Bernoulli can have only zero or one claim; therefore, these are the
only two possibilities when we integrate over all values of q.
6.4. C. Variance of the Bernoulli is: q(1-q) = (q - q2 ). Expected value of the process variance of the
portfolio = E[q] - E[q2 ]. E[q] = Mean of Beta Distribution = a / (a+b) = 4 / 9.
E[q2 ] = 2nd moment of Beta Distribution = a(a+1)/((a+b)(a+b+1)) = (4)(5)/((9)(10)) = .2222.
Therefore, expected value of the process variance of the portfolio = E[q] - E[q2 ] =
.4444 - .2222 = 0.2222.
6.5. E. The mean of the Bernoulli is q. Therefore, the variance of the hypothetical mean frequencies
= Var[q] = Variance of Beta = 2/81 = 0.0247.
6.6. C. Using the solutions of the previous two problems, the total variance =
Expected value of the process variance + Variance of the hypothetical means =
0.2222 + 0.0247 = 0.2469.
Alternately, the variance of the marginal Bernoulli (with a mean of 4/9) is: (4/9)(1 - 4/9) = .247.
6.7. B. K = .2222 / .0247 = 9.0. Z = 12 / (12 + 9) = 57.1%.
Estimated claim frequency = (.571)(7/12) + (.429)(.444) = 0.524.
Comment: For the Beta-Bernoulli, K = a + b = 4 + 5 = 9.
6.8. B. For the Beta-Bernoulli, the posterior distribution is Beta with new parameters equal to
(a + # claims observed), (b + # years - # claims observed)) =
(4 + 7) , (5 + 12 - 7) = 11 , 10. (11,10; x) is proportional to: q10 (1-q)9 .

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 187

6.9. E. The mean of the posterior Beta with parameters 11, 10 is: 11 / (11 + 10) =
11 / 21 = 0.524. Alternately, the estimated frequency using Buhlmann Credibility was .52, and this
Buhlmann Credibility result must be equal to the result of Bayesian Analysis, since the Beta is a
conjugate prior of the Bernoulli.
6.10. C. The posterior distribution is a Beta with a = 11 and b = 10. Thus F(x) = (11,10 ; x).
F(.55) - F(.50) = (11,10; .55) - (11,10; .5) = .591 - .412 = 0.179.
Comment: An example of a Bayesian Interval Estimate.
6.11. B. The posterior Beta with parameters 11 and 10 has mean = 11/21 = .5238, second
moment a(a+1)/((a+b)(a+b+1)) = (11)(12)/((21)(22)) = .2857, and variance = .2857 - .52382 =
0.0113.
Comment: The variance of the posterior Beta is considerably less than the variance of the prior Beta
distribution. The observations allow us to narrow the distribution of possibilities.
6.12. D. The posterior distribution of q is a Beta with a = 11 and b = 10, with mean of .524 and
standard deviation of:

0.0113 = 0.106. Thus F(.55) - F(.50)

((.55 - .524)/.106) - ((.5-.524)/.106) = (.25) - (-.23) = .5987 - .4090 = 0.1897.


Comment: Note that this differs from the exact answer of .179 obtained as the solution to a previous
question using values of the Incomplete Beta Functions.
6.13. E. The predictive distribution is a Bernoulli with mean 11/21. This is the case because the
chance of one claim is the integral of q g(q), which is the mean of g(q), which is the mean of the
posterior Beta = 11/21, as per a previous question. The chance of no claim is the integral of
(1- q)g(q) = {(integral of g(q)} - {(integral of q g(q)} = 1 - mean of the posterior Beta = 1 - 11/21 =
10/21. Note that for each value of q the Bernoulli can have only zero or one claim; therefore, these
are the only two possibilities when we integrate over all values of q.
1
( 11n ) (101 - n ) / 21 n = 0, 1.
n

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 188

12
6.14. B. Given q, the probability of 7 claims in 5 years is: q7 (1 - q)5 = 792 q7 (1-q)5 .
7
The posterior distribution of q is a Beta with a = 11 and b = 10, with density
f(q) = {20!/(10! 9!)} q10 (1 - q)9 = 1,847,560 q10 (1 - q)9 , 0 q 1.
Therefore, the probability of 7 claims over the next 12 years is:
1

1847560q10 (1 - q)9 792 q7 (1 - q)5 dq = 1463267520 q17 (1 - q)14


0

dq

= (1,463,267,520) (17! 14! / 32!) = 1,463,267,520 / 8,485,840,800 = 17.24%.


Comment: Beyond what you are likely to be asked on your exam. Involves Beta type integrals.
The estimated future claim frequency is 11/21 and the predictive distribution for the next year is a
Bernoulli with mean 11/21.
However, the number of claims over the next several years is given by the mixture of Binomial
Distributions via a Beta, a Beta-Binomial Distribution, rather than a Binomial Distribution.
The probability of 7 claim from a Binomial with m = 12 and q = 11/21 is:
(12!/(7! 5!))(11/21)7 (10/21)5 = 21.0%, not the correct answer to this question.
6.15. B. For a Bernoulli the mean is q and the process variance is: q(1-q) = q - q2 .
Thus the Expected Value of the Process Variance is: E[q] - E[q2 ].
Thus E[q2 ] = E[q] - EPV. EPV = 0.2.
0.6 = overall average = average of the hypothetical means = E[q].
Thus E[q2 ] = 0.6 - 0.2 = 0.4. VHM = VAR[q] = E[q2 ] - E[q]2 = 0.4 - 0.62 = 0.4 - 0.36 = 0.04.
Comment: Note that while the question does not assume that the q values are Beta Distributed,
one can assume so anyway and solve for the a and b parameters. In that case, the overall mean =
mean of the Beta = a/(a+b) = .6. The EPV = ab/{(a+b)(a+b+1)} = .2.
One can solve: a = 3 and b = 2. Then the VHM = ab / {(a+b)2 (a+b+1)} = 0.04.
6.16. A. The uniform distribution is a special case of the Beta, with a = 1 and b = 1.
a = a + r = 1 + 2 = 3. b = b + n - r = 1 + 5 = 6.
Posterior Beta has mean: a/(a + b) = 3/(3 + 6) = 1/3.
Alternately, K = a + b = 1 + 1 = 2. Z = 7/(7 + K) = 7/9. Prior mean = .5.
Estimated future annual frequency = (7/9)(2/7) + (2/9)(.5) = 1/3.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 189

6.17. A. Posterior Beta has a = 3, b = 6, = 1.


f(q) = q3-1(1 - q)6-1 8!/(2!5!) = 168q2 (1 - q)5 .
0.4

0.6

0.6

Prob[q .4] = 168 q2(1 q)5dq = 168 (1- y)2 y5dy =168 y5 - 2y6 + y 7 dy =
y=1

168{y6 /6 - 2y7 /7 + y8 /8} ] = (168)(0.00407503) = 68.46%.


y = 0.6

Comment: F(x) = (3, 6; x/1). F(.4) = (3, 6; .4) = 68.46%.


6.18. & 6.19. Let q be the fraction of the way the cue ball is from the left end of the table towards
the right end. Then the probability a ball is to the left of the cue ball is q.
Assume that your friend places the cue ball at random on the table means that q is uniformly
distributed from 0 to 1.
Thus we have a Uniform-Bernoulli, a special case of the Beta-Bernoulli with a = 1 and b = 1.
Since q is the probability of a ball being to the left of the cue ball, we have two successes
in three trials.
Thus the posterior distribution of q is Beta with:
a = 1 + 2 = 3, and b = 1 + 1 = 2.
The posterior mean is: a / (a + b) = 3/5 = 0.6.
The posterior density of q is: q2 (1 - q) 4! / (2! 1!) = 12q2 - 12q3 , 0 q 1.
The posterior probability that q < 1/4 is:
0.25

0 12q2 - 12q3 dq = (4)(0.253) - (3)(0.254) = 5.08%

Comment: Similar to the situation originally discussed by the Reverend Thomas Bayes.
His work was edited by Richard Price and published posthumously in 1764 as
An Essay towards solving a Problem in the Doctrine of Chances.
http://rstl.royalsocietypublishing.org/content/53/370.full.pdf
.
6.20. D. Beta-Bernoulli, with a = 499 and b = 501.
Mean of prior Beta is: a/(a+b) = 499 / (499+501) = 0.499.
6.21. C. For the Beta-Bernoulli, K = a + b = 1000. Z = 2000 / (2000 + 1000) = 2/3.
estimate: (2/3)(1010/2000) + (1/3)(.499) = 0.503.
Alternately, Posterior Beta has parameters:
499 + 1010 = 1509 and 501 + 2000 - 1010 = 1491.
Mean of posterior Beta: 1509 / (1509 + 1491) = 1509/3000 = 0.503.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 190

6.22. B. The 2nd moment of the posterior Beta is:


a(a+1)/{(a+b)(a+b+1)} = (1509)(1510)/{(3000)(3001)} = .2530923.
Variance = .2530923 - .5032 = .0000833.
Prob(q > .500) 1 - [(0.500 - 0.503) /

0.0000833 ] = 1 - (-.33) = 0.6293.

Comment: The Beta posterior is a distribution of Bernoulli parameters.


6.23. D. From the solution to the previous question, the posterior Beta (1509,1491), has mean
.503 and Standard Deviation .00913. On the Normal Distribution, since (1.645) (-1.645)
= .95 - .05 = .90, the mean 1.645 standard deviations covers a probability of 90%.
Thus .503 (1.645)(.00913) = (0.488, 0.518) is an approximate 90% confidence interval for the
Bernoulli parameter.
Comment: This an example of using the posterior distribution in order to find a confidence interval for
value of the true parameter(s) around its Bayesian estimator. In this case an approximate 95%
confidence interval for the Bernoulli parameter around its Bayesian estimate of 0.503 would be
0.503 (1.960)(0.00913) = (0.485, 0.521).
6.24. C. The posterior distribution of q is Beta with a = 5 + 1 = 6 and b = 3 + 6 - 1 = 8.
The posterior mean is: a/(a + b) = 6/(6 + 8) = 6/14 = 3/7.
6.25. D. The predictive distribution is Bernoulli, with q = 3/7.
Its variance is: (3/7)(1 - 3/7) = 12/49.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 191

6.26. E. From the previous solution, the posterior distribution of q is a Beta Distribution with
a = 6 and b = 8. g(q) = {13!/(5!7!)}q6-1(1-q)8-1 = 10296 q5 (1-q)7 , 0 q 1.
Given q, the probability of 3 failures in three trials is: (1-q)3 .
1

f(0) = (1-q)3 10296 q5 (1-q)7 dq = 10296 q5 (1 - q)10 dq = 10296 (6, 11)


0

= (10296)(5! 10! / 16!) = 10296/48048= 21.4%.


Alternately, in order to calculate the probability of 3 failures in 3 Bernoulli trials, proceed sequentially,
one trial at a time.
The posterior distribution of q is Beta with a = 6 and b = 8 with mean 6/(6 + 8) = 3/7.
The chance of a failure in the first trial is: 1 - 3/7 = 4/7.
Posterior to one trial with a failure, we get a Beta with a = 6 and b = 9.
The conditional probability of a failure in the second trial is: 9/(6 + 9) = 0.6.
Posterior to two trials each with a failure, we get a Beta with a = 6 and b = 10.
The conditional probability of a failure in the third trial is: 10/(6 + 10) = 5/8.
Therefore, the probability of 3 failures in 3 trials is: (4/7)(.6)(5/8) = 21.4%.
Comment: Beyond what you are likely to be asked on your exam.
The predictive distribution is not a Binomial Distribution with m = 3 and q = 3/7,
with density at 0 of: (4/7)3 = 0.187. Instead, the predictive distribution is a Beta-Binomial with
m = 3, a = 6, and b = 9. See Exercise 15.82 in Loss Models.
6.27. (a) Let q be the longterm proportion of boys born.
Given q, the probability of the observation is proportional to: q111 (1-q)97.
For q = 48% this is: 1.1752 x 10-63.
For q = 52% this is: 3.6034 x 10-63.
For q = 56% this is: 2.9092 x 10-63.
Thus the probability weights are: 1.1752/4, 3.6034/2, and 2.9092/4.
The posterior probabilities are: 10.4%, 63.8%, and 25.8%.
Thus the posterior chance that q > 52%, in other words that q = 56%, is 25.8%.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 192

(b) Given q, the probability of the observation is proportional to: q111 (1-q)97.
Since q is uniform from 0.48 to 0.56, the probability weights are also proportional to:
q111 (1-q)97.
0.56

Thus the desired probability that q > 52% is:

q111 (1- q)97 dq

0.52
0.56

q111 (1- q)97 dq

0.48
Now q111 (1-q)97 is proportional to a Beta Distribution, with a = 112, b = 98, and = 1.
This Beta Distribution has a mean of: 112 / (112 + 98) = 0.53333.
This Beta Distribution has a second moment of: (112)(113) / {(210)(211)} = 0.28562.
Thus this Beta Distribution has a variance of: 0.28562 - 0.533332 = 0.001179.
Ignore the constants in front of the density of the Beta, since they will cancel in the ratio of integrals
we want. Then the integral in the numerator is (proportional to) the difference between the Beta
Distribution at 0.56 and 052.
0.56 - 0.53333
0.52 - 0.53333
Use the Normal Approximation: [
] - [
]=
0.001179
0.001179
[0.78] - [-0.39] = 0.7823 - 0.3483 = 0.4340.
The integral in the denominator is (proportional to) the difference between the Beta Distribution at
0.56 and 048.
0.56 - 0.53333
0.48 - 0.53333
Use the Normal Approximation: [
] - [
]=
0.001179
0.001179
[0.78] - [-1.55] = 0.7823 - 0.0606 = 0.7217.
0.56

Thus the desired probability is:

q111 (1- q)97 dq

0.52
0.56

q111 (1- q)97 dq

0.48
(Using a computer, the exact answer is 60.02%.)

= 0.4340 / 0.7217 = 60.1%.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 193

(c) For this Beta-Bernoulli, the posterior Beta has:


a = 13 + 111 = 124, and b = 12 + 97 = 109.
Thus the posterior probability that q > 0.52 is the survival function of the Beta at 0.52.
This Beta Distribution has a mean of: 124 / (124 + 109) = 0.53219.
This Beta Distribution has a second moment of: (124)(125) / {(233)(234)} = 0.28429.
Thus this Beta Distribution has a variance of: 0.28429 - 0.532192 = 0.001064.
Use the Normal Approximation: 1 - [

0.52 - 0.53219
0.001064

] = 1 - [-0.37] = [0.37] = 0.6443.

(Using a computer, the exact answer is 64.62%.)


Comment: Notice the way that the posterior distribution depends on the prior distribution assumed.
All of the priors have a mean of 52%.
Similar to a situation analyzed by Pierre Simon Laplace.
Laplace developed the modern form of what is now called Bayes Theorem:
Prior times likelihood is proportional to the posterior.
Unlike in Laplaces day, currently in some countries the percentage of boys born differs
significantly from the natural rate, due to using ultrasound to select to have abortions based on
the gender of the fetus.
6.28. C. The prior distribution of q is a Beta Distribution with a = 135, b = 350, and = 1.
(a - 1 = 134 and b - 1 = 349.)
The mean of the (prior) Beta Distribution is: 135/(135 + 350) = 0.278.
The marginal distribution is Bernoulli with q = .278.
6.29. B. For the Beta-Bernoulli, the Buhlmann Credibility parameter = a + b = 485.
Therefore Z = 100 / (100+485) = 100/585.
The prior mean is a/(a+b) = 135/485 and the observation is 40/100.
Therefore the new estimate of q is: (100/585)(40/100) + (485/585)(135/485) = 175/585 = .299.
Thus over his next 100 attempts we expect (100)(.299) = 29.9 hits.
Alternately, the posterior Beta has parameters: 135 + 40 = 175 and 350 + 60 = 410.
The mean of the posterior Beta is 175 / (175+ 410) = 175/585 = .299.
(This is also the mean of the predictive Bernoulli distribution.)
Thus over his next 100 attempts we expect (100)(.299) = 29.9 hits.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 194

6.30. C. The distribution of q prior to any observations is a Beta Distribution with a = 135 and
b = 350. For the Beta-Bernoulli, the Buhlmann Credibility parameter = a + b = 485.
Therefore Z = 200 / (200+485) = 200/685.
The prior mean is a/(a+b) = 135/485 and the observation is 45/200.
Therefore the new estimate of q is: (200/685)(45/200) + (485/685)(135/485) = 180/685 = 0.263.
Alternately, the posterior Beta has parameters: 135 + 45 = 180 and 350 + 155 = 505. The mean
of the posterior Beta is 180 / (180+ 505) = 180/685 = 0.263.
Alternately, one can start with the distribution of q after the first 100 attempts, which from a previous
solution is a Beta with parameters a = 175 and b = 410. Then using only the
observation of the second 100 attempts to update this (intermediate) Beta, gives a Beta posterior
to all the observations, with parameters 175 + 5 = 180 and 410 + 95 = 505. Then proceed as
before.
Comment: One can either update in two smaller steps or one big step.
6.31. B. The posterior Beta has a = 1 + 47 = 48 and b = 5 + 0.
The mean of the posterior Beta is: 48/(48 + 5) = 90.6%.
Comment: The mean of the prior Beta is: 1/(1 + 5) = 1/6. Based on the comic strip Peanuts.
In 1952, it might not have occurred to someone that Lucy could be so mean; even 1/6 would
have been a rather high estimate of the probability of someone doing something like this.
6.32. D. Beta-Bernoulli, with chance of success (for the tyrannosaur) of q.
The posterior Beta for Toms q has: a = 2 + 0 = 2, and b = 150 + 300 = 450.
The number of future days Tom stays alive is the number of failures for the tyrannosaur prior to his
first success, which is Geometric with
= (chance of failure for the tyrannosaur)/(chance of success for the tyrannosaur) =
(1- q)/q = 1/q - 1. Expected number of days alive = E[] = E[1/q] - 1.
For Toms posterior Beta Distribution, E[X-1] = (a + b)(a - 1)/{(a)(a + b + 1) =
(a + b - 1)/(a -1) = (2 + 450 - 1)/(2 - 1) = 451. Therefore, E[] = E[1/q] - 1 = 451 - 1 = 450 days.
Tom expects to find (60)(.5){(.6)(1) + (.2)(5) + (.2)(10)} = 108 worth of coins each day.
(450 days alive in the future)(108 per day) = 48,600.
Comment: Difficult! Beyond what you are likely to be asked on the exam.
One could add half a walk, 48,600 + 54 = 48,654, to take into account the probability that Tom
completes his walk on the day he is eaten.
Note, each scientist can be eaten only once in total. Therefore, this is not your typical Beta-Bernoulli.
However, one can update the Beta in the same manner as usual for a scientist who represents 300
failures and no successes for the tyrannosaur.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 195

6.33. D. a = 15 + 40 = 55. b = 15 + (60 - 40) = 35.


Posterior mean is: a/(a + b) = 55/90 = 11/18.
So we expect the Durham Bulls to win (11/18)(102) = 62 of their remaining 102 games.
Expected total wins: 40 + 62 = 102.
Alternately, using Buhlmann Credibility, K = a + b = 15 + 15 = 30, Z = 60/(60 + K) = 2/3.
Prior mean = a/(a + b) = 15/30 = 1/2.
Future frequency = (2/3)(40/60) + (1/3)(1/2) = 11/18. Proceed as before.
6.34. C. The posterior distribution of q is Beta; a = 15 + 40 = 55. b = 15 + (60 - 40) = 35.
Posterior mean is: a/(a+b) = 55/90 = .6111.
2nd moment of the posterior Beta is: a(a+1)/{(a+ b)(a+ b+ 1)} = (55)(56)/{(90)(91)} = .3761.
The total number of wins is 40 plus the additional wins in the remaining 102 games.
Thus the variance of the total number of wins is the variance of the remaining wins.
The number of remaining wins is Binomial with m = 102 and q.
Posterior EPV = E[102q(1- q)] = (102)(E[q] - E[q2 ]) = (102)(.6111 - .3761) = 24.0.
Posterior VHM = Var[102q] = (1022 )Var[q] = (1022 )(E[q2 ] - E[q]2 ) = (1022 )(.3761 - .61112 ) =
27.6. Total Variance = EPV + VHM = 24.0 + 27.6 = 51.6.
6.35. A. K = a + b = 20. Z = 6/(6 + 20) = 3/13. Observed frequency of girls = 0/6 = 0.
A priori mean = a/(a + b) = 10/(10 + 10) = 1/2.
Estimate = (Z)(0) + (1 - Z)(1/2) = (10/13)(1/2) = 5/13 = 38.5%.
Alternately, a = a + r = 10 + 0 = 10. b = b + n - r = 10 + 6 = 16.
Mean of posterior Beta is: a/(a + b) = 10/(10 + 16) = 5/13 = 38.5%.
6.36. E., 6.37. D., 6.38. A. The posterior distribution of q is Beta with a = 1 and b = 19 + 4 = 23.
The estimate corresponding to the squared error loss function is the mean.
The mean of the posterior Beta is 1/(1 + 23) = 1/24 = 4.17%.
The estimate corresponding to the absolute loss function is the median.
The density of the posterior Beta is: f(q) = 23(1-q)22, 0 < q < 1.
Therefore, F(q) = (1- q)23, 0 < q < 1.
Set 0.5 = (1- q)23. q = 2.97%. The median of the posterior Beta is 2.97%.
The estimate corresponding to the zero-one loss function is the mode.
The density of the posterior Beta is: f(q) = 23(1-q)22, 0 < q < 1.
This is a decreasing function of q, and thus the mode is at q = 0.
Comment: Loss functions are discussed in Mahlers Guide to Buhlmann Credibility.
The zero-one loss function is kind of silly for actuarial work.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 196

6.39. The mean of the mixture for a single game is: 0.6.
For a given value of q, for a single game, the second moment of the Bernoulli is: (1-q)q + q2 = q.
Therefore, the second moment of the mixture is: E[q] = 0.6.
Therefore, the variance of the mixture is: 0.6 - 0.62 = 0.24.
The variance of the number of games won during a season is: (0.24)(162) = 38.88
The standard deviation of the number of games won during a season is 6.24.
Alternately, for a single game EPV = E[q(1-q)] = E[q] - E[q2 ],
and VHM = Var[q] = E[q2 ] - E[q]2 .
For a single game, the (total) variance is: EPV + VHM = E[q] - E[q]2 = 0.6 - 0.62 = 0.24.
The variance of the number of games won during a season is: (0.24)(162) = 38.88
The standard deviation of the number of games won during a season is 6.24.
Comment: The usual assumption for a Beta-Binomial would be that for a given team q is the same
throughout the year. That is not the case here.
Note that in this case, the answer does not depend on the distribution of q.
6.40. This uniform is a Beta with a =1 and b = 1. This is a Beta-Bernoulli with a = 1 and b = 1.
a = 1 + 0 = 1, and b = 1 + n = n + 1.
Posterior mean is : a / (a + b) = 1 / (n+2).
6.41. This uniform is a Beta with a =1 and b = 1. This is a Beta-Bernoulli with a = 1 and b = 1.
a = 1 + n = n + 1, and b = 1 + 0 = 1.
Posterior mean is : a / (a + b) = (n+1) / (n+2).
6.42. B. The posterior probability that H = 1/4 is: .2 / .3 = 2/3.
The posterior probability that H = 1/2 is: .1 / .3 = 1/3.
A

A Priori Chance of
This Value
of H

Chance
of the
Observation

1/4
1/2

0.800
0.200

0.2500
0.5000

Overall

Comment: A mixture of Bernoulli Distributions.

Prob. Weight =
Posterior Chance
Product of Columns
of This Value
B&C
of H
0.2000
0.1000

66.7%
33.3%

0.3000

1.000

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 197

6.43. B. Assuming a given value of , the chance of observing one success in two trials is: (1).
The prior distribution of is: g() = 1, 0 1. By Bayes Theorem, the posterior distribution of is
proportional to the product of the chance of the observation and the prior distribution: (1). Thus
the posterior distribution of is proportional to 2. The integral of 2 from 0 to 1 is 1/2 - 1/3 =
1/6. We must divide by 1/6 in order to have the integral of the posterior distribution equal to unity.
Thus the posterior distribution of is 6( 2).
The posterior chance of in [.45, .55] is:
=.55

=.55

( 2) d = 32 - 23 ]

=.45

= .3 - .1505 = 0.1495.

=.45

Comment: This is an example of a Bayesian interval estimate. A Beta-Binomial conjugate prior


situation, since the uniform distribution is a Beta distribution with a=1 and b=1.
The posterior distribution is Beta(2,2; ) = {(3!) / (1!)(1!)}2-1 (1-)2-1 = 6( 2).
6.44. B. Mean of the Beta is: a/(a+b) = a/(a+b). 1/10 = a/(a + b). b = 9a.
Mode = 1/20 = (a - 1)/(a + b - 2). b = 19a - 18. a = 1.8. b = 16.2.
a = 1.8 + h, b = 16.2 + 5 - h = 21.2 - h, and the posterior mean is: (1.8 + h)/23.

19/115 = (1.8 + h)/23. h = 2.


6.45. B. The given Beta Distribution has parameters: a = 4 and b = 5. The posterior distribution is
also a Beta Distribution, but with new first parameter, a, equal to the prior first parameter, a, plus the
number of claims observed = 4 + 2 = 6, and new second parameter, b, equal to the prior second
parameter, b, plus the number of trials minus the number of claims observed = 5 + 3 - 2 = 6.
The mean of a Beta distribution is a /(a + b).
Thus the mean of the posterior distribution is: 6 / (6 + 6) = 0.5.
6.46. C. This is a Beta-Bernoulli conjugate prior, with prior Beta Distribution with a = 2 and b = 2.
For this situation, the Buhlmann Credibility parameter K = a + b = 4.
For one exposure period: Z = 1 / (1 + 4) = 1/5.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 198

6.47. E. Since we have a Bernoulli process with either zero or one claim per exposure period, the
chance of having one claim in the next exposure period is the posterior mean. Since this is a
conjugate prior situation with a member of a linear exponential family, the Bayes Analysis result is
equal to the Buhlmann credibility result. From the previous question K = 4.
Therefore, Z = 12 / (12 + 4) = 75%. The observed mean is 3/12 = .25.
The prior mean is the mean of the (prior) Beta = a / (a + b) = 2 / 4 = .5.
Thus the new estimate = (75%)(.25) + (25%)(.5) = .3125 = 5/16.
Alternately, for the Beta-Bernoulli, the parameters of the posterior Beta are:
posterior a = prior a + number of claims observed = 2 + 3 = 5, and
posterior b = prior b + (number of exposures - number of claims observed) = 2 + 9 = 11.
Mean of the posterior Beta = ( posterior a ) / ( posterior a + posterior b ) = 5 / (5 + 11 ) = 5/16.
6.48. A. The prior distribution is a Beta with parameters a = 2 and b = 2.
For the Beta-Bernoulli the (posterior) predictive distribution is a Bernoulli, with
q = ( a + number of claims observed) / (a + b + number of trials observed)
= (2 + number of claims observed) / (2 + 2 + 2) = (2 + number of claims observed) / 6.
For the first insured, which had two claims in two trials, q = 4/6 = 2/3.
For the second insured, with no claims in two trials, q = 2/6 = 1/3.
Thus posterior to the observations, we have two insureds each with Bernoulli distributions, one with
q = 2/3 and one with q = 1/3. The chance that they each have a claim is: (2/3)(1/3) = 2/9.
6.49. D. For the Beta-Bernoulli, the Buhlmann Credibility Parameter is K = a + b = 2 + 2 = 4.
For N exposure periods, Z = N / (N + 4). Setting Z = .75 and solving for N, N = 12.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 199

6.50. D. The mode of the Beta is where is the density is largest.


f(x) is proportional to: (x/)a-1 (1 - x/)b-1, 0 x . Setting the derivative of f(x) equal to 0:
0 = (a - 1)(x/)a-2 (1 - x/)b-1/ - (b - 1)(x/)a-1 (1 - x/)b-2/.
(a - 1)(1 - x/) = (b - 1)(x/). x = (a - 1)/(a + b - 2).
Checking against the endpoints, this is the maximum of the density when a > 1 and b > 1.
The mode is: (a - 1)/(a + b - 2) = (a - 1)/(a + b - 2), for a > 1 and b > 1.
The mode of prior distribution I is 1/7. 1/7 = (5 - 1)/(5 + b1 - 2). b1 = 25. b2 = 25.
Mean of the Beta is: a/(a+b) = a/(a+b).
The mean of prior distribution II is 6/11. 6/11 = a2 /(a2 + 25). a2 = 30.
After 15 successes in n trials, using prior distribution I,
a = 5 + 15 = 20, b = 25 + n - 15 = 10 + n, and the posterior mean is: 20/(30 + n).
After 15 successes in n trials, using prior distribution II,
a = 30 + 15 = 45, b = 25 + n - 15 = 10 + n, and the posterior mean is: 45/(55 + n).
The mean of posterior distribution I is 16/31 of the mean of posterior distribution II.
20/(30 + n) = (16/31)45/(55 + n). 31n + 1705 = 36n + 1080. n = 125.
6.51. D. Given x, the chance of observing three successes out of three trials is x3 .
The a priori density function is f(x) =1, 0< x <1.
The posterior probability is proportional to the product of the chance of the observation and the a
priori probability: (x3 ) (1) = x3 . The integral from zero to one of x3 is 1/4.
In order to get a density function we must divide x3 by this integral; therefore the posterior density is
4x3 .
The mean of the posterior density is the integral from 0 to 1 of (x)(4x3).
The integral from 0 to 1 of 4x4 is: 4/5 = 0.80.
Alternately, the uniform distribution is a Beta distribution with a = 1 and b = 1.
For a Beta-Bernoulli. the posterior mean is:
(a + number of successes) / (a + b + number of trials) = (1 + 3)/(1 + 1 + 3) = 4/5 = 0.80.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 200

6.52. D. Risk 1 follows a Uniform Distribution, a Beta-Bernoulli, with a = 1 and b = 1.


Thus Risk 1 has Buhlmann Credibility Parameter of: a + b = 2.
Risk 2 follows a Gamma-Poisson with = 1 and = 1/.
Thus Risk 2 has Buhlmann Credibility Parameter of: 1/ = .
If for an equal number of exposures more credibility is assigned to Risk 2 than Risk 1, then the
Buhlmann Credibility Parameter for Risk 2 is less than that for Risk 1. In other words, < 2.
Alternately, one can work out the two Buhlmann Credibility Parameters.
For Risk 1, the EPV is E[q - q2 ] = 1/2 - 1/3 = 1/6.
For Risk 1, the VHM is VAR[q] = 1/12. Thus for Risk 1, K = (1/6)/(1/12) = 2.
For Risk 2, the EPV is E[] = 1/.
Using the hint, the VHM is VAR[] = E[2] - E2 [] = 2 / 2 - 1/ 2 = 1/ 2.
Thus for Risk 2, K = (1/ )/(1/ 2) = .
Comment: This question is a combination of two simpler questions asking you to compute the
credibility for each of two different (conjugate prior) situations.
Note that since Z = N/(N + K), a smaller value of K corresponds to a larger Z.
6.53. D. For a Bernoulli the mean is q and the process variance is q(1-q) = q - q2 .
Thus the Expected Value of the Process Variance is: E[q] - E[q2 ].
Thus E[q2 ] = E[q] - EPV.
We are given that: average of the hypothetical means = E[q] = .3 and EPV = .1.
Thus E[q2 ] = .3 - .1 = .2.
Now VHM = VAR[q] = E[q2 ] - E[q]2 = .2 - .32 = .2 - .09 = 0.11.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 201

6.54. E. Since q is uniform from 0 to s, E[q] = s/2, and Var[q] = s2 /12.

E[q2 ] = s2 /12 + (s/2)2 = s2 /3.


For fixed q, the process variance of the Bernoulli is q(1-q).

EPV = E[q(1-q)] = E[q] - E[q2 ] = s/2 - s2 /3.


The hypothetical mean of the Bernoulli is q.

VHM = Var[q] = s2 /12.


Thus the Buhlmann Credibility Parameter K = EPV/VHM = (s/2 - s2 /3)/(s2 /12) =
(6 - 4s)/s = 2(3 - 2s) / s.
Alternately, the density of q is: f(q) = 1/s for 0 q s.
Thus the Expected Value of the Process Variance is:
s

q(1-q)f(q)dq = (1/s)(q-q2)dq = (1/s)(s2/2 - s3/3) = s/2 - s2/3.


0

For fixed q, the mean of the Bernoulli is q. The overall mean is thus s/2.
Thus the Variance of the Hypothetical Means is:
s

q=s

(q-s/2)2f(q)dq = (1/s)(q-s/2)2dq = (1/s)(q-s/2)3 / 3 ]


0

= s2 /24 - (-s2 /24) = s2 /12.

q=0

Proceed as before.
Alternately, this uniform distribution is a Beta with a = 1, b = 1, and = s.
Thus this is a Beta-Bernoulli, with < 1.
K = (a + b){(a + b + 1)/ - (a + 1)}/b = 2(3/s - 2)/1 = 2(3 - 2s) / s.
Comment: q is uniform on [0, s], with the value of s fixed but unknown. Some might find it easier to
select a value of s such as 0.4 (not 1), compute K and then see which letter solution could be right.
If s = 1, then we would have a uniform distribution on [0, 1]; this is a special case of a Beta-Bernoulli
with a = 1, b = 1, (and = 1). For this case, K = a + b = 2.
Only choices A and E approach this result as s approaches 1.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 202

6.55. D. Given q, the chance of the observation is qn . Since the a priori distribution of q is uniform
on [0,1], the posterior distribution of q is proportional to qn .
1

qn dq = qn+1/(n+1)

] = 1/(n+1).
0

Thus the posterior distribution of q is: qn /(1/(n+1)) = qn (n+1). Thus the mean of the posterior
distribution is:
1

q {qn (n+1)} dq = qn+2(n+1)/(n+2)

] = (n+1)/(n+2).
0

Setting the posterior mean equal to the given 0.95, one solves for n.
0.95 = (n+1)/(n+2). 0.95n + 1.9 = n + 1. 0.9 = 0.05n. n = 18.
Alternately, this is a Beta-Bernoulli, with a = 1 and b =1. The posterior Beta has parameters:
1 + n and 1 + (n-n). This posterior Beta with parameters: n + 1 and 1, has a mean of
(n+1)/((n+1) + 1) = (n + 1)/(n + 2). Then proceed as above to solve for n.
Alternately, this is a Beta-Bernoulli, with a = 1 and b = 1. Thus the result of Bayesian Analysis is
equal to that of Buhlmann Credibility. K = a + b = 2. Z = n / (n + 2). Observation = 1.
Prior mean = 1/2. New estimate = (1)(n/(n + 2)) + (1/2)(2/(n + 2)) = (n + 1)/(n + 2).
Then proceed as above to solve for n.
Comment: Backwards. Given an output, solve for a missing input.
6.56. B. We are given that the mean of the Beta Distribution is 1/3, so a/(a+b) = 1/3 or b = 2a. For
comparing values of (q) between values of q (for fixed a and b) we dont care about the constant in
front. (q) is proportional to: qa-1 (1-q)b-1 = qa-1 (1-q)2a-1 . We can maximize this, by maximizing its
log: (a-1)ln(q) + (2a-1)ln(1-q). Taking the derivative of this log, and setting it equal to zero:
(a-1)/q -(2a-1)/(1-q) = 0. (2a - 1)q = (a - 1) - (a - 1)q. q = (a - 1)/(3a - 2).
For the Beta-Bernoulli, the Buhlmann Credibility Parameter K = a + b = 3a.
For small credibility, K is large, so a is large. As a approaches infinity, the value at which f(q) is
largest: (a - 1)/(3a - 2) approaches 1/3.
Comment: As the credibility gets infinitesimal, the prior density has its mode at its mean of 1/3.
As a goes to infinity, the variance of the prior density goes to zero, and the probability is
concentrated at the mean.

2013-4-10,

Conjugate Priors 6 Beta-Bernoulli,

HCM 10/21/12,

Page 203

6.57. C. We are given that ab/{(a+b+1)(a+b)2 } = 1/45.


Also from the solution to the previous question we have b = 2a.
Thus a(2a)/{(a+2a+1)(a+2a)2 } = 2a2 / {(3a+1)(9a2 )} = 2/ {9(3a+1)} = 1/45.
Thus 1/(3a+1) = 1/10. Thus a = 3 and b = 6.
The Buhlmann Credibility parameter K = a + b = 9. Thus for 9 observations,
Z = 9/(9 + 9) = 50%. The prior mean is 1/3. The observation is 4/9.
Thus the new estimate is: (4/9)(1/2) +(1/3)(1- 1/2) = 4/18 + 3/18 = 7/18.
Alternately, the posterior Beta has parameters: 3 + 4 = 7 and 6 + (9 - 4) = 11.
Thus the mean of the posterior Beta is 7/(7+11) = 7/18.
Comment: Gives you two of the intermediate results and asks you to solve for a and b.
Check: a/(a+b) = 3/(3+6) = 1/3. ab/{(a+b+1)(a+b)2 } = (3)(6)/{(10)(92 )} = 1/45.
6.58. A. Assume one observes C claims in Y years. Then the chance of the observation is
proportional to: pC(1-p)Y-C. The a priori density of p is 1, 0 p1.
Thus by Bayes Theorem, the posterior density of p is proportional to: pC(1-p)Y-C, 0 p1.
Therefore, the mean of the posterior density is:
1

p pC(1- p)Y - C dp /

pC(1- p)Y - C dp = (C+2, Y+1 - C)/(C+1, Y+1 - C) =

{(C+2)(Y+1-C)/(Y+C+3)} / {(C+1)(Y+1-C)/(Y+C+2)} = (C + 1)/(Y + 2).


Of the combinations given, the posterior mean is 1/5 for C = 0 and Y = 3.
Alternately, this a special case of the Beta-Bernoulli Conjugate Prior, with a = 1 and b = 1.
a = a + C = C + 1. b = b + Y - C = Y + 1 - C.
The mean of the posterior Beta is: a/(a + b) = (C + 1)/(Y + 2). Proceed as before.
Alternately, since for the Beta-Bernoulli Conjugate Prior, Buhlmann Credibility gives the same result
as Bayes Analysis, one could use Buhlmann Credibility with K = a + b = 2. Z = Y/(Y+2).
Posterior estimate = (C/Y)Y/(Y+2) + (1/2)2/(Y+2) = (C + 1)/(Y + 2).

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 204

Section 7, Beta-Binomial41
The Beta Distribution is also a conjugate prior to the Binomial Distribution with m fixed.
Since a Binomial is a series of m independent Bernoulli trials, one can apply similar ideas to a
Beta-Binomial as to a Beta-Bernoulli.
Exercise: The number of claims in a year from an individual policyholder is Binomial with parameters
m = 10 and q. The prior distribution of q is Beta with a = 3, b = 6, and = 1.
An individual policyholder has 4 claims the first year, and 5 claims the second year.
What is the expected number of claims from this policyholder in the third year?
[Solution: We observed two years with the equivalent of 10 Bernoulli trials each;
we observed (2)(10) = 20 Bernoulli trials, with 4 + 5 = 9 claims.
a = a + r = 3 + 9 = 12, and b = b + n - r = 6 + 20 - 9 = 17.
Mean of the posterior Beta Distribution of q is: a/(a + b) = 12/(12 + 17) = 0.414.
Expected future annual frequency is: mq = (10)(0.414) = 4.14.]
The Beta Distribution is a conjugate prior to the Binomial Distribution for fixed m.
If for fixed m, the q parameter of the Binomial is distributed over a portfolio by a Beta, then the
posterior distribution of q parameters is also given by a Beta with parameters:
a = a + number of claims.
b = b + m (number of years) - number of claims.
Exercise: The number of claims in a year from an individual policyholder is Binomial with parameters
m = 10 and q. The prior distribution of q is Beta with a = 3, b = 6, and = 1.
An individual policyholder has 4 claims the first year, and 5 claims the second year.
What is the posterior distribution of q?
[Solution: The posterior distribution is also Beta, with a = a + number of claims = 3 + 4 + 5 = 12, and
b = b + m(number of years) - number of claims = 6 + (10)(2) - (4 + 5) = 17.
Comment: Mean of the posterior Beta Distribution of q is: a/(a + b) = 12/(12 + 17) = 0.414.
Expected future annual frequency is: mq = (10)(0.414) = 4.14.]
This is the same result as obtained previously by thinking of the Binomial as the sum of 10
independent Bernoullis. The two approaches are mathematically equivalent. Use whichever
approach you prefer.
The number of claims per year is Binomial. We observe several years. This is mathematically the
same as if we had independent Bernoulli trials, each with mean q. The distribution of q is Beta.
This is mathematically equivalent to a Beta-Bernoulli. Therefore, the estimates from Buhlmann
Credibility and Bayes Analysis are equal.
41

For m = 1, we get the special case of the Beta-Bernoulli.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 205

Translate from Binomial Land to Bernoulli Land.


Update for Observations.
Translate back to Binomial Land.
This trick works to get the posterior mean.
It does not work to get the predictive distribution.
As will be discussed, the mixed distribution is a Beta-Binomial.
Alternately, one can use the following updating formulas for the Beta-Binomial case, in order to get
the posterior Beta:
a = a + number of claims.
b = b + m (number of years) - number of claims.
It turns out that for the Beta-Binomial: K = (a + b) / m, with Bayes = Buhlmann.

Beta-Binomial Frequency Process


Beta is a Conjugate Prior for the Binomial Likelihood
Binomial with m fixed is a Member of a Linear Exponential Family.
Buhlmann Credibility Estimate = Bayes Analysis Estimate.
Buhlmann Credibility Parameter, K = (a + b)/m.
Binomial
Process
Beta Prior a, b
Dist. of Parameters

Mixing

Beta-Binomial
Marginal Dist.
Number of Claims

Observations:
# claims = C,
# years = Y.

Beta Posterior
Dist. of Parameters
a = a + C.
b = b + mY - C.

Mixing

Binomial
Process

Beta-Binomial
Predictive Dist.
Number of Claims

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 206

Beta-Binomial, Prior Expected Value of the Process Variance:


Since the frequency for each risk is Binomial, the process variance for an individual risk is:
mq(1-q) = mq - mq2
Therefore the expected value of the process variance = m E[q] - m E[q2 ].
E[q] = the mean of the Beta Distribution =

a
.
(a + b)

E[q2 ] = the second moment of the Beta Distribution =

EPV = m

a (a + 1)
.
(a + b) (a + b + 1)

a
a (a + 1)
ab
-m
=m
.
(a + b)
(a + b) (a + b + 1)
(a + b) (a + b + 1)

Beta-Binomial, Prior Variance of the Hypothetical Means:

Variance of the Prior Beta Distribution =

a2
a (a + 1)
ab
=
.
2
2
(a + b) (a + b + 1) (a + b)
(a + b) (a + b + 1)

Since the frequency for each risk is Binomial, each hypothetical mean is mq.
ab
VHM = Var[mq] = m2 Var[q] = m2
.
(a + b)2 (a + b + 1)
Beta-Binomial, Buhlmann Credibility:
K = EPV/ VHM =

a + b 42
.
m

As with the Beta-Bernoulli, Buhlmann = Bayes.


ab
ab
+ m2
2
(a + b) (a + b + 1)
(a + b) (a + b + 1)
a b (m+ a + b)
=m
= Variance of Mixed Beta-Binomial.
(a + b)2 (a+ b +1)
EPV + VHM = m

42

For m = 1, this reduces to the Beta-Bernoulli.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 207

Mixed Beta-Binomial Distribution:


In the case of a Beta-Binomial, the mixed distribution is not a Binomial. Therefore, while one can use
the idea that a Binomial is a sum of independent Bernoulli trials to get the mean future frequency, one
can not use this idea in general to predict the probability of the insured having a certain number of
claims in the future. Rather, one has to perform the mixing.
The predictive distribution is a mixture of Binomial Distributions via the posterior Beta.
For example, in the previous exercise, the posterior Beta for this policyholder has a = 12,
b = 17, and = 1. Therefore, the posterior density of q is:
g(q) =

28!
q11(1 - q)16 = 365,061,060 q11(1 - q)16, 0 q 1.
11! 16!

Given q, the probability of 6 claims next year is the density at 6 for a Binomial with m = 10:
10
q6 (1 - q)4 = 210 q6 (1 - q)4 .
6
Therefore, mixing over q, the probability of 6 claims next year for this policyholder is:
1

210 q6 (1 - q)4 365,061,060 q11 (1 - q)16 dq = 76,662,822,600

q17 (1- q)20 dq.

This is a Beta type integral discussed in the previous section.


1

xa - 1 (1- x)b - 1 dx =

(a - 1)! (b- 1)! (a) (b)


=
= (a, b).
(a + b - 1)!
(a + b)

Thus, with a = 18 and b = 21, this integral is:

1
17! 20!
=
.
604,404,010,980
38!

Therefore, mixing over q, the probability of 6 claims next year for this policyholder is:
76,662,822,600 / 604,404,010,980 = 12.68%.
This is not the same as the density at 6 of a Binomial with m = 10 and q = posterior mean = 12/29:
5! 6!
(17/29)4 (12/29)6 = 12.45%.
12!

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 208

This mixed distribution is an example of what is sometimes called a Beta-Binomial Distribution.43


(a + x, b + m- x)
f(x) =
, x = 0, 1, ... , m.44
(m+1) (a, b) (x + 1, m +1- x)
Exercise: In this example, what is the probability that the insured has zero claims next year?
[Solution: In this example, the predictive distribution is a Beta-Binomial with m = 10, a = 12, and
b = 17. f(0) = (12+0, 17+10-0) / {(10+1)(12, 17)(0+1, 10+1-0)} =
(12, 27) / {(11)(12, 17)(1, 11)}.
(12, 27) = (12) (27)/(39) = 11! 26! / 38!.
(12, 17) = (12) (17)/(29) = 11! 16! / 28!.
(1, 11) = (1) (11)/(12) = (1) 10! / 11! = 1/11.
f(0) = {11! 26!/38!} {(11){11! 16!/28!}/11} = (26!/16!) (28!/38!)
= {(17)(18) ... (26)} / {(29)(30) ... (38)} = 1.12%.
Alternately, given q, the probability of 0 claims next year is the density at 0 for a Binomial with
m = 10: (1-q)10. The posterior distribution of q is Beta with a = 12, b = 17, and = 1:
g(q) = {28!/(11! 16!)} q11(1 - q)16 = 365,061,060 q11(1 - q)16, 0 q 1.
f(0) =

(1- q)10 365,061,060 q11(1 - q)16 dq = 365,061,060

= 365,061,060 (12, 27) = 365,061,060

q11(1 -

q)26 dq

11! 26!
= 365,061,060 / 3,248,970,178 = 1.12%.
38!

Alternately, treat this as 10 separate Bernoulli trials each with no claim.


The distribution of q prior to next year is Beta with a = 12, and b = 17 with mean 12/29.
The chance of no claims in the first trial is 17/29.
Posterior to one trial with no claim, we get a Beta with a = 12 and b = 18.
The conditional probability of no claims in the second trial is: 18/(12 + 18) = 18/30.
Posterior to two trials with no claim, we get a Beta with a = 12 and b = 19.
The conditional probability of no claims in the third trial is: 19/(12 + 19) = 19/31.
Proceeding in this manner, the probability of no claims in all 10 trials is:
(17/29)(18/30)(19/31)(20/32)(21/33)(22/34)(23/35)(24/36)(25/37)(26/38) = 1.12%.
Comment: The final sequential technique only works when one is trying to get the probability of
having either no claims in the future, or the maximum number of claims in the future, in this case 10
per year. If one instead was calculating the probability of 6 claims next year, one would not know
which of the 10 trials had a claim and which did not.]
43

See Exercise 15.82 in Loss Models. This distribution is also sometimes called a Binomial-Beta,
Negative Hypergeometric, or Polya-Eggenberger Distribution. It has three parameters: m, a, and b.
See for example Kendalls Advanced Theory of Statistics by Stuart and Ord.
44

(a, b) = (a)(b) / (a+b).

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 209

In this example, the predictive distribution is a Beta-Binomial with m = 10, a = 12, and b = 17, with
densities at 0 to 10 of: 0.0112362, 0.0518594, 0.121351, 0.188768, 0.215442, 0.188022,
0.126840, 0.0652322, 0.0244621, 0.00604002, 0.00074612.45
Variance of the Mixed Beta-Binomial Distribution:46
The number of claims in a year from an individual policyholder is Binomial with parameters m = 10
and q. The prior distribution of q is Beta with a = 3, b = 6, and = 1.
Exercise: Determine the mean, second moment, and variance of this prior Beta distribution.
[Solution: E[q] = Mean of the Beta = a / (a + b) = 3/9 = 1/3.
a (a +1)
(3) (4)
E[q2 ] = Second Moment of the Beta =
=
= 2/15.
(a + b) (a + b + 1) (3 + 6) (3 + 6 + 1)
Var[q] = Variance of the Beta = 2/15 - (1/3)2 = 1/45.]
Then there are two different techniques one could use to determine the variance of the marginal
(prior mixed) distribution.
Process Variance of the Binomial = 10q(1 - q) = 10q - 10q2 .
EPV = E[10q - 10q2 ] = 10 E[q] - 10 E[q2 ] = (10)(1/3 - 2/15) = 2.
Hypothetical Mean of the Binomial = 10q.
VHM = Var[10q] = 102 Var[q] = (102 )(1/45) = 2.222.
Variance of the marginal (mixed) distribution is: EPV + VHM = 2 + 2.222 = 4.22.
Alternately, the mean of the mixture is the mixture of the means:47
E[10q] = 10 E[q] = 10/3 = 3.333.
The second moment of the mixture is the mixture of the second moments:48
E[10q(1-q) + (10q)2 ] = 10 E[q] + (9)(10) E[q2 ] = 10/3 + (90)(2/15) = 15.333.
Variance of the mixture is: 15.333 - 3.3332 = 4.22.

In general, the Beta-Binomial has mean: m


45

a b (m+ a + b) 49
a
, and variance: m
.
(a + b)2 (a+ b +1)
a + b

These densities are not equal to those of a Binomial with m = 10 and q = posterior mean =12/29:
0.00479192, 0.0338253, 0.107445, 0.202249, 0.249838, 0.211627, 0.124487, 0.0502131, 0.0132917,
0.00208497, 0.000147174.
46
See 4, 5/07, Q.3.
47
The mean of the Binomial is 10q.
48
The second moment of the Binomial is: variance of the Binomial + (mean of the Binomial)2 .
49
With = 1 for the Beta. When m = 1, the Beta-Binomial is just a Bernoulli Distribution.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 210

The Beta-Binomial has the same mean as for a Binomial with m and q = a/(a+b).
a
b
However, for m > 1, this is a larger variance than that of this Binomial: m
.
a + b a + b
For this example, the prior distribution of q is Beta with a = 3, b = 6, and = 1. m = 10.
The mean of the mixed Beta-Binomial is: m
and the variance is: m

a
3
= 10
= 3.333,
a + b
3+6

a b (m+ a + b)
(3) (6) (10 + 3 + 6)
=
10
= 4.22,
(a + b)2 (a+ b +1)
(3 + 6)2 (3 + 6 + 1)

matching the previous results.


If an individual policyholder has 4 claims the first year, and 5 claims the second year, then as
discussed previously, the posterior distribution is also Beta, with
a = a + number of claims = 3 + 4 + 5 = 12, and
b = b + m(number of years) - number of claims = 6 + (10)(2) - (4 + 5) = 17.
Exercise: Determine the mean, second moment, and variance of this posterior Beta distribution.
[Solution: E[q] = Mean of the Beta = a / (a + b) = 12/29 = 0.4138.
a (a +1)
(12) (17)
E[q2 ] = Second Moment of the Beta =
=
= 0.1793.
(a + b) (a + b + 1) (12 + 17) (12 + 17 + 1)
Var[q] = Variance of the Beta = 0.1793 - 0.41382 = 0.00807.]
Exercise: Determine the variance of the predictive (posterior mixed) distribution.
[Solution: Process Variance of the Binomial = 10q(1 - q) = 10q - 10q2 .
Posterior EPV = E[10q - 10q2 ] = 10 E[q] - 10 E[q2 ] = (10)(0.4138 - 0.1793) = 2.345.
Hypothetical Mean of the Binomial = 10q.
Posterior VHM = Var[10q] = 102 Var[q] = (102 )(0.00807) = 0.807.
Variance of the predictive (mixed) distribution is: EPV + VHM = 2.345 + 0.807 = 3.152.
Alternately, the mean of the mixture is the mixture of the means:
E[10q] = 10 E[q] = (10)(0.4138) = 4.138.
The second moment of the mixture is the mixture of the second moments:
E[10q(1-q) + (10q)2 ] = 10 E[q] + (9)(10) E[q2 ] = (10)(0.4138) + (90)(0.1793) = 20.275.
Variance of the mixture is: 20.275 - 4.1382 = 3.152.
Comment: Parallel to the calculation of the variance of the marginal (prior mixed) distribution, except
using the posterior Beta rather than the prior Beta.]

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 211

Problems:
7.1 (3 points) The number of claims in a year from an individual policyholder is Binomial with
parameters m = 3 and q. The prior distribution of q is Beta with a = 2, b = 4, and = 1.
An individual policyholder has 1 claim the first year, 2 claims the second year, and 1 claim the third
year. What is the expected number of claims from this policyholder in the fourth year?
A. 1
B. 6/5
C. 4/3
D. 3/2
E. 8/5
7.2 (2 points) The annual number of claims is Binomial with m = 4 and q.
(q) = 60q2 (1-q)3 , 0 < q < 1.
You observe 8 claims in 3 years.
Compare the estimate based on Buhlmann Credibility to that from Bayes Analysis.
7.3 (2 points) You are given:
(i) For Q = q, X1 , X2 ,..., Xm are independent, identically distributed Bernoulli random variables with
parameter q.
(ii) The prior distribution of Q is beta with a = 20, b = 28, and = 1.
(iii) 12 claims are observed.
Determine the smallest value of m such that the mean of the posterior distribution of Q is less than or
equal to 0.25.
(A) 80
(B) 90
(C) 100
(D) 110
(E) 120
7.4 (3 points) You are given:
(i) The number of claims for each policyholder has a binomial distribution with parameters m = 10
and q.
(ii) The prior distribution of q is beta with parameters a = 4, b = unknown, and = 1.
(iii) A randomly selected policyholder had the following claims experience:
Year Number of Claims
1
3
2
2
3
y
(iv) The Bayesian credibility estimate for the expected number of claims in Year 3 based
on the Year 1 and Year 2 experience is 3.3333.
(v) The Bayesian credibility estimate for the expected number of claims in Year 4 based
on the Year 1, Year 2 and Year 3 experience is 3.7838.
Determine y.
(A) 4
(B) 5
(C) 6
(D) 7
(E) 8

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Use the following information for the next five questions:


The number of claims in a year from an individual policyholder is Binomial
with parameters m = 5 and q.
The prior distribution of q is: (q) = 60q3 (1- q)2 , 0 q 1.
7.5 (1 point) What is the mean of the marginal distribution?
A. 2.7
B. 2.9
C. 3.1
D. 3.3
E. 3.5
7.6 (2 points) What is the variance of the marginal distribution?
A. less than 1.2
B. at least 1.2 but less than 1.4
C. at least 1.4 but less than 1.6
D. at least 1.6 but less than 1.8
E. at least 1.8
7.7 (2 points) An individual policyholder has 4 claims in a year.
What is the expected future annual claim frequency for this policyholder?
A. 3.3
B. 3.5
C. 3.7
D. 3.9
E. 4.1
7.8 (2 points) An individual policyholder has 4 claims in a year.
What is the variance of the predictive distribution?
A. less than 1.2
B. at least 1.2 but less than 1.4
C. at least 1.4 but less than 1.6
D. at least 1.6 but less than 1.8
E. at least 1.8
7.9 (3 points) An individual policyholder has 3 claims in a year.
Use the Bayes estimate that corresponds to the zero-one loss function,
in order to predict this insureds frequency next year.
A. 2.8
B. 2.9
C. 3.0
D. 3.1
E. 3.2

7.10 (3 points) You are given:


(i) Conditional on Q = q, the random variables X1 , X2 , Xm, are independent
and follow a Bernoulli distribution with parameter q.
(ii) Sm = X1 + X2 + + Xm.
(iii) The distribution of Q is beta with a = 7, b = 3, and = 0.6.
Determine the variance of the marginal distribution of S40.
(A) 12

(B) 14

(C) 16

(D) 18

(E) 20

Page 212

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 213

Use the following information for the next 7 questions:


Assume that given an inherent claim frequency q, the number of claims observed for one risk in m
trials is given by a Binomial distribution with mean mq and variance mq(1-q).
Also assume that the parameter q varies between 0 and 1 for the different risks, with q following a
ab
a
Beta distribution with mean
and variance
.
2
(a + b) (a+ b +1)
a + b
7.11 (2 points) If you observe d claims in m trials for an individual insured.
Which of the following is the expected value of the posterior distribution of q for this insured?
A. (a+m) / (a+b+d)
B. (b+m) / (a+b+d)
C. (a+d) / (a+b+m)
D. (b+d) / (a+b+m)
E. None of the above
7.12 (3 points) What is the Expected Value of the Process Variance (for a single trial)?
ab
ab
A.
B.
(a + b) (a + b + 1)
a + b
C.

ab
(a + b)2

D.

ab
(a +

b)2 (a

+ b + 1)

E. None of the above


7.13 (1 point) What is the Variance of the Hypothetical Means (for a single trial)?
ab
ab
A.
B.
(a + b) (a + b + 1)
a + b
C.

ab
(a + b)2

D.

ab
(a +

b)2 (a

+ b + 1)

E. None of the above


7.14 (4, 5/83, Q.41a) (1 point)
What is the Buhlmann Credibility assigned to the observation of m trials?
A. m / (m+a)
B. m / (m+b)
C. m / (m+a+b)
D. m / (m+a+ab+b)
E. None of the above

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 214

7.15 (1 point) If you observe d claims in m trials for an individual insured.


Using Buhlmann Credibility, what is the estimated future claim frequency for this insured?
A. (a+m) / (a+b+d)
B. (b+m) / (a+b+d)
C. (a+d) / (a+b+m)
D. (b+d) / (a+b+m)
E. None of the above
7.16 (4 points) Let d be such that 0 d m.
What is the probability of observing d claims in m trials for an individual insured?
Let the complete Beta Function be defined as (r,s) = (r)(s) / (r+s).
A.

(a + d, b)
(m+1) (a, b + m- d) (d+ 1, m +1- d)

B.

(a + d, b)
(m+1) (a + m- d, b) (d, m)

C.

(a + d, b + m- d)
(m+1) (a + m, b + d) (d, m)

D.

(a + d, b + m- d)
(m+1) (a, b) (d + 1, m +1- d)

E. None of the above.


7.17 (2 points) If a = 2 and b = 4, then what is the probability of observing 5 claims in 7 trials for an
individual insured?
A. 6.8%
B. 7.0%
C. 7.2%
D. 7.4%
E. 7.6%

Use the following information for the next two questions:


(i) Conditional on Q = q, the random variables X1 , X2 , Xm, are independent and follow a
Bernoulli distribution with parameter q.
(ii) Sm = X1 + X2 + + Xm
(iii) The distribution of Q is beta with a = 3, b = 11, and = 1.
7.18 (3 points) Determine the variance of the marginal distribution of S50.
(A) 34

(B) 36

(C) 38

(D) 40

(E) 42

7.19 (2 points) We observe that S50 = 7.


Determine the variance of the predictive distribution for X51.
(A) 0.10

(B) 0.13

(C) 0.16

(D) 0.19

(E) 0.22

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 215

Use the following information for the next three questions:

Frequency follows a binomial distribution with parameters 4 and q.


The prior distribution of q is: 6 (q - q2 ), 0 q 1.
During the next year there is one claim.
7.20 (2 points) Find the posterior distribution of q.
7.21 (1 point) Estimate the future annual frequency for this insured.
A. 1.5
B. 1.6
C. 1.7
D. 1.8
E. 1.9
7.22 (2 points) Find the posterior probability that q is more than 0.5.
A. 23%
B. 25%
C. 27%
D. 29%
E. 31%

7.23 (4, 5/86, Q.47) (1 point) The beta distribution is a conjugate prior distribution to the binomial
distribution. Explain briefly what is meant by this.
Use the following information for the next two questions:
Assume that the number of claims, r, made by an individual insured in one year follows a binomial
distribution:
3
p(r) = r (1 - )3-r, r = 0, 1, 2, 3.
r
Also assume that the parameter, , has the following p.d.f.:
g() = 6( - 2 ), 0 < < 1.
7.24 (4, 5/91, Q.32) (2 points) Given an observation of one claim in a one year period,
what is the posterior distribution of ?
A. 30 2 (1 - )2
B. 10 2 (1 - )3
C. 6 2 (1 - )2
D. 60 2 (1 - )3
E. 105 2 (1 - )4
7.25 (4, 5/91, Q.33) (3 points) What is the Buhlmann credibility assigned to a single observation?
A. 3/8
B. 3/7
C. 1/2
D. 3/5
E. 3/4

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 216

7.26 (165, 5/91, Q.12) (1.9 points) A Bayesian process is being used to estimate the mortality
rate at age x. You are given the following:
(i) The mortality rate is assumed to be a random variable with a Beta distribution with = 1.
(ii) The mean of the prior distribution is 0.20.
(iii) 10 lives were observed.
(iv) 1 life died before attaining age x + 1.
(v) The number of deaths has a binomial distribution.
(vi) You have twice as much confidence in the prior mean as in the observed mortality rate.
(vii) The mode of any Beta distribution is (a - 1) / (a + b - 2).
Determine the mode of the posterior distribution.
(A) 1/13
(B) 1/9
(C) 1/7
(D) 1/6

(E) 1/5

Use the following information for the next two questions:


The number of claims for an individual risk in one year follows the Binomial Distribution with
parameters m = 5 and q. The parameter q has a prior distribution in the form of a beta:
f(q) = 60 q3 (1-q)2 , 0 q 1. No claims occurred in the first year.
7.27 (4B, 5/93, Q.4) (1 point) The posterior distribution of q is proportional to which of the
following?
A. q3 (1-q)2

B. q3 (1-q)7

C. q8 (1-q)2

D. q7 (1-q)3

E. q2 (1-q)8

7.28 (4B, 5/93, Q.5) (3 points) Determine the Buhlmann credibility estimate for the expected
number of claims in the second year.
A. Less than 1.40
B. At least 1.40 but less than 1.50
C. At least 1.50 but less than 1.60
D. At least 1.60 but less than 1.70
E. At least 1.70
7.29 (4B, 5/93, Q.29) (2 points) You are given the following:
The distribution for number of claims is binomial with parameters q and m = 1.
The prior distribution of q has mean = 0.25 and variance = 0.07.
Determine the Buhlmann credibility to be assigned to a single observation of one risk.
A. Less than 0.20
B. At least 0.20 but less than 0.25
C. At least 0.25 but less than 0.30
D. At least 0.30 but less than 0.35
E. At least 0.35

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 217

7.30 (4, 11/03, Q.7 & 2009 Sample Q.5) (2.5 points) You are given:
(i) The annual number of claims for a policyholder has a binomial distribution with
probability function:
2
p(x | q) = qx (1-q)2-x, x = 0, 1, 2.
x
(ii) The prior distribution is: (q) = 4q3 , 0 < q < 1.
This policyholder had one claim in each of Years 1 and 2.
Determine the Bayesian estimate of the number of claims in Year 3.
(A) Less than 1.1
(B) At least 1.1, but less than 1.3
(C) At least 1.3, but less than 1.5
(D) At least 1.5, but less than 1.7
(E) At least 1.7
7.31 (5 points) In the previous question, 4, 11/03, Q.7,
estimate the probability of having 0 claims in Year 3,
the probability of having 1 claim in Year 3,
and the probability of having 2 claims in Year 3.

7.32 (4, 11/04, Q.1 & 2009 Sample Q.133) (2.5 points) You are given:
(i) The annual number of claims for an insured has probability function:
3
p(x) = q x (1 - q)3- x , x = 0, 1, 2, 3.
x
(ii) The prior density is (q) = 2q, 0 < q < 1.
A randomly chosen insured has zero claims in Year 1.
Using Bhlmann credibility, estimate the number of claims in Year 2 for the selected insured.
(A) 0.33
(B) 0.50
(C) 1.00
(D) 1.33
(E) 1.50
7.33 (4, 11/06, Q.9 & 2009 Sample Q.253) (2.9 points) You are given:
(i) For Q = q, X1 , X2 ,..., Xm are independent, identically distributed Bernoulli random variables with
parameter q.
(ii) Sm = X1 + X2 +...+ Xm
(iii) The prior distribution of Q is beta with a = 1, b = 99, and = 1.
Determine the smallest value of m such that the mean of the marginal distribution of Sm is greater
than or equal to 50.
(A) 1082
(B) 2164

(C) 3246

(D) 4950

(E) 5000

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 218

7.34 (4, 11/06, Q.29 & 2009 Sample Q.272) (2.9 points) You are given:
(i) The number of claims made by an individual in any given year has a binomial distribution
with parameters m = 4 and q.
(ii) The prior distribution of q has probability density function
(q) = 6q(1 - q), 0 < q < 1.
(iii) Two claims are made in a given year.
Determine the mode of the posterior distribution of q.
(A) 0.17
(B) 0.33
(C) 0.50
(D) 0.67

(E) 0.83

7.35 (4, 5/07, Q.3) (2.5 points) You are given:


(i) Conditional on Q = q, the random variables X1 , X2 , Xm, are independent
and follow a Bernoulli distribution with parameter q.
(ii) Sm = X1 + X2 + + Xm
(iii) The distribution of Q is beta with a = 1, b = 99, and = 1.
Determine the variance of the marginal distribution of S101.
(A) 1.00

(B) 1.99

(C) 9.09

(D) 18.18

(E) 25.25

7.36 (4, 5/07, Q.15) (2.5 points) You are given:


(i) The number of claims for each policyholder has a binomial distribution with parameters m = 8
and q.
(ii) The prior distribution of q is beta with parameters a (unknown), b = 9, and = 1.
(iii) A randomly selected policyholder had the following claims experience:
Year Number of Claims
1
2
2
k
(iv) The Bayesian credibility estimate for the expected number of claims in Year 2 based
on the Year 1 experience is 2.54545.
(v) The Bayesian credibility estimate for the expected number of claims in Year 3 based
on the Year 1 and Year 2 experience is 3.73333.
Determine k.
(A) 4
(B) 5
(C) 6
(D) 7
(E) 8

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 219

Solutions to Problems:
7.1. B. The prior distribution of q is proportional to: q(1-q)3 . The chance of the observation is
proportional to: {q(1-q)2 } {q2 (1-q)} {q(1-q)2 }. Thus the posterior distribution of q is proportional to:
q(1-q)3 {q(1-q)2 }{q2 (1-q)}{q(1-q)2 } = q5 (1-q)8 . Therefore, the posterior distribution of q is a Beta
with a = 6, b = 9, and = 1, with mean 6 / (6 + 9) = 0.4.
The expected future annual frequency is: (0.4)(3) = 1.2.
Alternately, each year is a Binomial with m = 3, the sum of three independent Bernoullis.
Thus three years is the sum of 9 independent Bernoullis, each with the same q.
There are a total of 4 claims in 9 Bernoulli trials so:
a = a + r = 2 + 4 = 6, and b = b + n - r = 4 + 9 - 4 = 9.
Posterior mean of q is: 6/(6 + 9). Expected future annual frequency is: (6/15)(3) = 1.2.
7.2. The number of claims per year is Binomial with m = 4 and q. We observe 3 years.
This is mathematically the same as if we had (4)(3) independent Bernoulli trials, each with mean q.
The distribution of q is Beta with a = 3 and b = 4.
Therefore, this is mathematically equivalent to a Beta-Bernoulli.
Therefore, the estimates from Buhlmann Credibility and Bayes Analysis are equal.
Alternately, the number of claims over 3 years is Binomial with m = 12 and q.
The posterior distribution of q is proportional to:
f(8) (q) ~ q8 (1-q)4 q2 (1-q)3 = q10(1-q)7 .
Therefore, the posterior distribution of q is Beta with a = 11 and b = 8.
E[X | q] = 4q. Bayes Analysis estimate = E[4q] = 4E[q] = 4(11/19) = 44/19.
Var[X | q] = 4q(1 - q) = 4q - 4q2 . The prior distribution of q is Beta with a = 3 and b = 4.
E[q] = 3/7. E[q2 ] = (3)(4)/{(7)(8)} = 3/14.
(Prior) EPV = E[4q - 4q2 ] = 4(3/7) - (4)(3/14) = 6/7.
(Prior) VHM = Var[4q] = 16Var[q] = (16){3/14 - (3/7)2 } = 24/49.
K = EPV/VHM = (6/7)/(24/49) = 7/4. Z = 3/(3+K) = 12/19.
A priori mean = E[4q] = (4)(3/7) = 12/7.
Buhlmann Credibility estimate = (12/19)(8/3) + (7/19)(12/7) = 44/19.
Therefore, the estimates from Buhlmann Credibility and Bayes Analysis are equal.
7.3. A. The Posterior Beta has parameters: a = 20 + 12 = 32, and b = 28 + m - 12 = 16 + m.
The mean of the Posterior Beta is: a/(a + b) = 32/(48 + m) .25. m 80.
Comment: Similar to 4, 11/06, Q.9.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 220

7.4. B. A Beta-Binomial; Bayes Analysis equals Buhlmann Credibility in this case.


Based on the first two years of data: a = a + 3 + 2 = 9, and b = b + (10 - 3) + (10 - 2) = b + 15.
Therefore, the Bayesian credibility estimate for the future frequency is
a/(a + b) = 9/(9 + b + 15) = 9/(b + 24). The estimate of the number of claims is 10 times that:
3.3333 = 90/(b + 24). b = 3.
Based on the first three years of data: a = a + 3 + 2 + y = 9 + y,
and b = b + (10 - 3) + (10 - 2) + (10 - y) = 28 - y.
The estimate of the number of claims is:
3.7838 = 10(9 + y)/(9 + y + 28 - y) = 10(9 + y)/37. y = 5.
Comment: Similar to 4, 5/07, Q.15. Given the usual outputs, solve for missing inputs.

7.5. B. & 7.6. E. The prior distribution is Beta with a = 4, b = 3, and = 1,


with mean 4/(4 + 3) = 4/7, second moment (4)(5)/{(7)(8)} = 5/14,
and variance 5/14 - (4/7)2 = 3/98.
The mean number of claims is E[5q] = 5 E[q] = 5(4/7) = 2.857.
Prior to observations, the EPV = E[mq(1-q)] = 5E[q] - 5E[q2 ] = (5)(4/7) - (5)(5/14) = 15/14.
Prior to observations, the VHM = Var[5q] = 25 Var[q] = (25)(3/98) = 75/98.
Variance of the marginal distribution = Prior EPV + Prior VHM = 15/14 + 75/98 = 90/49 = 1.837.
Comment: The mixed distribution is not a Binomial; it is a Beta-Binomial. Similar to 4, 5/07 Q.3.
7.7. A. The prior distribution of q is proportional to: q3 (1- q)2 .
The chance of the observation is proportional to: q4 (1-q).
Thus the posterior distribution of q is proportional to: q3 (1- q)2 q4 (1-q) = q7 (1-q)3 .
Therefore, the posterior distribution of q is a Beta with a = 8, b = 4, and = 1,
with mean 8/(8 + 4) = 2/3.
The expected future annual frequency is: (2/3)(5) = 10/3.
Alternately, a year is a Binomial with m = 5, the sum of five independent Bernoullis.
The prior distribution is Beta with a = 4, b = 3, and = 1.
There are a total of 4 claims in 5 Bernoulli trials so:
a = a + r = 4 + 4 = 8, and b = b + n - r = 3 + 5 - 4 = 4. Proceed as before.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 221

7.8. C. The posterior distribution of q is a Beta with a = 8, b = 4, and = 1,


with mean 8/(8 + 4) = 2/3, second moment (8)(9)/{(12)(13)} = 0.4615,
and variance: .4615 - (2/3)2 = 0.01709.
Posterior to observations, the EPV = E[mq(1-q)] = 5E[q] - 5E[q2 ] = (5)(2/3) - (5)(.4615) = 1.026.
Posterior to observations, the VHM = Var[5q] = 25 Var[q] = (25)(0.01709) = 0.427.
Variance of the predictive distribution = Posterior EPV + Posterior VHM = 1.026 + 0.427 = 1.453.
Comment: The mixed distribution is not a Binomial; it is a Beta-Binomial.
Similar to 4, 5/07 Q.3, which deals with the marginal distribution prior to observations, rather than the
predictive distribution posterior to observations.
7.9. C. The prior distribution of q is proportional to: q3 (1- q)2 .
The chance of the observation is proportional to: q3 (1-q)2 .
Thus the posterior distribution of q is proportional to: q3 (1- q)2 q3 (1-q)2 = q6 (1-q)4 .
Therefore, the posterior distribution of q is a Beta with a = 7, b = 5, and = 1.
The zero-one loss function corresponds to the mode. We wish to maximize this density.
Setting the derivative with respect to q equal to zero: 6q5 (1-q)4 - 4q6 (1-q)3 = 0.
6(1-q) = 4q. q = 0.6. Thus the estimated frequency is: (5)(0.6) = 3.0.
Comment: In general, the mode of a Beta Distribution is: (a - 1) / (a + b - 2), for a > 1 and b > 1.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

7.10. E. E[q] = Mean of the Beta =

Page 222

a
= (0.6)(7/10) = 0.42.
a + b

E[q2 ] = Second Moment of the Beta = 2

a (a + 1)
= (0.62 )(56/110) = 0.183273.
(a + b) (a + b + 1)

Var[q] = Variance of the Beta = 0.183273 - 0.422 = 0.006873.


Process Variance = 40q(1 - q) = 40q - 40q2 .
EPV = 40 E[q] - 40 E[q2 ] = (40)(0.42) - (40)(0.183273) = 9.469.
Hypothetical Mean = 40 q.
VHM = 402 Var[q] = (402 ) (0.006873) = 10.9926.
Variance of the marginal (mixed) distribution is: EPV + VHM = 9.469 + 10.996 = 20.465.
a
Alternately, E[q] = Mean of the Beta =
= (0.6)(7/10) = 0.42.
a + b
The mean of the mixture is the mixture of the means: E[40q] = 40 E[q] = (40) (0.42) = 16.80.
a (a + 1)
E[q2 ] = Second Moment of the Beta = 2
= (0.62 )(56/110) = 0.183273.
(a + b) (a + b + 1)
The second moment of the mixture is the mixture of the second moments:
E[40q(1-q) + (40q)2 ] = 40 E[q] + (1560) E[q2 ] = (40)(0.42) + (1560)(0.183273) = 302.706.
Variance of the mixture is: 302.706 - 16.802 = 20.466.
Comment: The mixed distribution is not a Binomial.
Here, < 1. For use in the Beta-Binomial conjugate prior situation, = 1.
7.11. C. For the Beta-Bernoulli, the posterior distribution is Beta with new parameters equal to
(a + # claims observed), (b + # trials - # claims observed)) = (a+d), (b+m-d).
The mean of this posterior Beta is: (a+d)/{(a+d)+(b+m-d)} = (a+d) / (a+b+m).
7.12. B. The process variance for an individual risk (for one trial) is q(1-q) = q - q2 since the
frequency for each risk is Bernoulli. EPV = E[q] - E[q2] = mean of Beta - second moment of Beta =
a/(a+b) - a(a+1)/{(a+b)(a+b+1)} = ab / {(a+b)(a+b+1)}.
7.13. D. The variation of the hypothetical means (for one trial) is the variance of q = Var[q] =
Variance of the Prior Beta = a(a+1)/{(a+b)(a+b+1)} - a2 /(a+b)2 = ab / {(a+b)2 (a+b+1)}.
7.14. C. K = the (prior) expected value of the process variance / the (prior) variance of the
hypothetical means = {ab/((a+b)(a+b+1))} / {ab/((a+b)2 (a+b+1))} = a + b.
Z = N/(N + K) = m/(m + a + b).

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 223

7.15. C. The prior mean frequency is the mean of the prior Beta Distribution, which is a/(a+b).
The observation is d/n. From the prior solution Z = m/(m+a+b).
Thus the estimate using Buhlmann Credibility is:
{m/(a+b+m)}(d/m) + {1 - m/(a+b+m)}{a/(a+b)} = d/(a+b+n) + a/(a+b+n) = (a + d) / (a + b + m).
7.16. D. The probability density of q is a Beta Distribution with parameters a and b:
{(a+b-1)! / {(a-1)! (b-1)!} qa-1(1-q)b-1 = {(a+b) / (a)(b)} qa-1(1-q)b-1.
One can compute the unconditional density at d via integration:
1

f(d) = f(d | q) {(a+b) / (a)(b)} qa-1(1-q)b-1 dq =


0
1

{(a+b) / (a)(b)} {m! / (d!)(m-d)!} qd (1-q)m-d qa-1(1-q)b-1dq =


0
1

{1/(a,b)}{(m+1) / ((d+1))(m+1-d)}

qa+d-1(1-q)b+m-d-1dq =

{1/(a,b)}{(m+2) / (m+1)((d+1))(m+1-d)} {(a+d) (b+m-d) / (a+b+m) } =


{1/(a,b)}{1/(m+1)}{1/(d+1,m+1-d)} (a+d,b+m-d) =
(a+d,b+m-d) / {(m+1)(a,b)(d+1,m+1-d)}.
Comment: This (prior) marginal distribution is sometimes called a Binomial-Beta, Negative
Hypergeometric, or Polya-Eggenberger Distribution. See for example Kendalls Advanced Theory
of Statistics by Stuart and Ord. It has three parameters: m, a and b. It has mean ma/(a+b) and
variance: abm(m+a+b) / {(a+b+1)(a+b)2}.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 224

7.17. E. Using the solution to the previous question:


f(d) = (a+d,b+m-d) / {(m+1)(a,b)(d+1,m+1-d). Here d = 5, a = 2, b = 4, and m = 7.
f(5) = (2+5,4+7-5) / {(7+1)(2,4)(5+1,7+1-5) = (7,6) / {(8)(2,4)(6,3).
(7,6) = (7)(6) / (7+6) = (6!)(5!)/(12!) = 1/5544. (2,4) = (2)(4) / (2+4) = (1!)(3!)/(5!) =
1/20. (6,3) = (6)(3) / (6+3) = (5!)(2!)/(8!) = 1/168.
Therefore, f(5) = (7,6) / {(8)(2,4)(6,3) = {(20)(168)} / {(8)(5544)} = 5/66 = 0.07576.
Comment: One can also compute the solution by doing integrals similar to those in the solution to
the previous question. The probability of observing other number of claims in 7 trials is as follows:
d
f(d)
F(d)

0
0.15152
0.15152

1
0.21212
0.36364

2
0.21212
0.57576

3
0.17677
0.75253

4
0.12626
0.87879

5
0.07576
0.95455

6
0.03535
0.98990

7
0.01010
1.00000

This is an example of the Binomial-Beta distribution with: a = 2, b =4, and m = 7.


7.18. B. E[q] = Mean of the Beta = a/(a + b) = 3/14 = 0.2143.
E[q2 ] = Second Moment of the Beta = a(a + 1)/{(a + b)(a + b + 1)} = (3)(4)/{(14)(15)} = 0.05714.
Var[q] = Variance of the Beta = 12/{(14)(15)} - (3/14)2 = .01122.
Process Variance = 50q(1 - q) = 50q - 50q2 .
EPV = 50 E[q] - 50 E[q2 ] = (50)(0.2143 - 0.0571) = 7.86.
Hypothetical Mean = 50 q. VHM = 502 Var[q] = (502 )(.01122) = 28.05.
Variance of the marginal (mixed) distribution is: EPV + VHM = 7.86 + 28.05 = 35.9.
Alternately, E[q] = Mean of the Beta = a/(a + b) = 3/14 = 0.2143.
The mean of the mixture is the mixture of the means: E[50q] = 50 E[q] = (50)(0.2143) = 10.715.
E[q2 ] = Second Moment of the Beta = a(a + 1)/{(a + b)(a + b + 1)} = (3)(4)/{(14)(15)} = 0.05714.
The second moment of the mixture is the mixture of the second moments:
E[50q(1-q) + (50q)2 ] = 50 E[q] + (49)(50) E[q2 ] = (50)(0.2143) + (49)(50)(0.05714) = 150.7.
Variance of the mixture is: 150.7 - 10.7152 = 35.9.
Comment: Similar to 4, 5/07, Q.3. The mixed distribution is not a Binomial; it is a Beta-Binomial.
7.19. B. The posterior distribution is Beta with a = 3 + 7 = 10 and b = 11 + (50 - 7) = 54.
This Beta has mean: 10/(10 + 54) = 0.156.
Therefore, the predictive distribution is Bernoulli, with q = 0.156.
Its variance is: (0.156)(1 - 0.156) = 0.132.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 225

7.20. The chance of the observation given q is: 4 q (1-q)3 .


Thus the posterior distribution is proportional to: (q - q2 ) q (1-q)3 = q2 (1-q)4 , 0 q 1.
This is a Beta Distribution with a = 3, b = 5, and = 1.
The constant in front is: (3 + 5) / {(3) (5)} = 7! / {(2!)(4!)} = 105.
The posterior distribution of q is: 105 q2 (1-q)4 , 0 q 1.
Alternately, the prior distribution of q is Beta with a = 2, b = 2, and = 1.
Thus this is a Beta-Binomial, and the posterior distribution is Beta with parameters:
a = 2 + 1 = 3, b = 2 + (4)(1) - 1 = 5, and = 1.
The posterior distribution of q is: 105 q2 (1-q)4 , 0 q 1.
7.21. A. The mean of the posterior Beta is: 3 / (3 + 5) = 3/8.
Thus the estimated future frequency is: (4)(3/8) = 1.5.
Alternately, K = (a+b)/m = (2 + 2)/4 = 1.
Z = 1/(1+K) = 1/2.
Prior mean is: m a / (a +b) = 4 (2)/(2 + 2) = 2.
Thus the estimated future frequency is: (1/2)(1) + (1 - 1/2)(2) = 1.5.
7.22. A. Let x = 1 - q. Then, 105 q2 (1-q)4 = 105 (1 - x)2 x4 = 105 (x4 - 2x5 + x6 ).
q > 0.5. x < 0.5.
0.5

105

0 x4 - 2x5 + x6 dx = (105) {0.55/5 - (2) (0.56/6) + 0.57/7} = 22.66%.

7.23. If the prior distribution is a Beta Distribution and the likelihood is a Binomial Distribution, then
the posterior distribution is also a Beta Distribution.
7.24. D. By Bayes Theorem, the posterior distribution of is proportional to p(1)g() =
3(1-)2 6(2 ) = 182 (1-)3 . Thus the posterior distribution is a Beta Distribution with parameters:
a = 3 and b = 4. The constant in front is: (3 + 4) / {(3)(4)} = 6! / {2! 3!} = 60.
Thus the posterior distribution is 602 (1-)3 .
Comment: The prior distribution is a Beta with a = 2 and b = 2. We observe 1 claim in 3 trials. The
posterior 1st parameter, a, is the prior 1st parameter, a, + # claims = 2 + 1 = 3. The posterior 2nd
parameter, b, is the prior 2nd parameter, b, + # trials - # claims = 2 + 3 - 1 = 4.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 226

7.25. B. For the Beta-Bernoulli, the Buhlmann Credibility parameter is a+b (the sum of the
parameters of the Prior Beta), which in this case is 2 + 2 = 4. In this case one observation is equal
to 3 Bernoulli Trials (m=3 in the Binomial.) Therefore, Z = N / (N + K) = 3 /(3 + 4) = 3/7.
Alternately, the process variance for an individual risk is: 3(1) = 3 - 32, since the frequency for
each risk is Binomial with m = 3. Therefore the expected value of the process variance =
the expected value of 3 - the expected value of 32 = 3E[] - 3E[2].
E[] = the mean of the (a,b; x): a/(a+b) = 2/(2+2) = 2/4= .5. E[2] = second moment of Beta =
a(a+1)/{(a+b) (a+b+1)} = (2)(3)/{(4)(5)} = .3. Therefore the EPV = 3(.5 - .3) = .6.
The Variance of the Hypothetical Means is the variance of 3 = 9Var[] =
9 (Variance of the Prior Beta) = 9 (.3 - .52 ) = .45. K = EPV / VHM = .6 / .45 = 4/3. In this case, one
observation is one draw from the Binomial, therefore N =1. Z = 1 / (1 + 4/3) = 3/7.
Comment: Note the care that must be taken over the meaning of N. One must be consistent with
the definition of an exposure in the calculation of the variances. The two solutions given here use
different such definitions, but still produce the same answer for the Credibility.
7.26. C. Mean of the Beta is: a/(a+b) = a/(a+b). 0.2 = a/(a + b). b = 4a.
You have twice as much confidence in the prior mean as in the observed mortality rate.

Z = 1/3. Z = 10/(10 + K) = 1/3. K = 20. For the Beta-Binomial, K = a + b.


a + b = 20. 5a = 20. a = 4. b = 16. a = a + 1 = 5. b = b + 10 - 1 = 25.
Posterior Mode = (a - 1)/(a + b - 2) = 4/28 = 1/7.
Alternately, posterior estimate is: (1/3)(prior mean) + (2/3)(observed mortality) =
(1/3){a/(a+b)}+ (2/3)(1/10) = (1/3)(1/5) + (2/3)(1/10) = 2/15.
But the posterior estimate is the mean of the posterior Beta distribution:
a/(a + b) = (a + 1)/(a + 1 + b + 9) = (a + 1)/(a + b + 10) = (a + 1)/(5a + 10).

(a + 1)/(5a + 10) = 2/15. a = 4. b = 16. Proceed as before.


7.27. B. By Bayes Theorem the posterior is proportional to:
(a prior chance of value of q) (chance of observation given q) =
f(q) p(0 | q) = {60 q3 (1-q)2 } {(1-q)5 }, which is proportional to: q3 (1-q)7 .
Comment: The Beta and the Binomial are Conjugate Priors. (The Beta-Bernoulli is a special case.)
The posterior is also a Beta Distribution, with a = 4 and b = 8 and therefore with constant in front of:
(a+b-1)! / (a-1)! (b-1)! = 11! / (3!)(7!) = 1320.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 227

7.28. D. One can use the fact that for the Beta-Bernoulli the Bayes and Buhlmann estimates are
equal. The mean of the Posterior Distribution (4,8; t) is: a /(a + b) = 4/(4 + 8) = 4 /12 = 1/3.
The expected number of claims for 5 trials is then: 5(1/3) = 5/3 = 1.667.
Alternately, for the Beta-Bernoulli, the Buhlmann Credibility parameter K = a + b = 4 + 3 = 7.
(Where a and b are the parameters of the prior Beta.) Thus for 5 trials,
Z = 5 / (5+7) = 5/12.
The prior estimate (for 5 Bernoulli Trials) is 5E[q] = (5)(Mean of (4,3; q)) = (5)(4/7) = 20/7.
Buhlmann estimate upon observing 0 claims is: (5/12)(0) + (7/12)(20/7) = 5/3 = 1.667.
7.29. E. We have E[q] = .25 and Var[q] = E[q2 ] - E[q]2 = .07. Therefore, E[q2 ] = .1325. For the
Binomial we have a process variance of mq(1-q) = q(1-q) = q - q2 . Therefore, the Expected Value
of the Process Variance = E[q] - E[q2 ] = .25 - .1325 = .1175. For the Binomial the mean is mq = q.
Therefore the Variance of the Hypothetical Means = Var[q] = .07.
Therefore K = EPV / VHM = .1175 / .07 = 1.6786.
For one observation Z = 1 / (1 + 1.6786) = 0.373.
Comment: Since the Buhlmann credibility does not depend on the form of the prior distribution (but
just on its mean and variance), one could assume it was a Beta Distribution. One can then solve for
the parameters a and b of the Beta: mean = .25 = a / (a+b), and
variance = .07 = ab / {(a+b+1)(a+b)2 }. Therefore b = 3a, 4a+1 = (.25)(.75) / .07 = 2.679.
a = .420, b = 1.260.
The Buhlmann Credibility parameter for the Beta-Bernoulli is a + b = 1.68.
For one observation, Z = 1 / (1 + 1.68) = 0.373.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 228

7.30. C. (q) = 4q3 (1 - q)0 , 0 < q < 1. f(x) = {(a+b-1)! / ((a-1)! (b-1)!)} xa-1 (1-x)b - 1, 0 < x < 1.

The prior distribution is Beta with a = 4 and b = 1.


Each year is a Binomial with m = 2. Two independent Bernoulli Trials.
Thus 2 years 4 independent Bernoulli Trials. Observe 2 claims in 4 trials.
a = a + 2 = 6. b = b + (4 - 2) = 3. Mean of Posterior Beta is: a/(a + b) = 6/(6 + 3) = 2/3.
Expected claims in Year 3: E[mq] = 2E[q] = (2)(2/3) = 4/3.
Alternately, chance of observation is: {2q(1-q)} {2q(1-q)} = 4q2 (1-q)2 .
By Bayes Theorem, posterior distribution of q is proportional to:
(q) 4q2 (1-q)2 = 16q5 (1-q)2 , 0 q 1. This is proportional to a Beta Distribution with a = 6 and
b = 3, which therefore must be the posterior distribution of q. Proceed as before.
Alternately, for the Beta-Binomial:
a = a + number of claims = 4 + 2 = 6.
b = b + m (number of years) - number of claims = 1 + (2)(2) - 2 = 3. Proceed as before.
Alternately, prior distribution is Beta with a = 4 and b = 1. 4 independent Bernoulli Trials.
K = a + b = 5.
Prior mean of q is: 4 / (4 + 1) = 0.8. Observe 2 claims in 4 trials.
Z = 4 / (4 + 5) = 4/9.
Estimate of q: (4/9) (2/4) + (5/9) (0.8) = 2/3.
Expected claims in Year 3: E[mq] = 2 E[q] = (2) (2/3) = 4/3.
Alternately, prior distribution is Beta with a = 4 and b = 1. Binomial with m = 2.
K = (a + b) / m = (4 + 1) / 2 = 2.5.
Prior mean frequency: (2) 4 / (4 + 1) = 1.6. Observe 2 claims in 2 years.
Z = 2 / (2 + 2.5) = 4/9.
Expected claims in Year 3: (4/9) (2/2) + (5/9) (1.6) = 4/3.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 229

7.31. From the previous solution, posterior to Year 2, the distribution of q is a Beta Distribution with
a = 6 and b = 3. g(q) = {8!/(5!2!)}q6-1(1-q)3-1 = 168q5 (1-q)2 , 0 q 1.
Given q, the probability of 2 claims in Year 3 is the density at 2 for a Binomial with m = 2: q2 .
1

f(2) = q2 168q5 (1-q)2 dq = 168 q7 (1 - q)2 dq = 168 (8, 3) = (168)(7! 2! / 10!) = 168/360.
0

Given q, the probability of 1 claim in Year 3 is: 2q(1-q).


1

f(1) = 2q(1-q)168q5 (1-q)2 dq = 336 q6 (1 - q)3 dq = 336 (7, 4) = (336)(6! 3! / 10!) = 336/840.
0

Given q, the probability of 0 claims in Year 3 is: (1-q)2 .


1

f(0) = (1-q)2 168q5 (1-q)2 dq = 168 q5 (1 - q)4 dq = 168 (6, 5) = (168)(5! 4! / 10!) = 168/1260.
0

The probabilities of having either 0, 1 and 2 claims in Year 3 are respectively:


168/1260 = 2/15 = 13.33%, 336/840 = 2/5 = 40%, and 168/360 = 7/15 = 46.67%.
Alternately, in order to calculate the probability of 0 claims in Year 3, treat this as 2 separate Bernoulli
trials each with no claim.
The distribution of q prior to Year 3 is Beta with a = 6 and b = 3 with mean 6/9 = 2/3.
The chance of no claims in the first trial is: 1 - 2/3 = 1/3.
Posterior to one trial with no claim, we get a Beta with a = 6 and b = 4.
The conditional probability of no claims in the second trial is: 4/(6 + 4) = 0.4.
Therefore, the probability of no claims in both trials is: (1/3)(.4) = 13.33%.
In order to calculate the probability of 2 claims in Year 3, treat this as 2 separate Bernoulli trials each
with a claim.
The distribution of q prior to Year 3 is Beta with a = 6 and b = 3 with mean 6/9 = 2/3.
The chance of a claim in the first trial is 2/3.
Posterior to one trial with a claim, we get a Beta with a = 7 and b = 3.
The conditional probability of a claim in the second trial is: 7/(7 + 3) = 0.7.
Therefore, the probability of a claim in both trials is: (2/3)(.7) = 46.67%.
Therefore, the probability of 1 claim in Year 3 is: 1 - 13.33% - 46.67% = 40%.
Comment: Beyond what you are likely to be asked on your exam.
The predictive distribution is not a Binomial Distribution with m = 2 and q = 2/3,
with densities: (1/3)2 = 1/9, 2(1/3)(2/3) = 4/9, (2/3)2 = 4/9.
Instead, the predictive distribution is a Beta-Binomial with m = 2, a = 6, and b = 3:
f(x) = (a+x, b+m-x) / {(m+1)(a, b)(x+1, m+1-x)} = (6+x, 5-x) / {(3)(6, 3)(x+1, 3-x)}.

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 230

f(0) = (6, 5) / {(3)(6, 3)(1, 3)} = (1/1260)/{(3)(1/168)(1/3)} = 168/1260.


f(1) = (7, 4) / {(3)(6, 3)(2, 2)} = (1/840)/{(3)(1/168)(1/6)} = 336/840.
f(2) = (8, 3) / {(3)(6, 3)(3, 1)} = (1/360)/{(3)(1/168)(1/3)} = 168/360.
See Exercise 15.82 in Loss Models.
7.32. C. This is a Beta-Binomial situation with m = 3, a = 2 and b = 1.
One year of experience is three Bernoullis trials. We have zero claims.
a = a + 0 = 2 + 0 = 2. b = b + (3 - 0) = 1 + 3 = 4.
Mean of the Posterior Beta = a/(a + b) = 2/(2 + 4) = 1/3.
Number of claims expected in Year 2: (3)(1/3) = 1.
Alternately, K = a + b = 2 + 1 = 3. Z = 3/(3 + 3) = 1/2. Mean of prior Beta: a/(a + b) = 2/(2 + 1).
Estimated future frequency per Bernoulli trial: (1/2)(0/3) + (1/2)(2/3) = 1/3.
Number of claims expected in Year 2: (3)(1/3) = 1.
7.33. E. E[Xi] = E[q] = the mean of the Beta = a/(a + b) = 1/100.
E[Sm] = mE[Xi] = m/100 50. m 5000.
Comment: No observations and no application of Bayes Theorem to get a posterior distribution.
7.34. C. A Beta-Binomial with a = 2 and b = 2. 4 trials with 2 successes.
a = a + 2 = 4. b = b + 4 - 2 = 4. Posterior Distribution of q is Beta with a = 4 and b = 4.
f(q) is proportional to: q3 (1- q)3 . We wish to maximize this density.
Setting the derivative with respect to q equal to zero: 0 = 3q2 (1- q)3 - 3q3 (1- q)2 . q = 0.5.
Comment: The mode is where the density is largest. The question refers to the posterior distribution
of q, in other words after observations, in order to distinguish it from the prior distribution of q.
This posterior Beta is symmetric around its mean of: 4/(4 + 4) = 0.5, which is also the mode:
Prob.
1.75
1.5
1.25
1
0.75
0.5
0.25
0.2

0.4

0.6

0.8

2013-4-10,

Conjugate Priors 7 Beta-Binomial, HCM 10/21/12,

Page 231

7.35. B. E[q] = Mean of the Beta = a/(a + b) = 1/100.


E[q2 ] = Second Moment of the Beta = a(a + 1) / {(a + b)(a + b + 1)} = 2/{(100)(101)}.
Var[q] = Variance of the Beta = 2/{(100)(101)} - 1/1002 = 99/{(1002 )(101)}.
Process Variance = 101q(1 - q) = 101q - 101q2 .
EPV = 101 E[q] - 101 E[q2 ] = 101/100 - 2/100 = 0.99.
Hypothetical Mean = 101 q.
VHM = 1012 Var[q] = (1012 )99/{(1002 )(101)} = 0.9999.
Variance of the marginal (mixed) distribution is: EPV + VHM = 0.99 + 0.9999 = 1.9899.
Alternately, E[q] = Mean of the Beta = a/(a + b) = 1/100.
The mean of the mixture is the mixture of the means:
E[101q] = 101 E[q] = 101/100 = 1.01.
E[q2 ] = Second Moment of the Beta = a(a + 1) / {(a + b)(a + b + 1)} = 2/{(100)(101)}.
The second moment of the mixture is the mixture of the second moments:
E[101q(1-q) + (101q)2 ] = 101 E[q] + (100)(101) E[q2 ] = 101/100 + (100)(101)(2)/{(100)(101)} =
101/100 + 2 = 3.01.
Variance of the mixture is: 3.01 - 1.012 = 1.9899.
Comment: The mixed distribution is not a Binomial; it is a Beta-Binomial.
7.36. D. A Beta-Binomial; Bayes Analysis equals Buhlmann Credibility in this case.
Based on the first year of data: a = a + 2, and b = b + (8 - 2) = 9 + 6 = 15.
Therefore, the estimate for q is: a/(a + b) = (a + 2)/(a + 2 + 15).
The estimate of the number of claims is 8 times that:
2.54545 = 8(a + 2)/(a + 2 + 15). a = 5.
Based on the first two years of data: a = a + 2 + k = 7 + k, and b = b + 6 + (8 - k) = 23 - k.
The estimate of the number of claims is:
3.73333 = 8(7 + k)/(7 + k + 23 - k) = 8(7 + k)/30. k = 7.
Comment: Given the usual outputs, solve for missing inputs.

2013-4-10,

Conjugate Priors 8 Inverse Gamma Distribution,

HCM 10/21/12,

Page 232

Section 8, Inverse Gamma Distribution50


If X follows a Gamma Distribution, with parameters and 1, then /X follows an Inverse Gamma
Distribution with parameters and . (Thus an Inverse Gamma Distribution is no more complicated
conceptually than the Gamma Distribution.) is the shape parameter and is the scale parameter,
as parameterized in Loss Models.
The Distribution Function is: F(x) = 1 - ( ; /x),
while the probability density function is: f(x) = e/x / {() x +1}.
Note that the density has an exponential of 1/x times a negative power of x. This is how one
recognizes an Inverse Gamma density.51 The scale parameter is divided by x in the exponential.
The negative power of x has an absolute value one more than the shape parameter .
Exercise: A probability density function is proportional to e-11/x x-2.5. What distribution is this?
[Solution: This is an Inverse Gamma Distribution with = 1.5 and = 11. The proportionality
constant in front of the density is 111.5 / (1.5) = 36.48 / 0.8862 = 41.16. Note that there is no
requirement that be an integer. However, if is non-integral then one needs access to software
package that computes the (complete) Gamma Function.]
The Distribution Function is similar to that of a Gamma Distribution: ( ; x/).
If x/ follows an Inverse Gamma Distribution with scale parameter of one, then /x follows a Gamma
Distribution with a scale parameter of one. The Inverse Gamma is heavy-tailed, as can be seen by
the lack of the existence of certain moments.52 The nth moment of an Inverse Gamma only exists for
n < .
Note that the Inverse Gamma density function integrates to unity from zero to infinity.53

e- / x
x + 1 dx = () / , > 0.
0

This fact will be useful for working with the Inverse Gamma Distribution.
50

See Appendix A of Loss Models.


The Gamma density has an exponential of x times x to a power.
52
In the extreme tail its behavior is similar to that of a Pareto distribution with the same shape parameter .
53
This follows from substituting y = 1/x in the definition of the Gamma Function. Remember it via the fact that all
probability density functions integrate to unity over their support.
51

2013-4-10,

Conjugate Priors 8 Inverse Gamma Distribution,

HCM 10/21/12,

Page 233

For example, one can compute the moments of the Inverse Gamma Distribution:
E[Xn ] =

xnf(x)dx = xn e -/ x / { () x+ 1 } dx

= {/()}

0 e- / x x-( + 1 -n) dx

= {/()} ( - n) / (-n) = n ( - n) / (), - n > 0.


Alternately, the moments of the Inverse Gamma also follow from the moments of the Gamma
Distribution which are E[Xn ] = n (+n) / ().
(This formula works for n positive or negative.)
If X follows a Gamma, with unity scale parameter ( ; x), then Z = /X has Distribution Function:
F(z) = 1 - (; /z).
This is the Inverse Gamma Distribution, as parameterized by Loss Models.
Thus the Inverse Gamma has Moments: E[Zn ] = E[(/X)n ] = n E[X-n] = n (-n) / ().
Specifically, the mean of the Inverse Gamma = E[/X] = (-1) / () = /(-1).

2013-4-10,

Conjugate Priors 8 Inverse Gamma Distribution,

HCM 10/21/12,

Inverse Gamma Distribution


Support: x > 0

Parameters: > 0 (shape parameter), > 0 (scale parameter)

D. f. :

F(x) = 1 - ( ; /x)

P. d. f. :

e- / x
e- / x
f(x) = + 1
=
x
[ ] x x []

Moments: E[Xk] = n

( i)

k
, > k.
( 1)...( k)

i=1

Mean =

>1

Second Moment =

Variance =

Mode =

2
, >2
( 1) ( 2)

( 1)2 ( 2)

, >2

Coefficient of Variation = Standard Deviation / Mean =


Skewness =

+1

2
, > 3.
3

Kurtosis =

Limited Expected Value: E[X x] =

1
, > 2.
2

3 ( 2 ) ( + 5 )
, > 4.
( 3 ) ( 4 )

{1 - [1; /x]} + x [; /x]


1

R(x) = Excess Ratio = [1; /x] (1) (x/) [; /x] .


e(x) = Mean Excess Loss =

[ -1; / x]
-x
1 [ ; / x]

X Gamma(, 1) /X Inverse Gamma(, ).

>1

Page 234

2013-4-10,

Conjugate Priors 8 Inverse Gamma Distribution,

Displayed below are various Inverse Gamma Distributions:


Alpha= 0.8and Theta= 500
0.0008

0.0006

0.0004

0.0002

2000

4000

6000

8000

10000

Alpha = 1.5 andTheta= 1500


0.0008

0.0006

0.0004

0.0002

2000

4000

6000

8000

10000

Alpha = 3 andTheta= 6000


0.0008

0.0006

0.0004

0.0002

2000

4000

6000

8000

10000

HCM 10/21/12,

Page 235

2013-4-10,

Conjugate Priors 8 Inverse Gamma Distribution,

HCM 10/21/12,

Problems:
Use the following information for the next 3 questions:
You have an Inverse Gamma Distribution with parameters = 4 and = 9.
8.1 (1 point) What is the density function at x = 7?
A. less than 0.01
B. at least 0.01 but less than 0.02
C. at least 0.02 but less than 0.03
D. at least 0.03 but less than 0.04
E. at least 0.04
8.2 (1 point) What is the mean?
A. 2.0
B. 2.5
C. 3.0

D. 3.5

E. 4.0

8.3 (1 point) What is the variance?


A. 4.0
B. 4.5
C. 5.0

D. 5.5

E. 6.0

8.4 (1 point) What is the integral from zero to infinity of e-6/x x-11?
A. 0.006

B. 0.008

C. 0.010

D. 0.012

E. 0.014

Page 236

2013-4-10,

Conjugate Priors 8 Inverse Gamma Distribution,

HCM 10/21/12,

Page 237

Solutions to Problems:
8.1. B. f(x) = e /x / {() x+1}. f(7) = 94 e- 9/7 / {(4) 75 } = 0.0180.
8.2. C. Mean = / (1) = 9 / (4-1) = 3.
8.3. B. Variance = 2 / {(1)2 (2)} = (92 ) / {32 2} = 4.5.
Alternately, the second moment is 2 (-2) / () = 2 / {(1) (2)} = (92 ) / {(3)(2)} = 13.5.
Thus the variance = 13.5 - 32 = 4.5.

8.4. A.

0 e - q / x

x - ( + 1) dx = () / . Letting = 6 and = 10,

the integral from zero to infinity of e-6/x x-11 is: (10) / 610 = 9! / 610 = 0.006001.
Comment: e-6/x x-11 is proportional to the density of an Inverse Gamma Distribution with
= 6 and = 10. Thus its integral from zero to infinity is the inverse of the constant in front of the
Inverse Gamma Density, since the density itself must integrate to unity.
Alternately, one could let y = 6/x and convert the integral to a complete Gamma Function.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 238

Section 9, Inverse Gamma - Exponential


The Inverse Gamma - Exponential is a third example of a conjugate prior situation. Unlike the
previous two examples, the Inverse Gamma - Exponential involves a mixture of severity rather than
frequency parameters across a portfolio of risks.
The sizes of loss for a particular policyholder are assumed to be Exponential with mean
. Given , the distribution function of the size of loss is 1 - ex/, while the density of the
size of loss distribution is:

e- x /
.

The mean of this Exponential is and its variance is 2.


Note this is not the parameterization of the exponential used in Loss Models; I have used rather
than , so as to not confuse the scale parameter of the Exponential with that of the Inverse Gamma
which is .
So for example, the density of a loss being of size 8 is: (1/) e-8/.
If = 2 this density is: (1/2)e-4 = 0.009, while if = 20 this density is (1/20)e-4 = 0.034.54
Prior Distribution:
Assume that the values of are given by an Inverse Gamma distribution with = 6 and = 15,
with probability density function:55
g() = 94,921.875 e15/ / 7 , 0 < :

54

The first portion of this example is the same as in Mahlers Guide to Loss Distributions.
However, here we introduce observations and then apply Bayes Analysis and Buhlmann Credibility.
55
The constant in front is: / () = 156 / (6) = 11,390,625 / 120 = 94,921.875

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 239

Displayed below is this distribution of Exponential parameters:


0.4

0.3

0.2

0.1

10

Note that this distribution is an Inverse Gamma which has a mean of: /(-1) = 15 / (6-1) = 3.
This is the a priori estimate of claim severity.
Marginal Distribution (Prior Mixed Distribution):
If we have a policyholder and do not know its expected mean severity, in order to get the density of
the next loss being of size 8, one would weight together the densities of having a loss of size 8
given , using the a priori probabilities of :
g() = 94,921.875 e 15/ / 7, and integrating from zero to infinity:

f(8) =

e- 8 /
g() d = 94,921.875

e- 23 /
d = 94,921.875 ( 6! ) / (237 ) = 0.0201.
8

Where we have used the fact :

e- / x

+ 1 dx = () / = (-1)! / .
x
0

More generally, if the distribution of Exponential means is given by an Inverse Gamma distribution
g() = e/ / {() +1}, and then we compute the density at size x by integrating from zero to
infinity:56
Both the Exponential and the Inverse Gamma have terms involving powers of e1/ and 1/; note how these terms
combine when one takes the product.
56

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 240

f(x) =

- x/

e - ( + x) /
e - /
e- x /
e

g()
d
=
d
=

() + 1
+ 2 d =
()
0
0
0

( + 1)

=
.
() ( + x) + 1 ( + x) + 1
Thus the (prior) mixed distribution is in the form of the Pareto distribution.
Note that the shape parameter and scale parameter of the mixed Pareto distribution are the same as
those of the Inverse Gamma distribution.
For the specific case dealt with previously: = 6 and = 15.
Thus the density at size x is: 6(156 )(15+x)-7. For x = 8 this density is: 6(156 )(23)-7 = 0.0201.
This is the same result as calculated above.
For the Inverse Gamma-Exponential the (prior) marginal distribution is always a Pareto,
with = shape parameter of the (prior) Inverse Gamma and
= scale parameter of the prior Inverse Gamma.
The marginal Pareto is a size of loss distribution, while the prior Inverse Gamma is a distribution of
parameters. Since the Inverse Gamma is a distribution of each insureds mean severity, the a priori
mean severity is the mean of the Inverse Gamma. The mean of the Pareto is also the a priori mean.
Mean of the Inverse Gamma = Mean of the Pareto.
In this particular case we get a marginal Pareto distribution with parameters of = 6 and = 15,
which has a mean of 15/(6 - 1) = 3, which matches the mean of the prior Inverse Gamma.
Note that the formula for the mean of an Inverse Gamma and a Pareto are both /(-1).
Exercise: Each insured has an Exponential severity with mean . The values of are distributed via
an Inverse Gamma with parameters = 2.3 and = 1200. An insured is picked at random.
What is the chance that his next claim will be greater than 1000?
[Solution: The marginal distribution is a Pareto with parameters = 2.3 and = 1200.

2.3
1200
S(1000) =
=

= 24.8%.]
+ x
1000 +1200

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 241

Prior Expected Value of the Process Variance:


The process variance of the severity for an individual risk is 2 since the severity for each risk is
Exponential. Therefore the expected value of the process variance =
the expected value of 2 = second moment of the distribution of =
second moment of the Inverse Gamma Distribution =

2
.
( 1) ( 2)

Thus for = 6 and = 15, the expected value of the process variance is: 152 / {(5)(4)} = 11.25.
Prior Variance of the Hypothetical Means:
The variance of the hypothetical mean severities is the variance of = Var[] =
Variance of the Prior Inverse Gamma =
2nd moment of Inverse Gamma - square of mean of Inverse Gamma =
2
2
2
=
.
( 1) ( 2) ( 1)2 ( 1)2 ( 2)

For = 6 and = 15, VHM =

152
= 9/4 = 2.25.
(6 -1)2 (6 - 2)

Prior Total Variance:


The total variance = the variance of the marginal Pareto distribution =
2 2
2
2
2nd moment of Pareto - square of mean of Pareto =

=
.
( 1) ( 2) ( 1)2 ( 1)2 ( 2)
For = 6 and = 15 this equals:

(6) (15 2 )
= 13.5.
(6 - 2) (6 -1)2

The Expected Value of the Process Variance + Variance of the Hypothetical Means =
2.25 + 11.25 = 13.5 = Total Variance.
VHM = the variance of the Inverse Gamma. Total Variance = the variance of the Pareto.
Total Variance = EPV + VHM. Variance of the Inverse Gamma < Variance of the Pareto.
Variance of Inverse Gamma:

2
2
,

>
2.
Variance
of
Pareto:
, > 2.
( - 1)2 ( - 2)
( - 1)2 ( - 2)

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 242

Observations:
Let us now introduce the concept of observations. A risk is selected at random and it is observed to
have 3 claims of size: 8, 5, and 4, in that order.57
Posterior Distribution:
We can employ Bayesian analysis to compute what the probability that the selected risk had a
given Exponential Parameter. Given a Exponential with parameter , the density at size 8 is:
e- 8 /
. The density of the first claim being of size 8, the second of size 5 and the third of size 2 is

the product:

e- 8 / e- 5 / e- 4 /
e- 17 /
=
.

The a priori probability of is the Prior Inverse Gamma distribution:


g() = 94921.875

e- 15 /
, 0 < . Thus the posterior density of is proportional to the product
7

of the density of observation and the a priori probability:

e- 32 /
.
10

This is proportional to the density for an Inverse Gamma distribution with = 9 and = 32.
This posterior Inverse Gamma (dashed) is compared to the prior Inverse Gamma (solid):
0.4

0.3

0.2

0.1

2
57

10

It will turn out that for the forthcoming analysis that the answer will only depend on the number of claims, 3, and
their total, 17.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 243

In general, if one observes C claims totaling L losses, we have that the density of the observation
e- L /
given is proportional to a product of terms equal to:
. The prior Inverse Gamma distribution
C
e- /
is proportional to: + 1 . Note that both the density of observation and the Inverse Gamma have

a term involving the exponential of 1/ and a term involving to a negative power. The posterior
e - ( + L) /
probability for is proportional to:
. This is proportional to the density of an Inverse
+ 1+ C
Gamma distribution with new shape parameter = +C and new scale parameter = +L.
Thus for the Inverse Gamma - Exponential the posterior density function is also an
Inverse Gamma. This posterior Inverse Gamma has a shape parameter = prior shape
parameter plus the number of claims observed. This posterior Inverse Gamma has a
scale parameter = prior scale parameter plus the total cost of the claims observed.
Posterior = prior + C.

Posterior = prior + L.

For example, in the case where we observed 3 claims totaling 17, C = 3 and L = 17. The prior
shape parameter was 6 while the prior scale parameter was 15. Therefore the posterior shape
parameter = 6 + 3 = 9, while the posterior scale parameter = 15 + 17 = 32, matching the result
obtained above.
The fact that the posterior distribution is of the same form as the prior distribution is why the Inverse
Gamma is a Conjugate Prior Distribution for the Exponential.
Predictive Distribution:
Since the posterior distribution is also an Inverse Gamma distribution, the same analysis that led to a
Pareto (prior) marginal distribution, will lead to a (posterior) predictive distribution that is Pareto.
However, the parameters are related to the posterior Inverse Gamma.
For the Inverse Gamma-Exponential the (posterior) predictive distribution is always a
Pareto, with parameters: = shape parameter of the posterior Inverse Gamma and
= scale parameter of the posterior Inverse Gamma.
Thus posterior = prior + C and posterior = prior + L.
In the particular example with a posterior Inverse Gamma distribution with parameters 9 and 32, the
parameters of the posterior Pareto predictive distribution are also 9 and 32. Alternatively, one can
compute this in terms of the prior Inverse Gamma with parameters 6 and 15 and the observations
of 3 claims totaling 17; = 6 + 3 = 9 and = 15 + 17 = 32.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 244

Below are compared the marginal Pareto (solid) and the posterior predictive Pareto (dashed):
0.4

0.3

0.2

0.1

10

Having observed 3 claims totaling 17, an average severity of 5.67 compared to the a priori mean of
3, has increased the estimated probability of a large claim in the future.
Exercise: Each insured has an Exponential severity with mean .
The values of are distributed via an Inverse Gamma with parameters = 2.3 and = 1200.
An insured is picked at random and observed to have 5 claims totaling 3000.
What is the chance that his next claim will be greater than 1000?
[Solution: The predictive distribution is a Pareto with parameters = 2.3 + 5 = 7.3 and
= 1200 + 3000 = 4200. S(1000) = {/(x+)} = {4200/(1000+4200)}7.3 = 21.0%.]
Posterior Mean:
One can compute the means and variances posterior to the observations. The posterior mean can
be computed in either one of two ways. First one can weight together the means for each type of
risk, using the posterior probabilities. This is E[] = the mean of the posterior Inverse Gamma =
32/(9-1) = 4. Alternately, one can compute the mean of the predictive Pareto distribution:
/(-1) = 32/(9-1) = 4. Of course the two results match.
Thus the new estimate posterior to the observations for this risks average severity using Bayesian
Analysis is 4. This compares to the a priori estimate of 3. In general, the observations provide
information about the given risk, which allows one to make a better estimate of the future experience
of that risk. Not surprisingly observing 3 claims totaling 17, for an observed average severity of
5.67, has raised the estimated severity from 3 to 4.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 245

Posterior Expected Value of the Process Variance:


Just as prior to the observations, posterior to the observations one can compute three variances:
the expected value of the process variance, the variance of the hypothetical pure premiums, and the
total variance.
The process variance of the severity for an individual risk is 2 since the severity for each risk is
Exponential. Therefore the expected value of the process variance = the expected value of 2 =
second moment of the posterior Inverse Gamma =

2
.
( 1) ( 2)

Thus for the Posterior Inverse Gamma with = 9 and = 32, the expected value of the process
32 2
variance is:
= 18.29.
(8) (7)
Posterior Variance of the Hypothetical Means:
The variance of the hypothetical mean severities is the variance of = Var[] =
Variance of the Posterior Inverse Gamma =

For = 9 and = 32 this is:

2
2
2
=
.
( 1) ( 2) ( 1)2 ( 1)2 ( 2)

322
= 16/7 = 2.29.
(10 -1)2 (10 - 2)

Posterior Total Variance:


The total variance = the variance of the predictive Pareto distribution =
2 2
2
2
=
.
( 1) ( 2) ( 1)2 ( 1)2 ( 2)

For = 9 and = 32, total variance =

(9)(322)
= 20.57.
(82)(7)

The Expected Value of the Process Variance + Variance of the Hypothetical Means =
2.29 + 18.29 = 20.58 = Total Variance, subject to rounding.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 246

Buhlmann Credibility:
Next, lets apply Buhlmann Credibility to this example. The Buhlmann Credibility parameter K =
the (prior) expected value of the process variance / the (prior) variance of the hypothetical means =
11.25 / 2.25 = 5. Note that K can be computed prior to any observation and doesnt depend on
them. Specifically both variances are for a single insured for one trial.
In general, EPV =

K=

prior EPV
=
prior VHM

2
2
, > 2, and VHM =
, > 2.
( 1) ( 2)
( 1)2 ( 2)
2
( 1) ( 2)
2

= - 1, > 2.

( 1)2 ( 2)

For the Inverse Gamma-Exponential in general, the Buhlmann Credibility parameter


K = - 1, where > 2 is the shape parameter of the prior Inverse Gamma distribution.
For this example, K = 6 - 1 = 5. Having observed 3 claims, Z = 3 / (3 + 5) = 3/8 = 0.375.
The observed severity = 17/3. The a priori mean = 15/(6-1) = 3.
Thus the new estimate = (3/8)(17/3) + (5/8)(3) = 4.
Note that in this case the estimate from Buhlmann Credibility matches the estimate from Bayesian
Analysis. For the Inverse Gamma-Exponential the estimates from using Bayesian
Analysis and Buhlmann Credibility are equal.58
Summary:
The many different aspects of the Inverse Gamma-Exponential are summarized below.
Be sure to be able to clearly distinguish between the situation prior to observations and that
posterior to the observations.

58

As discussed in a subsequent section this is a special case of the general results for conjugate priors of members
of linear exponential families. This is another example of what Loss Models calls exact credibility.
It should be noted that for 2, one should not use Buhlmann Credibility. In these cases one can still use Bayesian
Analysis, since > 0 and C 1 = + C > 1 posterior Inverse Gamma has a finite mean.

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 247


2013-4-10,

Inverse Gamma-Exponential Severity Process

Mixing

Exponential
Process

Variance = 2/{(2)(1)2}.

(Size of Loss)
= shape parameter of the Prior Inverse Gamma = .
= scale parameter of the Prior Inverse Gamma = .
Mean = /(1). second moment = 22/{(1)(2)}.

Pareto Marginal Distribution:

Inverse Gamma is a Conjugate Prior, Exponential is a Member of a Linear Exponential Family


Buhlmann Credibility Estimate = Bayes Analysis Estimate
Buhlmann Credibility Parameter, K = - 1.
Inverse Gamma Prior
(Distribution of Parameters)
Shape parameter = alpha = ,
Scale parameter = theta = .

Exponential
Process

Mixing

Variance = 2/{(2)(1)2}.

(Size of Loss)
= shape parameter of the Posterior Inverse Gamma
= = + C.
= scale parameter of the Posterior Inverse Gamma
= = + L.
Mean = /(1). second moment = 22/{(1)(2)}.

Pareto Predictive Distribution:

Observations: $ of Loss = L, # claims = C.

Inverse Gamma Posterior


(Distribution of Parameters)
Posterior Shape parameter =
= + C.
Posterior Scale parameter =
= + L.

Exponential Parameters (means) of individuals making up the entire portfolio are distributed via an Inverse
Gamma Distribution with parameters and : f(x) = e-/x/ {x+1[]},
mean = /(1), second moment = 2/{(1)(2)} , variance = 2/{(2)(1)2}.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 248

Comparing the Inverse Gamma and Pareto Distributions:

Exponential Process

Inverse Gamma Prior


Distribution of Parameters
, .

Mixing

Pareto Marginal
Size of Loss
= .
= .

Since the Inverse Gamma is a distribution of each insureds mean severity.


The a priori mean severity is the mean of the Inverse Gamma.
The mean of the Pareto is also the a priori mean.
Mean of Inverse Gamma = Mean of Pareto =

.
-1

VHM = the variance of the Inverse Gamma.


Total Variance = the variance of the Pareto.
Total Variance = EPV + VHM > VHM.
2
2

=
Variance
of
Inverse
Gamma
<
Variance
of
Pareto
=
, > 2.
( 1)2 ( 2)
( 1)2 ( 2)

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 249

Hazard Rates of Exponentials Distributed via a Gamma:59


The way that I have presented things, the mean of each Exponential Severity was , and followed
an Inverse Gamma Distribution.
If the hazard rate of the Exponential, , is distributed via a Gamma(, ), then the mean 1/ is
distributed via an Inverse Gamma(, 1/), and therefore the mixed distribution is Pareto.
If the Gamma has parameters and , then the mixed Pareto has parameters and 1/.
Relationship to the Gamma-Poisson:
Assume as before, , the mean of each Exponential, follows an Inverse Gamma Distribution with
parameters = 6 and = 15. Then, F() = 1 - [6, 15/].
If = 1/, then F() = [6, 15]. follows a Gamma with parameters = 6 and = 1/15.
This is mathematically the same as Exponential interarrival times each with mean 1/,
or a Poisson Process with intensity .
Prob[X > x] Prob[Waiting time to 1st claim > x] = Prob[no claims by time x].
From time 0 to x we have a Poisson Frequency with mean x. x has a Gamma Distribution with
parameters 6 and x/15. This is mathematically a Gamma-Poisson, with mixed distribution that is
Negative Binomial with r = 6 and = x/15.
Prob[X > x] Prob[no claims by time x] = f(0) = 1/(1 + x/15)6 = 156 /(15+ x)6 .
This is the survival function at x of a Pareto Distribution, with parameters = 6 and = 15, as
obtained previously.
As before, a risk is selected at random and it is observed to have 3 claims of size: 8, 5, and 4, in that
order. This is mathematically equivalent to having three interarrival times in the Poisson Process of
lengths 8, 5, and 4. In other words, we see a total of 3 claims in a length of time: 8 + 5 + 4 = 17.
Recall that the prior Gamma had parameters = 6 and = 1/15. Thus using the updating formula for
the Gamma-Poisson, the posterior Gamma has parameters 6 + 3 = 9 and 1/(15 + 17) = 1/32.
Translating back to the Inverse Gamma-Exponential, the Posterior Inverse Gamma has parameters
= 9 and = 1/(1/32) = 32, matching the result obtained previously.
59

See for example, 4B, 11/93, Q.6.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 250

Problems:
Use the following information to answer the next 15 questions:
The size of claim distribution for any particular policyholder is exponential with mean .
The values of the portfolio of policyholders have probability density function:
g() = 216 e -6/ 5
You are given the following values of the Incomplete Gamma Function, as per the Appendix of
Loss Models:
( ; y)
y

=4

=5

=6

=7

1.0

0.019

0.004

0.001

0.000

1.5
2.0

0.066
0.143

0.019
0.053

0.004
0.017

0.001
0.005

3.0
3.5
6.0
7.0

0.353
0.463
0.849
0.918

0.185
0.275
0.715
0.827

0.084
0.142
0.554
0.699

0.034
0.065
0.394
0.550

9.1 (1 point) What is the mean claim size for the portfolio?
A. less than 1.6
B. at least 1.6 but less than 1.8
C. at least 1.8 but less than 2.0
D. at least 2.0 but less than 2.2
E. at least 2.2
9.2 (1 point) What is the probability that an insured picked at random from this portfolio will have an
Exponential parameter between 2 and 4?
A. less than 25%
B. at least 25% but less than 35%
C. at least 35% but less than 45%
D. at least 45% but less than 55%
E. at least 55%

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 251

9.3 (2 points) The probability density function that a claim chosen at random will be of size x is given
by which of the following?
A. 279,936 (6 + x)-7
B. 38,880 (6 + x)-6
C. 5184 (6 + x)-5
D. 648 (6 + x)-4
E. None of A, B, C, or D
9.4 (1 point) An insured is picked at random. What is the probability that his next claim will be
greater than 3 and less than 5?
A. less than 8%
B. at least 8% but less than 10%
C. at least 10% but less than 12%
D. at least 12% but less than 14%
E. at least 14%
9.5 (2 points) What is the variance of the claim severity for the portfolio?
A. less than 3
B. at least 3 but less than 5
C. at least 5 but less than 7
D. at least 7 but less than 9
E. at least 9
9.6 (2 points) What is the expected value of the process variance for the claim severity?
A. less than 3
B. at least 3 but less than 5
C. at least 5 but less than 7
D. at least 7 but less than 9
E. at least 9
9.7 (2 points) What is the variance of the hypothetical mean severities?
A. less than 3
B. at least 3 but less than 5
C. at least 5 but less than 7
D. at least 7 but less than 9
E. at least 9

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 252

9.8 (2 points) An insured has 2 claims of sizes 3 and 5.


Using Buhlmann Credibility what is the estimate of this insured's expected future claim severity?
A. less than 2.3
B. at least 2.3 but less than 2.5
C. at least 2.5 but less than 2.7
D. at least 2.7 but less than 2.9
E. at least 2.9
9.9 (2 points) An insured has 2 claims of sizes 3 and 5 . Which of the following is proportional to the
posterior probability density function for this insured's exponential parameter?
A. e10/ 5
B. e12/ 6
C. e14/ 7
D. e16/ 8
E. None of A, B, C, or D
9.10 (1 point) An insured has 2 claims of sizes 3 and 5.
What is the mean of the posterior severity distribution?
A. less than 2.3
B. at least 2.3 but less than 2.5
C. at least 2.5 but less than 2.7
D. at least 2.7 but less than 2.9
E. at least 2.9
9.11 (1 point) An insured has 2 claims of sizes 3 and 5.
What is the probability that this insured has an Exponential parameter between 2 and 4?
A. 40%

B. 44%

C. 48%

D. 52%

E. 56%

9.12 (2 points) An insured has 2 claims of sizes 3 and 5.


What is the variance of the posterior distribution of exponential parameters?
A. less than 2.0
B. at least 2.0 but less than 2.1
C. at least 2.1 but less than 2.2
D. at least 2.2 but less than 2.3
E. at least 2.3

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 253

9.13 (1 point) An insured has 2 claims of sizes 3 and 5.


What is the probability that this insured has an Exponential parameter between 2 and 4?
Use the Normal Approximation.
A. 40%
B. 44%
C. 48%

D. 52%

E. 56%

9.14 (2 points) An insured has 2 claims of sizes 3 and 5. Which of the following is proportional to
the density of the predictive distribution of the severity for this insured?
A. (14 + x)-8
B. (13 + x)-7
C. (12 + x)-6
D. (11 + x)-5
E. None of A, B, C, or D
9.15 (1 point) An insured has 2 claims of sizes 3 and 5. What is the probability that his next claim will
be greater than 2 and less than 4?
A. less than 18%
B. at least 18% but less than 20%
C. at least 20% but less than 22%
D. at least 22% but less than 24%
E. at least 24%

9.16 (2 points) You are given the following:


The size of the single claim follows a distribution: f(x) = e-x, x > 0.
The parameter is a random variable with probability density function: g() = 100e-100, > 0.
Two claims are observed. They have sizes 40 and 80.
What is the expected value of the size of the next claim?
A. 70
B. 80
C. 90
D. 100
E. 110
9.17 (3 points) You are given the following:
The amount of an individual claim has an Exponential Distribution with mean q
The parameter q is distributed via an Inverse Gamma distribution with shape parameter = 4
and scale parameter = 1000.
From an individual insured you observe 3 claims of sizes: 100, 200, and 500.
For the zero-one loss function, what is the Bayes estimate of q for this insured?
A. 225
B. 250
C. 275
D. 300
E. 325

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 254

Use the following information for the next six questions:

The time in minutes between text messages sent by an individual girl is Exponential with mean .
For an individual, the times between sending text messages are independent.
You assume that is distributed between different girls via
an Inverse Gamma Distribution with = 4 and = 100.

Samantha sent her 5th text message exactly 20 minutes after you started observing her.
You may use the following values of the incomplete Gamma Function:
[8 ; 7.67] = 0.500. [9 ; 8.67] = 0.500. [10 ; 9.67] = 0.500.
9.18 (2 points) Estimate Samanthas mean time in minutes between text messages,
using the Bayes estimate that minimizes the squared error loss function.
(A) 0
(B) 10
(C) 12
(D) 14
(E) 15
9.19 (2 points) Estimate Samanthas mean time in minutes between text messages,
using the Bayes estimate that minimizes the absolute error loss function.
(A) 0
(B) 10
(C) 12
(D) 14
(E) 15
9.20 (2 points) Estimate Samanthas mean time in minutes between text messages,
using the Bayes estimate that minimizes the zero-one loss function.
(A) 0
(B) 10
(C) 12
(D) 14
(E) 15
9.21 (2 points) Estimate the time in minutes until Samantha sends her next text message,
using the Bayes estimate that minimizes the squared error loss function.
(A) 0
(B) 10
(C) 12
(D) 14
(E) 15
9.22 (2 points) Estimate the time in minutes until Samantha sends her next text message,
using the Bayes estimate that minimizes the absolute error loss function.
(A) 0
(B) 10
(C) 12
(D) 14
(E) 15
9.23 (2 points) Estimate the time in minutes until Samantha sends her next text message,
using the Bayes estimate that minimizes the zero-one loss function.
(A) 0
(B) 10
(C) 12
(D) 14
(E) 15

9.24 (3 points) The size of each loss is Exponential with mean .


The improper prior distribution is: () = 1, > 0.
You observe losses of size: 10, 5, 15, and 20. What is the variance of the predictive distribution?

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 255

9.25 (4B, 11/93, Q.6) (2 points) You are given the following:
An individual risk has exactly one claim each year.
The size of the single claim follows an exponential distribution with parameter t,
f(x) = te-tx, x > 0
The parameter t is a random variable with probability density function h(t) = te-t, t > 0.
A claim of $5 is observed in the current year.
Determine the posterior distribution of t.
A. t2 e-5t

B. (125/2)t2 e-5t

C. t2 e-6t

D. 108t2 e-6t

E. 36t2 e-6t

Use the following information for the next two questions:

Claim sizes for a given risk follow a distribution with density function
f(x) = e-x/ / , 0 < x < , > 0.

The prior distribution of is assumed to follow a distribution with mean 50 and


density function g() = 500,000 e-100/ / 4 , 0 < < .

9.26 (4B, 5/98, Q.28) (2 points) Determine the variance of the hypothetical means.
A. Less than 2,000
B. At least 2,000, but less than 4,000
C. At least 4,000, but less than 6,000
D. At least 6,000, but less than 8,000
E. At least 8,000
9.27 (4B, 5/98, Q.29) (2 points) Determine the density function of the posterior distribution of
after 1 claim of size 50 has been observed for this risk.
A. 62,500 e-50/ / 4
B. 500,000 e-100/ / 4
C. 1,687,500 e-150/ / 4
D. 50,000,000 e-100/ / (35)
E. 84,375,000 e-150/ / 5

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 256

9.28 (4, 11/00, Q.23) (2.5 points) You are given:


(i) The parameter has an inverse gamma distribution with probability density function:
g() = 500 4 e-10/ , > 0.
(ii) The size of a claim has an exponential distribution with probability density function:
f (x | = ) = 1 e-x/ , x > 0, > 0.
For a single insured, two claims were observed that totaled 50.
Determine the expected value of the next claim from the same insured.
(A) 5
(B) 12
(C) 15
(D) 20
(E) 25
9.29 (2 points) In the previous question, what is the probability that the next claim from the same
insured is greater than 30?
(A) 5%
(B) 7%
(C) 9%
(D) 11%
(E) 13%
9.30 (2 points) In 4, 11/00, Q.23, determine the variance of the distribution of the size of
the next claim from the same insured.

9.31 (4, 5/07, Q.30) (2.5 points) You are given:


(i) Conditionally, given , an individual loss X follows the exponential distribution with
probability density function: f(x | ) = exp(-x/)/, 0 < x < .
(ii) The prior distribution of is inverse gamma with probability density function:
() = c2 exp(-c/)/3, 0 < < .

(iii)

exp(a / y) / yn dy = (n-2)! / an-1, n = 2, 3, 4, ....


0

Given that the observed loss is x, calculate the mean of the posterior distribution of .
(A) 1 / (x + c)

(B) 2 / (x + c)

(C) (x + c) / 2

(D) x + c

(E) 2(x + c)

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 257

Solutions to Problems:
9.1. D. g() is an Inverse Gamma density function, with parameters: = 6 and = 4.
(+1 is the power to which 1/ is raised, while is the number divided by in the exponential term.)
It has mean / ( - 1) = 6 / 3 = 2.
Each exponential severity has a mean of .
Thus, Mean of the portfolio = E[] = Mean of Inverse Gamma Distribution = 2.
9.2. B. The prior distribution is an Inverse Gamma with = 6 and = 4.
Thus F(x) = 1 - (; /x) = 1 - (4; 6/x). F(4) - F(2) = {1 - (4; 1.5)} - {1 - (4; 3 )} =
(4; 3) - (4; 1.5) = .353 - .066 = 0.287.
9.3. C. The (prior) marginal distribution is given by a Pareto with = 6 and = 4.
(Note the mean is / (-1) = 2, which matches the answer to a previous question.)
Pareto density = ()( + x)( + 1) = (4)(64 ) / (6 +x)5 = 5184 / (6 +x)5 .
9.4. C. The (prior) marginal distribution is given by a Pareto with = 6 and = 4.
F(x) = 1 - (/( + x)). F(3) = 1 - (6/9)4 = .8025. F(5) = 1 - (6/11)4 = .9115.
F(5) - F(3) = .9115 - .8025 = 10.9%.
9.5. D. For the Pareto, Variance = 2 / (2)(1)2. Since = 6 and = 4, Variance = 8.
Alternately, the total variance equals the expected value of the process variance plus the variance of
the hypothetical means = 6 + 2 = 8. (Using the solutions to the next two questions.)
9.6. C. The process variance for an exponential is 2 . Therefore the expected value of the process
variance = E[2 ] = second moment of the Inverse Gamma = 2 / (-1)(-2) =

62
= 6.
(4 -1) (4 - 2)

9.7. A. The mean for an exponential is . Therefore the variance of the hypothetical means
=VAR[] = variance of the Inverse Gamma = 2nd moment - mean2 = 6 - 22 = 2.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 258

9.8. D. Using the solutions to the prior 2 questions, K = 6/2 = 3. (Alternately, K = - 1 = 3.)
Thus, Z = 2/(2 + 3) = 40%. Prior estimate is 2. Observation is (3+5) / 2 = 4.
Thus the new estimate = (40%)(4) + (60%)(2) = 2.8
9.9. C. The Posterior Distribution is an Inverse Gamma with = 6 + 3 + 5 = 14 and
= 4 + 2 = 6. f() = e / / {() +1} = (146 ) e 14/ / {(7-1) 7} e-14 / - 7.
9.10. D. Mean of Posterior Inverse Gamma is 14/(6 - 1) = 2.8.
9.11. E. The posterior distribution is an Inverse Gamma with = 14 and = 6.
Thus F(x) = 1 - (; /x ) = 1 - (6; 14/x).
F(4) - F(2) = {1 - (6; 3.5)} - {1 - (6; 7)} = (6; 7) - (6; 3. ) = 0.699 - 0.142 = 0.557.
9.12. A. Variance of Posterior Inverse Gamma = 2nd moment - mean2 =
142 / {(6-1) (6-2)} - {14/(6-1)}2 = 9.8 - 2.82 = 1.96.
9.13. D. The posterior distribution is an Inverse Gamma with = 14 and = 6, with mean of 2.8
and standard deviation of

1.96 = 1.4.

Thus F(4) - F(2) [(4 - 2.8)/1.4] - [(2 - 2.8)/1.4] = (0.86) - (-0.57) = 0.8051 - 0.2843 =
0.5208.
Comment: Note that this differs from the exact answer of 0.557 obtained as the solution to a
previous question using values of the Incomplete Gamma Functions.
9.14. E. (Posterior) predictive distribution is Pareto with = 14 and = 6.
Pareto density = ()( + x)( + 1) = (6)(146 ) / (14 +x)7 = 45,177,216 / (14 +x)7 .
9.15. D. The (posterior) predictive distribution is given by a Pareto with = 14 and = 6.
F(x) = 1 - (/( + x)). F(2) = 1 - (14/16)6 = 0.5512. F(4) = 1 - (14/18)6 = 0.7786.
F(4) - F(2) = 0.7786 - 0.5512 = 22.7%.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 259

9.16. E. The posterior distribution is proportional to: e-100e -40 e -80 = 2e-220.
E[X | ] =1/. Therefore, the expected size of the next claim is:

(1/) 2e-220 d / 2e-220 d = 220-2 (2) /{220-3 (3)} = 220/2 = 110.


0

Alternately, let = 1/. = 1/. d/d = -1/2. Then severity is exponential with mean .
h() = g() |d/d| = 100e-100//2,
an Inverse Gamma with = 1 and = 100, (an Inverse Exponential.)
= + C = 1 + 2 = 3. = + L = 100 + 120 = 220. /(-1) = 220/2 = 110.
Comment: A somewhat disguised Inverse Gamma-Exponential.
Since = 1 2, one can not apply Buhlmann Credibility.
The prior Inverse Gamma has no finite mean, and neither the EPV nor VHM exist.
9.17. A. For the Exponential, f(x | q) = e-x/q/q.
For the Inverse Gamma, (q) = 10004 e-1000/q / {[4] q4+1} = 10004 e-1000/q / {3! q5 }.
The posterior distribution is proportional to: f(100 | q) f(200 | q) f(500 | q) (q),
which is proportional to: (e-100/q/q) (e-200/q/q) (e-500/q/q) (e-1000/q / q5 ) = e-1800/q / q8 .
This is an Inverse Gamma Distribution with parameters 7 and 1800.
For the zero-one loss function, the Bayes estimate is the mode of the posterior distribution.
The mode of the Inverse Gamma distribution is: /( + 1) = 1800/(7 + 1) = 225.
Comment: This is an example of an Inverse Gamma-Exponential.
= + C = 4 + 3 = 7, and = + L = 1000 + 800 = 1800.
For the squared error loss function, the Bayes estimate is the mean of the posterior distribution,
which in this case is: /(1) = 1800/(7 - 1) = 300.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 260

9.18. E., 9.19. D., 9.20. C. We are trying to estimate for Samantha.
The number of text messages of 5 acts like the number of claims.
The total time of 20 minutes acts like the total dollars of loss.
The posterior distribution of is Inverse Gamma with = 4 + 5 = 9, and = 100 + 20 = 120.
After observations, this is the distribution of hypothetical means.
The estimate corresponding to the squared error loss function is the mean.
The mean of the posterior Inverse Gamma is: /(-1) = 120 / (9-1) = 15.
The estimate corresponding to the absolute loss function is the median.
The distribution function of the posterior Inverse Gamma is: F() = 1 - [9; 120/].
Setting this equal to 50%: 0.50 = [9; 120/]. 120/ = 8.67. = 120/8.67 = 13.84.
The estimate corresponding to the zero-one loss function is the mode.
The mode of the posterior Inverse Gamma is: /(+1) = 120 / (9+1) = 12.
Comment: Loss functions are discussed in Mahlers Guide to Buhlmann Credibility.
The zero-one loss function is kind of silly for actuarial work.
9.21. E., 9.22. B., 9.23. A. We are trying to estimate the next observation from Samantha.
The number of text messages of 5 acts like the number of claims.
The total time of 20 minutes acts like the total dollars of loss.
The posterior distribution of is Inverse Gamma with = 4 + 5 = 9, and = 100 + 20 = 120.
Thus the predictive distribution is Pareto with = 9, and = 120.
After observations, this is the distribution of observed lengths of time between text messages.
The estimate corresponding to the squared error loss function is the mean.
The mean of the predictive Pareto is: /(-1) = 120 / (9-1) = 15.
The estimate corresponding to the absolute loss function is the median.
The median of the predictive Pareto is: {(1-p)1/ - 1} = (120) (0.5-1/9 - 1) = 9.607.
The estimate corresponding to the zero-one loss function is the mode.
The mode of the predictive Pareto is 0.
Comment: Loss functions are discussed in Mahlers Guide to Buhlmann Credibility.
The zero-one loss function is kind of silly for actuarial work. While the density of the predictive Pareto
is largest at zero, that is not a sensible estimate of the next observation.
The mean of the Inverse Gamma and the Pareto are equal; however, their modes and medians are
not.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 261

9.24. The posterior distribution is proportional to: () f(10) f(5) f(15) f(20) = e50/ / 4.
This is proportional to an Inverse Gamma Distribution with = 3 and = 50, which must be the
posterior distribution.
Mixing Exponentials via this Inverse Gamma produces a Pareto predictive distribution, with
= 3 and = 50. This Pareto has mean = /(-1) = 25, second moment = (2)(502 )/2 = 2500, and
variance = 2500 - 252 = 1875.
Alternately, posterior EPV = E[2] = second moment of posterior inverse gamma = (502 )/2 = 1250.
Posterior VHM = Variance of posterior inverse gamma = 1250 - 252 = 625.
Posterior Total Variance = EPV + VHM = 1250 + 625 = 1875.
9.25. D. Using Bayes Theorem, the posterior probability weight for t is:
(chance of observation given t)(a priori probability) = (te-5t)(te-t) = t2 e-6t.
One has to normalize the probability weights so that they integrate to unity. This is proportional to a
Gamma density, with = 3 and = 1/6. Therefore in order to be a true probability density function,
the constant in front must be / () = 63 / (3) = 216 / 2! = 108.
Therefore, the Posterior distribution is: 108t2 e-6t .
Comment: Choices C and E do not integrate to unity; they are proportional to the correct choice D,
which does integrate to unity. While this is in fact a special case of the Inverse Gamma- Exponential
conjugate prior, this fact doesnt seem to aid in the speed of solution. One would have to convert the
exponential to the parameterization, = 1/t, in which case h(t) becomes an Inverse Gamma with
respect to .

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 262

9.26. B. f(x) is an exponential Distribution with mean . g() is an Inverse-Gamma Distribution with
parameters = 3 and = 100. The Variance of the Hypothetical Means = Var[] =
Variance of the Inverse Gamma = 2 / {(-1)2 (-2)} = 1002 / (22 (1)) = 2500.
Alternately, one can compute the first and second moments of g, by integrating from zero to infinity
and using the change of variables = 100/:

E[] = 500,000 e-100/ / 4 d = 500,000 (100/) e (/100)4 (100/2) d =

50 e d = 50 (2) = 50(1!) = 50.

E[2 ] = 2 500,000 e-100/ / 4 d = 500,000 (100/)2 e (/100)4 (100/2) d =

5000 e d = 5000. Thus the VHM = Var[] = 5000 - 502 = 2500.


Alternately, one can compute the first and second moments of g, by computing the -1 and -2
moments of the corresponding Gamma Distribution. If = 1/, then the distribution of is:
g() |d/d| = 500,000 e-100 4 |(-1/2 )| = 500000 e-100 2, which is a Gamma Distribution with
parameters = 3 and = 1/100. The nth moment of a Gamma is: n (+n)/().
The -1 moment of this Gamma is: -1(-1)/ () = 1/ {(-1)} = 100/2 = 50.
The -2 moment of this Gamma is: -2(-2)/ () = -2/(-1)(-2) = 1002 /2 = 5000.
Thus as above, the VHM = 5000 - 502 = 2500.
Comment: In this case, we were given the first moment of g().

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 263

9.27. E. The prior distribution of , g(), is an Inverse Gamma Distribution with parameters
= 3 and = 100. The posterior distribution of is also an Inverse Gamma Distribution, but with
parameters = 3 + 1 = 4 and = 100 + 50 = 150.
That is a density of: e /x / {() x +1} = 1504 e-150/ /{(4) 5 } =
(506250000/6) e-150/ / 5 = 84,375,000 e-150/ / 5 .
Alternately, if one observes 1 claim of size 50, we have that the chance of the observation given
is: e-50/ /. The posterior distribution of is proportional to:
(e-50/ /) g() = (e-50/ /)500,000 e-100/ / 4 = 500,000 e-150/ -5.
To get the posterior distribution, we need to divide by the integral of this quantity:

(e-50/ /) g() d = (e-50/ /)500,000 e-100/ / 4 d = 500,000 e-150/ -5 d.


=0

=0

=0

Using the change of variables = 150/, the integral becomes:


0

500,000 e- (/150)5 (-150d / 2 ) = (500,000/1504 ) e- 3/ d = (500,000/1504 ) (4)


=

=0

Thus the posterior distribution of is:


500,000 e-150/ -5 / {(500,000/1504 ) (4)} = 84,375,000 e-150/ / 5 .

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 264

9.28. C. This is an Inverse Gamma - Exponential Conjugate Prior. The prior Inverse Gamma has
parameters: = 3 and = 10. There are 2 claims and $50 of loss, thus the posterior Inverse
Gamma has parameters: = 3 + 2 = 5 and = 10 + 50 = 60.
The mean of the posterior Inverse Gamma is: /(-1) = 60/4 = 15.
Alternately, the predictive distribution is Pareto with = 5 and = 60, and mean:
/( - 1) = 60/4 = 15.
Alternately, for the Inverse Gamma-Exponential, Buhlmann Credibility produces the same estimate
as Bayesian Analysis. K = -1 = 2. Z = 2/(2+K) = 1/2.
Prior mean is: /(-1) = 10/2 = 5. estimate = (1/2)(50/2) + (1/2)(5) = 15.
Alternately, the sum of two independent identically distributed Exponential Distributions each with
mean is a Gamma distribution with = 2 and = . Therefore, the chance of the observation 2
claims totaling $50 is the density at 50 of a Gamma distribution with = 2 and
= , which is proportional to 2e-50/. By Bayes Theorem the posterior distribution of is
proportional to: (4e-10/)(2e-50/) = 6e-60/. This is an Inverse Gamma Distribution with
parameters: = 5 and = 60, and mean: 60/(5-1) = 15.
9.29. E. This is an Inverse Gamma - Exponential Conjugate Prior. The prior Inverse Gamma has
parameters: = 3 and = 10. There are 2 claims and $50 of loss,
thus the posterior Inverse Gamma has parameters: = 3 + 2 = 5, and = 10 + 50 = 60.
The predictive distribution is Pareto with = 5 and = 60.
For the predictive Pareto Distribution, S(30) = {60/(60 + 30)}5 = 0.132.
9.30. The predictive distribution is Pareto with = 5 and = 60.
For this predictive Pareto Distribution:
mean = 60 / (5 - 1) = 15. 2nd moment =

(2) (602 )
= 600. Variance = 600 - 152 = 375.
(5 - 1) (5 - 2)

Comment: We have computed the variance of the posterior mixed distribution.

2013-4-10,

Conjugate Priors 9 Inverse Gamma - Exponential, HCM 10/21/12, Page 265

9.31. C. An Inverse Gamma - Exponential, with = 2 and = c.


The posterior Inverse Gamma has parameters:
= + C = 2 + 1 = 3, and = + L = c + x.
The mean of the posterior Inverse Gamma is: /( - 1) = (x + c)/2.
Alternately, one can apply Bayes Theorem.
The posterior distribution of is: f(x | ) () /

f(x | ) () d = c2

f(x | ) () d .

exp[-(c + x)/ ] / 4 d = c2 2!/(x + c)3, using the hint with n = 4 and a = x + c.


0

f(x | ) () d = c2 exp[-(c + x)/ ] / 3d = c2 1!/(x + c)2, using the hint with n = 3 and a = x+c.
0

The mean of the posterior distribution of is:

f(x | ) () d / f(x | ) () d = (x + c)/2.


Comment: Since x is a loss, we would expect x to appear in the numerator rather than the
denominator of the expected future loss, eliminating choices A and B.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 266

Section 10, Normal-Normal60


The Normal-Normal is a fourth example of a conjugate prior situation. Like the Inverse GammaExponential, it involves a mixture of claim severity rather than frequency parameters across a
portfolio of risks. Unfortunately, unlike the Gamma-Poisson, where the Gamma, Poisson, and
Negative Binomial Distributions each take different roles, in the Normal-Normal the Normal
Distribution takes on all of these roles. Thus this is not the first Conjugate Prior situation one should
learn.
The sizes of claims a particular policyholder makes is assumed to be Normal with mean m and
known fixed variance s2 .61
Given m, the distribution function of the size of loss is: [(x-m)/s], while the density of
exp[the size of loss distribution is: [(x-m)/s] =

(x - m)2
]
2s2

So for example if s = 3, then the probability density of claim being of size 8 is:
exp(-(8-m)2 /18) / {3 2 }.
If m = 2 this density is: exp(-2) / {3 2 } = 0.018,
while if m = 20 this density is: exp(-8) / {3 2 } = 0.000045.
Prior Distribution:
Assume that the values of m are given by another Normal Distribution with mean 7 and
standard deviation of 2, with probability density function:
exp[f(m) =

(m- 7)2
]
(2) (82 )
2

, - < m < .

Note that the mean of this distribution of 7 is the a priori estimate of claim severity.

60

The Normal-Normal is discussed in Loss Models at Examples 5.5, 20.13, and Exercise 20.35.
Note Ive used roman letter for parameters of the Normal likelihood, in order to distinguish from the parameters of
the Normal prior distribution discussed below.
61

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 267

Below is displayed this prior distribution of hypothetical mean severities:62


density
0.20
0.15
0.10
0.05

10

12

14

Mean Severity

Marginal Distribution (Prior Mixed Distribution):


If we have a risk and do not know what type it is, in order to get the chance of the next claim being of
size 8, one would weight together the chances of having a claim of size 8 given m:
exp(-(8-m)2 /18) / {3 2 }, using the a priori probabilities of m:
f(m) = exp(-(m-7)2 /8) / {4 2 }, and integrating from minus infinity to infinity:

exp(-(8-m)2/18)/{3

2 } f(m) dm =

exp(-(8-m)2/18)/{3

2 } exp(-(m-7)2 /8)/{2 2 } dm =

{1/(6 2 )} exp[ -{(8-m)2 /18 + (m-7)2 /8)}]/ 2 dm =


-

{1/(6 2 )} exp[ -{13m2 - 190m + 697} / 72]/ 2 dm =


-

62

There is a very small but positive chance that the mean severity will be negative. There is always a positive chance
that the mean severity will be negative for the Normal-Normal conjugate prior.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 268

{1/(6 2 )} exp[ -{m2 - (190/13)m + (95/13)2 + (697/13) - (95/13)2 }/(72/13)]/ 2 dm =


-

exp((-36/132 )/(72/13))/{6 2 } exp[ -{m -95/13}2 / {2(6/ 13 )2 }]/ 2 dm =


-

{exp(-1/26)/{6 2 }} (6/ 13 ) = exp(-1/26)/{ 13 2 } = 0.1065.


Where we have used the fact that a Normal Density integrates to unity:63

exp[ -{m -95/13}2 / {2(6/

13 )2 ]/{(6/ 13 ) 2 } dm = 1.

More generally, if the distribution of hypothetical means m is given by an Normal Distribution


f(m) = exp(-(m-)2 /22)/{ 2 }, and we compute the chance of having a claim of size x by
integrating from minus infinity to infinity:64

exp(-(x-m)2/2s2)/{s

2 } f(m) dm =

exp(-(x-m)2/2s2)/{s

2 } exp(-(m-)2 /22 )/{ 2 } dm =

{1/(s 2 )} exp[-{(x-m)2 /2s2 + (m-)2 /22 )}] / 2 dm =


-

{1/(s 2 }) exp[-{(s2 + 2 )m2 - (x2 + s2 )2m + (x2 2 + 2s2 )}/(2s2 2)] / 2 dm.
-

63

With mean of 95/13 and standard deviation of 6/ 13 .


Note that Ive used Greek letters for the parameters of the prior Normal Distribution, while I used roman letters for
the parameters of the Normal likelihood.
64

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 269

Let 2 = s2 2 /(s2 + 2 ), = (x2 +s2 )/(s2 + 2 ), and = (x2 2 + 2s2 )/(s2 + 2 ),


then this integral equals:

{1/(s 2 )} exp[-(m2 - 2m + )/(22)]/ 2 dm =


-

{1/(s 2 )} exp[-{m2 - 2m + 2 - 2 + }/(22)]/ 2 dm =


-

{1/(s 2 )} exp[-( 2 - )/(22)] exp[-(m - )2 /(22)] / 2 dm =


-

{1/(s 2 )} exp[-( 2 - )/(22)) = exp[-( 2 - )/(22)] / { s2 + 2 2 }.


Where we have used the fact that a Normal Density integrates to unity:65

exp[- {(m-)2 /(22)] / {

2 } dm = 1.

Note that 2 - = (x2 4 + 2x2s2 + 2s4 - {x2 s2 2 + x2 4 + 2s4 + 2 2s2 }) /(s2 + 2 )2 =


(2x2s2 - x2 s2 2 2 2s2 ) /(s2 + 2 )2 = -(x - )2 2s2 /(s2 + 2 )2 .
Thus, (2 - )/ 2 = {-(x-)2 2s2 /(s2 + 2 )2 } (s2 + 2 ) / (s2 2) = -(x - )2 / (s2 + 2 ).
Thus the marginal distribution can be put back in terms of x, s, , and :
{1/( s2 + 2 2 )} exp[( 2 - )/(22)] = {1/( s2 + 2 2 )} exp[-(x-)2 / {2(s2 + 2 )}].
This is a Normal Distribution with mean and variance s2 + 2 .
Thus if the likelihood is a Normal Distribution with variance s2 (fixed and known),
and the prior distribution of the hypothetical means of the likelihood is also a Normal,
but with mean and variance 2,
then the Marginal Distribution (Prior Mixed Distribution) is yet
a third Normal Distribution with mean and variance s2 + 2 .
65

With mean of and standard deviation of .

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 270

As with the other Conjugate Priors discussed above, the mean of the likelihood is what is varying
among the insureds in the portfolio. Therefore, the mean of the marginal distribution is equal to that of
the prior distribution, in this case .
For the specific case dealt with previously: s = 3, = 7, and = 2, the marginal distribution has a
Normal Distribution with a mean of 7 and variance of: 32 + 22 = 13.
Thus the chance of having a claim of size x is: exp[-(x-7)2 /26] / { 13

2 }.

For x = 8 this chance is: exp(-1/26) / { 13 2 } = 0.1065.


This is the same result as calculated above.
For the Normal-Normal the marginal distribution is always a Normal, with mean equal to
that of the prior Normal66 and variance equal to the sum of the variances of the prior
Normal and the Normal likelihood.67
Prior Expected Value of the Process Variance:
The process variance of the severity for an individual risk is s2 since the severity for each risk is a
Normal with fixed variance s2 . Therefore the expected value of the process variance = E[s2 ] = s2 .
Thus for s = 3, EPV = 32 = 9.
Prior Variance of the Hypothetical Means:
The variance of the hypothetical mean severities is the variance of m = Var[m] =
Variance of the Prior Normal = 2 = 22 = 4.
Prior Total Variance:
The total variance = the variance of the marginal Normal Distribution = s2 + 2 = 32 + 22 = 13.
Expected Value of the Process Variance + Variance of the Hypothetical Means = 9 + 4
= 13 = Total Variance.
The fact that the EPV + VHM = Total Variance
is one way to remember the variance of the marginal Normal.
66

This fact follows from the the fact that the prior distribution is parametrizing the mean severities of the likelihoods.
As will be discussed below, the EPV is the variance of the Normal Likelihood, the VHM is the variance of the prior
Normal, and the total variance is the variance of the marginal distribution. Thus this relationship follows from the
general fact that the total variance is the sum of the EPV and VHM.
67

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 271

EPV = s2 = variance of Normal Likelihood.


VHM = 2 = variance of Normal Prior.
Variance of marginal Normal = Total Variance = EPV + VHM = s2 + 2 .
Observations:
Let us now introduce the concept of observations. A risk is selected at random and it is observed to
have 5 claims of sizes: 8, 7, 5, 4 and 3. Note that for the forthcoming analysis all that will be
important is that there were 5 claims totaling 27.
Posterior Distribution:
We can employ Bayesian analysis to compute what the chances are that the selected risk had a
given hypothetical mean.
Given a Normal severity distribution with mean m and variance 9, the chance of observing a claim of
size 8 is: exp(-(8-m)2 /18) / {3 2 }.
The chance of having 5 claims of size: 8, 7, 5, 4 and 3 is the product of five likelihoods, which is
proportional to:
exp[-(8-m)2 /18] exp[-(7-m)2 /18] exp[-(5-m)2 /18] exp[-(4-m)2 /18] exp[-(3-m)2 /18]
= exp[-{(8-m)2 + (7-m)2 + (5-m)2 + (4-m)2 + (3-m)2 }/18] = exp[-{163 - 54m + 5m2 }/18].

The a priori probability of m is the Prior Normal distribution: f(m) = exp(-(m-7)2 /8) / {2 2 }.
Thus the posterior chance of m is proportional to the product of the chance of observation and the a
priori probability:
exp[-{163 - 54m + 5m2 }/18] exp[-(m-7)2 /8] = exp[-{1093 - 342m + 29m2 }/72]
= exp[-{(1093/29) - (342/29)m + m2 } / {2(36/29)}].
This is proportional to the density for a Normal distribution with mean 171/29 and variance 36/29.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 272

Below the prior Normal with = 7 and = 2, and this posterior Normal with = 171/29 and
= 6/ 29 , are compared:
density
0.35

Posterior

0.30
0.25
0.20
0.15

Prior

0.10
0.05
2

10

12

14

Mean Severity

In general, if one observes C claims totaling L losses, we have that the chance of the observation
given m is a product of terms proportional to: exp[-{-2Lm + Cm2 }/(2s2 )].
The prior Normal distribution is proportional to: exp[-(-2m + m2 )/(22)].
The posterior probability for m is therefore proportional to:
exp[-{-2Lm + Cm2 }/(2s2 )] exp[-(-2m + m2 )/(22)] =
exp[-{-2(L2 + s2 )m + (C2+s2 )m2 )}/(2s2 2)] =
exp[-{-2((L2 + s2 )/(C2+s2 ))m + m2 )} / {2s2 2/(C2+s2 )}].
This is proportional to the density of a Normal distribution with
new mean = (L2 + s2 ) / (C2+s2 ) and new variance = s2 2 / (C2+s2 ).
Thus for the Normal-Normal the posterior density function is also a Normal.
L 2 + s2
s2 2
This posterior Normal has a mean =
,
and
variance
=
.
C 2 + s2
C 2 + s 2

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 273

For example, in the case where we observed 5 claims totaling 27, C = 5 and L = 27, the prior
Normal had mean = 7 and variance 2 = 4, and the Normal Likelihoods had variance s2 = 9,
the posterior Normal has a mean of:
(L2 + s2 ) / (C2 + s2 ) = {(27)(4) + (7)(9)} / {(5)(4) + 9} = 171/29,
and variance of: s2 2/(C2 + s2 ) = (4)(9) / {(5)(4) + 9} = 36/29, matching the result obtained above.
The fact that the posterior distribution is of the same form as the prior distribution is why the Normal is
a Conjugate Prior Distribution for the Normal (fixed severity).
Posterior Mean:
One can compute the means and variances posterior to the observations. The posterior mean can
be computed by weighting together the means for each type of risk, using the posterior
probabilities. This is E[m] = the mean of the posterior Normal = (L2 + s2 ) / (C2 + s2 ) =
{(27)(4) + (7)(9)} / {(5)(4) + 9} = 171/29.
Thus the new estimate posterior to the observations for this risk using Bayesian Analysis is 171/29.
This compares to the a priori estimate of 7. In general, the observations provide information about
the given risk, which allows one to make a better estimate of the future experience of that risk. Not
surprisingly observing 5 claims totaling 27 (for an average severity of 5.4) has lowered the
estimated future mean severity from 7 to 171/29 = 5.9.
Posterior Expected Value of the Process Variance:
Just as prior to the observations, posterior to the observations one can compute three variances:
the expected value of the process variance, the variance of the hypothetical pure premiums, and the
total variance.
The process variance of the severity for an individual risk is s2 since the severity for each risk is a
Normal with fixed variance s2 . Therefore the expected value of the process variance =
the expected value of s2 = s2 = 9.
Posterior Variance of the Hypothetical Means:
The variance of the hypothetical mean severities is the variance of m = Var[m] =
Variance of the Posterior Normal =

s2 2
= 36/29.
C 2 + s 2

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 274

Posterior Total Variance:


The posterior total variance = the posterior Expected Value of the Process Variance +
the posterior Variance of the Hypothetical Means = Variance of the Normal Likelihood +
Variance of the Posterior Normal = s2 +
For this example: s2 +

s2 2
.
C 2 + s 2

s2 2
= 9 + 36/29 = 297/29.
C 2 + s 2

This total variance is the variance of the predictive distribution.


Predictive Distribution:
Since the posterior distribution is also a Normal distribution, the same analysis that led to a Normal
(prior) marginal distribution, will lead to a (posterior) predictive distribution that is Normal. However,
the parameters are related to the posterior Normal.
For the Normal-Normal the predictive distribution is always a Normal with
mean =

L 2 + s2
s2 2
2+
,
and
variance
=
s
.
C 2 + s2
C 2 + s 2

In the particular example, the predictive distribution is a Normal with mean 171/29 and variance
297/29. Below are compared the prior marginal Normal with = 7 and =
and this posterior predictive Normal with = 171/29 and =
density

13 ,

279 / 29 :

Predictive

0.12
0.10
0.08
Marginal
0.06
0.04
0.02
-5

10

15

20

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 275

Buhlmann Credibility:
Next, lets apply Buhlmann Credibility to this example.
The Buhlmann Credibility parameter
the (prior) expected value of the process variance
K=
= 9 / 4.
the (prior) variance of the hypothetical means
Note that K can be computed prior to any observation and doesnt depend on them.
Specifically both variances are for a single insured for one trial.
For the Normal-Normal in general, the Buhlmann Credibility parameter
K = s2 / 2, where 2 is the variance of the prior Normal and s2 is the variance each of the
Normal Likelihoods.
For this example, K = 9/4.
Having observed 5 claims, Z = 5 / {5 + (9/4)} = 20/29 = 0.690.
The observed severity = 27/5.
The a priori mean = 7.
Thus the estimated future severity is: (20/29)(27/5) + (9/29)(7) = 171/29.
Note that in this case the estimate from Buhlmann Credibility matches the estimate from
Bayesian Analysis.
For the Normal-Normal the estimates from using Bayesian Analysis and Buhlmann
Credibility are equal.68
For the Normal-Normal, it is easier to apply Buhlmann Credibility than Bayes Analysis.
Summary:
The many different aspects of the Normal-Normal Conjugate Prior Severity Process are
summarized below. Be sure to be able to clearly distinguish between the situation prior to
observations and that posterior to the observations. The Normal-Normal is far and away the least
important of the four conjugate prior situations to learn for your exam.
While applying Bayes Analysis to this situation is very difficult, one can apply Buhlmann credibility
to this situation without memorizing anything. One can determine the EPV and VHM by just
applying their definitions.
68

As discussed in a subsequent section this is a special case of the general results for conjugate priors of members
of linear exponential families. This is an another example of what Loss Models calls exact credibility.

Page 276
HCM 10/21/12,

Conjugate Priors 10 Normal-Normal,


2013-4-10,

Normal-Normal Severity Process

(Size of Loss)
Mean = = mean of the prior Normal Distribution.
Variance = s2 + 2.

Normal Marginal Distribution:

Normal Severity Process,


fixed variance s2, mean m.

Normal is a Conjugate Prior, Normal (fixed variance) is a Member of a Linear Exponential Family
Buhlmann Credibility Estimate = Bayes Analysis Estimate
K = Variance of Normal Likelihood/ Variance of Normal Prior = s2/2.

Normal Prior
Mixing

Mixing

(Size of Loss)
Mean = (L2 + s2 ) / {C2 + s2 } =
mean of posterior Normal Distribution.
Variance = s2 + s2 2 / {C2 + s2 }.

Normal Predictive Distribution:

Observations: $ of Loss = L, # claims = C.

(Distribution of Parameters)
f(m) = ((m-)/),
mean = , variance = 2.

Normal Posterior
(Distribution of Parameters)
Mean =
(L2 + s2 ) / {C2 + s2 }.
Variance =
s2 2 / {C2 + s2 }.

Normal Severity Process,


fixed variance s2, mean m.

The Means of the Normal Severity Distributions of individuals making up the entire portfolio are distributed via a
Normal Distribution with parameters and : f(m) = exp[-(m-)2/ 22]/{ 2 }.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 277

Problems:
Use the following information to answer the next 15 questions:
The size of claim distribution for any particular policyholder is Normal with mean m and variance 25.
The m values of the portfolio of policyholders have probability density function:
exp[f(m) =

(m-100)2
]
288

12 2

, - < m < .

10.1 (1 point) What is the mean claim size for the portfolio?
A. less than 20
B. at least 20 but less than 40
C. at least 40 but less than 80
D. at least 80 but less than 160
E. at least 160
10.2 (1 point) What is the total variance of the claim severity for the portfolio?
A. less than 135
B. at least 135 but less than 145
C. at least 145 but less than 155
D. at least 155 but less than 165
E. at least 165
10.3 (1 point) What is the probability that an insured picked at random from this portfolio will have an
expected mean severity m between 110 and 120?
A. less than 15%
B. at least 15% but less than 25%
C. at least 25% but less than 35%
D. at least 35% but less than 45%
E. at least 45%
10.4 (1 point) What is the value of the probability density function for the claim sizes of the entire
portfolio?
exp[A.

(m-100)2
]
338

13 2

exp[B.

E. None of A, B, C, or D

(m-100)2
]
288

12 2

exp[C.

(m-100)2
]
98

7 2

exp[D.

(m-100)2
]
50

5 2

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 278

10.5 (1 point) What is the probability that a claim picked at random from this portfolio will be of size
between 100 and 120?
A. less than 15%
B. at least 15% but less than 25%
C. at least 25% but less than 35%
D. at least 35% but less than 45%
E. at least 45%
10.6 (1 point) What is the expected value of the process variance for the claim severity?
A. less than 20
B. at least 20 but less than 40
C. at least 40 but less than 80
D. at least 80 but less than 160
E. at least 160
10.7 (1 point) What is the variance of the hypothetical mean severities?
A. less than 20
B. at least 20 but less than 40
C. at least 40 but less than 80
D. at least 80 but less than 160
E. at least 160
10.8 (2 points) An insured has 3 claims of sizes 95, 115, and 120.
Using Buhlmann Credibility what is the estimate of this insured's expected future claim severity?
A. less than 97
B. at least 97 but less than 100
C. at least 100 but less than 103
D. at least 103 but less than 106
E. at least 106
10.9 (1 point) An insured has 3 claims of sizes 95, 115, and 120.
What is the mean of the posterior severity distribution?
A. less than 97
B. at least 97 but less than 100
C. at least 100 but less than 103
D. at least 103 but less than 106
E. at least 106

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 279

10.10 (2 points) An insured has 3 claims of sizes 95, 115, and 120.
What is the variance of the posterior severity distribution?
A. less than 8.0
B. at least 8.0 but less than 8.1
C. at least 8.1 but less than 8.2
D. at least 8.2 but less than 8.3
E. at least 8.3
10.11 (1 point) An insured has 3 claims of sizes 95, 115, and 120.
Which of the following is proportional to the posterior probability density function for this insured's
mean expected severity?
A. exp[-(m-100)2 /15.75]
B. exp[-(m-105)2 /15.75]
C. exp[-(m-100)2 /338]
D. exp[-(m-105)2 /338]
E. None of A, B, C, or D
10.12 (1 point) An insured has 3 claims of sizes 95, 115, and 120. What is the probability that this
insured will have an expected future mean severity m between 110 and 115?
A. less than 15%
B. at least 15% but less than 25%
C. at least 25% but less than 35%
D. at least 35% but less than 45%
E. at least 45%
10.13 (2 points) An insured has 3 claims of sizes 95, 115, and 120.
Which of the following is proportional to the predictive distribution of the severity for this insured?
A. exp[-(m-100)2 /65.76]
B. exp[-(m-100)2 /15.75]
C. exp[-(m-108.78)2 /65.76]
D. exp[-(m-108.78)2 /15.75]
E. None of A, B, C, or D

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 280

10.14 (1 point) An insured has 3 claims of sizes 95, 115, and 120.
What is the probability that the next claim from this insured will be of size between 115 and 120?
A. less than 15%
B. at least 15% but less than 25%
C. at least 25% but less than 35%
D. at least 35% but less than 45%
E. at least 45%
10.15 (2 points) An insured has 3 claims of sizes 95, 115, and 120.
Using Bayesian Analysis what is the estimate of this insured's expected future claim severity?
A. less than 97
B. at least 97 but less than 100
C. at least 100 but less than 103
D. at least 103 but less than 106
E. at least 106

Use the following information for the next two questions:

The size of claim distribution for any particular policyholder is LogNormal,


with parameters and = 1.7.

The values of the portfolio of policyholders have a Normal Distribution

with mean 3.8 and variance 2.25.


For a particular policyholder you observe 5 claims of sizes:
120, 160, 210, 270, and 380.

10.16 (4 points) Use Bayes Analysis to estimate the expected future claim severity for this
policyholder.
Hint: Work with the log claim sizes and apply Bayes Analysis to the resulting Normal-Normal
Conjugate Prior in order to get the predictive Normal distribution. Then convert the predictive
Normal Distribution to a LogNormal Distribution.
A. 400
B. 500
C. 600
D. 700
E. 800
10.17 (5 points) Use Buhlmann Credibility to estimate the expected future claim severity for this
policyholder.
A. 400
B. 500
C. 600
D. 700
E. 800

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 281

Use the following information to answer the next 4 questions:

The size of claim distribution for any particular Manufacturing Class is assumed
to be Normal with mean m and variance 1.2 million.

The hypothetical means, m, are assumed to be normally distributed over


exp[the Manufacturing Industry Group: f(m) =

(m- 6000)2
]
68,450

185 2

, - < m < .

For the Widget Manufacturing Class, one observes 72 claims for a total of 350,000.

10.18 (2 points) Using Buhlmann Credibility, what is the estimated future average claim severity for
the Widget Manufacturing Class?
(x -)2
]
22 , - < x < .
2

exp[Hint: The density of the Normal Distribution is: f(x) =


A. Less than 5100
B. At least 5100, but less than 5200
C. At least 5200, but less than 5300
D. At least 5300, but less than 5400
E. At least 5400

10.19 (1 point) What is the mean of the posterior distribution of m?


A. Less than 5100
B. At least 5100, but less than 5200
C. At least 5200, but less than 5300
D. At least 5300, but less than 5400
E. At least 5400
10.20 (1 point) What is the variance of the posterior distribution of m?
A. Less than 8,000
B. At least 8,000, but less than 9,000
C. At least 9,000, but less than 10,000
D. At least 10,000, but less than 11,000
E. At least 11,000
10.21 (2 points) What is the probability that the Widget Manufacturing Class has an expected
future mean severity m between 5000 and 5500?
A. 94%
B. 95%
C. 96%
D. 97%
E. 98%

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 282

10.22 (2 points) You are given the following:


The IQs of actuaries are normally distributed with mean 135 and standard deviation 10.
Each actuarys score on an IQ test is normally distributed around his true IQ,
with standard deviation of 15.
Abbie the actuary scores a 155 on an IQ test.
Using Buhlmann Credibility, what is the estimate of Abbies IQ?
A. 139
B. 141
C. 143
D. 145
E. 147

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 283

Solutions to Problems:
10.1. D. f(m) is a Normal density function with mean of 100 (and variance of 144.)
Each Normal severity has a mean of m.
Thus, Mean of the portfolio = E[m] = Mean of the prior Normal Distribution = 100.
10.2. E. f(m) is a Normal density function with mean of 100 and variance of 144.
The variance of the portfolio is the sum of the variance of the prior Normal and the (fixed) variance of
the Normal Severity Processes = 144 + 25 = 169.
10.3. B. f(m) is a Normal density function with mean of 100 and standard deviation of 12.
Thus the probability that an insured picked at random from this portfolio will have a mean severity m
between 110 and 120 is: [(120 - 100)/12] - [(110 - 100)/12] =
(1.67) - (0.83) = 0.9525 - 0.7967 = 0.1558.
Comment: Note that we picked an insured at random and asked about its expected mean severity.
This is different than picking a claim at random and asking about its size. The former is the
hypothetical mean with a distribution with a variance equal to the VHM (by definition),
while the latter has a distribution with variance equal to the total variance.
10.4. A. The (prior) marginal distribution is a Normal, with mean of 100 and variance of:
144 + 25 = 169. Thus the probability density function that a claim chosen at random will be of size x
is given by: ((x-100)/13) = exp[-(x-100)2 /338] / {13 2 }.
10.5. D. The (prior) marginal distribution is a Normal, with =100 and 2 = 144 + 25 = 169. The
standard deviation is

169 = 13. Thus the chance of a claim being in the interval from 100 to 120 is:

[(120-100)/13] - [(100-100)/13] = (1.54) - (0) = 0.9382 - 0.5 = 0.4382.


10.6. B. We are given that each insured has a severity process with variance 25.
Thus each process variance is 25 and so is their expected value over the portfolio.
10.7. D. The hypothetical mean severity for each insured is m. The distribution of m is given as
Normal with mean of 100 and variance of 144. Thus the variance of the hypothetical mean severities
is 144.
Comment: Note that the total variance of 169, the variance of the (prior) marginal distribution, is equal
to the sum of the (prior) EPV of 25 and the (prior) VHM of 144.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 284

10.8. E. The Buhlmann Credibility Parameter K = EPV / VHM =


Variance of the Normal Likelihood
= 25/144.
Variance of the Normal Prior Distribution
Thus for three claims, Z = 3/(3 + 25/144) = 432/457.
The prior mean is 100. The observed mean is (95+115+120)/3 = 110.
The Buhlmann Credibility Estimate = (110)(432/457) + (100)(25/457) = 50020/457 = 109.45.
10.9. E. The posterior distribution is a Normal, with mean equal to:
(L2 + s2 ) / (C2 + s2 ) = {(330)(144) + (100)(25)} / {(3)(144) + 25} = 50,020/457 = 109.45.
10.10. A. The posterior distribution is a Normal, with variance equal to:
s2 2 / ( s2 + C2) = (25)(144) / {25 + (3)(144)} = 7.8775.
10.11. E. The posterior distribution is a Normal, with mean equal to:
(L2 + s2 ) / (C2 + s2 ) = {(330)(144) + (100)(25)} / {(3)(144) + 25} = 109.45
and variance equal to: s2 2 / (s2 + C2) = (25)(144) / {25 + (3)(144)} = 7.8775.
Thus the posterior density is: exp[-(m-109.45)2 /15.75] / {2.807 2 }.
10.12. D. The posterior distribution is a Normal, with mean equal to:
(L2 + s2 ) / ( C2 + s2 ) = {(330)(144) + (100)(25)} / {(3)(144) + 25} = 109.45
and variance equal to: s2 2 / (s2 + C2) = (25)(144) / {25 + (3)(144)} = 7.8775.
Thus the probability that this insured will have an expected future mean severity between 110 and
115 is: [(115-109.45)/2.807] - [(110-109.45)/2.807] =
(1.98) - (0.20) = 0.9761 - 0.5793 = 0.3968.
10.13. E. The (posterior) predictive distribution is a Normal, with mean equal to:
(L2 + s2 ) / (C2 + s2 ) = {(330)(144) + (100)(25)} / {(3)(144) + 25} = 109.45
and variance equal to: s2 + s2 2 / (s2 + C2) = 25 + ((25)(144) / {25 + (3)(144)}) = 32.8775.
Thus the (posterior) predictive density is: exp[-(m-109.45)2 /65.76] / {5.734 2 }.
Comment: The posterior total variance of 32.8775, the variance of the (posterior) predictive
distribution, is equal to the sum of the posterior EPV of 25 and the posterior VHM of 7.8875.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 285

10.14. A. The (posterior) predictive distribution is a Normal, with mean equal to:
(L2 + s2 ) / (C2 + s2 ) = {(330)(144) + (100)(25)} / {(3)(144) + 25} = 109.45,
and variance equal to: s2 + s2 2 / (s2 + C2) = 25 + ((25)(144) / {25 + (3)(144)}) = 32.8775.
The standard deviation is: 32.8775 = 5.734.
Thus the chance of a claim being in the interval from 115 to 120 is:
[(120 - 109.45)/5.733] - [(115 - 109.45)/5.733] = (1.84) - (0.97) =
0.9671 - 0.8340 = 0.1331.
10.15. E. The (posterior) predictive distribution is a Normal, with mean equal to:
(L2 + s2 ) / (C2 + s2 ) = {(330)(144) + (100)(25)} / {(3)(144) + 25} = 109.45.
Comment: This mean of the (posterior) predictive distribution = Bayes Analysis Estimate =
Buhlmann Credibility Estimate.
10.16. E. For any particular policyholder, the log of the claim sizes follows a Normal distribution with
standard deviation of 1.7. The hypothetical means of these distributions are in turn Normally
Distributed. Thus this is mathematically a Normal-Normal conjugate prior situation.
The sum of the observed log claim sizes is:
ln(120) + ln(160) + ln(210) + ln(270) + ln(380) = 4.787 + 5.075 + 5.347 + 5.598 + 5.940 =
26.747.
Thus the (posterior) predictive distribution of the log claim sizes is a Normal, with mean equal to:
(L2 + s2 ) / (C2 + s2 ) = {(26.747)(2.25) + (3.8)(1.72 )} / {(5)(2.25) + 1.72 } = 71.16/14.14 =
5.03, and variance equal to: s2 + s2 2 / (s2 + C2) =
(1.72 ) + ((1.72 )(2.25) / {1.72 + (5)(2.25)}) = 2.89 + 0.460 = 3.35.
Thus the standard deviation is: 3.35 = 1.83.
The mean of the corresponding LogNormal Distribution with parameters 5.03 and 1.83 is:
exp[5.03 + (0.5)(1.832 )] = exp(6.70) = 816.
Comment: Very difficult. For an example of the use of the Normal-Normal Conjugate Prior in the
context of a LogNormal claim severity, see for example pages 279-280 of Credibility Using SemiParametric Models, by Virginia R. Young, ASTIN Bulletin, Volume 27, No. 2, November 1997.
The answer seems peculiar, since it is much larger than any of the observed claims.
However, with long-tailed distributions, the overwhelming majority of claims are less than the mean.
For example, the median of the posterior predictive LogNormal is exp(5.03) = 152.
For this LogNormal, the probability of a claim being less than the mean of 816 is:
[(6.70-5.03)/1.83] = (0.91) = 82%.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 286

10.17. B. The process variance given is:


Second Moment of LogNormal - Square of First Momentof LogNormal =
Exp[2 + (2)(1.72 )] - Exp[ + (1.72 )/2]2 = Exp[2] (e5.78 - e2.89) = 305.77 Exp[2].
Therefore, EPV = 305.77 E[e2]. is Normal with mean 3.8 and standard deviation 1.5.
Therefore, 2 is Normal with mean 7.6 and standard deviation 3.
Therefore, e2 is LogNormal withparameters 7.6 and 3.
Therefore,E[e2] is the mean of this LogNormal: Exp[7.6 + 32 /2] = e12.1 = 179,872.
Therefore, EPV =(305.77)(179,872) = 55.00 million.
The hypothetical mean given is: First Momentof LogNormal =Exp[ + (1.72 )/2] =e e1.445.
Therefore, VHM = Var[e](e1.445)2 = Var[e] e2.89.
is Normal with mean 3.8 and standard deviation 1.5.
Therefore, e is LogNormal withparameters 3.8 and 1.5.
Therefore,Var[e] is the variance of this LogNormal:
Exp[(2)(3.8) + (2)(2.25)] - Exp[3.8 + 2.25/2]2 = e12.1 - e9.85 = 160,914.
Therefore, VHM =160,914 e2.89 = 2.895 million.
K = EPV/VHM = 55.00/2.895 = 19.0. Z = 5 / (5 + 19.0) = 20.8%.
The hypothetical mean given is: e e1.445. e is LogNormal withparameters 3.8 and 1.5.
Therefore, E[e] is the mean of this LogNormal: Exp[3.8 + 1.52 /2] = e4.925.
Thus the prior mean is: E[e] e1.445 = e4.925 e1.445 = 584.
Observed mean is: (120 + 160 + 210 + 270 + 380)/5 = 228.
Estimate = (20.8%) (228) + (1 - 20.8%) (584) = 510.
Comment: The estimate using Buhlmann Credibility is not equal to the estimate using Bayes
Analysis. While Buhlmann is equal to Bayes in "Normal land, they are not equal in "LogNormal
land. This is the case because linearity is not preserved under exponentiation. This is also why the
mean of a LogNormal is not equal to the exponential of the mean of the underlying Normal.
Working in Normal land, K = 2.89/2.25 = 1.284, n = 5, and Z = 79.6%.
The mean of the log observed claims sizes is: 5.35.
The prior mean of the log claim sizes is 3.8, the mean of the prior Normal.
(5.35)(0.796) + (3.8)(1 - 0.796) = 5.03.
This 5.03 is equal to the mean of the predictive Normal as given in my solution to the previous
question. Then as per my solution to the previous question, one could get the variance of the
predictive Normal, and proceed as I did to get the predictive LogNormal Distribution and the Bayes
estimate of the future severity.

2013-4-10,

Conjugate Priors 10 Normal-Normal,

HCM 10/21/12,

Page 287

10.18. C. The process variance for each class is 1.2 million. The EPV is 1.2 million. f(m) is Normal,
with = 6000 and = 185. The VHM is the variance of f(m), which is 1852 = 34225.
Thus K = EPV/VHM = 1,200,000/ 34,225 = 35.06. Z = 72 / (72+35.06) = 0.673.
The prior estimate is the mean of f(m) = 6000. The observed severity is 350000/72 = 4861.
Thus the posterior estimate is: (4861)(0.673) + (6000)(1 - 0.673) = 5233.
Comment: This is a Conjugate Prior situation with the likelihood a member of a linear exponential
family, a Normal with fixed variance. Therefore, the Buhlmann Credibility estimate equals that from
Bayesian Analysis, the mean of the posterior distribution of m, which is shown in the next solution.
10.19. C. The posterior distribution of m is a Normal, with mean equal to:
(L2 + s2 ) / (C2 + s2 ) = {(350000)(34225) + (6000)(1,200,000)} / {(72)(34225) + 1,200,000}
= 19,178,750,000 / 3,664,200 = 5234.
10.20. E. The posterior distribution of m is a Normal, with variance equal to:
s2 2 / (s2 + C2) = (1,200,000)(34,225) / {1,200,000 + (72)(34,225)} = 11,208.
10.21. E. The posterior distribution of m is a Normal, with mean equal to 5234, and variance equal
to 11208. Thus it has standard deviation of

11,208 = 106. Thus the probability that this class has

an expected future mean severity between 5000 and 5500 is:


[(5500 - 5234)/106] - [(5000 - 5234)/106] = (2.51) - (-2.21) = 0.9940 - (1 - 0.9864) =
0.9804.
Comment: Thus (5000, 5500) is a reasonable interval estimate for the expected severity for the
Widget Manufacturing Class.
10.22. B. The Expected Value of the Process Variance is: 152 = 225.
The Variance of the Hypothetical Means is 102 = 100. Thus K = EPV/VHM = 225/100 = 2.25.
Z = 1/(1+ 2.25) = 0.308. The prior estimate is 135 and the observation is 155.
Thus the posterior estimate is: (155)(0.308) + (135)(1 - 0.308) = 141.
Comment: Since this is a Conjugate Prior situation with the likelihood a member of a linear
exponential family (a Normal with fixed variance), the Buhlmann Credibility estimate equals that from
Bayesian Analysis. The Bayesian Analysis estimate is the mean of the posterior distribution of m,
(L2 + s2 ) / (C2 + s2 ) = {(155)(100) + (135)(225)} / {(1)(100) + 225} = 141,
resulting in the same solution.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 288

Section 11, Linear Exponential Families69


Linear exponential families include: Exponential, Poisson, Normal for fixed , Binomial
for m fixed (including the special case of the Bernoulli, m = 1) , Negative Binomial for r
fixed (including the special case of the Geometric, r = 1), the Gamma for fixed, and the
Inverse Gaussian for fixed.
Definition:
A linear exponential family is a family of probability density functions defined on a fixed
interval such that f(x; ) = p(x)

p(x) er( )x
for the parameter in a fixed interval.70 71
q()

r() is called the canonical parameter.


Thus the log density is of the form: ln f(x; ) = x r( ) + ln[p(x)] - ln[q()];
the log density is the sum of three terms: x times the canonical parameter, a function of x, and a
function of the parameter .
Note that f can be continuous, for example Exponential, or discrete, for example Bernoulli.
Note that a constant multiplying f(x; ) can be absorbed into either p(x) or q().
There are different ways to parameterize the density.
The log density could be the sum of three terms: x times minus , a function of x, and a function of
the parameter. In that case, would be called the natural parameter.72
In other words, if we take r() = -, then f(x; ) = p(x) e-x / q().
Any single parameter family of distributions that can be put in this form (and that has fixed support
that does not depend on the parameter) is a linear exponential family. The so-called natural
parameter is just a particular way of parametrizing such densities, which happens to be convenient
for deriving various results for linear exponential families in general.
69

As discussed in Sections 5.4 and 15.5.3 of Loss Models. Linear Exponential Families come up in other
applications, for example Generalized Linear Models (GLIM). See for example A Primer on the Exponential Family of
Distributions, by David R. Clark and Charles Thayer, CAS 2004 Discussion Paper Program.
70
The fixed interval for x can be - to +, 0 to 1, 0 to , etc.
The key thing is that the boundaries can not depend in any way on .
71
I have used rather than , in order to avoid confusion where Loss Models has already used as a parameter of a
distribution.
72
Different authors use different terminology.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 289

For the Poisson with parameter , f(x; ) = e- x / x!


Therefore, ln f(x; ) = x ln() - ln(x!) -
Therefore, the Poisson distribution is a member of a linear exponential family of the discrete type.
If we set = -ln(), then we have the Poisson in terms of its natural parameter;
ln f(x; ) = -x - ln(x!) - e- = -x + ln[p(x)] - ln[q()], with p(x) = 1/ x! and ln[q()] = e-.
For the Exponential distribution: f(x; ) = ex/ /.
ln f(x; ) = -x/ - ln.
Therefore, the Exponential distribution is a member of a linear exponential family of the continuous
type.73
Exercise: Show that the Normal Distribution with fixed and single parameter is a member of a
linear exponential family.
[Solution: For the Normal Distribution with fixed and single parameter ,
(x -)2
]
22 .
2

exp[f(x; ) =

ln f(x; ) = -0.5{(x-)/}2 - ln - 0.5 ln[2] =

x / 2 - 0.5 x2 / 2 - 0.5 2 / 2 - ln - 0.5 ln[2]. This has only one term where x and the single
parameter appear together and in that term x is linear, therefore, the Normal distribution for fixed
variance is a member of a linear exponential family of the continuous type. If one lets
= - / 2 , then is the natural parameter of the Normal Distribution with fixed.]
For the Binomial Distribution with m fixed and single parameter q,
f(x; q) = m! qx (1-q)m-x / {x!(m-x)!}.
ln f(x; q) = x ln[q] + (m-x) ln[1-q] + ln[m!] - ln[x!] - ln[m-x)!] =
x {ln[q] - ln[1-q])} - ln[x!] - ln[(m-x)!] + m ln[1-q] + ln[m!].
Therefore, the Binomial Distribution with m fixed is a member of a linear exponential family of the
discrete type. Specifically, with m = 1, the Bernoulli is a member of a linear exponential family of the
discrete type.
Exercise: What is the natural parameter for the Binomial Distribution with m fixed and single
parameter q?
[Solution: ln f(x; q) = x {ln[q] - ln[1-q])} - ln[x!] - ln[(m-x)!] + m ln[1-q] + ln[m!].
Thus the natural parameter is: -{ln[q] - ln[1-q]} = ln[(1-q)/q)] = ln[1/q - 1]. ]
73

With natural parameter = 1/, q() = 1/ and p(x) = 1. 0 < x < , 0 < < .

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 290

For the Negative Binomial Distribution with r fixed and single parameter ,
f(x; p) = (x+r-1)! x / {(1+)x+r x!(r-1)!}
ln f(x; p) = x ln - (x+r)ln(1+) + ln[(x+r-1)!] - ln x! - ln[(r-1)!] =
x {ln ln(1+)} + ln[(x+r-1)!] - ln x! - ln[(r-1)!] - r ln(1+).
Therefore, the Negative Binomial Distribution with r fixed is a member of a linear exponential family
of the discrete type. Specifically, with r = 1, the Geometric Distribution is a member of a linear
exponential family of the discrete type.
Exercise: What is the natural parameter for the Negative Binomial Distribution with r fixed and single
parameter ?
[Solution: ln f(x; ) = x {ln - ln(1+)} + ln[(x+r-1)!] - ln x! - ln (r-1)! - r ln(1+) .
Thus the natural parameter is: -{ln[] - ln[1+]} = ln[(1+)/] = ln[1 + 1/]. ]

Exercise: Let f(x; ) = exp[- 5(x-)2 / (x2 )]

5
, x > 0.
x3

Is this density a member of a linear exponential family?


[Solution: ln f(x) = - 5(x-)2 / x2 - (3/2)ln(x) + ln(5/)/2 =
-5x/2 + 10/ - 5/x - (3/2)ln(x) + ln(5/ )/2. The term involving both x and is: -5x/2 .
Since this is linear in x, this is a member of a linear exponential family.
Comment: This is an Inverse Gaussian Distribution, with = 10.]
Exercise: Let f(x; ) = exp[-(1 - x)2 / (20x)] /

20 x , x > 0.

Is this density a member of a linear exponential family?


[Solution: ln f(x) = ln() - (1- x)2 / (20x) - ln(20 x) / 2 =
ln() - 1/20x + /10 - x2 /20 - ln(20 x) /2.
This is indeed a member of an Exponential Family, and the term involving both x and is: x2 /20.
Since this is linear in x, this is a member of a linear exponential family.
Comment: This is a Reciprocal Inverse Gaussian Distribution, with one parameter fixed, as
discussed Insurance Risks Models by Panjer & Willmot, not on the syllabus.]

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 291

Cases where Bayes = Buhlmann:


If the likelihood density is a member of a linear exponential family and the conjugate prior
distribution is used as the prior distribution, then the Buhlmann Credibility estimate is
equal to the corresponding Bayesian estimate (for the squared error loss function.)74
Specifically, this applies to the Gamma-Poisson, Beta-Bernoulli,
Inverse Gamma-Exponential, and the Normal-Normal (fixed variance).75
Loss Models uses the term exact credibility to refer to situations where the Buhlmann Credibility
estimate are identical to the corresponding Bayesian estimate (for a squared error loss function.)
Since the Buhlmann Credibility estimates are the least squares line fit to the Bayesian estimates, the
two are identical if and only if the Bayesian estimates are along a straight line.
Mean and Variance of a Linear Exponential Family:
One can derive general formulas for the mean and variance76 of any member of a linear exponential
family. Assume one has a likelihood from a linear exponential family:
f(x; ) = p(x)e-x / q(), with x in some fixed interval independent of .
For example if f(x; ) is a Poisson, then = - ln(), p(x) = 1/ x! and ln[q()] = e-,
x = 0,1,2... Thus = e and q() = exp[e-] = e.
q() is the normalizing constant such that the density function f(x) integrates to unity over its domain77:
1 = f(x; ) dx = p(x)e-x / q() dx = (1/q()) p(x)e-x dx.
Therefore, q() =

p(x)e-x dx.

Exercise: Verify the above equation for the Poisson.


However, since the Poisson is discrete , substitute summation for integration.
[Solution: q() = exp[e-] = e.

p(x)
x=0

74

e- x

x / x! = e = q().]
x=0

This result is demonstrated subsequently.


It also applies for example to the Beta-Binomial (fixed m), and the Beta- Negative Binomial (fixed r) .
76
As well as higher moments.
77
For convenience I have written this in terms of integrals. For discrete density functions such as the Poisson,
summation is substituted for integration.
75

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 292

Differentiating the above equation with respect to the natural parameter , one obtains:78
q() = - p(x) x e-x dx.
The mean of f(x; ) depends on and is computed a follows:
() = x f(x; ) dx = {1/q()} x p(x) e-x dx = -q() / q().
Therefore:79
() = -q() / q() = -d (ln (q()) / d.
For example for the Poisson, ln(q()) = e-, and -d (ln (q()) / d = e- = ,
which is in fact the mean of the Poisson Distribution.
Thus as shown in Loss Models, we have computed the mean of a linear exponential family in terms
of the q(), its normalizing function in terms of its natural parameter. In a similar manner we can obtain
formulas for the second moment and the variance.
We had: q() = - p(x) x e-x dx.
Differentiating again with respect to we obtain:
q() = p(x) x2 e-x dx = q() f(x; ) x2 dx = q() E[X2 ].
Therefore, the second moment is: E[X2 ] = q() / q().
Therefore the variance = v() = E[X2 ] - E[X]2 = q() / q() - {q() / q()}2 =
{q()q() - q()2 } /q()2 = d(q() / q()) = d2 ln[q()] / d2 .

v() =
78

d2 ln[q()]
= -(). 80
d2

Note that for a member of a linear exponential family, the domain of integration does not depend on the parameter
. If it did, the derivative with respect to of the integral would contain additional terms and the result obtained here
would not apply.
79
See Equation 5.8 in Loss Models, with r() = -.
80
See Equation 5.9 in Loss Models, with r() = -. Since the variance is > 0, () < 0, and () is strictly decreasing
in . For example, for the Exponential Distribution the mean is the inverse of the natural parameter.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 293

Exercise: Verify the above equation for the variance in the case of a Poisson.
[Solution: For the Poisson, ln[q()] = e-, and

d2 ln[q()]
= e- = ,
d2

which is in fact the variance of the Poisson.]


More generally, the rth cumulant, r = (-1)r dr ln(q()) / dr = (-1)r-1 dr-1 () / dr-1
= -d r-1

/ d .81

Exercise: Use the above relationship to determine the skewness of a Gamma Distribution,

x1 ex/(), with natural parameter and mean /.


[Solution: 2 = - d (/) / d = /2. 3 = -d ( /2) / d = 2/3.
Skewness = 3 / 2 1.5 = 2/ .]

81

See Volume 1 of Kendalls Advanced Theory of Statistics, by Stuart and Ord.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 294

Equality of Maximum Likelihood & Method of Moments:


For Linear Exponential Families, the Methods of Maximum Likelihood and Moments
produce the same result when applied to ungrouped data, in the absence of truncating or
censoring.
Thus there are many cases where one can apply the method of maximum likelihood to ungrouped
data by instead performing the simpler method of moments:
Exponential, Poisson, Normal for fixed , Binomial for m fixed (including the special case
of the Bernoulli, m=1) , Negative Binomial for r fixed (including the special case of the
Geometric, r =1), and the Gamma for fixed.
Demonstration that Maximum Likelihood Equals Method Moments:
This useful fact is demonstrated as follows.
Assume one has a likelihood from a linear exponential family:
f(x; ) = p(x)e-x / q(), with x in some fixed interval independent of .
Then as shown above, the mean is given by: () = -q() / q().
Since we have a single parameter , the method of moments consists of setting this equal to the
observed mean. In other words (1/n) xi = -q() / q() .
Now the Method Maximum Likelihood consists of maximizing the sum of the log densities.
ln f(xi) = -xi + ln[p(xi)] - ln[q()].

ln f(xi) / = -xi - q() / q().


Setting the partial derivative, with respect to the single parameter , of the loglikelihood equal to
zero:
0 = xi + nq() / q(). Therefore, (1/n) xi = -q() / q().
This is the same equation as obtained for the method of moments.
The mean is therefore a sufficient statistic for any linear exponential family. In other words, all
of the information in the data useful for estimating the parameter is contained in the mean of
the data. If we wish to estimate the parameter, once we have the mean, we can ignore the
individual data points.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 295

A Table of Some Linear Exponential Families:


Natural
Name

Density

Poisson

e x /x!

Parameter, 82
-ln()

q()

p(x)

exp[e]

1/x!

Exponential

ex/ /

1/

1/

Gamma,

x1ex//()

1/

1
x1/()

Fixed

Normal,

exp(-.5{(x-)/}2 / { 2 }

-/2

exp(0.52 2 )

exp(-.5x2 /2 )/{ 2 }

Fixed
Bernoulli83
Binomial,

qx(1- q)1-x
m! qx(1- q)m-x / {x!(m-x)!}

ln[(1-q)/q]

1+e

ln[(1-q)/q]

(1+e)m

1
m! / {x!(m-x)!}

fixed m
Geometric

x / (1+)x+1

ln(1 + 1/)

1/(1-e)

Negative
Binomial, (x+r-1)! x / {(1+)x+rx!(r-1)!}

ln(1 + 1/)

(1-e)-r

(x+r-1)! / {x!(r-1)!}

fixed r
Inverse

/ (2) x-1.5 exp(-x/ 22 + / - / 2x)

/ (2) x-1.5 exp(- / 2x)

Gaussian,
Fixed

/(22)

exp[- 2 ]

Loss Models uses in order to designate the natural parameter. I have used instead, in order to avoid confusion
with the use of as a parameter in the Exponential, Gamma and Inverse Gaussian.
83
In the Bernoulli and the Binomial do not confuse the parameter q with the function q(), used to describe the
linear exponential family.
82

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 296

Conjugate Priors:
Assume one has a likelihood from a linear exponential family:
f(x; ) = p(x)e-x / q(), with x in some fixed interval independent of .
Then one can write down a corresponding conjugate prior distribution of the natural parameter :84
() = q()-K e-K / c(, K).
Where c is a normalizing constant, which depends on and K, the two parameters of the
Conjugate Prior distribution ().

c(, K) = q()-K e-K d.


It will turn out that K is the Buhlmann Credibility Parameter, while is the a priori mean = E[()].
For example, if we have a Poisson likelihood, with parameter , then the natural parameter
= -ln() and q() = exp[e] = e.
Thus () = q()-K e-K / c(, K) = exp[-Ke] exp[-K] / c(, K).
In order to put this density function in terms of rather than , we need to change variables,
remembering to multiply by |d/d| = 1/.
exp[-Ke] = eK. exp[-K] = exp[K ln() ] = K.
Thus, in terms of lambda, this supposed conjugate prior is: eK K 1 / c(, K).
Which we recognize as a Gamma Distribution with parameters = 1/K and = K.
In this case, the normalizing constant is:
c(, K) = () = (K) K-K.
As was discussed previously, the Gamma is in fact a conjugate prior to the Poisson.
The Buhlmann Credibility Parameter K is in fact 1/.
is in fact the a priori mean, which is equal to that of the Gamma, = K/K = .

84

See Theorem 15.24 of Loss Models. It will be shown that as claimed () is in fact a conjugate prior for f.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 297

Marginal Distribution:
The marginal distribution is the integral of the likelihood times the prior distribution.

f(x; )() d = {p(x)e-x/q()} q()-K e-K / c(, K) d =


{p(x)/c(, K)} e-(x+K) q()-(K+1) d = p(x) c((x + K)/(K+1), K+1) / c(, K).
Where we have used the fact that c(a, b) = q()-b e-ab d.
Thus the marginal distribution is: p(x) c((x + K)/(K+1), K+1) / c(, K).
For example, in the case of the Gamma-Poisson, p(x) = 1/x! and c(, K) = ( K) K-K.
Thus, p(x) c((x + K)/(K+1), K+1) / c(, K) = (1/x!) ((x + K) (K+1)-(x+K)} / {( K) K-K}
= {(x + K) / (x! ( K))} (K+1)-(x+K) KK = {(x + ) / (x! ())} (1+ 1/)-(x+) =
{(x+-1)! /(x! (-1)!)} x / (1+)x+.
This is a Negative Binomial Distribution, with r = and = ,
which as discussed previously is in fact the marginal distribution of the Gamma- Poisson.
A Priori Mean:
We can determine the a priori mean, which is the mean of the marginal distribution as well as E[()].
Since () = -q()/q(),

(-1/c(, K))q() q()-(K+1) e-K d. Using integration by parts this equals:


(-1/c(, K)) { -q()-K e-K /K ] - q()-K e-K d} = ()] - () d.

E[()] = ()() d = - (q()/q()) q()-K e-K / c(, K) d =

85

The first term is the difference of () at its limits of support. For all of the examples we are dealing
with () is equal (and in fact zero) at the endpoints of its support.

Thus this first term vanishes and therefore, E[()] = ()d = ()(1) = .
85

With du = q() q()-(K+1) d and v = e-K.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 298

Thus is in fact the a priori mean.86 Note that we have also shown that the mean of the marginal
distribution is .
Posterior Distribution:
Next let us determine the posterior distribution of . Assume we have n observations:
x1 , ..., xn , which sum to S. Then by Bayes Theorem the posterior distribution is proportional to:
() f(x1 ;) ...f(xn ;). This is proportional to:
q()-K e-K {exp[-x1 ] / q()}... {exp[-xn ] / q()} = q()-(K+n) e-(K+S). Thus the posterior
distribution has the same form as the prior distribution, but with new parameters:
K = K + n = K + number of observations, and = (K + S) / (K + n).
Thus () is indeed a conjugate prior for the linear exponential family f(x; ).
For example, in the case of the Gamma-Poisson, K = 1/ and = .
As was discussed previously, the parameters of the posterior Gamma are 1/ = 1/ + E = K + n,
and = + C = K + S. Thus K = 1/ = 1/ + E = K + number of observations.
= = ( + C) / (1/ + E) = (K + S) / ( K+ n).
So the general formula to update the parameters of the conjugate prior distribution works in the
particular case of the Gamma-Poisson.
Why Buhlmann = Bayesian for Conjug. Priors with Likelihoods from Linear Expon. Families:87
Since the posterior density of has the same form as the prior density, one can obtain this Bayesian
Estimate by substituting in the expression for the expected value of () the posterior parameters
rather than the prior parameters.
Recall that prior to any observations the expected value of () is .
Since posterior to observations = (K+S) / (K+n), the posterior expected value of () is:
(K + S) / (K + n).

That is why the letter was chosen for this parameter of the prior distribution.
See Theorem 15.24 in Loss Models. This is also discussed in A Teachers Remark on Exact Credibility by Hans
Gerber, ASTIN Bulletin, November 1995.
86
87

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 299

For example, in the case of the Gamma-Poisson, this posterior estimate is:
( + number of claims) / (1/ + number of exposures).
The estimate based solely on the observations is:
(sum of observations)/(number of observations) = S/n.
The estimate prior to observations is the prior expected value of (), which is .
We can rewrite the posterior Bayesian estimate as a linear combination of these two estimates:
Posterior Bayesian Estimate = (K+S) / (K+n) = {n/ (K+n)} S/n) + {K/ (K+n)} =
{n / (n+K)} {estimate based on observations} + (1 - {n / (n+K)}) (prior estimate)
For example, in the case of the Gamma-Poisson the posterior estimate =
( + C) / (1/ + E) = {E / (E + 1/)} (C/E) + {1- (E/(E + 1/))} ().88
Thus the Bayesian estimator is a linear function of the observation and the prior estimate. In general,
the Buhlmann Credibility estimate is the least squares linear approximation to the Bayesian
estimate.89 When as is the case here, the Bayesian estimate is itself linear, then it is fit exactly by the
least squares line, and the Buhlmann Credibility estimate is equal to the Bayesian estimate.
Thus it has been shown that if the likelihood density is a member of a linear exponential family and
the conjugate prior distribution is used as the prior distribution, then the Buhlmann Credibility
estimate is equal to the corresponding Bayesian estimate. Thus this is an example of what Loss
Models refers to as exact credibility.
The credibility is the weight given to the observations; Z = n / (n+K). Thus the Buhlmann Credibility
parameter K is equal to one of the parameters used to construct the Conjugate Prior (; , K)
above.90

88

Where C = number of claims observed, E = number of exposures observed.


Linear here refers to a linear function of the observation and the prior estimate.
90
This was why this parameterization of the conjugate prior was used.
In the case of the Gamma-Poisson, K = 1/ and Z = E/(E + 1/).
89

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 300

Problems:
11.1 (1 point) Which of the following statements are true?
1. If H is some hypothesis and B is an event, then the posterior probability of H is proportional to
the product of the conditional probability of B given H and the prior probability of H.
2. When the conditional distribution of the current observations is given by a Binomial distribution,
the Beta distribution is a conjugate prior distribution.
3. A probability density function which is non-zero on an interval and such that f(x; ) is
proportional to exp[sin()x + 3x4 + 2x3 + cos()] is a linear exponential family.
A. 1, 2

B. 1, 3

C. 2, 3

D. 1, 2, and 3

E. None of A, B, C, or D.

11.2 (2 points) Which of the following are linear exponential families?


A. A LogNormal Distribution with unknown and = 3.
B. A Pareto Distribution with unknown and = 4.
C. A Gamma Distribution with unknown and = 5.
D. A Weibull Distribution with unknown and = 6.
E. None of A, B, C, or D.
11.3 (4, 5/85, Q.42) (1 point) You are given that the likelihood density distribution is a member of
a linear exponential family and that the conjugate prior distribution is used as the prior distribution.
Which of the following statements are true?
1) It is possible that the likelihood density distribution is a Poisson distribution,
and the prior distribution is a Gamma distribution.
2) It is possible that the likelihood density distribution is a negative binomial distribution,
and the prior distribution is a beta distribution.
3) The Buhlmann credibility estimate is equal to the corresponding Bayesian estimate.
A. 1
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3
11.4 (4, 5/86, Q.41). (1 point) Which of the following statements are true?
1. Buhlmann's credibility estimates are the best linear approximation to Bayesian estimates.
2. The normal distribution is a member of a linear exponential family.
3. In the case of a normal likelihood and a normal conjugate prior distribution,
the Buhlmann and Bayesian credibility estimates are equal.
A. 1
B. 1, 2
C. 1, 3
D. 2, 3
E. 1, 2, 3.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 301

11.5 (4, 5/87, Q.43) (1 point) Let u be in the interval c < u < d. Let x be in the interval a < x < b.
Which of the following define the probability density form of a linear exponential family?
A. f(x; u) = exp[p(u)A(x) + B(x) + q(u)]
B. f(x; u) = exp[p(u)x + B(x) + q(u)]
C. f(x; u) = exp[p(u)A(x) + B(x)]
D. f(x; u) = exp[p(u)x] + B(x) + q(u)
E. f(x; u) = exp[B(x) + q(u) ]
11.6 (4B, 11/92, Q.28) (2 points) You are given the following:

The conditional distribution f(x|) is a member of the linear exponential family.

The prior distribution of , g(), is a conjugate prior with f(x|).

The expected value of the process variance, E[Var(X | )] = 3.

E[X] = 1
E[X | X1 = 4] = 2 where X1 is the value of a single observation.

Determine the variance of the hypothetical means, Var(E[X | ]).


A.
B.
C.
D.
E.

At least 0 but less than 2


At least 2 but less than 5
At least 5 but less than 8
At least 8
Cannot be determined.

11.7 (4B, 11/94, Q.2) (3 points) You are given the following:
The density function g(x|) is a member of the linear exponential family.
The prior distribution of , h(), is a conjugate prior distribution with the density function, g(x|).
X1 and X2 are distributed via g(x|), and E[X1 ] = 0.5 and E[ X2 | X1 = 3] = 1.00.
The variance of the hypothetical means, Var(E[X | ]) = 6.
Determine the expected value of the process variance, E[Var(X | )].
A. Less than 2.0
B. At least 2.0, but less than 12.0
C. At least 12.0, but less than 22.0
D. At least 22.0
E. Cannot be determined from the given information.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 302

Solutions to Problems:
11.1. D. 1. T. Bayes Theorem. 2. T. 3. T. It is an exponential family. Inside the exponential, the
term involving both x and the single parameter is linear in x.
11.2. C. A. For the LogNormal, f(x) = exp[-.5 ({ln(x) } / )2] /(x
ln f(x) = -.5 ({ln(x) } / 3)2 - ln(3x

2 ). For = 3,

2 ). The term involving both x and is: -ln(x)/9, which is not

linear in x. Thus the LogNormal for fixed , is not a member of a linear exponential family.
B. For the Pareto, f(x) = ()( + x) ( + 1). For = 4, ln f(x) = ln(4) + 4ln() - 5ln(+x). The term
involving both x and is -5ln(+x), which is not linear in x. Thus the Pareto for fixed , is not a
member of a linear exponential family. C. For the Gamma, f(x) = x1 ex/ / ().
For = 5, ln f(x) = -5ln() + 4ln(x) - x/ - ln 24. The term involving both x and is -x/, which is linear
in x. (The distribution is defined on the fixed interval 0 < x < , and is in the fixed interval
0 < < .) Thus the Gamma for fixed , is a member of a linear exponential family.
D. For the Weibull, f(x) = (x/) exp(-(x/)) /x. For = 6, ln f(x) = ln(6) + 5ln(x) - 6ln() - (x/)6 .
The term involving both x and is (x/)6 , which is not linear in x. Thus the Weibull for fixed 1, is not
a member of a linear exponential family.
Comment: The Gamma for = 1 is an Exponential Distribution, a linear exponential family.
The Weibull for =1 is an Exponential Distribution, a linear exponential family.
11.3. E. 1. True. The Gamma-Poisson is an example of a conjugate prior and the Poisson is a
member of a linear exponential family. 2. True. The Negative Binomial with r fixed is a member of a
linear exponential family and the Beta Distribution is a conjugate prior to it. 3. True.
11.4. E. 1. True. 2. True for a Normal with known variance. 3. True.
11.5. B. f(x; u) = exp[p(u)x + B(x) + q(u)].
Comment: Choice A is the form of an exponential family; for A(x) = x, one has a linear exponential
family. The form in Loss Models parameterizes using the natural parameter, rather than p(u).
A linear exponential family may be parameterized using the natural parameter, rather than it must be
parameterized in that convenient manner.

2013-4-10,

Conjugate Priors 11 Linear Exponential Families,

HCM 10/21/12,

Page 303

11.6. A. For the given situation, the estimate based on Bayesian Analysis is equal to that from
Buhlmann Credibility. After a single observation of 4, the new Bayesian Estimate is 2, which is also
the Buhlmann Credibility estimate. The prior mean is 1.
Thus, 2 = Z(4) + (1-Z)(1). Therefore Z = 1/3.
With one observation Z = 1 / (1+K). Thus since Z = 1/3, K = 2.
But K = EPV / VHM and we are given EPV = 3. Thus VHM = 3/2.
Comment: E refers to the expected value averaging over the different values of .
Similarly, Var refers to taking the second central moment over all the different values of .
11.7. D. Since we are dealing with conjugate priors and a member of a linear exponential family,
the Bayesian estimate is equal to the Buhlmann Credibility Estimate.
Our prior mean is 0.5 and if our observation is 3, then the new estimate is 1.
Therefore: 1 = 3Z + 0.5(1 - Z). Z = 20%. For one observation Z = 1/(1 + K). K = 4.
But K = Expected Value of the Process Variance / Variance of the Hypothetical Means.
Therefore 4 = Expected Value of the Process Variance / 6.
Therefore the Expected Value of the Process Variance = 24
Comment: Multiple concepts, difficult. X1 and X2 are intended to be the prior and posterior
variables. Thus E[X1 ] is the mean prior to any observation. E[X2 | X1 = 3] is the mean after an
observation of 3.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 304

Section 12 Overview of Conjugate Priors


Four examples of conjugate priors have been discussed: the Gamma-Poisson and Beta-Bernoulli
for frequency, and the Inverse Gamma-Exponential and the Normal-Normal for severity.
In each case the posterior distribution is of the same form as the prior distribution.

Updating:
Thus in each case one can update this distribution from one year to the next; with one years
posterior distribution becoming the next years prior distribution. For example, for a Gamma-Poisson
assume we start with a prior Gamma with parameters: = 2 and = 1/20. If during the first year we
observe 3 claims on 16 exposures,91 then the posterior Gamma has = 2 + 3 = 5, and
1/ = 20 + 16 = 36. This becomes the prior distribution for the second year.
If one now observes in the second year 5 claims on 18 exposures, the posterior Gamma has
= 5 + 5 = 10, and 1/ = 36 + 18 = 54. One can continue in this manner. Note that one can also get
the posterior Gamma after two years by using the observation in 2 years of 8 claims on
34 exposures to update the original prior Gamma and get: = 2 + 8 = 10 and 1/ = 20 + 34 = 54.

Gamma-Poisson, Updating One Year at a Time:


Prior: = 2, 1/ = 20. Mean Frequency: 10.0%.
7
6
5
4
3
2
1
0.1

91

0.2

0.3

Perhaps these are the robbery claims from a small chain of convenience stores.

0.4

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 305

Year 1: 3 Claims, 16 Exposures.


Posterior: = 5, 1/ = 36. Mean Frequency: 13.9%.
7
6
5
4
3
2
1
0.1

0.2

0.3

0.4

0.3

0.4

0.3

0.4

Year 2: 5 Claims, 18 Exposures.


Posterior: = 10, 1/ = 54. Mean Frequency: 18.5%.
7
6
5
4
3
2
1
0.1

0.2

Year 3: 4 Claims, 20 Exposures.


Posterior: = 14, 1/ = 74. Mean Frequency: 18.9%.
8
6
4
2

0.1

0.2

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 306

Year 4: 4 Claims, 22 Exposures.


Posterior: = 18, 1/ = 96. Mean Frequency: 18.8%.
8
6
4
2

0.1

0.2

0.3

0.4

Year 5: 3 Claims, 20 Exposures.


Posterior: = 21, 1/ = 116. Mean Frequency: 18.1%.
10
8
6
4
2

0.1

0.2

0.3

0.4

Other Conjugate Priors:


As discussed previously, the Beta distribution is a conjugate prior to the Binomial Distribution for
fixed m. If for fixed m, the q parameter of the Binomial is distributed over a portfolio by a Beta, then
the posterior distribution of q parameters is also given by a Beta. For m = 1, we get the special
case of the Beta-Bernoulli.
The Beta distribution is also a conjugate prior to the Negative Binomial Distribution for fixed r.
If for fixed r, 1/(1+) of the Negative Binomial is distributed over a portfolio by a Beta,92
then the posterior distribution of 1/(1+) parameters is also given by a Beta.93
For r = 1, one gets the special case of the Beta-Geometric.
One could equally well have /(1+) =1 - 1/(1+), distributed via a Beta.
The marginal distribution is called a Generalized Waring Distribution, with density:
f(x) = (k+x) (a+b) (a+k) (b+x) / {(k)x! (a) (b) (a+k+b+x)}, x= 0, 1, 2, ...

92
93

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 307

The Inverse Gamma Distribution is a conjugate prior to the Gamma Distribution. (The
Inverse Gamma-Exponential is a special case.) The marginal distribution and the predictive
distribution are each Generalized Pareto Distributions.
There are many other conjugate prior situations.94 Here are some examples, involving continuous
mixtures of severity distributions. In each case the scale parameter is being mixed, with the other
parameters in the severity distribution held fixed.
Severity

Prior Distribution

Marginal Distribution

Exponential

Inverse Gamma: ,

Pareto: ,

Inverse

Gamma: ,

Inverse
Pareto: = ,

Exponential
Weibull, = t

Inverse Transformed

Burr: , , = t

Gamma: , , = t
Inverse

Transformed

Weibull, = t

Gamma: , , = t

Gamma, = a

Inverse Gamma: ,

Inverse

Inverse Burr: = , , = t

Generalized Pareto: , , = a
Generalized

Gamma, = a

Gamma: ,

Pareto: = a, , =

Transformed

Inverse Transformed

Transformed Beta:

Gamma, = a, = t

Gamma: , , = t

, , = t, = a

Inverse Transformed
Gamma, = a, = t

94

Transformed
Gamma: , , = t

Transformed Beta:
= a, , = t, =

See the problems for illustrations of some of these additional examples. In Foundations of Casualty Actuarial
Science, 3rd edition and earlier, at the end of his Credibility Chapter, Gary Venter has an excellent chart displaying a
large number of conjugate distributions.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 308

Exercise: The severity for each insured is a Transformed Gamma Distribution with parameters
= 3.9, q, and = 5. Over the portfolio, q varies via an Inverse Transformed Gamma Distribution
with parameters = 2.4, = 17, and = 5.
What is the severity distribution for the portfolio as a whole?
[Solution: Using the above chart, the mixed distribution is a Transformed Beta Distribution with
parameters = 2.4, = 17, = 5, and = 3.9.]
Let C be the number of claims observed and let L be the dollars of loss observed.
Let be the sum of the inverses of the loss sizes observed.
Let be the sum of the loss sizes observed each to the power t; if t =1, then = L.
Let be the sum of the loss sizes observed each to the power - t; if t =1, then = .
For these conjugate priors, here are the parameters of the posterior distribution:
Severity

Prior Distribution

Parameters of Posterior Distribution95

Exponential96

Inverse Gamma, ,

= +C, = + L

Inverse Exponential

Gamma: ,

= + C, 1/ = 1/ +

Weibull, = t 97

Inverse Transformed
Gamma: , , = t

= + C, t = t + , = t

Inverse

Transformed

Weibull, = t

Gamma: , , = t

= + C, -t = -t + , = t

Gamma, = a

Inverse Gamma: ,

= + aC, = + L

Gamma, = a

Gamma: ,

= + aC, 1/ = 1/ +

Transformed

Inverse Transformed

Gamma, = a, = t

Gamma: , , = t

Inverse

Inverse Transformed
Gamma, = a, = t

= + aC, t = t + , = t

Transformed
Gamma: , , = t

95

= + aC, -t = -t + , = t

In each case the posterior distribution has the same form as the prior distribution.
This is the Inverse Gamma-Exponential Conjugate Prior, discussed previously. Note that is is a special case of the
Inverse Transformed Gamma-Weibull, the Inverse Gamma-Gamma, and Inverse Transformed Gamma-Transformed
Gamma Conjugate Priors.
97
If one instead in the Weibull used the parameter = , then follows an Inverse Gamma Distribution,
96

and the Inverse Gamma is a conjugate prior to the Weibull.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 309

Dirichlet-Multinomial:
For a life aged x, let qx = y1 , and 1|qx = y2 . Assume a priori, y1 and y2 have a joint distribution
f(y1 , y2 ) proportional to: y1 2 y2 4 (1 - y1 - y2 )3 . In order to integrate to one, the appropriate
constant in front turns out to be: 11!/(2! 4! 3!) = 138,600.
Then, f(y1 , y2 ) = 138,600 y1 2 y2 4 (1 - y1 - y2 )3 , 0 y1 1, 0 y2 1, 0 y1 + y2 1.
The Dirichlet Distribution is a generalization of the Beta Distribution (with = 1): 98
k

j=0

j=1

j=0

f(y1 , y2 , y3 , ..., yk) = (a) { yja j- 1 / (aj)} , 0 yj 1, where y0 = 1 - yj , and a = aj .99


The above prior is a Dirichlet Distribution with parameters a0 = 4, a1 = 3, and a2 = 5.
It can be shown that for the Dirichlet Distribution, E[yj] = aj/a.
Exercise: For the above Dirichlet Distribution with parameters a0 = 4, a1 = 3, and a2 = 5, determine
E[y1 ] and E[y2 ].
[Solution: a = 4 + 3 + 5 = 12. E[y1 ] = a1 /a = 3/12 = 1/4. E[y2 ] = a2 /a = 5/12.
Comment: Therefore, the prior estimates of qx and 1 |qx are 1/4 and 5/12 respectively.]
Assume we observe 30 lives all aged x that are assumed to follow the same mortality function.
6 lives die during the first year and 8 lives die during the second year.
The remaining 16 lives survive beyond the second year.
Then applying Bayes Theorem, the posterior distribution is proportional to:
{y1 2 y2 4 (1 - y1 - y2 )3 } {y1 6 y2 8 (1 - y1 - y2 )16} = y1 8 y2 12 (1 - y1 - y2 )19.
This posterior distribution is also a Dirichlet Distribution, with a0 = 20, a1 = 9, and a2 = 13.
Exercise: Posterior to the observation, determine E[y1 ] and E[y2 ].
[Solution: a = 20 + 9 + 13 = 42. E[y1 ] = a1 /a = 9/42. E[y2 ] = a2 /a = 13/42.
Comment: Therefore, the posterior estimates of qx and 1 |qx are 9/42 and 13/42 respectively.]

98

See for example, Kendalls Advanced Theory of Statistics. For example, a Dirichlet Distribution with parameters
a0 = 4 and a1 = 3 is a Beta Distribution with parameters a = 3 and b = 4. A Multivariate Bayesian Claim Count
Development Model With Closed Form Posterior and Predictive Distributions, by Stephen Mildenhall, CAS Forum
Winter 2006, combines the use of the Dirichlet-Multinomial and the Gamma-Poisson.
k
99
The constant, (a)/ (aj) , is required so that the density integrates to one over its support.
j=0

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 310

In general, let there be dj deaths observed during year j, d0 lives survive beyond year k, and
k

d = dj = original number of lives. Then the posterior distribution is Dirichlet with parameters
j=0

aj = aj + dj. The Dirichlet Distribution is conjugate prior to the Multinomial.


The posterior estimate of E[yj] = E[j-1|qx] is: ai/a = (aj + dj)/(a + d).
(ai + di)/(a + d) = {d/(a + d)} (di/d) + {a/(a + d)} (ai/a) = Z(observed rate) + (1 - Z)(prior rate),
where Z = a/(a + d).
Thus the estimate from Bayes Analysis is equal to that from Buhlmann Credibility, with the
k

Buhlmann Credibility Parameter, K = a = aj .100


j=0

Limits:
The Incomplete Gamma Function, (a; y), can be obtained as a limit of an appropriate sequence of
Incomplete Beta Functions (a, b; x), with b = 1+ y/x, as x goes to zero.101
Also the Poisson with parameter can be obtained as a limit of an appropriate sequence of
Binomial Distributions, with m = /q as q goes to zero.
Therefore, the Gamma-Poisson Conjugate Prior can be obtained as a limit of an appropriate
sequence of Beta-Binomial Conjugate Priors.
Since the Poisson can also be obtained as a limit of Negative Binomial Distributions, the
Gamma-Poisson Conjugate Prior can also be obtained as a limit of an appropriate sequence
of Beta-Negative Binomial Conjugate Priors.
Pure Premiums:
One can combine a frequency assumption and a severity assumption in order to get a model of total
costs; i.e., pure premiums. Problems 1 to 5 below, combine a Gamma-Poisson frequency
assumption with an Inverse Gamma-Exponential severity assumption.

100
101

For the Beta-Bernoulli, a0 = b, a1 = a, and K = a0 + a1 = a + b.


See Mahlers Guide to Loss Distributions.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 311

Variance of the Prior versus the Marginal Distribution:


In the Gamma-Poisson, the prior Gamma and the marginal Negative Binomial have the same mean:
= r .
However, the variance of the Negative Binomial is greater than the variance of the Gamma.
For > 0, r (1 + ) > r 2 = 2.
This follows from the fact that the variance of the Gamma is the VHM, while the variance of the
Negative Binomial is the Total Variance = EPV + VHM > VHM.
In general for our conjugate prior examples, the variance of the Marginal Distribution (prior mixed) is
greater than the variance of the prior distribution.
Thus the variance of the mixed Pareto is greater than that of the prior Inverse Gamma.
The variance of the mixed Bernoulli is greater than that of the prior Beta.
Similarly, the variance of the Predictive Distribution (posterior mixed) is greater than the variance of
the posterior distribution.
Some General Ideas:
If the likelihood density is a member of a linear exponential family and the conjugate prior distribution
is used as the prior distribution, then the Buhlmann Credibility estimate is equal to the corresponding
Bayesian estimate (for a squared error loss function.) This was the case for all four of the examples
discussed in detail.
However, it is important to note that while a lot of time has been spent going over the details of the
four conjugate prior examples, these are very special examples. In general, one does not have the
posterior distribution of the same type as the prior distribution.102 Even when one has a conjugate
prior situation, the Bayesian estimate is not necessarily equal to the Buhlmann Credibility estimate.
If the likelihood density is not a member of a linear exponential family, even if a conjugate prior
distribution is used as the prior distribution, the Buhlmann Credibility estimate is usually not equal to
the corresponding Bayesian estimate (for a squared error loss function.)

102

See for example the mixing Poisson example presented in Section 2, as well as the many examples in Mahlers
Guide to Buhlmann Credibility and Bayesian Analysis.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 312

Problems:
Use the following information for the next 5 questions:
The number of claims a particular policyholder makes in a year is Poisson with mean .
The values of the Poisson parameter (for annual claim frequency) for the individual policyholders in
a portfolio follow a Gamma distribution with parameters: = 6 and = 2.5.
The size of claim distribution for any particular policyholder is exponential with mean .
The values of the portfolio of policyholders have an Inverse Gamma distribution with parameters:
= 3 and = 20.
For an individual insured the frequency and severity process are independent of each other.
In addition, the distributions across the portfolio of and are independent of each other.
12.1 (3 points) What is the Expected Value of the Process Variance of the (annual) Pure Premiums
for this portfolio?
A. less than 6100
B. at least 6100 but less than 6200
C. at least 6200 but less than 6300
D. at least 6300 but less than 6400
E. at least 6400
12.2 (3 points) What is the Variance of the Hypothetical Mean (annual) Pure Premiums for this
portfolio?
A. less than 28,000
B. at least 28,000 but less than 29,000
C. at least 29,000 but less than 30,000
D. at least 30,000 but less than 31,000
E. at least 31,000
12.3 (2 points) An individual risk from this portfolio is observed to have claims totaling 900 in cost
over 3 years.
Use Buhlmann Credibility to estimate the future (annual) pure premium of this individual.
A. less than 288
B. at least 288 but less than 289
C. at least 289 but less than 290
D. at least 290 but less than 291
E. at least 291

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 313

12.4 (3 points) An individual risk from this portfolio is observed to have 50 claims over 3 years.
Use Buhlmann Credibility to estimate the future (annual) claim frequency of this individual.
A. less than 16.3
B. at least 16.3 but less than 16.4
C. at least 16.4 but less than 16.5
D. at least 16.5 but less than 16.6
E. at least 16.6
12.5 (3 points) An individual risk from this portfolio is observed to have 50 claims totaling 900 in
cost over 3 years.
Use Buhlmann Credibility to estimate the future average claim severity of this individual.
A. less than 17.2
B. at least 17.2 but less than 17.4
C. at least 17.4 but less than 17.6
D. at least 17.6 but less than 17.8
E. at least 17.8

12.6 (5 points) The number of claims a particular policyholder makes in a year follows a distribution:
f(x) =

66 x
(x + 5)!
, x = 0, 1, 2, ...., with mean .
x! 5! (6 + )6 + x

The means for the individual policyholders in a portfolio follow a Generalized Pareto Distribution,
with parameters , = 6, and . An insured is chosen at random from this portfolio.
What is the probability of observing n claims over the coming year for this insured?
Hint: The density of a Generalized Pareto Distribution is: f(x) =

[ + ] x -1
, x > 0.
[ ] [] ( + x) +

A.

( + 6) ( + )
(6 + x)
(6) ()
( + x + 6 + ) (x + 1)

B.

( + 6) ( + )
( + x)
() ( )
( + x + 6 + ) (x + 1)

C.

( + )
( + x) (6 + x)
(6) ( ) ()
(x +1)

D.

( + 6)
( + x) (6 + x)
(6) ( ) () ( + x + 6 + )

E.

( + 6) ( + )
( + x) (6 + x)
(6) () () ( + x + 6 + ) (x + 1)

12.7 (3 points) f(x) = x2 3 ex / 2, x > 0. () = 3 e2 8/3, > 0.


You observe n claims, x1 , x2 , ... xn .
Compare the estimate based on Buhlmann Credibility to that from Bayes Analysis.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 314

Use the following information for the next 3 questions:


The number of claims a particular policyholder makes in a year follows a distribution with parameter
p: f(x) = p(1 - p)x, x = 0, 1, 2, ....
The values of the parameter p for the individual policyholders in a portfolio follow a Beta distribution,
with parameters a = 4, b = 5, = 1: g(p) = 280 p3 (1 - p)4 , 0 p 1.
12.8 (2 points) What is the a priori mean annual claim frequency for the portfolio?
A. less than 1.5
B. at least 1.5 but less than 1.6
C. at least 1.6 but less than 1.7
D. at least 1.7 but less than 1.8
E. at least 1.8
12.9 (3 points) You observe an individual policyholder to have 6 claims in 10 years.
Assuming the observation of the separate years are independent, what is the posterior distribution
of the parameter p for this policyholder?
A. 19,612,560 p14(1-p)9
B. 27,457,584 p13(1-p)10
C. 32,449,872 p12(1-p)11
D. 62,403,600 p11(1-p)13
E. 49,031,400 p10(1-p)14
12.10 (2 points) You observe an individual policyholder to have 6 claims in 10 years.
Assuming the observation of the separate years are independent, what is the posterior estimate of
the average annual claim frequency for this policyholder?
A. less than 0.85
B. at least 0.85 but less than 0.90
C. at least 0.90 but less than 0.95
D. at least 0.95 but less than 1.00
E. at least 1.00

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 315

Use the following information for the next eight questions:


The size of claims for any particular policyholder follows an Inverse Gamma distribution with density:
f(x) =

3 e- / x
, x > 0.
2x4

The values of parameter for the individual policyholders in a portfolio follow a Gamma distribution
with density: g() = 4 e/10 / 2,400,000, > 0.
Hint: The density of a Generalized Pareto Distribution is: f(x) =

[ + ] x -1
, x > 0.
[ ] [] ( + x) +

The nth moment of a Generalized Pareto Distribution is: E[Xn ] = n

( - n) ( + n)
, > n.
() ()

12.11 (3 points) Prior to any observations, what is the density function for the claims severity from
an insured picked at random from this portfolio?
A. 105,000 x4 / (x+10)8
B. 2,800,000 x4 / (x+10)9
C. 63,000,000 x4 / (x+10)10
D. 1,260,000,000 x4 / (x+10)11
E. None of the above.
12.12 (2 points) Prior to any observations, what is the expected claim severity for an insured picked
at random from this portfolio?
A. less than 26
B. at least 26 but less than 27
C. at least 27 but less than 28
D. at least 28 but less than 29
E. at least 29
12.13 (2 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
What is the posterior density function for the parameter for this insured?
A. 4.78 x 10-31 16 e.1

B. 4.78 x 10-14 16 e

C. 2.81 x 1014 16 e43

D. 9.82 x 1015 16 e53

E. None of the above.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 316

12.14 (3 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
What is the density function for the future claim severity from this insured?
A. 105,000 x4 / (x+10)8
B. 163.1 x10 / (x+0.575)14
C. 15291 x16 / (x+1.739)20
D. 5.313 x20 / (x+0.1)24
E. None of the above.
12.15 (2 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
Using Bayesian Analysis what is the expected future average claim severity from this insured?
A. less than 15
B. at least 15 but less than 16
C. at least 16 but less than 17
D. at least 17 but less than 18
E. at least 18
12.16 (3 points) What is the Expected Value of the Process Variance, prior to any observations?
A. less than 100
B. at least 100 but less than 250
C. at least 250 but less than 500
D. at least 500 but less than 1000
E. at least 1000
12.17 (2 points) What is the Variance of the Hypothetical Mean Severities, prior to any
observations?
A. 75
B. 100
C. 125
D. 150
E. 175
12.18 (2 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
Using Buhlmann Credibility what is the expected future average claims severity from this insured?
A. less than 15
B. at least 15 but less than 16
C. at least 16 but less than 17
D. at least 17 but less than 18
E. at least 18

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 317

Use the following information for the next nine questions:


The size of claims for any particular policyholder follows a Gamma distribution with density:
f(x) = 6 x5 ex / 120, x > 0. The values of parameter for the individual policyholders in a portfolio
follow a Gamma distribution with density: g() = 3456 3 e12, > 0.
12.19 (3 points) Prior to any observations, what is the density function for the claims severity from
an insured picked at random from this portfolio?
A. 5,806,080 x4 (x +12)-9
B. 10,450,944 x5 (x +12)-10
C. 156,764,160 x4 (x +12)-10
D. 313,528,320 x5 (x +12)-11
E. None of the above.
12.20 (2 points) Prior to any observations, what is the expected claim severity for an insured picked
at random from this portfolio?
A. less than 22
B. at least 22 but less than 23
C. at least 23 but less than 24
D. at least 24 but less than 25
E. at least 25
12.21 (2 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
What is the posterior density function for the parameter for this insured?
A. 5.013 x 1017 27 e43
B. 1.636 x 1018 30 e43
C. 4.934 x 1020 27 e55
D. 3.370 x 1021 30 e55
E. None of the above.
12.22 (3 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
What is the density function for the future claim severity from this insured?
A. 1.917 x 1049 x4 / (x + 55)-30

B. 1.150 x 1050 x5 / (x + 55)-31

C. 5.409 x 1054 x4 / (x + 55)-33


E. None of the above.

D. 3.570 x 1055 x5 / (x + 55)-34

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 318

12.23 (2 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
Using Bayesian Analysis what is the expected future average claim severity from this insured?
A. less than 13
B. at least 13 but less than 14
C. at least 14 but less than 15
D. at least 15 but less than 16
E. at least 16
12.24 (3 points) Use the Bayesian central limit theorem to construct a 95% credibility interval
(confidence interval) for the estimate in the previous question.
12.25 (3 points) What is the Expected Value of the Process Variance, prior to any observations?
A. less than 125
B. at least 125 but less than 130
C. at least 130 but less than 135
D. at least 135 but less than 140
E. at least 140
12.26 (2 points) What is the Variance of the Hypothetical Mean Severities, prior to any
observations?
A. less than 270
B. at least 270 but less than 280
C. at least 280 but less than 290
D. at least 290 but less than 300
E. at least 300
12.27 (2 points) From a particular insured you observe four claims of sizes: 5, 8, 10, 20.
Using Buhlmann Credibility what is the expected future average claims severity from this insured?
A. less than 13
B. at least 13 but less than 14
C. at least 14 but less than 15
D. at least 15 but less than 16
E. at least 16

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 319

Use the following information for the next four questions:


The severity distribution of each risk in a portfolio is given by a Weibull Distribution:
F(x) = 1 - exp[-(x/)1/3].
varies over the portfolio via an Inverse Transformed Gamma Distribution with
= 2.5, = 343, and = 1/3.
For the Inverse Transformed Gamma Distribution:
F(x) = 1 - [; (/x)], x > 0.

exp -
x
f(x) =
.

+
1
x
[]

[ ]

E[Xk] = k [ - k/] / [], k < .


Mode is:


+ 1

1/

You may use the following values of the Incomplete Gamma Function:
[5.5 ; 5.1705] = [6.5 ; 6.16988] = [7.5 ; 7.16943] = [8.5 ; 8.16909] = 0.5.
For a given risk you observe four claims of sizes: 1, 8, 27, and 125.
12.28 (3 points) Determine the posterior distribution of for this risk.
12.29 (1 point) You are interested in minimizing the squared error loss function.
Find the Bayesian estimate of the hypothetical mean claim severity for this risk.
A. 50
B. 100
C. 200
D. 300
E. 400
12.30 (1 point) A loss function is defined as equal to zero if the estimate equals the true value, and
one otherwise. You are interested in minimizing the expected value of this loss function.
Find the Bayesian estimate of the hypothetical mean claim severity for this risk.
A. Less than 50
B. At least 50, but less than 100
C. At least 100, but less than 200
D. At least 200, but less than 400
E. At least 400
12.31 (2 points) You are interested in minimizing the expected absolute error.
Find the Bayesian estimate of the hypothetical mean claim severity for this risk.
A. 50
B. 100
C. 150
D. 200
E. 250

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 320

12.32 (3 points) Severity is Normal with mean 1000 and variance v.


The prior distribution of v is Inverse Gamma with = 50,000 and = 4.
You observe 2 claims of sizes 800 and 900.
What is the Bayesian estimate of v; i.e., the mean of the posterior distribution of v?
A. Less than 17,000
B. At least 17,000, but less than 17,500
C. At least 17,500, but less than 18,000
D. At least 18,000, but less than 18,500
E. At least 18,500
12.33 (3 points) The Dirichlet Distribution is:
k

j=0

j=1

j=0

f(y1 , y2 , y3 , ..., yk) = (a) { yja j- 1 / (aj)} , 0 yj 1, where y0 = 1 - yj , and a = aj .


For the Dirichlet Distribution, E[yj] = aj/a.
For a life aged x, let qx = y1 , 1|qx = y2 , 2|qx = y3 , and 3|qx = y4 .
A priori, y1, y2 , y3 , and y4 have a joint distribution which is Dirichlet with parameters:
a0 = 20, a1 = 3, a2 = 4, a3 = 5, and a4 = 6.
50 lives all aged x are assumed to follow the same mortality function.
3 lives die during the first year, 2 lives die during the second year, 4 lives die during the third year,
and 6 lives die during the fourth year.
The remaining 35 lives survive beyond the fourth year.
Determine the posterior estimate of 3 qx.
A. 0.21

B. 0.22

C. 0.23

D. 0.24

E. 0.25

12.34 (2 points) Losses follow a distribution function: F(x) = 1 - exp[-c x2 ], x > 0, c > 0.
The prior distribution of c follows a Gamma Distribution with parameters = 10 and = 0.02.
A sample of 4 values is observed: 1, 2, 3, 5.
Determine the form of the posterior distribution of c.
12.35 (3 points) You are given the following:
The amount of an individual claim has an Inverse Gamma distribution with shape parameter = 6
and scale parameter q.
The parameter q is distributed via an Exponential Distribution with mean 100.
From an individual insured you observe 3 claims of sizes: 10, 20, and 50.
For the zero-one loss function, what is the Bayes estimate of q for this insured?
A. 90
B. 95
C. 100
D. 105
E. 110

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 321

12.36 (3 points) You are given the following:


F(x | ) = 1 - (10/x), x > 10.
The prior distribution of is Gamma with = 4 and = 0.5.
From an individual insured you observe 3 claims of sizes: 15, 25, and 60.
What is the posterior distribution of ?
12.37 (4, 5/85, Q.55) (3 points) Losses have the following probability density function:
f(x; ) =

exp(-

x)

, x > 0. One observes four losses of sizes: 1, 4, 9 and 64.

2 x

The prior distribution of is Gamma with = 0.7 and = 1/2.


In which of the following ranges is the Bayesian estimate of (i.e., the mean of the posterior
distribution of )?
A.
B.
C.
D.
E.

Less than 0.28


At least 0.28, but less than 0.29
At least 0.29, but less than 0.30
At least 0.30, but less than 0.31
At least 0.31

12.38 (4, 5/87, Q.63) (2 points) The parameter is to be estimated for a Pareto distribution with
shape parameter and scale parameter 1. It is assumed that has a prior distribution g() that is
Gamma distributed with parameters = 2.2 and = 1/2.
The posterior distribution is Gamma with parameters:
= n + , and 1/ = 1/ +

ln(1+ xi) , where n is the sample size.


i=1

If the square of the error is to be minimized, what is the Bayes estimate of given the following
sample: 3, 9, 12, 6, 4?
A. 1.1

B. 7.2

C. 2 + ln(34)

D.

7.2
2 + ln(34)

E.

7.2
2 + ln(18,200)

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 322

12.39 (4, 5/88, Q.45) (1 point) Which of the following are true?
1. One major advantage of the use of conjugate prior distributions is that the prior distribution
for one year can be used as the posterior distribution for the next year.
2. If a set of data can be assumed to have a binomial distribution, and a beta distribution is
employed as the prior distribution, then the mean of the posterior distribution is equal
to the corresponding Buhlmann credibility estimate of the binomial distribution parameter.
3. If n independent Bernoulli trials are performed with constant probability, , of success,
and the prior distribution of is a beta distribution,
then the posterior distribution of is a beta distribution.
A. 2

B. 1, 2

C. 1, 3

D. 2, 3

E. 1, 2, and 3

12.40 (165, 5/90, Q.10) (1.7 points) A mortality study involves 50 insureds, each age x, with
similar medical profiles. 7 insureds die during the first year, 12 insureds die during the second year,
11 insureds die during the third year, and 11 insureds die during the fourth year. The observed
mortality rates are graduated by a Bayesian method. You are given:
(i) The prior joint distribution of ti = i-1lqx are Dirichlet,
4

j=0

j=1

j=0

fT(t1 , t2 , t3 , t4 ) = (a) { t ja j -1 / (aj )} , 0 tj 1, where t0 = 1 - t j , and a = aj .


(ii) The parameters of the prior Dirichlet Distribution are: a0 = 4, a1 = 7, a2 = 4, a3 = 15, a4 = 9.
(iii) For the Dirichlet Distribution, E[tj] = aj/a.
(iv) The vector of graduated values v is the mean vector of the posterior distribution for T.
Determine the graduated value v3 (estimating 2Iqx).
(A) 0.29

(B) 0.31

(C) 0.42

(D) 0.52

(E) 0.54

12.41 (165, 11/90, Q.10) (1.9 points)


A complete study is performed on 40 patients with a serious disease.
In the first year after diagnosis, 4 deaths occur. In the second year, 24 deaths occur.
You wish to do a Bayesian graduation of ti = i-1|q0 using a Dirichlet prior with parameters
a0 = 2, a1 = 8, a1 = 6, and a3 = 4, fT(t1 , t2 , t3 ) is proportional to: t1 7 t2 5 t3 3 (1 - t1 - t2 - t3 ).
For the Dirichlet Distribution, fT(t1 , t2 , ..., tk) = (ai)

{ tj aj - 1 / (aj )} , 0 tj 1, where
j =0

t0 = 1 -

tj .

For the Dirichlet Distribution, E[tj] = aj/(ai).

j =1

Use the graduated values to estimate the mortality rate q1 as: E[1|q0 ]/E[q0 ].
(A) 1/5

(B) 1/2

(C) 5/8

(D) 2/3

(E) 5/6

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 323

12.42 (4B, 11/93, Q.7) (1 point) For each of the following pairs, give the corresponding predictive
density function.
Likelihood
Function

Conjugate
Prior

Predictive
Density

Bernoulli

Beta

Poisson

Gamma

Exponential

Inverse Gamma

A. 1 = Bernoulli, 2 = Negative Binomial, 3 = Pareto


B. 1 = Beta, 2 = Gamma, 3 = Inverse Gamma
C. 1 = Weibull, 2 = Gamma, 3 = Inverse Gamma
D. 1 = Weibull, 2 = Negative Binomial, 3 = Pareto
E. 1 = Pareto, 2 = Gamma, 3 = Pareto
12.43 (165, 11/94, Q.10) (1.9 points)
A complete study is performed on 100 patients with a serious illness.
The observed mortality rates are graduated by a Bayesian method.
A Dirichlet Distribution with parameters a0 , a1 , ..., ak, has joint density
k

j=0

j=1

j=0

fT(t1 , t2 , ..., tk) = (a) { t ja j -1 / (aj)} , 0 tj 1, where t0 = 1 - t j , and a = aj .


For the Dirichlet Distribution, E[tj] = aj/a.
You are given:
(i) ti = i-1lq0 using a Dirichlet prior distribution with parameters a1 = 50, a2 = 40, a3 = 30, and
k

a = aj = 150.
j=0

(ii) The following table of estimated mortality rates was constructed from the graduated values:
i
0
1
2
qi
9/25
2/5
7/12
Determine the number of patients who died during the third year.
(A) 20
(B) 22
(C) 24
(D) 26
(E) 28

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 324

12.44 (165, 11/97, Q.12) (1.9 points) A Bayesian graduation has been performed on mortality
rate data for 200 patients having a certain operation.
A Dirichlet Distribution is used for both the prior and posterior distributions.
The parameters of the prior joint Dirichlet distribution of ti = i-1lqx are a0 , a1 , ..., ak.
k

j=0

j=1

j=0

fT(t1 , t2 , ..., tk) = (a) { t ja j -1 / (aj)} , 0 tj 1, where t0 = 1 - t j , and a = aj .


For the Dirichlet Distribution, E[tj] = aj/a.
The graduated mortality rate for each year is the weighted average of the prior rate and the
observed rate for that year, with a weight of 5/9 to the prior rate.
The prior rate is 0.06 for year 1.
Determine a1 , the second parameter of the prior Dirichlet distribution.
(A) 7

(B) 10

(C) 12

(D) 15

(E) 22

12.45 (4, 5/00, Q.10) (2.5 points) You are given:


The size of a claim for an individual insured follows an inverse exponential distribution with the
following probability density function:
f(x | ) =

e- / x
,x>0
x2

The parameter has a prior distribution with the following probability density function:
g() =

e- / 4
,>0
4

One claim of size 2 has been observed for a particular insured.


Which of the following is proportional to the posterior distribution of ?
(A) e/2

(B) e3/4

(C) e

(D) 2 e/2

(E) 2 e9/4

12.46 (3 points) In the previous question, 4, 5/00, Q.10, what is the predictive distribution?
(A) Pareto with = 2 and = 4/3.
(B) Inverse Pareto with = 2 and = 4/3.
(C) Gamma with = 2 and = 4/3.
(D) Inverse Gamma with = 2 and = 4/3.
(E) None of the above.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 325

Solutions to Problems:
12.1. A. Take an individual risk with Poisson parameter and Exponential Parameter . The mean
frequency is and the process variance of the frequency is . The mean severity is and the
process variance of the severity is 2 . Since for an individual risk, the frequency and severity are
independent, the Process Variance of the pure premium = f s2 + s2 f2 =
2 + 2 = 2 2 . Since the distributions of and are independent, the EPV of the Pure Premium
= E[22 ] = 2E[]E[2 ] =
2 (mean of Gamma Distribution)(2nd moment of the Inverse Gamma Distribution).
The mean the Gamma = = 6(2.5) =15. The second moment of the Inverse Gamma =
2 / {(-1)( -2)} = 400 /{(2)(1)} = 200. Therefore, the Expected Value of the Process Variance of
the Pure Premium = (2)(15)(200) = 6000.
12.2. D. Take an individual risk with Poisson parameter and Exponential Parameter .
The mean frequency is and the mean severity is . Mean pure premium = .
VHM of Pure Premiums = Var[] = E[()2 ] - E2 [].
Since the distributions of and are independent, the VHM = E[2 ]E[2 ] - E2 []E2 [] =
(second moment of the Gamma Dist.)(second moment of the Inverse Gamma Dist.) (square of the mean of Gamma Dist.) (square of the mean of Inverse Gamma Dist.).
The mean of the Gamma = = 6(2.5) =15. The second moment of the Gamma = (+1)2 =
262.5. The mean of the Inverse Gamma = / ( - 1) = 20 /(4 - 2) = 10. The second moment of the
Inverse Gamma = 2 / {(-1)( -2)} = 400 / {(2)(1)} = 200. Therefore, Variance of the Hypothetical
Mean Pure Premiums = (262.5)(200) - 152 102 = 30,000.
12.3. D. From the two previous solutions, K = EPV/VHM = 6000 / 30000 = 1/5 = 0.2.
Z = 3/3.2 = 93.75%. Prior mean p.p. is (mean frequency)(mean severity) = (15)(10) = 150.
Observed pure premium = 900/3 = 300.
New estimate = (0.9375)(300) + (1 - 0.9375)(150) = 290.63.
12.4. C. For the Gamma-Poisson, the Buhlmann Credibility Parameter is the scale parameter of
the Gamma Distribution. K = 1/ = 0.4. Z = 3/(3+K) = 3/3.4 = 0.882.
The prior mean frequency is the mean of the Gamma = = 15.
The observed frequency is 50/3 = 16.67.
Thus the new estimated frequency is: (0.882)(16.67) + (1 - 0.882)(15) = 16.47.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 326

12.5. D. For the Inverse Gamma-Exponential, the Buhlmann Credibility Parameter is the shape
parameter of the Inverse Gamma Distribution - 1. K = - 1 = 2. Z = 50/(50 + 2) = .962. (Note that
for severity one uses the number of claims to compute the credibility, not the number of years.) The
prior mean severity is the mean of the Inverse Gamma = /(-1) = 10. The observed severity is
900/50 = 18.
Thus the estimated future severity is: (.962)(18) + (1 - .962)(10) = 17.70.
Comment: Note that the product of the separate Buhlmann Credibility estimates of frequency and
severity from these two questions does not match the Buhlmann credibility estimate of the pure
premium (even though the observed data is the same.) (16.47)(17.70) = 291.52 290.63. Thus it
makes a difference whether one analyzes pure premiums or separately analyzes frequency and
severity.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 327

12.6. E. The density of a Generalized Pareto Distribution is:


g() = (+) 1/{ ()()(+)+} = (+) 61/{ ()()(6+)+}.

The chance of n claims is f(x | ) g() d =


=

{((x+5)!)/(x!)(5!)}66x/(6+)6+x(+) 61 / {()()(6+)+} d =
=0
=

{(x+6)/((x+1)(6))}66+ (+)/ {()()} x+1/ {(6+)6+x++} d =


=0

{(x+6)/((x+1)(6))} 66+ (+) / {()()} 6-(6+) (+x)(6+)/(+x+6+) =


{(+6) (+)/((6)()())} {(+x)(6+x)/((+x+6+)(x+1))}.
Comment: Very difficult. The integrand is of the same form as a Generalized Pareto Distribution.
Since the density of the Generalized Pareto Density is unity we know that:
=

(+) 1/{ ()()(+)+} d = 1. Therefore,


=0
=

1/ (+)+ d = ()()/(+). This can be rewritten as:


=0
=

s1/ (+)t d = s-t(s)(t-s)/(t).


=0

This is an example of a Generalized Waring Frequency Distribution, with parameter r = 6.


Each insured has a Negative Binomial Distribution, with fixed parameter r; in this case r = 6.
For example if = 3.4 and = 0.6, then the first few densities are:
x
f(x)

0
0.4814

1
0.2002

2
0.1073

3
0.0639

4
0.0406

5
0.0271

6
0.0187

7
0.0134

8
0.0098

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 328

12.7. The distribution of x is Gamma with = 3 and = 1/.


This is mathematically the same as if we had 3n claims from Exponentials each with mean 1/,
summing to xi.
The distribution of is Gamma with = 4 and = 1/2.
Therefore, this is mathematically equivalent to an Inverse Gamma-Exponential, (the distribution of
1/, the mean of each Exponential, is Inverse Gamma.)
Therefore, the estimates from Buhlmann Credibility and Bayes Analysis are equal.
Alternately, the posterior distribution is proportional to:
f(x1 ) ... f(xn ) () ~ 3n exp[xi] 3 e2 = 3n+3 exp[(2 + xi)].
Therefore, the posterior distribution of is Gamma with = 4 + 3n, and = 1/(2 + xi).
E[X | ] = 3/. Bayes Analysis estimate = E[3/] = 3E[1/] =
(3)(negative first moment of posterior distribution) = (3)(2 + xi)/(3 + 3n) = (2 + xi)/(n + 1).
Var[X | ] = 3/2. The prior distribution of is Gamma with = 4 and = 1/2.
(Prior) EPV = E[3/2] = 3(negative second moment of prior distribution) = (3)(22 )/{(3)(2)} = 2.
(Prior) VHM = Var[3/] = 9Var[1/] = (9)(E[1/2] - E[1/]2 ) = (9){2/3 - (2/3)2 } = 2.
K = EPV/VHM = 2/2 = 1. Z = n/(n+1). A priori mean = E[3/] = (3)(2/3) = 2.
Buhlmann Credibility estimate = {n(n+1)}xi/n + (1/(n+1))(2) = (2 + xi)/(n + 1).
Therefore, the estimates from Buhlmann Credibility and Bayes Analysis are equal.
Comment: If f(x) is Gamma with parameters a and 1/, and the prior distribution of is Gamma with
parameters > 2, and , then the posterior distribution of is Gamma with parameters
= + an, and 1/ = 1/ + xi.
The estimated future severity using Bayes Analysis is: E[a/] = a(1/ + xi) / ( + an - 1).
EPV = E[a/2] = a/{2 (-1)(-2)}.
VHM = Var[a/] = a2 Var[1/] = (a2 )(E[1/2] - E[1/]2 ) = (a2 ) {1/{2 (-1)(-2)} - (1/{ (-1)})2 } =
a2 / {2 (-1)2 (-2)}.
K = EPV/VHM = ( - 1)/a. Z = n/{n+ ( - 1)/a}. A priori mean = E[a/] = a/{ (-1)}.
Buhlmann Credibility estimate = {n/{n+ ( - 1)/a}}xi/n + {(( - 1)/a)/{n+ ( - 1)/a}}a/{ (-1)} =
(1/ + xi)/(n + ( - 1)/a) = a(1/ + xi)/( + an - 1), the same as from using Bayes Analysis.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 329

12.8. C. This is a Geometric Distribution (a Negative Binomial with r =1), parameterized somewhat
differently than in Loss Models, with p = 1/(1+). Therefore for a given value of p the mean is:
(p) = = (1-p)/p. In order to get the average mean over the whole portfolio we need to take the
integral of (p) g(p) dp.
1

(p) g(p) dp = ((1-p)/p) 280 p3(1-p)4 dp = 280 p2(1-p)5 dp = 280 (3)(6) / (3+6)
0

= 280 (2!)(5!) / 8! = 5/3.


Comment: Difficult! Special case of the Beta-Negative Binomial (for r fixed) Conjugate Prior.
For the Beta-Negative Binomial in general, the a priori mean turns out to be rb/(a-1).
For r =1, b = 5 and a = 4, the a priori mean is (1)(5)/3 = 5/3.
12.9. B. The sum of ten independent Geometric Distribution is a Negative Binomial with
r = 10. Therefore, the sum has a Negative Binomial Distribution, with parameters r = 10 and
= (1-p)/p. f(x) = (10+x-1!) p10(1-p)x / (9! x!). f(6) = (15!) p10(1-p)6 / 9! 6! = 5005 p10(1-p)6 .
By Bayes Theorem, the posterior distribution of the parameter p for this policyholder is proportional
to the product of f(6) times g(p). f(6)g(p) = 5005p10(1-p)6 280p3 (1-p)4 which is proportional to
p 13(1-p)10, and is therefore a Beta Distribution with parameters a = 14, b = 11. Thus the posterior
distribution of p is: (25)/{(14)(11)}p13(1-p)10 =
(24!)/{(13!)(10!)}p13(1-p)10 = 27,457,584 p1 3(1-p)1 0.
Comment: For the Beta-Negative Binomial in general the distribution of p posterior to
observing C claims in E years is a Beta with parameters: a + rE, b + C.
For C = 6, E = 10, a = 4, b = 5 and r =1,
the posterior Beta has parameters: 4 + (1)(10), and 5 + 6 = 14, and 11.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 330

12.10. A. From the solution to the previous question, the posterior distribution of p is:
(14+11)/{(14)(11)} p13(1-p)10. The Geometric Distribution with parameter = (1-p)/p has mean
= (p) = (1-p)/p. In order to get the average mean over the whole portfolio we need to take the
integral of (p) g(p) dp.
1

(p) g(p) dp = ((1-p)/p) (25)/{(14)(11)} p13(1-p)10 dp =


0

0
1

(25)/{(14)(11)} p12(1-p)11 dp = {(25)/{(14)(11)}} (13)(12) / (13+12)


0

= {(12) / (11)} {(13) / (14)} = 11/13.


Comment: Difficult! For the Beta-Negative Binomial in general the estimated mean annual frequency
posterior to observing C claims in E years is: r(b + C) / (a + rE - 1). Therefore, the posterior
estimated mean annual frequency is (1)(5+6) / {4 + (1)(10) -1} = 11 /13.
For the Beta-Negative Binomial in general, the Buhlmann Credibility parameter is: (a-1)/r.
The Credibility assigned to the observation of 6 /10 is Z = E / {E + (a-1)/r} = 10 / {10 + (4-1)/1} =
10 /13. The prior estimate is 5/3. The Buhlmann Credibility estimate is:
(10/13)(6/10) + (1- 10/13)(5/3) = 6/13 + 5/13 = 11/13, equal to the Bayesian estimate.
This is yet another example of the general result when the likelihood is a member of a linear
exponential family in a conjugate prior situation.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 331

12.11. A. The marginal distribution is:

f(x)g()d = {3e/x / (2x4) } {4 e/10/2,400,000}d =


0

7e(1/x + 1/10) d = x-4 {(8)/4,800,000} (1/x + 1/10)-8 =

x-4/4,800,000
0

x-4 {5040 /4,800,000} {x8 108 (x + 10)-8} = 105,000 x4 /(x+10)8 .


Comment: The integral is reduced to one involving the Gamma function. The conditional distribution
of x is an Inverse Gamma Distribution, with shape parameter 3 and scale parameter .
The prior distribution of is a Gamma Distribution, with shape parameter 5 and scale parameter 10.
The marginal distribution is a Generalized Pareto Distribution, with = 3, = 10, = 5.
In general the marginal distribution will be a Generalized Pareto with parameters: = shape
parameter of the conditional Inverse Gamma Distribution, = scale parameter of the prior Gamma
Distribution, = shape parameter of the prior Gamma Distribution. Thus we see how one can obtain
a Generalized Pareto as a mixture of an Inverse Gamma Distribution via a Gamma Distribution.
As a special case, one can obtain a Pareto Distribution as a mixture of an Inverse Exponential via a
Gamma Distribution. As another special case, one can obtain an Inverse Pareto Distribution as a
mixture of an Inverse Gamma via an Exponential Distribution.

12.12. A. From the previous solution, the marginal distribution is: 105,000 x4 /(x+10)8 .
This is a Generalized Pareto Distribution with = 3, = 10, = 5.
Its mean is: / (-1) = (10)(5)/2 = 25.
Alternately, given , one has a Inverse Gamma severity distribution with parameters = 3 and ,
with mean /(-1) = /2. Thus the overall mean is E[/2] = E[]/2 =
(mean of the Gamma distribution of )/2 = (5)(10)/2 = 25.
12.13. E. Using Bayes Theorem, the posterior distribution is proportional to the product of the
chance of the observation and the prior distribution of : f(5)f(8)f(10)f(20)g(), which is proportional
to: 3e/5 3e/8 3e/10 3e/20 4 e/10 = 16e.575.
This is proportional to a Gamma Distribution with = 17, and scale parameter = 1/0.575:
(.57517)16e.575 / (17) = 3.92 x 10-18 16e.575.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 332

12.14. C. From the previous solution, the posterior distribution is: (0.57517)16e.575 / (17),
and thus the predictive distribution is:

{3e/x / (2x4 } {(0.57517)16e.575 / (17)} d =


0

0.57517 -4
x
2 (17)

0.57517 -4
19 e- (1/ x + 0.575) d =
x (20) (1/x + 0.575)-20
2 (17)

(20)
(0.57517) x-4 x20 0.575-20 (x + 1.739)-20 =
2 (17)

(2907) x16 0.575-3 (x + 1.739)-2 = 15,291 x16 / (x+1.739)2 0.


Comment: The integral is reduced to one involving the Gamma function. The predictive distribution
is a Generalized Pareto Distribution with = 3, = 1/ 0.575 = 1.739, = 17.
12.15. A. From the previous solution, the predictive distribution is a Generalized Pareto Distribution
with = 3, = 1/0.575 = 1.739, = 17. This has a mean of /(-1) = (17)(1/0.575)/2 = 14.78.
12.16. D. Given , one has a Inverse Gamma severity distribution with shape parameter 3 and
variance 2/ (3-1)2 (3-2) = 2/4. EPV = E[2/4] = E[2]/4 =
(second moment of the prior Gamma distribution of )/4 = (500 + 502 )/4 = 3000/4 = 750.
12.17. C. Given , one has a Inverse Gamma severity distribution with shape parameter 3 and
mean / (3-1) = /2. Thus the Variance of the Hypothetical Mean Severities = Var[/2] = Var[]/22 =
(variance of the prior Gamma distribution of )/4 = (5(10)2 )/4 = 125.
12.18. E. From previous solutions, EPV = 750, VHM = 125, and thus K = 750/125 = 6.
Z = 4/(4 + 6) = .4. From a previous solution, the a priori mean is 25. The observation is
(5+8+10+20)/4 = 10.75. Thus the estimate is: (10.75)(.4) + (25)(.6) = 19.3.
Comment: Note that the estimates from Buhlmann Credibility and Bayesian Analysis are not equal.
While the Gamma is a Conjugate Prior for the Inverse Gamma severity, the Inverse Gamma is not
a member of a linear exponential family.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 333

12.19. B. The marginal distribution is:

f(x)g()d = {6x5ex / 120 } {3456 3 e12}d =


0

9e(x +12) d = 28.8x5 (10) (x +12)-10 = 10,450,944 x5 (x +12)- 1 0.

28.8x5
0

Comment: The integral is reduced to one involving the Gamma function. The conditional distribution
of x is a Gamma Distribution with shape parameter 6 and scale parameter 1/.
The prior distribution of is another Gamma Distribution with shape parameter 4 and scale
parameter 1/12. The marginal distribution is a Generalized Pareto Distribution with = 4,
= 12, = 6. In general the marginal distribution will be a Generalized Pareto with parameters:
= shape parameter of the prior Gamma Distributions, = 1 / scale parameter of the prior Gamma
Distribution, = shape parameter of the conditional Gamma Distribution. Thus we see how one can
obtain a Generalized Pareto as a mixture of an Gamma Distribution via another Gamma Distribution.
As a special case, one can obtain a Pareto Distribution as a mixture of an Exponential via a Gamma
Distribution. (Reformulated this is the Inverse-Gamma Exponential Conjugate Prior.)
12.20. D. From the previous solution, the marginal distribution is: 10,450,944 x5 (x +12)-10.
This is a Generalized Pareto Distribution with parameters = 4, = 12, = 6.
Its mean is: / (-1) = (12)(6)/3 = 24.
Alternately, given , one has a Gamma severity distribution with mean 6/.
Thus the overall mean is E[6/] = 6E[1/] =
(6) (the negative first moment of the Gamma distribution of , with = 4 and = 1/12)
= (6) {(1/12)-1/(4-1)) = 24.
12.21. C. Using Bayes Theorem, the posterior distribution is proportional to the product of the
chance of the observation and the prior distribution of : f(5)f(8)f(10)f(20)g(),
which is proportional to: 6e5 6e8 6e10 6e20 3 e12 = 27e55.
This is proportional to a Gamma Distribution with = 28 and = 1/55:
(5528)27e55 / (28) = 4.934 x 1020 27e55.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 334

12.22. D. From the previous solution, the posterior distribution of is: (5528)27e55 / (28),
and thus the predictive distribution is:

{6x5ex / 120} {(5528)27e55 / (28)} d =


0

(5528 / 120 (28)) x5

33e(x + 55) d = (5528 / 120 (28))x5 (34) (x + 55)-34


0

= 3.570 x 1055 x5 / (x + 55)- 3 4.


Comment: The integral is reduced to one involving the Gamma function. The predictive distribution
is a Generalized Pareto Distribution with = 28, = 55, and = 6.
12.23. A. From the previous solution, the predictive distribution is a Generalized Pareto Distribution
with = 28, = 55, = 6. This has a mean of /(-1) = (6)(55)/27 = 12.22.
Alternately, given the severity is Gamma with = 6 and = 1/, and thus mean 6/.
From a previous solution, the posterior distribution of is: (5528)27e55 / (28),
a Gamma Distribution with = 28 and = 1/55.
Thus the posterior mean of 1/ is: 1 [-1]/[] = 55 [27]/[28] = 55/27.
Thus the posterior mean severity is: 6 E[1/] = (6)(55)/27 = 12.22.
Comment: For a Gamma Distribution, E[Xk] = k

( + k)
, k > .
()

12.24. From a previous solution, the posterior distribution of is: (5528)27e55 / (28),
a Gamma Distribution with = 28 and = 1/55.
Thus posterior, E[1/2] = 2 [-2]/[] = 552 [26]/[28] = 552 /{(26)(27)}.

Var[6/] = 36 Var[1/] = 36{E[1/2] - E[1/]2 } = (36){552 /{(26)(27)} - (55/27)2 } = 5.745.


From the previous solution, E[6/] = (6)(55)/27 = 12.22.
Thus using the Normal Approximation, a 95% confidence interval is:
12.22 1.960 5.745 = 12.22 4.70 = [7.52, 16.92].
Comment: Similar to Exercise 15.81 in Loss Models.
( + k)
, k > .
For a Gamma Distribution, E[Xk] = k
()

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 335

12.25. E. Given , one has a Gamma severity distribution with variance 6/2.
Thus the Expected Value of the Process Variance is E[6/2] = 6E[1/2] =
6 (negative second moment of the Gamma distribution of ) = (6){(1/12)-2/{(4-1)(3-1)}) = 144.
12.26. C. Given , one has a Gamma severity distribution with mean 6/.
Thus the Variance of the Hypothetical Mean Severities is Var[6/] = 62 Var[1/] =
(36){E[1/2] - E2 [1/]} = 36{24 - 42 ) = 288.
12.27. A. From previous solutions, EPV = 144, VHM = 288, and thus K = 144/288 = 1/2.
Z = 4/(4 + 1/2) = 8/9. From a previous solution, the a priori mean is 24. The observation is:
(5 + 8 + 10 + 20)/4 = 10.75. Thus the estimate is: (10.75)(8/9) + (24)(1/9) = 110/9 = 12.2.
Comment: Note that the estimate from Buhlmann Credibility is equal to that from Bayesian
Analysis. The Gamma is a Conjugate Prior for the Gamma severity, and the Gamma severity, with
fixed shape parameter, is a member of a linear exponential family.
12.28. The posterior density is proportional to the product of the prior distribution and the densities
at the observations.
The prior Inverse Transformed Gamma density with taking the place of x is proportional to:
-11/6 exp(-7/1/3).
The Weibull density at observation x is proportional to: 1/3 exp(- x1/3 1/3).
Thus at x = 1, 8, 27, 125, the densities are proportional to:
1/3 exp(-1/3), 1/3 exp(-2 1/3), 1/3 exp(-3 1/3), and 1/3 exp(-5 1/3).
Thus the posterior distribution is proportional to:
-11/6 exp(-71/3) 1/3 exp(-1/3) 1/3 exp(-21/3) 1/3 exp(-31/3) 1/3 exp(-51/3) =
-19/6 exp(-18 1/3).
This proportional to:
an Inverse Transformed Gamma distribution with = 13/2 = 6.5, = 183 and = 1/3.
Note that then: + 1 = (1/3)(13/2) + 1 = 19/6.
Comment: The Inverse Transformed Gamma is a Conjugate Prior to the Weibull Iikelihood with tau
fixed; the tau parameters of the Weibull and the Inverse Transformed Gamma have to be equal.
If one instead in the Weibull used the parameter = , then follows an Inverse Gamma
Distribution, and the Inverse Gamma is a conjugate prior to the Weibull.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 336

12.29. E. As determined in the previous solution, the posterior distribution of is:


an Inverse Transformed Gamma distribution with = 6.5, = 183 and = 1/3.
It has mean: 183 (6.5 - 3) / (6.5) = 183 / (5.5)(4.5)(3.5) = 67.32.
Given a Weibull with parameters and = 1/3, the mean severity is: (1+ 1/) = (1+ 3) = 6.
Posterior expected value of the severity is: E[6] = 6E[] = (6)(67.32) = 404.
Comment: Using the squared error loss function the Bayesian estimate is the mean.
Loss functions are discussed in Mahlers Guide to Buhlmann Credibility.
12.30. A. As determined in a previous solution, the posterior distribution of is:
an Inverse Transformed Gamma distribution with = 6.5, = 183 and = 1/3.
Using the zero-one loss function, the Bayesian estimate of is the mode of the posterior Inverse

Transformed Gamma distribution:
+ 1

1/

= 183 {(1/3)/(13/6 + 1)}3 = 6.80.

Given a Weibull with parameters and = 1/3, the mean severity is: (1+ 1/) = (1+ 3) = 6.
Thus the estimated hypothetical mean severity is: (6)(6.80) = 40.8.
12.31. C. As determined in a previous solution, the posterior distribution of is:
an Inverse Transformed Gamma distribution with = 6.5, = 183 and = 1/3.
Using the absolute error loss function, the Bayesian estimate of is the median of the posterior
distribution.
The Transformed Gamma Distribution function is: 1 - [; (/x)] = 1 - [6.5 ; 18/x1/3], so the median
is such that [6.5 ; 18/x1/3] = 0.5.
We are given that [6.5 ; 6.16988] = 0.5.
Therefore, 18/ x1/3 = 6.16988. median = (18/ 6.16988)3 = 24.8.
Given a Weibull with parameters and = 1/3, the mean severity is: (1+ 1/) = (1+ 3) = 6.
Thus the estimate of the hypothetical mean severity is: (6)(24.8) = 149.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 337

12.32. E. The probability of the observation is proportional to:


{exp[-(800 - 1000)2 /(2v)]/ v } {exp[-(900 - 1000)2 /(2v)]/ v } = exp[-25000/v]/v.
The prior distribution of v is proportional to: e-50000/v/v5 .
Therefore, the posterior distribution of v is proportional to:
(exp[-25000/v]/v)e-50000/v/v5 = e-75000/v/v6 .
This is proportional to an Inverse Gamma Distribution with = 5 and = 75,000.
The posterior distribution of v is an Inverse Gamma Distribution with = 5 and = 75,000.
The mean of the posterior distribution of v is: 75,000/(5 - 1) = 18,750.
12.33. B. Applying Bayes Theorem, the posterior distribution is proportional to:
{y1 2 y2 3 y3 4 y4 5 (1 - y1 - y2 - y3 - y4 )19} {y1 3 y2 2 y3 4 y4 6 (1 - y1 - y2 - y3 - y4 )35} =
y 1 5 y2 5 y3 8 y4 11 (1 - y1 - y2 - y3 - y4 )54.
This posterior distribution is also a Dirichlet Distribution, with a0 = 55, a1 = 6, a2 = 6, a3 = 9, and
k

a4 = 12. a = aj = 55 + 6 + 6 + 9 + 12 = 88. E[yj] = E[j-1|qx] is: aj / a.


j=0

E[qx] = a1 /a = 6/88 = .06818. E[1|qx] = a2 /a = 6/88 = .06818.


E[2|qx] = a3 /a = 9/88 = .10227. E[3|qx] = a4 /a = 12/88 = .13636.
The posterior estimate of 3 p x is:
(1 - qx)(1 - 1|qx)(1 - 2|qx) = (1 - .06818)(1 - .06818)(1 - .10227) = .7795.
The posterior estimate of 3 qx is: 1 - .77949 = 0.2205.
12.34. F(x) = 1 - exp[-c x2 ]. f(x) = 2 c x exp[-c x2 ].
Thus the chance of the observation is proportional to: c4 exp[-c(12 + 22 + 32 + 52 )] = c4 exp[-39c].
The prior distribution is proportional to the density of the Gamma prior: c9 exp[-50c].
Thus the posterior distribution is proportional to: c9 exp[-50c] c4 exp[-39c] = c13 exp[-89c].
Thus the posterior distribution is also a Gamma with parameters = 14 and = 1/89.
Comment: The severity distribution is a Weibull parameterized differently than in Loss Models.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 338

12.35. C. For the Inverse Gamma, f(x | q) = q e-q/x / {[] x+1} = q6 e-q/x / {5! x7 }.
For the Exponential, (q) = e-q/100/100.
The posterior distribution is proportional to: f(10 | q) f(20 | q) f(50 | q) (q),
which is proportional to: q6 e-q/10 q6 e-q/20 q6 e-q/50 e-q/100 = q18 e-0.18q.
This is an Gamma Distribution with parameters 19 and 1/0.18 = 5.556.
For the zero-one loss function, the Bayes estimate is the mode of the posterior distribution.
The mode of a Gamma distribution for > 1 is: ( - 1) = (5.556)(19 - 1) = 100.
Comment: This is an example of an Exponential-Inverse Gamma,
a special case of the Gamma-Inverse Gamma with = 1.
= + aC = 1 + (6)(3) = 19, and 1/ = 1/ + (1/xi) = 1/100 + 1/10 + 1/20 + 1/50.
For the squared error loss function, the Bayes estimate is the mean of the posterior distribution,
which in this case is: = (5.556)(19) = 105.6.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 339

12.36. By differentiating with respect to x: f(x | ) = 10 / x+1, x > 10.


The chance of the observation given is: 6 f(15 | ) f(25 | ) f(60 | ).
This is proportional to: (10/15) (10/25) (10/60) = 3 1.5 2.5 6 = 3 22.5.
The prior density of beta is a Gamma Distribution and is proportional to: e-2 3.
Therefore, the posterior density of is proportional to:
e-2 3 3 22.5 = 6 exp[- (2 + ln(22.5)] = 6 exp[- / 0.1956].
Thus the posterior distribution of is Gamma with = 7 and = 0.1956.
Comment: The severity distribution is a Single Parameter Pareto with = and = 10.
Note that for 1, the mean severity does not exist. Since the Gamma has support starting at zero,
neither the marginal nor the predictive distributions have a finite mean.
In general, the Gamma Distribution is a Conjugate Prior to the Single Parameter Pareto likelihood
with fixed. In general, if the prior Gamma has shape parameter and scale parameter , then the
n

posterior Gamma has parameters: = + n, and 1/ = 1/ +

ln[xi / ].
i=1

In this case: = 4 + 3 = 7, and 1/ = 1/0.5 + ln[15/10] + ln[25/10] + ln[60/10] = 5.1135.


A graph comparing the prior and posterior Gammas:
density
0.8
Posterior

0.6

0.4

0.2

Prior

beta

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 340

12.37. C. By Bayes Theorem, the posterior distribution of is proportional to the product of the
prior distribution of and the chance of the observation given .
The chance of the observation is a product of four density functions: f(1;) f(4;) f(9;) f(64;) which
is proportional to: 4 exp(-) exp(-2) exp(-3) exp(-8) = 4 e-14.
The a priori distribution of is: 20.7 0.71 e2 / (0.7).
Thus the posterior distribution of is proportional to: 4 e-14 0.3 e2 = 3.7 e-16.
This is proportional to a Gamma distribution with = 4.7 and = 1/16.
Thus the mean of the posterior distribution is = 4.7 / 16 = 0.294.
Comment: The Gamma Distribution is a conjugate prior distribution to the Weibull Distribution.
Assume that the Weibull is parameterized in a somewhat different way than in Loss Models, has
parameters (fixed and known), and . here corresponds to in Loss Models .
If the prior Gamma distribution of has parameters and , then the posterior Gamma distribution
of has parameters: = + n, and 1/ = 1/ +

xi .

In this case, = 0.7 + 4 = 4.7, and 1/ = 2 + (11/2 + 41/2 + 91/2 + 641/2) = 16.
One could instead put this all in terms of an Inverse Gamma Distribution of 1/.
Integrating the probability weight of 3.7 e-16 from zero to infinity one gets (4.7) / 164.7.
Thus dividing by this constant will give the posterior distribution, which must integrate to unity.
The posterior distribution is: 164.7 3.7 e-16 / (4.7),
a Gamma distribution with = 4.7 and = 1/16.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 341

12.38. E. The posterior distribution of is a Gamma with parameters = n + = 5 + 2.2 = 7.2,


and 1/ = 1/ + ln(1+xi) = 2 + ln(4) + ln(10) + ln(13) + ln(7) + ln(5) =
2 + ln((4)(10)(13)7)(5)) = 2 + ln(18200). = 1 / {2 + ln(18200)}.
The mean of the posterior Gamma is: = 7.2 / {2 + ln(18,200)}.
Alternately, not using the hint, the chance of the observation is a product of probability density
functions of the Pareto: f(x) = (1+x)(+1).
The a priori distribution of is given by the prior Gamma Distribution:
g() = 1 e/ / () = 22.2 1.2 e2 / (2.2), which is proportional to 1.2 e2.
Thus the posterior distribution of is proportional to:
(1+3)(+1) (1+9)(+1) (1+12)(+1) (1+6)(+1) (1+4)(+1) 1.2 e2 =
6.2 (18200)(+1) e2 = (18200)-1 6.2 e-(2 + ln(18200)).
This is proportional to a Gamma probability density with parameters:
= 7.2 and = 1 / {2 + ln(18,200)}.
The mean of the posterior Gamma is: = 7.2 / {2 + ln(18,200)}.
Comment: When one is minimizing the squared error, then the Bayes estimator is the mean of the
posterior distribution. The posterior distribution of is proportional to the product of the chance of
the observation given and the a prior chance of .
The Gamma is a Conjugate Prior for the Pareto ( with a fixed scale parameter.)
12.39. D. 1. False. The posterior distribution for one year can be used as the prior distribution for
the next year. 2. True. 3. True. The Beta Distribution is a conjugate prior distribution to the Binomial
Distribution with fixed n.
12.40. A. Applying Bayes Theorem, the posterior distribution is proportional to:
{t1 6 t2 3 t3 14 t4 8 (1 - t1 - t2 - t3 - t4 )3 } {t1 7 t2 12 t3 11 t4 11 (1 - t1 - t2 - t3 - t4 )9 } =
t1 13 t2 15 t3 25 t4 19 (1 - t1 - t2 - t3 - t4 )12.
This posterior distribution is also a Dirichlet Distribution, with a0 = 13, a1 = 14, a2 = 16,
4

a3 = 26, and a4 = 20. a = aj = 13 + 14 + 16 + 26 + 20 = 89. E[tj] = E[j-1|qx] is: aj/a.


j=0

E[2|qx] = a3 /a = 26/89 = 0.292.


Comment: E[qx] = a1 /a = 14/89. E[1|qx] = a2 /a = 16/89. E[3|qx] = a4 /a = 20/89.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 342

12.41. C. Assume that x lives die in year 3. Then 40 - 4 - 24 - x = 12 - x lives survive beyond
year 3. Applying Bayes Theorem, the posterior distribution is proportional to:
{ t1 7 t2 5 t3 3 (1 - t1 - t2 - t3 )} {t1 4 t2 24 t3 x (1 - t1 - t2 - t3 )12 - x} = t1 11 t2 29 t3 3+x (1 - t1 - t2 - t3 )13 - x.
This posterior distribution is also a Dirichlet Distribution, with a0 = 14 - x, a1 = 12, a2 = 30,
and a3 = 4 + x.
E[q0 ] = 12/(14 - x + 12 + 30 + 4 + x) = 12/60 = 1/5.
E[1|q0 ] = 30/(14 - x + 12 + 30 + 4 + x) = 30/60 = 1/2.
q1 =1|q0 /(1 - q0 ) = (1/2)/(1 - 1/5) = 5/8.
12.42. A. The Beta-Bernoulli the predictive distribution is the Bernoulli, the predictive distribution for
the Gamma-Poisson is the Negative Binomial, and for the Inverse Gamma-Exponential the
predictive distribution is the Pareto.
12.43. D. The prior distribution is proportional to: t1 49 t2 39 t3 29....
Assume that x die in the first year, y die in the second year, and z die in the third year.
Applying Bayes Theorem, the posterior distribution is proportional to: t1 49+x t2 39+y t3 29+z....
This posterior distribution is also a Dirichlet Distribution, with a1 = 50 + x, a2 = 40 + y,
a3 = 30 + z. aj = aj + dj, where dj is the number who die during year j and d0 is the number who
survive beyond year k. a = aj = aj + dj = 150 + 100 = 250. E[tj] = E[j-1|qx] is: aj/a.
E[2|qx] = a2 /a = (30 + z)/250.
The posterior estimate of 2|qx = (1 - q0 )(1 - q1 )q2 = (16/25)(3/5)(7/12) = .224.
(30 + z)/250 = 0.224. z = 26.
12.44. D. The prior rate for the first year is: a1 /a = 0.06. a1 = 0.06a.
k

Let there be dj deaths observed during year j, d0 lives survive beyond year k, and d = dj =
j=0

original number of lives. Then applying Bayes Theorem, the posterior distribution is Dirichlet with
parameters aj = aj + dj. The posterior estimate of ti = i-1lqx is:
ai/a = (ai + di)/(a + d) = (ai/a){a/(a + d)} + (di/d){d/(a + d)}.
The weight to the prior rate, ai/a, is: a/(a + d) = a/(a + 200) = 5/9. a = 250.

a1 = (.06)(250) = 15.
Comment: The estimate from Bayes Analysis is equal to that from Buhlmann Credibility, with
K = a = 250, Z = d/(d + a), and 1 - Z = a/(d + a) = 250/(200 + 250) = 5/9.

2013-4-10,

Conjugate Priors 12 Overview,

HCM 10/21/12,

Page 343

12.45. B. By Bayes Theorem, the posterior distribution of is proportional to the product of the a
priori probability of and the chance of the observation given :
g() f(2 | ) = (e-/4 / 4) (e-/2 / 22 ) = e3/4 /16, which is proportional to: e3/4.
Comment: x; the posterior distribution of is proportional to: x e-x/(4/3).
This is a Gamma Distribution with = 2 and scale parameter of 4/3. Looking in the Appendix A
attached to the exam, the constant is: 1/{() } = 1/{(2) (4/3)2 } = 9/16.
Therefore, the posterior distribution is: (9/16) e3/4.
This is an example of a Gamma - Inverse Exponential Conjugate Prior. The severity distribution is
an Inverse Exponential Distribution with scale parameter . The prior distribution of is a Gamma
Distribution with = 1 and scale parameter 4. One observes one claim of size 2
(C = 1, = sum of inverses of loss sizes = 1/2.) The posterior distribution of is also a Gamma
Distribution, but with = 1 + C = 1 + 1 = 2 and scale parameter: 1/(1/4 + ) = 1/(1/4 + 1/2) = 4/3.
12.46. B. The predictive distribution is the Inverse Exponential severity mixed via the posterior
Gamma:

(9 / 16) e- 3 / 4

e - / x
9 2 - (3 / 4 + 1/ x)
d
=
d =
e
x2
16x 2 0

{9/ (16x2 )} 2! / (3/4 + 1/x)3 = (8x/3) (x + 4/3)-3.


This is an Inverse Pareto as per Appendix A of Loss Models, with = 2 and = 4/3.
Comment: Note that neither the Inverse Exponential nor the Inverse Pareto have a finite mean.
Therefore, one can not use Bayesian Analysis or Buhlmann Credibility to predict the future severity
of this insured. The density of the Inverse Pareto is: x1 / (x + )+1.

2013-4-10,

Conjugate Priors 13 Important Ideas,

HCM 10/21/12,

Page 344

Section 13, Important Formulas and Ideas


Here are what I believe are the most important formulas and ideas from this study guide to know for
the exam.

Mixing Poissons (Section 2):


When mixing Poissons: EPV = = a priori mean.

Gamma-Poisson Conjugate Prior (Section 4):


Estimate using Buhlmann Credibility = Estimate using Bayes Analysis
Inputs: Prior Gamma Distribution has parameters and .
Poisson Likelihood (Model Distribution) has mean that varies across the portfolio via a Gamma
Distribution.
Marginal Distribution:
Negative Binomial Distribution with parameters r = , = .
r goes with alpha, and beta rhymes with theta.
Buhlmann Credibility Parameter = 1/.
Observations: C claims for E exposures.
Posterior Gamma Distribution has parameters = + C, 1/ = E + 1/.
Predictive Distribution:
Negative Binomial with parameters r = and = .
The Exponential is a special case of the Gamma with = 1.
For the Exponential-Poisson, the marginal distribution is Geometric with = .

2013-4-10,

Conjugate Priors 13 Important Ideas,

HCM 10/21/12,

Page 345

Beta-Bernoulli and Beta Binomial Conjugate Priors (Section 6 to 7):


Estimate using Buhlmann Credibility = Estimate using Bayes Analysis
Inputs: Prior Beta has parameters a, b, and = 1.
For use in the Beta-Bernoulli, the parameter in the Beta is always equal to one.
For = 1, f(x) =

(a + b - 1)!
xa - 1 (1 - x)b - 1, 0 x 1.
(a - 1)! (b - 1)!

Bernoulli Likelihood has mean q that varies across the portfolio via Beta.
Marginal Distribution: Bernoulli with parameter q = a/(a+b).
Buhlmann Credibility Parameter = a + b.
Observations: r claims (successes) for n exposures (trials).
Posterior Beta has parameters a = a + r and b = b + n - r.
Predictive Distribution: Bernoulli with parameter q = a/(a+b).
The Uniform Distribution on (0,1) is a special case of the Beta Distribution, with a = 1 and
b = 1, (and = 1). Thus the case of Bernoulli parameters uniformly distributed on (0, 1) is a special
case of the Beta-Bernoulli, with a =1 and b =1.
If for fixed m, the q parameter of the Binomial is distributed over a portfolio by a Beta,
then the posterior distribution of q parameters is also given by a Beta with parameters:
a = a + number of claims. b = b + m(number of years) - number of claims.

2013-4-10,

Conjugate Priors 13 Important Ideas,

HCM 10/21/12,

Page 346

Inverse Gamma-Exponential Conjugate Prior (Section 9):


Estimate using Buhlmann Credibility = Estimate using Bayes Analysis
Inputs: Prior Inverse Gamma has parameters and .
Exponential Likelihood has mean that varies across the portfolio via an Inverse Gamma
Distribution with parameters and .
Marginal Distribution: Pareto with same parameters as Prior Inverse Gamma.
Buhlmann Credibility Parameter = - 1.
Observations: L dollars of loss on C claims.
Posterior Inverse Gamma has parameters = + C and = + L.
Predictive Distribution: Pareto with same parameters as Posterior Inverse Gamma.
Normal-Normal Conjugate Prior (Section 10):
Estimate using Buhlmann Credibility = Estimate using Bayes Analysis
Inputs: Prior Normal has parameters and ; Normal Likelihood has fixed variance of s2 , with mean
m that varies across the portfolio via prior Normal.
Marginal Distribution: Normal with mean , and variance 2 + s2 .
Buhlmann Credibility Parameter = EPV/VHM = s2 / 2.
Observations: L dollars of loss on C claims.
Posterior Normal has mean =

L 2 + s2
s2 2
and
variance
=
.
C 2 + s2
C 2 + s 2

Predictive Distribution: Normal with same mean as the posterior Normal and variance s2 more than
that of the posterior Normal.

2013-4-10,

Conjugate Priors 13 Important Ideas,

HCM 10/21/12,

Page 347

Linear Exponential Families (Section 11):


ln f(x; ) = x r( ) + ln[p(x)] - ln[q()]; with r() the natural parameter.
Examples of Linear Exponential Families: Poisson, Exponential, Gamma for fixed ,
Normal for fixed , Bernoulli, Binomial for fixed m, Geometric, Negative Binomial for
fixed r, Inverse Gaussian for fixed .
If the likelihood density is a member of a linear exponential family and the conjugate prior
distribution is used as the prior distribution, then the Buhlmann Credibility estimate is
equal to the corresponding Bayesian estimate (for a squared error loss function.)
Specifically, this applies to the Gamma-Poisson,
Beta-Bernoulli, Inverse Gamma-Exponential, and the Normal-Normal (fixed variance.)
For Linear Exponential Families, the Methods of Maximum Likelihood and Moments
produce the same result when applied to ungrouped data.

2013-4-10,

Conjugate Priors 13 Important Ideas,

HCM 10/21/12,

Page 348

Updating Formulas, Conjugate Priors


Gamma-Poisson

= + C

Beta-Bernoulli
Beta-Binomial

a = a + r
b = b + n - r
a = a + # of claims. b = b + m (# of years) - # of claims.

Inverse Gamma-

= + C

1/ = (1/) + E

= + L

Exponential
Normal-Normal

L 2 + s2
C 2 + s2

2 =

s2 2
C s2 + s 2

Buhlmann Credibility Parameters, Conjugate Priors


Gamma-Poisson

K = 1/

Beta-Bernoulli
Beta-Binomial

K=a+b
K = (a + b) / m

Inverse Gamma-Exponential

K=-1

Normal-Normal

K = s2 / 2

Estimate using Buhlmann Credibility = Estimate using Bayesian Analysis


Marginal Distributions, Conjugate Priors
Gamma-Poisson

Negative Binomial: r = , =

Beta-Bernoulli
Beta-Binomial

Bernoulli: q = a/(a + b)
Beta-Binomial

Inverse Gamma-Exponential

Pareto: = , =

Normal-Normal

Normal: = , 2 = 2 + s2

Page 349
HCM 10/21/12,

Conjugate Priors 13 Important Ideas,


2013-4-10,

Gamma-Poisson Frequency Process

Poisson
Process
Mixing

r = shape parameter of the Prior Gamma = .


= scale parameter of the Prior Gamma = .
Mean = r = .
Variance = r(1+) = + 2.

Negative Binomial Marginal Distribution


(Number of Claims)

Gamma is a Conjugate Prior, Poisson is a Member of a Linear Exponential Family


Buhlmann Credibility Estimate = Bayes Analysis Estimate
Buhlmann Credibility Parameter, K = 1/.
Gamma Prior
(Distribution of Parameters)
Shape parameter = alpha = ,
Scale parameter = theta = .

Poisson
Process

Mixing

r = shape parameter of the Posterior Gamma


= = + C.
= scale parameter of the Posterior Gamma
= = 1/(E + 1/).
Mean = r = ( + C)/(E + 1/).
Variance = r(1+) =
( + C)/(E + 1/) + ( + C)/(E + 1/)2.

Negative Binomial Predictive Distribution


(Number of Claims)

Observations: # claims = C, # exposures = E.

Gamma Posterior
(Distribution of Parameters)
Posterior Shape parameter =
= + C.
Posterior Scale parameter =
1/ = 1/ + E.

Poisson Parameters of individuals making up the entire portfolio are distributed via a Gamma Distribution
with parameters and : f(x) = - x-1 e-x/ / [], mean = , variance = 2.

Page 350
HCM 10/21/12,

Conjugate Priors 13 Important Ideas,


2013-4-10,

Beta-Bernoulli Frequency Process

Bernoulli
Process

(Number of Claims)
Bernoulli parameter q = mean of Bernoulli =
a/(a+b) = mean of prior Beta.
Variance = q(1-q) = ab/(a+b)2 .

Bernoulli Marginal Distribution:

Beta is a Conjugate Prior, Bernoulli is a Member of a Linear Exponential Family


Buhlmann Credibility Estimate = Bayes Analysis Estimate
Buhlmann Credibility Parameter, K = a + b.

Beta Prior a, b
Mixing

Bernoulli
Process

Mixing

(Number of Claims)
Bernoulli parameter q = mean of Bernoulli =
(a+r)/(a+b+n) = mean of posterior Beta.
Variance = q(1-q) = (a+r)(b+n-r)/(a+b+n)2 .

Bernoulli Predictive Distribution:

Observations: # claims = # successes = r,


# exposures = # of trials = n.

(Distribution of Parameters)

Beta Posterior
(Distribution of Parameters)
Posterior 1st parameter =
a + r.
Posterior 2nd parameter =
b + n - r.

Bernoulli Parameters of individuals making up the entire portfolio are distributed via a
Beta Distribution with parameters a and b: f(x) = (a+b-1)! xa-1 (1-x)b-1 /{(a-1)!(b-1)!}, 0 x 1,
mean = a/(a+b), variance = ab/{(a+b+1)(a+b)2 }.

2013-4-10,

Conjugate Priors 13 Important Ideas,

HCM 10/21/12,

Beta-Binomial Frequency Process


Beta is a Conjugate Prior for the Binomial Likelihood
Binomial with m fixed is a Member of a Linear Exponential Family.
Buhlmann Credibility Estimate = Bayes Analysis Estimate.
Buhlmann Credibility Parameter, K = (a + b)/m.
Binomial
Process
Beta Prior a, b
Dist. of Parameters

Mixing

Beta-Binomial
Marginal Dist.
Number of Claims

Observations:
# claims = C,
# years = Y.

Beta Posterior
Dist. of Parameters
a = a + C.
b = b + mY - C.

Mixing

Binomial
Process

Beta-Binomial
Predictive Dist.
Number of Claims

Page 351

Page 352
HCM 10/21/12,

Conjugate Priors 13 Important Ideas,


2013-4-10,

Inverse Gamma-Exponential Severity Process

Mixing

Exponential
Process

(Size of Loss)
= shape parameter of the Prior Inverse Gamma = .
= scale parameter of the Prior Inverse Gamma = .
Mean = /(1).
Variance = 2/{(2)(1)2}.

Pareto Marginal Distribution:

Inverse Gamma is a Conjugate Prior, Exponential is a Member of a Linear Exponential Family


Buhlmann Credibility Estimate = Bayes Analysis Estimate
Buhlmann Credibility Parameter, K = - 1.
Inverse Gamma Prior
(Distribution of Parameters)
Shape parameter = alpha = ,
Scale parameter = theta = .

Exponential
Process

Mixing

(Size of Loss)
= shape parameter of the Posterior Inverse Gamma
= = + C.
= scale parameter of the Posterior Inverse Gamma
= = + L.
Mean = /(1).
Variance = 2/{(2)(1)2}.

Pareto Predictive Distribution:

Observations: $ of Loss = L, # claims = C.

Inverse Gamma Posterior


(Distribution of Parameters)
Posterior Shape parameter =
= + C.
Posterior Scale parameter =
= + L.

Exponential Parameters (means) of individuals making up the entire portfolio are distributed via an Inverse
Gamma Distribution with parameters and : f(x) = e-/x/ {x+1[]},
mean = /(1), variance = 2/{(2)(1)2}.

Page 353
HCM 10/21/12,

Conjugate Priors 13 Important Ideas,


2013-4-10,

Normal-Normal Severity Process

(Size of Loss)
Mean = = mean of the prior Normal Distribution.
Variance = s2 + 2.

Normal Marginal Distribution:

Normal Severity Process,


fixed variance s2, mean m.

Normal is a Conjugate Prior, Normal (fixed variance) is a Member of a Linear Exponential Family
Buhlmann Credibility Estimate = Bayes Analysis Estimate
K = Variance of Normal Likelihood/ Variance of Normal Prior = s2/2.

Normal Prior
Mixing

Mixing

(Size of Loss)
Mean = (L2 + s2 ) / {C2 + s2 } =
mean of posterior Normal Distribution.
Variance = s2 + s2 2 / {C2 + s2 }.

Normal Predictive Distribution:

Observations: $ of Loss = L, # claims = C.

(Distribution of Parameters)
f(m) = ((m-)/),
mean = , variance = 2.

Normal Posterior
(Distribution of Parameters)
Mean =
(L2 + s2 ) / {C2 + s2 }.
Variance =
s2 2 / {C2 + s2 }.

Normal Severity Process,


fixed variance s2, mean m.

The Means of the Normal Severity Distributions of individuals making up the entire portfolio are distributed via a
Normal Distribution with parameters and : f(m) = exp[-(m-)2/ 22]/{ 2 }.

Mahlers Guide to

Semiparametric Estimation
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-11


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-11,

Semiparametric Estimation,

HCM 10/22/12,

Page 1

Mahlers Guide to Semiparametric Estimation


Copyright 2013 by Howard C. Mahler.
This Study Aid will discuss the important technique of semiparametric estimation.
Information in bold or sections whose title is in bold are more important for passing the exam.
Larger bold type indicates it is extremely important. Information presented in italics should not be
needed to directly answer exam questions and should be skipped on first reading. It is provided
to aid the readers overall understanding of the subject, and to be useful in practical applications.
Solutions to the problems in each section are at the end of that section.1

Section #

Pages

Section Name

1
2
3
4
5
6
7
8

3
4-31
32-36
37-40
41-44
45-54
55-58
59

Introduction
Poisson Frequency
Negative Binomial with Beta Fixed
Geometric Frequency
Negative Binomial with r Fixed
Overview
Other Distributions
Important Ideas & Formulas

Note that problems include both some written by me and some from past exams. The latter are copyright by the
CAS and SOA and are reproduced here solely to aid students in studying for exams. In some cases Ive rewritten
these questions in order to match the notation in the current Syllabus. In some cases the material covered is
preliminary to the current Syllabus; you will be assumed to know it in order to answer exam questions, but it will not
be specifically tested. The solutions and comments are solely the responsibility of the author; the CAS and SOA
bear no responsibility for their accuracy. While some of the comments may seem critical of certain questions, this is
intended solely to aid you in studying and in no way is intended as a criticism of the many volunteers who work
extremely long and hard to produce quality exams.

2013-4-11,

Semiparametric Estimation,

HCM 10/22/12,

Page 2

Course 4 Exam Questions by Section of this Study Aid2


Section Sample
1
2
3
4
5
6
7

39

5/00

11/00

33

5/01

11/01

11/02

11/03

11/04

5/05

11/05

37

28

30

11/06

The CAS/SOA did not release the 5/02, 5/03, 5/04, 5/06, 11/07 and subsequent exams.

Excluding any questions that are no longer on the syllabus.

5/07
25

2013-4-11,

Semiparametric Estimation 1 Introduction,

HCM 10/22/12,

Page 3

Section 1, Introduction
The application of Buhlmann Credibility to the type of situation in which one assumes the likelihood
has a certain form, is referred to by Loss Models as semiparametric empirical Bayes
estimation.3 Initially, the likelihood of the frequency for each insured will be assumed to be a
Poisson, by far the most common application. Subsequently, I discuss semiparametric estimation
using the assumptions of either a Geometric, or Negative Binomial frequency.
In each case, the Expected Value of the Process Variance and Total Variance will be estimated,
and then we will back out the Variance of the Hypothetical Means.
Total Variance = EPV + VHM VHM = Total Variance - EPV.
Semiparametric vs. Nonparametric vs. Full Parametric Estimation:
Semiparametric estimation assumes a particular form of the frequency distribution, but differs from
full parametric estimation via Buhlmann Credibility4 where in addition one assumes a particular
distribution of types of insureds or mixing distribution. Full parametric estimation assumes a
complete model and K is calculated with no reference to any observations.
For example, if one assumes Good Drivers are 80% of the portfolio and each have a Poisson
frequency with = 0.03, while Bad Drivers are 20% of the portfolio and each have a Poisson
frequency with = 0.07, then one can calculate the Buhlmann Credibility Parameter, K, with no
reference to any data.
In semiparametric estimation, one relies on the data to help calculate K.
Semiparametric estimation assumes a particular form of the frequency distribution, which differs
from Nonparametric estimation5 where no such assumption is made. One can use semiparametric
estimation with only one year of data from a portfolio of insureds. Nonparametric estimation
requires data separately for each of at least two years from several insureds.
From the least complete to most complete modeling assumptions:
Nonparametric, semiparametric, full parametric.
From the least reliance on data in order to calculate K to the most reliance on data to calculate K:
Full parametric, semiparametric, nonparametric.
3

See Section 20.4.2 of Loss Models, in particular Example 20.36.


See pages 8-47 to 8-49 of Credibility by Mahler and Dean. See also Topics in Credibility by Dean.
4
See Mahlers Guide to Buhlmann Credibility.
5
See Mahlers Guide to Empirical Bayesian Credibility.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 4

Section 2, Poisson Frequency


As discussed in a Mahlers Guide to Conjugate Priors, when each insured has a Poisson
frequency, EPV = E[Process Variance | ] = E[] = overall mean.
In semiparametric estimation, when one assumes each exposure has a Poisson Distribution, one
estimates the mean and total variance and then:
estimated EPV = estimated mean = X .
estimated VHM = estimated total variance - estimated EPV = s2 - X .
Exercise: Assume that one observes that the claim count distribution during a year is as follows for
a group of 10,000 insureds:6
Total Claim Count :
0
1
2
3 4 5 6 7 >7
Number of Insureds: 8116 1434 329 87 24 7 2 1
0
Assume in addition that the claim count for each individual insured has a Poisson
distribution. Estimate the Buhlmann Credibility parameter K.
[Solution: X = 0.2503 and s2 = (10,000/9999)(0.4213 - 0.25032 ) = 0.3587,
where I have used the sample variance, in order to have an unbiased estimate of the variance.7
B

Number
of Claims
0
1
2
3
4
5
6
8
7

Probability
0.8116
0.1434
0.0329
0.0087
0.0024
0.0007
0.0002
0.0000
0.0001

Col. A times
Col. B
0.0000
0.1434
0.0658
0.0261
0.0096
0.0035
0.0012
0.0000
0.0007

Square of Col. A
times Col. B
0.0000
0.1434
0.1316
0.0783
0.0384
0.0175
0.0072
0.0000
0.0049

Sum

0.2503

0.4213

estimated EPV = X = 0.2503.


estimated VHM = Total Variance - estimated EPV = s2 - X = 0.3587 - 0.2503 = 0.1084.
K = EPV/VHM = 0.2503/0.1084 = 2.3.]
6

For example, 329 of these 10,000 insureds happen 2 claims over the previous year. Some of these 329 were
very unlucky, but many of these 329 have a worse than average future expected claim frequency.
7
The sample variance has 10,000 - 1 = 9999 in the denominator, rather than 10,000. For a smaller number of
insureds, using the sample variance rather than usual variance could make a significant difference. Loss Models
uses the sample variance in this situation, while Mahler & Dean do not. On the exam, I would use the sample
variance in this situation.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 5

Several Years of Data:8


Assume we have three years of data for 5 insureds:9
Number of Claims
Insured
Year 1
Year 2
Year 3
1
1
0
0
2
2
1
3
3
0
1
2
4
0
0
1
5
0
3
0

Total
1
6
3
1
3

If we assume each insured is Poisson and that for each insured his Poisson parameter is the same
for each of the three years, then it makes sense to treat the sum of the claims from each insured as
a single observation of the risk process.
The mean 3-year frequency is: (1 + 6 + 3 + 1 + 3) /5 = 14/5 = 2.8 = estimated EPV.
The sample variance of the three year totals is:
{(1 - 2.8)2 + (6 - 2.8)2 + (3 - 2.8)2 + (1 - 2.8)2 + (3 - 2.8)2 } / (5 - 1) = 4.2.
Estimated VHM = 4.2 - 2.8 = 1.4.
K = EPV/VHM = 2.8/1.4 = 2.
Exercise: Estimate the future number of claims from insured #2 over the next 3 years.
[Solution: K was calculated assuming 3 years of data from a single insured was one draw from the
risk process, and therefore, Z = 1/(1 + K) = 1/3.
Estimated 3 year frequency = (1/3)(6) + (2/3)(2.8) = 3.87.]
The estimated future annual frequency for insured #2 is: 3.87 / 3 = 1.29.
One could get the same result, by instead treating one year as a single observation of the risk
process. The mean annual frequency is 14/15 = 0.9333 = estimated EPV (annual.) The sum of
three years from one insured is the sum of three independent, identically distributed variables, with
three times the process variance. Therefore, the EPV for three years would be (3)(.9333) = 2.8.
The sample variance for the three year totals is 4.2. VHM for 3 years = Var[H.M. for 3 years] =
Var[(3)(annual H. M.)] = 32 Var[annual H. M.] = 9 (VHM annual).
4.2 = 2.8 + (9) (VHM annual).
Estimated VHM annual = (4.2 - 2.8)/9 = 0.1556. K = EPV/VHM = 6 on an annual basis.
For 3 years of data, Z = 3/(3+6) = 1/3.
The estimated future annual frequency for insured #2 is: (1/3)(6/3) + (2/3)(0.9333) = 1.29.
8
9

See for example, 4, 5/05, Q.28, and 4, 5/07, Q.25.


Practical applications of this technique would involve many more than 5 insureds.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 6

Exercise: During a single 3-year period, 5 insureds had the following experience:
Number of Claims in Year 1 through Year 3
Number of Insureds
1
2
3
2
6
1
The number of claims per year follows a Poisson distribution, with constant for a given insured.
For the insured who had 6 claims over the period, using semiparametric empirical Bayes
estimation, determine the Bhlmann estimate for the number of claims in Year 4.
[Solution: X = {(2)(1) + (2)(3) + (1)(6)}/5 = 2.8.
E[X2 ] = {(2)(12 ) + (2)(32 ) + (1)(62 )}/5 = 11.2.
Sample Variance = (5/4)(11.2 - 2.82 ) = 4.2.
Estimated EPV = X = 2.8. Estimated VHM = 4.2 - 2.8 = 1.4.
K = EPV/VHM = 2.8/1.4 = 2.
Throughout we have taken 3 years as one draw from the risk process, so N = 1.
Z = 1/(1 + 2) = 1/3.
Observed frequency per year for this policyholder is: 6/3 = 2.
Overall mean frequency per year is: 2.8/3 = 0.9333.
(1/3)(2) + (1 - 1/3)(0.9333) = 1.29.
Comment: This is a summarized version of the previous data, and therefore we get the previous
result. Similar to 4, 5/05, Q.28 and 4, 5/07, Q. 25.]
One could get a somewhat different result by treating the data as 15 separate observations when
calculating K.
Mean = 14/15 = 0.9333 = estimated EPV.
However, the sample variance for 15 separate observations is:
{7(0 - 0.9333)2 + 4(1 - 0.9333)2 + 2(2 - 0.9333)2 + 2(3 - 0.9333)2 }/(15 - 1) = 1.2095.
Estimated VHM = 1.2095 - 0.9333 = 0.2762. K = 0.9333/0.2762 = 3.38.
Z = 3/(3 + 3.38) = .470.
The estimated future annual frequency for insured #2 is:
(0.470)(6/3) + (0.530)(0.9333) = 1.43.
In general these two somewhat different methods of treating several years of data produce
somewhat different results.10 The first method, using the sum for each insured, is preferable.
Treating this data as 15 separate observations would ignore the assumption that for an individual
each year of data is assumed to come from the same Poisson distribution.11
10

As shown subsequently via a simulation experiment, for a larger number of insureds the estimated values of K
will be in the same general range.
11
If instead one assumes that for an individual his Poisson parameter shifts over time, then one should specifically
take that into account. See for example, A Markov Chain Model of Shifting Risk Parameters, by Howard Mahler,
PCAS 1997.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 7

Problems:
Use the following information for the next two questions:
You observe a portfolio of risks for a single year. Assume each individual risks claim frequency is
given by a Poisson process. The number of claims observed is as follows:
Number of Claims
0
1
2
3
4
5
6
7
8
9

Number of Insureds
210
293
235
141
70
31
12
5
2
1
1000

2.1 (3 points) What is the Buhlmann credibility of a single year of data from an individual risk from
this portfolio?
A. less than 19%
B. at least 19% but less than 20%
C. at least 20% but less than 21%
D. at least 21% but less than 22%
E. at least 22%
2.2 (2 points) What is the Buhlmann credibility of three years of data from an individual risk from
this portfolio?
A. less than 45%
B. at least 46% but less than 47%
C. at least 47% but less than 48%
D. at least 48% but less than 49%
E. at least 49%

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 8

2.3 (3 points) The claim count distribution during a year is as follows for a large group of insureds:
Claim Count :
0
1
2
3
4
5
>5
Percentage of Insureds: 60.0% 24.0% 9.8% 3.9% 1.6% 0.7% 0%
Assume the claim count for each individual insured has a Poisson distribution whose expected
mean does not change over time.
What is the estimated future annual frequency for an insured who had 4 claims during the year?
(A) 1.6
(B) 1.7
(C) 1.8
(D) 1.9
(E) 2.0
2.4 (2 points) You have data from 10,000 insureds for one year.
Xi is the number of claims from the ith insured.

Xi

= 431.

Xi2 = 534.

Xi3 = 736.

The number of claims of a given insured during the year is assumed to be Poisson
distributed with an unknown mean that varies by insured.
Determine the semiparametric empirical Bayes estimate of the expected number of
claims next year of an insured that reported two claims during the studied year.
(A) Less than 0.32
(B) At least 0.32, but less than 0.35
(C) At least 0.35, but less than 0.38
(D) At least 0.38, but less than 0.41
(E) At least 0.41

The following information pertains to the next two questions.


The claim count distribution is as follows for a large sample of insureds.
Total Claim Count
0
Percentage of Insureds 65%

1
25%

2
6%

3
3%

4
1%

>4
0%

Assume the claim count for each individual insured has a Poisson distribution whose expected
mean does not change over time.
2.5 (1 point) What is the expected value of the process variance?
A. 0.42
B. 0.44
C. 0.46
D. 0.48
E. 0.50
2.6 (1 point) What is the variance of the hypothetical means?
A. 0.16
B. 0.17
C. 0.18
D. 0.19
E. 0.20

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 9

2.7 (3 points) You are given the following data for private passenger automobile insurance for a
year:
Number of Claims per Policy
Number of Policies
0
103,704
1
14,075
2
1,766
3
255
4
45
5
6
6
2
7 and over
0
Total
119,853
Assuming the number of claims per year for each policyholder follows a Poisson distribution,
using semiparametric empirical Bayes estimation, determine the Bhlmann credibility factor, Z,
for one year of data.
A. Less than 0.10
B. At least 0.10 but less than 0.13
C. At least 0.13 but less than 0.16
D. At least 0.16 but less than 0.19
E. At least 0.19
2.8 (3 points) The claim count distribution during a year is as follows for a group of 27,000
insureds:
Total Claim Count :
0
1
2
3
4
5
6
7 or more
Number of Insureds: 25422 1410 131 24
9
3
1
0
Joe Smith, an insured from this portfolio, is observed to have two claims in six years.
Assume each insureds claim count follows a Poisson Distribution.
Using semiparametric estimation, what is Joes estimated future annual frequency?
A. Less than 0.10
B. At least 0.10, but less than 0.15
C. At least 0.15, but less than 0.20
D. At least 0.20, but less than 0.25
E. At least 0.25

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 10

Use the following information for the next two questions:


An insurer has data on the number of claims for 700 policyholders for five years.
Let Xij is the number of claims from the ith policyholder for year j.
You are given:
5

X1j = 1.
j=1

700 5

Xij = 268.
i=1 j=1

700 5

Xij2 = 319.
i=1 j=1

Xi j = 471.

i=1 j=1

700 5

The frequency for each policyholder is assumed to be Poisson.


2.9 (3 points) Use semiparametric estimation, treating the 5 years of data from each policyholder
as one draw from the risk process, in order to estimate the number of claims for policyholder #1
over the next year.
(A) 0.09
(B) 0.10
(C) 0.11
(D) 0.12
(E) 0.13
2.10 (3 points) Use semiparametric estimation, treating the data as 3500 separate observations,
in order to estimate the number of claims for policyholder #1 over the next year.
(A) 0.09
(B) 0.10
(C) 0.11
(D) 0.12
(E) 0.13

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 11

2.11 (2 points) A group of 1000 drivers is observed for a year to determine how many claims
each driver experiences. The data is as follows:
# of Claims # of Drivers
0
960
1
32
2
6
3
2
Assuming each insured is Poisson, estimate the credibility to be assigned to one year of
frequency data from a driver.
A. 10%
B. 15%
C. 20%
D. 25%
E. 30%
2.12 (3 points) You are given:
(i) During a 3-year period, 500 policies had the following claims experience:
Total Claims in Years 1, 2, and 3
Number of Policies
0
405
1
50
2
30
3
10
4
5
(ii) The number of claims per year follows a Poisson distribution.
(iii) Each policyholder was insured for the entire 3-year period.
A randomly selected policyholder had 2 claims over the 3-year period.
Using semiparametric empirical Bayes estimation, determine the Bhlmann estimate for the
number of claims in Year 4 for the same policyholder.
A. Less than 0.20
B. At least 0.20, but less than 0.25
C. At least 0.25, but less than 0.30
D. At least 0.30, but less than 0.35
E. 0.35 or more
2.13 (3 points) The claim count distribution during a year is as follows for a group of 9461 insureds:
Total Claim Count:
0
1
2
3
4
5 6 7
8&more
Number of Insureds: 7840 1317 239 42
14
4 4 1
0
Assume each insureds claim count follows a Poisson Distribution.
How much credibility would be given to three years of data from an insured?
A. Less than 45%
B. At least 45%, but less than 50%
C. At least 50%, but less than 55%
D. At least 55%, but less than 60%
E. At least 60%

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 12

2.14 (3 points)You are given:


(i) During a single 3-year period, 30,293 policies had the following total claims experience:
Number of Claims in Year 1 through Year 3
Number of Policies
0
25,480
1
4,198
2
537
3
67
4
8
5
3
(ii) The number of claims per year follows a Poisson distribution.
(iii) Each policyholder was insured for the entire period.
A randomly selected policyholder had 2 claims over the period.
Using semiparametric empirical Bayes estimation, determine the Bhlmann estimate for the
number of claims in Year 4 for the same policyholder.
(A) 0.10
(B) 0.12
(C) 0.14
(D) 0.16
(E) 0.18
2.15 (3 points)You are given:
(i) During a single 6-year period, 23,872 policies had the following total claims experience:
Number of Claims in Year 1 through Year 6
Number of Policies
0
19,634
1
3,573
2
558
3
83
4
19
5
4
6
1
(ii) The number of claims per year follows a Poisson distribution.
(iii) Each policyholder was insured for the entire period.
A randomly selected policyholder had 2 claims over the period.
Using semiparametric empirical Bayes estimation, determine the Bhlmann estimate for the
claim frequency in Year 7 for the same policyholder.
(A) 4%
(B) 5%
(C) 6%
(D) 7%
(E) 8%

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 13

2.16 (2 points) You are given:


(i) Each of 4000 policyholders was insured for 2 years.
(ii) For each policyholder, you assume that the number of claims per year follows
a Poisson distribution.
(iii) Let Xi be the number of claims that the ith policyholder had over the two years.
4000

(iv)

Xi = 260.
i=1

4000

(v)

Xi2 = 320.
i=1

A similar policyholder had two claims over a 5-year period.


Using semiparametric empirical Bayes estimation, determine the Bhlmann estimate for the
annual claim frequency for this policyholder.
(A) 13%
(B) 14%
(C) 15%
(D) 16%
(E) 17%

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 14

Use the following information for the next 3 questions:


The claim count distribution is as follows for a large sample of insureds.
Total Claim Count
0
1
2
3
4
>4
Percentage of Insureds 61.9% 28.4% 7.8% 1.6% 0.3% 0%
Assume the claim count for each individual insured has a Poisson distribution which does not
change over time.
2.17 (4, 5/85, Q.36) 1 point) What is the expected value of the process variance?
A. Less than 0.45
B. At least 0.45, but less than 0.55
C. At least 0.55, but less than 0.65
D. At least 0.65, but less than 0.75
E. 0.75 or more
2.18 (4, 5/85, Q.37) (1 point) What is the variance of the hypothetical means?
A. Less than 0.01
B. At least 0.01, but less than 0.02
C. At least 0.02, but less than 0.03
D. At least 0.03, but less than 0.04
E. 0.04 or more
2.19 (4, 5/85, Q.38) (1 point) Find the expected claim frequency of an insured who has had one
accident free year.
A. Less than 0.425
B. At least 0.425, but less than 0.450
C. At least 0.450, but less than 0.475
D. At least 0.475, but less than 0.500
E. 0.500 or more

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 15

2.20 (4, 5/89, Q.39) (2 points) A group of 340 insureds in a high crime area submit the following
210 theft claims in a one year period:
Number of Claims
Number of Insureds
0
200
1
80
2
50
3
10
Each insured is assumed to have a Poisson distribution for the number of thefts, but the mean of
such distribution may vary from one insured to another.
If a particular insured experienced 2 claims in the observation period, what is the Buhlmann
credibility estimate of the number of claims for this insured in the next year?
(Use the observed data to estimate the expected value of the process variance and the variance
of the hypothetical means.)
A. Less than 0.75
B. At least 0.75, but less than 0.85
C. At least 0.85, but less than 0.95
D. At least 0.95, but less than 1.20
E. 1.20 or more
2.21 (4, 5/91, Q.35) (2 points) The number of claims for each insured in a population has a
Poisson distribution. The distribution of insureds by number of actual claims in a single year is
shown below.
Number of
Number of
Claims
Insureds
0
900
1
90
2
7
3
2
4
1
_____

Total
1000
Calculate the Buhlmann estimate of credibility to be assigned to the observed number of claims
for an insured in a single year.
A. Less than 0.10
B. At least 0.10 but less than 0.13
C. At least 0.13 but less than 0.16
D. At least 0.16 but less than 0.19
E. At least 0.19

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 16

2.22 (4B, 11/97, Q.7 & Course 4 Sample Exam 2000, Q.39) (3 points)
You are given the following:
The number of losses arising from m+4 individual insureds over a single period of
observation is distributed as follows:
Number of Losses
0
1
2
3 or more

Number of Insureds
m
3
1
0

The number of losses for each insured follows a Poisson distribution,


but the mean of each such distribution may be different for individual insureds.
The variance of the hypothetical means is to be estimated from the above observations.
Determine all values of m for which the estimate of the variance of the hypothetical means will be
greater than 0.
A. m 0
B. m 1
C. m 3
D. m 7
E. m 9
2.23 (4B, 11/98, Q.11) (2 points) You are given the following:

The number of losses arising from 500 individual insureds over a single period
of observation is distributed as follows:
Number of Losses Number of Insureds
0
450
1
30
2
10
3
5
4
5
5 or more
0

The number of losses for each insured follows a Poisson distribution, but the
mean of each such distribution may be different for individual insureds.
Determine the Buhlmann credibility of the experience of an individual insured over a single period.
A. Less than 0.20
B. At least 0.20, but less than 0.30
C. At least 0.30, but less than 0.40
D. At least 0.40, but less than 0.50
E. At least 0.50

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 17

2.24 (4, 5/00, Q.33) (2.5 points) The number of claims a driver has during the year is assumed to
be Poisson distributed with an unknown mean that varies by driver.
The experience for 100 drivers is as follows:
Number of Claims
during the Year
Number of Drivers
0
54
1
33
2
10
3
2
4
1
Determine the credibility of one year's experience for a single driver using semiparametric empirical
Bayes estimation.
(A) 0.046
(B) 0.055
(C) 0.061
(D) 0.068
(E) 0.073
2.25 (4, 11/00, Q.7) (2.5 points) The following information comes from a study of robberies of
convenience stores over the course of a year:
(i) Xi is the number of robberies of the ith store, with i = 1, 2,..., 500.
(ii) Xi = 50
(iii) Xi2 = 220
(iv) The number of robberies of a given store during the year is assumed to be Poisson
distributed with an unknown mean that varies by store.
Determine the semiparametric empirical Bayes estimate of the expected number of
robberies next year of a store that reported no robberies during the studied year.
(A) Less than 0.02
(B) At least 0.02, but less than 0.04
(C) At least 0.04, but less than 0.06
(D) At least 0.06, but less than 0.08
(E) At least 0.08

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 18

2.26 (4, 11/04, Q.37 & 2009 Sample Q.159) (2.5 points)
For a portfolio of motorcycle insurance policyholders, you are given:
(i) The number of claims for each policyholder has a conditional Poisson distribution.
(ii) For Year 1, the following data are observed:
Number of Claims Number of Policyholders
0
2000
1
600
2
300
3
80
4
20
Total
3000
Determine the credibility factor, Z, for Year 2.
(A) Less than 0.30
(B) At least 0.30, but less than 0.35
(C) At least 0.35, but less than 0.40
(D) At least 0.40, but less than 0.45
(E) At least 0.45
2.27 (4, 5/05, Q.28 & 2009 Sample Q.197) (2.9 points) You are given:
(i) During a 2-year period, 100 policies had the following claims experience:
Total Claims in Years 1 and 2
Number of Policies
0
50
1
30
2
15
3
4
4
1
(ii) The number of claims per year follows a Poisson distribution.
(iii) Each policyholder was insured for the entire 2-year period.
A randomly selected policyholder had one claim over the 2-year period.
Using semiparametric empirical Bayes estimation, determine the Bhlmann estimate for the
number of claims in Year 3 for the same policyholder.
(A) 0.380
(B) 0.387
(C) 0.393
(D) 0.403
(E) 0.443

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 19

2.28 (4, 11/05, Q.30 & 2009 Sample Q.240) (2.9 points)
For a group of auto policyholders, you are given:
(i) The number of claims for each policyholder has a conditional Poisson distribution.
(ii) During Year 1, the following data are observed for 8000 policyholders:
Number of Claims
Number of Policyholders
0
5000
1
2100
2
750
3
100
4
50
5+
0
A randomly selected policyholder had one claim in Year 1.
Determine the semiparametric empirical Bayes estimate of the number of claims in Year 2 for the
same policyholder.
(A) Less than 0.15
(B) At least 0.15, but less than 0.30
(C) At least 0.30, but less than 0.45
(D) At least 0.45, but less than 0.60
(E) At least 0.60
2.29 (4, 5/07, Q.25) (2.5 points) You are given:
(i) During a single 5-year period, 100 policies had the following total claims experience:
Number of Claims in Year 1 through Year 5
Number of Policies
0
46
1
34
2
13
3
5
4
2
(ii) The number of claims per year follows a Poisson distribution.
(iii) Each policyholder was insured for the entire period.
A randomly selected policyholder had 3 claims over the period.
Using semiparametric empirical Bayes estimation, determine the Bhlmann estimate for the
number of claims in Year 6 for the same policyholder.
(A) Less than 0.25
(B) At least 0.25, but less than 0.50
(C) At least 0.50, but less than 0.75
(D) At least 0.75, but less than 1.00
(E) At least 1.00

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 20

Solutions to Problems:
2.1. C. Mean = 1.753, 2nd moment = 5.283.
Total Variance (adjusted for degrees of freedom) = (1000/999) (5.283 - 1.7532) = 2.212.
EPV = Mean = 1.753. VHM = Total Variance - EPV = .459.
K = EPV / VHM = 1.753 / .459 = 3.82. Z = 1 / (1 + 3.82) = 20.7%.
Number
of Insureds
210
293
235
141
70
31
12
5
2
1

Number
of Claims
0
1
2
3
4
5
6
7
8
9

Square of
# of Claims
0
1
4
9
16
25
36
49
64
81

1000

1753

5283

Comment: In this case we are given data, therefore first we need to compute the total variance and
back out the VHM. The fact that the claim frequency is Poisson is sufficient to allow an estimate of
the EPV, but without the data we could not estimate the VHM. We are not given a complete
model of the risk process, as for example in the case of a Gamma-Poisson.
2.2. A. For the previous solution, K = 3.82.
For three years of data, Z = 3 / (3 + 3.82) = 44.0%.
2.3. C. Mean = .652 and the total variance = 1.414 - .6522 = .989.
A

Number
of Claims
0
1
2
3
4
5

A Priori
Probability
0.60000
0.24000
0.09800
0.03900
0.01600
0.00700

Col. A times
Col. B
0.00000
0.24000
0.19600
0.11700
0.06400
0.03500

Square of Col. A
times Col. B
0.00000
0.24000
0.39200
0.35100
0.25600
0.17500

Sum

0.652

1.41400

EPV = overall mean = .652. VHM = Total Variance - EPV = .989 - .652 = .337.
K = EPV/VHM = .652/.337 = 1.93. Z = 1/(1 + K) = .341.
Estimated future annual frequency = (.341)(4) + (1 - .341)(.652) = 1.79.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 21

2.4. C. Sample Mean = 431/ 10000 = .0431. Second Moment = 534/ 10000 = .0534.
Sample Variance = (10000/9999)(.0534 - .04312 ) = .05155.
EPV = Mean = .0431. VHM = Variance - EPV = .05155 - .0431 = .00845.
K = EPV/VHM = .0431/ .00845 = 5.10. Z = 1/(1+K) = 16.4%.
Estimated frequency = (.164)(2) + (1 - .164)(.0431) = 0.364.
Comment: Similar to 4, 11/00, Q.7. One does not make use of

Xi3 = 736.

2.5. E. & 2.6. B. Each insureds frequency process is given by a Poisson with parameter , with
varying over the group of insureds. Then the process variance for each insured is .
Thus the expected value of the process variance is estimated as follows:
E[VAR[X | ]] = E[] = overall mean = 0.50.
A

Number
of Claims
0
1
2
3
4

A Priori
Probability
0.65
0.25
0.06
0.03
0.01

Col. A times
Col. B
0.00
0.25
0.12
0.09
0.04

Square of Col. A
times Col. B
0.00
0.25
0.24
0.27
0.16

Sum

0.50

0.92

The mean = 0.50 and the total variance = 0.92 - 0.502 = 0.67. Thus we estimate the Variance of
the Hypothetical Means as: Total Variance - EPV = 0.67 - 0.50 = 0.17.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

2.7. C. Mean = 18594/119853 = 0.1551. 2nd moment = 24376/119853 = 0.2034.


Number
of Claims

Number
of Policies

Contribution to
First Moment

Contribution to
Second Moment

0
1
2
3
4
5
6

103,704
14,075
1,766
255
45
6
2

0
14,075
3,532
765
180
30
12

0
14,075
7,064
2,295
720
150
72

21

119,853

18,594

24,376

Sample Variance = (119,853 / 119,852) (0.2034 - 0.15512) = 0.1793.


EPV = Mean = 0.1551. VHM = Sample Variance - EPV = 0.1793 - 0.1551 = 0.0242.
K = EPV / VHM = 0.1551 / .0242 = 6.41. Z = 1 / (1 + 6.41) = 13.5%.
Comment: Taken from pages 107-108 of Mathematical Methods in Risk Theory by Hans
Buhlmann. The same data is shown in Table 16.20 in Loss Models.
2.8. D. The estimated mean is: 1801/27000 = .06670.
The estimated second moment is: 2405/27000 = .08907.
The sample variance is: (27000/26999)(.08907 - .066702 ) = .08462.
A

Number
of Claims
0
1
2
3
4
5
6

Number of
Insureds
25422
1410
131
24
9
3
1

Col. A times
Col. B
0
1410
262
72
36
15
6

Square of Col. A
times Col. B
0
1410
524
216
144
75
36

Sum

27000

1801

2405

Since we have assumed each insured is Poisson:


EPV = E[] = overall mean = .06670.
VHM = Total Variance - EPV = .08462 - .06670 = .01792.
K = EPV/VHM = .06670/.01792 = 3.72.
For six years of data, Z = 6/(6+3.72) = 61.7%.
The estimated overall mean is .06670 and the observed frequency for Joe is 2/6 = 1/3.
Therefore, Joes estimated future frequency is:
(.617)(1/3) + (1 - .617)(.06670) = 0.231.

Page 22

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 23

2.9. C. The overall mean 5-year frequency is: 268/700 = .3829.


The second moment of the 5-year claim frequency is: 471/700 = .6729.
Therefore, the sample variance is: (700/699)(.6729 - .38292 ) = .5270.
Since each policyholder is Poisson, the estimated EPV (5 year) = mean = .3829.
The estimated VHM (5 year) = Total variance - EPV = .5270 - .3829 = .1441.
K (5 year) = EPV/VHM = .3829/.1441 = 2.66.
Remembering that 5 years of data was defined as one draw from the risk process,
Z = 1/(1 + 2.66) = 27.3%. Over 5 years policyholder #1 had 1 claim.
Thus the estimated number of claims for policyholder # 1 over the next 5 years is:
(27.3%)(1) + (72.7%)(.3829) = .551.
Thus the estimated future annual frequency for policyholder # 1 is: .551/5 = 0.110.
2.10. D. The overall mean annual frequency is: 268/3500 = .0766.
The second moment of the annual claim frequency is: 319/3500 = .0911.
Therefore, the sample variance is: (3500/3499)(.0911 - .07662 ) = .0853.
Since each policyholder is Poisson, the estimated EPV = mean = .0911.
The estimated VHM = Total variance - EPV = .0853 - .0766 = .0087.
K = EPV/VHM = .0766/.0087 = 8.8. For 5 years of data, Z = 5/(5 + 8.8) = 36.2%.
The mean annual frequency of policyholder #1 is 1/5 = .2.
Thus the estimated future annual frequency for policyholder # 1 is:
(36.2%)(.2) + (63.8%)(.0766) = 0.121.
Comment: Note the somewhat different estimates in this and the previous solution.
2.11. E. First Moment is: {(0)(960) + (1)(32) + (2)(6) + (3)(2)}/1000 = .050.
Second Moment is: {(02 )(960) + (12 )(32) + (22 )(6) + (32 )(2)}/1000 = .074.
Estimated Total Variance = .074 - .0502 = .0715.
Process Variance = . Expected Value of the Process Variance = E[] = .050.
Estimated Variance of the Hypothetical Means = Total Variance - EPV = .0715 - .05 = .0215.
K = EPV/VHM = .050/.0215 = 2.33.
For one year of data, Z = 1/(1 + K) = 1/3.33 = 30.0%.
Comment: If instead you use the sample variance of: (1000/999)(.0715) = .0716, then
VHM = .0716 - .05 = .0216. K = .050/.0216 = 2.31. Z = 1/(1 + K) = 1/3.31 = 30.2%.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 24

2.12. E. Treat three years from a single insured, as one draw from the risk process.
Estimated EPV = X = {(405)(0) + (50)(1) + (30)(2) + (10)(3) + (5)(4)}/500 = 0.32.
Second Moment = {(405)(02 ) + (50)(12 ) + (30)(22 ) + (10)(32 ) + (5)(42 )}/500 = 0.68.
Sample Variance = (500/499)(.68 - .322 ) = .5788.
Estimated VHM = .5788 - .32 = .2588. K = EPV/VHM = .32/.2588 = 1.236.
N = 1, since we observe an insured for a three year period, one draw from the risk process.
Z = 1/(1 + K) = .447.
Estimated number of claims for three years is: (.447)(2) + (1 - .447)(.32) = 1.071.
Estimated number of claims for one year is: 1.071/3 = 0.357.
Comment: Similar to 4, 5/05, Q.28.
2.13. C. The mean is:
(0)(7840) + (1)(1317) + (2)(239) + (3)(42) + (4)(14) + (5)(4) + (6)(4) + (7)(1)
= 0.2144.
9461
The second moment is:
(0)(7840) + (1)(1317) + (4)(239) + (9)(42) + (16)(14)+ (25)(4)+ (36)(4) + (49)(1)
=
9461
0.3348.
The estimated variance is: (9461/9460)(.3348 - .21442 ) = .2889.
EPV = estimated mean = 0.2144. VHM = total variance - EPV = 0.2889 - 0.2144 = 0.0745.
K = EPV/VHM = 0.2144/0.0745 = 2.88. For three years, Z = 3/(3 + 2.88) = 51%.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 25

2.14. B. X = {(25,480)(0) + (4,198)(1) + (537)(2) + (67)(3) + (8)(4) + (3)(5)}/30,293 = 0.1822.


E[X2 ] = {(25,480)(02 ) + (4,198)(12 ) + (537)(22 ) + (67)(32 ) + (8)(42 ) + (3)(52 )}/30,293 =
0.2361.
Sample Variance = (30,293/30,292)(0.2361 - 0.18222 ) = 0.2029.
Estimated EPV = X = 0.1822. Estimated VHM = 0.2029 - 0.1822 = 0.0207.
K = EPV/VHM = 0.1822/.0207 = 8.80.
Throughout we have taken 3 years as one draw from the risk process, so N = 1.
Z = 1/(1 + 8.80) = 10.2%.
Observed frequency per year for this policyholder is 2/3.
Overall mean frequency per year is: 0.1822/3 = 0.0607.
(10.2%)(2/3) + (1 - 10.2%)( 0.0607) = 0.123.
Comment: Similar to 4, 5/07, Q. 25.
Data for male driver in California from 1969 to 1971, taken from Table A2 of
The Distribution of Automobile Accidents - Are Relativities Stable Over Time?,
by Emilio C. Venezian, PCAS 1990.
If instead we want K for one year being one draw from the risk process, then the previously
determined EPV is divided by 3 and the previously determined VHM is divided by 32 .
EPV = 0.1822/3 = 0.0607. VHM = 0.0207/9 = 0.0023. K = 0.607/0.0023 = 26.4.
Now N = 3, and Z = 3/(3 + 26.4) = 10.2%, matching the credibility determined previously.
2.15. E. X = {(19,634)(0) + (3,573)(1) + (558)(2) + (83)(3) + (19)(4) + (4)(5) + (1)(6)}/23,872
= 0.2111. E[X2 ] =
{(19,634)(02 ) + (3,573)(12 ) + (558)(22 ) + (83)(32 ) + (19)(42 ) + (4)(52 ) + (1)(62 )} /23,872 =
0.2929.
Sample Variance = (23,872/23,871)(0.2929 - 0.21112 ) = 0.2483.
Estimated EPV = X = 0.2111. Estimated VHM = 0.2483 - 0.2111 = 0.0372.
K = EPV/VHM = 0.2111/.0372 = 5.67.
Throughout we have taken 6 years as one draw from the risk process, so N = 1.
Z = 1/(1 + 5.67) = 15.0%. Observed frequency per year for this policyholder is: 2/6 = 1/3.
Overall mean frequency per year is: 0.2111/6 = 0.0352.
(15.0%)(1/3) + (1 - 15.0%)( 0.0352) = 0.080.
Comment: Similar to 4, 5/07, Q. 25.
Data for female driver in California from 1969 to 1974, taken from Table 1 of
The Distribution of Automobile Accidents - Are Relativities Stable Over Time?,
by Emilio C. Venezian, PCAS 1990.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 26

2.16. B. Treat two years from a single insured, as one draw from the risk process.
Estimated EPV = X = 260/4000 = 0.065.
Second Moment = 320/4000 = 0.08.
Sample Variance = (4000/3999)(0.08 - 0.0652 ) = 0.07579.
Estimated VHM = 0.07579 - 0.065 = 0.01079.
K = EPV/VHM = 0.065/0.01079 = 6.022.
We observe a policyholder for 5 years; N = 2.5, since one draw from the risk process is two
years.
Z = 2.5/(2.5 + K) = 29.3%.
Observed two-year frequency for this policyholder is: (2)(2/5) = 0.8.
Estimated number of claims for two years is: (29.3%)(0.8) + (70.7%)(0.065) = 0.280.
Estimated number of claims for one year is: 0.280/2 = 0.140.
Alternately, the EPV for a single year is the mean annual frequency: 0.065/2 = 0.0325.
The hypothetical means for a single year are half of those for two years.
Therefore, the VHM for one year is 1/22 = 1/4 that for two years: 0.01079/4 = 0.002698.
If one year is defined as a single draw from the risk process, then K = 0.0325/0.002698 = 12.05.
We observe a policyholder for 5 years; N = 5, since one draw from the risk process is one year.
Z = 5/(5 + K) = 29.3%.
The observed annual claim frequency for this insured is: 2/5 = 0.4.
Estimated future annual frequency for this insured is: (29.3%)(0.4) + (70.7%)(0.0325) = 0.140.
Comment: The definition of what is one draw from the risk process, must be consistent between
the calculation of the EPV and VHM, and the determination of N.
2.17. B.
A

Number
of Claims
0
1
2
3
4

A Priori
Probability
0.61900
0.28400
0.07800
0.01600
0.00300

Col. A times
Col. B
0.00000
0.28400
0.15600
0.04800
0.01200

Square of Col. A
times Col. B
0.00000
0.28400
0.31200
0.14400
0.04800

Sum

0.5000

0.78800

Each insureds frequency process is given by a Poisson with parameter , with varying over the
group of insureds. Then the process variance for each insured is . Thus the expected value of the
process variance is estimated as follows: E[VAR[X | ]] = E[] = overall mean = 0.5.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 27

2.18. D. Following the solution to the previous question, one can estimate the total mean as 0.5
and the total variance as: 0.788 - 0.52 = 0.538. Thus we estimate the Variance of the Hypothetical
Means as: Total Variance - EPV = 0.538 - 0.5 = 0.038.
2.19. C. Using the results of the previous two questions, K = EPV / VHM = .5 / .038 = 13.16.
For one year Z = 1 / (1+13.16) = .071. The a priori mean is .5. Thus if there are no accidents, then
the expected future claim frequency is: (0)(.071) + (.5)(1-.071) = 0.465.
Comment: This indicates a claim free credit for one year of 7.1%, equal to the credibility.
2.20. B. Each insureds frequency process is given by a Poisson with parameter , with varying
over the group of insureds. The process variance for each insured is . Thus the expected value of
the process variance is estimated as follows: E[VAR[X | ]] = E[] = overall mean = .6176.
A

Number
of Claims
0
1
2
3

Number
of Insureds
200
80
50
10

Probability
0.58824
0.23529
0.14706
0.02941

Col. A times
Col. C
0.00000
0.23529
0.29412
0.08824

Square of Col. A
times Col. C
0.00000
0.23529
0.58824
0.26471

Sum

340

0.6176

1.08824

One can estimate the total variance (adjusting for degrees of freedom) as:
(340/339)(1.0882 - .61762 ) = .7089. Thus we estimate the Variance of the Hypothetical Means
as: Total Variance - EPV = .7089 - .6176 = .0913.
Thus the Buhlmann Credibility parameter K = EPV/VHM = .6176 / .0913 = 6.76.
For one observation of single insured Z = 1/(1 + 6.76) = 12.9%. The observed frequency is 2 and
the prior mean is .6176. Thus the new estimate is (.129)(2) + (1 - .129)(.6176) = 0.796.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 28

2.21. D.
Number
of Insureds
900
90
7
2
1

Number
of Claims
0
1
2
3
4

Square of Number
of Claims
0
1
4
9
16

1000

0.114

0.152

Expected Value of the Process Variance = Mean of the Poissons = 114 /1000 =.114.
The Total Variance (adjusting for degrees of freedom) = (1000/999)(.152 - .1142 )= .139.
The estimate of the Variance of the Hypothetical Means = Total Variance - EPV = .139 - .114 =
.025. K = .114 / .025 = 4.56. For one observation, Z = 1/(1+4.56) = 0.180.
Comment: Note that when mixing Poissons, the credibility assigned to one observation is
Z = 1/(1+K) = (A Priori Total Variance - A Priori Mean) / A Priori Total Variance =
1 - (A Priori Mean / A Priori Total Variance) = 1 - (.114 / .139) = .180.
2.22. D. The estimated first moment is: (number of claims)/(number of exposures) =
{(0)m + (1)(3) + (2)(1)} / (m+3+1) = 5/(m+4).
Similarly, the estimated second moment is:
{(02 )m + (12 )(3) + (22 )(1)} / (m+3+1) = 7/(m+4).
Thus the estimated total variance (adjusted for degrees of freedom) is:
((m + 4)/(m + 3)) (7/(m + 4) - 25/(m + 4)2 ) = 7/(m +3) - 25/((m + 4)(m + 3)).
Since, we are mixing Poissons, the estimate of the Expected Value of the Process Variance is
equal to the overall mean of 5/(m + 4).
Thus the estimated Variance of the Hypothetical means is:
Total Variance - Estimated EPV = 7/(m + 3) - 25/((m + 4)(m + 3)) - 5/(m + 4) =
(2m - 12)/((m + 4)(m + 3)).
Since the denominator is never negative, this expression is positive for m > 12/2 = 6.
Since m is an integer number of exposures, the estimated VHM is positive for m 7.
Comment: Note that when this technique or the more sophisticated Empirical Credibility
techniques are used in actual applications, it is an important concern that the estimated VHM could
be very small or even negative.
If one does not adjust for degrees of freedom, then the estimated total variance is:
7/(m+4) - 25/(m+4)2. Then the estimated Variance of the Hypothetical means is:
Total Variance - Estimated EPV = 7/(m+4) - 25/(m+4)2 - 5/(m+4) = (2m-17)/(m+4)2.
Since the denominator is never negative, this expression is positive for m > 17/2 = 8.5.
Since m is an integer number of exposures, the estimated VHM is positive for m 9.
(This was the intended solution, when this question was originally asked and the Syllabus
readings were different.)

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 29

2.23. E. The observed mean is .17, while the estimated variance (adjusted for degrees of
freedom) is: (500/499)(.39 - .172 ) = .3618.
Number of Insureds

Number of Losses

Square of Number of Losses

450
30
10
5
5

0
1
2
3
4

0
1
4
9
16

500

0.170

0.390

Since we are mixing Poissons the EPV = mean = .17.


The VHM = Total Variance - EPV = .3618 - .17 = .1918. K = EPV / VHM = .17 / .1918 = .886.
For one observation, Z = 1/(1+K) = 1/1.886 = 0.53.
2.24. E. The estimated first moment is 63/100 = .63.
The estimated second moment is 107/100 = 1.07.
The sample variance is: (100/99)(1.07 - .632 ) = .680.
Number
of Insureds
54
33
10
2
1

Number
of Claims
0
1
2
3
4

Square of
# of Claims
0
1
4
9
16

100

63

107

EPV = Mean = .630. VHM = Total Variance - EPV = .680 - .630 = .050.
K = EPV / VHM = .63 / .050 = 12.63. Z = 1 / (1+ 12.63) = 7.3%.
Comment: As per Loss Models, the sample variance is used to estimate the total variance;
otherwise the estimated variance would be biased. If one used the regular variance rather than
the sample variance, one would instead get: VHM = .673 - .630 = .043,
K = 14.62, and Z = 6.4%, not the intended answer out of Loss Models.
However, Mahler & Dean, now also on the syllabus, would use the biased estimator of the
variance.
2.25. B. EPV = Mean = 50/500 = .1. Second moment = 220/500 = .44.
Sample Variance = (500/499)(.44 - .12 ) = .4309.
VHM = Total Variance - EPV = .4309 - .1 = .3309.
K = EPV/VHM = .1/.3309 = .302.
For one store for one year, Z = 1/(1 + K) = 1 /1.302 = .768.
The observation is 0 and the overall mean is .1, so the estimated future frequency for this store is:
(0)(.768) + (.1)(.232) = 0.023.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 30

2.26. A. Mean = 1520/3000 = .5067. Second Moment = 2840/3000 = .9467.


Sample Variance = (3000/2999)(.9467 - .50672 ) = .6902.
Estimated EPV = .5067. Estimated VHM = .6902 - .5067 = .1835.
K = EPV/VHM = .5067/.1835 = 2.762. For one policyholder, Z = 1/(1 + 2.762) = 0.266.
Comment: The number of claims for each policyholder has a conditional Poisson distribution
The frequency for each policyholder is Poisson, but the values of (may) vary across the portfolio.
In other words, f(n | ) is Poisson with mean .
2.27. C. Treat two years from a single insured, as one draw from the risk process.
Estimated EPV = X = {(50)(0) + (30)(1) + (15)(2) + (4)(3) + (1)(4)}/100 = 0.76.
Second Moment = {(50)(02 ) + (30)(12 ) + (15)(22 ) + (4)(32 ) + (1)(42 )}/100 = 1.42.
Sample Variance = (100/99)(1.42 - .762 ) = .8509. Estimated VHM = .8509 - .76 = .0909.
K = EPV/VHM = .76/.0909 = 8.36.
N = 1, since we observe an insured for a two year period, one draw from the risk process.
Z = 1/(1 + K) = 10.7%.
Estimated number of claims for two years is: (.107)(1) + (1 - .107)(.76) = .786.
Estimated number of claims for one year is: .786/2 = 0.393.
Alternately, the EPV for a single year is the mean annual frequency: 0.76/2 = 0.38.
The hypothetical means for a single year are half of those for two years.
Therefore, the VHM for one year is 1/22 = 1/4 that for two years: .0909/4 = .0227.
If one year is defined as a single draw from the risk process, then K = 0.38/.0227 = 16.74.
We observe two years of data, two draws from the risk process, and therefore N = 2.
Z = 2/( 2 + K) = 2/18.74 = 10.7%.
The observed annual claim frequency for this insured is: 1/2.
Estimated future annual frequency for this insured is: (.107)(1/2) + (1 - .107)(0.38) = 0.393.
Comment: The definition of what is one draw from the risk process, must be consistent between
the calculation of the EPV and VHM, and the determination of N.
2.28. D. Mean = {(0)(5000) + (1)(2100) + (2)(750) + (3)(100) + (4)(50)}/8000
= 4100/8000 = .5125.
Second Moment = {(02 )(5000) + (12 )(2100) + (22 )(750) + (32 )(100) + (42 )(50)}/8000
= 6800/8000 = .85.
Sample Variance is: (8000/7999)(.85 - .51252 ) = .5874.
Estimated EPV = mean = .5125.
Estimated VHM = Sample Variance - Mean = .5874 - .5125 = .0749.
K = EPV/VHM = .5125/.0749 = 6.84. Z = 1/(1 + K) = 1/7.84 = .1275.
Estimate is: (.1275)(1) + (1 - .1275)(.5125) = 0.575.

2013-4-11,

Semiparametric Estimation 2 Poisson Frequency,

HCM 10/22/12,

Page 31

2.29. A. X = {(46)(0) + (34)(1) + (13)(2) + (5)(3) + (2)(4)}/100 = 0.83.


E[X2 ] = {(46)(02 ) + (34)(12 ) + (13)(22 ) + (5)(32 ) + (2)(42 )}/100 = 1.63.
Sample Variance = (100/99)(1.63 - 0.832 ) = 0.951.
Estimated EPV = X = 0.83. Estimated VHM = 0.951 - 0.83 = 0.121. K = EPV/VHM = 6.86.
Throughout we have taken 5 years as one draw from the risk process, so N = 1.
Z = 1/(1 + 6.86) = 12.7%.
Observed frequency per year for this policyholder is 3/5 = 0.6.
Overall mean frequency per year is: 0.83/5 = 0.166.
(12.7%)(0.6) + (1 - 12.7%)(0.166) = 0.221.
Comment: The answer has to be between 0.6 and 0.166, so choices D and E make no sense.

2013-4-11, Semiparametric Estimation 3 Neg. Bin. Beta Fixed,

HCM 10/22/12,

Page 32

Section 3, Negative Binomial with Fixed


One can assume a Negative Binomial Distribution with fixed known parameter, with the r
parameter varying between the insureds. In semiparametric estimation, when one assumes each
exposure has a Negative Binomial Distribution with known parameter, one estimates the mean
and total variance and then:12
EPV = (1+ )(estimated mean)
VHM = estimated total variance - EPV.
Exercise: Derive the above formula for the EPV.
[Solution: Since the mean frequency for each exposure is r, the overall mean is:
E[r]. Therefore, E[r] = (estimated mean) / .
Since the process variance of each exposure is r(1+), the EPV is:
E[r(1+)] = (1+) E[r] = (1+) (estimated mean)/ = (1+ )(estimated mean). ]
Exercise: Assume that one observes that the claim count distribution during a year is as follows for
a group of 10,000 insureds:
Total Claim Count :
0
1
2
3 4 5 6 7
Number of Insureds: 8116 1434 329 87 24 7 2 1

>7
0

Assume in addition that the claim count for each individual insured has a Negative Binomial
distribution, with = 0.3 and r varying over the portfolio of insureds.
Estimate the Buhlmann Credibility parameter K.
[Solution: In a previous exercise we estimated the total mean as 0.2503,
and the total variance as 0.3587.
EPV = (1+ )(estimated mean) = (1.3)(0.2503) = 0.3254.
VHM = Total Variance - EPV = 0.3587 - 0.3254 = 0.0333.
K = EPV/VHM = 0.3254/0.0333 = 9.8. ]

12

As 0, the Negative Binomial Poisson, and the EPV estimated mean.

2013-4-11, Semiparametric Estimation 3 Neg. Bin. Beta Fixed,

HCM 10/22/12,

Page 33

Problems:
3.1 (2 points) The following information comes from a study of robberies of convenience stores
over the course of a year:
(i) Xi is the number of robberies of the ith store, with i = 1, 2,..., 500.
(ii) Xi = 50
(iii) Xi2 = 220
(iv) The number of robberies of a given store during the year is assumed to be
Negative Binomial with = 2.7, and r that varies by store.
Determine the semiparametric empirical Bayes estimate of the expected number of
robberies next year of a store that reported no robberies during the studied year.
(A) Less than 0.02
(B) At least 0.02, but less than 0.04
(C) At least 0.04, but less than 0.06
(D) At least 0.06, but less than 0.08
(E) At least 0.08

3.2 (2 points) The number of claims a driver has during the year is assumed to be Negative
Binomial with = 0.6, and r that varies by driver.
The number of losses arising from 500 individual insureds over a single period of observation is
distributed as follows:
Number of Losses Number of Insureds
0
450
1
30
2
10
3
5
4
5
5 or more
0
Determine the credibility of one year's experience for a single driver using semiparametric empirical
Bayes estimation.
A. Less than 0.20
B. At least 0.20, but less than 0.30
C. At least 0.30, but less than 0.40
D. At least 0.40, but less than 0.50
E. At least 0.50

2013-4-11, Semiparametric Estimation 3 Neg. Bin. Beta Fixed,

HCM 10/22/12,

Page 34

3.3 (3 points) The claim count distribution during a year is as follows for a group of 27,000
insureds:
Total Claim Count :
0
1
2
3
4
5
6
7 or more
Number of Insureds: 25422 1410 131 24
9
3
1
0
Joe Smith, an insured from this portfolio, is observed to have two claims in six years.
Assume each insureds claim count follows a Negative Binomial Distribution.
Assume = 0.25 for each insured, but r varies across the portfolio of insureds.
Using semiparametric estimation, what is Joes estimated future annual frequency?
A. Less than 0.10
B. At least 0.10, but less than 0.15
C. At least 0.15, but less than 0.20
D. At least 0.20, but less than 0.25
E. At least 0.25

2013-4-11, Semiparametric Estimation 3 Neg. Bin. Beta Fixed,

HCM 10/22/12,

Page 35

Solutions to Problems:
3.1. E. Mean = 50/500 = .1. Second moment = 220/500 = .44.
Sample Variance = (500/499)(.44 - .12 ) = .4309. EPV = (1+)mean = (3.7)(.1) = .37.
VHM = Total Variance - EPV = .4309 - .37 = .0609. K = EPV/VHM = .37/.0609 = 6.08.
For one store for one year, Z = 1/(1 + K) = 1 /7.08= .141.
The observation is 0 and the overall mean is .1, so the estimated future frequency for this store is:
(0)(.141) + (.1)(1 - .141) = 0.086.
Comment: Similar to 4, 11/00, Q.7, except that there a Poisson is assumed.
3.2. B. The observed mean is .17, while the estimated variance (adjusted for degrees of
freedom) is: (500/499)(.39 - .172 ) = .3618.
Number of Insureds

Number of Losses

Square of Number of Losses

450
30
10
5
5

0
1
2
3
4

0
1
4
9
16

500

0.170

0.390

Estimated EPV = (1+)mean = (1.6)(.170) = .272.


Estimated VHM = Estimated Total Variance - Estimated EPV = .3618 - .272 = .0898.
K = EPV / VHM = .272 / .0898 = 3.03. Z = 1 / (1+ 3.03) = 24.8%.
Comment: Similar to 4B, 11/98, Q.11, except that there a Poisson is assumed.

2013-4-11, Semiparametric Estimation 3 Neg. Bin. Beta Fixed,

HCM 10/22/12,

Page 36

3.3. A. The estimated mean is: 1801/27000 = .06670.


The estimated second moment is: 2405/27000 = .08907.
The sample variance is: (27000/26999)( .08907 - .066702 ) = .08462.
A

Number
of Claims
0
1
2
3
4
5
6

Number of
Insureds
25422
1410
131
24
9
3
1

Col. A times
Col. B
0
1410
262
72
36
15
6

Square of Col. A
times Col. B
0
1410
524
216
144
75
36

Sum

27000

1801

2405

Since we have assumed each insured has a Negative Binomial frequency with
= 0.25, for each insured, the process variance = r(1+ ) = r(1.25) = (1.25)(mean).
EPV = E[Process Variance] = E[(1.25)(mean)] = (1.25)E[mean] = (1.25)(overall mean) =
(1.25)(.06670) = .08338. VHM = Total Variance - EPV = .08462 - .08338 = .00124.
K = EPV/VHM = .08338/.00124 = 67. For six years of data, Z = 6/(6+67) = 8.2%.
The estimated overall mean is .06670 and the observed frequency for Joe is 2/6 = 1/3.
Therefore, Joes estimated future frequency is: (.082)(1/3) + (1 - .082)(.06670) = 0.089.

2013-4-11, Semiparametric Estimation 4 Geometric Frequency, HCM 10/22/12, Page 37

Section 4, Geometric Frequency


Assume that each exposure has a Geometric distribution, with the parameters varying over the
portfolio of exposures. Let the mixing distribution be g().13
Then since the mean frequency for each exposure is , the overall mean is the mean of g. Since
the process variance of each exposure is (1+) = + 2 , the EPV is: E[ + 2 ] =
E[] + E[2 ] = mean of g + second moment of g.
Since the mean of each exposure is , the VHM is by definition the variance of g =
second moment of g - (mean of g)2 .
Thus if we let the overall mean be , we have the following relationships when mixing Geometric
Distributions:14
= mean of g.
EPV = + second moment of g.
VHM = second moment of g - 2 .
Total Variance = EPV + VHM = 2(second moment of g) + - 2 .
Therefore, EPV - VHM = + 2 . Let the total variance = 2 = EPV + VHM =
EPV + (EPV - 2 - ) = 2 EPV - 2 - . Therefore, EPV = (2 + + 2 )/2.
Thus VHM = (2 - - 2 )/2.
Thus like the situation of mixing Poissons, the mean and total variance involve the EPV and VHM.
Therefore assuming one is mixing Geometric Distributions, one can estimate the EPV and VHM
from the estimated mean X and estimated variance s2 :
EPV = (s2 + X + X 2 )/2.

VHM = s2 - EPV = (s2 - X - X 2 )/2.

Note that for some data sets, s2 < X + X 2 , and therefore the estimated VHM would be negative.
In this case, either the assumption of a mixture of Geometric Distributions is not appropriate, or
there is a lot of random fluctuation which affected the estimate. It should also be noted that as
gets very small, the Geometric approaches a Poisson. Therefore, when the mean claim frequency
is small, there is very little difference between assuming a mixture of Geometric Distributions and
Poissons.
13
14

For example, g() could be a Beta Distribution.


See Exercise 20.73 in Loss Models.

2013-4-11, Semiparametric Estimation 4 Geometric Frequency, HCM 10/22/12, Page 38


In semiparametric estimation, when one assumes each exposure has a Geometric Distribution,
one estimates the mean and total variance and then:
EPV = (estimated total variance + estimated mean + estimated mean2 )/2.
VHM = estimated total variance - EPV.
Exercise: Assume that one observes that the claim count distribution during a year is as follows for
a group of 10,000 insureds:
Total Claim Count :
0
1
2
3 4 5 6 7
Number of Insureds: 8116 1434 329 87 24 7 2 1

>7
0

Assume in addition that the claim count for each individual insured has a Geometric
distribution.
Estimate the Buhlmann Credibility parameter K.
[Solution: In a previous exercise we estimated the total mean as 0.2503,
and the total variance (adjusted for degrees of freedom) as 0.3587.
EPV = (estimated total variance + estimated mean + estimated mean2 )/2 =
(0.3587 + 0.2503 + 0.25032 )/2 = 0.3358.
VHM = Total Variance - EPV = 0.3587 - 0.3358 = 0.0229.
K = EPV/VHM = 0.3358/0.0229 = 14.7.
Comment: When we had assumed instead that each individual insured had a Poisson Distribution,
then we estimated K = 2.3. Here we assumed more of the total variance was due to the process
variance and less was due to differences between the insureds. Therefore, here less credibility
would be assigned to the experience of an individual insured.]

2013-4-11, Semiparametric Estimation 4 Geometric Frequency, HCM 10/22/12, Page 39


Problems:
4.1 (3 points) The following information comes from a study of robberies of convenience stores
over the course of a year:
(i) Xi is the number of robberies of the ith store, with i = 1, 2,..., 500.
(ii) Xi = 50
(iii) Xi2 = 220
(iv) The number of robberies of a given store during the year is assumed to be
Geometric with unknown mean that varies by store.
Determine the semiparametric empirical Bayes estimate of the expected number of
robberies next year of a store that reported no robberies during the studied year.
(A) Less than 0.02
(B) At least 0.02, but less than 0.04
(C) At least 0.04, but less than 0.06
(D) At least 0.06, but less than 0.08
(E) At least 0.08

4.2 (2 points) The number of claims a driver has during the year is assumed to be Geometric with
mean that varies by driver.
The number of losses arising from 500 individual insureds over a single period of observation is
distributed as follows:
Number of Losses Number of Insureds
0
450
1
30
2
10
3
5
4
5
5 or more
0
Determine the credibility of one year's experience for a single driver using semiparametric empirical
Bayes estimation.
A. Less than 0.20
B. At least 0.20, but less than 0.30
C. At least 0.30, but less than 0.40
D. At least 0.40, but less than 0.50
E. At least 0.50

2013-4-11, Semiparametric Estimation 4 Geometric Frequency, HCM 10/22/12, Page 40


Solutions to Problems:
4.1. D. Mean = 50/500 = .1. Second moment = 220/500 = .44.
Sample Variance = (500/499)(.44 - .12 ) = .4309.
EPV = (estimated total variance + estimated mean + estimated mean2 )/2 =
(.4309 + .1 + .12 )/2 = .2705.
VHM = Total Variance - EPV = .4309 - .2705 = .1604. K = EPV/VHM = .2705/.1604 = 1.69.
For one store for one year, Z = 1/(1+K) = 1 /2.69 = .372.
The observation is 0 and the overall mean is .1, so the estimated future frequency for this store is:
(0)(.372) + (.1)(1 - .372) = 0.063.
Comment: Similar to 4, 11/00, Q.7, except there a Poisson is assumed.
4.2. B. The observed mean is .17, while the estimated variance (adjusted for degrees of
freedom) is: (500/499)(.39 - .172 ) = .3618.
Number of Insureds

Number of Losses

Square of Number of Losses

450
30
10
5
5

0
1
2
3
4

0
1
4
9
16

500

0.170

0.390

Estimated EPV = (estimated total variance + estimated mean + estimated mean2 )/2 =
(.3618 + .17 + .172 )/2 = .2804.
Estimated VHM = Estimated Total Variance - Estimated EPV = .3618 - .2804 = .0814.
K = EPV / VHM = .2804 / .0814 = 3.44. Z = 1 / (1 + 3.44) = 22.5%.
Comment: Similar to 4B, 11/98, Q.11, except there a Poisson is assumed.

2013-4-11, Semiparametric Estimation 5 Neg. Bin. r Fixed,

HCM 10/22/12,

Page 41

Section 5, Negative Binomial with r Fixed


Instead of assuming a Geometric Distribution, one can assume a Negative Binomial Distribution
with fixed known r parameter, with the parameter varying between the insureds. In
semiparametric estimation, when one assumes each exposure has a Negative Distribution with
known r parameter, one estimates the mean and total variance and then:
EPV = (estimated total variance + (r)(estimated mean) + estimated mean2 )/(1+r)
= (s2 + r X + X 2 )/(1+r).
VHM = estimated total variance - EPV = s2 - EPV.
Exercise: Derive the above formula for the EPV.
[Solution: Let the mixing distribution be g().
Then since the mean frequency for each exposure is r, the overall mean is:
= r(the mean of g).
Since the process variance of each exposure is r(1+) = r + r2 , the EPV is:
E[r + r2 ] = rE[] + rE[2 ] = r(mean of g) + r(second moment of g).
Since the mean of each exposure is r, the VHM is by definition:
Var[r] = r2 Var[] = (r2 )(the variance of g) = (r2 )(second moment of g) - (r2 )( mean of g)2 .
Then 2 = total variance = EPV + VHM =
r(mean of g) + r(second moment of g) + (r2 )(second moment of g) - (r2 )( mean of g)2 =
- 2 + (r + r2 )(second moment of g).
Therefore, second moment of g = (2 + 2 - )/(r + r2 ).
Therefore, EPV = r(mean of g) + r(second moment of g) =
+ (2 + 2 - )/(1 + r) = (2 + r + 2 )/(1 + r).
If we estimate by X and 2 by s2 , then the estimated EPV is: (s2 + r X + X 2 )/(1+r).
Comment: For r = 1, one has a Geometric and as before, EPV = (2 + + 2)/2.]

2013-4-11, Semiparametric Estimation 5 Neg. Bin. r Fixed,

HCM 10/22/12,

Page 42

Exercise: Assume that one observes that the claim count distribution during a year is as follows for
a group of 10,000 insureds:
Total Claim Count :
0
1
2
3 4 5 6 7 >7
Number of Insureds: 8116 1434 329 87 24 7 2 1
0
Assume in addition that the claim count for each individual insured has a Negative Binomial
distribution, with r = 2, and varying over the portfolio of insureds.
Estimate the Buhlmann Credibility parameter K.
[Solution: In a previous exercise we estimated the total mean as 0.2503,
and the total variance as 0.3587.
EPV = (estimated total variance + r(estimated mean) + estimated mean2 )/(1+r)
= {0.3587 + 2(0.2503) + 0.25032 }/3 = 0.3073.
VHM = Total Variance - EPV = 0.3587 - 0.3073 = 0.0514.
K = EPV/VHM = 0.3073/0.0514 = 6.0.]

2013-4-11, Semiparametric Estimation 5 Neg. Bin. r Fixed,

HCM 10/22/12,

Page 43

Problems:
5.1 (3 points) The following information comes from a study of robberies of convenience stores
over the course of a year:
(i) Xi is the number of robberies of the ith store, with i = 1, 2,..., 500.
(ii) Xi = 50
(iii) Xi2 = 220
(iv) The number of robberies of a given store during the year is assumed to be
Negative Binomial with r = 2, and that varies by store.
Determine the semiparametric empirical Bayes estimate of the expected number of
robberies next year of a store that reported no robberies during the studied year.
(A) Less than 0.02
(B) At least 0.02, but less than 0.04
(C) At least 0.04, but less than 0.06
(D) At least 0.06, but less than 0.08
(E) At least 0.08

5.2 (2 points) The number of claims a driver has during the year is assumed to be
Negative Binomial with r = 3, and that varies by driver.
The number of losses arising from 500 individual insureds over a single period of observation is
distributed as follows:
Number of Losses Number of Insureds
0
450
1
30
2
10
3
5
4
5
5 or more
0
Determine the credibility of one year's experience for a single driver using semiparametric empirical
Bayes estimation.
A. Less than 0.20
B. At least 0.20, but less than 0.30
C. At least 0.30, but less than 0.40
D. At least 0.40, but less than 0.50
E. At least 0.50

2013-4-11, Semiparametric Estimation 5 Neg. Bin. r Fixed,

HCM 10/22/12,

Page 44

Solutions to Problems:
5.1. C. Mean = 50/500 = .1. Second moment = 220/500 = .44.
Sample Variance = (500/499)(.44 - .12 ) = .4309.
EPV = (estimated total variance + (r)(estimated mean) + estimated mean2 )/(1+r) =
(.4309 + (2)(.1) + .12 )/(1+2) = .2136.
VHM = Total Variance - EPV = .4309 - .2136 = .2173.
K = EPV/VHM = .2136/.2173 = .98. For one store for one year, Z = 1/(1+K) = 1 /1.98 = .505.
The observation is 0 and the overall mean is .1, so the estimated future frequency for this store is:
(0)(.505) + (.1)(1 - .505) = 0.050.
Comment: Similar to 4, 11/00, Q.7, except there instead a Poisson is assumed.
5.2. C. The observed mean is .17, while the estimated variance (adjusted for degrees of
freedom) is: (500/499)(.39 - .172 ) = .3618.
Number of Insureds

Number of Losses

Square of Number of Losses

450
30
10
5
5

0
1
2
3
4

0
1
4
9
16

500

0.170

0.390

EPV = (estimated total variance + r(estimated mean) + estimated mean2 )/(1+r) =


(.3618 + (3)(.17) + .172 )/(1 + 3) = .2252.
Estimated VHM = Estimated Total Variance - Estimated EPV = .3618 - .2252 = .1366.
K = EPV / VHM = .2252 / .1366 = 1.65. Z = 1 / (1+ 1.65) = 37.7%.
Comment: Similar to 4B, 11/98, Q.11, except there a Poisson is assumed.

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 45

Section 6, Overview
Comparisons Between Types of Distributions:
In four exercises the same data has been used to estimate the Buhlmann Credibility Parameter, K,
using semiparametric estimation. However, the result depended on the assumed type of
frequency distribution for each insured:
Distribution
Type

Poisson

Geometric

2.3

14.7

Negative

Negative

Binomial, r =2

Binomial, = 0.3

6.0

9.8

In the case of a shorter-tailed Poisson, more of the total variation is assumed to be due to variation
between the insureds and less from random fluctuation in the risk process of each individual
insured. Therefore, for the Poisson assumption, K is smaller and the credibility assigned to an
individual is larger. In the case of a longer-tailed Geometric, less of the total variation is assumed to
be due to variation between the insureds and more from random fluctuation in the risk process of
each individual insured. Therefore, for the Geometric assumption, K is larger and Z is smaller.
Mixing Bernoullis:
One can not perform semiparametric estimation if we assume a Bernoulli frequency.
Assume that each exposure has a Bernoulli distribution, with the parameters q varying over the
portfolio of exposures. Let the mixing distribution be g(q).15
Then since the mean frequency for each exposure is q, the overall mean is the mean of g.
Since the process variance of each exposure is q(1 - q) = q - q2 , the EPV is:
E[q - q2 ] = E[q] - E[q2 ] = mean of g - second moment of g.
Since the mean of each exposure is q, the VHM is by definition the variance of g =
second moment of g - ( mean of g)2 .
Thus if we let the overall mean be , we have the following relationships when mixing Bernoullis:
= mean of g.
EPV = - second moment of g.
VHM = second moment of g - 2 .
Total Variance = EPV + VHM = - 2 .
15

The important special case in which g(q) is a Beta Distribution, is discussed in Mahlers Guide to Conjugate
Priors.

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 46

Thus unlike the situation of mixing Poissons, the mean and total variance do not usefully involve the
EPV and VHM. Thus in the case of mixing Bernoullis, one can not estimate the EPV and VHM
from the observed mean and total variance.
Each exposure has a Bernoulli distribution if and only if each exposure has either zero or one claim
(per year). So it easy to tell whether a Bernoulli assumption is appropriate. If each exposure has a
Bernoulli distribution, then the observed variance is always the observed mean - the square of the
observed mean. Thus the observed variance does not provide significant help in estimating either
the EPV or VHM.16
A Simulation Experiment:
The difference between full parametric and semiparametric estimation can be illustrated via
simulation. For example, let us simulate a model involving mixing Poissons. Assume there are four
types of risks, all with claim frequency given by a Poisson distribution:
Average Annual
Number
Type
Claim Frequency
of Risks
Excellent
1
40
Good
2
30
Bad

20

Ugly
4
10
As shown in Mahlers Guide to Buhlmann Credibility, for this full parametric model, EPV = 2,
VHM = 1, and therefore, K = 2. Therefore, for one year of data, for the full parametric model,
Z = 1/3. We determine that one year of data gets a credibility of 1/3, prior to seeing any data. In
contrast, for semiparametric estimation, Z depends on the observed data for a portfolio.
I simulated 1 year of data from these 100 insureds:

16

# of claims

# of insureds

0
1
2
3
4
5
6
7
8
9

28
27
12
19
7
5
1
0
0
1

Since the total variance is EPV + VHM, and each of the VHM and EPV are positive, we know each must be less
than the total variance. Thus if one has enough data to estimate the total variance with some accuracy, then this
estimate serves as an upper bound on each of the EPV and VHM.

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 47

Exercise: What is the mean and sample variance for the above data?
[Solution: The mean is 1.76. The sample variance is: (100/99)(6.00 - 1.762 ) = 2.932.]
If instead of the full parametric model, I merely assume that each insured is Poisson and apply
semiparametric estimation to the simulated data, then K = 1.76/(2.932 - 1.76) = 1.50.
Therefore, Z = 1/2.50 = 40%.
Exercise: For an insured who had 5 claims in one year, what is his estimated frequency next year?
[Solution: It depends on whether one uses full parametric or semiparametric estimation. Using the
full parametric model, the a priori mean is 2, Z = 1/3, and the estimate = (1/3)(5) + (2/3)(2) = 3.
Using semiparametric estimation, the observed mean is 1.76, Z = 40%,
and the estimate = (40%)(5) + (60%)(1.76) = 3.056.]
I ran a total of ten simulations of the above 100 insureds, and used semiparametric estimation in
order to estimate K. The results were: 1.50, 1.98, 1.15, 2.60, 1.20, 2.11, 1.63, 1.76, 1.67, and
3.08. Thus one sees that with only 100 insureds17, the estimates of the Buhlmann Credibility
parameter using semiparametric estimation are subject to considerable random fluctuation.
However, the resulting estimates are subject to less random fluctuation. For example, for the
final simulation, the overall mean was 2.21, K = 3.08, and the estimate for an insured with 5
claims, would be (1/4.08)(5) + (3.08/4.08)(2.21) = 2.894. This is not that dissimilar from the
estimate from the first simulation, 3.056 as calculated above.
When I instead simulated the same situation, but with 10,000 rather than 100 insureds18, the
estimates of K using semiparametric estimating were less subject to random fluctuation. For ten
simulation the estimates of K were: 2.02, 1.90, 2.06, 2.07, 2.04, 1.94, 1.94, 1.96, 2.06, 1.95.
These are all relatively close to the true underlying Buhlmann Credibility Parameter,
K = 2.19
I next altered the mean frequencies to 3%, 6%, 9%, and 12%.20 With 10,000 insureds21,
for ten simulation the estimates of K were: 37, 124, 184, 33, 25, 35, 41, 32, 126, 49.22

17

With an overall mean frequency of 2, for 200 total expected claims.


And thus had a total of 20,000 rather than 200 expected claims.
19
In actual applications of semiparametric estimation we would not know the true underlying model or the true
value of K. The amount of credibility assigned to the data is relatively insensitive to K, and therefore small
difference in K like these, usually do not make a significant difference in the estimates of future claim frequency.
20
This is more similar to a situation in private passenger automobile insurance.
21
With a mean frequency of 4%, for 400 total expected claims.
22
While in this case none of the ten simulated K values were negative, due to a negative estimate of the VHM,
recall that if the estimated VHM < 0, we set Z = 0.
18

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 48

Exercise: For 10,000 insureds, each of which has a Poisson frequency, 4000 have a mean
frequency of 3%, 3000 have a mean of 6%, 2000 have a mean of 9%, and 1000 have a mean of
12%. What is the Buhlmann Credibility Parameter, K?
[Solution: EPV = mean = (40%)(3%) + (30%)(6%) + (20%)(9%) + (10%)(12%) = .06.
VHM = (40%)(3% - 6%)2 + (30%)(6% - 6%)2 + (20%)(9% - 6%)2 + (10%)(12% - 6%)2 =
.0009. K = .06/.0009 = 66.7.
Comment: This is a full parametric model, as per Mahlers Guide to Buhlmann Credibility.]
Thus for these ten simulations, the values of K vary considerably around the true value of 66.7.
The corresponding credibility assigned to one year of data varies from 1/185 = 0.5% to
1/26 = 3.8%, compared to the correct amount of 1/67.7 = 1.5%.
Simulating Several Years of Data:
Rather than just simulating one year of data from each insured, for example one can simulate three
years of data from each insured. I simulated 100 insureds with mean frequencies of 1, 2, 3, and 4,
as per the original example.
The number of claims by year for the 100 insureds were:
1 0 0, 2 1 3, 0 1 2, 0 0 1, 0 3 0, 2 1 0, 0 0 0, 1 0 2, 0 2 1, 1 2 3, 0 1 0, 0 3 0, 0 2 5, 0 0 1,
0 0 1, 0 0 0, 2 1 1, 0 1 2, 0 1 1, 0 1 1, 0 2 0, 1 2 0, 1 0 1, 1 2 0, 2 0 0, 4 0 2, 1 2 2, 0 1 0,
0 3 0, 0 1 0, 0 2 2, 1 0 1, 2 3 0, 1 0 0, 0 1 0, 0 0 2, 2 1 0, 2 1 2, 0 0 0, 1 0 1, 0 2 3, 2 3 6,
6 6 2, 2 4 2, 2 4 0, 2 0 0, 3 3 2, 2 1 1, 2 1 0, 1 2 2, 0 0 3, 3 2 2, 1 2 1, 3 0 2, 0 1 0, 0 2 0,
1 2 1, 0 1 2, 2 3 2, 3 3 3, 2 3 2, 3 2 2, 1 4 4, 1 2 0, 2 2 2, 0 1 1, 0 2 1, 2 3 0, 2 2 3, 3 6 4,
6 3 2, 6 2 0, 5 4 3, 0 2 2, 7 4 1, 3 3 2, 3 3 1, 3 3 2, 0 6 3, 3 3 4, 5 2 1, 3 5 2, 3 1 2, 3 2 5,
4 4 1, 3 4 3, 3 3 4, 3 1 3, 2 7 4, 2 2 1, 5 6 2, 2 5 5, 3 2 3, 3 5 4, 6 4 4, 7 3 7, 5 3 8, 0 3 3,
9 6 7, 4 6 3.
As discussed previously, one can treat the data in two somewhat different ways. It is preferable to
treat the sum of the claims from each insured over the three years as a single observation.23
Then the mean (3 year) frequency is: 589/100 = 5.89 = EPV.
The second moment is: (total for insured i)2 / 100 = 5325/100 = 53.25.
Thus the sample variance is: (100/99)(53.25 - 5.892 ) = 18.745.
The estimated VHM (3 year) = 18.745 - 5.89 = 12.855.
Estimated K (3 year) = EPV / VHM = 5.89/12.855 = 0.46.
23

See 4, 5/05, Q.28.

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 49

Exercise: Using the above estimated K, what is the estimated future annual frequency for the final
insured, with 4, 6 and 3 claims?
[Solution: We have calculated K assuming that one draw from the risk process is three years of
data from a single insured. Therefore, Z = 1/(1 + 0.46) = 68.5%.
Estimated three year frequency = (68.5%)(13) + (31.5%)(5.89) = 10.76.
Estimated future annual frequency = 10.76 /3 = 3.59.]
If instead one treats each year of data from each insured as a separate observation, then the mean
annual frequency is: 589/300 = 1.963. The second moment of the frequency is:
2117/300 = 7.057. The sample variance = (300/299)(7.057 - 1.9632 ) = 3.214. Thus the
estimated VHM = 3.214 - 1.963 = 1.251. The estimated K = 1.963/1.251 = 1.57.
Exercise: Using semi-parametric estimation, treating each year of data from each insured as a
separate observation, what is the estimated future annual frequency for the final insured,
with 4, 6 and 3 claims?
[Solution: Z = 3/(3 + 1.57) = 65.6%. Estimate = (65.6%)(13/3) + (34.4%)(1.963) = 3.52.]
More generally, let Xij is the number of claims from the ith insured for year j. If we have n insureds
and Y years of data, then treating each year of data from each insured as a separate observation:
n

estimated mean frequency =

Xij / (nY).
i=1 j=1

estimated second moment of the frequency =

Xij2 / (nY) .
i=1 j=1

Estimated K = EPV / VHM =

mean
nY
(second moment - mean2 ) - mean
nY - 1

We note that depending how one uses the data in semiparametric estimation, one gets somewhat
different amounts of credibility applied to the data, and slightly different estimates of the future
annual frequency. It is preferable not to treat each year of data from an individual insured
separately, but rather to work with the total for each insured, since for an individual each year of data
is assumed to come from the same Poisson distribution.
In each case, the credibility differs from that for the full parametric case, 3/(3+2) = 60%.

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 50

Alternately one could apply nonparametric estimation to this situation.24


si2 = the usual sample variance for the data from a single insured i.
EPV = average of the si2 .
VHM = (the sample variance of the means for the insureds) - EPV / (# years of data)
For the simulated data shown above, s1 2 = 1/3,25 EPV = 1.710, mean for the first insured is 1/3,
VHM = 1.513 and K = 1.710/1.513 = 1.13. Thus using nonparametric estimation, three years of
data are given a credibility of: 3/(3+1.13) = 72.6%.
For ten simulations, the credibility given to three years of data is for these four different techniques:
Simulation
#

Full
Parametric

Semiparametric,
Each Year
Treated Separately

Semiparametric,
Working with
Insured's Totals

Nonparametric

1
2
3
4
5
6
7
8
9
10

60.0%
60.0%
60.0%
60.0%
60.0%
60.0%
60.0%
60.0%
60.0%
60.0%

65.6%
69.0%
56.7%
59.8%
68.2%
46.3%
50.9%
68.6%
59.6%
39.7%

68.5%
69.8%
56.9%
60.4%
65.9%
54.3%
46.8%
64.4%
59.8%
46.7%

72.6%
70.9%
57.0%
61.0%
62.2%
61.5%
42.4%
57.5%
59.8%
52.3%

Avg.

60.0%

58.4%

59.4%

59.7%

The average credibility assigned is similar for the four techniques.26 However, the random
fluctuations in the data have a large effect on the nonparametric estimates of K, which rely solely on
the data. In contrast, the full parametric method does not rely on the data in order to estimate K, and
therefore the estimate of K is unaffected by the random fluctuations in the data. The semiparametric
estimates of K are usually affected somewhat less by random fluctuations than nonparametric
estimation.

24

See Mahlers Guide to Empirical Bayesian Credibility.


The mean of 1, 0, 0 is 1/3. The sample variance is: {(0 - 1/3)2 + (0 - 1/3)2 + (1 - 1/3)2 }/(3 - 1) = 1/3.
26
This is due to the fact that the model that generated the simulated data was the same one used for full parametric
estimation. In actual actuarial applications we do not have a perfect model of reality.
25

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 51

Therefore, the Full Parametric Method has the advantage of a stable estimate of K. Thus one
would prefer the Full Parametric Method, provided the assumptions in the model are close to
reality.27 The nonparametric model has the advantage of making very few assumptions, but if there
is not a very large data set, the estimate of K is subject to significant random fluctuation. The
semiparametric estimates of K share some of the advantages and disadvantages of the other two
methods.

27

This is a very big caveat. However, in many situations the actuary can use information from other similar situations
(other years, other states, other insurers, etc.) to help him formulate a model.

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 52

Problems:
Use the following information for the next 4 questions:
The claim count distribution during a year is as follows for a group of 10,000 insureds:
Total Claim Count :
0
1
2
3
4
5 6
7 8 9
Number of Insureds: 6788 2099 704 253 95 37 15 6 2 1

>9
0

6.1 (3 points) Assume each insureds claim count follows a Poisson Distribution.
How much credibility would be given to three years of data from an insured?
A. 60%
B. 65%
C. 70%
D. 75%
E. 80%
6.2 (3 points) Assume each insureds claim count follows a Geometric Distribution.
How much credibility would be given to three years of data from an insured?
A. Less than 15%
B. At least 15%, but less than 20%
C. At least 20%, but less than 25%
D. At least 25%, but less than 30%
E. At least 30%
6.3 (3 points) Assume each insureds claim count follows a Negative Binomial Distribution,
with r = 3, and varying across the portfolio of insureds.
How much credibility would be given to three years of data from an insured?
A. Less than 40%
B. At least 40%, but less than 45%
C. At least 45%, but less than 50%
D. At least 50%, but less than 55%
E. At least 55%
6.4 (3 points) Assume each insureds claim count follows a Negative Binomial Distribution,
with = 0.2, and r varying across the portfolio of insureds.
How much credibility would be given to three years of data from an insured?
A. Less than 40%
B. At least 40%, but less than 45%
C. At least 45%, but less than 50%
D. At least 50%, but less than 55%
E. At least 55%

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 53

Solutions to Problems:
6.1. B. The estimated mean is: 4988/10000 = .4988.
The estimated second moment is: 10680/10000 = 1.0680.
The sample variance is: (10000/9999)(1.0680 - .49882 ) = .8193.
A

Number
of Claims
0
1
2
3
4
5
6
7
8
9

Number of
Insureds
6788
2099
704
253
95
37
15
6
2
1

Col. A times
Col. B
0
2099
1408
759
380
185
90
42
16
9

Square of Col. A
times Col. B
0
2099
2816
2277
1520
925
540
294
128
81

Sum

10000

4988

10680

EPV = overall mean = .4988.


VHM = Total Variance - EPV = .8193 - .4988 = .3205.
K = EPV/VHM = .4988/.3205 = 1.56.
For three years, Z = 3/4.56 = 65.8%.
Comment: One year of experience would be given a credibility of Z = 1/(1+K) = 1/2.56 = .391.
Thus the estimated future claim frequency (Buhlmann credibility premium) for an insured who had
for example 5 claims in one year is: (.391)(5) + (1 - .391)(.4988) = 2.26.
6.2. A. In the previous solution we estimated the total mean as .4988 and the estimated total
variance as .8193.
EPV = (estimated total variance + estimated mean + estimated mean2 )/2 =
(.8193 + .4988 + .49882 )/2 = .7835.
VHM = Total Variance - EPV = .8193 - .7835 = .0358.
K = EPV/VHM = .7835/.0358 = 21.9.
For three years, Z = 3/24.9 = 12.0%.
Comment: One year of experience would be given a credibility of Z = 1/(1+K) = 1/22.9 = .044.
Thus the estimated future claim frequency (Buhlmann credibility premium) for an insured who had
for example 7 claims in one year is: (.044)(7) + (1 - .044)(.4988) = .785. Here we assumed more
of the total variance was due to the process variance and less was due to differences between the
insureds, than was the case when we assumed each insured was Poisson. Therefore, here much
less credibility would be assigned to the experience of an individual insured.

2013-4-11,

Semiparametric Estimation 6 Overview,

HCM 10/22/12,

Page 54

6.3. C. In a previous solution we estimated the total mean as .4988 and the estimated total
variance as .8193.
EPV = (estimated total variance + r(estimated mean) + estimated mean2 )/(1+r) =
(.8193 + 3(.4988) + .49882 )/4 = .6411.
VHM = Total Variance - EPV = .8193 - .6411 = .1782.
K = EPV/VHM = .6411/.1782 = 3.60.
For three years, Z = 3/6.60 = 45.5%.
6.4. D. In a previous solution we estimated the total mean as .4988 and the estimated total
variance as .8193.
EPV = (1+ )(estimated mean) = (1.2)(.4988) = .5986.
VHM = Total Variance - EPV = .8193 - .5986 = .2207.
K = EPV/VHM = .5986/.2207 = 2.71.
For three years, Z = 3/5.71 = 52.5%.

2013-4-11, Semiparametric Estimation 7 Other Distributions,

HCM 10/22/12,

Page 55

Section 7, Other Distributions


While semiparametric estimation is usually applied to frequency distributions, particularly the
Poisson Distribution, it is possible to apply these same ideas to other distributions.
For example, assume each insured has an Exponential Severity with mean , and varies across
the portfolio.
= E[Mean | ] = E[].
EPV = E[Var | ] = E[2].
VHM = Var[Mean | ] = Var[] = E[2] - E[]2 .
Total Variance = EPV + VHM = 2E[2] - E[]2 = 2E[2] - 2.
EPV = E[2] = (Total Variance + 2)/2.
Therefore, we can estimate the EPV as: (S2 + X 2 )/2.
VHM = S2 - EPV.
Exercise: For 100 claims from this portfolio of insureds, Xi = 2000, Xi2 = 90,000.
Estimate K, using semiparametric estimation.
[Solution: X = 2000/100 = 20. E[X2 ] = 900. S2 = (900 - 202 )(100/99) = 505.
EPV = (S2 + X 2 )/2 = (505 + 202 )/2 = 452.5.
VHM = 505 - 452.5 = 52.5. K = 452.5/52.5 = 8.6.]

2013-4-11, Semiparametric Estimation 7 Other Distributions,

HCM 10/22/12,

Page 56

Problems:
7.1 (3 points) Each insured has a Gamma Severity with parameters = 3 and , with varying
across the portfolio.
For 1000 claims from this portfolio of insureds, Xi = 20,000, Xi2 = 550,000.
Estimate the Buhlmann Credibility Parameter K, using semiparametric estimation.
A. 11
B. 13
C. 15
D.17
E. 19
7.2 (3 points) The annual aggregate losses for each policy are uniform from 0 to , with varying
across the portfolio. For 200 policies from this portfolio, Xi = 20,000, Xi2 = 3,000,000.
Estimate the amount of credibility to be applied to one year of data from one policy, using
semiparametric estimation.
A. 5%
B. 10%
C. 15%
D.20%
E. 25%
7.3 (3 points) Each insured has an Exponential Severity with mean , and varies across the
portfolio. For 1000 claims from this portfolio of insureds, Xi = 50,000, Xi2 = 5,100,000.
Estimate the Buhlmann Credibility Parameter K, using semiparametric estimation.
A. 40
B. 50
C. 60
D.70
E. 80
7.4 (3 points) The annual aggregate losses for each policy are Normal with = 170, with varying
across the portfolio. For 100 policies from this portfolio, Xi = 20,000, Xi2 = 7,000,000.
Using semiparametric estimation, estimate the future annual aggregate losses for a policy that had
a total of 1300 in loss in 3 years.
A. 215
B. 220
C. 225
D.230
E. 235

2013-4-11, Semiparametric Estimation 7 Other Distributions,

HCM 10/22/12,

Page 57

Solutions to Problems:
7.1. A. = E[3] = 3E[]. EPV = E[32] = 3E[2]. VHM = Var[3] = 9Var[] = 9E[2] - 9E[]2 .
Total Variance = EPV + VHM = 12E[2] - 9E[]2 = 12E[2] - 2.
EPV = 3E[2] = (Total Variance + 2)/4.
X = 20000/1000 = 20. E[X2 ] = 550. S2 = (550 - 202 )(1000/999) = 150.15.
EPV = (S2 + X 2 )/4 = (150.15 + 202 )/4 = 137.54.
VHM = 150.15 - 137.54 = 12.61. K = 137.54/12.61 = 10.9.
Comment: For fixed and varying across the portfolio, EPV = (S2 + X 2 )/( + 1).
7.2. E. = E[/2] = E[]/2. EPV = E[2/12] = E[2]/12.
VHM = Var[/2] = Var[]/4 = E[2]/4 - E[]2 /4.
Total Variance = EPV + VHM = E[2]/3 - E[]2 /4 = E[2]/3 - 2.
EPV = E[2]/12 = (Total Variance + 2)/4.
X = 20000/200 = 100. E[X2 ] = 3,000,000/200 = 15,000.
S 2 = (15000 - 1002 )(200/199) = 5025.13.
EPV = (S2 + X 2 )/4 = (5025.13 + 1002 )/4 = 3756.3.
VHM = 5025.13 - 3756.3 = 1268.8. K = 3756.3/1268.8 = 2.96.
Z = 1/(1 + K) = 25.3%.
7.3. B. = E[]. EPV = E[2]. VHM = Var[] = E[2] - E[]2 .
Total Variance = EPV + VHM = 2E[2] - E[]2 = 2E[2] - 2.
EPV = E[2] = (Total Variance + 2)/2.
X = 50000/1000 = 50. E[X2 ] = 5100. S2 = (5100 - 502 )(1000/999) = 2602.6.
EPV = (S2 + X 2 )/2 = (2602.6 + 502 )/2 = 2551.3.
VHM = 2602.6 - 2551.3 = 51.3. K = 2551.3/51.3 = 49.7.

2013-4-11, Semiparametric Estimation 7 Other Distributions,


7.4. D. EPV = E[2] = E[1702 ] = 28,900.
X = 20000/100 = 200. E[X2 ] = 7,000,000/100 = 70,000.
S 2 = (70000 - 2002 )(100/99) = 30,303.
VHM = 30,303 - 28,900 = 1403. K = 28,900/1403 = 20.6.
Z = 3/(3 + K) = 12.7%.
Estimate = (12.7%)(1300/3) + (87.3%)(200) = 229.6.

HCM 10/22/12,

Page 58

2013-4-11, Semiparametric Estimation 8 Important Ideas,

HCM 10/22/12,

Page 59

Section 8, Important Ideas & Formulas


Assume we have one year of data from a group of insureds.
Assume each insureds frequency distribution is of the same type, but a parameter varies across
the group of insureds.
Let X = observed mean, and
s2 = sample variance = estimate of the total variance.
In each case, the estimated VHM = s2 - estimated EPV.
Poisson: EPV = X = observed mean.
Negative Binomial (fixed ): EPV = (1+ ) X .
Geometric: EPV= (s2 + X + X 2 )/2.

Negative Binomial (fixed r): EPV = (s2 + r X + X 2 )/(1+r).

Mahlers Guide to

Empirical Bayesian Credibility


Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-12


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-12

Empirical Bayesian Credibility,

HCM 10/22/12,

Page 1

Mahlers Guide to Empirical Bayesian Credibility


Copyright 2013 by Howard C. Mahler.
The concepts in Section 20.4 of Loss Models, by Klugman, Panjer and WiIlmot, related to
Nonparametric Estimation are demonstrated.
Information in bold or sections whose title is in bold are more important for passing the exam.
Larger bold type indicates it is extremely important.
Information presented in italics (and sections whose title is in italics) should not be needed to directly
answer exam questions and should be skipped on first reading. It is provided to aid the readers
overall understanding of the subject, and to be useful in practical applications.
Solutions to the problems in each section are at the end of that section.
Note that problems include both some written by me and some from past exams.1
I have assigned point values to the questions based on 100 points corresponds to a four hour
actuarial exam.

Section #

Pages

1
2
3
4
5
6
7
8
9

3-5
6-38
39-51
52-93
94-106
107-113
114-124
125-138
139-143

Section Name
Introduction
No Varying Exposures
Differing Numbers of Years, No Varying Exposures
Varying Exposures
Differing Numbers of Years, Varying Exposures
Using an A Priori Mean
Using an A Priori Mean, Varying Exposures
Assuming a Poisson Frequency
Important Formulas and Ideas

In some cases Ive rewritten these questions in order to match the notation in the current Syllabus. Past exam
questions are copyright by the Casualty Actuarial Society and Society of Actuaries and are reproduced here solely
to aid students in studying for exams. The solutions and comments are solely the responsibility of the author; the
CAS and SOA bear no responsibility for their accuracy. While some of the comments may seem critical of certain
questions, this is intended solely to aid you in studying and in no way is intended as a criticism of the many
volunteers who work extremely long and hard to produce quality exams.

2013-4-12

Empirical Bayesian Credibility,

HCM 10/22/12,

Page 2

Course 4 Exam Questions by Section of this Study Aid2


Section Sample
1
2
3
4
5
6
7
8

31

5/00

11/00

15

16
27

5/01

32

11/01

11/02

11/03

11

15

30

11/04

17

5/05

11/05

11/06

5/07

27

11

25

11
22

13

The CAS/SOA did not release the 5/02, 5/03, 5/04, 5/06, 11/07 and subsequent exams.

Excluding any questions that are no longer on the syllabus.

2013-4-12

Empirical Bayesian Credibility 1 Introduction,

HCM 10/22/12,

Page 3

Section 1, Introduction
Section 20.4.1 of Loss Models discusses estimating the Expected Value of the Process Variance
and the Variance of the Hypothetical Means from data.3 This study guide will explain such Empirical
Bayesian Credibility techniques. First we will contrast these nonparametric methods to full
parametric and semiparametric methods of estimation.
Multi-sided Dice Example:
Mahlers Guide to Buhlmann Credibility, Mahlers Guide to Conjugate Priors, Loss Models, and
Credibility by Mahler and Dean, have many examples of different models for which the Bhlmann
Credibility assigned to an observation are determined. In these Full Parametric Models, prior to an
observation, all aspects of the risk process are specified. One such model involves multi-sided
dice:4
There are a total of 100 multi-sided dice of which 60 are 4-sided, 30 are 6-sided and 10 are 8-sided.
The multi-sided dice with 4 sides have 1, 2, 3, and 4 on them. The multi-sided dice with the usual 6
sides have numbers 1 through 6 on them. The multi-sided dice with 8 sides have numbers 1
through 8 on them. For a given die each side has an equal chance of being rolled; i.e., the die is fair.
Your friend has picked at random a multi-sided die. He then rolled the die and told you the result.
You are to estimate the result when he rolls that same die again.
It was shown that for this example:
A Priori Mean Die Roll = 3.
Expected Value of the Process Variance = EPV = 2.15.
Variance of the Hypothetical Means = VHM = .45.
Bhlmann Credibility Parameter = K = EPV/VHM = 2.15 / .45 = 4.778.
Credibility assigned to the observation of one die roll = Z = 1/(1+K) = 17.3%.
Bhlmann Credibility Estimate of the next roll from the same die =
(0.173)(Observation) + (0.827)(3).

See also Topics in Credibility by Dean.


This is also discussed briefly in Section 6.6 of Credibility by Mahler and Dean, not on the Syllabus.
4
See Mahlers Guide to Buhlmann Credibility, or Sections 3.1 and 3.2 of Credibility by Mahler and Dean, the 4th
Edition of Foundations of Casualty Actuarial Science.

2013-4-12

Empirical Bayesian Credibility 1 Introduction,

HCM 10/22/12,

Page 4

Example of Mixing Poissons:


In Mahlers Guide to Semiparametric Estimation, there is an example involving Poisson
frequencies.5 Assume that one observes that the claim count distribution is as follows for a large
group of insureds:
Total Claim Count :
Percentage of Insureds:

0
1
2
3
4
5
60.0% 24.0% 9.8% 3.9% 1.6% 0.7%

>5
0%

One can estimate the total mean as 0.652 and the total variance as: 1.414 - 0.6522 = 0.989.
Assume in addition that the claim count for each individual insured has a Poisson distribution which
does not change over time. In other words each insureds frequency process is given by a Poisson
with parameter , with varying over the group of insureds. Then the process variance for each
insured is .
Thus the expected value of the process variance is estimated as follows:
E[VAR[X | ]] = E[] = overall mean = 0.652.
Thus we estimate the Variance of the Hypothetical Means as:
Total Variance - EPV = 0.989 - 0.652 = 0.337.
Then K = EPV / VHM = 0.652/0.337 = 1.93.
Exercise: An insured from this portfolio is observed to have 1 claim in five years.
Estimate the future claim frequency for this insured.
[Solution: Z = 5/(5+K) = 5/6.93 = 72.2%.
The estimated future frequency = (72.2%)(1/5) + (1 - 72.2%)(0.652) = 0.326.
Comment: Once we have estimated K, we can use it for similar insureds to those in this portfolio.
For example, for a similar insured observed for 4 years, Z = 4/(4 + 1.93) = 67.5%.]
This is an example of what is called Semiparametric Estimation, or Semiparametric Empirical Bayes
Estimation. While the form of the risk process for each insured is specified, the distribution of risk
parameters among the insureds is not specified. Therefore, in addition to the modeling assumptions,
one must rely on observed data, in order to estimate the Bhlmann Credibility Parameter.

See also page 48 of Credibility by Mahler and Dean.

2013-4-12

Empirical Bayesian Credibility 1 Introduction,

HCM 10/22/12,

Page 5

Nonparametric Estimation:
In contrast to the two previous examples, one can rely on the data itself, without any specific model
of the risk process, to estimate the EPV and VHM. There have been developed an important set
of techniques that try to estimate the Bhlmann Credibility Parameter, K, from observed data.6 This
is sometimes referred to as nonparametric estimation. For nonparametric estimation, you need either
a number of individuals, classes, etc., each observed over a number of years, or a number of
individuals, classifications, etc., observed across a larger group.7 Examples of these techniques will
be discussed in this study guide.

Semiparametric vs. Nonparametric vs. Full Parametric Estimation:


Semiparametric estimation assumes a particular form of the frequency distribution, which differs from
nonparametric estimation where no such assumption is made.
One can use semiparametric estimation with only one year of data from a portfolio of insureds.
Nonparametric estimation requires data separately for each of at least two years from several
insureds.8
Semiparametric estimation assumes a particular form of the frequency distribution, but differs from full
parametric estimation via Bhlmann Credibility where in addition one assumes a particular
distribution of types of insureds or mixing distribution. Full parametric estimation assumes a
complete model and the Bhlmann Credibility Parameter, K, is calculated with no reference to any
observations.
From the least complete to most complete modeling assumptions:
Nonparametric, semiparametric, full parametric.
From the least reliance on data in order to calculate K to the most reliance on data to calculate K:
Full parametric, semiparametric, nonparametric.

References include: Loss Models, by Klugman, Panjer, and WiIlmot; Institute of Actuaries Study Note, An
Introduction to Credibility Theory, by H. R. Waters; Report of the Credibility Subcommittee: Development and
Testing of Empirical Bayes Credibility Procedures for Classification Ratemaking, I.S.O., September 1980; Gary
Venters Credibility Chapter of Editions 1 to 3 of Foundations of Casualty Actuarial Science; Advanced Risk Theory
by De Vlyder.
7
Four or more risks, classes, etc., is recommended in order to apply these techniques in practical applications.
However, in order to reduce the time needed to perform calculations, exam questions may have only 2 or 3
individuals or classes.
8
As discussed subsequently, if one relies on an a priori estimate of the mean, one can apply these techniques to a
single insured, class, etc.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 6

Section 2, No Varying Exposures (No Variation in Years)


There are a number of situations to which Nonparametric Empirical Bayesian Estimation can be
applied. The simplest case is that in which the insureds each have the same volume of data in each
year. Loss Models refers to this case as the Bhlmann case, as opposed to the Bhlmann-Straub
case with varying exposures. This case with no varying exposures can be split in turn into a simpler
case in which each insured has the same number of years of data, and a more complex case in which
insureds have different numbers of years of data.
The simpler case, which is a special case of the more complex case, will be discussed first. In this
simplest case, we have a rectangular array of data (number of claims, dollars of loss, etc.) with no
empty cells and no specific mention of exposures. Each class (or individual, region, etc.) has the
same number of years of data.
A Three Driver Example:
For example, assume there are 3 drivers in a particular rating class.
For each of 5 years, we have the number of claims for each of these 3 drivers:9
Hugh
Dewey
Louis

0
0
0

0
1
0

0
0
2

0
0
1

0
0
0

Exercise: Calculate the mean and sample variance of each driver.


Calculate the overall mean and the sample variance of the drivers means.
[Solution: The sample variance for Louis is:10
{(0 - 0.6)2 + (0 - 0.6)2 + (2 - 0.6)2 + (1 - 0.6)2 + (0 - 0.6)2 } / (5 - 1) = 0.80.

Hugh
Dewey
Louis
Mean
Variance

Mean

0
0
0

0
1
0

0
0
2

0
0
1

0
0
0

0.00
0.20
0.60
0.2667
0.0933

Sample
Variance
0.00
0.20
0.80
0.333

The drivers means are: 0, 0.2, and 0.6; their sample variance is:
{(0 - 0.2667)2 + (0.2 - 0.2667)2 + (0.6 - 0.2667)2 } / (3 - 1) = 0.09333.]
Using nonparametric empirical Bayesian estimation, one estimates the EPV as the average of the
sample variances for each driver = (0 + 0.2 + 0.8) / 3 = 0.33333.
9

I always put the observations going across and going down the individuals, classes, groups, regions, etc.
Note that in the sample variance one divides by n-1 = 4, rather than n = 5.

10

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 7

Then one estimates the VHM as: (sample variance of the drivers means) - EPV / (# years of data)
= 0.09333 - 0.33333/5 = 0.0267.11
We divide the estimated EPV and VHM in order to estimate K.
K = EPV/VHM = 0.3333 / 0.0267 = 12.5.
Z = 5 / (5 + 12.5) = 28.6%.
Overall mean = (0 + 0.2 + 0.6)/3 = 0.2667
Using nonparametric empirical Bayesian estimation to estimate the future claim frequency of each
driver:
Hugh: (0.286)(0) + (1 - 0.286)(0.2667) = 0.190.
Dewey: (0.286)(0.2) + (1 - 0.286)(0.2667) = 0.248.
Louis: (0.286)(0.6) + (1 - 0.286)(0.2667) = 0.362.
Usually one would not employ this technique in practical applications, unless one had many
more drivers.
We note that the resulting estimates are in balance:
(0.190 + 0.248 + 0.362) / 3 = 0.2667 = the observed mean claims frequency.
Formulas, no Variation in Exposures or Years:
This is an example of Empirical Bayesian estimation.
In general this technique would proceed as follows:
Assume we have Y years of data from each of C classes.
Let Xit be the data (die roll, frequency, severity, pure premium, etc.) observed for class (or risk) i,
in year t, for i = 1,...,C, and t = 1,...,Y.12
Y

Xit
Let X i = t=1
Y

= average of the data for class i.

C Y

Xit
Let X = i=1 t=1
CY
11

= overall average of the observed data.

A correction term is subtracted from the sample variance of the drivers means in order to make the estimate of the
VHM unbiased as discussed below.
12
Here we have assumed the same number of years of data for each class or risk.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 8

(Xit - Xi )2
Let si2 = t=1

Y - 1

= sample variance = estimated process variance for class i.

C Y

si2 (Xit - Xi )2
The estimated EPV = i=1
C
C

= i=1 t=1
C (Y - 1)

(Xi
The estimated VHM = i=1

.13

- X )2

C - 1

EPV 14
.
Y

K = EPV / VHM = estimated Bhlmann Credibility Parameter.


_
In the above example: C = 3, Y = 5, X2 = 0.2, X = 0.267, s2 2 = 0.20, EPV = 0.333,
VHM = 0.093 - 0.333/5 = 0.0264, and K = 0.333 / 0.0264 = 12.6.
Thus 5 years of data would be given a credibility of: 5 / (5 + 12.6) = 28.4%.
The remaining 71.6% weight be assigned to the observed overall mean of 0.267.
Thus for risk 2, Dewey, one would estimate the future annual frequency as:
(28.4%)(0.2) + (71.6%)(0.267) = 0.248.

Loss Models uses the notation v for the EPV, and thus the estimated EPV would be v^ .
14
Loss Models uses the notation a for the VHM, and thus the estimated VHM would be ^a .
13

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 9

Using the Functions of the Calculator to Compute Samples Means and Variances:
Using the TI-30X-IIS Multiview, one could work as follows with the data for Louis, (0, 0, 2, 1, 0):
DATA
DATA
Clear L1 ENTER
0 ENTER
0 ENTER
2 ENTER
1 ENTER
0 ENTER
(The five claim counts for Louis should now be in the column labeled L1.)
2nd STAT
1-VAR ENTER (If necessary use on the big button at the upper right to select 1-VAR.)
DATA L1 ENTER (If necessary use on the big button at the upper right to select DATA L1.)
FRQ ONE ENTER (If necessary use and to select FRQ ONE.)
CALC ENTER (Use and on the big button at the upper right to select CALC.)
Various outputs are displayed. Use and on the big button to scroll through them.
n = 5. (number of data points.)
X = 0.6 (sample mean of X.)
S x = 0.89443 (square root of the sample variance of X.)
x = 0.8 (square root of the variance of X, computed with n in the denominator.)
X=3

X2 = 5

To exit stat mode, hit 2ND QUIT.


To display the outputs again:
2nd STAT
STATVAR ENTER (Use on the big button at the upper right to select STATVAR.)
To get Sx2 , scroll down to Sx, hit ENTER, hit x2 . The sample variance is: 4/5 = 0.8.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 10

Using the TI-30X-IIS, one could work as follows with the data for Louis, (0, 0, 2, 1, 0):
2nd STAT
CLRDATA ENTER
2nd STAT
1-VAR ENTER (Use the key if necessary to select 1-VAR rather than 2-VAR.)
DATA
X1 = 0
Freq = 1
X2 = 0
Freq = 1
X3 = 2
Freq = 1
X4 = 1
Freq = 1
X5 = 0
Freq = 1 ENTER
STATVAR
Various outputs are displayed. Use the key and the key to scroll through them.
n = 5. (number of data points.)
X = 0.6 (sample mean of X.)
S x = 0.89443 (square root of the sample variance of X.)
x = 0.8 (square root of the variance of X, computed with n in the denominator.)
X=3

X2 = 5

S x2 = 0.894432 = 0.8.
Alternately, one could have entered the data for Louis, (0, 0, 2, 1, 0), as follows:
X1 = 0
Freq = 3
X2 = 1
Freq = 1
X3 = 2
Freq = 1 ENTER

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 11

Using the BA II Plus Professional, as follows for the data for Louis, (0, 0, 2, 1, 0):
2nd DATA
2nd CLR WORK
X1 0 ENTER
X2 0 ENTER
X3 2 ENTER
X4 1 ENTER
X4 0 ENTER
2nd STAT
If necessary press 2nd SET until 1-V is displayed (for one variable)
Various outputs are displayed. Use the keys and to scroll through them.
n = 5. (number of data points.)
X = 0.6 (sample mean of X.)
S x = 0.89443 (square root of the sample variance of X.)
x = 0.8 (square root of the variance of X, computed with n in the denominator.)
X=3

X2 = 5

S x2 = 0.894432 = 0.8.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 12

Multisided Die Simulation:


For example, assume we have the following data generated by a simulation of the multi-sided die
example:
Risk
Number

Die
Type

1
2
3
4
5
6
7
8
9
10

4
4
4
4
4
4
6
6
6
8

1
4
3
4
1
2
5
1
4
8

1
1
2
4
4
1
5
2
2
3

Average
Variance

Trial Number
4 5 6 7 8

9 10 11 12

Mean

Sample
Variance

2
3
3
3
3
4
1
3
6
6

4
1
4
3
2
1
2
1
4
2

3
2
1
1
2
2
5
3
1
2

2.000
2.667
2.667
2.500
2.750
2.417
3.250
2.333
3.750
5.083

1.455
1.515
1.152
1.182
0.932
1.356
2.750
2.061
3.659
5.720

2.942
0.8056

2.178

2
2
2
3
3
3
3
1
5
4

3
4
1
1
3
3
6
1
2
2

1
3
4
3
4
2
1
6
5
7

1
4
4
2
3
4
3
3
6
8

1
1
3
1
2
4
3
3
3
7

4
4
2
3
2
1
3
2
6
5

1
3
3
2
4
2
2
2
1
7

Assume that we only have the observed rolls and do not know what sided die generated each risk.
Then one can attempt to estimate the EPV and VHM, and thus the Bhlmann Credibility constant K,
as follows.
For Risk 1 separately we can estimate the mean as 2 and the process variance as:15
(1- 2)2 + (1- 2)2 + (2 - 2)2 + (4 - 2)2 + (2 - 2)2 + (3 - 2)2 + (1- 2)2 + (1- 2)2 + (3 - 2)2 + (1- 2)2 + (4 - 2)2 + (1- 2)2
12 - 1

= 1.455. We have calculated the usual sample variance of the first row of data.
Exercise: Based on the above data for risk 9 what is the estimated process variance?
[Solution: The estimated mean is 3.75 and the estimated process variance is 3.659.
Comment: We have estimated the sample mean and variance for a row of the data. ]
In this manner one gets a separate estimate of the process variance of each of the ten risks. By
taking an average of these ten values, one gets an estimate of the (overall) expected value of the
process variance, in this case 2.178.16 17

15

Note that this estimate of the process variance only involves data from a single row. We have divided by the
number of columns minus one; these Empirical Bayes methodologies are careful to adjust for degrees of freedom in
order to attempt to get unbiased estimators. With an unknown mean, one divides by n-1 in order to get the sample
variance, an unbiased estimate of the variance.
16
Note that the EPV for the underlying model is in this case 2.10; thus 2.178 is a pretty good estimate.
17
Note that this is the within variance from analysis of variance.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 13

Now by taking an average of the means estimated for each of the ten risks, one gets an estimate of
the overall average, in this case 2.942. Then in order to estimate the VHM one could sum the
squared deviations of the individual means from the overall mean.
In this case that sum is: (2 - 2.942)2 + (2.667 - 2.942)2 + (2.667 - 2.942)2 +
(2.5 - 2.942)2 + (2.75 - 2.942)2 + (2.417-2.942)2 + (3.25 - 2.942)2 + (2.333 - 2.942)2 +
(3.75 - 2.942)2 + (5.083 - 2.942)2 = 7.251. Dividing this sum by one less than the number of risks,
7.251 / (10 - 1) = 0.8057 is the sample variance of the hypothetical means.18
However, as discussed subsequently, this would be a biased estimate of the VHM. It contains a
little too much random variation due to the use of the observed means rather than the actual
hypothetical means. It turns out that if we subtract out the estimated EPV divided by the number of
years, the resulting estimator will be an unbiased estimator of the VHM:
0.8057 - (2.178/12) = 0.6242
In this example: C =10, Y =12, X1 = 2, X = 2.942, s1 2 = 1.455, EPV = 2.178,
VHM = 7.251/(10 - 1) - 2.178/12 = 0.6242, and K = 2.178/0.6242 = 3.5.
Thus twelve trials of data would be given a credibility of 12 / (12+3.5) = 77.4%.19
The remaining 22.6% weight be assigned to the observed overall mean of 2.942.
Thus for risk 1 one would estimate a future roll as: (77.4%)(2.000) + (22.6%)(2.942) = 2.21.
Exercise: Assuming we are unaware of which sided die was used to generate each risk, what is the
estimate of a future roll from risk 9?
[Solution: (77.4%)(3.750) + (22.6%)(2.942) = 3.57.]
Note that the computations of the estimated EPV and VHM would have been much faster if one
had been given summary statistics. In this case:

(Xij - X i)2 = 239.583.

( X i - X )2 = 7.251.

Estimated EPV = (Xij - X i)2 / {C(Y - 1)} = 239.583 / {(10)(12 - 1)} = 2.178.
Estimated VHM = ( X i - X )2 / (C - 1) - EPV/Y = 7.251 / (10 - 1) - 2.178/12 = 0.6241.

18

This is the between variance from analysis of variance.


If instead one had used the value of K of 4.67 from the full parametric model, then 12 trials would be assigned a
credibility of 12/(12+4.67) = 72.0%. In the full parametric model, one would weight the observation with the model
overall mean of 3.000. For risk 1, with an observation of 2.000, that would result in an estimate of a future mean die
roll of: (72.0%)(2.000)+(28.0%)(3.000) = 2.28.
19

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 14

Here are the estimated mean future die rolls for each of the ten risks:
Risk
Number

Observed Mean
Die Roll

Estimated Future
Mean Die Roll

1
2
3
4
5
6
7
8
9
10

2.000
2.667
2.667
2.500
2.750
2.417
3.250
2.333
3.750
5.083

2.21
2.73
2.73
2.60
2.79
2.54
3.18
2.47
3.57
4.60

Random Fluctuations:
Note that the estimated value of K of 3.5 differs somewhat from the value of K = 2.1/.45 = 4.67 for
the full parametric multisided die example. The difference is due to a lack of information here; in the
full parametric model we were given or we assumed that there was a 60% chance that the die was
4-sided, 30% chance the die was 6-sided, and 10% chance that the die was 8-sided.
In contrast, here we have made no assumptions about the types of dice or their probabilities. In
other words, we have made no assumptions about the structure parameters that specify the risk
process.
Unfortunately, while this estimate of the EPV is pretty good, that of the VHM is subject to
considerable random fluctuation. A significant difficulty with this technique is that the estimated VHM
can be negative, even though as a variance the VHM can not actually be negative. In the case of a
negative estimate of the VHM it is generally set equal to zero; this results in no credibility being
given to the data, which makes sense if the actual VHM is very small.
If the estimated VHM 0, then set Z = 0.
In any case, the random fluctuation in the estimate of the VHM can lead to considerable random
fluctuation in the estimate of K. In this case, when a series of ten different simulations were run, the
estimated values of K ranged from 2.9 to over 100 to negative values, (which would be set equal to
infinity when the VHM was set equal to zero.)

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 15

Explanation of the Correction Term for the VHM:


The formula for estimating the Variance of the Hypothetical Means has a correction term involving the
(estimated) Expected Value of the Process Variance:
VHM = (the sample variance of the class means) - EPV / (# years of data).
The basic reason why we need to subtract EPV/Y is because we are working with the sample
means rather than the hypothetical means. Each sample mean has a variance of:
(Process Variance)/Y, due to random fluctuation.20 Thus the sample variance of the sample means
contains this extra random fluctuation and is too large. We subtract EPV/ Y to correct for this and
make the estimate of the Variance of the Hypothetical Means unbiased.
More formally, using the analysis of variance formula to break Var[ X i] into two pieces:
Var[ X i] = Var[E[ X i| Class i]] + E[Var[ X i | Class i]] = Var[i] + E[i2 / Y] = VHM + EPV/Y.

VHM = Var[ X i] - EPV/Y.21


Assumptions for Nonparametric Empirical Bayes Estimation:
This technique is referred to as nonparametric empirical Bayes estimation, because we have
made no specific assumptions about the structure parameters. In fact, we make some assumptions;
as will be discussed, we implicitly or explicitly assume the usual Bhlmann covariance structure
between the data from years, classes, etc.
Of course, in addition we assumed that the years of data from Louis were drawn from a single
distribution, the years of data from Dewey were drawn from a single distribution, etc. We did not
specify the form of each of these distributions; they could each have a different form or the same
form. The important point, is that the whole computation would make no sense if each row of the
data matrix were not a random sample from its own distribution, whatever it is.

20

In general, the variance of any average of independent, identically distributed variables is the variance of a singe
variable divided by the number of variables. Var[mean] = Var[X]/n.
21
See page 19 in Topics in Credibility by Dean.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 16

The Assumed Covariance Structure:


The Empirical Bayes methods shown here assumes the usual Bhlmann covariance structure.22
When the size of risk is not important, for two years of data from a single risk,
the Variance-Covariance structure between the years of data is as follows:
COV[Xt , Xu] = 2tu + 2, where tu is 1 for t = u and 0 for t u.
COV[X1 , X2 ] = 2 = VHM, and COV[X1 , X1 ] = VAR[X1 ] = 2 + 2 = EPV + VHM.
Assume one has data from a number of risks over several years. Let Xit be the data (die roll,
frequency, severity, pure premium, etc.) observed for class (or risk) i, in year t.
Then the assumed covariance structure is:
COV[Xit , Xju] = ij tu 2 + ij 2.
In other words, two observations from different risks have a covariance of zero,
two observations of the same risk in different years have a covariance of 2, the VHM, while the
observation of a risk in a single year has a variance of 2 + 2, the VHM plus the EPV.
Let M be the overall grand mean, M = E[Xit], then we have from the assumed covariance structure
that the expected value of the product of two observations is:
E[Xit Xju] = COV[Xit , Xju] + E[Xit] E[Xju] = ij tu 2 + ij 2 + M2 .
Then for example for a single risk in a single year: E[Xit 2] = 2 + 2 + M2 .
Y

Xit
Let X i = t=1
Y

= average of the data for class i.

Then E[Xit X i] = (1/ Y)

22

u=1

u=1

E[XitXiu] = (1/ Y) tu 2 + 2 + M2 = 2 + M2 + 2 / Y.

This covariance structure is discussed in Mahlers Guide to Buhlmann Credibility.


The Bhlmann-Straub covariance structure is assumed when exposures vary by year, as is discussed
subsequently.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 17


Y

Also E[ X i 2] = E[ X i X i] = E[

Xit Xi / Y] = (1/ Y) 2 + M2 +
t=1

2 / Y

t=1

= 2 + M2 + 2 / Y = E[Xit X i ].
An Unbiased Estimate of the EPV:
In order to estimate the EPV, one looks at the squared differences between the observations for a
class and the average for that class:
Y

(Xit - Xi )2
Let si2 = t=1

.
Y - 1
Y

Then (Y-1) E[

si2 ] =

2
E[(Xit - Xi ) ] =

t=1

E[Xit2] - 2 E[Xit X i ] + E[X i 2 ]


t=1

2 + 2 + M2 - (2 + M2 +

2 / Y) = 2 (Y-1).

t=1

Thus, E[si2 ] = 2 (Y-1) / (Y-1) = 2.


si2 , the sample variance, is an unbiased estimator of the process variance.
Then any weighted average of the si2 for the various classes would be an unbiased estimator of the
EPV.
C

In particular, EPV = (1/C)

si2 is an unbiased estimator of the EPV.


i=1

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 18

An Unbiased Estimate of the VHM:


In order to estimate the VHM, one looks at the squared differences between the average for each
class and the overall average:
C

Let

( Xi - X )2 .
i=1

For two different risks, i j, E[ X i X j] = E[(1 / Y2 )

Xit Xju ]
t=1 u=1

= (1 / Y2 )

E[Xit Xju] = M2 = M2.


t=1 u=1

t=1 u=1

Therefore, since previously we had that E[ X i 2 ] = M2 + 2 + 2 /Y, we have that:


E[ X i X j] = M2 + ij (2 + 2 / Y).
E[ X i X ] = E[(1/ C)

j=1

j=1

j=1

X i X j ] = (1/ C) E[Xi Xj ] = (1/ C) M2 + ij( 2 + 2 / Y )

= (1/ C) {CM2 + 2 + 2/Y} = M2 + 2/C + 2 / (CY).


E[ X 2 ] = E[ X X ] = E[(1/ C)

i=1

i=1

i=1

X i X ] = (1/ C) E[Xi X] = (1/ C) M2 + 2 + 2 / Y

= M2 + 2/C + 2 / (CY).
C

Then E[2 ] =

E[(Xi - X
i=1

)2

]=

E[Xi 2] - 2 E[X i X ] + E[ X 2]
i=1

M2 + 2 + 2 / Y - {M2 + 2 / C + 2 / (CY)}
i=1

= (2 + 2/Y) (C - 1).

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 19

Thus, E[2 /(C - 1) - 2 /Y] = 2 . Therefore if one takes:


C

EPV = (1/C)

si2 , which is an unbiased estimator of 2,


i=1

then an unbiased estimator of the Variance of the Hypothetical Means, 2, is:


C

(Xi
VHM = i=1

- X )2
-

C - 1

EPV
.
Y

Summary of Empirical Bayesian Estimation (no variation in years or exposures):


For this case, Empirical Bayesian Estimation can be summarized as follows:
si2 is just the usual sample variance for the data from a single class i.
Y

(Xit - Xi )2
si2 = t=1

Y - 1

C Y

si2 (Xit - Xi )2
EPV = average of the sample variances for each class = i=1
C

= i=1 t=1
C (Y - 1)

VHM = (the sample variance of the class means) - EPV / (# years of data)
C

(Xi - X )2
= i=1

C - 1

EPV
.
Y

If the estimated VHM 0, then set Z = 0.


Otherwise, as usual, K =
23

# years of data
EPV
, and Z =
.23
# years of data + K
VHM

More generally, in the absence of exposures, the amount of data is the number of observations or the number of
columns in the data grid.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 20

Problems:
2.1 (2 points) You are given the following information about 3 groups over 2 years:
Group
Loss in Year 1
Loss in Year 2
1
18
20
2
25
30
3
16
10
What is the credibility assigned to two years of data?
(A) 70%
(B) 80%
(C) 90%
(D) 95%
(E) 99%

2.2 (3 points) An insurer has data on losses for seven policyholders for ten years. Xij is the loss
from the ith policyholder for year j. You are given:
X 1 = 13.1.
7 10

X = 10.3.
7

(Xij - Xi )2 = 253.

(Xi

i=1 j=1

i=1

- X )2 = 17.

Estimate the losses for policyholder #1 over the next year.


(A) Less than 11.0
(B) At least 11.0, but less than 11.5
(C) At least 11.5, but less than 12.0
(D) At least 12.0, but less than 12.5
(E) At least 12.5
2.3 (3 points) Use the following information:

Phil and Sylvia are competitors in the light bulb business.


You test 100 bulbs from each of them.
Let Xi be the lifetime for light bulb i.
For Phils 100 light bulbs: Xi = 85,217 and Xi2 = 82,239,500.
For Sylvias 100 light bulbs: Xi = 90,167 and Xi2 = 90,372,400.
Use Nonparametric Empirical Bayes estimation in order to estimate the average lifetime of a
randomly selected light bulb from Phil.
A. 860
B. 865
C. 870
D. 875
E. 880

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 21

Use the following information for the next two questions:


Hi Blumfield has just moved cross country to Massachusetts, in order to take a new job. During His
first week, His coworker Wild Bill Brown drove Hi to and from work. His car finally arrived from
California over the next weekend. So during his second week, Hi drove his own car to and from
work. His two way commute times were in minutes:
Day
Riding With Bill
Driving His Own Car
Monday
45
53
Tuesday
46
50
Wednesday
39
51
Thursday
42
50
Friday
43
46
2.4 (3 points) Using nonparametric Empirical Bayes Estimation, what is Bhlmanns credibility
parameter, K?
A. Less than 0.5
B. At least 0.5 but less than 1.0
C. At least 1.0 but less than 1.5
D. At least 1.5 but less than 2.0
E. At least 2.0
2.5 (1 point) In his third week, Hi will again drive his own car into work.
Using nonparametric Empirical Bayes Estimation, predict His average commute time during his third
week.
A. Less than 49.1
B. At least 49.1 but less than 49.3
C. At least 49.3 but less than 49.5
D. At least 49.5 but less than 49.7
E. At least 49.7
2.6 (3 points) Medical inflation is 5% per year. You have the following data for health insurance claim
costs for two insureds for two years.
Policyholder

2000

2001

Gilbert
Sullivan

500
400

200
900

Estimate Gilberts claim costs during 2002, using Nonparametric Empirical Bayes estimation.
A. Less than 490
B. At least 490 but less than 490
C. At least 500 but less than 510
D. At least 510 but less than 520
E. At least 520

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 22

Use the following information to answer the next four questions:


For each of 5 employees for each of 6 years, displayed below are the dollars of health insurance
claims made:
Alice
Asok
Dilbert
Howard
Wally

Mean

Sample Variance

138
112
135
91
467

73
103
155
206
133

320
129
121
109
265

102
93
123
211
193

782
104
77
116
118

270
171
139
81
89

280.83
118.67
125.00
135.67
210.83

69,679
802
704
3,341
19,679

Assume that each year of data has been brought to the current cost level.
2.7 (1 point) Use nonparametric Empirical Bayes techniques to estimate the Expected Value of the
Process Variance.
A. Less than 18,000
B. At least 18,000 but less than 19,000
C. At least 19,000 but less than 20,000
D. At least 20,000 but less than 21,000
E. At least 21,000
2.8 (2 points) Use nonparametric Empirical Bayes techniques to estimate the Variance of the
Hypothetical Means.
A. Less than 1500
B. At least 1500 but less than 1600
C. At least 1600 but less than 1700
D. At least 1700 but less than 1800
E. At least 1800
2.9 (1 point) Using the results of the previous questions, what is the credibility given to the
observed data from Wally?
A. Less than 30%
B. At least 30% but less than 35%
C. At least 35% but less than 40%
D. At least 40% but less than 45%
E. At least 45%
2.10 (1 point) What is the estimated future annual cost for Howard?
A. Less than 150
B. At least 150 but less than 155
C. At least 155 but less than 160
D. At least 160 but less than 165
E. At least 165

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 23

Use for the next three questions the past claims data on a portfolio of policyholders:
Year
Policyholder
1
2
3
1
85
50
65
2
60
55
70
3
80
95
75
2.11 (2 points) What is the Bhlmann credibility premium for policyholder 1 for year 4?
A. Less than 62
B. At least 62 but less than 67
C. At least 67 but less than 72
D. At least 72 but less than 77
E. At least 77
2.12 (2 points) What is the Bhlmann credibility premium for policyholder 2 for year 4?
A. Less than 62
B. At least 62 but less than 67
C. At least 67 but less than 72
D. At least 72 but less than 77
E. At least 77
2.13 (2 points) What is the Bhlmann credibility premium for policyholder 3 for year 4?
A. Less than 62
B. At least 62 but less than 67
C. At least 67 but less than 72
D. At least 72 but less than 77
E. At least 77

2.14 (3 points) Survival times are available for six insureds, three from Class A and three from
Class B. The three from Class A died at times t = 17, t = 32, and t = 39.
The three from Class B died at times t = 12, t = 20, and t = 24.
Nonparametric Empirical Bayes estimation is used to estimate the mean survival time for each class.
Unbiased estimators of the expected value of the process variance and the variance of the
hypothetical means are used.
Estimate Z, the Bhlmann credibility factor.
(A) 0
(B) 1/4
(C) 1/2
(D) 3/4
(E) 1

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 24

2.15 (2 points) Two insureds produced the following losses over a three-year period:
Annual Losses
Insured
Year 1
Year 2
Year 3
1
25
25
25
2
27
24
33
Using the nonparametric empirical Bayes method, determine the Bhlmann credibility
premium for Insured 1.
A. 25.4
B. 25.6
C. 25.8
D. 26.0
E. 26.2
2.16 (3 points) Two vehicles were selected at random from a population and the following claim
counts were observed:
Number of Claims during Year
Vehicle
Year 1
Year 2
Year 3
Year 4
1
1
0
0
0
2
1
3
0
1
Use empirical Bayesian estimation procedures to calculate the credibility weighted estimate of the
annual claims frequency for the first vehicle.
A. 0.48
B. 0.50
C. 0.52
D. 0.54
E. 0.56
2.17 (2 points) Two medium-sized insurance policies produced the following losses over a threeyear period:
Annual Losses
Insured
Year 1
Year 2
Year 3
1
25
34
23
2
15
26
17
Assuming that the annual exposures for each policy are equal and remain constant through time, use
empirical Bayesian estimation procedures to calculate the credibility assigned to the data from the
first insured.
A. 50%
B. 55%
C. 60%
D. 65%
E. 70%
2.18 (3 points) Lucky Tom finds coins on his walk to work. He can take one of two routes to work.
The last three times he took the first route he found coins worth: 123, 89, and 101.
The last three times he took the second route he found coins worth: 80, 112, and 67.
Estimate the worth of the coins he finds tomorrow if Lucky Tom takes the second route to work
tomorrow, using Nonparametric Empirical Bayes estimation.
A. Less than 89
B. At least 89 but less than 91
C. At least 91 but less than 93
D. At least 93 but less than 95
E. At least 95

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 25

2.19 (4 points) At Hogwarts School of Witchcraft and Wizardry, the number of points in the annual
competition for the house cup has been over the last three years:
House
Year 1
Year 2
Year 3
Gryffindor
421
450
428
Hufflepuff
442
403
382
Ravenclaw 405
398
456
Slytherin
461
487
475
Estimate Slytherins points in Year 4, using Nonparametric Empirical Bayes estimation.
A. 445
B. 450
C. 455
D. 460
E. 465
Use the following information for the next two questions:
Cookie Monster is testing brands of chocolate chip cookies.
Being diligent, he eats 1000 cookies from each of four brands:
Trollhouse, Little Glorias, Chip Off The Old Block, and Hello Mr. Chips.
Let Xi be the number of chips for cookie i, i = 1 to 1000.
For Trollhouse: Xi = 10,211 and Xi2 = 114,681.
For Little Glorias: Xi = 10,593 and Xi2 = 123,397.
For Chip Off The Old Block: Xi = 10,363 and Xi2 = 117,531.
For Hello Mr. Chips: Xi = 11,312 and Xi2 = 139,382.
2.20 (2 points) What is the estimated Expected Value of the Process Variance?
A. 10.6
B. 10.8
C. 11.0
D. 11.2
E. 11.4
2.21 (3 points) Use Nonparametric Empirical Bayes estimation in order to estimate the average
number of chips per cookie for Hello Mr. Chips.
A. 11.0
B. 11.1
C. 11.2
D. 11.3
E. 11.4

2.22 (3 points) For each of 5 years, we have the number of claims for each of 2 insureds:
A
B

0
1

0
0

1
0

0
0

0
2

Estimate the number of claims for insured B over the next 5 years, using Nonparametric Empirical
Bayes estimation.
A. 2.0
B. 2.2
C. 2.4
D. 2.6
E. 2.8

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 26

2.23 (3 points) Two insureds produced the following losses over a three-year period:
Annual Losses
Insured
Year 1
Year 2
Year 3
1
$3500
$3100
$3300
2
$2700
$3400
$2900
Inflation is 5% per year.
Using the nonparametric empirical Bayes method, estimate the losses in year 5 for Insured 2.
A. Less than 3600
B. At least 3600 but less than 3610
C. At least 3610 but less than 3620
D. At least 3620 but less than 3630
E. At least 3630
2.24 (3 points) Two insureds produced losses over a three-year period:
Annual Losses
Insured
Year 1
Year 2
Year 3
A
38
29
41
B
??
??
??
For insured B, the losses in Year 2 were 8 more than in Year 1.
For insured B, the losses in Year 3 were 2 more than in Year 2.
In Year 3, the losses for Insured B are greater than those for Insured A.
Using nonparametric empirical Bayes estimation, the Bhlmann credibility factor for an individual
policyholder is 10.68%.
What were the losses observed for insured B in Year 1?
A. 33
B. 34
C. 35
D. 36
E. 37
2.25 (4B, 11/93, Q.25) (1 point) You are given the following:
A random sample of losses taken from policy year 1992 has sample variance, s2 = 16.
The losses are sorted into 3 classes, A, B, and C, of equal size.
The sample variances for each of the classes are:
sA2 = 4

sB2 = 5

sC2 = 6

Estimate the variance of the hypothetical means.


A. Less than 4
B. At least 4, but less than 8
C. At least 8, but less than 12
D. At least 12
E. Not enough information to estimate

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 27

2.26 (Course 4 Sample Exam 2000, Q.31) You wish to determine the relationship between
sales (Y) and the number of radio advertisements broadcast (X).
Data collected on four consecutive days is shown below.
Day
Sales
Number of Radio Advertisements
1
10
2
2
20
2
3
30
3
4
40
3
Using the method of least squares, you determine the estimated regression line:
^

Y = -25 + 20X
You perform an Empirical Bayes nonparametric credibility analysis by treating the first two days, on
which two radio advertisements were broadcast, as one group, and the last two days, on which
three radio advertisements were broadcast, as another group.
Determine the estimated credibility, Z, of the data from each group.
2.27 (4, 5/00, Q.15 and 4, 11/02, Q.11 & 2009 Sample Q. 38) (2.5 points)
An insurer has data on losses for four policyholders for seven years. Xij is the loss from the ith
policyholder for year j.
You are given:
4 7

(Xij - Xi

)2 = 33.60

i=1 j=1
4

(Xi

- X )2 = 3.30

i=1

Calculate the Bhlmann credibility factor for an individual policyholder using nonparametric empirical
Bayes estimation.
(A) Less than 0.74
(B) At least 0.74, but less than 0.77
(C) At least 0.77, but less than 0.80
(D) At least 0.80, but less than 0.83
(E) At least 0.83

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 28

2.28 (4, 11/00, Q.16) (2.5 points) Survival times are available for four insureds, two from Class A
and two from Class B.
The two from Class A died at times t = 1 and t = 9.
The two from Class B died at times t = 2 and t = 4.
Nonparametric Empirical Bayes estimation is used to estimate the mean survival time for each class.
Unbiased estimators of the expected value of the process variance and the variance of the
hypothetical means are used.
Estimate Z, the Bhlmann credibility factor.
(A) 0
(B) 2/19
(C) 4/21
(D) 8/25
(E) 1
2.29 (4, 11/03, Q.15 & 2009 Sample Q.12) (2.5 points)
You are given total claims for two policyholders:
Year
Policyholder
1
2
3
4
X
730 800 650 700
Y
655 650 625 750
Using the nonparametric empirical Bayes method, determine the Bhlmann credibility
premium for Policyholder Y.
(A) 655
(B) 670
(C) 687
(D) 703
(E) 719
2.30 (4, 11/06, Q.27 & 2009 Sample Q.270) (2.9 points)
Three individual policyholders have the following claim amounts over four years:
Policyholder
Year 1
Year 2
Year 3
Year 4
X
2
3
3
4
Y
5
5
4
6
Z
5
5
3
3
Using the nonparametric empirical Bayes procedure, calculate the estimated variance of the
hypothetical means.
(A) Less than 0.40
(B) At least 0.40, but less than 0.60
(C) At least 0.60, but less than 0.80
(D) At least 0.80, but less than 1.00
(E) At least 1.00

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 29

2.31 (4, 5/07, Q.11) (2.5 points)


Three policyholders have the following claims experience over three months:
Policyholder
Month 1
Month 2
Month 3
Mean
Variance
I
4
6
5
5
1
II
8
11
8
9
3
III
5
7
6
6
1
Nonparametric empirical Bayes estimation is used to estimate the credibility premium in Month 4.
Calculate the credibility factor Z.
(A) 0.57
(B) 0.68
(C) 0.80
(D) 0.87
(E) 0.95

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 30

Solutions to Problems:
2.1. C. EPV = average of the sample variances for each class = 10.833.
VHM = Sample Variance of the Means - EPV/(# Years) = 53.083 - 10.833/2 = 47.667.
Group

Loss in
Year 1

Loss in
Year 2

Mean

Sample
Variance

1
2
3

18
25
16

20
30
10

19
27.5
13

2
12.5
18

19.833
53.083

10.833

Average
Sample Variance

K = EPV/VHM = 10.833/47.667 = .227. For two years of data, Z = 2/(2 + .227) = 90%.
2.2. E. Estimated EPV = average of the sample variances for each policyholder =
1
7

10

(Xij - X i)2 = (1/7)(1/9)(253) = 4.016.


i=1

1
9

j=1

Estimated VHM = sample variance of the X i - EPV/(number of years) =


7

(1/6)

(Xi - X )2

- (4.016/10) = 17/6 - 0.4016 = 2.432.

i=1

K = EPV / VHM = 4.016 / 2.432 = 1.651.


With 10 years of data, Z = 10/(10+K) = 10/11.651 = 85.8%.
The estimate for policyholder #1 is: Z(observed mean for #1) + (1 - Z)( observed overall mean) =
(0.858) X 1 + (1 - 0.858) X = (0.858)(13.1) + (0.142)(10.3) = 12.7.
2.3. C. The mean for Phil is 852.17. The second moment for Phil is 822,395.
The sample variance for Phil is: (100/99)(822,395 - 852.172 ) = 97,173.
The mean for Sylvia is 901.67. The second moment for Sylvia is 903,724.
The sample variance for Sylvia is: (100/99)(903,724 - 901.672 ) = 91,632.
EPV = (97,173 + 91,632)/2 = 94,403.
Overall Mean = (85217 + 90167)/200 = 876.92.
Sample Variance of their means is: {(852.17 - 876.92)2 + (901.67 - 876.92)2 } / (2 -1) = 1225.1.
Estimated VHM = 1225.1 - 94,403/100 = 281.1.
K = EPV/VHM = 94,403/281.1 = 335.8. Z = 100/(100 + 335.8) = 22.9%.
Estimated lifetime for Phils bulbs is: (0.229)(852.17) + (0.771)(876.92) = 871.3.
Comment: Estimate for Sylvias bulbs is: (0.229)(901.67) + (0.771)(876.92) = 882.6.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 31

2.4. A. Bill has a mean time of (45 + 46 + 39 + 42 + 43)/5 = 43.


Bill has a sample variance of:
{(45 - 43)2 + (46 - 43)2 + (39 - 43)2 + (42 - 43)2 + (43 - 43)2 } / (5 - 1) =7.5.
Hi has a mean time of (53 + 50 + 51 + 50 + 46)/5 = 50.
Hi has a sample variance of:
{(53 - 50)2 + (50 - 50)2 + (51 - 50)2 + (50 - 50)2 + (46 - 50)2 } / (5 - 1) =6.5.
The mean time for both weeks is: (43 + 50)/2 = 46.5.
The estimated EPV = (7.5 + 6.5)/2 = 7. The estimated VHM is:
{(43 - 46.5)2 + (50 - 46.5)2 } / (2 - 1) - EPV/5 = 23.1. K = 7/23.1 = 0.30.
2.5. E. Z = 5/5.30 = 94.3%.
The estimate of His future commute time is:
(0.943)(50) + (1 - 0.943)(46.5) = 49.8 minutes.
Comment: The concept is that part of the variation in commute times is random; i.e., due to
variations in the day of the week, the time of day, weather, delays, etc. Only a portion of the variation
is due to who is driving and which car they are driving. Thus we supplement the limited data from Hi
driving himself, with some data from Bill over the same route.
Note that the estimate could be rewritten as:
(0.943)(50) + (1 - 0.943)(50 + 43)/2 = (0.943 + 0.057/2)(50) + (0.057/2)(43) =
(97.1%)(His mean) + (2.9%)(Bills mean).
We give more weight to His mean because we assume it is more relevant to predicting the future
commuting time when Hi drives, (and the two data sources have the same volume of data.)
If instead, Bill were driving Hi to work during the third week, then the estimate would be:
(0.943)(43) + (1 - 0.943)(46.5) = 43.2.
2.6. E. First put all of the costs on the year 2002 level by adjusting for inflation.
For example: (1.052 )(500) = 551.
Policyholder

2000

2001

Mean

Sample
Variance

Gilbert
Sullivan

551
441

210
945

380
693

58,140
127,008

537
48828

92,574

Average
Sample Variance

EPV = 92,574. Estimated VHM = 48828 - (92574/2) = 2541.


K = 92,574/2541 = 36.4. Z = 2/(2 + 36.4) = 5.2%.
Estimate for Gilbert is: (380)(0.052) + (537)(0.948) = 529.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 32

2.7. B. One takes the average of the given sample variances for each individual;
the estimate of the EPV is: 18,841.
Alice
Asok
Dilbert
Howard
Wally

Mean

Sample Variance

138
112
135
91
467

73
103
155
206
133

320
129
121
109
265

102
93
123
211
193

782
104
77
116
118

270
171
139
81
89

280.83
118.67
125.00
135.67
210.83

69,679
802
704
3,341
19,679

174.20
4925.52

18,841

Mean
Sample Variance

2.8. D. One takes the sample variance of the 5 individual means:


{(281-174)2 + (119-174)2 + (125-174)2 + (136-174)2 + (211-174)2 } / (5 - 1) = 4926.
The estimate of the VHM is then: 4926 - EPV/(# of years) = 4926 - 18841/6 = 1786.
2.9. C. K = EPV/VHM = 18841/1786 = 10.5. For 6 years of data Z = 6/(6 + 10.5) = 36.4%.
Comment: Since they each have 6 years of data, each individuals data is given the same credibility.
(There are not differing numbers of exposures per year.)
2.10. D. The estimate of Howards future annual costs is:
(135.67)(0.364) + (174.20)(1 - 0.364) = 160.
Comment: The estimates of each individuals future annual costs are:
Mean

Estimate

Alice
Asok
Dilbert
Howard
Wally

280.83
118.67
125.00
135.67
210.83

213
154
156
160
188

Mean

174.20

174

2.11. C. EPV = 158.33. VHM = 128.70 - (158.33/3) = 75.9.


Policyholder

Year
2

1
2
3

85
60
80

50
55
95

Average
Sample Variance

Mean

Sample
Variance

65
70
75

66.67
61.67
83.33

308.33
58.33
108.33

70.56
128.70

158.33

K = 158.33/75.9 = 2.09. Z = 3/(3 + 2.09) = 59.0%.


The mean for policyholder 1 is 66.67 and the overall mean is 70.56.
Thus the estimate for policyholder 1 for year 4 is: (0.590)(66.67) + (0.410)(70.56) = 68.26.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 33

2.12. B. The estimate for policyholder 2 for year 4 is:


(0.590)(61.67) + (0.410)(70.56) = 65.31.
2.13. E. The estimate for policyholder 3 for year 4 is:
(0.590)(83.33) + (0.410)(70.56) = 78.09.
2.14. C. EPV = average of the sample variances for each class = 81.833.
VHM = Sample Variance of the Means - EPV/(# Years) = 56.889 - 81.833/3 = 29.611.
Class

First
Survival Time

Second
Survival Time

Third
Survival Time

Mean

Sample
Variance

A
B

17
12

32
20

39
24

29.333
18.667

126.333
37.333

24.000
56.889

81.833

Average
Sample Variance

Estimated K = 81.833/29.611 = 2.76. Z = 3 / (3 + 2.76) = 0.521.


Comment: Similar to 4, 11/00, Q.16.
2.15. E. Insured #1 has sample mean of 25, and sample variance of 0.
Insured #2 has sample mean of 28, and sample variance of 21.
EPV = (0 + 21)/2 = 10.5.
VHM = {(25 - 26.5)2 + (28 - 26.5)2 }/(2 - 1) - EPV/3 = 1.
K = EPV/VHM = 10.5. Z = 3/(3 + K) = 22.2%.
Estimate for Insured #1 is: (22.2%)(25) + (1 - 22.2%)(26.5) = 26.17.
2.16. A. EPV = (0.25 + 1.583)/2 = 0.917.

A
B
Mean
Sample Variance

Mean

1
1

0
3

0
0

0
1

0.250
1.250
0.750
0.500

Sample
Variance
0.250
1.583
0.917

VHM = (the sample variance of the means) - EPV / (# years of data) = 0.500 - 0.917/4 = 0.271.
K = EPV/VHM = 0.917/0.271 = 3.38. Z = 4 / (4 + K) = 0.542.
Estimated frequency for first vehicle: (0.25)(0.542) + (1 - 0.542)(0.75) = 0.479.
Comment: Similar to Exercise 12 in Topics in Credibility by Dean.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 34

2.17. D. EPV = (34.333 + 34.333)/2 = 34.333.

A
B

Mean

25
15

34
26

23
17

27.333
19.333

Mean
Sample Variance

23.333
32.000

Sample
Variance
34.333
34.333
34.333

VHM = (the sample variance of the means) - EPV / (# years of data) = 32 - 34.333/3 = 20.56.
K = EPV/VHM = 34.33/20.56 = 1.67. Z = 3/(3 + K) = 0.642.
Comment: Similar to Exercise 13 in Topics in Credibility by Dean.
The credibility assigned to the data from each insured is the same.
Since the exposures are the same for each policy and for each year, we can use the formulas that do
not involve exposures.
2.18. D. EPV = 416.83. Estimated VHM = 162 - (416.83/3) = 23.06.
Route

Year
2

1
2

123
80

89
112

Mean

Sample
Variance

101
67

104.33
86.33

297.33
536.33

95.33
162.00

416.83

Average
Sample Variance

K = 416.83/23.06 = 18.1. Z = 3/(3 + 18.1) = 14.2%.


The estimate for route 1 is: (0.142)(86.33) + (0.858)(95.33) = 94.05.
2.19. E. EPV = average of the sample variances = 581.9.
VHM = sample variance of the means - EPV/(# years ) = 819.4 - 581.9/3 = 625.4.
House

Year
2

Mean

Sample
Variance

Gryffindor
Hufflepuff
Ravenclaw
Slytherin

421
442
405
461

450
403
398
487

428
382
456
475

433.0
409.0
419.7
474.3

229.0
927.0
1002.3
169.3

434.0
819.4

581.9

Average
Sample Variance

K = EPV/VHM = 581.9/625.4 = 0.93.


For three years of data, Z = 3/(3 + 0.93) = 76.3%.
The observed annual points for Slytherin is 474.3 and the overall mean is 434.0.
The estimated future annual points for Slytherin is:
(76.3%)(474.3) + (1 - 76.3%)(434.0) = 464.7.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 35

2.20. B. For Trollhouse, the mean is 10.211, and the second moment is 114.681.
The sample variance is: (1000/999)(114.681 - 10.2112 ) = 10.43.
For Little Glorias, the mean is 10.593, and the second moment is 123.397.
The sample variance is: (1000/999)(123.397 - 10.5932 ) = 11.20.
For Chip Off The Old Block, the mean is 10.363, and the second moment is 117,531.
The sample variance is: (1000/999)(117.531 - 10.3632 ) = 10.15.
For Hello Mr. Chips, the mean is 11.312, and the second moment is 139.382.
The sample variance is: (1000/999)(139.382 - 11.3122 ) = 11.43.
Estimated EPV = (10.43 + 11.20 + 10.15 + 11.43)/4 = 10.80.
2.21. D. Overall Mean = (10.211 + 10.593 + 10.363 + 11.312)/4 = 10.620.
Sample Variance of their means is:
{(10.211 - 10.62)2 + (10.593 - 10.62)2 + (10.363 - 10.62)2 + (11.312 - 10.62)2 } / (4 -1) = 0.240.
Estimated VHM = 0.240 - 10.80/1000 = 0.229.
K = EPV/VHM = 10.80/0.229 = 47.2. Z = 1000/(1000 + 47.2) = 95.5%.
Estimated average number of chips per cookie for Hello Mr. Chips is:
(0.955)(11.312) + (0.045)(10.62) = 11.28.
2.22. A. EPV = (0.2 + 0.8)/2 = 0.5.

A
B
Mean
Sample Variance

Mean

0
1

0
0

1
0

0
0

0
2

0.20
0.60
0.400
0.080

Sample
Variance
0.20
0.80
0.500

VHM = (the sample variance of the means) - EPV / (# years of data) = 0.08 - 0.5/5 = -0.02.
Since the estimated VHM is negative, set Z = 0, equivalent to assuming the VHM is extremely
small. Overall mean = 4/10 = 0.4.
Estimated frequency for insured B: (0)(3/5) + (1 - 0)(0.4) = 0.4. (5)(0.4) = 2.0.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 36

2.23. C. First, inflate all of the losses up to the year 5 level. For example, (1.054 )(3500) = 4254.
Insured

Year
2

1
2

4254
3282

3589
3936

Mean

Sample
Variance

3638
3197

3827
3472

137,502
163,432

3649
63,145

150,467

Average
Sample Variance

Estimated EPV = 150,467. Estimated VHM = 63,145 - (150,467/3) = 12,989.


K = 150,467/12,989 = 11.6. Z = 3 / (3 + 11.6) = 20.5%.
The estimate for insured 2 is: (0.205)(3472) + (1 - 0.205)(3649) = 3613.
2.24. C. 0.1068 = Z = 3 / (3 + K). K = 25.09.
Let x be the losses observed for insured B in Year 1.
For Insured A, mean = 36, sample variance = (22 + 72 + 52 )/2 = 39.
For Insured B, mean = (x + x + 8 + x + 10)/3 = x + 6, sample variance = (62 + 22 + 42 )/2 = 28.
EPV = average of the sample variances for each insured = (39 + 28)/2 = 33.5.
25.09 = K = EPV/VHM. VHM = 33.5/25.09 = 1.335.
1.335 = VHM = Sample Variance of the Means - EPV/(# Years).

Sample Variance of the Means = 1.335 + 33.5/3 = 12.50.


Overall mean = (36 + x + 6)/2 = 21 + x/2.
12.50 = Sample Variance of the Means = (15 - x/2)2 + (x/2 - 15)2 .

(x/2 - 15)2 = 6.25. x/2 = 15 2.5. x = 35 or 25.


However, we are told that in Year 3, the losses for Insured B are greater than those for Insured A.
35 + 10 = 45 > 41. 25 + 10 = 35 < 41.
Therefore, the losses in Year 1 for Insured B are 35.
Comment: Here is the calculation using the missing inputs:
Insured

Loss in
Year 1

Loss in
Year 2

Loss in
Year 3

Mean

Sample
Variance

A
B

38
35

29
43

41
45

36
41

39
28

38.500
12.500

33.500

Average
Sample Variance

EPV = 33.5. VHM = 12.5 - 33.5/3 = 1.333.


K = EPV/VHM = 33.5/1.333 = 25.1. For three years of data, Z = 3 / (3 + 25.1) = 10.68%.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 37

2.25. C. Variance of the Hypothetical Means = Total Variance - EPV


= 16 - (4 + 5 + 6) / 3 = 16 - 5 = 11.
Comment: The estimated EPV is the weighted average of the individual estimated process
variances; in this case the classes are equally likely so they each have probability 1/3.
The estimated total variance is the sample variance of the losses, 16.
2.26. Group A has a mean sales of (10 + 20)/2 = 15.
Group A has a sample variance of: {(10 - 15)2 + (20 - 15)2 } / (2 - 1) = 50.
Group B has a mean sales of (30 + 40)/2 = 35.
Group B has a sample variance of: {(30 - 35)2 + (40 - 35)2 } / (2 - 1) = 50.
Group

Mean

Sample Variance

A
B

10
30

20
40

15
35

50
50

25
200

50

Mean
Sample Variance

The estimated EPV = (50 + 50)/2 = 50.


The mean for both groups is: (15+35)/2 = 25. The estimated VHM is:
{(15 - 25)2 +(35 - 25)2 } /(2 - 1) - EPV/2 = 200 - 50/2 = 175.
K = EPV / VHM = 50/175 = 2/7. For two days of data, Z = 2/(2 + 2/7) = 7/8.
Comment: This question is not treating each day as if it had a different number of exposures.
In fact once you divide the days into two groups, one ignores the number of advertisements per
day. The estimate of future sales on days with two advertisements (Group A) is:
(7/8)(15) + (1/8)(25) = 16.25. The estimate of future sales on days with three advertisements
(Group B) is: (7/8)(35) + (1/8)(25) = 33.25.
2.27. D. Estimated EPV = average of the sample variances for each policyholder =
1
4

(Xij - X i)2 = (1/4)(1/6)(33.60) = 1.4.


i=1

1
6

j=1

Estimated VHM = (sample variance of the X i ) - EPV / (number of years) =


1
3

(Xi - X )2 - (1.4/7) = (3.30/3) - 0.2 = 0.9.


i=1

K = EPV / VHM = 1.4/0.9 = 1.556. With 7 years of data,


Bhlmann credibility factor = Z = 7 / (7 + K) = 7/ 8.556 = 81.8%.

2013-4-12

Empirical Bayesian Cred. 2 No Varying Exposures, HCM 10/22/12, Page 38

2.28. A. EPV = average of the sample variances for each class = 17.
VHM = Sample Variance of the Means - EPV/(# Years) = 2 - 17/2 = -6.5.
Class

First
Survival Time

Second
Survival Time

Mean

Sample
Variance

A
B

1
2

9
4

5
3

32
2

4.000
2.000

17.000

Average
Sample Variance

Since the estimate of the VHM is negative, one treats it as if VHM = 0, which implies zero credibility.
Z = 0.
Comment: Since Z = 0, the estimated survival time for each class is equal to the overall mean of 4.
The Variance of Hypothetical Means is never negative, just as with any other variance.
2.29. C. EPV = average of the sample variances for each class = 3475.
VHM = Sample Variance of the Means - EPV/(# Years) = 1250 - 3475/4 = 381.
Policyholder

Year
2

Mean

Sample
Variance

X
Y

730
655

800
650

650
625

700
750

720
670

3,933
3,017

695
1250

3,475

Average
Sample Variance

K = EPV/VHM = 3475/381 = 9.1. For four years of data, Z = 4 / (4 + 9.1) = 30.5%.


Estimate for Policyholder Y is: (0.305)(670) + (1 - 0.305)(695) = 687.
Comment: The answer must be between the mean for policyholder Y of 670 and the overall mean
of 695. Thus only choice C and perhaps choice B are possible.
2.30. C. X1 = 3. Sample Variance for X is: {(2 - 3)2 + (3 - 3)2 + (3 - 3)2 + (4 - 3)2 } / (4 - 1) = 2/3.
X2 = 5. Sample Variance for Y is: {(5 - 5)2 + (5 - 5)2 + (4 - 5)2 + (6 - 5)2 } / (4 - 1) = 2/3.
X 3 = 4. Sample Variance for Z is: {(5 - 4)2 + (5 - 4)2 + (3 - 4)2 + (3 - 4)2 } /( 4 - 1) = 4/3.
Estimated EPV = (2/3 + 2/3 + 4/3) / 3 = 8/9. X = (3 + 5 + 4)/3 = 4.
Estimated VHM = {(3 - 4)2 + (5 - 4)2 + (4 - 4)2 } / (3 - 1) - EPV/4 = 1 - 2/9 = 7/9 = 0.778.
Comment: K = EPV/VHM = (8/9)/(7/9) = 8/7. Z = 4/(4 + K) = 7/9.
2.31. D. Estimated EPV = (1 + 3 + 1)/3 = 5/3.
Overall Mean is: (5 + 9 + 6)/3 = 20/3.
Estimated VHM = {(5 - 20/3)2 + (9 - 20/3)2 + (6 - 20/3)2 } / (3 - 1) - (5/3)/3 = 3.778.
K = EPV/VHM = (5/3)/3.778 = 0.441. Z = 3 / (3 + K) = 87.2%.
Comment: They gave you the sample mean and sample variance for each policyholder.

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 39

Section 3, Differing Numbers of Years, No Varying Exposures


The method can be generalized in order to deal with differing numbers of years of data.24
Rather than take an average of each insureds sample variance, one takes a weighted average in
order to estimate the EPV. One uses weights equal to the number of degrees of freedom for each
class, that is the number of years of data for that class minus 1.
Y i = the number of years of data for class i
Yi

(Xit - Xi )2
si2 = t=1

= the usual sample variance for the data from a single class i.25

Yi - 1

(Yi

C Yi

1)si2

(Xit - Xi )2

EPV = weighted average of these sample variances = i=1C

= i=1 t=1
C
(Yi - 1)
(Yi - 1)
i=1

i=1

Then besides using the altered estimate of the EPV, the estimate of the VHM is more complicated:
C

Let =

Yi i=1

C
Yi2
i= 1
.
C
Yi
i= 1

Yi ( Xi

- X )2 - (C -1)EPV

VHM = i=1

Yi

Xit
Where X i = t=1
Yi

C Yi

Xit
= average of the data for class i, and X = i=1Ct=1

= overall average.

Yi
i=1

24

These formulas are special cases of the formulas presented in the next sections, that deal with situations with
varying exposures. In those more general formulas, set the number of exposures equal to 1 whenever the insured
has data for that year, in order to get the formulas shown here.
25
Or from a single insured i, depending on the data set being analyzed.

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 40

Exercise: Verify that the above equation for the VHM reduces to the prior one when each class has
the same number of years of data Y.
C

[Solution:

Yi i=1

C
Yi2
i= 1
C
Yi
i= 1

Yi ( Xi

CY2
= Y (C-1).
CY

= CY -

(Xi - X )2 - (C -1)EPV (Xi - X)2

- X )2 - (C -1)EPV

VHM = i=1

Y
=

i=1

Y (C - 1)

= i=1

C- 1

EPV
=
Y

(sample variance of the class means) - EPV / (# of years).]


Exercise: There are 3 drivers in a particular rating class. For each year in which they were licensed to
drive, we have the number of claims for each driver. Hugh was licensed for the last 3 years, Dewey
was licensed for the last 5 years, while Louis was licensed for the last 4 years.
Use nonparametric empirical Bayesian estimation to estimate the EPV.
1
Hugh
Dewey
Louis

1
0

0
0
2

0
0
1

0
0
0

[Solution: The EPV = weighted average of the sample variances for each driver, with weights equal
to the number of years minus one =
(0)(2) + (0.2)(4) + (0.9167)(3)
= 3.55 / 9 = 0.394.
2 + 4 + 3
1
Hugh
Dewey
Louis

Mean

1
0

0
0
2

0
0
1

0
0
0

0.00
0.20
0.75

Sample
Variance
0.0000
0.2000
0.9167

Comment: Note that in computing the numerator of the EPV, one could save a little time by just
computing the numerator of each sample variance. For example, for Louis this would be:
(0 - 0.75)2 + (2 - 0.75)2 + (1 - 0.75)2 + (0 - 0.75)2 = 2.75 = (4 - 1)(0.9167).]

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 41

Exercise: For this same data, use nonparametric empirical Bayesian estimation to estimate the
VHM.
[Solution:
3

Yi i=1

3
Yi2
i=1
3
Yi
i =1

= (3 + 5 + 4) -

32 + 52 + 42
= 12 - 50/12 = 7.83.
3 + 5 + 4

X = overall average of the observed data = 4 claims / 12 years = 1/3.


The means for the three individual drivers are: 0, 0.20, and 0.75.
C

Yi ( Xi
VHM = i=1

- X )2 - (C -1)EPV

(3)(0 - 0.333)2 + (5)(0.20 - 0.333)2 + (4)(0.75 - 0.333)2 - (2)(0.394)


= 0.042.]
7.83
Exercise: For the above data, use nonparametric empirical Bayesian estimation to estimate the
future claim frequency for each driver.
[Solution: K = EPv / VHM = 0.394/0.042 = 9.4.
The number of years of data are 3, 5, and 4.
Thus the credibilities are: 3/12.4 = 24%, 5/14.4 = 35%, and 4/13.4 = 30%.
The estimates of future claim frequency are:
Hugh: (24%)(0) + (76%)(0.333) = 0.25.
Dewey: (35%)(0.2) + (65%)(0.333) = 0.29.
Louis: (30%)(0.75) + (70%)(0.333) = 0.46.]

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 42

Problems:
Use the following information for the next three questions:
Past claims data on a portfolio of policyholders are given below:
Policyholder

1
2
3

85
60

Year
2

50
55
95

65
70
75

3.1 (3 points) What is the Bhlmann credibility premium for policyholder 1 for year 4?
A. Less than 62
B. At least 62 but less than 67
C. At least 67 but less than 72
D. At least 72 but less than 77
E. At least 77
3.2 (2 points) What is the Bhlmann credibility premium for policyholder 2 for year 4?
A. Less than 62
B. At least 62 but less than 67
C. At least 67 but less than 72
D. At least 72 but less than 77
E. At least 77
3.3 (3 points) What is the Bhlmann credibility premium for policyholder 3 for year 4?
A. Less than 62
B. At least 62 but less than 67
C. At least 67 but less than 72
D. At least 72 but less than 77
E. At least 77

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 43

3.4 (4 points) Use the following information:

Phil and Sylvia are competitors in the light bulb business.


You test light bulbs from each of them.
Let Xi be the lifetime for light bulb i.
For Phils 100 light bulbs: Xi = 85,217 and Xi2 = 82,239,500.
For Sylvias 200 light bulbs: Xi = 178,102 and Xi2 = 165,218,000.
Use Nonparametric Empirical Bayes estimation in order to estimate the average lifetime of a
randomly selected light bulb from Sylvia.
A. Less than 880
B. At least 880 but less than 882
C. At least 882 but less than 884
D. At least 884 but less than 886
E. At least 886
3.5 (5 points) For each of 5 employees, displayed below are the dollars of health insurance claims
made:
Year 1

Year 2

Year 3

Year 4

Year 5

Year 6

Sum

102
93
123
211
193

782
104
77
116
118

270
171
139
81
89

1685
497
750
814
1265

722

1197

750

5011

Alice
Asok
Dilbert
Howard
Wally

138

73

135
91
467

155
206
133

320
129
121
109
265

Sum

831

567

944

Assume that each year of data has been brought to the current cost level.
There was no data from Asok for the first two years. (You may assume that Asok was not
employed at this company during years 1 and 2.)
What is the estimated future annual cost for Asok?
A. Less than 150
B. At least 150 but less than 155
C. At least 155 but less than 160
D. At least 160 but less than 165
E. At least 165

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 44

Use the following information for the next two questions:


Survival times are available for five insureds, three from Class A and two from Class B.
The three from Class A died at times t = 17, t = 32, and t = 39.
The two from Class B died at times t = 12 and t = 24.
Nonparametric Empirical Bayes estimation is used to estimate the mean survival time for each class.
Unbiased estimators of the expected value of the process variance and the variance of the
hypothetical means are used.
3.6 (2 points) Estimate the survival time for an insured picked at random from Class A.
A. Less than 22
B. At least 22 but less than 24
C. At least 24 but less than 26
D. At least 26 but less than 28
E. At least 28
3.7 (2 points) Estimate the survival time for an insured picked at random from Class B.
A. Less than 22
B. At least 22 but less than 24
C. At least 24 but less than 26
D. At least 26 but less than 28
E. At least 28
3.8 (4 points) You have the following data for the number of medical malpractice claims filed against
each of five pediatricians.

Dr. George Burns


Dr. Flo Schotte
Dr. Major Payne
Dr. Vera Sharpe-Needles
Dr. Hy Fever

Year 1

Year 2

Year 3

Year 4

Mean

Sample
Variance

1
2
0

3
1
4
0

2
0
1
1
2

2
2
1
1
0

2
1
1
2
0.5

0.667
1.000
0.000
2.000
1.000

Use Nonparametric Empirical Bayes estimation in order to estimate the expected number of claims
from each doctor in year 5.

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 45

3.9 (3 points) Past claims data on four insureds are given below:
Insured

Year 1

Year 2

1
2
3
4

75
60
50

60
55
85
40

Use Nonparametric Empirical Bayes estimation in order to estimate the Bhlmann credibility
premium for insured 3.
A. Less than 74
B. At least 74 but less than 76
C. At least 76 but less than 78
D. At least 78 but less than 80
E. At least 80

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 46

Solutions to Problems:
C

(Yi

- 1)si2

3.1. C., 3.2. B., 3.3. D. Estimated EPV = i=1C

(Yi

- 1)

i=1

(3 - 1)(308.33) + (3 - 1)(58.33) + (2 - 1)(200)


= 186.67.
2 + 2 + 1
Policyholder

1
2
3

85
60

Yi i=1

X =

C
Yi2
i=1
C
Yi
i =1

=8-

Year
2

Total

Mean

Sample
Variance

50
55
95

65
70
75

200
185
170

66.67
61.67
85.00

308.33
58.33
200.00

32 + 32 + 22
= 5.25.
8

200 + 185 + 170


= 69.38.
8
C

Yi ( Xi
Estimated VHM = i=1

- X )2 - (C -1)EPV

(3)(66.67 - 69.38)2 + (3)(61.67 - 69.38)2 + (2)(85 - 69.38)2 - (186.67)(3 - 1)


=
5.25
(688.33 - 373.34)/ 5.25 = 60.0.
K = 186.67/60.0 = 3.11.
Policyholder #1 has three years of data, so Z = 3/(3 + 3.11) = 49.1%. Thus the estimate for
policyholder 1 for year 4 is: (.491)(66.67) + (.509)(69.38) = 68.05.
Policyholder #2 has three years of data, so Z = 3/(3 + 3.11) = 49.1%.
The estimate for policyholder 2 for year 4 is: (.491)(61.67) + (.509)(69.38) = 65.59.
Policyholder #3 has two years of data, so Z = 2/(2 + 3.11) = 39.1%.
The estimate for policyholder 3 for year 4 is: (.391)(85.00) + (.609)(69.38) = 75.49.
Comment: The estimate of 68.05 for policyholder #1, differs from the estimate of 68.28 in the
previous section for a similar setup, but in which each policyholder has 3 years of data.

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 47

3.4. D. The mean for Phil is 852.17. The second moment for Phil is 822,395.
The sample variance for Phil is: (100/99)(822,395 - 852.172 ) = 97,173.
The mean for Sylvia is 178102/200 = 890.51.
The second moment for Sylvia is 165,218,000/200 = 826,090.
The sample variance for Sylvia is: (200/199)(826,090 - 890.512 ) = 33,248.
C

(Yi

- 1)si2

EPV = i=1C

(Yi

(99)(97,173) + (199)(33,248)
= 54,485.
99 + 199

- 1)

i=1

Overall Mean = (85217 + 178102)/300 = 877.73.


C

Yi i=1

C
Yi2
i=1
C
Yi
i =1

= 300 -

1002 + 200 2
= 133.33.
300

Yi ( Xi
VHM = i=1

- X )2 - (C -1)EPV

(100)(852.17 - 877.73)2 + (200)(890.51 - 877.73)2 - (2 - 1)(54,485)


= 326.4.
133.33
K = EPV/VHM = 54485/326.4 = 166.9.
For Sylvia, Z = 200/(200 + 166.9) = 54.5%.
Estimate for Sylvias bulbs is: (.545)(890.51) + (.455)(877.73) = 884.7.
Comment: For Phil, Z = 100/(100 + 166.9) = 37.5%.
Estimated lifetime for Phils bulbs is: (.375)(852.17) + (.625)(877.73) = 868.1.

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 48

3.5. E. One calculates the sample variance for each individual.


Now one takes the weighted average; each persons sample variance gets a weight proportional to
its number of degrees of freedom = number of year - 1.
So in this case the weights are: 5/23, 3/23, 5/23, 5/23, 5/23.
The estimate of the EPV is: 20,461.

Alice
Asok
Dilbert
Howard
Wally

138

73

135
91
467

155
206
133

320
129
121
109
265

102
93
123
211
193

782
104
77
116
118

270
171
139
81
89

Number of
Years -1
5
3
5
5
5

Process
Variance
69,679
1,198
704
3,341
19,679

23

20,461

Weighted Average
C

Yi i=1

C
Yi2
i=1
C
Yi
i =1

= (6 + 4 + 6 + 6 + 6) -

62 + 42 + 62 + 62 + 62
= 22.29.
28

X = overall average of the observed data = 5011/ (28 man-years) = 178.96.


The means for the five individual are: 280.83, 124.25, 125 , 135.67, and 210.83.
C

Yi ( Xi
VHM = i=1

- X )2 - (C -1)EPV

{(6)(280.83 - 178.96)2 + (4)(124.25 - 178.96)2 + (6)(125 - 178.96)2 +


(6)(135.67 - 178.96)2 + (6)(210.83 - 178.96)2 - (20461)(4)} / 22.29 = 1220 .
K = 20461/1220 = 16.8. The credibilities are: 6/22.8 = 26.3%, 4/20.8 = 19.2%,
6/22.8 = 26.3%,6/22.8 = 26.3%, and 6/22.8 = 26.3%.
The estimate of Asoks future annual cost is: (0.192)(124.25) + (1 - 0.192)(178.96) = 168.
Comment: The estimates of each individuals future annual costs are:
Mean
Estimate

Alice
280.83
206

Asok
124.25
168

Dilbert
125.00
165

Howard
135.67
168

Wally
210.83
187

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 49


C

(Yi

- 1)si2

3.6. D. & 3.7. B. Estimated EPV = i=1C

(Yi

(3 - 1)(126.333) + (2 - 1)(72)
= 108.22.
2 + 1

- 1)

i=1

Class

First
Survival Time

Second
Survival Time

Third
Survival Time

Mean

Sample
Variance

A
B

17
12

32
24

39

29.333
18.000

126.333
72.000

Yi i=1

C
Yi2
i=1
C
Yi
i =1

= 5 - (32 + 22)/5 = 2.4.

X = 124 / 5 = 24.8.
C

Yi ( Xi
Estimated VHM = i=1

- X )2 - (C -1)EPV

(3)(29.333 - 24.8)2 + (2)(18 - 24.8)2 - (108.22)(2 - 1)


= 19.13.
2.4
K = 108.22/19.13 = 5.66.
Class A has three years of data, so Z = 3/(3 + 5.66) = 34.6%.
Thus the estimated survival time for Class A is: (0.346)(29.333) + (0.654)(24.8) = 26.4.
Class B has two years of data, so Z = 2/(2 + 5.66) = 26.1%.
Thus the estimated survival time for Class B is: (0.261)(18) + (0.739)(24.8) = 23.0.

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 50


C

(Yi

- 1)si2

3.8. EPV = i=1C

(Yi

(0.667)(3) + (1)(2) + (0)(1) + (2)(3) + (1)(3)


= 13/12.
3 + 2 + 1 + 3 + 3

- 1)

i=1

Yi i=1

C
Yi2
i=1
C
Yi
i =1

= (4 + 3 + 2 + 4 + 4) - (42 + 32 + 22 + 42 + 42 )/17 = 13.41.

X = 23/17 = 1.353.

Y i ( X i - X )2 =
(4)(2 - 1.353)2 + (3)(1 - 1.353)2 + (2)(1 - 1.353)2 + (4)(2 - 1.353)2 + (4)(.5 - 1.353)2 = 6.882.
C

Yi ( Xi
VHM = i=1

- X )2 - (C -1)EPV
=

6.882 - (13 / 12)(5 - 1)


= 0.190.
13.41

K = EPV/VHM = (13/12)/0.190 = 5.70.


Year 4Mean
1
2
3
Dr. Burns
Dr. Schotte
Dr. Payne
Dr. Sharpe-Needles
Dr. Fever

2
1
1
2
0.5

Years
of Data

Estimated
Number of Claims

4
3
2
4
4

41.2%
34.5%
26.0%
41.2%
41.2%

1.62
1.23
1.26
1.62
1.00

For example, (41.2%)(2) + (1 - 41.2%)(1.353) = 1.62.

2013-4-12

Empirical Bayesian Cred. 3 Differing Number Years, HCM 10/22/12, Page 51


C

(Yi

- 1)si2

3.9. D. Estimated EPV = i=1C

(Yi

- 1)

i=1

(2 - 1)(112.5) + (2 - 1)(12.5) + (1 - 1)(N.A.) + (2 - 1)(50)


= 58.33.
1 + 1 + 1
Insured
1
2
3
4
C

Yi i=1

Year
1

Year
2

Total

Mean

Sample
Variance

75
60

60
55
85
40

135
115
85
90

67.50
57.50
85.00
45.00

112.50
12.50
N/A
50.00

50
C
Yi2
i=1
C
Yi
i =1

= 7 - (22 + 22 + 12 + 22 )/7 = 5.143.

X = (135 + 115 + 85 + 170)/7 = 60.71.


C

Yi ( Xi
Estimated VHM = i=1

- X )2 - (C -1)EPV

(2)(67.5 - 60.71)2 + (2)(57.5 - 60.71)2 + (1)(85 - 60.71)2 + (2)(45 - 60.71)2 - (58.33)(4 - 1)


5.143

= 198.6. K = 58.33/198.6 = 0.294.


For insured 3 with one year of data, Z = 1/(1 + 0.294) = 77.3%.
Thus the estimate for insured 3 is: (77.3%)(85) + (1 - 77.3%)(60.71) = 79.49.
Comment: Since there is only one year of data from insured number 3, the sample variance is not
defined. However, that is okay since its contribution to the estimate of the EPV includes a factor of:
1 - 1 = 0.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 52

Section 4, Varying Exposures, (No Differing Numbers of Years)26


In the example discussed previously with three drivers, each year from each driver was one
exposure. In this and the other examples in the prior sections, each year or trial of data represented
the same volume of data. In many insurance applications there would be varying amounts of data
per year per either insured or class.27 28
The Bhlmann-Straub Method is a generalization of the previous Bhlmann method, in order to
deal with those circumstances in which there are different numbers of exposures or premiums by
class and/or year. Again we will first deal with the simpler case where each class has the same
number of years of data.
A Two Class Example:
Assume you have data for 2 classes over three years and wish to use nonparametric Bayesian
Estimation to estimate the future pure premium for each class.
Class
Poultry Farms
Dairy Farms

1
41
58

Exposures
2
3
37
29
59
53

Total
107
170

1
232
104

The mean pure premium for Poultry is: 502/107 = 4.69.


The mean
pure
premium for 58
Dairy
is:
315/170
= 1.85.
Dairy
Class
Poultry
Farm
Farm
41
1
Exposures
59
37
2
53
29
3
Class
Poultry Farms
Dairy Farms

Pure Premium
1
2
3
5.66
0.89
8.17
1.79
1.02
2.85

Losses
2
33
60

3
237
151

Total
502
315

104
232
1
Total
4.69
1.85

Let Xit = Pure Premium for Class i in year t.


Y

mit = exposures for class i in year t. mi =

mit .
t=1

26

The situation with varying exposures is more general than the situation without varying exposures.
For example in group health insurance, different employers have different numbers of employees, and the
number of employees vary over time. In commercial automobile insurance, the number of insured cars in the fleet
varies from policyholder to policyholder and over time.
28
A classification is a set of similar insureds that are grouped together for purposes of estimating an average rate to
charge them. See for example, Risk Classification, Robert Finger in Foundations of Casualty Actuarial Science.
The number of exposures varies between classes and over time.
27

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 53

mit Xit
_
Xi = t=1
= weighted average pure premium for class i.29
mi
Y

mit (Xit - X i )2
Then one estimates the process variance of each class as: vi = t=1

Y - 1

. 30

For Poultry the estimated process variance is:


(41)(5.66 - 4.69)2 + (37)(0.89 - 4.69)2 + (29)(8.17 - 4.69)2
= 462.0.
3 - 1
Similarly, for Dairy the estimated process variance is:
(58)(1.79 - 1.85)2 + (59)(1.02 - 1.85)2 + (53)(2.85 - 1.85)2
= 46.9.
3 - 1
Then the estimate of the EPV is: (v1 + v2 )/2 = (462.0 + 46.9)/2 = 254.
One estimates the VHM as follows:31
C

mi 2
= m - i=1
m

= 277 -

1072 + 1702
= 131.3.
107 + 170

X = overall average = (502 + 315) / (107 + 170) = 2.95.


C

mi (X i

- X )2 = (107) (4.69 - 2.95)2 + (170) (1.85 - 2.95)2 = 530.

i=1
C

mi ( Xi
Then the estimated VHM = i=1
29

- X )2 - (C -1)EPV

530 - (2 - 1)(254)
= 2.10.
131.3

This weighted average pure premium for a class is the sum of the losses for that class divided by the sum of the
exposures for that class.
30
This is analogous to a sample variance, however its computation involves exposures and the weighted average.
As will be discussed we are estimating the process variance for a single exposure.
The notation is taken from 4, 11/01, Q.30.
31
The special formula for the denominator of the VHM is analogous to that when the number of years varied.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 54

Exercise: Determine the credibility do be used in order to estimate the future pure premium for each
class.
[Solution. K = EPV / VHM = 254 / 2.10 = 121.
For the Poultry class: Z = 107 / (107 + 121) = 46.9%.
For the Dairy class: Z = 170 / (170 + 121) = 58.4%.]
This is an example of the Bhlmann-Straub Method, one of the Empirical Bayesian techniques.
In general the Bhlmann-Straub Method would proceed as follows:
Assume we have Y years of data from each of C classes.
Let Xit be the data (die roll, frequency, severity, pure premium, loss ratio, etc.) observed for class
(or risk) i, in year t, for i = 1,...,C, and t =1,...,Y.
Let mit be the measure of size (premiums, exposures, number of die rolls, number of members in a
group, etc.) for class (or risk) i, in year t, for i = 1, ... , C, and t =1, ... , Y.
Y

Let mi =

mit = sum of exposures for class i.


t=1

C Y

Let m =

mit = overall sum of exposures.


i=1 t=1
Y

mit Xit
Let X i = t=1

mi

= weighted average of the data for class i.

C Y

mit Xit
Let X = i=1 t=1
m

= overall weighted average of the observed data.

mit (Xit - X i )2
Let vi = t=1

Y - 1

= estimated (process) variance of the data for class i.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 55


C

vi

estimated EPV (for a single exposure of the risk process) =

i=1

C Y

mit (Xit - Xi )2
= i=1 t=1
C (Y - 1)

mi 2
Let = m - i=1
m

= total exposures adjusted for degrees of freedom.32

mi (X i

estimated VHM = i=1

- X)2 - EPV (C -1)

.33

If the estimated VHM 0, set Z = 0.


K = EPV / VHM = estimated Bhlmann Credibility Parameter.
As always we wish to estimate the EPV and VHM for one draw from the risk process, in other
words for one exposure for one year. Each cell usually has more than one exposure, and thus a
smaller random fluctuation in the observed average frequency, pure premium, etc., than if it had had
only one exposure. Therefore, we multiply (Xit - X i)2 by mit, the exposures in that cell, in order to
increase the random fluctuations back up to the level they would have been at with one exposure.
The corresponding adjustment is made in the first term in the numerator of the VHM.34

Loss Models does not use any notation or terminology to describe what I have called , an intermediate step in
the calculation of the estimated VHM.
33
See Equation 20.76 in Loss Models, by Klugman, Panjer, and Willmot, or Part III, Chapter 3, Theorem 20, in
Advanced Risk Theory by De Vlyder.
34
These formulas are derived in the next section.
32

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 56

Preserving the Total Losses:


One could take the credibility weighted average of each classes pure premium and the overall mean
pure premium.
For the Poultry class, the estimated future pure premium would be:
(46.9%) (4.69) + (1 - 46.9%) (2.95) = 3.77.
For the Dairy class, the estimated future pure premium would be:
(58.4%) (1.85) + (1 - 58.4%) (2.95) = 2.31.
Class
Poultry Farms
Dairy Farms

Exposures
107
170

Losses
502
315

Pure Premium
4.69
1.85

277

817

2.95

Overall

Z
Estimated P.P.
46.9%
3.77
58.4%
2.31

However, there is a problem with this approach! The estimated pure premiums of 3.77 and 2.31,
correspond to total estimated losses of: (107)(3.77) + (170)(2.31) = 796.
This differs from the observed total losses of 817. Our fancy actuarial technique has lowered the
overall rate level, without our intending to. In general, proceeding in the above manner, the
observed total losses will not equal the total estimated losses.
In order to preserve the total losses, one applies the complement of credibility to the
credibility weighted average pure premium.35
Exercise: You have data for 2 classes over three years. Use nonparametric Bayesian Estimation to
estimate the future pure premium for each class, using the method that preserves total losses.
Class
Poultry Farm
Dairy Farm

1
41
58

Exposures
2
3
37
29
59
53

1
232
104

Losses
2
33
60

3
237
151

[Solution: From a prior solution, for the Poultry class Z = 46.9%, and for the Dairy class Z = 58.4%.
(46.9%)(4.69) + (58.4%)(1.85)
The credibility weighted pure premium is:
= 3.12.
46.9% + 58.4%
Thus the estimated pure premiums are, Poultry: (46.9%) (4.69) + (1 - 46.9%) (3.12) = 3.86, and
Dairy: (58.4%) (1.85) + (1 - 58.4%) (3.12) = 2.38.
Comment: If for example, we expected 50 exposures from Dairy Farms, we might charge a
premium of: (50)(2.38) = 119.]
Note that now the estimated total losses are: (107)(3.86) + (170)(2.38) = 818, matching the
observed total losses of 817, subject to rounding.
35

All past exam questions of this type have used this method that preserves total losses.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 57

Derivation of the Method of Preserving the Total Losses:


If one applies the complement of credibility to the credibility weighted average pure premiums,
provided the credibility is of the form Z = E/(E+K), then as will be shown the total losses are
preserved.36 Let Li be the losses for class i, mi be the exposures for class i, pi be the estimated
pure premium for class i, K = the estimated Bhlmann Credibility Parameter, and Zi be the credibility
for class i, Zi = mi / (mi + K).
The credibility weighted average pure premium =
mi
Li
(Li / mi)
Zi (Li / mi)
mi + K
mi + K
=
=
.
m
m
i
i
Zi
mi + K
mi + K

The estimated pure premium for class i = pi =


Z i (Li/mi) + (1 - Zi) (credibility weighted average pure premium) =
Li
Li

mi
Li
Li
K
K
mi + K
mi + K
+
=
+
.
mi + K mi mi + K
mi + K mi + K
mi
mi
mi + K
mi + K

The estimated total losses = mip i =

Li
mi
+
mi + K

Li

K
mi mi + K mi m+i K =
mi + K

mi mi +i K + K mi +i K = (mi + K) mi +i K = Li = the total observed losses.


L

Multisided Die Example with Differing Volumes of Data:


As in the previous example, the data would consist of the number of exposures per year for each
risk, as well as the outcome each year for each risk. For example, the outcome could be the number
of accidents or dollars of loss. This technique is set up to be applied to the accident frequencies,
pure premiums, severities, the average of a bunch of die rolls, etc.

36

Note the result does not depend on where K came from or what its numerical value is.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 58

For example, assume we have the following data generated by a simulation of the multi-sided die
example, where each trial consists of the sum of many rolls of the selected die:
Risk
Die
Number Type
1
2
3
4
5
6
7
8
9
10

4
4
4
4
4
4
6
6
6
8

Number of Die Rolls


Trial Number
2
3
4
5

1
150
80
120
40
160
50
40
160
120
80

150
90
110
70
130
50
70
130
110
90

150
100
100
100
100
50
100
100
100
100

150
100
100
100
100
50
100
100
100
100

150
110
90
130
70
50
130
70
90
110

Sum of Die Rolls


Trial Number
2
3
4
5

150
120
80
160
40
50
160
40
80
120

378
205
281
104
396
121
136
555
394
400

371
222
296
178
323
119
260
478
391
385

394
256
263
270
253
122
341
352
362
461

346
246
244
257
240
132
346
364
352
425

386
290
217
339
173
128
467
258
319
491

6
360
288
196
410
101
125
546
124
279
535

Assume that we only have the observed data and do not know what sided die generated each risk.
Then as was the case in the absence of varying exposures, one can attempt to estimate the EPV
and VHM or the Bhlmann Credibility Parameter K using the average die rolls and number of die
rolls for each trial and risk.37

Risk
#

1
2
3
4
5
6
7
8
9
10

150
80
120
40
160
50
40
160
120
80

Average

Number of Die Rolls


Trial Number
2
3
4
5
6
150
90
110
70
130
50
70
130
110
90

150
100
100
100
100
50
100
100
100
100

150
100
100
100
100
50
100
100
100
100

150
110
90
130
70
50
130
70
90
110

150
120
80
160
40
50
160
40
80
120

1
2.520
2.562
2.342
2.600
2.475
2.420
3.400
3.469
3.283
5.000

Average of Die Rolls


Trial Number
2
3
4
2.473
2.467
2.691
2.543
2.485
2.380
3.714
3.677
3.555
4.278

2.627
2.560
2.630
2.700
2.530
2.440
3.410
3.520
3.620
4.610

2.307
2.460
2.440
2.570
2.400
2.640
3.460
3.640
3.520
4.250

2.573
2.636
2.411
2.608
2.471
2.560
3.592
3.686
3.544
4.464

2.400
2.400
2.450
2.562
2.525
2.500
3.413
3.100
3.487
4.458

Mean

Proc.
Var.

2.483
2.512
2.495
2.597
2.477
2.490
3.493
3.552
3.495
4.495

2.047
0.819
1.993
0.309
0.195
0.470
1.378
2.688
1.523
6.449

3.009

1.787

For Risk 2 separately we can estimate the mean as:38


(80)(2.562) + (90)(2.467) + (100)(2.56) + (100)(2.46) + (110)(2.636) + (120)(2.4)
= 2.512.
600

37

In an insurance application one would work with the frequency (or severity, pure premium, loss ratio, etc.) and
number of exposures (or claims, premiums, etc.) for each year and each risk or class.
38
Assuming one had the sum of the die rolls for each trial, one could do this calculation in an equivalent way (except
for rounding) as: (205 + 222 + 256 + 246 + 290 + 288) / 600 = 1507/600 = 2.512.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 59

For Risk 2 separately we can estimate the process variance (per die roll):39
(80)(2.562 - 2.512) 2 +(90)(2.467 - 2.512) 2 +(100)(2.56 - 2.512)2 +(100)(2.46 - 2.512) 2 +(110)(2.636 - 2.512)2 + (120)(2.4 - 2.512)2
6 - 1

= 0.819.
For each trial, we multiplied its squared difference by its exposures, since it is assumed that the
process variance for a trial is that for a single die roll divided by the number of dice.40 Recall that we
are trying to estimate the EPV for a single die roll.41
Exercise: Based on the data for risk 9 what is the estimated variance?
[Solution: The estimated mean is 3.495 and the estimated variance is 1.523.]
In this manner one gets a separate estimate of the process variance of each of the ten risks.
By taking an average of these ten values, one gets an estimate of the (overall) expected value of
the process variance, in this case 1.787.42
By taking a weighted average of the means estimated for each of the ten risks, using the exposures
for each risk as the weight, one gets an estimate of the overall average, in this case 3.008. Then in
order to estimate the VHM one could take a weighted average of the squared deviations of the
individual means from the overall mean, using the exposures for each class as the weight.
The first part of the numerator of the estimated VHM is:

mi (X i - X)2 =

(900)(2.483 - 3.008)2 + (600)(2.512 - 3.008)2 + (600)(2.495 - 3.008)2 + (600)(2.597 - 3.008)2


+ (600)(2.477 - 3.008)2 + (300)(2.49 - 3.008)2 + (600)(3.493 - 3.008)2 + (600)(3.552 - 3.008)2
+ (600)(3.495 - 3.008)2 + (600)(4.495 - 3.008)2 = 2692.5.
However, as derived below this estimate would be biased; it contains too much random variation
due to the use of the observed means rather than the actual hypothetical means. It turns out that if
we subtract out from the numerator the estimated EPV times the number of classes minus one,
then this estimator will be an unbiased estimator of the VHM.

39

Note that this estimate of the process variance only involves data from a single row.
We have divided by the number of columns minus one.
With an unknown mean, one divides by n-1 in order to get an unbiased estimate of the variance.
40
The assumption that the process variance of the frequency is inversely proportional to the size of risk is the same
as used in the derivation of the Bhlmann Credibility formula: Z = E/(E+K).
41
More generally, we will be estimating the EPV and VHM for a single exposure.
The Bhlmann Credibility formula will adjust for the number of exposures.
42
One could take any weighted average of the separate estimates of the process variance. The Bhlmann-Straub
method, in the case where each class has the same number of years of data, gives each of the ten estimates of the
process variance equal weight. Note that the EPV for the underlying model is in this case 2.10.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 60

Numerator of estimated VHM = 2692.5 - (1.787)(10 - 1) = 2676.4.


As derived below, the denominator should be less than the sum of the exposures.43 The
denominator will be the sum of the exposures minus a term which is the sum of the squares of the
exposures for each class divided by the overall number of exposures: = m - mi2 / m =
6000 -

9002 + 6002 + 6002 + 6002 + 6002 + 300 2 + 600 2 + 6002 + 6002 + 6002
=
6000

6000 - 630 = 5370.


Dividing the numerator by this denominator produces an unbiased estimate of the VHM.
Estimated VHM = 2676.4 / 5370 = 0.498.
Thus for this example, one would estimate that K = 1.787 / 0.498 = 3.6.
Note that this estimated value of K differs somewhat from the value of K = 2.15/0.45 = 4.778 for the
multi-sided die example, as discussed in the introductory section. However, unlike here, there we
made use of the parameters of the full parametric model in order to estimate the EPV and VHM.
In summary for the simulation of the multi-sided die example:
C =10, Y = 6, X 1 = 2.483, X = 3.008, v1 = 2.047,
C

vi

estimated EPV =

i=1

C Y

mit (Xit - Xi )2
= i=1 t=1
C (Y - 1)

= 1.787,

mi 2
m1 = 900, m = 6000, = m C

mi (X i

estimated VHM = i=1

i=1

= 6000 - 630 = 5370,

- X)2 - EPV (C -1)

2692.5 - (1.787) (10 - 1)


= 0.498.
5370

estimated K = EPV / VHM = 1.787 / 0.498 = 3.6.

43

In the case of one roll per trial, this is equivalent to dividing by the number of risks minus one, in order to adjust for
the number of degrees of freedom.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 61

Negative Estimate of the VHM:


Unfortunately, the estimate of the VHM produced by the Bhlmann-Straub or other Empirical
Bayes Estimators is often subject to considerable random fluctuation. A significant difficulty is that the
estimated VHM can be negative, even though we know that as a variance it can not be negative.
In the case of negative estimates of the VHM, set the VHM = 0. This results in no
credibility being given to the data, which makes sense if the actual VHM is very small but
positive.
In any case, the random fluctuation in the estimate of the VHM can lead to considerable random
fluctuation in the estimate of K. In the case of the multi-sided dice, with relatively large amounts of
data, when ten additional simulation experiments were run, the estimated values of K ranged from
about 3 to about 7.
Formulas, Variation in Exposures, No Variation in Years:44
Y

mit (Xit - X i
v i = t=1

)2

vi

EPV =

Y - 1

i=1

C Y

mit (Xit - Xi )2
= i=1 t=1
C (Y - 1)

mi 2
= m - i=1
m
C

mi (X i

VHM = i=1

K=

EPV
.
VHM

- X)2 - EPV (C -1)


. If the estimated VHM < 0, set Z = 0.

Zi =

mi
.
mi + K

The complement of credibility is given to the credibility weighted average.

44

Note that if there are missing data cells, but there are the same number of years of data for each class, one can still
use these formulas, rather than the more complicated formulas in the next section. See 4, 11/04, Q.17.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 62

No Variation in Exposures:
The formulas in this section consider exposures. If one takes the number of exposures for each
class for each year equal to 1, then one obtains the formulas for the case without exposures,
discussed previously.
Let mit = 1 for all i and t. Then:
C

mit (Xit - Xi )2
EPV = i=1 t=1
C (Y - 1)

C
C

=m-

C (Y - 1)

Y2

CY2
= YC = YC = (C-1)Y.
YC

YC

mi (X i

- X)2 - EPV (C -1)

VHM = i=1

(Xi - X )2
=

i=1 t=1

mi 2
i=1

(Xit - Xi )2

C Y

i=1

C - 1

Y (Xi - X )2 -

= i=1

EPV (C -1)

(C -1) Y

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 63

Exercise: Use the formulas that consider exposures in order to estimate the EPV and VHM for the
previously discussed three driver example, which did not involve exposures.
Hugh
Dewey
Louis

0
0
0

0
1
0

0
0
2

0
0
1

0
0
0

[Solution: Just take the number of exposures for each class for each year equal to 1.
mit = 1 for all i and t.
C Y

mit (Xit - Xi )2
EPV = i=1 t=1
C (Y - 1)

(1/3)(1/4){(0 - 0)2 + (0 - 0)2 + (0 - 0)2 + (0 - 0)2 + (0 - 0)2 + (0 - .2)2 + (1 - .2)2 + (0 - .2)2


+ (0 - .2)2 + (0 - .2)2 + (0 - .6)2 + (0 - .6)2 + (2 - .6)2 + (1 - .6)2 + (0 - .6)2 } = 4/12 = 1/3.
C

mi 2
= m - i=1
m

= 15 - (52 + 52 + 52 )/15 = 10.

mi (X i

VHM = i=1

- X)2 - EPV (C -1)

(5)(0 - 0.267)2 + (5)(0.2 - 0.267) 2 + (5)(0.6 - 0.267) 2 - (1/ 3)(3 - 1)


= 0.0267.
10
Comment: This matches the result obtained previously using the simpler formulas that do not
involve exposures.]

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 64

Problems:
4.1 (3 points) You are given the following experience for two insured groups:
1

Year
2

Number of members

13

12

14

39

Average loss per member

46

125

114

94.72

Number of members

20

22

25

67

Average loss per member

45

70

22

44.63

Group
1

Number of members

Total

Average loss per member


2

Total

106
63.06

mij (Xij - Xi )2

= 74,030.

i=1 j=1
2

mi (X i

- X )2 = 61,849.

i=1

Determine the nonparametric Empirical Bayes credibility premium for group 2, using the method that
preserves total losses.
(A) 49
(B) 50
(C) 51
(D) 52
(E) 53

4.2 (2 points) You are given the following information for four territories:
Territory
1
2
3
4

Exposures
235
103
130
47

Losses
30,400
12,200
12,800
3,000

Total

515

58,400

The Bhlmann Credibility Parameter, K = 200.


Using the method that preserves total losses, estimate the future pure premium for Territory #4.
(A) 100
(B) 101
(C) 102
(D) 103
(E) 104

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 65

Use the following information to answer the next four questions:


For each of 6 classes for each of 3 years one has the number of exposures and the observed claim
frequency (the number of claims divided by the number of exposures):
Exposures
Claim Frequency
Class
Year
Year
Weighted
1
2
3
Sum
1
2
3
Average
Cracker Mfg.

14

20

13

47

10

14

9.40

Creamery

10

27

26

12

17

17.81

Bakery

24

22

19

65

14

9.49

Macaroni Mfg.

12

10

18

40

12

18

11.25

Ice Cream Mfg.

25

18

25

68

6.57

Grain Milling

11

26

24

28

15

21.27

Sum

91

87

95

273

13.68

9.17

9.97

10.95

4.3 (3 points) Use nonparametric Empirical Bayes techniques to estimate the Expected Value of
the Process Variance.
A. 300
B. 325
C. 350
D. 375
D. 400
4.4 (3 points) Use nonparametric Empirical Bayes techniques to estimate the Variance of the
Hypothetical Means.
A. 17.5
B. 18.0
C. 18.5
D. 19.0
E. 19.5
4.5 (1 point) Using the results of the previous questions, what is the credibility given to the three
years of observed data from the Cracker Manufacturing class?
A. 68%
B. 70%
C. 72%
D. 74%
E. 76%
4.6 (2 points) What is the estimated future claim frequency for the Creamery class?
Use the method that preserves total claims.
E. 15.3
A. 14.5
B. 14.7
C. 14.9
D. 15.1

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 66

4.7 (3 points) One is given the data from 27 insurance agencies writing commercial fire insurance on
behalf of your company in the same state, over the last 3 years.
Let Xij be the loss ratio for agency i in year j.
Let mij be the premium for agency i in year j.
3 27

mij (Xij - Xi

)2

27

mi = 93.2.

= .600.

i=1 j=1

i=1

27

mi (X i

27

X )2

mi2 = 872.3.

= .343.

i=1

i=1

Determine the Buhlmann Credibility factor, Z, for estimating the future loss ratio of an insurance
agency with a total of 8.0 in premiums over 5 years, using nonparametric empirical Bayes
estimation.
A. Less than 35%
B. At least 35% but less than 40%
C. At least 40% but less than 45%
D. At least 45% but less than 50%
E. At least 50%
4.8 (2 points) You are given the following information for three policyholders:
Policyholder
Premiums
Losses
Loss Ratio
1
5
2
40.0%
2
6
3
50.0%
3
10
9
90.0%
Total

21

14

66.7%

The Bhlmann Credibility Parameter, K = 10.


Using the method that preserves total losses, estimate the future loss ratio for
Policyholder #1.
(A) 56%
(B) 57%
(C) 58%
(D) 59%
(E) 60%

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 67

For the next two questions use the following experience for four insurers writing private passenger
automobile insurance in the nation of Hiber:
Year
Insurer
1
2
3
Total
Freedom

Premium
Loss Ratio

10
82%

9
85%

8
86%

27
84.1%

Victoria

Premium
Loss Ratio

12
77%

11
84%

12
75%

35
78.5%

Enterprise

Premium
Loss Ratio

16
76%

18
75%

20
79%

54
76.8%

Security

Premium
Loss Ratio

8
82%

8
77%

8
78%

24
79.0%

Total

Premium
Loss Ratio

46
78.6%

46
79.5%

48
79.0%

140
79.0%

Xij denotes the loss ratio for insurer i and year j, where i = 1, 2, 3, 4 and j = 1, 2, 3.
Premium is the money charged to the insured in order to buy insurance.
Loss ratio = losses / premium.
Corresponding to each loss ratio is the amount of premium, mij, measuring exposure.
4

mij (Xij - Xi )2
i=1 j=1

= 864.15

mi (X i

- X )2 = 1000.79

i=1

4.9 (3 points) Determine the nonparametric Empirical Bayes credibility factor for Freedom Insurance.
A. Less than 65%
B. At least 65% but less than 70%
C. At least 70% but less than 75%
D. At least 75% but less than 80%
E. At least 80%
4.10 (2 points) Determine the nonparametric Empirical Bayes credibility premium for Freedom
Insurance, using the method that preserves total losses.
A. Less than 80%
B. At least 80% but less than 81%
C. At least 81% but less than 82%
D. At least 82% but less than 83%
E. At least 83%

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 68

Use the following data on two classes over three years for the next four questions:
Exposures
Year
2001
2002
2003

Standard
100
120
150

Preferred
50
40
30

Total
150
160
180

Total

370

120

490

Losses
Year
2001
2002
2003

Standard
12,000
13,000
14,000

Preferred
3200
4500
1700

Total
15,200
17,500
15,700

Total

39,000

9400

48,400

Preferred
64.00
112.50
56.67

Total
101.33
109.38
87.22

78.33

98.78

Pure Premium
Year
Standard
2001
120.00
2002
108.33
2003
93.33
Total

105.41

Assume that the losses in each year have been adjusted to the cost level for the year 2005.
4.11 (3 points) Use nonparametric Empirical Bayes techniques to estimate
the Expected Value of the Process Variance.
(A) 25,000 (B) 26,000 (C) 27,000 (D) 28,000 (E) 29,000
4.12 (3 points) Use nonparametric Empirical Bayes techniques to estimate
the Variance of the Hypothetical Mean Pure Premiums.
A. Less than 170
B. At least 170 but less than 180
C. At least 180 but less than 190
D. At least 190 but less than 200
E. At least 200

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 69

4.13 (1 point) How much credibility is assigned to the data for the Standard Class?
A. Less than 75%
B. At least 75% but less than 80%
C. At least 80% but less than 85%
D. At least 85% but less than 90%
E. At least 90%
4.14 (2 points) Using the method that preserves total losses,
estimate the pure premium in the year 2005 for the Preferred Class.
A. Less than 87
B. At least 87 but less than 88
C. At least 88 but less than 89
D. At least 89 but less than 90
E. At least 90

4.15 (3 points) One is given the data for homeowners insurance from 11 branch offices of your
insurance company over the last 6 years.
Let Xij be the loss ratio for branch office i in year j.
Let mij be the premium for branch office i in year j.
11 6

mij (Xij - Xi

)2

11

= 1.843.

i=1 j=1

i=1

m1 = 9.0.

i=1

11

mi (X i

mi = 77.2.
11

X )2

= 0.361.

mi2 = 1105.
i=1

Determine the Buhlmann Credibility factor, Z, for estimating the future loss ratio of the first branch
office, using nonparametric empirical Bayes estimation.
A. 5%
B. 10%
C. 15%
D. 20%
E. 25%

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 70

Use the following information for the next 4 questions:


You observe loss ratios for each of 5 years for each of 10 classes.
Let Xij be the loss ratio for class i in year j, adjusted to the current level.
Let mij be the premiums for class i in year j.
_
Let Xi be the weighted average loss ratio for class i.
Let be the overall weighted average loss ratio.
Let mi be the total premiums for class i.
1

10

100

mij (Xij - Xi

63

)2

27

mi

351

77.2% 82.5% 74.0% 67.3% 73.1% 61.0% 80.2% 72.8% 77.4% 68.6%

10 5

Class:
_
Xi

193

33

125

178

162

142

10

= 102,397.

i=1 j=1

mi = 1374.
i=1

10

mi (X i

- X )2 = 38,429.

X = 76.09%.

i=1

4.16 (1 point) Determine the nonparametric empirical Bayes estimate for


the Expected Value of the Process Variance.
(A) 2000
(B) 2200
(C) 2400
(D) 2600
(E) 2800
4.17 (2 points) Determine the nonparametric empirical Bayes estimate for
the Variance of the Hypothetical Mean Loss Ratios.
(A) 13
(B) 14
(C) 15
(D) 16
(E) 17
4.18 (1 point) What is the credibility assigned to the observed data for class 6?
(A) 6%
(B) 8%
(C) 10%
(D) 12%
(E) 14%
4.19 (4 points) Determine the nonparametric Empirical Bayes credibility premium for class 6,
using the method that preserves total losses.
(A) 71%
(B) 72%
(C) 73%
(D) 74%
(E) 75%

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 71

Use the following information for the next 5 questions:


The Cars-R-Us Insurance Company has insured the following 7 taxicab companies in a city,
with the following numbers of cabs in their fleets and the following aggregate dollars of loss for each
of 5 years:

Company

Number of Cabs
1991
1992
1993

1994

1995

Sum

Arriba Taxis
Bowery
Canarsie Cabs
Downtown
East River
Flushing
Gandhi

80
5
20
10
15
10
25

80
5
20
10
20
11
29

83
6
22
9
25
10
29

85
5
23
9
27
11
30

88
5
26
10
30
12
30

416
26
111
48
117
54
143

Sum

165

175

184

190

201

915

Company

Aggregate Losses
1991
1992
1993

1994

1995

Sum

Arriba Taxis
Bowery
Canarsie Cabs
Downtown
East River
Flushing
Gandhi

$1,000,000
$20,000
$40,000
$50,000
$0
$60,000
$140,000

$570,000
$20,000
$120,000
$0
$40,000
$0
$40,000

$1,800,000
$100,000
$150,000
$110,000
$30,000
$0
$330,000

$200,000
$70,000
$0
$10,000
$0
$0
$70,000

$380,000
$0
$20,000
$40,000
$20,000
$170,000
$100,000

$3,950,000
$210,000
$330,000
$210,000
$90,000
$230,000
$680,000

Sum

$1,310,000

$790,000

$2,520,000

$350,000

$730,000

$5,700,000

1994

1995

1991
to 1995

$2,353
$14,000
$0
$1,111
$0
$0
$2,333

$4,318
$0
$769
$4,000
$667
$14,167
$3,333

$9,495
$8,077
$2,973
$4,375
$769
$4,259
$4,755

Company
1991
Arriba Taxis
Bowery
Canarsie Cabs
Downtown
East River
Flushing
Gandhi
7

Losses per Cab


1992
1993

$12,500
$4,000
$2,000
$5,000
$0
$6,000
$5,600

$7,125
$4,000
$6,000
$0
$2,000
$0
$1,379

$21,687
$16,667
$6,818
$12,222
$1,200
$0
$11,379

mij (Xij - Xi )2
i=1 j=1

= 26,723,482,801.

mi (X i
i=1

- X )2 = 9,876,143,619.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 72

4.20 (1 point) Use nonparametric Empirical Bayes techniques to estimate the Expected Value of
the Process Variance.
A. Less than 930 million
B. At least 930 million but less than 940 million
C. At least 940 million but less than 950 million
D. At least 950 million but less than 960 million
E. At least 960 million
4.21 (2 points) Use nonparametric Empirical Bayes techniques to estimate the Variance of the
Hypothetical Means.
A. Less than 6.5 million
B. At least 6.5 million but less than 7.0 million
C. At least 7.0 million but less than 7.5 million
D. At least 7.5 million but less than 8.0 million
E. At least 8.0 million
4.22 (1 point) Using the results of the previous questions, what is the credibility given to the five
years of observed data from the Downtown Company?
A. Less than 21%
B. At least 21% but less than 22%
C. At least 22% but less than 23%
D. At least 23% but less than 24%
E. At least 24%
4.23 (2 points) Using the method that preserves total losses, what is the estimated future pure
premium for the Canarsie Cab Company?
A. Less than $4200
B. At least $4200 but less than $4400
C. At least $4400 but less than $4600
D. At least $4600 but less than $4800
E. At least $4800
4.24 (1 point) Using the method that preserves total losses, assuming 5 cabs in 1997, what are the
expected aggregate losses in 1997 for the Bowery Company?
A. Less than $28,000
B. At least $28,000 but less than $30,000
C. At least $30,000 but less than $32,000
D. At least $32,000 but less than $34,000
E. At least $34,000

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 73

For the next two questions, use the following information for two classes and three years.
Number of Claims
Class
Year 1
Year 2
Year 3
Total
A
2
4
3
9
B
4
9
6
19
Number of Exposures
Class
Year 1
Year 2
Year 3
Total
A
100
150
200
450
B
200
200
200
600
Frequency
Class
Year 1
Year 2
Year 3
Total
A
0.02000
0.02667
0.01500
0.02000
B
0.02000
0.04500
0.03000
0.03167
4.25 (3 points) Determine the Buhlmann Credibility parameter, K, to be used for estimating future
frequency, using nonparametric empirical Bayes estimation.
A. 300
B. 600
C. 900
D. 1200
E. 1500
4.26 (1 point) Using nonparametric empirical Bayes credibility, estimate the future frequency for the
Class A, using the method that preserves total claims.
A. 2.28%
B. 2.32%
C. 2.36%
D. 2.40%
E. 2.44%

4.27 (3 points) You are given the following commercial automobile policy experience:
Company
Year 1
Year 2
Year 3
Total
Losses
I
?
100,000
120,000
Number of Automobiles
?
200
300
Losses
II
30,000
50,000
?
Number of Automobiles
100
200
?
Losses
III
160,000
?
150,000
Number of Automobiles
400
?
300
Determine the nonparametric empirical Bayes credibility factor, Z, for Company II.
(A) Less than 0.4
(B) At least 0.4, but less than 0.5
(C) At least 0.5, but less than 0.6
(D) At least 0.6, but less than 0.7
(E) At least 0.7

220,000
500
80,000
300
310,000
700

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 74

4.28 (2 points) You are given the following information for two group policyholders:
Group
Year 1
Year 2
A
Aggregate losses
600
600
Number of members
10
12
B
Aggregate losses
1000
900
Number of members
25
20
Using nonparametric empirical Bayes estimation, calculate the credibility factor, Z,
used for Group B's experience.
(A) 85%
(B) 87%
(C) 89%
(D) 91%
(E) 93%
4.29 (4, 11/00, Q.27) (2.5 points) You are given the following information on towing losses for two
classes of insureds, adults and youths:
Exposures
Year
Adult
Youth
Total
1996
2000
450
2450
1997
1000
250
1250
1998
1000
175
1175
1999
1000
125
1125
Total

5000

Pure Premium
Year
1996
1997
1998
1999

1000

Adult
0
5
6
4

Weighted average 3

6000

Youth
15
2
15
1

Total
2.755
4.400
7.340
3.667

10

4.167

You are also given that the estimated variance of the hypothetical means is 17.125.
Determine the nonparametric empirical Bayes credibility premium for the youth class,
using the method that preserves total losses.
(A) Less than 5
(B) At least 5, but less than 6
(C) At least 6, but less than 7
(D) At least 7, but less than 8
(E) At least 8

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 75

4.30 (4, 5/01, Q.32) (2.5 points) You are given the following experience for two insured groups:
Year
Group
1
2
3
Total
1

Number of members

12

25

Average loss per member

96

91

113

97

Number of members

25

30

20

75

113

111

116

113

Average loss per member


Total
2

Number of members

100

Average loss per member

109

mij (xij - x i )2

= 2020

i=1 j=1

mi ( xi

- x)2 = 4800

i=1

Determine the nonparametric Empirical Bayes credibility premium for group 1, using the method that
preserves total losses.
(A) 98
(B) 99
(C) 101
(D) 103
(E) 104
4.31 (4, 11/01, Q.30) (2.5 points) You are making credibility estimates for regional rating factors.
You observe that the Bhlmann-Straub nonparametric empirical Bayes method can be applied,
with rating factor playing the role of pure premium.
Xij denotes the rating factor for region i and year j, where i = 1, 2, 3 and j = 1, 2, 3, 4.
Corresponding to each rating factor is the number of reported claims, mij, measuring
exposure. You are given:
4

mij Xij

mi =

mij

Xi =

j=1

j=1

1
2
3

50
300
150

mi

1.406
1.298
1.178

v i = (1/3) mij (Xij - Xi )2

mi( X i - X )2

j=1

0.536
0.125
0.172

0.887
0.191
1.348

Determine the credibility estimate of the rating factor for region 1 using the method that
3

preserves

mi Xi .
i=1

(A) 1.31

(B) 1.33

(C) 1.35

(D) 1.37

(E) 1.39

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 76

4.32 (4, 11/04, Q.17 & 2009 Sample Q.145) (2.5 points)
You are given the following commercial automobile policy experience:
Company
Year 1
Year 2
Year 3
Losses
I
50,000
50,000
?
Number of Automobiles
100
200
?
Losses
Number of Automobiles

II

?
?

150,000
500

150,000
300

Losses
III
150,000
?
150,000
Number of Automobiles
50
?
150
Determine the nonparametric empirical Bayes credibility factor, Z, for Company III.
(A) Less than 0.2
(B) At least 0.2, but less than 0.4
(C) At least 0.4, but less than 0.6
(D) At least 0.6, but less than 0.8
(E) At least 0.8
4.33 (4, 5/05, Q.25 & 2009 Sample Q.194) (2.9 points) You are given:
Group
Year 1
Year 2
Year 3
Total Claims
1
10,000
15,000
Number in Group
50
60
Average
200
250

Total
25,000
110
227.27

Total Claims
Number in Group
Average

34,000
190
178.95

16,000
100
160

18,000
90
200

Total Claims
59,000
Number in Group
300
Average
196.67
You are also given = 651.03.
Use the nonparametric empirical Bayes method to estimate the credibility factor for
Group 1.
(A) 0.48
(B) 0.50
(C) 0.52
(D) 0.54
(E) 0.56

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 77

Solutions to Problems:
C

mit (Xit - X i )2
4.1. B. EPV =

i=1 t=1

C (Y - 1)

1 1
(74,030) = 18,508.
2 3-1

mi 2

=1
=m- i
m

= 106 - (392 + 672 )/106 = 49.3.

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)
=

61,849 - (2 - 1)(18,508)
= 879.
49.3

K = EPV/VHM = 18,508/879 = 21.1.


Z 1 = 39 / (39 + 21.1) = 0.649. Z2 = 67 / (67 + 21.1) = 0.760.
Credibility weighted mean =

(94.72)(0.649) + (44.63)(0.760)
= 67.70.
0.649 + 0.760

Estimate for group 2 is: (44.63)(0.760) + (67.70)(1 - 0.760) = 50.17.


Comment: Similar to 4, 5/01, Q.32.
4.2. B. Z = E/(E + K) = E/(E + 200).
In order to preserve the total losses, apply the complement of credibility to the credibility weighted
pure premium of:
(54.02)(129.36) + (33.99)(118.45) + (39.39)(98.46) + (19.03)(63.83)
= 110.00.
54.02 + 33.99 + 39.39 + 19.03
The estimated pure premium for Territory 4 is: (0.1903)(63.83) + (1 - 0.1903)(110.00) = 101.21.
Territory

Exposures

Losses

Pure
Premium

Credibility

Estimated
P.P.

1
2
3
4
Sum

235
103
130
47
515

30,400
12,200
12,800
3,000
58,400

129.36
118.45
98.46
63.83
113.40

54.02%
33.99%
39.39%
19.03%

120.46
112.87
105.45
101.21

Cred. Weighted

110.00

Comment: (120.46)(235) + (112.87)(103) + (105.45)(130) + (101.21)(47) = 58,399,


the observed total losses, subject to rounding.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 78

4.3. C. From the data from the first class the estimated process variance is:
{(14)(10 - 9.4)2 + (20)(6 - 9.4)2 + (13)(14 - 9.4)2 } / (3 - 1) = 255.7. Similarly, for each of the
remaining classes the estimated process variance would be: 440.0, 468.1, 393.8, 137.3, 404.6.
Thus the estimate of the Expected Value of the Process Variance is:
(255.7 + 440.0 + 468.1 + 393.8 + 137.3 + 404.6) / 6 = 349.9.
6

4.4. A.

mi (X i - X)2 = 47(9.4 - 10.95)2 + 27(17.81 - 10.95)2 + 65(9.49 - 10.95)2 +


i=1

40(11.25 - 10.95)2 + 68(6.57 - 10.95)2 + 26(21.27 - 10.95)2 = 5597.5.


6

=m-

mi2 / m = 273 - {472 + 272 + 652 + 402 + 682 + 262}/273 = 221.48


i=1

Thus the estimated VHM = {5597.5 - (349.9)(6 - 1)} / 221.48 = 17.4.


4.5. B. K = 349.9/17.4 = 20.1. For the Cracker Manufacturing class the three years of data have a
total of 47 exposures. Thus Z = 47/(47 + 20.1) = 70.0%.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 79

4.6. E. The credibility assigned to the data from the Creamery Class is: 27/(27 + 20.1) = 57.3%.
The credibilities assigned to the classes are: 0.700, 0.573, 0.764, 0.666, 0.772, 0.564.
The observed frequencies for the classes are given as: 9.40, 17.81, 9.49, 11.25, 6.57, 21.27.
Thus the credibility weighted pure premium is:
(0.700)(9.40) + (0.573)(17.81) + (0.764)(9.49) + (0.666)(11.25) + (0.772)(6.57) + (0.564)(21.27)
0.700 + 0.573 + 0.764 + 0.666 + 0.772 + 0.564
= 12.03.
The observed claim frequency for the Creamery Class is 17.81.
Thus the estimated future claim frequency for the Creamery Class is:
(57.3%)(17.81) + (42.7%)(12.03) = 15.34.
Comment: Using the method that preserves the overall number of claims:
Class

Exposures
Exposures
Claim Frequency
Claim Frequency
Year (Sum of 3 Years)
Year (over 3 years)

Cracker Mfg.
Creamery
Bakery
Macaroni Mfg.
Ice Cream Mfg.
Grain Milling

47
27
65
40
68
26

9.40
17.81
9.49
11.25
6.57
21.27

Overall
Cred. Weighted

273

10.95
12.03

Credibility
(K = 20.1)

Estimated Future
Claim Frequency

0.700
0.573
0.764
0.666
0.772
0.564

10.19
15.35
10.09
11.51
7.82
17.24

The observed claim frequency over the three years is given in the table in the question.
For Creamery it was calculated as: {(8)(26) + (10)(12) + (9)(17)} / (8 + 10 + 9) = 17.81.
The numerator is the number of claims observed over the three years.
The denominator is the number of exposures observed over the three years.
The mean observed claim frequency over all years and classes of 10.95 was calculated similarly.
If one did not use the method that preserves total claims, then the estimated future claim frequencies
for each of the six classes would be calculated as follows:
Class

Exposures
Exposures
Claim Frequency
Claim Frequency
Year (Sum of 3 Years)
Year (over 3 years)

Cracker Mfg.
Creamery
Bakery
Macaroni Mfg.
Ice Cream Mfg.
Grain Milling

47
27
65
40
68
26

9.40
17.81
9.49
11.25
6.57
21.27

Overall

273

10.95

Credibility
(K = 20.1)

Estimated Future
Claim Frequency

0.700
0.573
0.764
0.666
0.772
0.564

9.87
14.89
9.84
11.15
7.57
16.77

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 80

1
1
4.7. A. EPV =
27 3 - 1

27 3

mij (Xij - Xi)2 = 0.600/54 = 0.0111.


i=1 j=1

27

=m-

mi2 / m = 93.2 - (872.3/93.2) = 83.84.


i=1

27

VHM = {

mi (X i - X)2 - EPV(27 - 1)} / = {0.343 - (26)(0.0111)}/83.84 = 0.000649.


i=1

K = EPV/VHM = 0.0111/0.000649 = 17.1. Z = 8/(8 + 17.1) = 31.9%.


Comment: Loss Ratio = Losses / Premium takes the place of pure premium = Losses / Exposures.
Thus premiums take the place of exposures.
We are applying the same mathematics to a different situation. See 4, 11/01, Q.30.
It is usual to have the same number of years for the individual whose future you are predicting, as
was used to estimate K. However, that is not a requirement. Once K has been estimated, in this
case using 3 years of data from each of 27 agencies, this Buhlmann Credibility Parameter can be
used in the formula Z = N(N + K) with N equal to any reasonable value.
In this case, we happen to have more years of data for the agency we wish to estimate the future.
In other sections, I discuss how to estimate K using data that does not have the same number of
years of data from each individual or class.
4.8. A. The credibilities are: 5/(5 + 10) = 33.3%, 37.5%, and 50.0%.
{(0.333)(0.4) + (0.375)(0.5) + (0.500)(0.9)}
The Credibility Weighted Loss Ratio is:
= 0.638.
0.333 + 0.375 + 0.500
The estimated future loss ratio for policyholder #1 is: (0.333)(0.4) + (0.667)(0.638) = 55.9%.
1 1
4.9. A. EPV =
4 3 - 1
i

mij (Xij - Xi )2

= 864.15 / 8 = 108.02.

i=1 j=1
j

=m-

mi2 /m = 140 - (272 + 352 + 542 + 242)/140 = 101.1.


i=1
4

VHM = {

mi (X i - X)2 - (C - 1)EPV}/ = {1000.79 - (4-1)(108.02)} / 101.1 = 6.694.


i=1

K = EPV/VHM = 108.02/6.694 = 16.14.


Freedom has total premiums of 27 and credibility of: Z1 = 27/(27 + 16.14) = 0.626.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 81

4.10. D. K = 16.14. Z1 = 27/(27 + 16.14) = 0.626.


Z 3 = 54/(54 + 16.14) = 0.770.

Z 2 = 35/(35 + 16.14) = 0.684.

Z 4 = 24/(24 + 16.14) = 0.598. Credibility weighted mean =

(0.626)(84.1%) + (0.684)(78.5%) + (0.770)(76.8%) + (0.598)(79.0%)


= 79.4%.
0.626 + 0.684 + 0.770 + 0.598
Estimated future loss ratio for Freedom is: (0.626)(84.1%) + (1 - 0.626)(79.4%) = 82.3%.
Comment: Here loss ratio takes the place of pure premium, while premiums takes the place of
exposures measuring the size of insured.
Just as in the more common case with pure premiums, here:

mij Xij is the total losses for insured j.


j

See 4, 11/01, Q.30 for yet another parallel case involving in that case regional rating factors.
1
4.11. E. v i =
Y - 1
1
Estimated EPV =
C
Class
Standard
Preferred

mit (Xit - Xi )2 = estimated process variance for class i.


t=1
C

vi = (22094.6 + 35525.0)/2 = 28,810.


i=1

(# of Exposures)(Loss per Exposure - 3 year average)^2


2001
2002
2003
21300.2
10272.2

1028.7
46694.4

21860.2
14083.3

Average

22094.6
35525.0
28809.8

120.00
64.00

Loss per Exposure


108.33
112.50

Standard
Preferred
Standard
Preferred

Standard
Preferred

Process
Variance

93.33
56.67

Weighted Avg.
105.41
78.33

$12,000
$3,200

Aggregate Losses
$13,000
$4,500

$14,000
$1,700

Sum
$39,000
$9,400

100
50

Number of Exposures
120
40

150
30

Sum
370
120

Comment: In this question, during 2004 we are using data from the years 2001-2003 in order to
predict what will happen in 2005. This is a very common situation in practical applications.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 82


C

4.12. E. Let = m -

mi2 / m = 490 - (3702 + 1202) / 490 = 181.2.


i=1

X = overall average loss per exposure = 48,400/490 = 98.78.


C

mi ( Xi -

X)2 - EPV (C - 1)

estimated VHM = i=1

{370(105.41 - 98.78)2 + 120(78.33 - 98.78)2 - (2-1)(28810)} / 181.2 = 207.7.


4.13. A. K = EPV/VHM = 28810/207.7 = 139.
For the Standard Class, Z = 370/(370 + 139) = 72.7%.
4.14. B. For the Preferred Class, Z = 120/(120+139) = 46.3%. In order to preserve the total
losses apply the complement of credibility to the credibility weighted average pure premium:
{(0.727)(105.41) + (.0463)(78.33)} / (0.727 + 0.463) = 94.87.
Therefore, the estimated pure premium for the Preferred Class is:
(0.463)(78.33) + (1 - 0.463)(94.87) = 87.21.
1
1
4.15. B. EPV =
11 6 - 1

11 6

mij (Xij - Xi)2 = 1.843/55 = .0335.


i=1 j=1

11

=m-

mi2 / m = 77.2 - (1105/77.2) = 62.89.


i=1

11

VHM = { mi (X i - X )2 - EPV(11 - 1)} / = {0.361 - (0.0335)(10)} / 62.89 = 0.00041.


i=1

K = EPV/VHM = 0.0335/0.00041 = 82. Z = 9/(9 + 82) = 9.9%.


1
1
4.16. D. EPV =
C Y - 1

mij (Xij - Xi)2 = 101


i=1 j=1

1
(102,397) = 2560.
5 - 1

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 83


C

4.17. A. = m -

mi2 / m = 1374 - 269,954/1374 = 1177.5.


i=1

mi ( Xi -

X)2 - EPV (C - 1)

VHM = i=1

= {38,429 - (10-1)(2560)} / 1177.5 = 13.07.

4.18. E. K = EPV/VHM = 2560/13.07 = 195.9.


Z 6 = 33/(33+195.9) = 0.144.
4.19. C. Credibility weighted mean = 280.0/3.723 = 75.2.
Class

Exposures

Credibility

Loss Ratio

Product

Estimate

1
2
3
4
5
6
7
8
9
10

100
351
63
27
193
33
125
178
162
142

33.8%
64.2%
24.3%
12.1%
49.6%
14.4%
39.0%
47.6%
45.3%
42.0%

77.2
82.5
74.0
67.3
73.1
61.0
80.2
72.8
77.4
68.6

26.1
52.9
18.0
8.2
36.3
8.8
31.2
34.7
35.0
28.8

75.9
79.9
74.9
74.3
74.2
73.2
77.2
74.1
76.2
72.4

Sum

372.3%

280.0

Estimated future loss ratio for class 6 is: (61.0%)(.144) + (75.2%)(1 - 0.144) = 73.2%.
Comment: If the overall loss ratio were adequate, then the manual rate for class 6 might be
changed by: 73.2%/76.09% - 1 = -3.8%, ignoring fixed expense loadings, etc. Similarly, class 2
with a higher than average estimated loss ratio of 79.9%, might have its manual rate increased
by: 79.9%/76.09% - 1 = 5.0%.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 84

1
4.20. D. vi =
Y - 1

mit (Xit - Xi )2 = estimated process variance of data for company i.


t=1

1
1
Estimated EPV = (1/C) vi =
C Y - 1

mij (Xij - Xi)2 = (26,723,482,801) / {(7)(4)}


i=1 j=1

= 954.4 million.
Comment: Here is how one would calculate
Company

Arriba Taxis
Bowery
Canarsie
Downtown
East River
Flushing
Gandhi
Sum

mij (Xij - X i)2:

(Number of Cabs)(Loss per Cab-5 year average)^2


1991
1992
1993
1994
1995
7.223e+8
8.311e+7
1.893e+7
3.906e+6
8.876e+6
3.030e+7
1.784e+7

4.494e+8
8.311e+7
1.833e+8
1.914e+8
3.030e+7
1.996e+8
3.305e+8

1.234e+10
4.427e+8
3.253e+8
5.542e+8
4.639e+6
1.814e+8
1.272e+9

4.336e+9
1.754e+8
2.033e+8
9.588e+7
1.598e+7
1.996e+8
1.760e+8

2.359e+9
3.262e+8
1.263e+8
1.406e+6
3.156e+5
1.178e+9
6.065e+7

Sum
2.020e+10
1.111e+9
8.570e+8
8.468e+8
6.010e+7
1.789e+9
1.857e+9
2.672e+10

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 85


C

4.21. A. Let = m -

mi2 / m =
i=1

915 - (4162 + 262 + 1112 + 482 + 1172 + 542 + 1432 )/915 = 915 - 244 = 668.6.
C

mi ( Xi -

estimated VHM = i=1

X)2 - EPV (C - 1)
=

{9876 million - (7 - 1)(954 million)} / 668.6 = 6.21 million.


Comment: X = overall average loss per cab = 5.7 million/915 = $6230.
Here is how one would calculate mi ( X i - X )2 :
Arriba Taxis
Bowery
Canarsie
Downtown
East River
Flushing
Gandhi

Mean
9495
8077
2973
4375
769
4259
4755

Exposures
416
26
111
48
117
54
143

Contribution to numerator of the VHM


4,434,653,600
88,696,634
1,177,493,439
165,169,200
3,489,234,957
209,781,414
311,114,375

Overall

6230

915

9,876,143,619

4.22. D. K = EPV/VHM = 954 /6.21 = 154. There are 48 exposures.


Z = 48/(48+154) = 23.8%.
4.23. B. Calculate the credibility weighted pure premium of 5218:
Arriba Taxis
Bowery
Canarsie
Downtown
East River
Flushing
Gandhi
Sum

Exposures
416
26
111
48
117
54
143

Z
73.0%
14.4%
41.9%
23.8%
43.2%
26.0%
48.1%
270.4%

Pure Premium
$9495
$8077
$2973
$4375
$769
$4259
$4755

Extension
$6930
$1167
$1245
$1040
$332
$1106
$2289
$14108

Estimate
$8340
$5631
$4278
$5018
$3297
$4969
$4995
$5218

The mean pure premium for Canarsie is $2973.


Canarsie has 111 exposures, and K = 154, so Z = 111/(111+154) = 41.9%.
Thus the estimated future pure premium is: (0.419)($2973) + (0.581)($5218) = $4277.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 86

4.24. B. The mean pure premium for Bowery is $8077.


Bowery has 26 exposures and K = 154, so Z = 26/(26+154) = 14.4%.
Thus the estimated future pure premium is: (.144)($8077) + (.856)($5218) = $5630.
The expected aggregate losses for 5 cabs are: (5)(5630) = $28,150.
1
4.25. B. vi =
Y - 1

mit (Xit - Xi )2 = estimated process variance for class i.


t=1

Estimated EPV = (1/C) vi = (.00583 + .03167)/2 = 0.01875.


Class
A
B

(# of Exposures)(Frequency - 3 year average)^2


1
2
3

Process
Variance

0.00000
0.02722

0.00583
0.03167

0.00667
0.03556

0.00500
0.00056

Average

0.01875

A
B

0.02000
0.02000

Frequency
0.02667
0.04500

A
B

2
4

Number of Claims
4
9

100
200

Number of Exposures
150
200

A
B

0.01500
0.03000

Weighted Avg.
0.02000
0.03167

3
6

Sum
9
19

200
200

Sum
450
600

= m - mi2 /m = 1050 - (4502 + 6002 )/1050 = 514.3.


X1 = 9/450 = 0.02000. X 2 = 19/600 = 0.03167.
X = overall average frequency = 28/1050 = 0.02667.
C

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)

{450(0.02000 - 0.02667)2 + 600(0.03167 - 0.02667)2 - (2 - 1)(0.01875)} / 514.3 = 0.0000316.


K = EPV/VHM = 0.01875/0.0000316 = 593.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 87

4.26. C. Z A = 450/(450 + 593) = 0.431. ZB = 600/(600 + 593) = 0.503.


Credibility weighted mean = {(0.431)(0.0200) + (0.503)(0.03167)} / (0.431 + 0.503) = 0.0263.
Estimated frequency for Class A is: (.431)(.0200) + (1 - .431)(.0263) = 0.0236.
Comment: If one did not use the method that preserves total losses, in this case one would get a
very similar result: (0.431)(0.0200) + (1 - 0.431)(0.02667) = 0.0238.
4.27. D. Xit = pure premiums = losses/(number of automobiles):
Company

Pure Premium

I
II
III

500
300
400

Exposures
200
300
100
200
400
300

I
II
III

1
vi =
Y - 1

400
250
500

XBari

vi

440.00
266.67
442.86

1,200,000
166,667
1,714,286

Average

1,026,984

Total
500
300
700

mit (Xit - Xi )2 = estimated process variance for class i.


t=1

Estimated EPV = (1/C) vi = (1200 + 167 + 1714)/3 = 1027 thousand.


Let = m - mi2 / m = 1500 - (5002 + 3002 + 7002 )/1500 = 946.7.
X = overall average loss per exposure = 610,000/1500 = 406.67.

mi ( X i - X )2 = 500(440 - 406.67)2 + 300(266.67 - 406.67)2 + 700(442.86 - 406.67)2 =


7352 thousand.
C

mi ( Xi -

Estimated VHM = i=1

X)2 - EPV (C - 1)

{7352 thousand - (3 - 1)(1027 thousand)}/946.7 = 5596.


K = EPV/VHM = (1027 thousand) / 5596 = 184.
For Company II, Z = 300 / (300 + 184) = 62.0%.
Comment: Similar to 4, 11/04, Q.17. Note that even though there are missing data cells, each
company has the same number of years of data, two. Therefore, one can still use the formulas
without varying numbers of years of data, rather than the more complicated formulas with varying
number of years. Using the latter formulas would result in the same answer.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 88

4.28. B. The pure premiums by year for Group A are: 60 and 50.
For Group A, overall p.p. is 1200/22 = 54.55.
The pure premiums by year for Group B are: 40 and 45. Overall p.p. is 1900/45 = 42.22.
1
vi =
Y - 1

mit (Xit - Xi )2 .
t=1

Estimated process variance for Group A:


(1/1) {(10)(60 - 54.55)2 + (12)(50 - 54.55)2 } = 545.46.
Estimated process variance for Group B:
(1/1) {(25)(40 - 42.22)2 + (20)(45 - 42.22)2 } = 277.78.
Estimated EPV is: (545.46 + 277.78)/2 = 411.62.
C

m-

mi2 / m = 67 - (222 + 452)/67 = 29.55.


i=1

The overall mean is: 3100 / 67 = 46.27.


C

mi ( Xi -

X)2 = (22)(54.55 - 46.27)2 + (45)(42.22 - 46.27)2 = 2246.40.

i=1

mi ( Xi -

estimated VHM = i=1

X)2 - EPV (C - 1)
C

m -

mi2 / m
i=1

K = EPV / VHM = 411.62 / 62.09 = 6.63.


For Group B, Z = 45 / (45 + K) = 87.2%.

= {2246.40 - (411.62)(2 - 1)} / 29.55 = 62.09.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 89

1
4.29. E. v i =
Y - 1

mit (Xit - Xi )2 = estimated process variance for class i.


t=1

Estimated EPV = (1/C) vi = (10667 + 13917)/2 = 12,292.


Class
Adult
Youth

(# of Exposures)(Loss per Exposure - 4 year average)^2


1996
1997
1998
1999
18000
11250

4000
16000

9000
4375

1000
10125

Average

10667
13917
12292

Adult
Youth

$0
$15

Loss per Exposure


$5
$2

Adult
Youth

$0
$6,750

Aggregate Losses
$5,000
$500

2000
450

Number of Exposures
1000
250

Adult
Youth

Process
Variance

$6
$15

$4
$1

Weighted Avg.
$3
$10

$6,000
$2,625

$4,000
$125

Sum
15000
10000

1000
125

Sum
5000
1000

1000
175

K = EPV/VHM = 12292/17.125 = 718.


For Youth, Z = 1000/(1000 + K) = 0.582.
For Adult, Z = 5000/(5000 + K) = 0.874.
In order to preserve the total losses apply the complement of credibility to the credibility weighted
average p.p.: {(0.582)(10) + (0.874)(3)} / (0.582 + 0.874) = 5.80.
Therefore, the estimated pure premium for the Youth Class is:
(0.582)(10) + (1 - 0.582)(5.80) = 8.24.
Comment: The estimated pure premium for the Adult Class is:
(.874)(3) + (1 - .874)(5.80) = 3.35. The combined estimated losses corresponding to the
observed exposures are: (5000)(3.35) + (1000)(8.24) = 24,990, equal to the observed losses of
25,000, subject to rounding.
If one had been asked to calculate the VHM, one would proceed as follows.
Let = m - m2 / m = total exposures adjusted for degrees of freedom
= 6000 - (50002 + 10002 )/6000 = 1666.67.
X = overall average loss per exposure = 25000/6000 = 4.167.
C

mi ( Xi -

estimated VHM = i=1

X)2 - EPV (C - 1)

{(5000(3 - 4.167)2 + 1000(10 - 4.167)2 ) - (2 - 1)12292)} / 1666.67 = 17.125.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 90

1
1
4.30. A. EPV =
C Y - 1
i

mij (Xij - Xi)2 = (1/2){1/(3-1)} (2020) = 505.


i=1 j=1

= m - mi2 /m = 100 - (252 + 752 )/100 = 37.5.


C

mi ( Xi -

X)2 - EPV (C - 1)

VHM = i=1

= {4800 - (2-1)(505)}/37.5 = 114.5.

K = EPV/VHM = 505/114.5 = 4.41.


Z 1 = 25/(25+4.41) = .850. Z2 = 75/(75 + 4.41) = .944.
Credibility weighted mean = {(97)(.850) + (113)(.944)}/(.850 + .944) = 105.4.
Estimate for group 1 is: (97)(.850) + (105.4)(1 - .850) = 98.3.
Comment: mi( X i - X )2 = (25)(97-109)2 + (75)(113-109)2 = 4800.
2

mij (Xij - Xi)2 =


i=1 j=1

(8)(96-97)2 + (12)(91-97)2 + (5)(113-97)2 + (25)(113-113)2


+ (30)(111-113)2 + (20)(116-113)2 = 2020.
Estimate for group 1 is: (97) (85.0%) + (105.4) (1 - 85.0%) = 98.3.
Estimate for group 2 is: (113) (94.4%) + (105.4) (1 - 94.4%) = 112.6.
Reported exposures for group 1: 25. Reported exposures for group 2: 75.
(25) (98.3) + (75) (112.6) = 10,903, matching total reported losses subject to rounding.
Reported total losses: (25) (97) + (75) (113) = 10,900.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 91

4.31. C. EPV = (v1 + v2 + v3 )/3 = (0.536 + 0.125 + 0.172)/3 = 0.278.


= m - mi2 / m = 500 - (502 + 3002 + 1502 )/500 = 270.
C

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)

= {(0.887 + 0.191 + 1.348) - (0.278)(3-1)} / 270 =

1.878/270 = .00693. K = EPV/VHM = 0.278/0.00693 = 40.1.


Z 1 = 50 / (50 + 40.1) = 0.555. Z2 = 300 / (300 + 40.1) = 0.882. Z3 = 150 / (150 + 40.1) = 0.789.
The credibility weighted rating factor is:
{(0.555)(1.406) + (0.882)(1.298) + (0.789)(1.178)} / (0.555 + 0.882 + 0.789) = 1.282.
Estimated rating factor for region 1: (0.555)(1.406) + (1 - 0.555)(1.282) = 1.351.
Comment: A situation not specifically covered on the Syllabus. On the exam, dont worry about the
situation, just apply the mathematics. A rating factor would be used to help set the rate to be
charged in a geographical region compared to some overall average rate. The higher the
rating factor, the higher the rate charged in that region.
For example, for automobile insurance an urban area would typically have a higher rating
factor than a rural area. See for example,Ratemaking by Charles McClenahan and Risk
Classification by Robert Finger, in Foundations of Casualty Actuarial Science.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 92

4.32. B. Xit = pure premiums = losses/(number of automobiles):


Company

Pure Premium

I
II
III

500
300
3000

Exposures
100
200
500
300
50
150

I
II
III

1
vi =
Y - 1

250
500
1000

XBari

vi

333.33
375
1500

4,166,667
7,500,000
150,000,000

Average

53,888,889

Total
300
800
200

mit (Xit - Xi )2 = estimated process variance for class i.


t=1

Estimated EPV = (1/C) vi = (4.167 + 7.5 + 150)/3 = 53.89 million.


Let = m - mi2 / m = 1300 - (3002 + 8002 + 2002 )/1300 = 707.7.
X = overall average loss per exposure = 700,000/1300 = 538.46.
C

mi ( Xi -

Estimated VHM = i=1

X)2 - EPV (C - 1)

{300(333.33 - 538.46)2 + 800(375 - 538.46)2 + 200(1500 - 538.46)2 - (3-1)(53.89 mil)}/707.7 =


0.1570 million.
K = EPV / VHM = 53.89 million/ 0.1570 million = 343.
For Company III, Z = 200 / (200 + 343) = 36.8%.
Comment: Note that even though there are missing data cells, each company has the same number
of years of data, two. Therefore, one can still use the formulas without varying numbers of years of
data, rather than the more complicated formulas with varying number of years. Using the latter
formulas would result in the same answer.

2013-4-12

Empirical Bayesian Cred. 4 Varying Exposures, HCM 10/22/12, Page 93

4.33. B. Xit = pure premiums = (total claims) / (number in group).


Group

Pure Premium

1
2

200
160
Exposures
50
100

1
2

1
vi =
Y - 1

250
200

60
90

XBari

vi

227.27
178.95

68,182
75,789

Average

71,986

Total
110
190

mit (Xit - Xi )2 = estimated process variance for group i.


t=1

Estimated EPV = (1/C) vi = (68182 + 75789)/2 = 71,986.


The estimated VHM is given as 651.03. K = EPV/VHM = 71986/651.03 = 110.6.
Group 1 has 110 exposures, so its data is given credibility of: 110/(110 + 110.6) = 49.9%.
Comment: a is Loss Models notation for the VHM. Thus is the estimated VHM.
Presumably, those students who used the syllabus readings Credibility by Mahler and Dean, and
Topics in Credibility by Dean, which do not use that notation, had to guess that was the
estimated VHM, or had to do this calculation themselves!
Even though there are missing years of data, each group has two years of data.
Therefore, there is no need to use the formulas involving differing numbers of years of data.
Group 2 has 190 exposures, so its data is given credibility of 190/(190 + 110.6) = 63.2%.
Let = m - mi2 / m = 300 - (1102 + 1902 )/300 = 139.33.
X = overall average loss per exposure = 59000/300 = 196.67.
C

mi ( Xi -

Estimated VHM = i=1

X)2 - EPV (C - 1)

{110( 227.27 - 196.67)2 + 190(178.95 - 196.67)2 - (2-1)(71,986)} / 139.33 = 651.0, as given.

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 94

Section 5, Varying Exposures, Differing Numbers of Years45


As was the case with no variation in exposures, when one has variation in exposures the method
can be generalized in order to deal with differing numbers of years of data.
Just as in the case without exposures, rather than take an average of each insureds estimated
variance, one takes a weighted average in order to estimate the EPV. One uses weights equal to
the number of degrees of freedom for each class, that is the number of years of data for that class
minus 1. Yi = the number of years of data for class i.
Estimated EPV = weighted average of the estimated variances for each class
C

C Yi

(Yi - 1) vi

=1
= i C

(Yi - 1)

mit (Xit - Xi )2
= i=1 t=1C

=1

(Yi

.
- 1)

i=1

When each class has the same number of year of data Y, then we just take a straight average of the
sample variances for each class, and the above equation for the EPV reduces to the prior one when
the years do not vary.
One uses the same formulas for estimating the VHM as when the years did not vary:
C

mi 2
= m - i=1
m
C

mi (X i

VHM = i=1

45

- X)2 - EPV (C -1)

The formulas in this section have those in the previous sections as special cases.

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 95

Exercise: You have data for 2 classes. You have no data from the Dairy Class for year one;
you have data from both classes for the remaining two years.
Use nonparametric Bayesian Estimation with the method that preserves total losses in order to
estimate the future pure premium for each class.
Class
Poultry Farms
Dairy Farms

1
41

Exposures
2
3
37
29
59
53

1
232

Losses
2
33
60

3
237
151

[Solution: The mean pure premium for Poultry is: (232+33+237) / (41+37+29) = 502/107 = 4.69.
The mean pure premium for Dairy is: ( 60 + 151) / (59 + 53) = 211/112 = 1.88.
C Yi

mit (Xit - Xi )2
EPV = i=1 t=1C

. For Poultry the contribution to the numerator is:

(Yi

- 1)

i=1

(41)(5.66 - 4.69)2 + (37)(0.89 - 4.69)2 + (29)(8.17 - 4.69)2 = 924.


Similarly, for Dairy: (59)(1.02 - 1.88)2 + (53)(2.85 - 1.88)2 = 94.
Thus the estimate of the Expected Value of the Process Variance is: (924 + 94) / (2 + 1) = 339.
C

mi 2
= m - i=1
m

= 219 - (1072 + 1122 ) / 219 = 109.4.

X = overall average = (502 + 211) / (107 + 112) = 713/219 = 3.26.

mi ( X i - X )2 = (107)(4.69 - 3.26)2 + (112)(1.88 - 3.26)2 = 432.


Thus the estimated VHM = {432 - (339)(2-1)} / 109.4 = 0.85. K = 339/0.85 = 399.
For the Poultry class the three years of data have a total of 107 exposures.
Thus Z = 107 / (107 + 399) = 21.1%.
For the Dairy class the two years of data have a total of 112 exposures.
Thus Z = 112 / (112 + 399) = 21.9%.
(21.1%)(4.69) + (21.9%)(1.88)
Credibility weighted pure premium is:
= 3.26.
21.1% + 21.9%
The estimated pure premium for Poultry is: (21.1%)(4.69) + (1 - 21.1%)(3.26) = 3.56.
The estimated pure premium for Dairy is: (21.9%)(1.88) + (1 - 21.9%)(3.26) = 2.96.
Comment: In this case, since the two classes have very similar volumes of data, the credibility
weighted pure premium is similar to the overall mean.]

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 96

Assumptions for Nonparametric Empirical Bayes Estimation:


This technique is referred to as nonparametric empirical Bayes estimation, because we have
made no specific assumptions about the structure parameters. In fact, we make some assumptions;
as will be discussed, we implicitly or explicitly assume a certain covariance structure between the
data from years, classes, etc. To the extent the assumed covariance structure does not match the
particular situation to which the method is being applied, the resulting credibilities will be less than
optimal.46
The Assumed Covariance Structure:
The Empirical Bayes methods shown here assumes the usual Bhlmann or Bhlmann-Straub
covariance structure, as discussed in Mahlers Guide to Buhlmann Credibility.
As discussed previously, when the size of risk m is not important, for two years of data from a single
risk, the Variance-Covariance structure between the years of data is as follows:
COV[Xt , Xu ] = 2tu + 2, where tu is 1 for t = u and 0 for t u.
COV[X1 , X2 ] = 2 = VHM, and COV[X1 , X1 ] = VAR[X1 ] = 2 + 2 = EPV + VHM.
When the size of risk m is important, the EPV is assumed to be inversely proportional to the size of
risk and the covariance structure of frequency, severity, or pure premiums is assumed to be:
COV[Xt , Xu ] = tu (2 /m) + 2.
Assume one has data from a number of risks over several years. Let Xit be the data (die roll,
frequency, severity, pure premium, etc.) observed for class (or risk) i, in year t, and let mit be the
measure of size (premiums, exposures, number of die rolls, etc.) for class (or risk) i, in year t.
Then the assumed covariance structure is:
COV[Xit , Xju] = ij tu (2 / mit) + ij 2.
In other words, two observations from different random risks have a covariance of zero,
two observations of the same risk in different years have a covariance of 2, the VHM,
while the observation of a risk in a single year has a variance of (2 / mit) + 2,
the VHM plus the EPV (for a risk of size 1) divided by the size of risk.

46

Examples of other covariance structures are discussed near the end of a Mahlers Guide to Buhlmann Credibility,
and in Examples 20.25 and 20.26 in Loss Models.

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 97

Let M be the overall grand mean, M = E[Xit], then we have from the assumed covariance structure
that the expected value of the product of two observations is:
E[Xit Xju] = COV[Xit , Xju] + E[Xit]E[Xju] = ij tu (2 / mit) + ij 2 + M2 .
Then for example for a single risk in a single year: E[Xit2 ] = (2 / mit) + 2 + M2 .
Y

Let X i = (1/ mi)

mit Xit = weighted average of data for class i.


t=1

Then E[Xit X i] = (1/ mi)

miu E[XitXiu] = {(1/ mi) miu {tu (2 / mit)


u=1

+ 2 + M2 }

u=1

= (1/ mi)

miu {2 + M2} + (1/ mi) miu (2 / miu) = 2 + M2 + 2 / mi.


u=1

Also

E[ X i2 ] = E[ X i X i] = E[

t=1

t=1

mit Xit X i ] = (1/ mi) mit E[Xit Xi ] =

(1/ mi)

mit (2 + M2 + 2 / mit ) = 2 + M2 + 2 / mi = E[Xit X i].


t=1

An Unbiased Estimate of the EPV (for a single exposure):


In order to estimate the EPV, one looks at the squared differences between the observations for a
class and the weighted average observations for that class:
Y

mit (Xit - X i )2
Let vi = t=1

Y - 1

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 98
Y

Then, (Y-1) E[vi] =

mit E[(Xit - Xi

)2

]=

t=1

mit { E[Xit2] - 2E[Xit X i] + E[X i2] } =


t=1

mit

{(2

/ mit) +

M2

( 2

M2

/ mi )} =

t=1

(1 - mit / mi ) = 2 (Y-1).
t=1

Thus, E[vi] = 2 (Y-1) / (Y-1) = 2. vi is an unbiased estimator of the (expected value of the)
process variance. Then any weighted average of the vi for the various classes would be an
unbiased estimator of the EPV.
C

In particular, EPV = (1/C)

vi .
i=1

is an unbiased estimator of the EPV (for a single exposure of the risk process).
An Unbiased Estimate of the VHM:
In order to estimate the VHM, one looks at the squared differences between the weighted average
observations for each class and the overall weighted average:
C

Let 2 =

mi (X i - X )2 .
i=1

1
For two different risks, i j, E[ X i X i ] = E[
mi mj

1
mi mj

t=1 u=1

1
mit mju E[Xit Xju] =
mi mj

mit mju Xit Xju ] =


t=1 u=1

mit mju M2 = M2.


t=1 u=1

Therefore, since previously we had that E[ X i2 ] = M2 + 2 + (2 / mi), we have that:


E[ X i X i ] = M2 + ij (2 + 2 / mi).

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 99
C

1
E[ X i X ] = E[
m

j=1

1
mj X i X j =
m

mj E[ Xi Xj ] =
j=1

(1/m)

mj {M2 + ij

(2 + M2 + 2 / mi)} = (1/m) {mM2 + mi2 + 2}

j=1

= M2 + 2mi/m + 2 / m.
E[ X 2 ] = E[ X

X ] = E[(1/ m)

i=1

i=1

mi X i X ] = (1/ m) E[mi X i X ] =

(1/ m)

mi (M2

+ 2 mi / m + 2 / m) = M2 + 2

i=1

i=1

Then E[2 ] =

mi2 / m + 2 / m.

mi E[( Xi -

X )2]

i=1

mi { E[Xi2] - 2E[Xi X] + E[X 2]}


i=1

mi

{M2

2 / m

2(M2

2 m

i/ m

2 / m)

i=1

M2

mj2 / m2 +
j=1

= 2 (

mi + mi
i=1

i=1

mj2 / m2

j=1

E[2 ] = 2 (m -

-2

mj2 / m ) + 2 (C-1)
j=1

2 - 2 (C -1)
Thus, E
= 2 .
C

mi 2 / m
m

i=1

j=1

mj2 / m

) + 2

1 - 2mi / m + mi / m .
i=1

2 / m}

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 100

Therefore if one takes:


C

EPV = (1/C)

vi

which an unbiased estimator of 2, and = m -

i=1

(C- 1)

m -

mi2 / m

mi2 / m .
i=1

mi (X i

= i=1

- X)2 - EPV (C -1)


.

i=1

Thus an unbiased estimator of the Variance of the Hypothetical Means, 2, is:


C

mi (X i

- X)2 - EPV (C -1)

VHM = i=1

General Remarks:
The arrays of data have two dimensions. In my formulas the first dimension, the rows of the array,
are different classes. However, rather than classes they could equally well be different policyholders,
different territories, different states, different groups, different experiments, etc. While exam
questions often have for simplicity only two or three classes, these techniques are usually not
applied in practical applications unless there are at least 4 classes.47
In my formulas the second dimension, the columns of the array, are different years. However, rather
than years they could equally well be different observations, different policyholders, etc.
It should be noted that while we have obtained unbiased estimators of the EPV and the VHM, their
ratio is not necessarily an unbiased estimator of the Bhlmann Credibility Parameter
K = EPV/VHM.48 Also it should be noted that the Bhlmann-Straub technique assumes a certain
behavior of the covariances with size of risk and that the risk parameters are (approximately) stable
over time. While these are the most commonly used assumptions, they do not hold in some real
world applications.49 Thus as with all techniques, in practical applications one must be careful to
only apply the Bhlmann-Straub empirical Bayes technique in appropriate circumstances.
47

See for example Gary Venters Credibility Chapter of Editions 1 to 3 of Foundations of Casualty Actuarial Science.
See for example Gary Venters Credibility Chapter of Editions 1 to 3 of Foundations of Casualty Actuarial Science.
49
See for example William R. Gillams, Parametrizing the Workers Compensation Experience Rating Plan, PCAS
1992 and Howard Mahlers Discussion in PCAS 1993, A Markov Chain Model of Shifting Risk Parameters, by
Howard Mahler, PCAS 1997, and Credibility with Shifting Risk Parameters, Risk Heterogeneity and Parameter
Uncertainty, by Howard Mahler, PCAS 1998.
48

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 101

Summary of Bhlmann-Straub Empirical Bayes Estimation:


Name

Symbol

Number of Classes50
Frequency for Class i in Year t51

C
Xit

Exposures for Class i in Year t52

mit

Number of Years of data for Class i

Yi
Yi

Sum of Exposures for Class i

mit

mi

t=1
C Yi

Overall Sum of Exposures

mit

i=1 t=1
Yi

mit Xit
Weighted Average for Class i.

t=1

Xi

mi
C Yi

mit Xit
Overall Weighted Average

i=1 t=1

m
C Yi

mit (Xit
Expected Value of Process Variance

i=1t=1

EPV

- X i)

(Yi 1)
i=1

mi 2
Total Exposures Adjusted for Degrees of Freedom

m - i=1
m

mi (X i
Variance of the Hypothetical Means

VHM

Bhlmann Credibility Parameter


Credibility For Class i

K
Zi

50

Or number of policyholders or number of groups.


Or pure premium or loss ratio.
52
Or premiums or number of members in a group.
51

- X)2 - EPV (C -1)

i=1

EPV / VHM
mi / (mi + K)

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 102

Problems:
Use the following information for the next 3 questions:
Past data for a portfolio of group health policyholders are given below:
Policyholder
1

Year
2
20
10

Sum
3
16
13

36
23

Losses
# in group

Losses
# in group

19
11

23
8

17
7

59
26

Losses
# in group

26
14

30
17

35
18

91
49

Losses
# in group

Sum

45
25

73
35

68
38

186
98

Note that there is no data from policyholder #1 in year 1.


5.1 (5 points) Estimate the Bhlmann-Straub credibility premium to be charged policyholder #1 in
year 4, if you expect 15 in the group.
A. Less than 28
B. At least 28 but less than 30
C. At least 30 but less than 32
D. At least 32 but less than 34
E. At least 34
5.2 (1 point) Estimate the Bhlmann-Straub credibility premium to be charged policyholder #2 in
year 4, if you expect 10 in the group.
A. Less than 17
B. At least 17 but less than 18
C. At least 18 but less than 19
D. At least 19 but less than 20
E. At least 20
5.3 (1 point) Estimate the Bhlmann-Straub credibility premium to be charged policyholder #3 in
year 4, if you expect 20 in the group.
A. Less than 36
B. At least 36 but less than 37
C. At least 37 but less than 38
D. At least 38 but less than 39
E. At least 39

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 103

Use the following information for the next three questions:


There is data from private passenger automobile insurance policies, divided into 4 classes.
Let Xij be the pure premium for policy j in class i.
Let mij be the exposures for policy j in class i.
Then Lij = mij Xij is the loss for policy j in class i.
Class Number of Policies
1
2
3
4
Total

1140
1000
960
1060
4160

Number of Exposures
1741
1514
1456
1609
6320

Losses

mij Xij2

1266
1390
1359
1846
5861

12,883
16,157
15,133
22,805
66,978

5.4 (3 points) Use nonparametric empirical Bayes methods to estimate the Expected Value of the
Process Variance.
A. 11
B. 12
C. 13
D. 14
E. 15
5.5 (3 points) Use nonparametric empirical Bayes methods to estimate the Variance of the
Hypothetical Means.
A. 0.010
B. 0.015
C. 0.020
D. 0.025
E. 0.030
5.6 (2 points) Determine the nonparametric empirical Bayes credibility premium for class 4, using
the method that preserves total losses.
A. 1.02
B. 1.04
C. 1.06
D. 1.08
E. 1.10
5.7 (4 points) ABC Insurance Company offers a policy for maid services that is rated on a per
employee basis. The two insureds shown in the table below were randomly selected
from ABCs policyholder database. Over a four-year period the following was observed:
Year
Insured
1
2
3
4
A
Number of Claims 1
2
1
3
No. of Employees 20
22
20
18
B
Number of Claims 1
0
1
No. of Employees 14
15
16
Estimate the expected annual claim frequency per employee for insured A using the
empirical Bayes Bhlmann-Straub estimation model with the method that preserves total claims.
A. 7.5%
B. 7.7%
C. 7.9%
D. 8.1%
E. 8.3%

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 104

Solutions to Problems:
5.1. A , 5.2. E , & 5.3. C. The mean pure premium for policyholder #1 is: 36/23 = 1.565.
The mean pure premium for policyholder #2 is: 59/26 = 2.269.
The mean pure premium for policyholder #3 is: 91/49 = 1.857.
1
1
EPV =
C Y - 1

mij (Xij - Xi)2 =


i=1 j=1

({(10)(2-1.565)2 + (13)(1.231 - 1.565)2 } + {(11)(1.727 - 2.269)2 + (8)(2.875 - 2.269)2 +


(7)(2.429 - 2.269)2 } + {(14)(1.857 - 1.857)2 + (17)(1.765 - 1.857)2 + (18)(1.944 - 1.857)2 } ) /
(1 + 2 + 2) = 1.994.
= m - mi2 / m = 98 - {232 + 262 + 492 } / 98 = 61.2.
X = overall average = 186/98 = 1.898.

mi ( X i - X )2 = (23)(1.565 - 1.898)2 + (26)(2.269 - 1.898)2 + (49)(1.857 - 1.898)2 = 6.211.


Thus the estimated VHM = {6.211 - (1.994)(3 - 1)} / 61.2 = .0363.
K = 1.994/.0363 = 54.9.
Policyholder
1
2
3
Overall

Exposures Losses
23
36
26
59
49
91
98

186

Pure
Premium
1.565
2.269
1.857

Z
29.5%
32.1%
47.2%

1.898

Comment: The losses could be in units of $1000.

Estimated
P.P.
1.800
2.017
1.879

Year 4
Expos.
15
10
20

Premium
27.00
20.17
37.57

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 105
Y

mij (Xij -

5.4. E, 5.5. C, 5.6. D.

Xi)2

j=1

mij Xij2 - 2mij Xij X i + mij X i2 =


j=1

mij X ij2

- 2Li X i +

mi X i 2

mij Xij2 - Li2/mi.


j=1

j=1

mij (Xij - X i )2 = mij Xij2 - Li2 /mi =


66,978 - {12662 /1741 + 13902 /1514 + 13592 /1456 + 18462 /1609} = 61,395.
1
1
EPV =
C Y - 1

mij (Xij - Xi)2 = 61,395/(4160 - 4) = 14.77.


i=1 j=1

= m - mi2 / m = 6320 - (17412 + 15142 + 14562 + 16092 ) / 6320 = 4732.6.


X = 5861/6320 = .927. X1 = 1266/1741 = .727. X2 = 1390/1514 = .918.

X 3 = 1359/1456 = .933. X 4 = 1846/1609 = 1.147.

mi ( X i - X )2 = (1741)(0.727 - 0.927)2 + (1514)(0.918 - 0.927)2 + (1456)(0.933 - 0.927)2


+ (1609)(1.147 - 0.927)2 = 147.7.
C

mi ( Xi -

X)2 - EPV (C - 1)

VHM = i=1

= {147.7 - (14.77)(4 - 1)}/4732.6 = 0.0218.

K = EPV/VHM = 14.77/.0218 = 678.


Estimated P.P. for Class 4: (70.4%)(1.147) + (1 - 70.4%)(.930) = 1.083.
Class

Expos

Losses

P.P.

Estimate

1
2
3
4

1741
1514
1456
1609

1266
1390
1359
1846

0.727
0.918
0.933
1.147

72.0%
69.1%
68.2%
70.4%

0.784
0.922
0.932
1.083

Cred. Wght.

0.930

Comment: The number of policies acts as the number of observations for each class.
Taken from Credible Risk Classification, by Benjamin Joel Turner, Winter 2004 CAS Forum; the
losses and pure premiums have been all divided by 1000.

2013-4-12

Empirical Bayes Cred. 5 Varying Exposures & Years, HCM 10/22/12, Page 106

5.7. A. The mean frequency for insured A is: 7/80 = .0875.


The mean frequency for insured B is: 2/45 = .0444.
1
1
EPV =
C Y - 1

mij (Xij - Xi)2 =


i=1 j=1

{(20)(0.05 - 0.0875)2 + (22)(0.0909 - 0.0875)2 + (20)(0.05 - 0.0875)2 + (18)(0.1667 - 0.0875)2


+ (14)(0.0714 - 0.0444)2 + (15)(0 - 0.0444)2 + (16)(0.0625 - 0.0444)2 } / (3 + 2) = 0.0429.
= m - mi2 / m = 125 - {802 + 452 }/125 = 57.6.
X = overall average = 9/125 = .072.

mi ( X i - X )2 = (80)(0.0875 - 0.072)2 + (45)(0.0444 - 0.072)2 = 0.0535.


Thus the estimated VHM = {0.0535 - (0.0429)(2 - 1)} / 57.6 = 0.000184.
K = 0.0429/0.000184 = 233.
The credibilities are: 80/(80 + 233) = 0.256, and 45/(45 + 233) = 0.162.
Credibility weighted average is: {(0.256)(0.0875) + (0.162)(0.0444)}/(0.256 + 0.162) = 0.0708.
Estimated frequency for Insured A: (.256)(0.0875) + (1 - 0.256)(0.0708) = 0.0751.
Comment: Similar to Exercise 15 in Topics in Credibility by Dean.
Estimated frequency for Insured B: (0.162)(0.0444) + (1 - 0.162)(0.0708) = 0.0665.

2013-4-12

Empirical Bayesian Cred. 6 Using an A Priori Mean, HCM 10/22/12, Page 107

Section 6, Using an A Priori Mean


One can use similar techniques to those discussed in the previous sections, but employing an a
priori estimate of the overall mean, .53
In that case, one uses the same estimator for the EPV, but uses a different and somewhat more
stable estimator for the VHM.
Also, (1 - Z) is applied to the a priori mean, .
No Variation in Exposures:
When there is no variation in exposures (or years), then when using an a priori estimate of the
overall mean, , the estimate of the VHM is:
C

(Xi

VHM = i=1

- )

EPV
.
Y

Exercise: You have data for 2 policies over three years. Both policies are currently charged a rate
based on an expected annual loss of $587. Use nonparametric Bayesian Estimation to estimate the
future annual loss for each policy.
Policy
A
B

53

1
404
632

Losses
2
433
551

3
537
660

Loss Models sometimes refers to this prior estimate, , as the manual rate.

2013-4-12

Empirical Bayesian Cred. 6 Using an A Priori Mean, HCM 10/22/12, Page 108

[Solution: The EPV is estimated as 4048, in the usual manner making no use of the a priori estimate
of the mean:
Policy

Losses
2
433
551

1
404
632

A
B

3
537
660

Average

Mean

Sample Variance

458.0
614.3

4,891
3,204
4,048

The mean for Policy A is 458. The mean for Policy B is 614.3. We are given = 587.
C

(Xi

VHM = i=1

- )

EPV (458 - 587)2 + (614.3 - 587)2 4048


=
= 7344.
2
Y
3

K = 4048 / 7344 = 0.55. There is 3 years of data. Thus Z = 3 / (3 + 0.55) = 84.5%.


The estimated future annual loss for Policy A is: (0.845)(458) + (0.155)(587) = 478.
The estimated future annual loss for Policy B is: (0.845)(614.3) + (0.155)(587) = 610.
Comment: Note that we apply the complement of credibility to the a priori mean of 587, rather than
the observed overall mean of 536.]
Summary:
= a priori estimate of the mean.
The estimate of the EPV is the same as without an a priori mean:
C

C Y

EPV =

(Xit - X i )2

si2
i=1

= i=1 t=1
C (Y - 1)

The estimate of the VHM differs from that without an a priori mean:
C

(Xi

VHM = i=1

- )

EPV
, if estimated VHM is negative, set Z = 0.
Y

Estimate is: Z(observation) + (1 - Z) .

2013-4-12

Empirical Bayesian Cred. 6 Using an A Priori Mean, HCM 10/22/12, Page 109

Working with Only One Policy, No Variation in Exposures:


In the case where you use an assumed a priori mean, one could even apply this technique to a
single policy or class.
With C = 1 and no variation in exposures, the estimators for the EPV and VHM simplify to:
Y

(Xt - X )2
EPV = t=1

Y - 1

VHM = ( X - )2 - (EPV / Y).


Exercise: You have data for a policy over three years. The policy is currently charged a rate based
on an expected annual loss of $587. Use nonparametric Bayesian Estimation to estimate the future
annual loss for this policy.
1
404

Losses
2
433

3
537

[Solution: The EPV is estimated as 4891, from the sample variance, making no use of the a priori
estimate of the mean:
1
404

Losses
2
433

3
537

Mean

Sample Variance

458.0

4,891

The observed mean is 458. We are given = 587.


VHM = ( X - )2 - EPV / Y = (458 - 587)2 - 4891/3 = 15,011.
K = 4891/15,011 = 0.33. There is 3 years of data. Thus Z = 3 / (3 + 0.33) = 90.1%.
The estimated future annual loss for Policy A is: (0.901)(458) + (0.099)(587) = 471.]

2013-4-12

Empirical Bayesian Cred. 6 Using an A Priori Mean, HCM 10/22/12, Page 110

Problems:
6.1 (2 points) For a particular policyholder, the manual rate is 15 per year.
The past claims experience is:
Year
1
2
3
Claims
13
21
19
Estimate the Bhlmann credibility premium for the next year for the policyholder.
A. Less than 15
B. At least 15 but less than 16
C. At least 16 but less than 17
D. At least 17 but less than 18
E. At least 18
6.2 (3 points) An insurer has data on losses for four policyholders for seven years.
Xij is the loss from the ith policyholder for year j.
4

(Xij - Xi )2

= 33.60

i=1 j=1

X1 = 1.21, X 2 = 2.98, X 3 = 0.49, X 4 = 1.72.


Each policyholder is charged an annual rate based on 1.70 in expected losses.
Calculate the Bhlmann credibility factor for an individual policyholder using nonparametric empirical
Bayes estimation.
(A) Less than 0.74
(B) At least 0.74, but less than 0.77
(C) At least 0.77, but less than 0.80
(D) At least 0.80, but less than 0.83
(E) At least 0.83
6.3 (3 points) You have data for 3 policies over four years.
Policy
A
B
C

1
198
203
177

Losses
2
249
227
210

Total

578

686

3
205
220
185

4
212
231
192

Total
864
881
764

610

635

2509

For each policy the manual rate is $210 per year.


Estimate the Bhlmann credibility premium for the next year for policy C.
A. 193
B. 195
C. 197
D. 199
E. 201

2013-4-12

Empirical Bayesian Cred. 6 Using an A Priori Mean, HCM 10/22/12, Page 111

6.4 (3 points) For each of 3 years, we have the number of claims for each of 2 insureds:
A
B

100
90

110
80

120
100

A priori you expected 95 claims per year for each insured.


Estimate the number of claims for insured A next year, using Nonparametric Empirical Bayes
estimation.
A. 103 or less
B. 104
C. 105
D. 106
E. 107 or more
6.5 (3 points) Prior to any observations you assume the average survival time is 20 years.
Survival times are available for six insureds, three from Class A and three from Class B.
The three from Class A died at times t = 17, t = 32, and t = 39.
The three from Class B died at times t = 12, t = 20, and t = 24.
Nonparametric Empirical Bayes estimation is used to estimate the mean survival time for each class.
What is the estimated mean survival time for Class A?
(A) Less than 23
(B) At least 23, but less than 24
(C) At least 24, but less than 25
(D) At least 25, but less than 26
(E) At least 26
6.6 (3 points) A priori you project $400 in losses in 2005 for each insured.
Two insureds produced the following losses over a three-year period:
Annual Losses
Insured
2001
2002
2003
Thelma
$350
$230
$300
Louise
$270
$510
$390
Inflation is 10% per year.
Using the nonparametric empirical Bayes method, estimate the losses in 2005 for Louise.
A. 415
B. 420
C. 425
D. 430
E. 435

2013-4-12

Empirical Bayesian Cred. 6 Using an A Priori Mean, HCM 10/22/12, Page 112

Solutions to Problems:
6.1. B. The mean annual claims are: (13 + 21 + 19)/3 = 17.67.
The estimated EPV = sample variance =
{(13 - 17.67)2 + (21 - 17.67)2 + (19 - 17.67)2 } / (3 - 1) = 17.3.
The estimated VHM = (17.67 - 15)2 - 17.3 / 3 = 1.36.
K = 17.3/1.36 = 12.7. For the three years of data, Z = 3/(3 + 12.7) = 19.1%.
The estimate for the next year is: (0.191)(17.67) + (0.809)(15) = 15.50.
6.2. B. Estimated EPV = average of the sample variances for each policyholder =
1
1
C Y - 1

(Xij - X i)2 = (1/4)(1/6)(33.60) = 1.4.


i=1 j=1

1
Estimated VHM =
C

(Xi - )2 -

i=1

_
EPV
= (1/4) ( Xi - 1.72)2 - 1.4/7 =
Y

(1/4) {(1.21 - 1.70)2 + (2.98 - 1.70)2 + (0.49 - 1.70)2 + (1.72 - 1.70)2 } - 0.2 = 0.636.
K = EPV/VHM = 1.4/0.636 = 2.20.
With 7 years of data, Bhlmann credibility factor = Z = 7/(7 + K) = 7/9.2 = 76.1%.
6.3. D. The EPV is estimated as 289, in the usual manner making no use of the a priori estimate of
the mean:
Policy
1
198
203
177

A
B
C

Losses
2
249
227
210

3
205
220
185

Average

1
VHM =
C

4
212
231
192

Mean

Sample Variance

216.0
220.2
191.0

517
153
198
289

=
(Xi - )2 - EPV
Y
i=1

{(216 - 210)2 + (220.2 - 210)2 + (191 - 210)2 }/3 - 289/4 = 94.8.


K = 289/94.8 = 3.05. There is 4 years of data. Thus Z = 4/(4 + 3.05) = 56.7%.
The estimated future annual loss for Policy C is: (0.567)(191) + (0.433)(210) = 199.2.
Comment: Since we are given an a priori mean, the formula for the estimated VHM is different.

2013-4-12

Empirical Bayesian Cred. 6 Using an A Priori Mean, HCM 10/22/12, Page 113

6.4. D. EPV = (100 + 100)/2 = 100.

A
B

Mean

100
90

110
80

120
100

110.00
90.00

Sample
Variance
100.00
100.00

100.000

100.000

Mean

1
VHM =
C

= {(110 - 95)2 + (90 - 95)2 }/2 - 100/3 = 91.67.


(Xi - )2 - EPV
Y
i=1

K = EPV/VHM = 100/91.67 = 1.09. Z = 3/(3 + 1.09) = 0.733.


Estimated frequency for insured A: (0.733)(110) + (1 - 0.733)(95) = 106.0.
6.5. B. EPV = average of the sample variances for each class = 81.833.
Class

First
Survival Time

Second
Survival Time

Third
Survival Time

Mean

Sample
Variance

A
B

17
12

32
20

39
24

29.333
18.667

126.333
37.333

24.000

81.833

Average

1
Estimated VHM =
C

=
(Xi - )2 - EPV
Y
i=1

{(29.333 - 20)2 + (18.667 - 20)2 }/2 - 81.833/3 = 17.16.


Estimated K = 81.833/17.16 = 4.77. Z = 3/(3 + 4.77) = 0.386.
Estimated future mean survival time for class A: (0.386)(29.333) + (1 - 0.386)(20) = 23.6.
6.6. B. First, inflate all of the losses up to the 2005 level. For example, (1.14 )(350) = 512.
Insured

2001

Year
2002

2003

Mean

Sample
Variance

Thelma
Louise

512
395

306
679

363
472

394
515

11,354
21,509

Average

Estimated EPV = 16,432.


Estimated VHM = {(394 - 400)2 + (515 - 400)2 }/2 - 16,432/3 = 1153.
K = 16,432/1153 = 14.25. Z = 3/(3 + 14.25) = 17.4%.
The estimate for Louise is: (0.174)(515) + (1 - 0.174)(400) = 420.

16,432

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 114

Section 7, Using an A Priori Mean, Variation in Exposures54


When one employs an a priori estimate of the overall mean and there is variation in exposures, then
one uses the same estimator for the EPV as was used in the absence of employing an a priori
mean, but uses an estimator for the VHM somewhat different than that used in the absence of
employing an a priori mean:
C

mi (X i

- )2 - (C) (EPV)

VHM = i=1

Exercise: You have data for 2 classes over three years. Both classes are currently charged a rate
based on a pure premium of 2.50.
Use nonparametric Bayesian Estimation to estimate the future pure premium for each class.
Class
Poultry Farms
Dairy Farms

1
41
58

Exposures
2
3
37
29
59
53

1
232
104

Losses
2
33
60

3
237
151

[Solution: The EPV was previously estimated as 254.


The mean pure premium for Poultry is: (232 + 33 + 237) / (41 + 37 + 29) = 502 / 107 = 4.69.
The mean pure premium for Dairy is: (104 + 60 + 151)/(58 + 59 + 53) = 315 / 170 = 1.85.
m = overall exposures = 277.
C

mi (X i

- )2 = (107)(4.69 - 2.50)2 + (170)(1.85 - 2.50)2 = 585.

i=1

Thus the estimated VHM = {585 - (2)(254)} / 277 = 0.278. K = 254 / 0.278 = 914.
For the Poultry class the three years of data have a total of 107 exposures.
Thus Z = 107 / (107 + 914) = 10.5%.
For the Dairy class the three years of data have a total of 170 exposures.
Thus Z = 170 / (170 + 914) = 15.7%.
The estimated future pure premium for Poultry Farms is: (0.105)(4.69) + (0.895)(2.50) = 2.73.
The estimated future pure premium for Dairy Farms is: (0.157)(1.85) + (0.843)(2.50) = 2.40.
Class
Poultry Farms
Dairy Farms

Exposures
107
170

Losses
502
315

Pure Premium
4.69
1.85

Z
Estimated P.P.
10.5%
2.73
15.7%
2.40

Comment: Note that we apply the complement of credibility to the a priori mean pure premium of
2.50, rather than the observed overall mean of 2.95.
Loss Models would refer to this a priori mean of 2.50 as the manual rate]
54

See page 629-630, including example 20.35, in Loss Models.

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 115
Comparison to the Previous Case with No A Priori Mean:
In a previous section, when we did not have an a priori mean, the formulas for estimating the VHM
were more complicated:
C

mi (X i

X)2

VHM = i=1

- EPV (C -1)

mi 2
, with = m - i=1
m

The use of what I have called in the denominator rather than m, and the multiplication of the EPV
by C -1 rather than C, were needed in order to make the estimator of the VHM unbiased.
They are analogous to the denominator of N - 1 in the sample variance. In order to get an unbiased
estimator of the variance, when we use X rather than , we adjust for the number of degrees of
freedom, and use N - 1 rather than N in the denominator of the sample variance.
If we know the mean , then we can estimate the variance instead by: (Xi - )2 /N.
Similarly, here when we are given an a priori mean, in the formula to estimate the VHM there is no
need to adjust for an analog of the number of degrees of freedom.
Summary:
= a priori estimate of the mean.
The estimate of the EPV is the same as without an a priori mean:
C

vi

=1
EPV = i
C

C Y

mit (Xit - Xi )2
= i=1 t=1
C (Y - 1)

The estimate of the VHM differs from that without an a priori mean:
C

mi (X i
VHM = i=1

- )2 - (C) (EPV)
m

, if estimated VHM is negative, set Z = 0.

Estimate is: Z(observation) + (1 - Z) .

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 116
Working with Only One Class, Variation in Exposures:
In the case where exposures vary and you use an assumed a priori mean, one can apply this
technique to a single class. With C = 1, the estimators for the EPV and VHM simplify to:
Y

mt (Xt - X )2
EPV = t=1

Y - 1

VHM = ( X - )2 - EPV / m.
Exercise: You have data from one class over three years. The class is currently charged a rate
based on a pure premium of 2.50. Use nonparametric Bayesian Estimation to estimate the future
pure premium for the class.
Class
Poultry Farms

1
41

Exposures
2
3
37
29

1
232

Losses
2
33

3
237

[Solution: The mean pure premium for Poultry is:


(232 + 33 + 237) / (41 + 37 + 29) = 502 / 107 = 4.69.
For Poultry the estimated process variance is:
(41)(5.66 - 4.69)2 + (37)(0.89 - 4.69)2 + (29)(8.17 - 4.69)2
= 462.0.
3 - 1
Thus the estimated VHM = (4.69 - 2.50)2 - 462 / 107 = 0.478.
K = 462 / 0.478 = 967.
For the Poultry class the three years of data have a total of 107 exposures.
Thus Z = 107 / (107 + 967) = 10.0%.
The estimated future pure premium for Poultry Farms is: (0.100)(4.69) + (0.900)(2.50) = 2.72.]

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 117
Problems:
Use the following information for the next four questions:
You have data for 3 classes over four years:
Class
A
B
C

Exposures
2001
2002
11
14
20
21
15
14

2003
15
19
12

2004
18
20
12

2001
$2442
$4020
$2880

Losses
2002
$2926
$4725
$2632

2003
$2970
$4028
$2472

2004
$4230
$4120
$2328

Each class is currently charged a rate based on expected losses of $200 per exposure.
7.1 (4 points) Use nonparametric Bayesian Estimation to estimate the expected value of the
process variance of the pure premiums.
A. 1800
B. 2000
C. 2200
D. 2400
E. 2600
7.2 (2 points) Use nonparametric Bayesian Estimation to estimate the variance of the hypothetical
mean pure premiums.
A. Less than 110
B. At least 110 but less than 120
C. At least 120 but less than 130
D. At least 130 but less than 140
E. At least 140
7.3 (1 point) Use nonparametric Bayesian Estimation to estimate the amount of credibility to be
assigned to the data for Class A.
A. Less than 65%
B. At least 65% but less than 70%
C. At least 70% but less than 75%
D. At least 75% but less than 80%
E. At least 80%
7.4 (1 point) Use nonparametric Bayesian Estimation to estimate the expected losses in the year
2006 for Class B, assuming exposures of 21 for Class B in the year 2006.
A. Less than 4250
B. At least 4250 but less than 4300
C. At least 4300 but less than 4350
D. At least 4350 but less than 4400
E. At least 4400

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 118
7.5 (2 points) You are given the following data for a single group:
Year 1
Year 2
Aggregate Losses
110,000
160,000
Number of Members
400
500
Using an a priori mean, the variance of the hypothetical means has been estimated as 120.
Using the nonparametric empirical Bayes method, determine the credibility factor to be assigned to
the data for this group for purposes of predicting the pure premium of this group in Year 3.
(A) 10%
(B) 15%
(C) 20%
(D) 25%
(E) 30%
7.6 (3 points) Use the following data for 200 classes over 5 years:
Pure Premium for Class i in year t
Xit =
= relativity for class i in year t.
Pure premium in Year t for all classes
mit = exposures for class i in year t.
5

mi =

mit = total exposures over 5 years for class i.


t=1
5

mit Xit
_
Xi = t=1
= weighted average relativity for class i.
mi
200 5

mit (Xit - Xi

)2

200

= 8.350 x

1011.

i=1 t=1

i=1
200

200

mi (Xi
i=1

mi = 2.351 x 108.

1)2

= 2.072 x

1012.

mi2 = 9.423 x 1012.


i=1

You estimate the mean future pure premium for all classes to be $3.15.
Using nonparametric Bayesian Estimation, what is the estimated future pure premium for a class with
a weighted average relativity of 0.872 and a total of 183,105 exposures over five years?
Hint: The a priori mean relativity compared to average is 1.
A. Less than $2.95
B. At least $2.95 but less than $3.00
C. At least $3.00 but less than $3.05
D. At least $3.05 but less than $3.10
E. At least $3.10

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 119
7.7 (3 points) For a group policyholder, we have the following data available:
2001
2002
2003
2004
Losses
$38,000
$45,000
$42,000
Number in Group
100
110
90
80 (estimate)
If the manual rate per person is $370 per year, estimate the total credibility premium for
the year 2004.
A. 31,750
B. 32,000
C. 32,250
D. 32,500
E. 32,750
7.8 (3 points) Use the following information for two classes.
Number of Claims
Class
Year 1
Year 2
Year 3
Total
A
2
4
3
9
B
4
9
6
19
Number of Exposures
Class
Year 1
Year 2
Year 3
Total
A
100
150
200
450
B
200
200
200
600
Frequency
Class
Year 1
Year 2
Year 3
Total
A
0.02000
0.02667
0.01500
0.02000
B
0.02000
0.04500
0.03000
0.03167
A priori, one assumes each class has a mean frequency of 3%.
In year 4, there are 300 exposures for Class A and 200 exposures for Class B.
Using nonparametric empirical Bayes credibility, estimate the number of claims in year 4 for
Class A.
A. 6.5
B. 7.0
C. 7.5
D. 8.0
E. 8.5

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 120
7.9 (3 points) You are a pricing actuary in the group insurance line at Big Ticket Insurance.
The Chief Actuary requests that you perform a credibility analysis and determine a claims rate for
Product XYZ.
You are provided the following information on Product XYZ for the previous calendar year:
There was data on 110 different Members.
Average Number of Members per month: 100
Claims Cost Per Member Per Month (PMPM): $300
Manual Rate Cost PMPM: $550
mi = the number of months of data for member i
Xij is the claims cost for member i in month j.
X i is the average claim cost for member i.
110 mi

(Xit - X i )2

= 8000 million.

i=1 t=1
110

mi (X i

- 550)2 = 820 million.

i=1

Using nonparametric Empirical Bayes credibility, estimate the claims cost (PMPM) associated with
Product XYZ.
(A) Less than 400
(B) At least 400, but less than 420
(C) At least 420, but less than 440
(D) At least 440, but less than 460
(E) At least 460
7.10 (4, 11/05, Q.11 & 2009 Sample Q.223) (2.9 points) You are given the following data:
Year 1
Year 2
Total Losses
12,000
14,000
Number of Policyholders
25
30
The estimate of the variance of the hypothetical means is 254.
Determine the credibility factor for Year 3 using the nonparametric empirical Bayes method.
(A) Less than 0.73
(B) At least 0.73, but less than 0.78
(C) At least 0.78, but less than 0.83
(D) At least 0.83, but less than 0.88
(E) At least 0.88

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 121
Solutions to Problems:
7.1. D. The EPV is estimated as 2369 in the usual manner making no use of the a priori estimate of
1
the mean: vi =
Y - 1

mit (Xit - Xi )2 .
t=1

Estimated EPV = (1/C) vi = (1/3)(4137 + 2211 + 758) = 2369.


Class
C
B
A

2001
15
20
11

A
B
C

2001
$222.00
$201.00
$192.00

A
B
C

Exposures
2002
14
21
Pure Premium
2002
$209.00
$225.00
$188.00

2003
12
19
15

2004
12
20
18

$2880
$4020
$2442
2001

2003
$198.00
$212.00
$206.00

2004
$235.00
$206.00
$194.00

Overall
$216.690
$211.162
$194.566

(exposures)(p.p. - overall p.p.)^2


310.2
827.8
5239.5
2065.5
4021.0
13.3
98.8
603.6
1568.8

6034.8
533.0
3.8

(sigma i)^2
4137
2211
758

Losses

2369

7.2. A. Using the EPV, the pure premiums by class, and exposures by class, all calculated in the
previous solution:
C

mi (X i - )2 - (C)(EPV)

VHM = i=1

{{(58)(216.69 - 200)2 + (80)(211.16 - 200)2 + (53)(194.57 - 200)2 } - (3)(2369)} / 191 = 107.7.


7.3. C. K = 2369 / 107.7 = 22.0. Class A had 58 exposures. Z = 58/(58 + 22) = 72.5%.
7.4. D. K = 2369 / 107.7 = 22.0. Class B had 80 exposures. Z = 80/(80 + 22.0) = 78.4%.
Estimated P.P. = (0.784)(211.16) + (0.216)(200) = 208.75.
Thus the estimated losses for 21 exposures are: (21)(208.75) = $4384.

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 122
7.5. C. The pure premiums for the two years are: 110000/400 = 275 and
160000/500 = 320. The overall pure premium for two years is: 270000/900 = 300.
1
vi =
Y - 1

mit (Xit - Xi )2 = (1/(2 - 1)) {(400)(275 - 300)2 + (500)(320 - 300)2} = 450,000.


t=1

Since there is only one group this 450,000 is the estimate of the EPV.
K = EPV/ VHM = 450,000/120 = 3750. There are 900 exposures in the two years of data.
Z = 900/(900 + 3750) = 19.4%.
Comment: Similar to 4, 11/05, Q.11.
1
1
7.6. A. EPV =
C Y - 1

mij (Xij - Xi)2 = 8.350 x 1011/ {(200)(5-1)} =


i=1 j=1

1.044 x 109 . A priori we assume a relativity of 1 for a class chosen at random.


C

mi (X i -

VHM = i=1

)2
m

- (C)(EPV)

mi (X i - 1)2 - (C)(EPV)

= i=1

{2.072 x 1012 - (200)(1.044 x 109 )} / (2.351 x 108 ) = 7925.


Estimated K = 1.044 x 109 / 7925 = 131,735.
For given class, Z = 183105/(183105 + 131735) = 58.2%.
Estimated future relativity for given class = (0.582)(0.872) + (0.418)(1) = 0.926.
Estimated future pure premium for given class = (0.926)($3.15) = $2.92.
7.7. C. The mean dollars per person are: ($38000 + $45000 + $42000)/(100 + 110 + 90) =
$416.67. The observed pure premiums by year are: 38000/100 = 380, 45000/110 = 409.09,
and 42000/90 = 466.67. Therefore, the estimated EPV =
{(100)(380 - 416.67)2 + (110)(409.09 - 416.67)2 + (90)(466.67 - 416.67)2 }/(3 - 1) = 182,879.
The estimated VHM = (416.67 - 370)2 - 182,879 / (100 + 110 + 90) = 1568.
K = 182,879 / 1568 = 116.6. A total of 300 exposures, Z = 300/(300 + 116.6) = 72.0%.
The estimated pure premium for the next year is: (0.720)(416.67) + (0.280)(370) = 403.60.
With 80 persons expected in 2004, the estimated premium is: (80)(403.60) = 32,288.

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 123
1
7.8. E. v i =
Y - 1

mit (Xit - Xi )2 = estimated process variance for class i.


t=1

Estimated EPV = (1/C) vi = (.00583 + .03167)/2 = 0.01875.


Class
A
B

(# of Exposures)(Frequency - 3 year average)^2


1
2
3

Process
Variance

0.00000
0.02722

0.00583
0.03167

0.00667
0.03556

0.00500
0.00056

Average

0.01875
0.02000
0.02000

Frequency
0.02667
0.04500

A
B
A
B

A
B

0.01500
0.03000

Weighted Avg.
0.02000
0.03167

2
4

Number of Claims
4
9

3
6

Sum
9
19

100
200

Number of Exposures
150
200

200
200

Sum
450
600

X1 = 9/450 = 0.02000. X 2 = 19/600 = 0.03167. = 0.03.


C

mi (X i - )2 - (C)(EPV)

VHM = i=1

{(450(0.02000 - 0.03)2 + 600(0.03167 - 0.03)2 ) - (2)(0.01875))} / 1050 = 0.00000874.


K = EPV/VHM = 0.01875/0.00000874 = 2145. ZA = 450/(450 + 2145) = 0.173.
Estimated frequency for Class A is: (0.173)(0.0200) + (1 - 0.173)(0.0300) = 0.0283.
(300 exposures)(0.0283) = 8.49 claims.
Comment: Note that since we are given an a priori mean, the formula for the estimated VHM is
different than when we did not have an a priori mean.

2013-4-12 Emp. Bayes Cred. 7 A Priori Mean & Varying Expos., HCM 10/22/12, Page 124
7.9. A.

110

i=1

i=1

(mi - 1) = (mi - 1) = m - 110 = (12)(100) - 110 = 1090.


C mi

(Xij - Xi )2
EPV =

i=1 j=1
C

(mi -

8000 million
= 7.34 million.
1090

1)

i=1

mi (X i - )2 - (C)(EPV)

VHM = i=1

820 million - (110)(7.34 million)


= 0.0105 million.
(12)(100)

K = EPV/VHM = 7.34 / 0.0105 = 699.


We have: (12)(100) = 1200 member-months of data.
Z = 1200/(1200 + 699) = 63.2%.
Estimated claims cost PMPM is: (300)(63.2%) + (550)(1 - 63.2%) = $392.
Comment: Loosely based on Q. 18d of the Fall 2008 Group and Health - Design and Pricing Exam
of the SOA.
7.10. D. The pure premiums for the two years are: 12000/25 = 480 and 14000/30 = 466.67.
The overall pure premium for two years is: 26000/55= 472.73.
1
vi =
Y - 1

mit (Xit - Xi )2 = (1/(2 - 1)) {(25)(480 - 472.73)2 + (30)(466.67 - 472.73)2}


t=1

= 2423. Since there is only one group this 2423 is the estimate of the EPV.
K = EPV/ VHM = 2423/254 = 9.5.
There are 55 exposures in the two years of data.
Z = 55/(55 + 9.5) = 85.3%.
Comment: I assumed that by the credibility factor for Year 3 they meant the credibility assigned to
the two years of data, in order to estimate year 3. It is unclear what other estimate of the pure
premium for year 3 would be getting weight 1 - Z; perhaps it would be given to an a priori mean not
mentioned in the question. One can not use nonparametric empirical Bayes methods to estimate the
VHM from the data for one group, unless one is given an a priori mean; therefore I have included it in
this section of the study guide.
In my opinion, this is a poorly written exam question.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 125

Section 8, Assuming a Poisson Frequency55


As in Semiparametric Estimation, one can assume a Poisson Frequency, in which case, the
estimated EPV is equal to the observed overall mean.56
No Exposures, Same Number of Years of Data:
As previously, assume there are 3 drivers in a particular rating class.
For each of 5 years, we have the number of claims for each of these 3 drivers:
Hugh
Dewey
Louis

0
0
0

0
1
0

0
0
2

0
0
1

0
0
0

However, now assume each driver has a Poisson frequency.


Then, the estimated EPV = X = 4/15 = 0.2667.
As previously, the estimated VHM = (the sample variance of the class means) - EPV / Y =
C

(Xi

- X )2

i=1

C - 1

EPV
= 0.0933 - 0.2667/5 = 0.0400.57
Y

K = EPV/VHM = .2667/.0400 = 6.667.58


Z = 5/(5 + K) = 0.429.
The estimated future claim frequency of each driver:
Hugh: (0.429)(0) + (1 - 0.429)(0.2667) = 0.152.
Dewey : (0.429)(.2) + (1 - 0.429)(0.2667) = 0.238.
Louis : (0.429)(.6) + (1 - 0.429)(0.2667) = 0.410.
We note that the resulting estimates are in balance:
(0.152 + 0.238 + 0.410) / 3 = 0.2667 = the observed mean claims frequency.

55

See page 29 and Exercise 20 in Topics in Credibility by Dean.


See Mahlers Guide to Semiparametric Estimation and Mahlers Guide to Conjugate Priors.
57
The means are: 0, 0.2, and 0.6; their sample variance is:
{(0 - 0.2667)2 + (0.2 - 0.2667)2 + (0.6 - 0.2667)2 }/(3 - 1) = 0.0933.
58
In the absence of the Poisson assumption, the estimated K was 12.5.
56

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 126
Connection to Semiparametric Estimation:59
In this situation with no exposures and without differing number of years of data, one could get the
same result via Semiparametric Estimation.
Treat five years of data from one driver as one draw from the risk process
Over 5 years we have one driver with 0 claims, one driver with 1 claim and one driver with 3 claims.
Estimated EPV = X = (0 + 1 + 3)/3 = 4/3.
The sample variance is: {(0 - 4/3)2 + (1 - 4/3)2 + (3 - 4/3)2 } / (3 - 1) = 7/3.
Estimated VHM = 7/3 - 4/3 = 1.
K = EPV/VHM = (4/3)/1 = 4/3, where one draw from the risk process is five years of data.
Z = 1/(1 + K) = 3/7 = 0.429, matching the previous result.
If one had one year of data from each insured, with one exposure from each insured, then using
Empirical Bayes Estimation:
C

(Xi - X )2
VHM = i=1

(Xi - X )2

C - 1

EPV i=1
=
C - 1
Y

- EPV = sample variance - EPV.

This matches what is done in Semiparametric Estimation.


Differing Number of Years of Data:
As in a previous section, assume there are 3 drivers with differing number of years of data.

Hugh
Dewey
Louis

1
0

0
0
2

0
0
1

0
0
0

Assume each driver has a Poisson frequency.


Then, the estimated EPV = X = 4/12 = 1/3.
The VHM is estimated as when done previously with differing number of years of data:
3

Yi i=1

59

3
Yi2
i= 1
3
Yi
i= 1

= (3 + 5 + 4) - (32 + 52 + 42 ) / (3 + 5 + 4) = 12 - 50/12 = 7.833.

See Mahlers Guide to Semiparametric Estimation.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 127
The means for the three individual drivers are: 0, 0.20, and 0.75.
C

mi (X i

- X)2 - EPV (C -1)

VHM = i=1

({(3)(0 - 1/3)2 + (5)(0.20 - 1/3)2 + (4)(0.75 - 1/3)2 } - (1/3)(2)) / 7.833 = 0.05745


K = EPV/VHM = (1/3)/0.05745 = 5.80.60
Exercise: Estimate the future claim frequency for each driver.
[Solution: The credibilities are: 3/8.8 = 34.1%, 5/10.8 = 46.3%, and 4/9.8 = 40.8%.
The estimates of future claim frequency are:
Hugh: (34.1%)(0) + (1 - 34.1%)(1/3) = 0.220.
Dewey: (46.3%)(0.2) + (1 - 46.3%)(1/3) = 0.272.
Louis: (40.8%)(0.75) + (1 - 40.8%)(1/3) = 0.503.]
Exposures:
The situation with exposures can be handled in a manner parallel to that with differing numbers of
years of data. Assume we have three policies with differing numbers of vehicles over four years:
Policy

Total

Claims
Vehicles

1
2

0
2

0
3

0
3

1
10

Claims
Vehicles

0
1

0
1

2
2

1
2

3
6

Claims
Vehicles

0
1

1
1

0
1

0
1

1
4

Assume that for each policy, each vehicle has a Poisson frequency.61
Then, the estimated EPV = X = 5/20 = 0.25.
When there are exposures, the VHM is estimated using the method from a previous section:
C

mi 2
= m - i=1
m

= 20 - (102 + 62 + 42 )/20 = 12.4.

The means for the three polices are: 0.10, 0.50, and 0.25.
60
61

In the absence of the Poisson assumption, the estimated K would turn out to be 9.4.
The vehicles on a single policy are all assumed to have the same Poisson Distribution.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 128
C

mi (X i
VHM = i=1

- X)2 - EPV (C -1)

{(10)(0.1 - 0.25)2 + (6)(0.5 - 0.25)2 + (4)(0.25 - 0.25)2 } - (0.25)(2)


=
12.4
(0.6 - 0.5)/12.4 = 0.008065.
K = EPV/VHM = 0.25 / 0.008065 = 31.0.
Exercise: Estimate the future claim frequency per vehicle for each policy.
[Solution: The credibilities are: 10/41 = 24.4%, 6/37 = 16.2%, and 4/35 = 11.4%.
The estimates of future claim frequency are:
A: (24.4%)(0.1) + (1 - 24.4%)(0.25) = 0.213.
B: (16.2%)(0.5) + (1 - 16.2%)(0.25) = 0.291.
C: (11.4%)(0.25) + (1 - 11.4%)(0.25) = 0.250.]
As discussed in a previous section, one could use the method that preserves total claims, and give
the complement of credibility to the credibility weighted mean frequency:
{(24.4%)(.1) + (16.2%)(.5) + (11.4%)(.25)} / {24.4% + 16.2% + 11.4%} = 0.2575.
In this case, the estimates of future claim frequency are very similar to before:
A: (24.4%)(0.1) + (1 - 24.4%)(0.2575) = 0.219.
B: (16.2%)(0.5) + (1 - 16.2%)(0.2575) = 0.297.
C: (11.4%)(0.25) + (1 - 11.4%)(0.2575) = 0.257.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 129
Problems:
Use the following information for the next three questions:
A portfolio of 8 policyholders had the following number of claims by year:
Policyholder

2001

Year
2002

1
2
3
4
5
6
7
8

1
0
1
1
0
0
0
0

1
0
0
0
2
0
0
0

0
0
3
0
0
0
1
0

2
0
4
1
2
0
1
0

Total

10

2003

Total

8.1 (2 points) Assume that 75% of the policyholders in the portfolio have a Poisson frequency with
annual mean of 0.3, while the remaining 25% have a Poisson frequency with annual mean of 0.5.
Using Buhlmann Credibility, what is the estimated future annual claim frequency for policyholder 3?
A. 0.4
B. 0.5
C. 0.6
D. 0.7
E. 0.8
8.2 (3 points) Assume each insureds frequency is Poisson distributed.
The mean of each insureds Poisson Distribution does not change over time.
What is the estimated future annual claim frequency for policyholder 3?
Treat the 3 years of data from an individual policyholder as a single observation for purposes of
estimating the Buhlmann Credibility Parameter.
A. 0.4
B. 0.5
C. 0.6
D. 0.7
E. 0.8
8.3 (4 points) Using nonparametric Bayesian Estimation, what is the estimated future annual claim
frequency for policyholder 3?
A. 0.4
B. 0.5
C. 0.6
D. 0.7
E. 0.8
8.4 (2 points) For each of 7 years, we have the number of claims for each of 2 insureds:
A
B

Total

0
1

0
0

1
0

0
0

0
2

0
0

0
1

1
4

The number of claims for each insured each year has a Poisson distribution.
Estimate the number of claims for insured B over the next 7 years, using Nonparametric Empirical
Bayes estimation.
A. 2.6
B. 2.8
C. 3.0
D. 3.2
E. 3.4

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 130
8.5 (3 points) Two insurance policies produced the following claims during a 4 year period:
Year
Insured
2000 2001 2002 2003
A
Number of Claims 2
0
1
0
Insured Vehicles
2
2
2
1
B
Number of Claims 0
1
0
--Insured Vehicles
3
3
3
--Assume that the number of claims for each vehicle each year has a Poisson distribution and
that each vehicle on a policy has the same expected claim frequency.
Estimate the expected annual number of claims per vehicle for Insured A.
Use the method that preserves total claims.
A. 0.28
B. 0.30
C. 0.32
D. 0.34
E. 0.36
8.6 (2 points) You are given:
(i) A region is comprised of four territories. Claims experience is as follows:
Territory
Number of Exposures
Number of Claims
A
100
3
B
200
5
C
400
4
D
300
3
(ii) The number of claims for each exposure each year has a Poisson distribution.
(iii) Each exposure in a territory has the same expected claim frequency.
Determine the empirical Bayes estimate of the credibility to be assigned to the data from Territory
D, for purposes of estimating the claim frequency for that territory next year.
(A) 30%
(B) 35%
(C) 40%
(D) 45%
(E) 50%

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 131
8.7 (3 points) You are given:
(i) Over a four-year period, the following claim experience was observed for three insureds:
Year
Insured
1
2
3
4
A
Number of Claims
0
0
0
0
B
Number of Claims
0
0
0
1
C
Number of Claims
0
2
1
0
(ii) The number of claims for each insured each year follows a Poisson distribution.
Determine the semiparametric empirical Bayes estimate of the claim frequency for Insured C
in Year 5.
(A) Less than 0.40
(B) At least 0.40, but less than 0.45
(C) At least 0.45, but less than 0.50
(D) At least 0.50, but less than 0.55
(E) At least 0.55
8.8 (3 points) Use the following information for two classes and three years.
Number of Claims
Class
Year 1
Year 2
Year 3
Total
A
2
4
3
9
B
4
9
6
19
Number of Exposures
Class
Year 1
Year 2
Year 3
Total
A
100
150
200
450
B
200
200
200
600
Assume that the number of claims for each exposure each year has a Poisson distribution and that
each exposure in a class has the same expected claim frequency.
Using nonparametric Empirical Bayes credibility, estimate the future frequency for Class A,
using the method that preserves total claims.
A. 2.2%
B. 2.3%
C. 2.4%
D. 2.5%
E. 2.6

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 132
8.9 (4 points) PrizeCo sells hole-in-one insurance.
(The sponsor of a golf tournament will pay a cash prize to any amateur golfer who makes a
hole-in-one during the tournament. PrizeCo will reimburse the sponsor for any prize payments.)
For six golf courses at which PrizeCo has provided such insurance for many events over several
years:
Golf Course
Number of Amateur Golfers
Numbers of Holes-in-One
Amelia Island
8000
6
Clovernook
10,000
9
Donegal Highlands
20,000
8
Foxford Hills
12,000
6
Garden City
5000
1
Victoria
3000
3
Two hundred amateurs will be in an insured tournament at Garden City.
The prize for a hole-in-one is $100,000.
Determine the empirical Bayes estimate of the amount PrizeCo expects to pay.
A. $9000
B. $9500
C. $10,000
D. $10,500
E. $11,000
8.10 (4, 11/05, Q.22 & 2009 Sample Q.233) (2.9 points) You are given:
(i) A region is comprised of three territories. Claims experience for Year 1 is as follows:
Territory
Number of Insureds
Number of Claims
A
10
4
B
20
5
C
30
3
(ii) The number of claims for each insured each year has a Poisson distribution.
(iii) Each insured in a territory has the same expected claim frequency.
(iv) The number of insureds is constant over time for each territory.
Determine the Bhlmann-Straub empirical Bayes estimate of the credibility factor Z for
Territory A.
(A) Less than 0.4
(B) At least 0.4, but less than 0.5
(C) At least 0.5, but less than 0.6
(D) At least 0.6, but less than 0.7
(E) At least 0.7

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 133
8.11 (4, 11/06, Q.13 & 2009 Sample Q.257) (2.9 points) You are given:
(i) Over a three-year period, the following claim experience was observed for two
insureds who own delivery vans:
Year
Insured
1
2
3
A
Number of Vehicles
2
2
1
Number of Claims
1
1
0
B
Number of Vehicles
N/A 3
2
Number of Claims
N/A 2
3
(ii) The number of claims for each insured each year follows a Poisson distribution.
Determine the semiparametric empirical Bayes estimate of the claim frequency per vehicle
for Insured A in Year 4.
(A) Less than 0.55
(B) At least 0.55, but less than 0.60
(C) At least 0.60, but less than 0.65
(D) At least 0.65, but less than 0.70
(E) At least 0.70

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 134
Solutions to Problems:
8.1. A. The Expected Value of the Process Variance = 0.35.
Type of
Risk
A
B
Average

A Priori
Probability
0.75
0.25

Process
Variance
0.3
0.5

Mean
0.3
0.5

Square of
Mean
0.09
0.25

0.350

0.350

0.130

The Variance of the Hypothetical Means = 0.130 - 0.352 = 0.0075.


Thus the Bhlmann credibility parameter, K = EPV / VHM = 0.35 / 0.0075 = 46.7.
Therefore, for 3 years of data Z = 3 / (3 + 46.7) = 6.0%.
The estimated future annual frequency for policyholder 3 is:
(6.0%)(4/3) + (1 - 6.0%)(0.350) = 0.409.
8.2. D. Define the unit of data as one insured over three years.
3 Year Claim Count :
0
1
2
3
4
5 or more
Number of Insureds:
3
2
2
0
1
0
For the observed data the mean frequency (per 3 years) is:
{(3)(0) + (2)(1) + (2)(2) + (1)(4)} / 8 = 1.25.
The second moment is: {(3)(02 ) + (2)(12 ) + (2)(22 ) + (1)(42 )} / 8 = 3.25.
Thus the sample variance = (8/7)(3.25 - 1.252 ) = 1.929.
EPV = mean = 1.25. VHM = Total Variance - EPV = 1.929 - 1.25 = 0.679.
K = EPV/VHM = 1.25/0.679 = 1.84.
Three years of data has been defined as N = 1, therefore, Z = 1 / (1 + 1.84) = 35.2%.
The observed 3 year frequency for policyholder 3 is 4 and the overall mean is 1.25.
The estimated future 3 year frequency for policyholder 3 is:
(35.2%)(4) + (1 - 35.2%)(1.25) = 2.22.
The estimated future annual frequency is: 2.22/3 = 0.739.
Alternately, EPV = mean annual frequency = 10/24 = 5/12.
The individual mean annual frequencies are: 2/3, 0, 4/3, 1/3, 2/3, 0, 1/3, 0.
The sample variance of the class means is:
{3(0 - 5/12)2 + 2(1/3 - 5/12)2 + 2(2/3 - 5/12)2 + (4/3 - 5/12)2 }/(8 - 1) = 3/14.
VHM = (the sample variance of the class means) - EPV / Y = 3/14 - (5/12)/3 = 19/252.
K = EPV/VHM = (5/12) / (19/252) = 105/19.
For 3 years of data, Z = 3/(3 + K) = 57/162 = 35.2%. Proceed as before.
Comment: Over three years, each insureds frequency is Poisson.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 135
8.3. B. EPV = average of the sample variances = 0.5833.
VHM = sample variance of the means - EPV / (# years ) = 0.2143 - 0.5833/3 = 0.0199.
Policyholder

2001

Year
2002

1
2
3
4
5
6
7
8

1
0
1
1
0
0
0
0

1
0
0
0
2
0
0
0

2003

Mean

Sample
Variance

0
0
3
0
0
0
1
0

0.6667
0.0000
1.3333
0.3333
0.6667
0.0000
0.3333
0.0000

0.3333
0.0000
2.3333
0.3333
1.3333
0.0000
0.3333
0.0000

0.4167
0.2143

0.5833

Average
Sample Variance

K = EPV/VHM = 0.5833/0.0199 = 29.3. For three years of data, Z = 3/(3 + 29.3) = 9.3%.
The observed annual frequency for policyholder 3 is 4/3 and the overall mean is 0.4167.
Estimated future annual freq. for policyholder 3 is: (9.3%)(4/3) + (1 - 9.3%)(0.4167) = 0.502.
8.4. D. EPV = overall mean = 5/14.
VHM = {(1/7 - 5/14)2 + (4/7 - 5/14)2 }/(2 - 1) - EPV/(# years of data)
= 0.09184 - (5/14)/7 = 0.0408.
K = (5/14)/0.0408 = 8.75. Z = 7/(7 + 8.75) = 0.444
Estimated frequency for insured B: (0.444)(4/7) + (1 - 0.444)(5/14) = 0.452. (7)(0.452) = 3.16.
8.5. C. Estimated EPV = overall mean = 4/16 = 1/4.
= m - mi2 /m = 16 - (72 + 92 )/16 = 7.875. X1 = 3/7. X 2 = 1/9.
C

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)

{7(3/7 - 1/4)2 + 9(1/9 - 1/4)2 - (2 - 1)(1/4)} / 7.875 = 0.01864.


K = EPV/VHM = .25/0.01864 = 13.4. ZA = 7/(7 + 13.4) = 0.343. ZB = 9/(9 + 13.4) = 0.402.
Credibility weighted mean = {(0.343)(3/7) + (0.402)(1/9)} / (0.343 + 0.402) = 0.257.
Estimated frequency for Insured A is: (0.343)(3/7) + (1 - 0.343)(0.257) = 0.316.
Comment: Similar to Exercise 20 in Topics in Credibility by Dean.
The fact that the insureds have different numbers of years of data, does not complicate the
calculation. If we had not assumed a Poisson frequency, then the computation of the EPV would
have had to take into account the differing number of years of data.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 136
8.6. A. Since we have assumed a Poisson frequency, the estimated EPV = X = 15/1000 = .015.
= m - mi2 / m = 1000 - (1002 + 2002 + 4002 + 3002 )/1000 = 700.
C

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)

{(100)(.03 - .015)2 + (200)(.025 - .015)2 + (300)(.01 - .015)2 + (400)(.01 - .015)2


- (4 - 1)(.015)}/700 = 0.00002143.
K = EPV/VHM = 0.015/0.00002143 = 700.
Territory D has 300 exposures, and therefore Z = 300/(300 + 700) = 30%.
Comment: Similar to 4, 11/05, Q.22.
8.7. D. Since the frequency is assumed to be Poisson,
estimated EPV = mean = (0 + 1 + 3)/(4 + 4 + 4) = 4/12 = 1/3.
X1 = 0. X 2 = 0.25. X 3 = 0.75.
Estimated VHM = {(0 - 1/3)2 + (0.25 - 1/3)2 + (0.75 - 1/3)2 }/(3 - 1) - EPV/4
= .1458 - 1/12 = 0.0625.
K = (1/3)/0.0625 = 5.33. Z1 = 4/(4 + 5.33) = 42.9%.
Estimated frequency for Insured C is: (42.9%)(0.75) + (1 - 42.9%)(1/3) = 0.512.
8.8. D. Estimated EPV = overall mean = (9 + 19)/(450 + 600) = 28/1050 = 0.02667.
= m - mi2 /m = 1050 - (4502 + 6002 )/1050 = 514.3.
X1 = 9/450 = .02000. X 2 = 19/600 = .03167.
C

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)

{450(.02000 - 0.02667)2 + 600(.03167 - 0.02667)2 - (2 - 1)(0.02667)}/514.3 = 0.00001624.


K = EPV/VHM = 0.02667/0.00001624 = 1642.
Z A = 450/(450 + 1642) = 0.215. ZB = 600/(600 + 1642) = 0.268.
Credibility weighted mean = {(0.215)(0.0200) + (0.268)(0.03167)} / (0.215 + 0.268) = 0.0265.
Estimated frequency for Class A is: (0.215)(0.0200) + (1 - 0.215)(0.0265) = 0.0251.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 137
8.9. E. One can either assume that the number of holes-in-one is Poisson or Bernoulli; since the
mean frequency per golfer is so small, there is no practical difference.
For convenience, take the unit of exposure as 1000 golfers.
Then, the overall observed mean frequency is: 33/58 = 56.9% = EPV.
The six observed frequencies are: 75%, 90%, 40%, 50%, 20%, and 100%.
= mi - mi2 / mi = 58 - (82 + 102 + 202 + 122 + 52 + 32 )/ 58 = 45.207.

mi ( X i - X )2 = {(8)(75% - 56.9%)2 + (10)(90% - 56.9%)2 + (20)(40% - 56.9%)2 +


(12)(50% - 56.9%)2 + (5)(20% - 56.9%)2 + (3)(100% - 56.9%)2 } = 3.224.
C

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)

= {3.224 - (0.569)(6 - 1)}/45.207 = 0.00838.

K = EPV/VHM = 0.569/0.00838 = 67.9.


The data for Garden City had 5000 golfers, or 5 exposure units. Z = 5/(5 + 67.9) = 6.9%.
Estimated future frequency for Garden city is: (6.9%)(20%) + (1 - 6.9%)(56.9%) = 54.4%.
The given tournament has 200 golfers, or 0.2 exposures.
The expected number of holes-in-one is: (0.2)(54.4%) = 0.109.
Expected payment is: ($100,000)(0.109) = $10,900.
8.10. A. Since we have assumed a Poisson frequency, the estimated EPV = X = 12/60 = 1/5.
= m - mi2 / m = 60 - (102 + 202 + 302 )/60 = 36.67.
C

mi ( Xi -

VHM = i=1

X)2 - EPV (C - 1)

{(10)(.4 - .2)2 + (20)(.25 - .2)2 + (30)(.1 - .2)2 - (.2)(3 - 1)}/36.67 = 0.00954.


K = EPV/VHM = 0.20/.00954 = 21.0.
Territory A has 10 exposures, and therefore Z = 10/(10 + 21) = 32.3%.
Comment: The wording of this past exam question should have been better.
One has to use a combination of semiparametric and empirical bayes estimation.
If there are no exposures and no differing number of years of data, then what I have done is similar
to semiparametric estimation.
However, in this question we have differing numbers of years of data and we have exposures.
For your exam, I would be on the lookout for exposures.

2013-4-12 Empirical Bayesian Cred. 8 Assuming Poisson Freq., HCM 10/22/12, Page 138
8.11. C. Since the frequency is assumed to be Poisson,
estimated EPV = mean = (1 + 1 + 0 + 2 + 3)/(2 + 2 + 1 + 3 + 2) = 7/10 = 0.7.
X1 = (1 + 1 + 0)/(2 + 2 + 1) = 0.4. X2 = (2 + 3)/(3 + 2) = 1.
Estimated VHM = {5(0.4 - 0.7)2 + 5(1 - 0.7)2 - (2 - 1)(0.7)}/{10 - (52 + 52 )/10} = 0.2/5 = 0.04.
K = 0.70/.04 = 17.5. Z1 = 5/(5 + 17.5) = 0.222.
Estimated frequency for A is: (0.222)(0.4) + (1 - 0.222)(0.7) = 0.633.
Comment: Notice the Poisson assumption in this past exam question. Thus one uses a combination
of semiparametric and empirical bayes estimation.

2013-4-12

Empirical Bayesian Cred. 9 Important Ideas,

HCM 10/22/12,

Page 139

Section 9, Important Formulas and Ideas


No Variation in Exposures (Section 2)
Y

(Xit - Xi )2
t=1
si2 = sample variance for the data from a single class i =

Y - 1

C Y

si2 (Xit - Xi )2
EPV = average of the si2 = i=1
C

= i=1 t=1
C (Y - 1)

VHM = (the sample variance of the class means) - EPV / (# years of data)
C

(Xi - X )2
= i=1

C - 1

EPV
.
Y

If estimated VHM is negative, set Z = 0.

2013-4-12

Empirical Bayesian Cred. 9 Important Ideas,

HCM 10/22/12,

Page 140

No Variation in Exposures, Variation in Years (Section 3)


Y i = the number of years of data for class i.
Yi

(Xit - Xi )2
si2 = t=1

= the usual sample variance for the data from a single class i.

Yi - 1

(Yi

C Yi

- 1)si2

(Xit - Xi )2

EPV = weighted average of these sample variances = i=1C

= i=1 t=1
C
(Yi - 1)
(Yi - 1)
i=1

Let =

Yi i=1

i=1

C
Yi2
i= 1
.
C
Yi
i= 1

Yi ( Xi
VHM = i=1

- X )2 - (C -1)EPV

, if estimated VHM is negative, set Z = 0.

2013-4-12

Empirical Bayesian Cred. 9 Important Ideas,

HCM 10/22/12,

Page 141

Variation in Exposures (Section 4)


Xit = pure premium (or frequency) for class i and year t.
Observe C classes each for Y years.
mit = exposures for class i and year t.
Y

mi =

mi t =total exposures for class i.


t=1

m=

mi = total exposures.
i=1

mit (Xit - X i )2
v i = t=1

Y - 1

C Y

mit Xit
t=1
Xi =

mi

C Y

mit (Xit - Xi )2

vi

i=1

X = i=1 t=1
m

EPV =

mit Xit

= i=1 t=1
C (Y - 1)

m i 2
= m - i=1
m
C

mi (Xi

VHM = i=1

K=

EPV
.
VHM

- X)2 - EPV (C - 1)
, if the estimated VHM is negative, set Z = 0.

Zi =

mi
.
mi + K

When there is variation in exposures, in order to preserve the total losses, apply the
complement of credibility, 1 - Z, to the credibility weighted average of the class means.

2013-4-12

Empirical Bayesian Cred. 9 Important Ideas,

HCM 10/22/12,

Page 142

Variation in Years and Exposures (Section 5)


C

C Yi

(Yi - 1) vi

=1
EPV = i C

(Yi - 1)

mit (Xit - Xi )2
= i=1 t=1C

(Yi

.
- 1)

i=1

=1

mi 2
= m - i=1
m
C

mi (X i

VHM = i=1

- X)2 - EPV (C -1)

, if estimated VHM is negative, set Z = 0.

The formulas in previous sections are special cases of the formulas in this section; the
formulas using an priori mean are not.

2013-4-12

Empirical Bayesian Cred. 9 Important Ideas,

HCM 10/22/12,

Page 143

Using an A Priori Mean (Section 6)


= a priori estimate of the mean. The estimate of the EPV is the same as that in the
absence of relying on an a priori estimate of the mean.
(1 - Z) is applied to the a priori mean, .
No Variation in Exposures, More than One Class:
C

(Xi

VHM = i=1

- )

EPV
, if estimated VHM is negative, set Z = 0.
Y

No Variation in Exposures, One Class:


Y

(Xt - X )2
EPV = t=1

. VHM = ( X - )2 - (EPV / Y), if estimated VHM is negative, set Z = 0.

Y - 1

Using an A Priori Mean, Variation in Exposures (Section 7)


C

mi (X i

- )2 - (C) (EPV)

VHM = i=1

, if the estimated VHM is negative, set Z = 0.

Variation in Exposures, One Class:


Y

mt (Xt - X )2
EPV = t=1

Y - 1

. VHM = ( X - )2 - EPV / m, if estimated VHM < 0, set Z = 0.

Assuming a Poisson Frequency (Section 8)


EPV = X . Estimate the VHM using the same formula as one would otherwise.

Mahlers Guide to

Simulation
Joint Exam 4/C

prepared by
Howard C. Mahler, FCAS
Copyright 2013 by Howard C. Mahler.

Study Aid 2013-4-13


Howard Mahler
hmahler@mac.com
www.howardmahler.com/Teaching

2013-4-13,

Simulation,

HCM 10/25/12,

Page 1

Mahlers Guide to Simulation


Copyright 2013 by Howard C. Mahler.
Information in bold and sections whose titles are in bold, are more important to pass your exam.
Larger bold type indicates it is extremely important. Information presented in italics (and sections
whose titles are in italics) should not be needed to directly answer exam questions and should be
skipped on first reading. It is provided to aid the readers overall understanding of the subject, and to
be useful in practical applications.
Highly Recommended problems are double underlined.
Recommended problems are underlined. Solutions to the problems in each section are at the end
of that section.1
Section #

Pages

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

4-5
6-9
10-30
31-63
64-79
80-90
91-111
112-118
119-136
137-158
159-185
186-218
219-229
230-248
249-251
252-277
278-308
309-337
338-341
342-351

E
F

Section Name
Introduction
Uniform Random Numbers
Continuous Distributions, Inversion Method
Discrete Distributions, Inversion Method
Simulating Normal and LogNormal Distributions
Simulating Brownian Motion
Simulating Lifetimes
Miscellaneous, Inversion Method
Simulating a Poisson Process
Simulating a Compound Poisson Process
Simulating Aggregate Losses and Compound Models
Deciding How Many Simulations to Run
Simulating a Gamma and Related Distributions
Simulating Mixtures of Models
Simulating Splices
Bootstrapping
Bootstrapping via Simulation
Estimating p-values via Simulation
An Example of a Simulation Experiment
Summary and Important Ideas

Note that problems include both some written by me and some from past exams. The latter are copyright by the
CAS and SOA and are reproduced here solely to aid students in studying for exams. The solutions and comments
are solely the responsibility of the author; the CAS and SOA bear no responsibility for their accuracy. While some of
the comments may seem critical of certain questions, this is intended solely to aid you in studying and in no way is
intended as a criticism of the many volunteers who work extremely long and hard to produce quality exams. In some
cases Ive rewritten past exam questions in order to match the notation in the current Syllabus.

2013-4-13,

Simulation,

HCM 10/25/12,

Page 2

Course 3 Exam Questions by Section of this Study Aid

Section Sample

5/00

1
2
3
4
5
6
7
8

11/00

5/01

11/01

11

13

8
12

32

11/02

CAS

SOA

CAS

CAS

SOA

11/03

11/03

5/04

11/04

11/04

30

40
39
38

38-40

32
22

9
10
11
13
14
15

19

43-44
37

33
6 34
5

10
40

Questions no longer on the syllabus: Sample 3, Q.2, Q.7; 3, 5/00, Q.14; 11/00, Q.37-38;
11/01, Q.14; 11/02 Q. 23. CAS3, 11/03, Q.36-37.
I have rewritten: CAS3, 11/03, Q.40; CAS3, 11/04, Q.38-39.
The CAS/SOA did not release the 5/02 and 5/03 Course 3 exams.
The SOA did not release its 5/04 Course 3 exam.
From 5/00 to 5/03, the Course 3 Exam was jointly administered by the CAS and SOA.
Starting in 11/03, the CAS and SOA gave separate exams.

Course 4 Exam Questions by Section of this Study Aid


Section

Sample

12

14

16

38

5/00

11/00

5/01

11/01

11/02

11/03

35

26

11/04

17
17

26

17
18

Questions no longer on the syllabus: 4, 5/01, Q.29; 4, 11/03, Q.10.


The CAS/SOA did not release the 5/02, 5/03, and 5/04 Course 4 exams.

16

2013-4-13,

Simulation,

HCM 10/25/12,

Joint Exam 4/C Questions by Section of this Study Aid


Section
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

5/05

11/05

11/06

5/07

32
12
34

27

21

8
16

4
11

9
37

29

The CAS/SOA did not release the 5/06, 11/07, and subsequent exams.

Page 3

2013-4-13,

Simulation 1 Introduction,

HCM 10/25/12,

Page 4

Section 1, Introduction
The simulation concepts in chapter 21 of Loss Models are demonstrated.
Some additional simulation concepts are discussed in Mahlers Guide to Risk Measures.
Many of you will find this study guide a good review of different ideas; many actuaries understand a
concept much more clearly once they know how to simulate it.
Simulation allows the actuary to analyze complex situations that can not be handled in closed form, or
that would be difficult to handle by analytical techniques. As long as one can simulate each piece of a
process, one can simulate a complicated model, such as a model of an insurer. Once a simulation
model is created, little creative thought is required. 2
Simulation can also be useful to check the results of work done by other techniques.
Simulation experiments can help an actuary to develop his intuition, so he will be better able to
apply actuarial judgement to real world problems.
If one wishes to simulate a random variable S, then there are 4 steps:3
1. Build a model for S which depends on random variables.
We need to know the distributions and dependencies of these random variables.
2a. Simulate the variables from step one.4
2b. Compute S for this simulated set of values.
2c. Repeat steps 2a and 2b many times, recording the output value of S each time.5
3. Use the outputs from step 2 as an approximation to the distribution of S.6
4. Estimate quantities of interest such as the mean of S, the variance of S, the 99th percentile
of S, the limited expected value of S at 7, the probability that S is greater than 10, etc.
Hopefully, this will all be much more concrete after going through the many examples in this study
guide. For example, S might be the aggregate losses of an insured for a given year.
2

See Section 21.1.1 of Loss Models.


See Section 21.1.1 of Loss Models.
These steps describe in a general way many, but not all, uses of simulation by actuaries.
4
Some of these variables may have to be simulated more than once.
5
Each time you did steps 2a and 2b would be referred to as one simulation run. For a given application, one might
perform 1000 simulation runs.
6
If one has done the modeling correctly, and one has performed enough simulation runs, one should have a good
approximation to the true distribution of S.
3

2013-4-13,

Simulation 1 Introduction,

HCM 10/25/12,

Page 5

For example, assume that frequency is Binomial with m = 5 and q = 0.1,


and severity is Weibull with = 100 and = 2.
Frequency and severity are independent.
The size of each claim is independent of that of any other claim.7
Then one would simulate the annual aggregate losses as follows:
I. Simulate the number of claims from the Binomial Distribution.8
II. Simulate the size of each claim separately from the Weibull Distribution.9
III. S, the annual aggregate losses for this simulation run, is the sum of the values from step II.
IV. Return to step I and repeat the process.

This is an example of the usual collective risk model of aggregate losses.


See Mahlers Guide to Aggregate Distributions.
8
How to simulate a random draw from a Binomial will be discussed in a later section.
9
How to simulate a random draw from a Weibull will be discussed in a later section. If one had simulated 3 claims in
step I, one would simulate 3 independent identically distributed Weibulls. If zero claims were simulated, then S = 0.

2013-4-13,

Simulation 2 Uniform Random Numbers,

HCM 10/25/12,

Page 6

Section 2, Uniform Random Numbers


Uniform random numbers from the interval [0,1] are the basis of all the simulation
techniques covered on the exam.
While in practice these random numbers are produced by an algorithm, they are intended to be a
reasonable approximation to independent draws from the interval [0,1], where any number is as
likely to come up as any other number.10 11
If [c, d] is any arbitrary subinterval of [0,1], then the chance of the next random draw being in the
interval [c, d] should be equal to its length (d - c). You are not responsible for knowing how these
random numbers are produced. In exam questions, you will given a random number or a series of
random numbers from [0, 1] to use.
Other Intervals than [0,1]
Uniform random numbers from [0,1] can be used to easily generate uniform random numbers from
an arbitrary interval [r, s]. One needs to multiply by (r - s) the width of the desired interval and then
add the appropriate constant r so as to shift intervals so they are equal.
To get a uniform random number from [r, s], take (s - r) u + r, where u is a random number
from [0, 1].
For example, assume we a given 0.839 as a random number from [0,1].
If instead we needed numbers from the interval [-10, +15], then one multiplies by 25 the width of
the desired interval, and then one adds -10: (25)(0.839) - 10 = 10.975.
Exercise: 0.516, 0.854, 0.129, and 0.731, are four uniform random numbers from [0, 1].
Use these to produce four uniform random numbers from [100, 500].
[Solution: (400)(0.516) + 100 = 306.4, (400)(0.854) + 100 = 441.6, (400)(0.129) + 100 = 151.6,
and (400)(0.731) + 100 = 392.4.]

10

These are pseudorandom numbers. While they will be treated as if they are random, since they are
deterministically generated they can not really be random; there is always some pattern remaining. Although I will
refer to random numbers for simplicity, I should really say pseudorandom numbers.
11
There are practical difficulties of assuring that consecutive draws or series of consecutive draws do not have a
pattern. In practical applications any such pattern may have to be overcome by some sort of shuffling. Unfortunately,
one popular technique, congruential generators, generally exhibit such patterns. See for example, pages 317-320
of A New Kind of Science by Stephen Wolfram. Therefore, the Mathematica software package, developed by
Dr. Wolfram, uses instead a cellular automation rule in order to produce random numbers.

2013-4-13,

Simulation 2 Uniform Random Numbers,

HCM 10/25/12,

Page 7

Use of Uniform Random Numbers from [0, 1]:


Every simulation technique discussed subsequently will use as the input either a single random
number from [0, 1] or a series of independent random numbers from [0, 1] .
When given a series of random numbers from [0, 1], use them in the order given, and use
each one at most once. Once a random number is used, it is used up.12

12

Reusing a random number would normally make the output nonrandom and/or would introduce a dependence
between pieces of the output. While in some particular situations there exist clever methods of avoiding this
problem, they are not covered on the syllabus .

2013-4-13,

Simulation 2 Uniform Random Numbers,

HCM 10/25/12,

Page 8

Problems:
2.1 (1 point) You are given that .426 is a random number from the interval [0, 1].
Generate a random number in the interval [20, 30].
A. less than 21
B. at least 21 but less than 22
C. at least 22 but less than 23
D. at least 23 but less than 24
E. at least 24
2.2 (3 points) Let X and Y be independent, identically distributed variables, each uniformly
distributed on [0, 100].
Let Z = Minimum[X, Y].
Simulate 10 values of Z using the following 20 random numbers from [0, 1]:
0.574, 0.079, 0.803, 0.382, 0.507, 0.848, 0.090, 0.631, 0.246, 0.724, 0.968, 0.372, 0.653,
0.736, 0.329, 0.757, 0.915, 0.177, 0.770, 0.403.
(Use the first pair to simulate one value of Z, the second pair to simulate a second value of Z, etc.)
What is the average of the ten simulated values of Z?
A. 30
B. 32
C. 34
D. 36
E. 38
2.3 (CAS3, 11/04, Q.40) (2.5 points)
An actuary uses the following algorithm, where U is a random number generated from the
uniform distribution on [0,1], to simulate a random variable X:
(1) If U < 0.40, set X = 2, then stop.
(2) If U < 0.65, set X = 1, then stop.
(3) If U < 0.85. set X = 3, then stop.
(4) Otherwise, set X = 4, then stop.
What are the probabilities for X = 1, 2, 3, 4, respectively?
A. 0.40, 0.25, 0.15, 0.20
B. 0.25, 0.40, 0.20, 0.15
C. 0.15, 0.25, 0.20, 0.40
D. 0.15, 0.20, 0.40, 0.25
E. 0.20, 0.25, 0.15, 0.40

2013-4-13,

Simulation 2 Uniform Random Numbers,

HCM 10/25/12,

Page 9

Solutions to Problems:
2.1. E. Multiply by the width of the desired interval and then add a constant to translate the interval.
In this case, v = 10u + 20 = 4.26 + 20 = 24.26.
2.2. B. Using the first pair of .574 and .079, one gets a simulated x = (100)(.574) = 57.4 and
y = (100)(.079) = 7.9. z = Min[x, y] = 7.9.
One obtains the remaining simulated values of Z similarly, and the average is:
(7.9 + 38.2 + 50.7 + 9.0 + 24.6 + 37.2 + 65.3 + 32.9 + 17.7 + 40.3)/10 = 32.38.
Comment: The mean of either X or Y is 50. However, we expect the minimum of two identically
distributed values to have a smaller mean. (We would expect the maximum of two identically
distributed variables to have a larger mean.) The distribution of the smallest of two identical,
independent variables is: 1 - S(t)2 . In this example, the distribution of Z is:
1 - (1 - z/100)2 = z/50 - z2 /10000, 0 < z < 100.
The density of Z is 1/50 - z/5000, 0 < z < 100:
f(z)
0.020
0.015
0.010
0.005

20

40

60

80

100

Z has a mean of 100/3 = 33.333 < 50.


2.3. B. There is a 40% chance we stop at step 1, and thus a 40% chance of a 2.
There is a 65% chance we stop at step 2 or before; therefore, there is a 65% - 40% = 25% chance
that we stop at step 2. There is a 25% chance of a 1.
Probability of a 3 is: 85% - 65% = 20%. Probability of a 4 is: 100% - 85% = 15%.
The probabilities for X = 1, 2, 3, 4, respectively are: 0.25, 0.40, 0.20, 0.15.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 10

Section 3, Continuous Distributions, Inversion Method


There are two related simulation methods called the inversion method. One applies to continuous
distributions, which will be discussed in this section, while the other applies to discrete distributions,
to be discussed in the next section.
Lets assume we wish to simulate a random draw from an Exponential Distribution with mean 1000,
F(x) = 1 - e-x/1000.
Let 0.90 be a random number from [0,1].
Set F(x) = 0.9 and solve for x.
0.9 = 1- e-x/1000 x = -1000 ln(1 - 0.9) = 2303.
If instead we set S(x) = 0.9, then
0.9 = e-x/1000 x = -1000 ln(0.9) = 105.
F(x) and S(x) are shown below:
Prob.
1.0
0.8

F(x)

0.6
0.4
S(x)

0.2

500

1000

1500

2000

2500

3000

2303 is where the Distribution Function is 0.9, while 105 is where the Survival Function is 0.9.
Either technique is valid. In all problems involving simulation via inversion, there are always (at least)
two ways to proceed.
This is an example of simulation of a continuous distribution by the inversion method.
The key idea is that for any continuous distribution y = F(x) is uniformly distributed on the
interval [0,1].13
13

y = S(x) is also uniformly distributed on the interval [0,1].

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 11

Assume u is a random draw from the uniform distribution on [0,1].


To simulate a random draw from the distribution function F(x), we either set u = F(x) or u = S(x), and
solve for x.
Note that if we set u = F(x), large random numbers correspond to large simulated losses.
If instead we set u = S(x), then large random numbers correspond to small simulated losses.
For the Exponential Distribution, if one sets u = F(x) = 1 - e-x/, then x = - ln(1-u).
If one instead sets u = 1 - F(x) = S(x), then x = - ln(u).
Let u be a random number from (0,1).
While ideally exam questions should specify whether large random numbers correspond to large or
small losses, letting you know whether to set u equal to F(x) or S(x), recent questions do not.
If in an exam question it is not stated which way to perform the method of inversion,
set F(x) = u, since this is the manner shown in Loss Models, and then solve for x.14
Setting F(x) = u is the same mathematics as solving for a percentile.
Therefore, for those cases where formulas are given in Appendix A, we can
determine VaRp (X), for p = u.
For example, for the Exponential Distribution, VaRp (X) = - ln(1-p).
Let p = u, we get x = - ln(1-u), matching the previous result.

Exercise: Loss sizes are distributed via F(x) =

1
.
1 + (1000 / x)0.7

Let 0.312 be a random draw from the uniform distribution on [0,1].


Use this random draw to simulate a random loss size.
Assume large random numbers correspond to large losses.
1
[Solution: 0.312 = F(x) =
. Solve for x = 1000{(1/0.312) -1}-1/0.7 = 323.
1 + (1000 / x)0.7
Alternately, this is a LogLogistic Distribution with parameters = 0.7 and = 1000.
For the LogLogistic Distribution, VaRp (X) = {p-1 - 1}-1/ = (1000) {1/0.312 - 1}-1/0.7 = 323.
Comment: One can check as follows: F(323) =

14

1
= 0.312. ]
1 + (1000 / 323)0.7

In practical applications, use whichever manner you wish.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 12

This inversion method makes sense, since both u and F(x) are uniformly distributed on the
interval [0,1]. By setting u = F(x), the chance of x being in an arbitrary subinterval [c,d] of [0,1],
will be equal to the chance that u is in the interval [F(c), F(d)]. This chance is F(d) - F(c), since u
is uniformly distributed on [0,1]. This is precisely what is desired; thats what is meant by x
being a random draw from F(x).
Setting u = F(x) is only useful provided one can solve for x. We could do so in the case of the
Exponential and LogLogistic Distributions, because these Distribution functions can be inverted (in
closed algebraic form.)
Simulation by the inversion method will work for any Distribution Function, F(x), that
can be algebraically inverted.15
Examples of distributions that may be simulated in this manner include the: Exponential,
Inverse Exponential, Weibull, Inverse Weibull, Pareto, Inverse Pareto, Burr, Inverse Burr,
ParaLogistic, Inverse ParaLogistic, LogLogistic, and Single Parameter Pareto.16
Exercise: Let 0.85 be a random number from [0,1].
Using the inversion method, with large random numbers corresponding to large losses, simulate a
random loss from a Pareto Distribution, with parameters = 3 and = 500.
500 3
[Solution: Set 0.85 = F(x) = 1 -
. Solve for x. (1 - 0.85)-1/3 = (500+x)/500.
500 + x
x = (500)(0.15-1/3) - 500 = 441.
Alternately, for the Pareto Distribution,
VaRp (X) = {(1 - p)1/ - 1} = (500) {(1 - 0.85)-1/3 - 1} = (500) (0.8821) = 441.
Comment: 1 - {500 / (500 + 441)}3 = 0.85.]

15
16

Also called Simulation by Inversion or Simulation by Algebraic Inversion.


Each of these distributions has a formula for VaRp (X) in Appendix A attached to the exam.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 13

F(x) is uniform on [0,1]:


For example, take an Exponential Distribution with = 1000 and let y = 1 - e-x/1000.
Then the chance of y being in [0.65, 0.66] will be equal to the chance that x is such that
0.65 1 - e-x/1000 0.66. Since x was assumed to follow an Exponential Distribution with
= 1000, by the definition of the cumulative distribution function, this chance is: 0.66 - 0.65 = 0.01.
Thus the chance of y being in the subinterval [0.65, 0.66] is equal to the width of that interval.
More generally assume x follows the Distribution Function F(x). Then for y = F(x), the chance of y
being in an arbitrary subinterval [c, d] of [0, 1], will be equal to the chance that x is such that:
c F(x) d. By the definition of the cumulative distribution function, this chance is: d - c.
Thus the chance of y being in any subinterval of [0, 1] is proportional to the width of that subinterval,
with the same proportionality constant of 1 regardless of the subinterval. Thus y is uniformly
distributed, with density function h(y) = 1 for 0 y 1.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 14

Problems:
3.1 (1 point) Using the inversion method, you are simulating a random loss from the exponential
distribution F(x) = 1 - exp(-x/1000).
You draw the value 0.988 from the uniform distribution on [0,1].
What is the corresponding size of loss?
A. less than 4100
B. at least 4100 but less than 4200
C. at least 4200 but less than 4300
D. at least 4300 but less than 4400
E. at least 4400
3.2 (1 point) An Inverse Weibull Distribution has parameters = 15 and = 3.
Simulate a random draw from this distribution using the random number 0.215.
A. 9
B. 10
C. 11
D. 12
E. 13
3.3 (1 point) You wish to simulate via the inversion method a random variable, X, with probability
density function: f(x) = 3000 x-4 , 10 < x < .
To do so, you use a random variable Y, with probability density function: f(y) = 1; 0 < y < 1.
Which simulated value of X corresponds to a sample value of Y equal to 0.35?
A. Less than 11.0
B. At least 11.0, but less than 11.2
C. At least 11.2, but less than 11.4
D. At least 11.4, but less than 11.6
E. 11.6 or more
3.4 (1 point) An Inverse Paralogistic Distribution has parameters = 6 and = 2000.
Simulate a random draw from this distribution using the random number 0.396.
A. 1900
B. 2100
C. 2300
D. 2500
E. 2700
3.5 (1 point) You wish to model a Pareto distribution with = 1.5 and = 3 via simulation.
A value of 0.6 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value from the Pareto distribution corresponds to this 0.6?
A. Less than 2.6
B. At least 2.6, but less than 2.7
C. At least 2.7, but less than 2.8
D. At least 2.8, but less than 2.9
E. 2.9 or more

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 15

3.6 (1 point) An Inverse Exponential Distribution has = 400.


Simulate a random draw from this distribution using the random number 0.092.
A. 150
B. 170
C. 190
D. 210
E. 230
3.7 (1 point) Suppose you wish to model a distribution of the form f(x) = qxq-1, 0 < x < 1,
with q = 0.3, by simulation.
A value of 0.8 is randomly generated from a uniform distribution on the interval (0,1).
What simulated value corresponds to this 0.8?
A. Less than 0.45
B. At least 0.45, but less than 0.50
C. At least 0.50, but less than 0.55
D. At least 0.55, but less than 0.60
E. 0.60 or more
3.8 (1 point) An Inverse Pareto Distribution has parameters = 4 and = 300.
Simulate a random draw from this distribution using the random number 0.774.
A. 3000
B. 3500
C. 4000
D. 4500
E. 5000
3.9 (1 point) You want to generate a random variable, X, with probability density function:
f(x) = 0.25 x3 for 0 < x < 2.
To do so, you first use a random number, Y, which is uniformly distributed on (0,1).
What simulated value of X corresponds to a sample value of Y equal to 0.125?
A. 0.750
B. 0.879
C. 1.000
D. 1.189
E. 1.250
3.10 (1 point) An Inverse Burr Distribution has parameters = 5, = 700, and = 3.
Simulate a random draw from this distribution using the random number 0.871.
A. 1900
B. 2100
C. 2300
D. 2500
E. 2700
3.11 (1 point)
You wish to model a Weibull distribution with parameters = 464 and = 3 by simulation.
A value of 0.4 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value from the Weibull distribution corresponds to this 0.4?
A. Less than 360
B. At least 360, but less than 380
C. At least 380, but less than 400
D. At least 400, but less than 420
E. 420 or more

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 16

3.12 (1 point) You wish to simulate a LogLogistic Distribution as per Loss Models with parameters
= 250 and = 4.
A value of 0.37 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value from the LogLogistic distribution corresponds to this 0.37?
A. Less than 220
B. At least 220, but less than 225
C. At least 225, but less than 230
D. At least 230, but less than 235
E. 235 or more
3.13 (1 point) You wish to simulate a random value from the distribution
F(x) = 1 - exp[-0.1(1.05x - 1)].
A value of 0.61 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value from this distribution corresponds to this 0.61?
A. Less than 35
B. At least 35, but less than 40
C. At least 40, but less than 45
D. At least 45, but less than 50
E. 50 or more
3.14 (1 point) You wish to simulate a ParaLogistic Distribution as per Loss Models with parameters
= 6 and = 50.
A value of 0.89 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value from the ParaLogistic distribution corresponds to this 0.89?
A. Less than 35
B. At least 35, but less than 40
C. At least 40, but less than 45
D. At least 45, but less than 50
E. 50 or more
3.15 (3 points) Using the inversion method, simulate two random draws from the distribution:
1
x
F(x) = +
, - < x < . Use the random numbers 0.274 and 0.620.
2
8 + 4x2
What is the sum of these two simulated values?
A. Less than -0.4
B. At least -0.4, but less than -0.3
C. At least -0.3, but less than -0.2
D. At least -0.2, but less than -0.1
E. -0.1 or more

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 17

3.16 (3 points) You are given the following graph of a cumulative distribution function:
F(x)
1
(10000, 1.0)

0.9
(1000, 0.7)

(5000, 0.9)

0.7

(0, 0)

x
1000

5000

10000

You simulate 4 losses from this distribution using the following random numbers from (0, 1):
0.812, 0.330, 0.626, 0.941.
For a policy with a $500 deductible, determine the simulated average payment per payment.
A. Less than 2500
B. At least 2500, but less than 3000
C. At least 3000, but less than 3500
D. At least 3500, but less than 4000
E. 4000 or more

1
3.17 (1 point) You wish to model a Burr distribution, F(x) = 1 -
, by simulation.
1 + (x / )
The Burr distribution has parameters = 3, = 100, and = 2.
A value of 0.75 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value from the Burr distribution corresponds to this 0.75?
A. Less than 77
B. At least 77, but less than 78
C. At least 78, but less than 79
D. At least 79, but less than 80
E. 80 or more

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 18

3.18 (3 points) Annual prescription drug costs are modeled by a two-parameter Pareto distribution
with = 2000 and = 2.
A prescription drug plan pays annual drug costs for an insured member subject to the
following provisions:
(i) The insured pays 100% of costs up to the ordinary annual deductible of 250.
(ii) The insured then pays 25% of the costs between 250 and 2250.
(iii) The insured pays 100% of the costs above 2250 until the insured has paid 3600 in total.
This occurs when annual costs reach 5100.
(iv) The insured then pays 5% of the remaining costs.
Use the following random numbers to simulate four years: 0.58, 0.94, 0.13, 0.80.
What are the total plan payments for the four simulated years?
A. Less than 4500
B. At least 4500, but less than 4600
C. At least 4600, but less than 4700
D. At least 4700, but less than 4800
E. 4800 or more
3.19 (3 points) The Cauchy Distribution has density f(x) =

1
, - < x < .
{1 + (x - )2 }

Using the inversion method, simulate two random draws from a Cauchy Distribution with = 2.
Use the random numbers 0.313 and 0.762.
What is the product of these two simulated values?
d tan -1(x)
1
Hint:
=
.
dx
1 + x2
A. 2.5

B. 3.0

C. 3.5

D. 4.0

E. 4.5

3.20 (2 points) Loss sizes for liability risks follow a Pareto distribution,
with parameters = 300 and = 4.
Loss sizes for property risks follow a Pareto distribution,
with parameters = 1,000 and = 3.
Using the inversion method, a loss of each type is simulated. Use the random number 0.733 to
simulate the liability loss and the random number 0.308 to simulate the property loss.
What is the size of the larger simulated loss?
A. Less than 120
B. At least 120, but less than 125
C. At least 125, but less than 130
D. At least 130, but less than 135
E. At least 135

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 19

3.21 (2 points) Losses follow a LogLogistic Distribution with parameters = 2000 and = 3.
The following are three uniform (0,1) random numbers:
0.5217
0.1686
0.9485
Using these numbers and the inversion method, simulate three losses.
Calculate the average payment per payment for a contract with a deductible of 1000.
A. Less than 2000
B. At least 2000, but less than 2250
C. At least 2250, but less than 2500
D. At least 2500, but less than 2750
E. 2750 or more
3.22 (2 points) Use the following:

Two actuaries each simulate one loss from a Weibull Distribution with parameters
= 2 and = 1500.

Each uses the random number 0.312 from (0, 1) and the Inverse Transform Method.
Laurel has large random numbers correspond to large losses; her simulated loss is L.
Hardy has large random numbers correspond to small losses; his simulated loss is H.
What is H/L?
A. Less than 1.60
B. At least 1.60, but less than 1.65
C. At least 1.65, but less than 1.70
D. At least 1.70, but less than 1.75
E. 1.75 or more
3.23 (2 points) X follows a probability density function: 3 / exp[3x + 12], x > -4.
Using the random number 0.2203 and the inversion method, simulate a random value of X.
A. -3.9
B. -3.8
C. -3.7
D. -3.6
E. -3.5
3.24 (4, 5/86, Q.53) (1 point) You wish to generate, via the method of simulation, a random
variable, X, with probability density function: f(x) = 20,000 x-3; 100 < x < .
To do so, you use a random variable Y, with probability density function: f(y) = 1; 0 < y < 1.
Which simulated value of X corresponds to a sample value of Y equal to 0.25?
100
100
100
A.
B.
C.
D. (80,000)1/3
E. None of the above.
0.75
0.25
0.25

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 20

3.25 (4, 5/87, Q.49) (1 point) You wish to model a Pareto distribution by simulation.
The Pareto distribution has parameters = 0.50 and = 3.
A value of 0.5 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value from the Pareto distribution corresponds to this 0.5?
A. Less than 7.5
B. At least 7.5, but less than 8.5
C. At least 8.5, but less than 9.5
D. At least 9.5, but less than 10.5
E. 10.5 or more
3.26 (4, 5/88, Q.52) (1 point) Suppose you wish to model a distribution of the form
f(x) = qxq-1, 0 < x < 1, with q = 0.5, by simulation.
A value of 0.75 is randomly generated from a uniform distribution on the interval (0,1).
What simulated value corresponds to this 0.75?
A. Less than 0.45
B. At least 0.45, but less than 0.50
C. At least 0.50, but less than 0.55
D. At least 0.55, but less than 0.60
E. 0.60 or more
3.27 (4, 5/90, Q.26) (1 point) You want to generate a random variable, X, with probability density
function f(x) = (3/8) x2 for 0 < x < 2 by simulation.
To do so, you first use a random number, Y, which is uniformly distributed on (0,1).
What simulated value of X corresponds to a sample value of Y equal to 0.125?
A. 0.125
B. 0.250
C. 0.500
D. 0.750
E. 1.000
3.28 (4, 5/91, Q.24) (1 point) A random number generator producing numbers in the unit interval
[0,1] is used to produce an Exponential distribution with parameter = 20.
The number produced by the random number generator is y = 0.21.
Use the inversion method, to calculate the corresponding value of the exponential distribution.
A. x = 4.71
B. x = 0.01
C. x = 28.70
D. x = 0.24
E. x = 1.18
3.29 (4B, 11/92, Q.29) (1 point) You are given the following:

The random variable X has the density function f(x) = 4x-5 , x > 1.

The random variable Y having the uniform distribution on (0,1) is used to simulate
outcomes for X.
Determine the simulated value of X that corresponds to an observed value of Y = 0.250.
A. 0.931
B. 0.944
C. 1.000
D. 1.059
E. 1.075

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 21

3.30 (4B, 5/94, Q.11) (2 points) You are given the following:

The random variable X for the amount of an individual claim has the density function
f(x) = 2x-3, x 1.

The random variable R for the ratio of loss adjustment expense (LAE) to loss is
uniformly distributed on the interval [0.01, 0.21].
The amount of an individual claim is independent of the ratio of LAE to loss.
The random variable Y having the uniform distribution on [0,1) is used to simulate
outcomes for X and R, respectively.
Observed values of Y are Y1 = 0.636 and Y2 = 0.245.

Using Y1 to simulate X and Y2 to simulate R, determine the simulated value for X(1+R).
A.
B.
C.
D.
E.

Less than 0.65


At least 0.65, but less than 1.25
At least 1.25, but less than 1.85
At least 1.85, but less than 2.45
At least 2.45

3.31 (4B, 11/94, Q.18) (1 point) You are given the following:
The random variable X has the density function f(x) = 2 x-3, x > 1.
The random variable Y having the uniform distribution on [0,1] is used to simulate outcomes for X.
Determine the simulated value of X that corresponds to an observed value of Y = 0.50.
A. 0.630
B. 0.707
C. 1.414
D. 1.587
E. 2.000
3.32 (4B, 11/96, Q.27) (2 points) You are given the following:

1
The random variable X has a Burr distribution, F(x) = 1 -
,
1 + (x / )
with parameters = 1,000,000, = 2, and = 0.5.

A random number is generated from a uniform distribution on the interval (0, 1).
The resulting number is 0.36.
Determine the simulated value of X.
A. Less than 50,000
B. At least 50,000, but less than 100,000
C. At least 100,000, but less than 150,000
D. At least 150,000, but less than 200,000
E. At least 200,000

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 22

3.33 (4B, 5/97, Q.26) (2 points) You are given the following:

The random variable X has a distribution that is a mixture of a distribution with density function
f1 (x) = 4x3 , 0 < x < 1 and a distribution with density function
f2 (x) = 12x2 (1 - x), 0 < x < 1.

f1 (x) has a weight of 0.75 and f2 (x) has a weight of 0.25.

A random number is generated from a uniform distribution on the interval (0,1).


The resulting number is 0.064.
Determine the simulated value of X.
A. Less than 0.15
B. At least 0.15, but less than 0.25
C. At least 0.25, but less than 0.35
D. At least 0.35, but less than 0.45
E. At least 0.45
3.34 (4B, 5/98, Q.1) (1 point) You are given the following:

The random variable X has cumulative distribution function


1
F(x) = 1 , - < x< .
1 + exp[(x -1) / 2]

A random number is generated from a uniform distribution on the interval (0, 1).
The resulting number is 0.4.
Determine the simulated value of X.
A. Less than 0.25
B. At least 0.25, but less than 0.75
C. At least 0.75, but less than 1.25
D. At least 1.25, but less than 1.75
E. At least 1.75
3.35 (CAS3, 11/04, Q.39) (2.5 points)
The Gumbel distribution is given by: F(x) = exp[-e-x], for all x.
Use the random number from a uniform distribution on the interval [0,1], 0.9833, in order to simulate
a random value from the Gumbel distribution.
Use the inversion method, with large random numbers corresponding to large results.
A. Less than -2.0
B. At least -2.0, but less than 0
C. At least 0, but less than +2.0
D. At least +2.0, but less than +4.0
E. At least +4.0
Note: I have rewritten this exam question in order to match the current syllabus.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 23

3.36 (4, 11/06, Q.32 & 2009 Sample Q.275) (2.9 points)
A dental benefit is designed so that a deductible of 100 is applied to annual dental charges.
The reimbursement to the insured is 80% of the remaining dental charges subject to an annual
maximum reimbursement of 1000.
You are given:
(i) The annual dental charges for each insured are exponentially distributed with mean 1000.
(ii) Use the following uniform (0, 1) random numbers and the inversion method to generate
four values of annual dental charges: 0.30 0.92 0.70 0.08
Calculate the average annual reimbursement for this simulation.
(A) 522
(B) 696
(C) 757
(D) 947
(E) 1042

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 24

Solutions to Problems:
3.1. E. u = F(x) = 1 - e-x /1000 . Therefore, x = - ln(1 - u) = -1000 ln (1 - 0.988) = 4423.
Alternately, for the Exponential VaRp (X) = - ln(1-p) = -1000 ln (1 - 0.988) = 4423.
3.2. E. For the Inverse Weibull Distribution,
VaRp (X) = {-ln(p)}1/ = (15) {-ln(0.215)}-1/3 = 13.0.
Alternately, F(x) = exp[-(/x)] = exp[-(15/x)3 ].
0.215 = exp[-(15/x)3 ]. 1.5371 = (15/x)3 . x = 13.0.
3.3. D. F(x) = 1 - 1000x-3, 10 < x. Set 0.35 = F(x) = 1 - 1000x-3.
1000 1/ 3

x=
= 11.54.
1 - 0.35
Alternately, this is a Single Parameter Pareto Distribution with = 3 and = 10.
For the Single Parameter Pareto Distribution, VaRp (X) = (1- p) - 1/ = (10)(1 - 0.35)-1/3 = 11.54.
3.4. E. For the Inverse Paralogistic Distribution,
VaRp (X) = {p-1/ - 1}-1/ = (2000) {0.396-1/6 - 1}-1/6 = 2695.
(x / ) (x / 2000)6 6
.
=
Alternately, F(x) =
1 + (x / 2000)6
1 + (x / )
(x / 2000)6 6
(x / 2000)6
. 0.85694 =
0.396 =
.
1 + (x / 2000)6
1 + (x / 2000)6

0.16694 = (x/2000)-6. x = 2695.


3.5. A. F(x) = 1 - ( / ( + x)) = 1 - (3/(3+x))1.5
Set 0.6 = F(x) = 1 - (3/(3+x))1.5. 3/(3+x) = .4 1/1.5 = .543. x = 2.52.
Alternately, for the Pareto Distribution,
VaRp (X) = {(1 - p)1/ - 1} = (3) {(1 - 0.6)-1/1.5 - 1} = (3) (0.8420) = 2.52.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 25

3.6. B. For the Inverse Exponential Distribution,


VaRp (X) = {-ln(p)}-1 = (400) {-ln(0.092)}-1 = 168.
Alternately, F(x) = exp[-/x] = exp[-400/x].
0.092 = exp[-400/x]. x = 168.
3.7. B. F(x) = xq , 0< x <1. Set 0.8 = F(x) = xq = x.3. Thus x = .81/.3 = 0.475.
Comment: This is a Beta Distribution with parameters a = q, b = 1, and =1, although this doesnt
help in the solution. While we can simulate this particular Beta Distribution via the Inverse
Transform Algorithm, in general one would use the rejection method, not on the syllabus.
3.8. D. For the Inverse Pareto Distribution,
VaRp (X) = {p-1/ - 1}-1 = (300) {0.774-1/4 - 1}-1 = 4536.
Alternately, F(x) = {x/(x+)} = {x/(x+300)}4 .
0.774 = {x/(x+300)}4 . x = 4536.
3.9. D. F(x) = x4 / 16. Set .125 = F(x) = x4 / 16. x = 2.25 = 1.189.
3.10. C. For the Inverse Burr Distribution,
VaRp (X) = {p-1/ - 1}-1/ = (700) {0.871-1/5 - 1}-1/3 = 2305.
(x / ) (x / 700)3 5
Alternately, F(x) =
= 1 + (x / 700)3 .

1 + (x / )
(x / 700)3 5
(x / 700)3
0.871 =
.

0.97276
=
. 0.028003 = (x/700)-3. x = 2305.

1 + (x / 700)3
1 + (x / 700)3
3.11. B. F(x) = 1 - exp(-(x/)) = 1 - exp(-(x/464)3).
Set 0.4 = F(x) = 1 - exp(-(x/.464)3). x = (464)(- ln.6)1/3 = 371.
Alternately, for the Weibull Distribution, VaRp (X) = {-ln(1-p)}1/ = (464) {- ln(1 - 0.4)}1/3 = 371.
3.12. A. F(x) = 1 - 1/{1 + (x/)} = 1 - 1/{1 + (x/250)4 }.
Set 0.37 = F(x)= 1 - 1/{1 + (x/250)4 }. x = (250) {1/.63 - 1}1/4 = 219.
Alternately, for the Loglogistic Distribution, VaRp (X) = {p-1 - 1}-1/ = (250) {1/.37 - 1}-1/4 = 219.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 26

3.13. D. Set .61 = F(x) = 1 - exp(-0.1( 1.05x -1)). x = ln(1 - ln(1-.61)/0.1) /ln(1.05) = 48.0.
Comment: A Gompertz Distribution as per Actuarial Mathematics, with c = 1.05 and
m = .1, ( B = .1 ln(1.05) = .0049.) We have simulated an age at death of 48.
3.14. C. F(x) = 1 - 1 / (1 + (x/)) = 1 - 1 / (1 + (x/50)6 )6 .
Set 0.89 = F(x)= 1 - 1 / (1 + (x/50)6 )6 . x = (50) {(1 - 0.89)-1/6 - 1}1/6 = 43.7.
Alternately, for the ParaLogistic Distribution,
VaRp (X) = {(1-p)-1/ - 1}1/ = (50) {(1 - 0.89)-1/6 - 1}1/6 = 43.7.
3.15. B. u = F(x) = 1/2 + x/ 8 + 4x2 . x = (u - 1/2) 8 + 4x2 . x2 = (2u - 1)2 (2 + x2 ).

x2 = 2(2u - 1)2 /{1 - (2u - 1)2 }. x = (2u - 1)/ 2u - 2u2 .


Note that this has the right property that when u < 1/2, x < 0, and when u > 1/2, x > 0.
The simulated values are: (2(.274) - 1)/ 2(0.274) - 2(0.274)2 = -0.717.
and (2(.62) - 1)/ 2(0.62) - 2(0.62) 2 = 0.350. Their sum is: -0.717 + 0.350 = -0.367.
Comment: Check that F(-.717) = .274 and F(.350) = .620.
The t-distribution with two degrees of freedom.
One can not in general use the method of inversion to simulation a t-distribution.
3.16. C. Linearly interpolating, we find where F(x) = 0.812;
0.812 corresponds to a loss of: 1000 + (.112/.2)(4000) = 3240.
0.330 corresponds to a loss of: (.330/.7)(1000) = 471.
0.626 corresponds to a loss of: (.626/.7)(1000) = 894.
0.941 corresponds to a loss of: 5000 + (.041/.1)(5000) = 7050.
Payments are: 2740, 0, 394, 6550.
Average payment per (non-zero) payment is: (2740 + 394 + 6550)/3 = 3228.
3.17. A. F(x) = 1 - (1 / (1 + (x/))) = 1 - (1 / (1 + (x/100)2 ))3 .
Set 0.75 = F(x)= 1 - (1 / (1 + (x/100)2 ))3 . x = (100) {(.25)-1/3 - 1}1/2 = 76.6.
Alternately, for the Burr Distribution,
VaRp (X) = {(1-p)-1/ - 1}1/ = (100) {(1 - 0.75)-1/3 - 1}1/2 = 76.6.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 27

3.18. C. For the Pareto Distribution, VARp [X] = {(1-p)1/ - 1} = (2000){(1-p)-1/2 - 1}.
Thus the drug costs for the first simulated year is: (2000){(1 - 0.58)-1/2 - 1} = 1086.
The plan pays: (75%)(1086 - 250) = 627.
The drug costs for the second simulated year is: (2000){(1 - 0.94)-1/2 - 1} = 6165.
The plan pays: (75%)(2250 - 250) + (95%)(6165 - 5100) = 2512.
The drug costs for the third simulated year is: (2000){(1 - 0.13)-1/2 - 1} = 144.
The plan pays nothing.
The drug costs for the fourth simulated year is: (2000){(1 - 0.80)-1/2 - 1} = 2472.
The plan pays: (75%)(2250 - 250) = 1500
Total plan payments for the four simulated years: 627 + 2512 + 0 + 1500 = 4639.
Comment: Setup taken from SOA3, 11/04, Q.7 in Mahlers Guide to Loss Distributions.
3.19. D.

F(x) = (1/) 1/{1 + (x - )2 } dx = (1/) tan-1(x-) ] = (1/){tan-1(x - ) - (-/2)} =


-

1/2 + (1/)tan-1(x - ) = 1/2 + (1/)tan-1(x - 2).


u = F(x) = 1/2 + (1/)tan-1(x - 2). x = 2 + tan[(u - 1/2)].
The simulated values are: 2 + tan[(.313 - .5)] = 2 - .666 = 1.334,
and 2 + tan[(.762 - .5)] = 2 + 1.078 = 3.078. Their product is: (1.334)(3.078) = 4.11.
Comment: Note that F(1.334) = 1/2 + (1/)tan-1(-.666) = 1/2 - .5875/3.14159 = .313, and
F(3.078) = 1/2 + (1/)tan-1(1.078) = 1/2 + .8229/3.14159 = .762.
An integral you are unlikely to need to know how to do for your exam.
The Cauchy Distribution for = 0 is a t-distribution with one degree of freedom.
3.20. D. Set u = F(x) =1 - (1+ x/). x = {(1-u)1/ - 1).
The liability loss is: 300{(1 - .733)-1/4 - 1) = 117.
The property loss is: 1000{(1 - .308)-1/3 - 1) = 131.
The larger simulated loss is 131.
Comment: In each case, we could use that for the Pareto, VaRp (X) = {(1 - p)1/ - 1}.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 28

3.21. A. F(x) = 1 - 1/{1 + (x/)} = 1 - 1/{1 + (x/2000)3 } = u. x = (2000) {1/(1-u)- 1}1/3.


x = (2000)(1/.4783 - 1)1/3 = 2059. x = (2000)(1/.8314 - 1)1/3 = 1175.
x = (2000)(1/.0515 - 1)1/3 = 5282.
Payments are: 1059, 175, 4282.
Average payment per (non-zero) payment is: (175 + 1059 + 4282)/3 = 1839.
Comment: If the smallest simulated loss had been somewhat smaller and less than 1000, then there
would have been only two non-zero payments.
In this case, there are 3 losses that result in 3 (non-zero) payments.
When all of the losses in a sample result in a non-zero payment,
the average payment per payment = mean - deductible.
3.22. E. Laurel sets u = F(x) = 1 - exp[-(x/)].
Solving x = (-ln[1-u])1/ = 1500(-ln[1 - 0.312])1/2 = 917.
Hardy sets u = S(x) = exp[-(x/)].
Solving x = (-ln[u])1/ = 1500(-ln[0.312])1/2 = 1619.
H/L = 1619/917 = 1.77.
Comment: The answer does not depend on theta.
3.23. A. f(x) = 3 e-3x e-12, x > -4.
By integration: F(x) = 1 - e-3x e-12, x > -4.
Set: 0.2203 = 1 - e-3x e-12. ln[0.7797] = -3x - 12. x = -3.917.
3.24. E. F(x) = 1 - 10000 x-2. Setting 0.25 = F(x) = 1 - 10000 x-2. x-2 = .75 / 10000.
Therefore, x = 100 /

0.75 .

Alternately, this is a Single Parameter Pareto Distribution with = 2 and = 100, with
VaRp (X) = (1- p) - 1/ = (100) (1 - 0.25)-1/2 = 100 /

0.75 .

3.25. C. F(x) = 1 - {/(+x)} = 1 - {3/(3+x)}0.5.


Setting F(x) = the uniform random number from (0,1): .5 = 1 - {3/(3+x)}0.5.
Therefore 3/(3+x) = 0.52 = 0.25. Thus x = 9.
Alternately, for the Pareto Distribution,
VaRp (X) = {(1 - p)1/ - 1} = (3) {(1 - 0.5)-1/0.5 - 1} = (3)(3) = 9.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 29

3.26. D. Integrating f(x), F(x) = x = x.5. Setting F(x) =.75, x = .752 = 0.5625.
3.27. E. Since f(x) = (3/8)x2 for 0 < x < 2, then by integration F(x) = x3 / 8.
Set .125 = F(x) = x3 / 8. Solving x = {(.125)(8)}1/3 = 1.
3.28. A. Set the Distribution function for the exponential with = 20, equal to the random number
from [0,1]. 0.21 = F(x) = 1 - e-x/20 . Therefore, ln(.79) = - x / 20. x = - 20 ln(.79) = 4.71.
Alternately, for the Exponential VaRp (X) = - ln(1-p) = -20 ln (1 - 0.21) = 4.71.
3.29. E. Integrating f(t) from t = 1 to x, F(x) = 1 - x-4 , x > 1. Simulate via the method of inversion by
setting Y = F(x). .250 = 1 - x-4 . x = .75-1/4 = 1.075.
3.30. C. The Distribution Function for losses is obtained from the given density function via
integration: F(x) = 1 - x-2 , x 1.
Set F(x) = Y1. 1 - x-2 = 0.636. Thus x = 1.66.
To get a random number in the interval [.01,.21] from a random number in the interval [0,1], one must
change the width of the latter interval and translate it until it covers the former interval. This is achieved
by multiplying by .2 and adding .01.
Thus the simulated value of R is: .2Y2 + .01 = .059.
Thus the simulated value of X(1 + R) is: (1.66)(1 + .059) = 1.76.
Comment: The size of loss distribution is a Single Parameter Pareto Distribution with = 2 and
= 1. For the Single Parameter Pareto Distribution,
VaRp (X) = (1- p) - 1/ = (1) (1 - 0.636)-1/2 = 1.66
This question is a simplified form of a possible real world use of simulation.
3.31. C. f(x) = 2 x -3, x > 1 and therefore integrating from 1 to x: F(x) = 1 - 1/x2 , x > 1.
For a random number on [0,1] of .5, using the method of inversion we set .5 = F(x).
.5 = 1 - 1/x2 and therefore x = 1.414.
Alternately, this is a Single Parameter Pareto Distribution with = 2 and = 1, with
VaRp (X) = (1- p) - 1/ = (1) (1 - 0.5)-1/2 = 2 = 1.414.

2013-4-13,

Simulation 3 Continuous Dist. Inversion Method, HCM 10/25/12, Page 30

3.32. B. Set F(x) = .36 and solve for x:


2


1
1
0.36 = F(x) = 1 -
= 1 + (x / 1,000,000)0.5 .

1 + (x / )
1000/(1000+ x.5) = (1 - 0.36)1/2 = .8. 200 = .8x.5. x = 2502 = 62,500.
Alternately, for the Burr Distribution,
VaRp (X) = {(1-p)-1/ - 1}1/ = (1,000,000) {(1 - 0.36)-1/2 - 1}1/0.5 = 62,500.
3.33. D. Via integration, the two Distribution Functions are: F1 (x) = x4 and F2 (x) = 4x3 - 3x4 .
Thus the mixed Distribution Function is: .75 F1 (x) + .25F2 (x) = .75x4 + .25(4x3 - 3x4 ) = x3 .
Simulating by algebraic inversion, we set the random number equal to the (mixed) Distribution
Function: .064 = x3 . Thus x = 0.4.
Comment: Since the mixed distribution simplifies in this case, we can simulate it via algebraic
inversion using only one random number. The general method of simulating mixed distributions is
discussed in a subsequent section.
3.34. A. Set .4 = F(x) and solve for x. 0.4 = 1 - 1/{1+exp((x-1)/2)}.
1 + exp((x-1)/2) = 1/.6. (x-1)/2 = ln(.6667). x = 0.189.
Comment: Check, F(.189) = 1 - 1/{1+exp(-.4055)} = 1 - 1/1.667 = .4.
3.35. E. Set .9833 = F(x) = exp[-e-x]. e-x = .01684. x = 4.08.
Comment: While the Gumbel Distribution is in Appendix A.4 of Loss Models, you are very unlikely
to be asked about it on your exam.
The Gumbel Distribution is a special case of the extreme value distribution.
Note that F(-) = exp[-] = 0 and F() = exp[0] = 1. f(x) = e-x exp[-e-x] > 0.
3.36. A. Set u = F(x) = 1 - e-x/1000. x = -1000ln(1 - u).
The four simulated annual dental charges are:
-1000ln.7 = 357, -1000ln.08 = 2526, -1000ln.3 = 1204, -1000ln.92 = 84.
Reimbursements are: (.8)(257) = 206, 1000, (.8)(1104) = 883, 0.
Average annual reimbursement is: (206 + 1000 + 883 + 0)/4 = 522.
Comment: For the Exponential Distribution, VaRp (X) = - ln(1-p).
Each year there is total amount reimbursed, including the possibility of zero.
In order to get the average per year, we calculate the total reimbursements over four years and
dividing by four. We are taking an average per year, rather than an average per reimbursement.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 31

Section 4, Discrete Distributions, the Inversion Method


The inversion method can be applied to simulate any discrete distribution.17 In particular it can be
used for any frequency distribution, including the Binomial,18 Poisson,19 Geometric, and Negative
Binomial.20 One random number produces one random draw from the discrete distribution.
Bernoulli Distribution:
Assume we wish to simulate a random draw from a Bernoulli Distribution with mean .2.
Then if 0 u 0.8 there is no claim, while if 1 u > 0.8 there is a claim.21 In general for a Bernoulli
with parameter q: for u 1-q there is no claim, while for u > 1-q there is a claim.
Exercise: You wish to simulate via the inversion method, random draws from a Bernoulli Distribution
with q = 0.2.
You are given the following sequence of random numbers from [0,1]:
0.6, 0.2, 0.05, 0.9, 0.995, 0.4, 0.5.
Determine the corresponding sequence of random draws from the Bernoulli Distribution.
[Solution: f(0) = 0.8, f(1) = 0.2. F(0) = 0.8, F(1) = 1. If the random number is less than or equal to .8
then there is no claim; if the random number is greater than .8 then there is a claim.
The corresponding sequence of random draws from the Bernoulli Distribution is: 0, 0, 0, 1, 1, 0, 0.]
General Case of a Discrete Distribution:
In general, for a discrete distribution, one first constructs a table of the Distribution Function,
by cumulating densities. Then one looks for the first place the Distribution Function is greater than the
random number.
We want the smallest x, such that F(x) > u.22
Using this technique, large random numbers correspond to large simulated values.
17

As discussed subsequently, one can also apply the inversion method to a life table, in order to simulate times of
death and the present values of benefits paid for life insurances and annuities. The inversion method is also called
the inverse transform method. I refer to the application to discrete distributions as the Table Look-Up Method.
I refer to the application to continuous distributions as algebraic inversion.
18
One can also simulate a Binomial with parameters m and q, by summing the result of simulating m independent
random draws from a Bernoulli Distribution with parameter q.
19
One can also simulate the Poisson via the special algorithm for the Poisson Process to be described in a
subsequent section.
20
One can also simulate a Negative Binomial with parameters r and , for r integer, by summing the result of
simulating r independent random draws from a Geometric Distribution with parameter .
21
Note that one could also simulate the same Bernoulli via a claim when u < 0.2 and no claim for u 0.2.
In general there are many ways to simulate a given distribution.
22
The method in Loss Models F(x) > u, although the explanation is far from clear. It would be equally valid to
instead require F(x) u. While Loss Models discusses this distinction as if it were important, there should be no
practical difference, since using a computer we would not expect to get a random number u exactly equal to F(x).

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 32

While for discrete distributions one usually does things in this manner, if instead one wants small
random numbers corresponding to large simulated values, then we want the smallest x such that
F(x) > 1 - u, or equivalently S(x) < u.
Binomial Distribution:23
For example take a Binomial Distribution with q = 0.2 and m = 3.
We can compute a table of F(x), and then invert by looking up the proper values in this table:24
Number
of Claims
0
1
2
3

Probability
Density Function
51.200%
38.400%
9.600%
0.800%

Cumulative
Distribution
Function
0.51200
0.89600
0.99200
1.00000

For example if u is 0.95, then we simulate 2 claims, because 2 is the first value such that
F(x) > 0.95. For u in the interval [0.896, 0.992), we will simulate 2 claims. Since u is uniformly
distributed on [0,1], the chance of simulating 2 claims will be: 0.992 - 0.896 = 0.096 as desired.
Exercise: You wish to simulate via the inversion method, random draws from a Binomial Distribution
with m = 3 and q = 0.2. Large random numbers correspond to a large number of claims.
You are given the following sequence of random numbers from [0,1]:
0.6, 0.2, 0.8, 0.05, 0.9, 0.995, 0.4, 0.5.
Determine the corresponding sequence of random draws from the Binomial Distribution.
[Solution: Using the table calculated previously above: 1, 0, 1, 0, 2, 3, 0, 0.]
(a, b, 0) Relationship:
Note that one could successively calculate the Binomial densities by using of the relationship
(m - x) q
q
(m + 1) q
f(x+1) = f(x)
= f(x) {+
} together with f(0) = (1 - q)m.
(x + 1) (1 - q)
1 - q (x + 1) (1 - q)
In some cases, this method of calculating densities can be faster.
This technique can be applied to any member of the (a, b, 0) class of frequency distributions as
defined in Loss Models.

23
24

The Bernoulli Distribution is a special case of the Binomial Distribution, with m = 1.


For example, the chance of two claims is (3)(0.22 )(0.8) = 0.096.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 33

Recall that for each member of the (a, b, 0) class of frequency distributions:
f(x+1) / f(x) = a + {b/(x+1)}, x = 0, 1, 2, 3,...
where a and b depend on the parameters of the distribution:25
Distribution
a
b
f(0)
Binomial

-q / (1-q)

(m+1)q / (1-q)

(1-q)m

Poisson

Negative Binomial

/ (1+)

(r-1) / (1+)

1 / (1+)r

Discrete Severity Distributions:


The inversion method can also be used with any severity distribution that takes on only finite
values.26 For example, assume we have the distribution:
Size of Claim
Probability Cumulative Distribution Function
$100
30%
0.30
$500
60%
0.90
$2500
10%
1.00
Then if u = 0.72, we simulate a claim of size $500, since F($500) > 0.72 and $500 is the first such
value of x. If instead u = 0.94, we simulate a claim of size $2500.
If u = 0.270367103, we simulate a claim of size $100.
Efficiency:
The simulation of this severity example can be written as an algorithm:
Generate a random number U from (0, 1).
If U < 0.3, set X = $100 and stop.
If U < 0.9, set X = $500 and stop.
Otherwise set X = $2500.
One could perform this simulation in a more efficient manner.
The most efficient way to simulate X doing the fewest comparisons on average
testing for the largest probabilities first.
In this case, $500 has the largest probability, so we would test for it first.
$100 has the second largest probability, so we would test for it second.
25

See Mahlers Guide to Frequency Distributions.


One can approximate numerically any continuous distribution by a discrete distribution; the more values of x, the
better the approximation can be. One can then simulate the approximating discrete distribution via the inversion
method (Table Look-up) and use the result as an approximate simulation of the original continuous distribution. This
can be a useful technique in some practical applications.
26

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 34

An equally valid, but more efficient algorithm:


Generate a random number U from (0, 1).
If U < 0.6, set X = $500 and stop.
If U < 0.9, set X = $100 and stop.
Otherwise set X = $2500.
Exercise: Using this more efficient algorithm, and the random numbers: 0.72, and 0.94,
simulate two random severity values.
[Solution: $100, and $2500.]
While the two algorithms usually produce different results for a short list of random numbers, for a
very long list of random numbers, the distribution of simulated claim sizes will be the same.
Therefore, the two algorithms are equally valid.
Geometric Distribution:
For a Geometric Distribution with = 3, f(0) = 1/(1+) = 1/4. f(x+1) = f(x)/(1+) = (3/4)f(x).
x

f(x)

F(x)

S(x)

0
1
2
3
4
5
6
7
8
9
10

0.25000
0.18750
0.14062
0.10547
0.07910
0.05933
0.04449
0.03337
0.02503
0.01877
0.01408

0.25000
0.43750
0.57812
0.68359
0.76270
0.82202
0.86652
0.89989
0.92492
0.94369
0.95776

0.75000
0.56250
0.42188
0.31641
0.23730
0.17798
0.13348
0.10011
0.07508
0.05631
0.04224

Exercise: For the random numbers 0.84 and 0.49, simulate two random frequency values, with large
random numbers corresponding to large numbers of claims.
[Solution: 0.84 is first exceeded by F(6). 0.49 is first exceeded by F(2).
We simulate 6 and 2 claims.]
We note that for a Geometric, S(x) = {/(1+)}x+1. For example, S(6) = (3/4)7 = 0.13348.
If large random numbers are to correspond to large numbers of claims, then given a random number
u, we want the first x such that F(x) > u S(x) < 1- u {/(1+)}x+1 < 1- u
x+1 > {ln[1 - u] / ln[/(1+)]} x = largest integer in:

ln[1 - u]
.
ln[ / (1+ )]

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 35

For example, for u = 0.84, largest integer in {ln(1 - 0.84) / ln(3/4)} = largest integer in 6.37 = 6,
matching the number of claims simulated above.
Series of Bernoulli Trials:
For a series of independent identical Bernoulli trials, the chance of the first success following x failures
is given by a Geometric Distribution with mean:
= chance of a failure / chance of a success.
The number of trials = 1 + number of failures = 1 + Geometric.27
Therefore, if one wants to simulate the number of trials before the first success, with large random
numbers corresponding to large numbers of trials,
x = 1 + largest integer in: {ln[1-u] / ln[/(1+)]} =
1 + largest integer in: ln[1-u] / ln[chance of failure].
Exercise: There is a series of independent trials, each with a 80% success rate. Let X represent the
number of trials until the first success. Use the inversion method to simulate the random variable, X,
where large numbers correspond to a high number of trials.
Use the following random number: 0.98.
[Solution: x = 1 + largest integer in: ln[1-u] / ln[chance of failure] =
1 + largest integer in: ln(0.02) / ln(0.2) = 1 + largest integer in: 2.43 = 3.
Alternately, the number of failures is Geometric with = 0.2/0.8 = 1/4.
x

f(x)

F(x)

0
1
2
3
4
5
6
7

0.8
0.16000
0.03200
0.00640
0.00128
0.00026
0.00005
0.00001

0.8
0.96
0.992
0.9984
0.99968
0.99994
0.99999
1.00000

We need to see where u first exceeded by F(x). 0.98 is first exceeded by F(2).28
Thus we simulate 2 failures or 3 trials.]

27

1 + a Geometric Distribution is a zero-truncated Geometric Distribution.


See Mahlers Guide to Frequency Distributions.
28
Equivalently, u = 0.02 is first exceeded by S(2).

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 36

Uniform Discrete Variables:


A special case of a discrete distribution, is a variable with equal probability on the integers
from 1 to n.
Exercise: X is distributed with a 20% chance of each of 1, 2, 3, 4, and 5. You are given the following
sequence of random numbers from [0,1]: 0.65, 0.21, 0.83, 0.05, 0.92, 0.43.
Determine the corresponding sequence of random draws from X.
[Solution: F(1) = 0.2, F(2) = 0.4, F(3) = 0.6, F(4) = 0.8, F(5) = 1.
Since F(4) > 0.65, 0.65 corresponds to 4. 0.21 corresponds to 2. 0.83 corresponds to 5.
0.05 corresponds to 1. 0.92 corresponds to 5. 0.43 corresponds to 3.
Equivalently, in each case take 1 + (largest integer in 5u).
For example, when u = 0.43, we get 1 + (largest integer in 2.15) = 1 + 2 = 3.]
In general, in order to simulate a variable with equal probability on the integers from 1 to n,
let x = 1 + largest integer in nu.
Random Permutations and Subsets:
The algorithm to simulate random uniform discrete variables can be used to simulate random
permutations of the numbers from 1 to n.
Exercise: You are given the following sequence of random numbers from [0,1]: 0.65, 0.29, 0.83,
0.05. Simulate a random permutation of the numbers from 1 to 5.
[Solution: First simulate a random integer from 1 to 5. 1 + largest integer in (5)(0.65) = 4.
Now exchange 4 with 5 to get: 1, 2, 3, 5, 4. Now simulate a random integer from 1 to 4:
1 + largest integer in (4)(0.29) = 2. Exchange the number in the 2nd position with the number in the
4th position to get: 1, 5, 3, 2, 4. 1 + largest integer in (3)(0.83) = 3. Exchange the 3rd position with
the 3rd position; the sequence remains the same at: 1, 5, 3, 2, 4.
1 + largest integer in (2)(0.05) = 1. Exchange the 1st position with the 2nd position to get:
5, 1, 3, 2, 4. The simulated random permutation is: 5, 1, 3, 2, 4.]
One can simulate random subsets of the numbers from 1 to n, by just stopping partway through the
algorithm to simulate a random permutation of the numbers from 1 to n.
Exercise: You are given the following sequence of random numbers from [0,1]: 0.35, 0.59.
Simulate a random subset of size 2 from the integers from 1 to 5.
[Solution: First simulate a random number from 1 to 5: 1 + largest integer in (5)(0.35) = 2.
Now exchange 2 with 5 to get: 1, 5, 3, 4, 2. Now simulate a random number from 1 to 4:
1 + largest integer in (4)(0.59) = 3. Now exchange the number in the 3rd position with the number
in the 4th position to get: 1, 5, 4, 3, 2. The random subset is the last 2 numbers, {3, 2}.]

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 37

One could go through the exact same steps to simulate a random subset of size 3 from the
integers from 1 to 5. However, at the final step, we would take the first 3 numbers {1,5,4}.
By proceeding in this manner we use only 2 random numbers rather than 3 in order to simulate
a random subset of size 3. In general, in order to simulate a random subset of size r from the
integers from 1 to n, one needs either r or n-r random numbers, whichever is smaller.
Exercise: Your company has 17 branch offices, numbered from 1 to 17. You wish to visit 3
branch offices at random in order to observe how they are using a new underwriting tool you
helped to develop. 29 You are given the following sequence of random numbers from [0,1]:
0.53, 0.42, 0.13. Which branch offices do you visit?
[Solution: First simulate a random number from 1 to 17: 1 + largest integer in (17)(0.53) = 10.
Now exchange 10 with 17. Now simulate a random number from 1 to 16:
1 + largest integer in (16)(0.42) = 7. Now exchange the number in the 7th position with the
number in the 16th position. Now simulate a random number from 1 to 15:
1 + largest integer in (15)(0.13) = 2. Now exchange the number in the 2nd position with the
number in the 15th position to get: 1, 15, 3, 4, 5, 6, 16, 8, 9, 17, 11, 12, 13, 14, 2, 7, 10.
The random subset is the last 3 numbers: {2, 7, 10}.]

Poisson Distribution:
One can simulate the Poisson Distribution using the inversion method, as a well as a special
algorithm based on interarrival times to be discussed in a subsequent section.
Exercise: You wish to simulate via the inversion method, random draws from a Poisson Distribution
with = 3.2.
You are given the following sequence of random numbers from [0,1]:
0.6, 0.2, 0.8, 0.05, 0.9, 0.995, 0.4, 0.5.
Large random numbers correspond to large numbers of claims.
Determine the corresponding sequence of random draws from the Poisson Distribution.

29

You do not visit all 17 offices due to time and money constraints. You might have to spend a week at each office in
order to estimate the effectiveness of the new underwriting tool. An average of these three estimates could serve
as an estimate of the average effectiveness of the new tool throughout your whole company.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 38

[Solution: The first step is to calculate a table of densities for this Poisson,30 using the relationship
f(x+1) = f(x) { / (x+1)} = f(x) 3.2/(x+1), and f(0) = e = e-3.2 = 0.04076.
3.2
Number
of Claims
0
1
2
3
4
5
6
7
8
9
10

Probability
Density Function
4.076%
13.044%
20.870%
22.262%
17.809%
11.398%
6.079%
2.779%
1.112%
0.395%
0.126%

Cumulative
Distribution
Function
0.04076
0.17120
0.37990
0.60252
0.78061
0.89459
0.95538
0.98317
0.99429
0.99824
0.99950

One cumulates the densities to get the distribution function. The first random number is 0.6; one
sees the first time F(x) > 0.6, which occurs when x = 3 and F(3) = 0.60252.
Similarly, F(2) = 0.37990 > 0.2. Proceeding similarly, the complete sequence of random draws from
this Poisson Distribution, corresponding to 0.6, 0.2, 0.8, 0.05, 0.9, 0.995, 0.4, 0.5, is:
3, 2, 5, 1, 6, 9, 3, 3.]
One could write the above simulation as an algorithm:
Generate a random number U from (0, 1).
If U < 0.04076, set X = 0 and stop.
If U < 0.17120, set X = 1 and stop.
If U < 0.37990, set X = 2 and stop.
etc.
Efficiency for the Poisson:
First comparing u to F(0), then F(1), etc., is not an efficient way to program this algorithm. We would
require 1 + x comparisons. Therefore, the expected number of comparisons is: 1 + . This could
slow down execution of the computer program for large .
Most of the probability is concentrated near the mean of , so our computer program should start
our search at the largest integer in , in this case 3. We compare u to F(3). If F(3) > u, we then
compare u to F(2). If instead F(3) u, then we compare u to F(4). We proceed until we have
determined the smallest x such that F(x) > u.
30

I have only displayed values up to 10. In practical applications, one should include values up to a point where the
cumulative distribution function is sufficiently close to 1.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 39

One could perform this simulation in a more efficient manner:


If U < F(3) = 0.60252 then X 3, and then check successively, whether X 2, etc.
If U F(3) = 0.60252 then X 4, and then check successively, whether X 5, etc.
So for example, if the random number is 0.8, 0.8 .60252 X > 3. 0.8 0.78061 X > 4.
0.8 < 0.89459 X 5. X = 5. If instead the random number is 0.2, 0.2 < 0.60252 X 3.
0.2 < 0.37990 X 2. 0.2 0.17120 X > 1 X = 2.
This is more efficient, since the largest probabilities for a Poisson density are near its mean.
In general, a more efficient algorithm would start the search at F(largest integer in .)31
Starting to make comparisons near the mean of the distribution will be more efficient whenever
applying the Inversion Method to any discrete distribution whose graph resembles that of the
Poisson Distribution, such as either a Binomial Distribution with a large mean or a Negative Binomial
Distribution with a large mean.
Estimating Averages:
For example, assume you had a list of 1000 equally likely interest rate scenarios over the next 10
years. For each interest rate scenario, it would take one minute to calculate the expected profits on a
large portfolio of GICs. If you had 1000 minutes, or almost 17 hours, you could calculate the
expected profits under each scenario.
However, if one had only half an hour, one could pick 30 scenarios at random, by simulating a
random subset of size 30 from the integers from 1 to 1000. Then one could average the expected
profits for these 30 scenarios, and use this as an estimate of the expected profits for this portfolio.
This is an example of more general technique. One can simulate a random set of outcomes, in order
to estimate the average, variance, percentiles, etc., of a given random process.32 33

Using this more efficient method, as gets large, the number of comparison goes up as rather than .
Such a simulation experiment is shown in a subsequent section. Estimating the probability of ruin via simulation is
discussed in a subsequent section. Bootstrapping via simulation is discussed in a subsequent section.
33
There can be advantages to taking a stratified random sample. See for example Implications of Sampling Theory
for Package Policy Ratemaking, by Jeffrey Lange, PCAS 1966.
31
32

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 40

Problems:
4.1 (2 points) You are given the following:
P(k) is a cumulative probability distribution function of a Binomial distribution with q = 0.10
and m = 5.
An observation from the random variable U, having the uniform distribution on [0, 1], is 0.995.
Use the inversion method to determine the simulated number of claims from the Binomial
distribution.
A. 0
B. 1
C. 2
D. 3
E. 4
4.2 (2 points) You are using the inversion method to simulate Z, the present value random variable
for a special two-year term insurance on (60).
You are given:
(i) (60) is subject to only two causes of death, with
k

k|q 60(1)

k|q 60(2)

0
0.05
0.03
1
0.06
0.04
(ii) Death benefits, payable at the end of the year of death, are:
During year
Benefit for Cause 1
Benefit for Cause 2
1
2

1000
1100

2000
2200

(iii) i = 0.05.
(iv) For this trial your random number, from the uniform distribution on [0, 1], is 0.923.
(v) High random numbers correspond to high values of Z.
Calculate the simulated value of Z for this trial.
(A) 0
(B) 952
(C) 998
(D) 1905
(E) 1995
4.3 (3 points) You are given the following:
P(k) is a cumulative probability distribution function of a Negative Binomial Distribution as in
Loss Models with parameters = 2/3 and r = 8.
An observation from the random variable U, having the uniform distribution on [0, 1], is 0.35.
Use the inversion method to determine the simulated number of claims.
A. 2 or less
B. 3
C. 4
D. 5
E. 6 or more

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 41

4.4 (2 points) You wish to simulate via the inversion method three independent random draws from
a Poisson Distribution with = 0.7. You are given the following sequence of independent random
numbers from (0,1): 0.681, 0.996, 0.423.
Determine the sum of the three random draws from the Poisson Distribution.
A. 2 or less
B. 3
C. 4
D. 5
E. 6 or more
4.5 (2 point) A discrete empirical distribution X is created from the
following observations:
$X of Loss Frequency
100
3
200
4
400
6
700
3
1,200
2
2,000 2
(Assume that these six loss values are the only possible loss amounts.)
Using this distribution and random numbers Y from the interval (0,1), random losses can be
simulated.
Calculate the sum of the three independent losses generated by the random Y values of
0.78, 0.31 and 0.60.
A. Less than 800
B. At least 800, but less than 1000
C. At least 1000, but less than 1200
D. At least 1200, but less than 1400
E. 1400 or more
4.6 (1 point) Severity is equally likely to be $1000, $2000, $3000 or $4000.
A value of 0.61 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated severity corresponds to this 0.61?
A. $1000
B. $2000
C. $3000
D. $4000
E. None of the above.
4.7 (2 points) You wish to model a distribution by simulation.
f(x) = (0.4)(0.6x), x = 0, 1, 2, 3, ...
A value of 0.80 is randomly generated from a distribution which is uniform on the interval (0,1).
What simulated value corresponds to this 0.80?
A. 0
B. 1
C. 2
D. 3
E. 4 or more

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 42

4.8 (1 point) You wish to simulate a random value from a Binomial Distribution as per Loss Models
with m = 4 and q = 0.3. You will do so by first simulating four independent Bernoulli random
variables via the inversion method. You use the following four values: 0.1, 0.9, 0.2, 0.6,
independently randomly generated from a distribution which is uniform on the interval (0,1).
What is the resulting random value from the Binomial Distribution?
A. 0
B. 1
C. 2
D. 3
E. 4
4.9 (2 points) You are given the following sequence of five random numbers from [0,1]: 0.125,
0.027, 0.614, 0.850, 0.261. You simulate a random permutation of the numbers from 1 to 5.
What is the first of the five values in this permutation?
A. 1 B. 2 C. 3 D. 4 E. 5
4.10 (3 points) Mortality follows the Illustrative Life Table from Actuarial Mathematics:
l45 = 9164051, l65 = 7533964. For a group of 10 independent lives each age 45, the number of
deaths by age 65 is simulated from the binomial distribution using the inversion method.
Using the random number 0.93, how many deaths are there by age 65?
(A) 2 or fewer
(B) 3
(C) 4
(D) 5
(E) 6 or more
Use the following information for the next two questions:
A discrete random variable X has the following distribution:
k
Prob(X=k)
1
0.05
2
0.20
3
0.30
4
0.35
5
0.10
A random number from (0, 1) is 0.83.
4.11 (1 point) Using the inversion method, simulate a random value of X.
A. 1 B. 2 C. 3 D. 4 E. 5
4.12 (2 points) Using the most efficient algorithm, simulate a random value of X.
A. 1 B. 2 C. 3 D. 4 E. 5
4.13 (3 points) Hobbs is a baseball player. The probability that Hobbs gets a hit in any given
attempt is 30%. The results of his attempts are independent of each other.
Let X represent the number of attempts until his first hit.
Use the inversion method in order to simulate the random variable X.
Generate the total number of attempts until four hits result, by simulating 4 independent values of X.
Use the following random numbers: 0.745, 0.524, 0.941, 0.038.
A. fewer than 9 B. 9, 10, or 11 C. 12, 13, or 14 D. 15, 16, or 17 E. more than 17

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 43

4.14 (3 points) You wish to simulate via the inversion method, four random draws from a Poisson
Distribution with = 5.4. You are given the following sequence of four random numbers from [0,1]:
0.5859, 0.9554, 0.1620, 0.3532. Calculate the sample variance of the corresponding sequence of
random draws from the Poisson Distribution.
A. less than 9
B. at least 9, but less than 10
C. at least 10, but less than 11
D. at least 11, but less than 12
E. at least 12
4.15 (3 points) Using the inversion method, a Negative Binomial random variable with r = 4 and
= 2 is generated, with 0.64 as a random number from (0, 1).
Determine the simulated result.
A. 6 or less
B. 7

C. 8

D. 9

E. 10 or more

4.16 (2 points) You are given the following information about credit scores of individuals:
Interval
Probability
400 to 499
2%
500 to 549
5%
550 to 599
8%
600 to 649 12%
650 to 699 15%
700 to 749 18%
750 to 799 27%
800
13%
Credit scores are integers.
Assume that on each interval scores are distributed uniformly.
Simulate 3 credit scores using the following random numbers from (0, 1): 0.528, 0.342, 0.914.
4.17 (2 points) Weather is modeled by a Markov chain, with State 1 rain, and State 2 no rain.

The probability that it rains on a day is 0.50 if it rained on the prior day.
The probability that it rains on a day is 0.20 if it did not rain on the prior day.
It is raining today, Sunday. You simulate the weather for Monday through Saturday.
Use the following random numbers from (0, 1):
0.661, 0.529, 0.301, 0.132, 0.378, 0.792, 0.995.
How many simulated days did it rain during Monday through Saturday?
(A) 1 or less
(B) 2
(C) 3
(D) 4
(E) 5 or more

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 44

4.18 (2 points) Use the following information:

Annual aggregate losses can take on one of five values: 0, 1000, 2000, 3000, 4000,
with probabilities 15%, 25%, 35%, 20%, 5%, respectively.

Each year is independent of every other year.

Use the following ten random numbers from (0, 1): 0.679, 0.519, 0.148, 0.206, 0.824,
0.249, 0.392, 0.980, 0.501, 0.844, in order to simulate this model.
What are the total aggregate losses paid in ten years?
A. 18,000
B. 19,000
C. 20,000
D. 21,000
E. 22,000
4.19 (3 points) A machine is in one of four states (F, G, H, I) and migrates once per day among
them according to a Markov process with transition matrix:
F 0.20 0.80
0
0

G 0.50
0
0.50
0
H 0.75
0
0
0.25

I 1
0
0
0
The daily production of the machine depends on the state:
State:
F
G
H
I
Production:
100
90
70
0
The machine is in State F on day 0.
Days 1 to 7 are simulated, using the following random numbers from [0,1]:
0.834, 0.588, 0.315, 0.790, 0.941, 0.510, 0.003.
What is the total production of the machine on these 7 days?
(A) 600
(B) 610
(C) 620
(D) 630

(E) 640

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 45

4.20 (2 points) Use the following table of distribution function values for the annual number of claims:
x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

f(x)

F(x)
0.015625
0.062500
0.144531
0.253906
0.376953
0.500000
0.612793
0.709473
0.788025
0.849121
0.894943
0.928268
0.951874
0.968216
0.979305
0.986698
0.991550
0.994689
0.996695
0.997961
0.998753
0.999243
0.999544
0.999727
0.999838
0.999904
0.999943
0.999967
0.999981
0.999989
0.999994

Simulate five years, using the following random numbers: 0.325, 0.072, 0.956, 0.565, 0.899.
What is the sum of the simulated number of claims for the five years?
(A) 31 or less
(B) 32
(C) 33
(D) 34
(E) 35 or more
4.21 (3 points) Claims follow a Zero-Modified Poisson Distribution with pM
0 = 30% and = 1.8.
Use the inversion method to simulate the number of claims.
Do this three times using:
u1 = 0.98
u2 = 0.37
u3 = 0.68
Calculate the sum of the simulated values.
(A) 5
(B) 6
(C) 7
(D) 8

(E) 9

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 46

4.22 (3 points) The size a family follows a Zero-Truncated Negative Binomial Distribution with r = 6
and = 0.4.
Use the inversion method to simulate the size of family.
Do this three times using:
u1 = 0.08
u2 = 0.75
u3 = 0.47
Calculate the sum of the simulated values.
(A) 5
(B) 6
(C) 7
(D) 8

(E) 9

4.23 (3 points) N follows a Zero-Modified Binomial Distribution with pM


0 = 20%, m = 10 and q = 0.3.
Use the inversion method to simulate N.
Do this three times using:
u1 = 0.4768
u2 = 0.9967
Calculate the sum of the simulated values.
(A) 10
(B) 11
(C) 12
(D) 13

u3 = 0.3820
(E) 14

4.24 (4, 5/89, Q.43) (1 point) A discrete empirical distribution X is created from the following
observations:
$X of Loss
Frequency
100
2
200
5
400
5
600
4
1,000
4
(Assume that these five loss values are the only possible loss amounts.)
Using this distribution and random numbers Y from the interval (0,1), random losses can be
simulated. Calculate the sum of the two independent losses generated by the random Y values of
0.3 and 0.7.
A. 400
B. 500
C. 800
D. 1200
E. None of the above
4.25 (4B, 5/93, Q.2) (2 points) You are given the following:
P(k) is a cumulative probability distribution function for the binomial distribution where
k

P(k) =

j qj (1

- q)m - j , k = 0, 1, ..., m.

j=0

q = 0.25 and m = 5.
An observation from the random variable U, having the uniform distribution on [0, 1], is 0.7.
Use the inversion method to determine the simulated number of claims from the binomial
distribution.
A. 0
B. 1
C. 2
D. 3
E. 4

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 47

4.26 (3, 5/01, Q.11) (2.5 points)


You are using the inversion method to simulate Z, the present value random variable for a special
two-year term insurance on (70).
You are given:
(i) (70) is subject to only two causes of death, with
k

k|q 70(1)

k|q 70(2)

0
0.10
0.10
1
0.10
0.50
(ii) Death benefits, payable at the end of the year of death, are:
During year

Benefit for Cause 1

Benefit for Cause 2

1
2

1000
1100

1100
1200

(iii) i = 0.06
(iv) For this trial your random number, from the uniform distribution on [0, 1], is 0.35.
(v) High random numbers correspond to high values of Z.
Calculate the simulated value of Z for this trial.
(A) 943
(B) 979
(C) 1000
(D) 1038
(E) 1068
4.27 (3, 11/01, Q.13) (2.5 points) We have 100 independent lives age 70.
You are given:
(i) Mortality follows the Illustrative Life Table in Actuarial Mathematics: q70 = 0.03318.
(ii) i = 0.08
(iii) A life insurance pays 10 at the end of the year of death.
The number of claims in the first year is simulated from the binomial distribution using the inversion
method (where smaller random numbers correspond to fewer deaths).
The random number for the first trial, generated using the uniform distribution on [0, 1], is 0.18.
Calculate the simulated claim amount.
(A) 0
(B) 10
(C) 20
(D) 30
(E) 40
Note: I have rewritten the original exam question.
4.28 (CAS3, 11/03, Q.38) (2.5 points) Using the inversion method, a Binomial (10, 0.20) random
variable is generated, with 0.65 from U(0 ,1) as the initial random number.
Determine the simulated result.
A. 0 B. 1 C. 2 D. 3 E. 4

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 48

4.29 (CAS3, 11/03, Q.39) (2.5 points) When generating random variables, it is important to
consider how much time it takes to complete the process. Consider a discrete random variable
X with the following distribution:
k
Prob(X=k)
1
0.15
2
0.10
3
0.25
4
0.20
5
0.30
Of the following algorithms, which is the most efficient way to simulate X?
A.
If U<0.15 set X = 1 and stop.
If U<0.25 set X = 2 and stop.
If U<0.50 set X = 3 and stop.
If U<0.70 set X = 4 and stop.
Otherwise set X = 5 and stop.
B.
If U<0.30 set X = 5 and stop.
If U<0.50 set X = 4 and stop.
If U<0.75 set X = 3 and stop.
If U<0.85 set X = 2 and stop.
Otherwise set X = 1 and stop.
C.
If U<0.10 set X =2 and stop.
If U<0.25 set X = 1 and stop.
If U<0.45 set X = 4 and stop.
If U<0.70 set X = 3 and stop.
Otherwise set X = 5 and stop.
D.
If U<0.30 set X = 5 and stop.
If U<0.55 set X = 3 and stop.
If U<0.75 set X = 4 and stop.
If U<0.90 set X = 1 and stop.
Otherwise set X = 2 and stop.
E.
If U<0.20 set X = 4 and stop.
If U<0.35 set X = 1 and stop.
If U<0.45 set X = 2 and stop.
If U<0.75 set X = 5 and stop.
Otherwise set X = 3 and stop.
4.30 (CAS3, 11/03, Q.40) (2.5 points) W is a geometric random variable with = 7/3.
Three uniform random number from (0, 1) are: 0.68, 0.08, and 0.48.
Use the inversion method. Calculate W3 , the third randomly generated value of W.
Comment: I have revised the original exam question.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 49

4.31 (CAS3, 5/04, Q.30) (2.5 points) A scientist performs experiments, each with a 60% success
rate. Let X represent the number of trials until the first success. Use the inversion method to simulate
the random variable, X, and the following random numbers (where low numbers correspond to a
high number of trials): 0.15, 0.62, 0.37, 0.78.
Generate the total number of trials until three successes result.
A. 3
B. 4
C. 5
D. 6
E. 7
4.32 (CAS3, 11/04, Q.38) (2.5 points) A uniform random number from [0, 1] is 0.7885.
The inversion method is used to compute the number of failures, F, before a success in a series of
independent trials each with success probability p = 0.70.
Low random numbers correspond to a small number of failures.
Evaluate F.
A. 0 B. 1 C. 2 D. 3 E. 4
Comment: I have revised the original exam question in order to match the current syllabus.
4.33 (4, 5/05, Q.12 & 2009 Sample Q.182) (2.9 points)
A company insures 100 people age 65.
The annual probability of death for each person is 0.03. The deaths are independent.
Use the inversion method to simulate the number of deaths in a year.
Do this three times using:
u1 = 0.20
u2 = 0.03
u3 = 0.09
Calculate the average of the simulated values.
(A) 1/3
(B) 1
(C) 5/3
(D) 7/3

(E) 3

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 50

Solutions to Problems:
4.1. D. Calculate a table of values for the distribution function and then determine the first value at
which F(x)>.995. This first occurs at x = 3.
Number
of Claims
0
1
2
3
4
5

Probability
Density Function
59.049%
32.805%
7.290%
0.810%
0.045%
0.001%

Cumulative
Distribution
Function
0.59049
0.91854
0.99144
0.99954
0.99999
1.00000

Comment: In order to construct the table, one could use the relationship: f(x+1) / f(x) =
a + b / (x+1), with for the Binomial a = -q/(1-q), b = (m+1)q/(1-q), and f(0) = (1-q)m.
4.2. C. The present value of benefits are either 0, 1000/1.05 = 952, 1100/1.052 = 998,
2000/1.05 = 1905, 2200/1.052 = 1995. Arrange these from smallest to largest:
P.V. of Benefits

Probability

Cumulative Distribution

0
952
998
1905
1995

0.82
0.05
0.06
0.03
0.04

0.82
0.87
0.93
0.96
1

See where the cumulative distribution first exceeds the random number .923.
F(998) = .93 > .923, so the simulated value is 998.
Comment: Similar to 3, 5/01, Q.11.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 51

4.3. C. Calculate a table of values for the distribution function and then determine the first value at
which F(x)>.35. This first occurs at x = 4.
Number of
Claims
0
1
2
3
4
5
6
7

f(x)
0.0167962
0.0537477
0.0967459
0.1289945
0.1418940
0.1362182
0.1180558
0.0944446

F(x)
0.01680
0.07054
0.16729
0.29628
0.43818
0.57440
0.69245
0.78690

Comment: Ive only displayed the first part of the Distribution function; for this problem one need
only calculate up to F(4). In order to calculate the densities, one could use the relationship:
f(x+1) / f(x) = a + b/(x+1), with for the Negative Binomial
a = /(1+), b = (r-1)/(1+) and f(0) = 1/(1+)r.
4.4. D. The first step is to calculate a table of densities for this Poisson, using the relationship f(x+1)
= f(x) { / (x+1)} = .7 f(x) /(x+1) and f(0) = e = e-.7 = .49659.
0.7
Number
of Claims
0
1
2
3
4
5
6

Probability
Density Function
49.659%
34.761%
12.166%
2.839%
0.497%
0.070%
0.008%

Cumulative
Distribution
Function
0.49659
0.84420
0.96586
0.99425
0.99921
0.99991
0.99999

One cumulates the densities to get the distribution function. The first random number is 0.681; one
sees the first time F(x) > 0.681, which occurs when x = 1.
Similarly, F(4) > 0.996. Finally F(0) > 0.423.
Thus the random draws from this Poisson Distribution are: 1, 4, 0. Their sum is: 1 + 4 + 0 = 5.
Comment: If one were only interested in the total number of claims over three years, one could
instead simulate a single draw from a Poisson with mean (3)(.7) = 2.1. However, that is not what you
were asked to do here.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 52

4.5. D. Calculate a table of the Distribution Function. Since $700 is the first value at which the
distribution function is >.78, $700 is the first simulated claim. Since $200 is the first value at which the
distribution function is >.31, $200 is the 2nd simulated claim. Since $400 is the first value at which
the distribution function is >.60, $400 is the 3rd simulated claim.
$700 + $200 + $400 = $1300.
Size of
Loss
100
200
400
700
1200
2000

Number
Observed
3
4
6
3
2
2

f(x)
0.15
0.2
0.3
0.15
0.1
0.1

F(x)
0.15000
0.35000
0.65000
0.80000
0.90000
1.00000

4.6. C. In thousands of dollars take: 1 + largest integer in: (4)(.61) = 1 + 2 = 3.


Alternately, F(1000) = .25, F(2000) = .5, F(3000) = .75, F(4000) = 1.
Since F(3000) > .61, .61 corresponds to $3000.
4.7. D. This is a Geometric Distribution. One can just calculate a table of values for f(x) and then take
the cumulative sum to get F(x). F(x) first exceeds 0.8 when x is 3.
x
0
1
2
3
4

f(x)
0.4000
0.2400
0.1440
0.0864
0.0518

F(x)
0.4000
0.6400
0.7840
0.8704
0.9222

Alternately, one sum up the power series:


i=x

i=x

F(x) = f(i) = (.4) .6i = (.4) (1 - .6x+1) / (1-.6) = 1 - .6x+1.


i=0

i=0

Want .8 < F(x) = 1 - .6x+1. Solving, x > 2.15. Smallest integer x > 2.15 is 3.
Therefore, we simulate 3 claims.
4.8. B. One needs to generate four random draws from a Bernoulli Distribution with the same q
parameter as the Binomial. For q = .3 the Bernoulli Distribution has F(0) = .7 and
F(1) = 1. Therefore, the four random draws corresponding to: .1, .9, .2, .6, are: 0, 1, 0, and 0.
The sum of the four Bernoullis is 0 + 1 + 0 + 0 = 1, the random draw from a Binomial.
Comment: This is an alternate method of generating a random draw from a Binomial Distribution.
One adds up m independent random draws from a Bernoulli Distribution with parameter q.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 53

4.9. D. First simulate a random integer from 1 to 5: 1 + largest integer in (5)(.125) = 1. Now
exchange 1 with 5 to get: 5,2,3,4,1. Now simulate a random integer from 1 to 4: 1 + largest integer
in (4)(.027) = 1. Now exchange the number in the 1st position with the number in the 4th position to
get: 4, 2, 3, 5, 1. Now simulate a random integer from 1 to 3: 1 + largest integer in (3)(.614) = 2.
Now exchange the number in the 2nd position with the number in the 3rd position to get: 4,3,2,5,1.
Now simulate a random integer from 1 to 2: 1 + largest integer in (2)(.850) = 2.
Now exchange the number in the 2nd position with the number in the 2nd position; the sequence
remains the same at: 4,3,2,5,1.
The simulated random permutation is: 4, 3, 2, 5, 1.
Comment: We only use the first four random numbers. If we had been asked to simulate a random
subset of size two, without replacement, the answer would have been {5, 1}, and we could have
stopped after using just two random numbers.
4.10. C. The probability of death is: q = 1 - l65/l45 = 1 - 7533964/9164051 = .1779.
m = 10. The Distribution Function of the Binomial first exceeds .93 when there are 4 deaths.
Number
of Deaths
0
1
2
3
4
5
6
7
8
9
10

Probability
Density Function
14.100889%
30.513904%
29.714033%
17.146743%
6.493382%
1.686178%
0.304070%
0.037600%
0.003051%
0.000147%
0.000003%

4.11. D. F(x) first exceeds .83 at x = 4.


x

f(x)

F(x)

1
2
3
4
5

0.05
0.20
0.30
0.35
0.10

0.05
0.25
0.55
0.90
1.00

Cumulative
Distribution
Function
0.1410089
0.4461479
0.7432883
0.9147557
0.9796895
0.9965513
0.9995920
0.9999680
0.9999985
1.0000000
1.0000000

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 54

4.12. B. The most efficient way to simulate X doing the fewest comparisons on average
testing for the largest probabilities first. In this case, f(4) > f(3) > f(2) > f(5) > f(1).
So we take the cumulative sums of f(4), f(3), f(2), f(5), and f(1):
x

f(x)

F(x)

4
3
2
5
1

0.35
0.30
0.20
0.10
0.05

0.35
0.65
0.85
0.95
1.00

.
Let U be a random number from (0, 1).
If U < 0.35 set X = 4 and stop. If U < 0.65 set X = 3 and stop. If U < 0.85 set X = 2 and stop.
If U < 0.95 set X = 5 and stop. Otherwise set X = 1 and stop.
In this case, we stop when .83 < .85 and X = 2.
Comment: Similar to CAS3, 11/03, Q.39.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 55

4.13. D. First construct a table of the distribution of X.


f(1) = .3, f(2) = (.3){1 - f(1)} = (.3)(.7) = .21. f(3) = .7 f(2) = .147. f(x+1) = .7 f(x).
x

f(x)

F(x)

1
2
3
4
5
6
7
8

0.3
0.21
0.147
0.1029
0.07203
0.05042
0.03529
0.02471

0.3
0.51
0.657
0.7599
0.83193
0.88235
0.91765
0.94235

We need to see where u is first exceeded by F(x). .745 is first exceeded by F(4).
.524 is first exceeded by F(3). .941 is first exceeded by F(8). 038 is first exceeded by F(1).
Total number of attempts until four hits result: 4 + 3 + 8 + 1 = 16.
Alternately, X is 1 plus a Geometric with = probability of failure / probability of success = 7/3.
For a Geometric, S(n) = (/(1+))n+1. Therefore, S(x) = (/(1+))x = .7x.
Given a random number u, we want the first x such that F(x) > u 1 - .7x > u 1 - u > .7x

ln(1 - u) > x ln(.7). Since ln(.7) < 0, ln(1 - u) > x ln(.7) x > ln(1-u)/ln(.7).
Therefore we want: x = 1 + largest integer in: ln(1-u)/ln(.7).
ln.255 / ln.7 = 3.83 x = 4. ln.524 / ln.7 = 2.08 x = 3.
ln.059 / ln.7 = 7.94 x = 8. ln.962 / ln.7 = 0.11 x = 1.
Total number of simulated attempts until four hits result: 4 + 3 + 8 + 1 = 16.
Comment: Similar to CAS3, 5/04, Q.30. Note that X is the number of trials rather than the number
of failures until the first success. X is a zero-truncated Geometric; X = 1, 2, 3, ....
The number failures, X - 1, is a Geometric with = 7/3.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 56

4.14. B. The first step is to calculate a table of densities for this Poisson, using the relationship
f(x+1) = f(x) { / (x+1)} = f(x) 5.4 / (x+1), and f(0) = e = e-5.4 = 0.004517.
5.4
Number of Claims

Probability
Density Function

Cumulative
Distribution Function

0
1
2
3
4
5
6
7
8
9
10
11

0.4517%
2.4390%
6.5852%
11.8533%
16.0020%
17.2821%
15.5539%
11.9987%
8.0991%
4.8595%
2.6241%
1.2882%

0.004517
0.028906
0.094758
0.213291
0.373311
0.546132
0.701671
0.821659
0.902650
0.951245
0.977486
0.990368

One cumulates the densities to get the distribution function. The first random number is 0.5859; one
sees the first time F(x) > 0.5859, which occurs when x = 6 and F(6) = .701671.
Similarly, F(10) = 0.977486 > 0.9554. Proceeding similarly, the four random draws from this
Poisson Distribution are: 6, 10, 3, and 4. The sample mean is: (6 + 10 + 3 + 4)/4 = 5.75.
The sample variance is: {(6 - 5.75)2 + (10 - 5.75)2 + (3 - 5.75)2 + (4 - 5.75)2 }/(4 - 1) = 9.58.
4.15. D. Calculate a table of values for the distribution function and then determine the first value at
which F(x) > 0.64. This first occurs at x = 9.
Number of
Claims
0
1
2
3
4
5
6
7
8
9

f(x)
0.0123457
0.0329218
0.0548697
0.0731596
0.0853528
0.0910430
0.0910430
0.0867076
0.0794820
0.0706507

F(x)
0.01235
0.04527
0.10014
0.17330
0.25865
0.34969
0.44074
0.52744
0.60693
0.67758

Comment: Ive only displayed the first part of the Distribution function; for this problem one need
only calculate up to F(9). In order to calculate the densities, one could use the relationship:
f(x+1) / f(x) = a + b/(x+1), with for the Negative Binomial a = /(1+), b = (r-1)/(1+),
and f(0) = 1/(1+)r. f(x+1) / f(x) = {/(1+)}(x + r)/(x + 1). In this case, a = 2/3, b = 6/3,
f(x+1) = f(x){2/3 + (6/3)/(x+1)} = f(x)(2/3)(x + 4)/(x + 1).

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 57

4.16. The cumulative distribution function is:


499 549 599 649 699 749 799 800
2%
7%
15% 27% 42% 60% 87% 100%
Therefore, .528 corresponds to the interval from 700 to 749.
(.528 - 42%)/18% = 0.6. Thus .528 corresponds to: 700 + (50)(.6) = 730.
.342 corresponds to the interval from 650 to 699.
(.342 - 27%)/15% = 0.48. Thus .342 corresponds to: 650 + (50)(.48) = 674.
.914 is greater than 87%, and corresponds to a credit score of 800.
Comment: Depending on details, your simulated credit scores could differ by 1 from what I have.
4.17. B. State 1 is rain and State 2 is no rain. The transition matrix is:
0.5 0.5
0.2 0.8

Since the chain starts in state 1, we compare .661 to the cumulative sums across the 1st row of the
transition matrix. .5 .661 < 1.0, so the chain goes to state 2. It does not rain on Monday.
Since the chain is now in state 2, we compare .529 to the cumulative sums across the 2nd row of the
transition matrix. .2 .529 < 1.0, so the chain remains in state 2. It does not rain on Tuesday.
.2 .301, so it does not rain on Wednesday.
.132 < .2, so it rains on Thursday.
Since the chain is now in state 1, and .378 < .5, it remains in state 1. It rains Friday.
.5 .792, so it does not rain on Saturday. It rains Thursday and Friday, 2 days.
4.18. B.

L
F(L)

0
1000
0.15 0.40

Year
1
2
3
4
5
6
7
8
9
10
Total

Aggregate
Loss

0.679
0.519
0.148
0.206
0.824
0.249
0.392
0.980
0.501
0.844

2000
2000
0
1000
3000
1000
1000
4000
2000
3000
19,000

2000
0.75

3000
0.95

4000
1.00

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 58

4.19. C. In State F there is a 20% chance of staying in F and an 80% chance of going to G.
The first random number, .834 .20, so day one is in State G.
In State G there is a 50% chance of going to F and an 50% chance of going to H.
The next random number, .588 .50, so day two is in State H.
In State H there is a 75% chance of going to F and an 25% chance of going to I.
The next random number, .315 < .75, so day three is in State F.
In State F there is a 20% chance of staying in F and an 80% chance of going to G.
The next random number, .790 .20, so day four is in State G.
In State G there is a 50% chance of going to F and an 50% chance of going to H.
The next random number, .941 .50, so day five is in State H.
In State H there is a 75% chance of going to F and an 25% chance of going to I.
The next random number, .510 < .75, so day six is in State F.
In State F there is a 20% chance of staying in F and an 80% chance of going to G.
The first random number, .003 < .20, so day seven is in State F.
Thus the machine is in: G, H, F, G, H, F, F.
The total production is: 90 + 70 + 100 + 90 + 70 + 100 + 100 = 620.
4.20. E. F(4) = 0.376953 > 0.325. F(2) > 0.072. F(13) > 0.956. F(6) > 0.565. F(11) > 0.899.
4 + 2 + 13 + 6 + 11 = 36.
Comment: Based on a Negative Binomial Distribution with r = 10 and = 1.
4.21. D. A table of the Zero-Modified Poisson Distribution with pM
0 = 30% and = 1.8:
Number of Claims

Probability

Distribution Function

0
1
2
3
4
5
6
7
8

30.000000%
24.952237%
22.457013%
13.474208%
6.063394%
2.182822%
0.654847%
0.168389%
0.037888%

30.000000%
54.952237%
77.409250%
90.883458%
96.946852%
99.129673%
99.784520%
99.952909%
99.990797%

0.98 is first exceeded when n = 5.


0.37 is first exceeded when n = 1.
0.68 is first exceeded when n = 2.
5 + 1 + 2 = 8.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 59

4.22. C. For the non-truncated Negative Binomial Distribution, f(0) = 1/1.46 = 0.13281.
pT
k = pk / (1 - 0.13281) =

(6)(7) ... (5 +k) 0.4k


1
.
6
+
k
k!
1 - 0.13281
1.4

A table of the Zero-Truncated Negative Binomial Distribution with r = 6 and = 0.4:


Size of Family

Probability

Distribution Function

1
2
3
4
5
6
7
8
9

26.254327%
26.254327%
20.003297%
12.859262%
7.348150%
3.849031%
1.885240%
0.875290%
0.389018%

26.254327%
52.508653%
72.511950%
85.371212%
92.719362%
96.568393%
98.453632%
99.328922%
99.717940%

0.08 is first exceeded when n = 1.


0.75 is first exceeded when n = 4.
0.47 is first exceeded when n = 2.
1 + 4 + 2 = 7.
4.23. B. A table of the Zero-Modified Binomial Distribution:
Number of Claims

Probability

Distribution Function

0
1
2
3
4
5
6
7
8
9
10

20.000000%
9.966392%
19.220898%
21.966741%
16.475055%
8.472886%
3.026031%
0.741069%
0.119100%
0.011343%
0.000486%

20.000000%
29.966392%
49.187290%
71.154030%
87.629086%
96.101971%
99.128002%
99.869071%
99.988171%
99.999514%
100.000000%

0.4768 is first exceeded when n = 2.


0.9967 is first exceeded when n = 7.
0.3820 is first exceeded when n = 2.
2 + 7 + 2 = 11.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 60

4.24. C. There are 20 observations and the cumulative Distribution Function is:
x
100 200 400 600 1000
F(x) .10
.35
.60
.80
1.00
For each random number u from (0,1) one wants the smallest x such that F(x) > u.
For u =.3, F(200) = .35 . .3, so we simulate a loss of size 200. For u = .7, F(600) = .80 > .7,
so we simulate a loss of size 600. The sum of the simulated values is 200 + 600 = 800.
4.25. C. Calculate a table of values for the distribution function and then determine the first value at
which F(x) > .7. This first occurs at x = 2.
Number
of Claims
0
1
2
3
4
5

Probability
Density Function
23.730%
39.551%
26.367%
8.789%
1.465%
0.098%

Cumulative
Distribution
Function
0.23730
0.63281
0.89648
0.98438
0.99902
1.00000

Sum

4.75000

4.26. B. The present value of benefits are either 0, 1000/1.06 = 943, 1100/1.062 = 979,
1100/1.06 = 1038, 1200/1.062 = 1068. Arrange these from smallest to largest:
P.V. of Benefits

Probability

Cumulative Distribution

0
943
979
1038
1068

0.2
0.1
0.1
0.1
0.5

0.2
0.3
0.4
0.5
1

See where the cumulative distribution first exceeds the random number .35.
F(979) = .40 > .35, so the simulated value is 979.
4.27. C. q 70 = .03318. # of deaths the first year is Binomial with q = .03318 and m = 100:
f(0) = (1 - .03318)100 = .0342.

F(0) = .0342.

f(1) = 100(.03318)(1 - .03318)99 = .1175.

F(1) = F(0) + f(1) = .1517.

f(2) = {(100)(99)/2}(.033182 )(1 - .03318)98 = .1996.


F(2) = F(1) + f(2) = .3513.
See where the distribution function first exceeds the random number of .18:
Since F(1) .18 < F(2), there are 2 deaths. The payment is: (2)(10) = 20.
Comment: Some of the given information was used to answer the previous question on this exam.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 61

4.28. C. For a Binomial with m = 10 and q = .2, F(x) first exceeds .65 when x = 2.
x

f(x)

F(x)

0
1
2
3
4
5
6
7
8
9
10

0.1073742
0.2684355
0.3019899
0.2013266
0.0880804
0.0264241
0.0055050
0.0007864
0.0000737
0.0000041
0.0000001

0.1073742
0.3758096
0.6777995
0.8791261
0.9672065
0.9936306
0.9991356
0.9999221
0.9999958
0.9999999
1.0000000

Comment: One can stop computing the densities and distribution function, when one gets to
x = 2 and notices that F(2) > .65, the given random number.
4.29. D. The most efficient way to simulate X doing the fewest comparisons on average
testing for the largest probabilities first. In this case, f(5) > f(3) > f(4) > f(1) > f(2).
So we take the cumulative sums of f(5), f(3), f(4), f(1), and f(2):
.3, .3 + .25 = .55, .3 + .25 + .2 = .75, .3 + .25 + .2 + .15 = .90, .3 + .25 + .2 + .15 + .1 = 1.00.
Let U be a random number from (0, 1).
If U < 0.30 set X = 5 and stop. If U < 0.55 set X = 3 and stop. If U < 0.75 set X = 4 and stop.
If U < 0.90 set X = 1 and stop. Otherwise set X = 2 and stop.
Comment: All of these are valid algorithms.
Expected number of comparisons:
A. (1)(.15) + (2)(.1) + (3)(.25) + (4)(.5) = 3.1.
B. (1)(.3) + (2)(.2) + (3)(.25) + (4)(.25) = 2.45.
C. (1)(.1) + (2)(.15) + (3)(.2) + (4)(.55) = 3.2.
D. (1)(.3) + (2)(.25) + (3)(.2) + (4)(.25) = 2.4.
E. (1)(.2) + (2)(.15) + (3)(.1) + (4)(.55) = 3.0.
4.30. The third random number is .480.
For a Geometric Distribution with = 7/3, F(x) first exceeds .480 when x = 1.
x

f(x)

F(x)

0
1
2
3

0.3000000
0.2100000
0.1470000
0.1029000

0.3000000
0.5100000
0.6570000
0.7599000

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 62

4.31. D. X is the number of trials before the first success.


Prob[X = 1] = Prob[success on first trial] = .6.
Prob[X = 2] = Prob[failure on first trial] Prob[success on second trial] = (.4)(.6) = .24.
Prob[X = 3] = Prob[failure on 1st trial] Prob[failure on 2nd trial]Prob[success on 3rd trial] = (.4)(.4)(.6)
= .096. Note that Prob[X = 3] = .4 Prob[X = 2]. f(x+1) = .4 f(x).
x

f(x)

F(x)

1
2
3
4
5
6
7
8

0.6
0.24
0.096
0.0384
0.01536
0.006144
0.0024576
0.00098304

0.6
0.84
0.936
0.9744
0.98976
0.99590
0.99836
0.99934

Since low numbers correspond to a high number of trials, we need to see where 1 - u first exceeds
F(x), rather than the more usual where u first exceeds F(x).
1 - .15 = .85, first exceeded by F(3) = .936. 1 - .62 = .38, first exceeded by F(1) = .6.
1 - .37 = .63, first exceeded by F(2) = .84.
Total number of trials until three successes: 3 + 1 + 2 = 6.
Alternately, X is 1 plus a Geometric with
= probability of failure / probability of success = .4/.6 = 2/3.
For a Geometric as per Loss Models, S(n) = Prob[# of failures > n] = (/(1+))n+1.

S(x) = Prob[# of trials > x] = Prob[# of failures > x-1] = (/(1+))x = {(2/3)/(5/3)}x = 0.4x.
Given a random number u, we want the first integer x such that:
S(x) > u .4x < u x > ln(u)/ln(.4).
ln.15 / ln.4 = 2.07 x = 3. ln.62 / ln.4 = .52 x = 1. ln.37 / ln.4 = 1.09 x = 2.
Total number of trials until three successes: 3 + 1 + 2 = 6.
Comment: The number failures, X - 1, is a Geometric with = 2/3.
4.32. B. f(0) = .7. f(1) = Prob[failure first trial] Prob[success second trial] = (.3)(.7) = .21.
f(2) = Prob[failure first trial] Prob[failure second trial] Prob[success third trial] = (.32 )(.7) = .063.
F(0) = .7 .7885. F(1) = .91 > .7885. Thus we simulate 1 failure.
Comment: The number of failures before the first success is Geometric, with = .3/.7 = 3/7.

2013-4-13,

Simulation 4 Discrete Dist. Inversion Method, HCM 10/25/12, Page 63

4.33. B. The number of deaths in a year is Binomial with m = 100 and q = 0.03.
f(0) = .97100 = 0.04755. f(1) = 100(.9799).03 = 0.14707.
f(2) = {(100)(99)/2}(.9798).032 = 0.22515. F(0) = 0.04755. F(1) = 0.19462. F(2) = 0.41977.
u1 = 0.20 is first exceeded by F(2) = 0.41977, simulate two deaths.
u2 = 0.03 is first exceeded by F(0) = 0.04755, simulate zero deaths.
u3 = 0.09 is first exceeded by F(1) = 0.19462, simulate one death.
Average number of deaths simulated is: (2 + 0 + 1)/3 = 1.
Comment: Given larger random numbers, one would need to calculate more of the distribution
function of the Binomial.

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 64

Section 5, Simulating Normal and LogNormal Distributions


One can simulate a random draw from a Standard Unit Normal many different ways, including the
inversion method (Table Lookup), the Polar Normal Method,34 and the Rejection Method. 35
A random draw from a Standard Unit Normal can in turn be used to simulate a random Normal or a
random LogNormal.
Inversion Method (Table Lookup):
In order to use the inversion method (Table Lookup) in order to simulate a random draw from a
Standard Unit Normal36, one uses a Table of the (standard unit) Normal Distribution, and a random
number from (0, 1).
Exercise: Given a random number 0.95 from (0,1), use the inversion method in order to simulate a
random draw from a Standard Unit Normal.
[Solution: (1.645) = 0.95, therefore, the random draw from a Standard Unit Normal is 1.645.]
In general, one looks up in a Normal Table, the place where the Distribution equals the random
number from (0,1). Given a random number u, find Z such that (Z) = u.
One random number from (0,1), gives one random draw from a Standard Unit Normal.
Exercise: Given a random number 0.3974 from (0,1), use the inversion method in order to simulate
a random draw from a Standard Unit Normal.
[Solution: (0.260) = 0.6026, (-0.260) = 1 - 0.6026 = 0.3974.
Therefore, the random draw from a Standard Unit Normal is -0.260.]
Non-Unit Normals:
Assume we have simulated a random draw from a Standard Unit Normal, with a mean of zero and a
standard deviation of one. Then we can convert this to a random draw from a Normal with mean
and standard deviation by multiplying the standard normal variable by and then adding .
Simulate a Unit Normal Z, then X = Z + .
For example, assume we are trying to simulate heights of human males, which are assumed to be
normally distributed with mean 69 inches and a standard deviation of 4 inches. Then a standard
normal draw of 1.2 would translate to a height of (1.2)(4) + 69 = 73.8.37
34

See for example Simulation by Ross, not on the syllabus.


See for example Simulation by Ross, not on the syllabus.
36
With a mean of zero and a standard deviation of one.
37
This is the inverse of the usual method of standardizing variables so that one can use the Standard Normal Table.
35

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 65

Exercise: Simulate a random draw from a Normal Distribution with parameters = 2, = 7.


Use the random number from zero to one: 0.6406.
[Solution: (0.36) = 0.6406. Z = 0.36. X = (7)(0.36) + 2 = 4.52.
Comment: If we standardize 4.52: (4.52 - 2) / 7 = 0.36.]
Simulating a LogNormal Distribution:
Assume we want to simulate a random variable Y from a LogNormal Distribution with = 10 and
= 4. Then by the definition of the LogNormal Distribution, ln(Y) is Normally distributed with mean
10 and standard deviation 4.
Above we saw how to simulate such a variable; if Z is distributed as per a Standard Normal, then
4Z + 10 is distributed as per a Normal Distribution with standard deviation 4 and mean 10.
Set ln(Y) = 4Z + 10.
Therefore, Y = exp(4Z + 10).
For example, if a random draw from a Unit Normal is -0.211, then exp[(-0.211)(4) +10] = 9471 is a
random draw from a LogNormal Distribution with = 10 and = 4.
In general, if Z is a random draw from a Unit Normal, then exp(Z + ) is a random draw
from a LogNormal Distribution with parameters and .
In order to simulate a LogNormal Distribution with parameters and :
1. Simulate a Unit Normal, Z.
2. Get a random Normal variable with parameters and . X = Z + .
3. Exponentiate to get a random LogNormal variable. Y = exp(Z + ).
Exercise: Simulate a random draw from a LogNormal Distribution with parameters = 5, = 2.
Use the random number from zero to one: 0.4286.
[Solution: (0.18) = 0.5714 = 1 - 0.4286. Z = -0.18.
Exp[(2)(-0.18) + 5] = e4.64 = 103.5.]

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 66

Normal Distribution Table


Entries represent the area under the standardized normal distribution from - to z, Pr(Z < z).
The value of z to the first decimal place is given in the left column.
The second decimal is given in the top row.
z

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.0
0.1
0.2
0.3
0.4

0.5000
0.5398
0.5793
0.6179
0.6554

0.5040
0.5438
0.5832
0.6217
0.6591

0.5080
0.5478
0.5871
0.6255
0.6628

0.5120
0.5517
0.5910
0.6293
0.6664

0.5160
0.5557
0.5948
0.6331
0.6700

0.5199
0.5596
0.5987
0.6368
0.6736

0.5239
0.5636
0.6026
0.6406
0.6772

0.5279
0.5675
0.6064
0.6443
0.6808

0.5319
0.5714
0.6103
0.6480
0.6844

0.5359
0.5753
0.6141
0.6517
0.6879

0.5
0.6
0.7
0.8
0.9

0.6915
0.7257
0.7580
0.7881
0.8159

0.6950
0.7291
0.7611
0.7910
0.8186

0.6985
0.7324
0.7642
0.7939
0.8212

0.7019
0.7357
0.7673
0.7967
0.8238

0.7054
0.7389
0.7704
0.7995
0.8264

0.7088
0.7422
0.7734
0.8023
0.8289

0.7123
0.7454
0.7764
0.8051
0.8315

0.7157
0.7486
0.7794
0.8078
0.8340

0.7190
0.7517
0.7823
0.8106
0.8365

0.7224
0.7549
0.7852
0.8133
0.8389

1.0
1.1
1.2
1.3
1.4

0.8413
0.8643
0.8849
0.9032
0.9192

0.8438
0.8665
0.8869
0.9049
0.9207

0.8461
0.8686
0.8888
0.9066
0.9222

0.8485
0.8708
0.8907
0.9082
0.9236

0.8508
0.8729
0.8925
0.9099
0.9251

0.8531
0.8749
0.8944
0.9115
0.9265

0.8554
0.8770
0.8962
0.9131
0.9279

0.8577
0.8790
0.8980
0.9147
0.9292

0.8599
0.8810
0.8997
0.9162
0.9306

0.8621
0.8830
0.9015
0.9177
0.9319

1.5
1.6
1.7
1.8
1.9

0.9332
0.9452
0.9554
0.9641
0.9713

0.9345
0.9463
0.9564
0.9649
0.9719

0.9357
0.9474
0.9573
0.9656
0.9726

0.9370
0.9484
0.9582
0.9664
0.9732

0.9382
0.9495
0.9591
0.9671
0.9738

0.9394
0.9505
0.9599
0.9678
0.9744

0.9406
0.9515
0.9608
0.9686
0.9750

0.9418
0.9525
0.9616
0.9693
0.9756

0.9429
0.9535
0.9625
0.9699
0.9761

0.9441
0.9545
0.9633
0.9706
0.9767

2.0
2.1
2.2
2.3
2.4

0.9772
0.9821
0.9861
0.9893
0.9918

0.9778
0.9826
0.9864
0.9896
0.9920

0.9783
0.9830
0.9868
0.9898
0.9922

0.9788
0.9834
0.9871
0.9901
0.9925

0.9793
0.9838
0.9875
0.9904
0.9927

0.9798
0.9842
0.9878
0.9906
0.9929

0.9803
0.9846
0.9881
0.9909
0.9931

0.9808
0.9850
0.9884
0.9911
0.9932

0.9812
0.9854
0.9887
0.9913
0.9934

0.9817
0.9857
0.9890
0.9916
0.9936

2.5
2.6
2.7
2.8
2.9

0.9938
0.9953
0.9965
0.9974
0.9981

0.9940
0.9955
0.9966
0.9975
0.9982

0.9941
0.9956
0.9967
0.9976
0.9982

0.9943
0.9957
0.9968
0.9977
0.9983

0.9945
0.9959
0.9969
0.9977
0.9984

0.9946
0.9960
0.9970
0.9978
0.9984

0.9948
0.9961
0.9971
0.9979
0.9985

0.9949
0.9962
0.9972
0.9979
0.9985

0.9951
0.9963
0.9973
0.9980
0.9986

0.9952
0.9964
0.9974
0.9981
0.9986

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 67

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

3.0
3.1
3.2
3.3
3.4

0.9987
0.9990
0.9993
0.9995
0.9997

0.9987
0.9991
0.9993
0.9995
0.9997

0.9987
0.9991
0.9994
0.9995
0.9997

0.9988
0.9991
0.9994
0.9995
0.9997

0.9988
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9994
0.9996
0.9997

0.9989
0.9992
0.9995
0.9996
0.9997

0.9990
0.9993
0.9995
0.9996
0.9997

0.9990
0.9993
0.9995
0.9997
0.9998

3.5
3.6
3.7
3.8
3.9

0.9998
0.9998
0.9999
0.9999
1.0000

0.9998
0.9998
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

0.9998
0.9999
0.9999
0.9999
1.0000

z
Pr(Z < z)

Values of z for selected values of Pr(Z < z)


0.842
1.036
1.282
1.645
1.960
0.800
0.850
0.900
0.950
0.975

2.236
0.990

2.576
0.995

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 68

Problems:
5.1 (1 point) Assume -2.153 is a random draw from a Normal Distribution with a mean of zero and
standard deviation of one. Use this value to simulate a random draw from a Normal Distribution with
a mean of 10 and a standard deviation of 4.
A. less than 1
B. at least 1 but less than 2
C. at least 2 but less than 3
D. at least 3 but less than 4
E. at least 4
5.2 (1 point) Assume -2.153 is a random draw from a Normal Distribution with a mean of zero and
standard deviation of one. Use this value to simulate a random draw from a LogNormal Distribution
with = 10 and = 4.
A. less than 1
B. at least 1 but less than 2
C. at least 2 but less than 3
D. at least 3 but less than 4
E. at least 4
5.3 (1 point) A random number 0.0228 is generated from a uniform distribution on the interval (0, 1).
Using the Method of Inversion, determine the simulated value of a random draw from a Normal
Distribution with = 5 and = 17.
A. Less than -20
B. At least -20, but less than -10
C. At least -10, but less than 0
D. At least 0, but less than 10
E. At least 10
5.4 (2 points) A random number 0.9772 is generated from a uniform distribution on the interval
(0, 1). Using the Method of Inversion, determine the simulated value of a random draw from a
LogNormal Distribution with = 5 and = 3.
A. Less than 55,000
B. At least 55,000, but less than 60,000
C. At least 60,000, but less than 65,000
D. At least 65,000, but less than 70,000
E. At least 70,000

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 69

5.5 (3 points) Insurance for a city's snow removal costs covers four winter months.

There is a deductible of 5000 per month.


The city's monthly costs are independent.
The cost for each month is LogNormal with parameters = 9 and = 1.5.
To simulate four months of claim costs, the insurer uses the Method of Inversion.
The 4 numbers drawn from the uniform distribution on [0,1] are: 0.6879, 0.1515, 0.2743, 0.8078.
Calculate the insurer's simulated claim cost.
A. 29,000
B. 31,000
C. 33,000 D. 35,000

E. 37,000

Use the following information for the next 3 questions:


X(0) = 0, X(1) - X(0) is Normally distributed with = 0 and = 5, X(2) - X(1) is Normally distributed
with = 0 and = 5, X(3) - X(2) is Normally distributed with = 0 and = 5, etc.
X(1) - X(0) is independent of X(2) - X(1), X(2) - X(1) is independent of X(3) - X(2), etc.
5.6 (1 point) Simulate X(1). Use the following random number from [0, 1]: 0.3085.
A. less than -4
B. at least -4 but less than -3
C. at least -3 but less than -2
D. at least -2 but less than -1
E. at least -1
5.7 (1 point) Given the solution to the previous question, simulate X(2).
Use the following random number from [0, 1]: 0.8159.
A. less than -1
B. at least -1 but less than 0
C. at least 0 but less than 1
D. at least 1 but less than 2
E. at least 2
5.8 (1 point) Given the solution to the previous question, simulate X(3).
Use the following random number from [0, 1]: 0.1151.
A. less than -5
B. at least -5 but less than -3
C. at least -3 but less than -1
D. at least -1 but less than 1
E. at least 1

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 70

5.9 (2 points) In Munchkin Land, the heights of adult males are Normally Distributed with mean 40
and standard deviation 5. Dorothy Gale meets three adult male Munchkins.
Simulate their heights using the following random numbers from [0, 1]: 0.7486, 0.1210, 0.5319.
What is the sum of these three simulated heights?
A. 117
B. 118
C. 119
D. 120
E. 121
5.10 (3 points) A retrospectively rated workers compensation policy is written for the Phil Fish
Canning Company.
The insurance premium paid by the canning company depends on its annual aggregate losses, L:
P = (1.03) (1.15L + 80,000),
subject to a minimum premium of 300,000 and a maximum premium of 600,000.
Phil Fish Canning Companys annual aggregate losses are LogNormal with = 12.3 and = 0.8.
You simulate 5 separate years of losses, using the following random draws from a Standard Normal
Distribution with mean zero and standard deviation one: 0.1485, -1.5499, 0.3249, -0.1484, 1.8605.
What is the average of the five simulated premiums?
A. 400,000
B. 425,000
C. 450,000
D. 475,000
E. 500,000
5.11 (3 points) Mr. Kotters class takes a standardized statewide reading test.
A score of at least 60 passes the test.
The scores of Mr. Kotters students on this reading test are Normally Distributed with
= 50 and = 10.
The following are four uniform (0, 1) random numbers:
0.5596
0.3821
0.8643
0.0495
Using these numbers and the inversion method, simulate the scores of four students:
Vinnie Barbarino, Arnold Horshack, Freddie Washington, and Juan Epstein.
Determine the difference between the average simulated score of those students who passed and
the average simulated score of those students who failed.
A. 14
B. 15
C. 16
D. 17
E. 18
5.12 (2 points) Assume the following model:

Annual aggregate losses follow a LogNormal Distribution with = 10 and = 2.

Each year is independent of the others.


Use the following random numbers from (0, 1): 0.9099, 0.3483, 0.5000, in order to simulate this
model. What is the simulated total aggregate loss for three years?
A. Less than 360,000
B. At least 360,000, but less than 370,000
C. At least 370,000, but less than 380,000
D. At least 380,000, but less than 390,000
E. At least 390,000

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 71

5.13 (3 points) Losses for Medical Malpractice Insurance are assumed to follow a LogNormal
Distribution with median of 59,874, and coefficient of variation of 4.
The amount of Allocated Loss Adjustment Expense (ALAE) has the following relationship to the
amount of loss:
ln[ALAE] = 4.6 + ln[Loss] / 2.
A random number 0.7224 is generated from a uniform distribution on the interval (0, 1).
Using the Method of Inversion, simulate a random Medical Malpractice Loss.
What is the resulting ratio of ALAE to Loss?
A. 13%
B. 16%
C. 19%
D. 22%
E. 25%
5.14 (2 points) Assume the following model:

Annual aggregate losses follow a Normal Distribution with = 900 and = 150.

Each year is independent of the others.


Use the following random numbers from (0, 1): 0.063, 0.834, 0.648, in order to simulate this model.
What is the simulated total aggregate loss for three years?
A. Less than 2,000
B. At least 2,000, but less than 2,500
C. At least 2,500, but less than 3,000
D. At least 3,000, but less than 3,500
E. At least 3,500
5.15 (4 points) Using the Antithetic Variate Method, if one has a random number u, then one uses
both u and 1 - u in your simulation. This produces two random outputs instead of one.
(a) (1 point) Applying the Antithetic Variate Method to a LogNormal Distribution, using a random
number of 0.90, what are the two outputs?
(b) (3 points) Applying the Antithetic Variate Method to a LogNormal Distribution, let X and Y be
the two outputs from a single random number u. What is the correlation of X and Y?
5.16 (4B, 5/98, Q.17) (2 points) You are given the following:

In 1997, claims follow a lognormal distribution with parameters = 7 and = 2.

Inflation of 5% affects all claims uniformly from 1997 to 1998.

A random number is generated from a uniform distribution on the interval (0, 1).
The resulting number is 0.6915
Using the inversion method, determine the simulated value of a claim in 1998.
A. Less than 2,000
B. At least 2,000, but less than 3,000
C. At least 3,000, but less than 4,000
D. At least 4,000, but less than 5,000
E. At least 5,000

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 72

5.17 (4B, 5/99, Q.9) (1 point) Two random numbers are generated from a uniform distribution on
the interval (0, 1). The resulting numbers are 0.1587 and 0.8413.
Using the inversion method, determine the sum of two simulated values from a normal distribution
with mean zero and variance one.
A. Less than -1.5
B. At least -1.5, but less than -0.5
C. At least -0.5, but less than 0.5
D. At least 0.5, but less than 1.5
E. At least 1.5
5.18 (3, 5/00, Q.32) (2.5 points) Insurance for a city's snow removal costs covers four winter
months.
(i) There is a deductible of 10,000 per month.
(ii) The insurer assumes that the city's monthly costs are independent and normally
distributed with mean 15,000 and standard deviation 2,000.
(iii) To simulate four months of claim costs, the insurer uses the Inverse Transform Method
(where small random numbers correspond to low costs).
(iv) The four numbers drawn from the uniform distribution on [0,1] are:
0.5398 0.1151 0.0013 0.7881
Calculate the insurer's simulated claim cost.
(A) 13,400 (B) 14,400 (C) 17,800 (D) 20,000 (E) 26,600
5.19 (4, 5/05, Q.34 & 2009 Sample Q.202) (2.9 points) Unlimited claim severities for a warranty
product follow the lognormal distribution with parameters = 5.6 and = 0.75.
You use simulation to generate severities.
The following are six uniform (0, 1) random numbers:
0.6179
0.4602
0.9452
0.0808
0.7881
0.4207
Using these numbers and the inversion method, calculate the average payment per claim for
a contract with a policy limit of 400.
(A) Less than 300
(B) At least 300, but less than 320
(C) At least 320, but less than 340
(D) At least 340, but less than 360
(E) At least 360

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 73

5.20 (4, 11/05, Q.27 & 2009 Sample Q.237) (2.9 points)
Losses for a warranty product follow the lognormal distribution with underlying normal mean and
standard deviation of 5.6 and 0.75 respectively.
You use simulation to estimate claim payments for a number of contracts with different
deductibles.
The following are four uniform (0,1) random numbers:
0.6217
0.9941
0.8686
0.0485
Using these numbers and the inversion method, calculate the average payment per loss for a
contract with a deductible of 100.
(A) Less than 630
(B) At least 630, but less than 680
(C) At least 680, but less than 730
(D) At least 730, but less than 780
(E) At least 780
5.21 (4, 11/06, Q.21 & 2009 Sample Q.265) (2.9 points) For a warranty product you are given:
(i) Paid losses follow the lognormal distribution with = 13.294 and = 0.494.
(ii) The ratio of estimated unpaid losses to paid losses, y, is modeled by
y = 0.801 x0.851 e-0.747x
where x = 2006 - contract purchase year
The inversion method is used to simulate four paid losses with the following four uniform (0,1)
random numbers: 0.2877
0.1210
0.8238
0.6179
Using the simulated values, calculate the empirical estimate of the average unpaid losses
for purchase year 2005.
(A) Less than 300,000
(B) At least 300,000, but less than 400,000
(C) At least 400,000, but less than 500,000
(D) At least 500,000, but less than 600,000
(E) At least 600,000

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 74

Solutions to Problems:
5.1. B. (-2.153)(4) + 10 = 1.388.
Comment: Reverse of the usual process of standardizing a variable to enable one to use the
standard Normal table.
5.2. E. e(-2.153)(4) + 10 = e1.388 = 4.007.
Comment: If x follows a LogNormal, then ln(x) follows a Normal. Thus if y follows a Normal, ey
follows a LogNormal Distribution.
5.3. A. Set 0.0228 = F(x) = [(x - 5)/17]. Using the Standard Normal Table, (2) = .9772, and
thus (-2) = 1 - .9772 = .0228. -2 = (x - 5)/17. x = -29.
Alternately, x = Z + = (17)(-2) + 5 = -29.
5.4. B. Set 0.9772 = F(x) = [(ln(x) - 5)/3]. Using the Standard Normal Table, (2) = .9772.
Therefore 2 = (ln(x) - 5)/3. Therefore x = e11 = 59,874.
5.5. E. Since (.49) = .6879, the first random number of .6879 corresponds to a random Unit
Normal of 0.49. This in turn correspond to a month with costs of:
exp[9 + (1.5)(0.49)] = 16,899. Similarly, the other months correspond to:
exp(9 + (1.5)-1(0.1515)) = exp[9 + (1.5)(-1.03)] = 1728,
exp[9 + (1.5)-1(0.2743)] = exp[9 + (1.5)(-0.60)] = 3294, and
exp[9 + (1.5)-1(0.8078)] = exp(9 + (1.5)(0.87)] = 29,882. After applying the 5000 per month
deductible, the insurer pays: 11,899 + 0 + 0 + 24,882 = 36,781.
Comment: Similar to 3, 5/00, Q.32.
5.6. C. (-0.5) = .3085. Therefore X(1) = (-0.5)(5) = -2.5.
5.7. E. (0.9) = .8159. Therefore X(2) = X(1) + (0.9)(5) = -2.5 + 4.5 = 2.0.
5.8. B. (-1.2) = .1151. Therefore X(3) = X(2) + (-1.2)(5) = 2 - 6 = -4.0.
Comment: A Brownian Motion.

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 75

5.9. B. (.67) = .7486. The first height is: 40 + (5)(.67) = 43.35.


(-1.17) = .1210. The second height is: 40 + (5)(-1.17) = 34.15.
(0.08) = .5319. The third height is: 40 + (5)(0.08) = 40.40.
43.35 + 34.15 + 40.40 = 117.90.
5.10. A. L = Exp[0.8Z + 12.3].
Thus the first simulated annual aggregate loss is: Exp[(0.8)(.1485) + 12.3] = 247,409.
P = (1.03){(1.15)(247409) + 80000} = 375,457.
Z

Aggregate
Loss

Preliminary
Premium

Paid
Premium

0.1485
-1.5499
0.3249
-0.1484
1.8605

247,409
63,582
284,908
195,102
973,254

375,457
157,712
419,873
313,499
1,235,219

375,457
300,000
419,873
313,499
600,000

Average

401,766

Comment: No premium paid is less than the minimum of 300,000 or more than the maximum of
600,000.
5.11. D. [0.15] = 0.5596. [-0.30] = 0.3821. [1.10] = 0.8643. [-1.65] = 0.0495.
Therefore, the four test scores are: 50 + (0.15)(10) = 51.5, 50 + (-0.30)(10) = 47.0,
50 + (1.10)(10) = 61.0, 50 + (-1.65)(10) = 33.5.
The average failing score is: (51.5 + 47.0 + 33.5)/3 = 44.
The average passing score is 61. The difference is: 61 - 44 = 17.
5.12. A. [1.34] = 0.9099.
The simulated aggregate losses for the first year are: exp[10 + (1.34)(2)] = 321,258.
[-0.39] = 0.3483.
The simulated aggregate losses for the second year are: exp[10 + (-0.39)(2)] = 10,097.
[0] = 0.5000.
The simulated aggregate losses for the third year are: exp[10 + (0)(2)] = 22,026.
Three year total is: 321,258 + 10,097 + 22,036 = 353,391.

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 76

5.13. E. 1 + CV2 = E[X2 ]/E[X]2 = exp[2 + 22] / exp[ + 2/2]2 = exp[2].


Thus, 1 + 42 = exp[2]. = 1.683.
For the LogNormal, let x be the median.
0.5 = [(ln[x] - )/]. (ln[x] - )/ = 0. x = exp[].
Therefore, 59,874 = exp[]. = 11.
Given u = 0.7224, we wish to find Z such that [Z] = 0.7224. Z = 0.59.
the simulated loss is: exp[11 + (1.683)(0.59)] = 161,615.
ln[ALAE] = 4.6 + ln[Loss]/2 = 4.6 + ln[161,615]/2 = 10.5965. ALAE = 39,995.
ALAE / Loss = 39,995 / 161,615 = 24.7%.
Comment: Loosely based on Illinois Tort Reform and the Cost of Medical Liability Claims, by
Susan J. Forray and Chad C. Karls, in the July 2010 Contingencies.
5.14. C. Simulate a random Normal. (-1.53) = 0.0630, so the corresponding random Normal is:
900 + (-1.53)(150) = 671.
(0.97) = 0.8340, so the corresponding random Normal is: 900 + (.97)(150) = 1046.
(0.38) = 0.6480, so the corresponding random Normal is: 900 + (.38)(150) = 957.
Total of the three years is: 671 + 1046 + 957 = 2674.

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 77

5.15. (a) u = 0.90 corresponds to a random Standard Normal of 1.282.


u = 1 - 0.90 = 0.10 corresponds to a random Standard Normal of -1.282.
Thus the two outputs are: exp[ + 1.282 ] and exp[ - 1.282 ].
(b) With Z a random draw from a Standard Normal Distribution, the two outputs are:
exp[ + Z ] and exp[ - Z ].
Each of the outputs is a random draw from a LogNormal Distribution.
Thus, E[X] = E[Y] = exp[ + 2/2].
E[X2 ] = E[Y2 ] = exp[2 + 22].
Thus, Var[X] = Var[Y] = exp[2 + 2] (exp[2] - 1).

E[XY] =

- exp[ + z ] exp[ - z ] [z] dz = e2 - [z] dz = e2.

Cov[X, Y] = E[XY] - E[X]E[Y] = e2 - exp[ + 2/2] exp[ + 2/2] = e2 (1 - exp[2]).


Thus, Corr[X, Y] =

e2 (1 - exp[2 ])
exp[2 +

2]

(exp[2 ]

- 1)

= -exp[-2].

Comment: The two outputs from the Antithetic Variate Method are negatively correlated.
As approaches zero, the correlation approaches -1.
The Antithetic Variate Method is on the syllabus of joint Exam MFE/3F.
5.16. C. Working in 1997 dollars, set .6915 = F(x) = ({ln(x) - 7}/2). Using the Standard Normal
Table, (.5) = .6915. Therefore .5 = {ln(x) - 7}/2. Therefore x = e8 = 2981 in 1997 dollars. Inflating
by 5%, this corresponds to a claim of size: (1.05)(2981) = 3130 in 1998.
Alternately, working in 1998 dollars, after 5% inflation one has a LogNormal Distribution with
parameters = 7 + ln(1.05) = 7.049 and = 2.
Set .6915 = F(x) = [ln(x) - 7.049}/2]. Using the Standard Normal Table, (.5) = .6915. Therefore
.5 = {ln(x) - 7.049}/2. Therefore x = e8.049 = 3131 in 1998 dollars.
5.17. C. Looking in the Normal Distribution Table, (-1) is .1587.
Thus the first random Normal is: -1. (1) = .8413, so the second random Normal is: 1.
The sum of the two simulated values is: (-1) + 1 = 0.

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 78

5.18. B. (.1) = .5398. Thus the first random number of .5398 corresponds to a random Unit
Normal of .1. This in turn correspond to a month with costs of:
15,000 + (.1)(2000) = 15,200. Similarly, the other months correspond to:
15,000 + -1(.1151)(2000) = 15,000 + (-1.2)(2000) = 12,600,
15,000 + -1(.0013)(2000) = 15,000 + (-3)(2000) = 9000, and
15,000 + -1(.7881)(2000) = 15,000 + (.8)(2000) = 16,600. After applying the 10,000 per month
deductible, the insurer pays: 5200 + 2600 + 0 + 6600 = 14,400.
Comment: Note how the simulation would have been not much more difficult if there had been
additional coverage modifications, such as an aggregate policy limit. In the absence of the deductible
per month, the sum of the losses for the four months would be normally distributed with mean of
(4)(15000) = 60,000 and standard deviation of 2000 4 = 4,000. Therefore, in the absence of a
deductible, one could use a single random number to directly simulate the aggregate losses.
5.19. A. Let Z be such that (Z) = u. Then the random LogNormal is exp[5.6 + 0.75Z].
Random Number Standard Normal
Normal
LogNormal
Limited to 400
0.6179
0.3
5.825
339
339
0.4602
-0.1
5.525
251
251
0.9452
1.6
6.800
898
400
0.0808
-1.4
4.550
95
95
0.7881
0.8
6.200
493
400
0.4207
-0.2
5.450
233
233
With the policy limit of 400, the mean is: (339 + 251 + 400 + 95 + 400 + 233)/6 = 286.
5.20. A. Assuming that large random numbers correspond to large claims, set F(x) = u.
u = [(ln(x) - 5.6)/.75]. x = exp[5.6 + .751[u]].
[.6217] = 0.31. x = exp[5.6 + (.75)(0.31)] = 341. payment is 241.
[.9941] = 2.52. x = exp[5.6 + (.75)(2.52)] = 1790. payment is 1690.
[.8686] = 1.12. x = exp[5.6 + (.75)(1.12)] = 626. payment is 526.
[.0485] = -1.66. x = exp[5.6 + (.75)(-1.66)] = 78. payment is 0.
The average payment per loss is: (241 + 1690 + 526 + 0)/4 = 614.
Comment: The question should have specified whether large random numbers correspond to large
simulated values or small simulated values. However, when using a table, such as that of the Normal
Distribution, one commonly has large random numbers correspond to large simulated values. The
average payment per payment is: (241 + 1690 + 526)/3 = 819.

2013-4-13,

Simulation 5 Normal and LogNormal,

HCM 10/25/12,

Page 79

5.21. A. [-0.56] = 0.2877. Thus the first simulated Standard Normal is -0.56.
The first simulated paid loss is: exp[13.294 + (0.494)(-.56)] = 450,161.
The second simulated paid loss is: exp[13.294 + (0.494)(-1.17)] = 333,041.
The third simulated paid loss is: exp[13.294 + (0.494)(0.93)] = 939,798.
The fourth simulated paid loss is: exp[13.294 + 0(.494)(0.30)] = 688,451.
Average paid loss: (450,161 + 333,041 + 939,798 + 688,451)/4 = 602,863.
x = 2006 - contract purchase year = 2006 - 2005 = 1.

y = 0.801 10.851 e-0.747 = 0.3795.


Estimated average unpaid losses: (0.3795)(602,863) = 228,863.
Comment: The part of this exam question related to unpaid losses is an unnecessary complication.

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 80

Section 6, Simulating Brownian Motion38


In order to simulate the outcomes of Brownian Motion, one needs to simulate random draws from
Normal Distributions. Questions about simulating Brownian Motion can be rephrased in terms of
simulating Normal Distributions.
Arithmetic Brownian Motion:
An Arithmetic Brownian Motion, X(t) is Normal with mean X(0) + t, and variance 2 t.
Arithmetic Brownian Motion is a stochastic process, with the following properties:
1. X(t + s) - X(t) is Normally Distributed with mean s, and variance 2 s.
2. The increments for disjoint time intervals are independent.
3. X(t) is continuous.
One can simulate Arithmetic Brownian Motion, by simulating the increments as
independent Normals.
Exercise: Use the Method of Inversion in order to get random draws from a Standard Unit Normal.
Use the following random numbers from [0, 1]: 0.4207, 0.7881, 0.0107, 0.9332, 0.3085.
[Solution: (-0.2) = 0.4207, and therefore the first random Standard Normal is: -0.2.
Similarly, the remaining random Standard Normals are: 0.8, -2.3, 1.5, -0.5.
Comment: The random numbers from [0, 1] were chosen for illustrative purposes so that it would
be easy to look up the corresponding Standard Normals on the Normal Distribution Table provided
with the exam.]
Let X(t) be the position of a particle at time t. Assume X(t) is an Arithmetic Brownian Motion with
= 0 and = 3. Assume X(0) = 0.
Then X(1) - X(0) is Normally distributed with = 0 and = 3, X(2) - X(1) is Normally distributed with
= 0 and = 3, X(3) - X(2) is Normally distributed with = 0 and = 3, etc. X(1) - X(0) is
independent of X(2) - X(1), X(2) - X(1) is independent of X(3) - X(2), etc.
Exercise: Use the Method of Inversion in order to get a random draw from a Normal Distribution with
= 0 and = 3. Use the following random number from [0, 1]: 0.4207.
[Solution: The Standard Unit Normal is -0.2. The corresponding Normal is: (-0.2)(3) = -0.6.]
38

Brownian Motion is covered in Derivatives Markets by McDonald, Introduction to Probability Models by Ross, or
Loss Models.

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 81

Thus the simulated value of X(1) is -0.6. Now let us assume we want to also simulate X(2).
X(2) - X(1) is Normally distributed with = 0 and = 3.
Using the random number 0.7881, the simulated Normal is: (0.8)(3) = 2.4.
The simulated value of X(2) = X(1) + 2.4 = -0.6 + 2.4 = 1.8.
Using a sequence of random numbers, one could simulate X(t) in this sequential manner:
Time
0
1
2
3
4
5

Random
Number
0.4207
0.7881
0.0107
0.9332
0.3085

Unit Standard
Mean Normal
NormalDeviation = 0, = 3
-0.2
0.8
-2.3
1.5
-0.5

X(t)
0
-0.6
1.8
-5.1
-0.6
-2.1

-0.6
2.4
-6.9
4.5
-1.5

For example, X(3) = X(2) + (-2.3)(3) = 1.8 - 6.9 = -5.1.


In order to simulate X(t) at: 1, 2, 3, ..., 100, we would simulate 100 independent Normal Distributions
each with standard deviation 3. We would take a running sum of these simulated Normals, in order
to get the simulated Arithmetic Brownian Motion. For example X(44) would be the sum of the first
44 random Normals. Here is an example of such a simulation, for t = 1, 2, 3, ..., 100:

20

-10

-20

-30

-40

40

60

80

100

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 82

Here is another simulation of the same process:

80

60

40

20

20

40

60

80

100

-20

Each time one simulates such a stochastic process, one gets a different result.
Exercise: Let X(t) be the value of a currency at time t. Assume X(0) = 50.
X(t) follows an Arithmetic Brownian Motion with = 0.7 and = 2.
Sequentially simulate X(1) and X(5).
Use the following random numbers from (0, 1): 0.6808, 0.1788.
[Solution: X(1) - X(0) is Normally distributed with = 0.7 and = 2.
X(5) - X(1) is Normally distributed with = (0.7)(5 - 1) = 2.8, and = 2 5 - 1 = 4.
X(1) - X(0) is independent of X(5) - X(1).
Time
0
1
5

Random
Number
0.6808
0.1788

StandardMean
Standard Increment
NormalDeviation
0.47
-0.92

1.64
-0.88

X(t)
50
51.64
50.76

The random number 0.6808 corresponds to a Standard Unit Normal of 0.47.


The corresponding Normal with mean .7 and standard deviation 2 is: (0.47)(2) + 0.7 = 1.64.
X(1) = 50 + 1.64 = 51.64.
The random number 0.1788 corresponds to a Standard Unit Normal of -0.92.
The corresponding Normal with mean 2.8 and standard deviation 4 is: (-0.92)(4) + 2.8 = -0.88.
X(5) = X(1) - 0.88 = 50.76.]

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 83

Standard Brownian Motion:


In Derivative Markets, not on the syllabus of this exam, McDonald refers to an Arithmetic Brownian
Motion with = 0 and = 1 as a Brownian Motion, Z(t). Many other textbooks refer to this as a
Standard Brownian Motion. It is assumed that Z(0) = 0.
Geometric Brownian Motion:
If ln(X(t)) is an Arithmetic Brownian Motion, then X(t) is a Geometric Brownian Motion.
For a Geometric Brownian Motion, X(t)/X(0) is LogNormal with parameters t and t .
Geometric Brownian Motion is a stochastic process, with the following properties:
1. X(t + s) / X(t) is LogNormally Distributed with parameters s and s .
2. The ratios for disjoint time intervals are independent.
3. X(t) is continuous.
Standard Normal

Standard Brownian Motion

Normal Distribution

Arithmetic Brownian Motion

LogNormal Distribution

Geometric Brownian Motion

Simulating Geometric Brownian Motion:39


One can simulate Geometric Brownian Motion, by simulating successive ratios as
independent LogNormals, or by simulating an Arithmetic Brownian Motion and then
exponentiating.
Let X(t) be the price of a stock at time t, where time is measured in months.
X(0) = $100. Assume X(t) follows a Geometric Brownian Motion with = 0.003 and = 0.06.
Then X(1)/X(0) is LogNormally distributed with = 0.003 and = 0.06, X(2)/X(1) is LogNormally
distributed with = 0.003 and = 0.06, X(3)/X(2) is LogNormally distributed with = 0.003 and
= 0.06, etc.
X(1)/X(0) is independent of X(2)/X(1), X(2)/X(1) is independent of X(3)/X(2), etc.
X(1)/X(0) is LogNormally distributed with = 0.003 and = 0.06.
ln(X(1)) - ln(X(0)) is Normally distributed with = 0.003 and = 0.06.
39

See Exercise 21.8 in Loss Models, 3, 5/01, Q.8 rewritten.


Simulating stock prices is discussed in Mahlers Guide to Two Topics in Financial Economics.

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 84

Thus one can simulate ln(X(t)) as previously, and then exponentiate to get X(t).
Using a sequence of random numbers, one could simulate ln[X(t)] in a sequential manner:
Time
(months)

Random
Number

0
1
2
3
4
5

0.3085
0.7881
0.5793
0.9192
0.2420

Unit Standard
Mean
Normal
= 0.003, = 0.06
NormalDeviation
-0.5
0.8
0.2
1.4
-0.7

-0.027
0.051
0.015
0.087
-0.039

ln(X(t))

X(t)

4.605
4.578
4.629
4.644
4.731
4.692

$100.00
$97.34
$102.43
$103.98
$113.43
$109.09

In a similar manner one could simulate many months of the stock price.
Here is an example of a simulation of the stock price over 10 years (120 months):

Price
200
180
160
140
120
100

20

40

60

80

100

120

Months

It is important to note, that the movements of the simulated price from month to month are random
and independent.40 If one simulated the same process again, one would get a different result.
Nevertheless, the expected stock price at time t (in months) is: 100exp(t + t2/2) = 100exp( + 2/2)t =
100exp(0.003 + 0.062 /2)t = 100(1.00481t). After 10 years, the expected price is: 100(1.00481120) = 178. In this
particular simulation run, the price at 10 years turned out to be less than expected. In different simulation runs the
price at 10 years will vary; Geometric Brownian Motion is a Stochastic Process.
40

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 85

Problems:
Use the following information for the next 2 questions:
An Arithmetic Brownian Motion is zero at time 0 and has = 0 and = 7.
6.1 (1 point) Simulate the Arithmetic Brownian Motion at time 5.
Use the following random number from [0, 1]: 0.2119.
A. less than -20
B. at least -20 but less than -15
C. at least -15 but less than -10
D. at least -10 but less than -5
E. at least -5
6.2 (1 point) Given the solution to the previous question, simulate the Arithmetic Brownian Motion at
time 15.
Use the following random number from [0, 1]: 0.6554.
A. less than -5
B. at least -5 but less than -2
C. at least -2 but less than 2
D. at least 2 but less than 5
E. at least 5

6.3 (3 points) You are to simulate an Arithmetic Brownian Motion that is zero at time zero and with
= 0.22 and = 0.06,
at times 1, 5, 10, and 25.
You are given the following independent random draws from a Standard Unit Normal Distribution:
-1.1160, 0.2761, 1.4980, 0.6966, -0.5783, -0.8267, 0.9787, 0.4923.
What is the simulated value of this Brownian Motion at time = 25?
A. 2.7
B. 2.8
C. 2.9
D. 3.0
E. 3.1

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 86

Use the following information for the next 3 questions:


For a simulation of the movement of a stocks price, X(t):

X(t+1)/X(t) is LogNormally distributed with = 0.08 and = 0.36.


X(t+1)/X(t) is independent of X(t+2)/X(t+1).
The simulation projects the stock price in steps of time 1.
Simulated price movements are determined using the inversion method.
The price at t = 0 is 80.
6.4 (2 points) Simulate the price of the stock at time 1.
Use the following random number from [0, 1]: 0.9641.
A. less than 160
B. at least 160 but less than 170
C. at least 170 but less than 180
D. at least 180 but less than 190
E. at least 190
6.5 (2 points) Given the solution to the previous question, simulate the stock price at time 2.
Use the following random number from [0, 1]: 0.0139.
A. less than 60
B. at least 60 but less than 70
C. at least 70 but less than 80
D. at least 80 but less than 90
E. at least 90
6.6 (2 points) Given the solution to the previous question, simulate the stock price at time 3.
Use the following random number from [0, 1]: 0.6179.
A. less than 100
B. at least 100 but less than 110
C. at least 110 but less than 120
D. at least 120 but less than 130
E. at least 130

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 87

6.7 (3 points) You are to simulate a Standard Brownian Motion


(starts at 0 at time 0 and = 1 and = 0)
successively at times: 1, 2, 3, 4, and 5.
You are given the following independent random numbers from (0 ,1):
0.0351, 0.3520, 0.8749, 0.1112, 0.6844.
What is the simulated value of this Brownian Motion at time 5?
(A) -1.8
(B) -1.6
(C) -1.4
(D) -1.2
(E) -1.0
6.8 (1 point) Using the results of the previous question, simulate a Geometric Brownian Motion with
= 0.2 and = 0.1. Assume the Geometric Brownian Motion is one at time zero.
What is the simulated value of this Geometric Brownian Motion at time 5?
A. less than 1
B. at least 1 but less than 2
C. at least 2 but less than 3
D. at least 3 but less than 4
E. at least 4

6.9 (3, 5/01, Q.8) (2.5 points) For a simulation of the movement of a stocks price:
(i) The price follows geometric Brownian motion, with drift coefficient = 0.01 and
variance parameter 2 = 0.0004.
(ii) The simulation projects the stock price in steps of time 1.
(iii) Simulated price movements are determined using the inverse transform method.
(iv) The price at t = 0 is 100.
(v) The random numbers, from the uniform distribution on [0,1], for the first 2 steps
are 0.1587 and 0.9332, respectively.
(vi) F is the price at t = 1; G is the price at t = 2.
Calculate G - F.
(A) 1
(B) 2
(C) 3
(D) 4
(E) 5
6.10 (3 points) Continuing the simulation in the previous question, the random numbers, from the
uniform distribution on [0,1], for the next 2 steps are 0.6554 and 0.0359, respectively.
Calculate the difference between the simulated price at t = 4 and the simulated price at t = 3.
A. less than -1
B. at least -1 but less than 0
C. at least 0 but less than 1
D. at least 1 but less than 2
E. at least 2

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 88

Solutions to Problems:
6.1. C. (-0.8) = 0.2119. The Arithmetic Brownian Motion at time 5 is Normal with mean zero and
standard deviation: 7 5 . Therefore X(5) = (-0.8)(7) 5 = -12.52.
6.2. B. (0.4) = 0.6554. Therefore X(15) = X(5) + (0.4)(7) 15 - 5 = -12.52 + 8.85 = -3.67.
6.3. A. One can simulate the corresponding Brownian Motion without drift, X(t), and then add t at
the end in order to get the Brownian Motion with drift. X(1) = (-1.1160)(.22) 1 = -.2455.
X(5) = X(1) + (.2761)(.22) 5 - 1 = -.1240. X(10) = X(5) + (1.498)(.22) 10 - 5 = .6129.
X(25) = X(10) + (0.6966)(.22) 25 -10 = 1.2064.
Thus, the simulated value of the Brownian Motion with drift at time = 25 is:
1.2064 + (25)(.06) = 2.7064.
Comment: At t = 10, the simulated simulated value of the Brownian Motion with drift is:
.6129 + (10)(.06) = 1.2129.
6.4. B. (1.8) = .9641. A simulated Normal with = .08 and = 0.36 is: .08 + (1.8)(.36) = .728.
The corresponding LogNormal is: e.728 = 2.0709.
Therefore price of the stock at time 1: (80)(2.0709) = 165.67.
6.5. D. (-2.2) = .0139. A simulated Normal with = .08 and = 0.36 is: .08 + (-2.2)(.36) =
-0.712. The corresponding LogNormal is: e-.712 = .49066.
Therefore X(2) = X(1)(.49066) = (165.67)(.49066) = 81.29.
6.6. A. (0.3) = .6179. A simulated Normal with = .08 and = 0.36 is: .08 + (.3)(.36) = .188.
The corresponding LogNormal is: e.188 = 1.20683.
Therefore X(2) = X(1)(1.20683 = (81.29)(1.20683) = 98.10.
Comment: A Geometric Brownian Motion.

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 89

6.7. A. A Standard Brownian Motion has = 1 and = 0. X(1), X(2) - X(1),


X(3) - X(2), X(4) - X(3), and X(5) - X(4), are independent Normals each with mean 0 and standard
deviation 1. (-1.81) = .0351. (-.38) = .3520. (1.15) = .8749. (-1.22) = .1112.
(0.48) = .6844. Therefore, X(1) = -1.81. X(2) = X(1) - .38 = -2.19.
X(3) = X(2) + 1.15 = -1.04. X(4) = X(3) - 1.22 = -2.26. X(5) = X(4) + 0.48 = -1.78.
6.8. C. From the previous solution, Standard Brownian Motion: -1.78 @5.
Brownian Motion without drift: -1.78 = (-1.78)(.1) = -.178 @5.
Brownian Motion with drift: -.178 + 5 = -.178 + (5)(.2) = .822 @5.
Geometric Brownian Motion with drift: e.822 = 2.275 @5.
6.9. D. = 0.01 and = 0.02. First simulate the Brownian Motion without drift, by simulating a
Normal variable. Then add the drift. Then exponentiate to get the Geometric Brownian Motion.
Then multiply by 100 in order to get the stock prices.
The Brownian Motion without drift at time 1 is Normally Distributed with mean zero, and standard
deviation: 1 = 0.02 1 = 0.02.
(-1) = .1587. Therefore, the corresponding simulated Normal is: (-1)(.02) = -.02.
Given the Brownian Motion without drift at time = 1 is -.02, the Brownian Motion without drift at
time = 2 is Normally Distributed with mean -.02 and standard deviation: .02 2 - 1 = .02.
(1.5) = .9332. The corresponding simulated Normal is: -.02 + (1.5)(.02) = .01.
Brownian Motion without drift: 0 @0, -.02 @1, .01 @2.
Brownian Motion with drift: 0 @0, -.02 + = -.01 @1, .01 + 2 = .03 @2.
Geometric Brownian Motion with drift: e0 = 1 @0, e-.01 = .9900 @1, e.03 = 1.0305 @2.
Simulated Stock Price: 100 @0, 99.00 @1, 103.05 @2.
The difference in the two stock prices is: 103.05 - 99.00 = 4.05.
Comment: Professor Klugman rewrote this exam question as Exercise 21.8 in Loss Models.
Note that the price at time 2 depends on the price at time 1. One must first simulate what happens
at time 1 and then what happens between times 1 and 2.
One could get the price @2 by noting that the difference of the log stock price at 2 and the log stock
price at 1 is the difference in the Brownian Motion with drift:
.03 - (-.01) = .04. The ratio of stock price @2 to stock price @1 is: e.04 = 1.0408.

Stock Price @ 2: (99.00)(1.0408) = 103.04, the same answer subject to rounding.

2013-4-13,

Simulation 6 Brownian Motion,

HCM 10/25/12,

Page 90

6.10. A. = .01 and = .02. From the previous solution, the Brownian Motion without drift at
time = 2 is .01. Therefore, the Brownian Motion without drift at time = 3 is Normally Distributed with
mean .01 and standard deviation: .02 3 - 2 = .02.
(0.4) = .6554. The corresponding simulated Normal is: .01 + (0.4)(.02) = .018.
Given the Brownian Motion without drift at time = 3 is .018, the Brownian Motion without drift at
time = 4 is Normally Distributed with mean .018 and standard deviation: .02 4 - 3 = .02.
(-1.8) = .0359. The corresponding simulated Normal is: .018 + (-1.8)(.02) = -.018.
Brownian Motion without drift: .018 @3, -.018 @4.
Brownian Motion with drift: .018 + 3 = .048 @3, -.018 + 4 = .022 @4.
Geometric Brownian Motion with drift: e.048 = 1.0492 @3, e.022 = 1.0222 @4.
Simulated Stock Price: 104.92 @3, 102.22 @4.
The difference in the two stock prices is: 102.22 - 104.92 = -2.70.

2013-4-13,

Simulation 7 Lifetimes,

HCM 10/25/12,

Page 91

Section 7, Simulating Lifetimes


One can also apply the inversion method to a life table, in order to simulate times of death and the
present values of benefits paid for life insurances and annuities.41
For life contingencies the following are all the same:
small random numbers early deaths.
large random numbers large lifetimes.
Setting u = F(x).
Setting 1 - u = S(x) =

number still alive


.
number originally alive

number still alive = (1-u) (number alive at starting age).


Exercise: One is simulating future lifetimes for a person age 70 using the illustrative Life Table from
Actuarial Mathematics,42 where small random numbers correspond to early deaths.
For a random number of 0.10, when does this simulated person die?
[Solution: l70 = 6,616,155. (1-u)l70 = (0.9)(6,616,155) = 5,954,540.
l72 = 6,164,663 > 5,954,540 > 5,920,394 = l73.
Therefore, the person dies between age 72 and age 73.
Comment: Note that a small random number did result in an early simulated death.]
Unless specifically stated otherwise, on your exam set F(x) = u.
Simulation of lifetimes can also be done the other way around, by setting u = S(x). Check with a
small random number, such as u = 0.01, whether your result matches the statement in the question.
Exercise: One is simulating future lifetimes for a person age 70 using the illustrative Life Table, where
small random numbers correspond to late deaths.
For a random number of 0.10, when does this simulated person die?
[Solution: l70 = 6,616,155. u l70 = (0.10)(6,616,155) = 661,616.
l92 = 682,707 > 661,616 > 530,959 = l93.
Therefore, the person dies between age 92 and age 93.
Comment: Note that a small random number did result in a late simulated death.]

41

I expect a lower average frequency of questions on simulating lifetimes, than when simulation was on the same
exam as life contingencies on the old Exam 3.
42
This table is not attached to your exam. An excerpt from this table is given with the problems for this section.

2013-4-13,

Simulation 7 Lifetimes,

HCM 10/25/12,

Page 92

Annuity Example: 43
Fred age (65) and his wife Ethel age (60), two independent lives, purchase a special annual whole
life annuity-due. The benefit is 30,000 while both are alive and 20,000 while only one is alive.
Mortality follows the Illustrative Life Table.44 i = 0.06.
Here is how we would simulate the actuarial present value of this annuity. First one needs two
random numbers from (0, 1), for example: 0.668 and 0.222.
Exercise: Simulate the death of Fred using the inversion method, where small random numbers
correspond to early deaths, using the random number 0.668.
[Solution: (1-u)l65 = (1 - 0.668) (7,533,964) = 2,501,276.
l84 = 2,660,734 > 2,501,276 > 2,358,246 = l85. Fred dies between age 84 and 85.]
Exercise: Simulate the death of Ethel using the inversion method, where small random numbers
correspond to early deaths, using the random number 0.222.
[Solution: (1-u)l60 = (1 - 0.222) (8,188,074) = 6,370,322.
l71 = 6,396,609 > 6,370,322 > 6,164,663 = l72. Ethel dies between age 71 and 72.]
Fred lives 84 - 65 = 19 complete years and Ethel lives 71 - 60 = 11 complete years.
time:
0
....
11
12
....
19
20
payment:
30
30
20
20
0
There are 12 payments of 30,000, followed by 8 payments of 20,000.
This is a sum of two certain annuity-dues, one with 20 payments of 20,000 and another with 12
payments of 10,000. The interest rate is 6%, so v = 1/1.06.
The present value is: (20000)(1 - v20) / (1-v) + (10000)(1 - v12) / (1-v) =
(20000)(12.1581) + (10000)(8.8869) = 332,031.
I ran a simulation of this situation 10,000 times. The minimum present value was 30,000 and the
maximum was 476,349. The present value had mean of 337,375 and sample standard deviation of
70,351.

43
44

I would be extremely surprised to see this type of thing on your exam.


An excerpt from this table is given with the problems for this section.

2013-4-13,

Simulation 7 Lifetimes,

HCM 10/25/12,

Page 93

Here is a graph of the survival function of the present value:

1
0.8
0.6
0.4
0.2

100000

200000

300000

400000

500000

For example, Prob[PV > 300,000] = 73.75% and Prob[PV > 400,000] = 19.16%.
De Moivres Law:45
For De Moivres Law, the survival function is uniform: S(t) = 1 - t/, 0 < t < .
For a life aged x, the age of death is uniform from x to .
Therefore, one can simulate the age of death as: x + u(-x).46
Exercise: John is age 25. Mortality follows De Moivres law with = 80.
Simulate Johns age at death, using the inversion method, where small random numbers correspond
to early deaths, and using the random number 0.615.
[Solution: Age of death = 25 + (0.615)(80-25) = 58.8.]

45

See page 78 of Actuarial Mathematics.


As discussed in a previous section, in order to simulate a random number uniformly distributed on [a, b], one
takes: a + (b-a)u, where u is a random number from [0, 1].
46

2013-4-13,

Simulation 7 Lifetimes,

HCM 10/25/12,

Page 94

Gompertzs Law:47
For Gompertz Law, the survival function is: S(t) = exp[-m(ct - 1)]. For a life age x, one can simulate
the age at death, y, via the inversion method. If small random numbers correspond to early deaths,
then we set: 1 - u = ly/lx.
1 - u = exp[-m(cy - 1)] / exp[-m(cx - 1)] = exp[-m(cy - cx)].
ln(1 - u) = -m(cy - cx).
cy = cx - ln(1 - u)/m.
y = ln[cx - ln(1 - u)/m] / ln(c).
Exercise: Mortality follows Gompertzs Law with m = 0.0008 and c = 1.09.
For a life aged 50, use the random number 0.18 to simulate the age of death using the inversion
method, where small random numbers correspond to early deaths.
[Solution: y = ln[cx - ln(1 - u)/m] / ln(c) = ln[1.0950 - ln(1 - 0.18)/0.0008] / ln(1.09) = 67.0.
Check: S(67.0)/S(50) = exp[-0.0008(1.0967 - 1)] / exp[-0.0008(1.0950 - 1)] = 0.77365/0.94300 =
0.820 = 1 - 0.18.
Comment: If u = 0.01, then y = ln(1.0950 - ln(1 - 0.01)/0.0008)/ln(1.09) = 51.8.
A small random number does correspond to an early death.]
Makehams Law, S(t) = exp[-At - m(ct - 1)], can not be algebraically inverted as can Gompertzs
Law.

47

See page 78 of Actuarial Mathematics.

2013-4-13,

Simulation 7 Lifetimes,

HCM 10/25/12,

Page 95

Problems:
In answering the questions in this section, you may use the following values from the Illustrative Life
Table in Actuarial Mathematics:
x
0
5
10
15
20
21
22
23
lx 10,000,000 9,749,503 9,705,588 9,663,731 9,617,802 9,607,896 9,597,695 9,587,169
x
24
25
26
27
28
29
30
31
lx 9,576,288 9,565,017 9,553,319 9,541,153 9,528,475 9,515,235 9,501,381 9,486,854
x
32
33
34
35
36
37
38
39
lx 9,471,591 9,455,522 9,438,571 9,420,657 9,401,688 9,381,566 9,360,184 9,337,427
x
40
41
42
43
44
45
46
47
lx 9,313,166 9,287,264 9,259,571 9,229,925 9,198,149 9,164,051 9,127,426 9,088,049
x
48
49
50
51
52
53
54
55
lx 9,045,679 9,000,057 8,950,901 8,897,913 8,840,770 8,779,128 8,712,621 8,640,861
x
56
57
58
59
60
61
62
63
lx 8,563,435 8,479,908 8,389,826 8,292,713 8,188,074 8,075,403 7,954,179 7,823,879
x
64
65
66
67
68
69
70
71
lx 7,683,979 7,533,964 7,373,338 7,201,635 7,018,432 6,823,367 6,616,155 6,396,609
x
72
73
74
75
76
77
78
79
lx 6,164,663 5,920,394 5,664,051 5,396,081 5,117,152 4,828,182 4,530,360 4,225,163
x
80
81
82
83
84
85
86
87
lx 3,914,365 3,600,038 3,284,542 2,970,496 2,660,734 2,358,246 2,066,090 1,787,299
x
88
89
90
91
92
93
94
95
96
lx 1,524,758 1,281,083 1,058,491 858,676 682,707 530,959 403,072 297,981 213,977
x
97
98
99
100 101
102
103 104 105 106 107 108 109 110
lx 148,832 99,965 64,617 40,049 23,705 13,339 7,101 3,558 1,668 727 292 108 36 11

2013-4-13,

Simulation 7 Lifetimes,

HCM 10/25/12,

Page 96

7.1 (1 point) Mortality follows the Illustrative Life Table in Actuarial Mathematics.
For a life aged 62, simulate the age of death using the inversion method, where small random
numbers correspond to early deaths.
Using the random number 0.86, what is the age at death?
(A) 75
(B) 80
(C) 85
(D) 90
(E) 95
7.2 (1 point) Mortality follows De Moivres law with = 100.
Simulate the age at death of a life aged 40, with small random numbers corresponding to early
deaths and using the random number 0.316.
What is the simulated age at death?
A. Less than 60
B. At least 60, but less than 65
C. At least 65, but less than 70
D. At least 70, but less than 75
E. At least 75
7.3 (3 points) Mortality follows Gompertzs Law with m = 0.0007 and c = 1.1.
For a life aged 55, simulate the age of death using the inversion method, where small random
numbers correspond to early deaths.
Using the random number 0.23, what is the age at death?
Hint: For Gompertzs Law, S(x) = exp[-m(cx - 1)].
A.
B.
C.
D.
E.

Less than 68
At least 68, but less than 70
At least 70, but less than 72
At least 72, but less than 74
At least 74

7.4 (1 point) Lifetimes follow a Weibull Distribution with = 6 and = 80.


Using the inversion method (where small random numbers correspond to early deaths), you
simulate the future lifetime of Samson who is age 60.
A random number from the uniform distribution on [0,1] is 0.2368.
What is the simulated future lifetime for Samson?
A. 7
B. 8
C. 9
D. 10
E. 11
7.5 (2 points) Mortality follows De Moivres law with = 90.
Rob is aged 65. Laura is aged 60.
Simulate their future lifetimes, with small random numbers corresponding to early deaths.
Use the random number 0.561 for Rob and the rand

You might also like