Chapter 1  1
UECM3413 SURVIVAL MODELS
Contents:
Chapter 1. Mathematics of Survival Analysis
Chapter 2. The Life Table
Chapter 3. Tabular Survival Models Estimated from Complete Data Samples
Chapter 4. Tabular Survival Models Estimated from Incomplete Data Samples: Study
Design
Chapter 5. Survival Models Estimated from Incomplete Data Samples Methods of
Moment Procedure
Chapter 6. Survival Models Estimated from Incomplete Data Samples Methods of
Maximum Likelihood Procedure
Chapter 7. Estimation of Parametric Survival Models
Reference Books:
1. London, D. (1997). Survival Models and Their Estimation. (3
rd
ed.). Winsted,
Conn.:ACTEX Publication.
2. ElandtJohnson, R.C., & Johnson, N. L. (1999) Survival Models and Data Analysis.
New York: John Wiley.
Additional Text:
3. Klugman, S. A., Panjer, H.H., & Willmot, G. E. (2004). Loss Models: From Data to
Decisions. (2
nd
ed.). Hoboken, N. J.: WileyInterscience.
4. Lee, E. T., & Wang, J. W. (2003). Statistical Methods for Survival Data Analysis.
(3
rd
ed.). New York: John Wiley.
Lecturer: Dr Wong Wai Kuan
Email : wongwkuan@gmail.com
Survival Models
Chapter 1  2
Chapter 1
Mathematics of Survival Analysis
1.1 Introduction to a Survival Model
A survival model is a probability distribution for a special kind of random variable. The
random variable, T , is defined to be the time of failure of the entity known to exist at time
0 = t and is therefore frequently called the failure time random variable.
Now if T is the time of failure of an entity which exists at 0 = t , then T is also the future
lifetime of this entity measured from 0 = t . Note that we are not concerned with the age of
the entity at time 0 = t .
Let define the random variable X be the age of the entity. For example, an entity could have
25 = x when 0 = t , or 35 = x when 0 = t , if we are concerning the health of a human being,
it might show different probability of surviving.
1.1.1 Survival Distribution Function (SDF)
If T is the time of failure, the probability of still functioning at time t is the same time as the
probability that the failure time is later than time t , i.e.
0
p t T P t S
t
= > = ) ( ) ( .
By nature of T , 0 T , 1 0 = ) ( S and ) (t S is a nonincreasing function, we will assume
0 =
) ( lim t S
t
or in short 0 = ) ( S .
This function of random variable T is called the Survival Distribution Function (SDF).
If the initial event defining time 0 = t is actual at birth, i.e 0 = x when 0 = t , so the attained
age and elapsed time will run exactly together. In this case, one could use either x or t .
Thus, the survival probabilities will be given by ) (x S , 0 x with 1 0 = ) ( S and 0 ) (x S as
x . In this case, the random variable X is the age at death (failure) or the future
lifetime random variable, just as T is the time at death (failure) or the future lifetime
random variable.
1.1.2 Cumulative Distribution Function (CDF)
The Cumulative Distribution Function (CDF) of T is ) (t F ,
0
q t T P t F
t
= = ) ( ) ( .
Here, ) (t F gives the probability that failure (death) will occur not later than time t .
Note (i) ) ( ) ( t S t F =1 , 0 0 = ) ( F and 1 = ) ( F .
(ii) ) ( ) ( ) ( ) ( ) ( b S a S a F b F b T a P = =
Example 1.1
(a) Let time at failure T be uniform on ] , [ 100 0 , find
0
p
t
and
0
q
t
.
(b) Let time at failure T be exponential with mean 20, find
0
p
t
,
0
q
t
and ) ( 2 1 T P .
Survival Models
Chapter 1  3
1.1.3 Probability Density Function (PDF)
For a continuous random variable, the Probability Density Function (PDF) is given as
0 = = t t S
dt
d
t F
dt
d
t f ), ( ) ( ) ( .
It is the density of failure at time t and is an instantaneous measure. It is the unconditional
density of failure at time t . It is the density failure at time t given only that the entity
existed at time 0 = t .
1.1.4 Hazard Rate Function (HRF)
On the other hand a conditional density of failure at time t , given survival to time t is called
the hazard rate at time t or Hazard Rate Function (HRF). The hazard rate is denoted as
) (t , since it is also called force of mortality in actuarial context therefore it can also be
denoted as ) (x .
Notes
(1) (Conditional density of failure at time t , given survival to time t )
(Probability of survival to time t ) = (Unconditional density of failure at time t )
i.e. ) ( ) ( ) ( t f t S t = or
) (
) (
) (
t S
t f
t = .
(2) Since ) ( ) ( t S
dt
d
t f =
) ( ln
) (
) (
) ( t S
dt
d
t S
t S
dt
d
t =
= , or ) ( ln
) (
) (
) ( x S
dx
d
x S
x S
dx
d
x =
= .
Integrating this, we have ) ( ln ) ( t S dy y
t
=
0
or
=
t
dy y
e t S
0
) (
) (
(3) The Cumulative Hazard Function (CHF) is defined to be
) ( ln ) ( ) ( t S dy y t
t
=
=
0
So that
) (
) (
t
e t S
= .
(4) In order for ) (t to be force of mortality,
(i) it must be nonnegative, 0 ) (t , and
(ii) =
0
dy y) ( , which comply with
= =
0
0
dy y
e S
) (
) (
.
Example 1.2
(a) Let time at failure T have survival function 10 0
100
10
2
<
= t
t
t S ,
) (
) ( , find ) (t .
(b) Find ) (t S if
t
t
=
10
3
) ( for 10 0 < t .
(c) Find ) (t f if
2
4
2
t
t
t
= ) ( for 2 0 < t .
Survival Models
Chapter 1  4
Example 1.3
Which of the following formulas could serve as force of mortality?
(a)
1
+ = ) ( ) ( t b a t , 0 > a , 0 > b .
(b)
3
1
+ = ) ( ) ( t t , 0 t .
Example 1.4
(a) If
) ( x
x
=
100 2
1
, 100 0 < x calculate
0 40
p .
(b) Given that
<
<
=
5 4 0
5 4 1
100
1
1 0 1
. ,
. ,
,
) (
x
x
e
x
x S
x
, compute ) (4 .
1.1.5. The moments of the Random Variable
The first moment of a continuous random variable T defined on ) , [ 0 is given by
=
0
dt t f t T E ) ( ) ( ,
if the integral exists and otherwise the first moment is undefined. Integration by parts yields
=
0
dt t S T E ) ( ) (
The second moment of T is given by
=
0
2 2
dt t f t T E ) ( ) ( ,
if the integral exists, so the variance of T can be found from
2 2
)} ( { ) ( ) ( T E T E T V = .
The first moment of a continuous random variable X is denoted as 0
e . Thus
= =
0
0 dx x f x X E e ) ( ) ( .
Since 0
e = 20.
For a proposed new type, with the same w, the new survival function is
<
<
=
w x
w
x w
x
x S
5
5
5 0 1
,
,
) (
*
.
Compute the increase in life expectancy at time 0 = t .
1.2 Some Important Parametric Survival Models
1.2.1 The Uniform distribution on ] , [ w 0 or De Moivres Law
The Uniform distribution on ] , [ w 0 is often referred as DeMoivre distribution in actuarial
literature.
For w x 0 ,
w
x f
1
= ) ( ,
w
x
x F = ) ( ,
w
x
x S =1 ) ( ,
x w
x
=
1
) ( ,
2
0
w
X E e = =
) ( ,
12
2
w
X V = ) ( .
Note that the uniform survival model is useful but not appropriate for human survival
analysis.
1.2.2 The Exponential Distribution
For general rate parameter , 0 x
x
e x f
= ) ( ,
x
e x F
=1 ) ( ,
x
e x S
= ) ( ,
= ) (x ,
1
0 = =
) ( X E e ,
2
1
= ) ( X V .
Note that in actuarial context, the hazard rate is generally called the force of mortality, with
this, the exponential distribution is referred as the constant force distribution. It is also not
appropriate for human survival analysis over a broad range, but is used extensively over short
intervals.
1.2.3 The Gompertz Distribution
This distribution was suggested as a model for human survival by Gompertz in 1825. The
distribution is usually defined by its hazard rate as
1 0 0 > > = c B x Bc x
x
, , , ) ( .
Then SDF is given by
[ ]
(
= ) (
ln
exp ) ( exp ) (
x x
c
c
B
dy y x S 1
0
.
The PDF ) ( ) ( ) ( x S x x f = is complicated. ] [ X E is not easily found.
Survival Models
Chapter 1  6
1.2.4 The Makeham Distribution
In 1860, Makeham modified the Gompertz distribution by taking the HRF to be
B A c B x Bc A x
x
> > > + = , , , , ) ( 1 0 0 .
Then SDF is given by
(
= Ax c
c
B
x S
x
) (
ln
exp ) ( 1 .
1.2.5 The Weibull Distribution
The Weibull Distribution can be constructed from an increasing failure rate of the simple
form 1 0 0 > > = n k x kx x
n
, , , ) ( .
Its SDF is given by
(
=
+
1
1
n
x k
x S
n
exp ) ( .
1.3 Conditional Measures and Truncated Distributions
1.3.1 Conditional Probabilities and Densities
Thus far only probabilities measured from age 0 = x is considered. Now the case of a person
who is known to be alive at age 0 > x is considered. The random variable ) (x T for the
remaining time until death for a person who has already reached age x is studied.
1.3.2 Lower Truncation of the Distribution of X
What is the probability that a person, known to be alive at age x , will still alive t years later
(i.e. at age t x + )?
This probability is dealing with the distribution of a subset of the sample space of the random
variable X , namely those values of X which fall in excess of x . This distribution is called
the distribution of X truncated below at x .
In words, this is the probability that the age at death will exceed t x + , given that it does
exceed x , or, probability of survival to t x + , given survival to x .
The desired probability is denoted by
x t
p or ) (t S
T
.
) ( x X t x X P p
x t
> + > = ) ( x X t x S > + =
) (
) , (
x X P
x X t x X P
>
> + >
=
) (
) (
x X P
t x X P
>
+ >
=
) (
) (
x S
t x S +
=
+
=
t x
x
dy y
e
) (
Similarly,
) ( x X t x X P q
x t
> + = ) ( x X t x F > + =
) (
) ( ) (
x S
t x S x S +
=
) (
) ( ) (
x F
x F t x F
+
=
1
x t T
p t F = = 1 ) (
When the value of 1 = t , drop the leading one subscript, i.e.
x x
p p =
1
,
x x
q q =
1
.
Survival Models
Chapter 1  7
Notes:
1.
x t
p
) (
) (
x S
t x S +
=
0
0
p
p
x
t x+
= .
2. For X ~ ] , [ w U 0 ,
x t
p
x w
t x w
= , ) (x T ~ ] , [ x w U 0 .
3. A generalized version of de Moivres Law has
x w
k
x
= and
x t
p
k
x w
t x w

\

= .
4. For X ~ ) exp( ,
x t
p
t
x
t x
e
e
e
+
= =
) (
, ) (x T ~ ) exp( .
5. For X ~ ) , ( Pareto ,
x t
p
\

+ +
+
=
t x
x
, ) (x T ~ ) , ( + x Pareto
To find the conditional density function for death at age t x + , given alive at age x ,
differentiate the above expression with respect to t , and remember that x is a constant.
) ( x X t x f > + = ) ( ) ( t F
dt
d
t f
T T
=
) (
) (
x S
t x S
dt
d
+
=
) (
) (
x S
t x f +
=
Since ) ( ) ( ) ( x S x x f = , then ) (t f
T
can be rewritten as
) (t f
T
) (
) (
) ( ) (
t x p
x S
t x t x S
x t
+ =
+ +
=
And conditional HRF at t x + , given alive at age x is given as
) ( x X t x > +
) (
) (
x X t x S
x X t x f
> +
> +
=
) (
) (
) (
) (
x S
t x S
x S
t x f +
+
= ) (
) (
) (
t x
t x S
t x f
+ =
+
+
=
which means the HRF for this truncated distribution is identical to the untruncated HRF.
Example 1.6
(a) Write the symbol for the probability that ) (52 lives to at least age 77.
(b) Write the symbol for the probability that a person aged 74 dies before age 91.
Example 1.7
Let X be an age at death random variable with density
50
x
x f = ) ( , 10 0 < x . Consider the
random variable ) (2 T for the remaining lifetime past 2 = x . Find
2
p
t
and ) (t f
T
.
Survival Models
Chapter 1  8
Example 1.8
(a) For the survival function,
<
=
100 0
100 0
10000
10000
2
x
x
x
x S ) ( , compute
32
q .
(b) The force of mortality is
x
x
=
100
1
, compute
50 10
p .
(c) For a population which contains equal numbers of males and females at birth:
For males: 1 0. ) ( = x
m
, 0 x
For females: 08 0. ) ( = x
f
, 0 x .
Compute
60
q for this population.
1.3.3 The Mean and Variance of ) (x T
The expected value of ) (x T , if it exists, is denoted by x e
dt p t
t x x t
=
+ 0
dt p
x t
=
0
=
0
dt t S
T
) ( dt
x S
t x S
+
=
0
) (
) (
[ ]
2
) (x T E dt p t
t x x t
=
+ 0
2
dt p t
x t
=
0
2
[ ]
2
0
2 x
x t
e dt p t x T V
= ) (
The average number of years lived within the next n years, denoted by n x e :
. This is called
the n year temporary complete life expectancy. The mathematical definition is
n x e :
x n
n
t x x t
p n dt p t +
=
+ 0
Those who die within n years are averaged in the integral, and survivors are represented by
the second summand
x n
p n .
Integrating by part, we will get simpler formula:
n x e :
dt p
n
x t
=
0
.
Let n X denote the random variable which is the minimum of X and n . The second
moment is defined by
( ) [ ]
2
n x T E ) (
x n
n
t x x t
p n dt p t
2
0
2
+
=
+
dt p t
n
x t
=
0
2
Survival Models
Chapter 1  9
Example 1.9
Let X be an age at death random variable with density
50
10 x
x f
= ) ( , 10 0 < x . Consider
the random variable ) (3 T for the remaining lifetime past 3 = x . Find 3
e and [ ] ) (3 T V .
Example 1.10
(a) Let X be uniform on ] , [ 100 0 . Find [ ] ) (40 T E and [ ] ) (40 T V .
(b) Let X be exponential with rate parameter 05 0. = . Find [ ] ) (40 T E and [ ] ) (40 T V .
Example 1.11
(a) Mortality follow De Moivres law and 30 20 =
e . Compute
20
q .
(b) Let X be an age at death random variable with density
50
x
x f = ) ( , 10 0 < x .
Consider the random variable ) (4 T for the remaining lifetime past 4 = x . Find
4
p
t
,
) (t f
T
and 4
e .
Survival Models
Chapter 1  10
Example 1.12
(a) The future lifetime of ) (0 follows a twoparameter Pareto distribution with 50 =
and 3 = . Compute 20
e .
(b) Future lifetime of ) (20 is subject to force of mortality
x
x
=
100
5 0.
, 100 < x .
Compute 20
e and 0 5 20:
e .
(c) Given
< <
=
40 05 0
40 0 04 0
x
x
x
.
.
) ( . Compute 5 2 25:
e .
1.3.4 The Central Rate
A conditional measure over the interval from age x to age 1 + x is called the central rate of
failure or central rate of death and is denoted by
x
m . It is defined as the weighted average
value of the HRF ) (x over the interval, using, as the weight for ) ( y , the probability of
survival to age y . Formally,
x
m
dy y S
dy y y S
x
x
x
x
=
+
+
1
1
) (
) ( ) (
where the denominator is the sum of the weights for a continuous case weighted average.
More generally,
x t
m is the average hazard, or central rate of death, over the interval from
x to t x + , and is given by
x t
m
dy y S
dy y y S
t x
x
t x
x
=
+
+
) (
) ( ) (
ds s x S
ds s x s x S
t
t
+ +
=
0
0
) (
) ( ) (
The second expression resulting from s x y + = .
Example 1.13
If X has an exponential distribution, show that
x x
p m ln = .
Example 1.14
Let X be uniform on ] , [ 10 0 , find
1
m ,
2
m and
2 3
m .
Survival Models
Chapter 1  11
Example 1.15
Given that
(i) T is the random variable for the future lifetime of ) (x .
(ii) The PDF of T is
t
T
e t f
2
2
= ) ( , 0 t
Find
x
m , the centraldeathrate at age x .
1.3.5 Use of Conditional Probabilities in Estimation
As
) (
) (
x S
x S
p
x
1 +
= , then recursively,
1 1 0
=
x
p p p x S ... ) ( .
1 1 0
1 1 0
+
=
+
=
x
n x
x n
p p p
p p p
x S
n x S
p
...
...
) (
) (
1 1 + +
=
n x x x
p p p ...
a x a n x a
p p
+
= , where a is an integer which less than n .
Example 1.16
The following mortality table is given:
x 60 61 62 63 64
x
q
0.001 0.002 0.003 0.004 0.005
Compute
60 2
p and
60 5
p .
Example 1.17
The following information is stated.
(1) The probability that two 70yearold are both alive in 20 years is 16%.
(2) The probability that two 80yearold are both alive in 20 years is 1%.
(3) There is an 8% chance of a 70yearold living 30 years.
(4) All lives are independent and have the same expected mortality.
Determine the probability of an 80yearold living 10 years.
Survival Models
Chapter 1  12
1.3.6 Upper and Lower Truncation of the Distribution of X
Another conditional probability of interest is the probability that a person is known to be
alive at age x will die between ages n x + and m n x + + , it is denoted as
x m n
q and defined
as ) ( x X m n x X n x P q
x m n
> + + + = .
This can also be expressed as the probability that a person aged x will survive n years, but
then die within the next m years. This suggests that
x m n x n n x m x n x m n
p p q p q
+ +
= =
where
n x m
q
+
is the conditional probability of dying between ages n x + and m n x + + ,
while
x n
p is the conditional probability of surviving to age n x + , given alive at age x .
Example 1.18
The following mortality table is given:
x 60 61 62 63 64
x
q
0.001 0.002 0.003 0.004 0.005
Compute the probability that a person aged 60 will die sometime between 2 and 5 years from
now.
Example 1.19
(a) Given that
2
100
100
(
+
=
x
x S ) ( , compute
40 5
q .
(b) Given that
x
x
=
100
2
, 100 0 x , compute
64 10
q .
(c) Given that
2 1
100
1
/
) (
(
=
x
x S , 100 0 < x , calculate the probability that a life aged
36 will die sometime between ages 51 and 64.
Survival Models
Chapter 1  13
1.4 Curtate Future Lifetime
Let ) (x K be the random variable that measure the future lifetime without counting the last
fraction of a year. For example, if time at death is 80.2, ) (x K would assume the value
x 80 , rather than x 2 80. . This is called the curtate lifetime. ) (x K is a discrete random
variable. If x is an integer, ) (x K assumes integer value only.
Equivalently, ) (x K is the greatest integer of ) (x T . The value of k x K = ) ( if death occurs in
the interval ) , [ 1 + + + k x k x , that is, if the individual aged x survives k more years and dies
in the following years.
[ ] k x K P = ) ( =
x k
q
x k x k k x x k
p p q p
1 + +
= =
The curtate expectation of life at age x is the expected number of whole years of future live
for an individual aged x . It is denoted by
x
e .
[ ] [ ] = = = =
= 1 1 k
x k
k
x
p k x K P k x K E e ) ( ) (
To find variance of ) (x K , we need
[ ] =
=1
2
1 2
k
x k
p k x K E ) ( ) (
n x
e
:
is the n year curtate temporary life expectancy, the average number of full years of
future lifetime with the next n years.
n x
e
:
= + =
=
=
n
k
x k x n x k
n
k
p p n q k
1
1
1
( ) [ ]
2
n x K E ) ( = + =
=
=
n
k
x k x n x k
n
k
p k p n q k
1
2
1
1
2
1 2 ) ( .
The curtate future lifetime of ) (x refers to the number of future years completed by ) (x prior
to death.
Example 1.20
Future lifetime of ) (20 is subject to force of mortality
x
x
=
100
1
, 100 < x , compute
20
e ,
0 5 20:
e and [ ] ) (20 K V .