Professional Documents
Culture Documents
xii
Contents
properties may wish to review Appendix A before studying the specific estimators developed in the text.
Frequently the estimated survival model produced directly from a
study is not entirely suitable for practical use, and is, therefore, systematically
revised before such use. The process of revising the initial estimates into
revised estimates is called graduation. This step in the development of a
useable estimated survival model is the topic of a companion text to this one
entitled Graduation: The Revision of Estimates.
Survival Models and Their Estimation is said to be a general text in
that it treats survival model estimation from the viewpoint of several different
practitioners, including the actuary, the demographer, and the biostatistician,
without attempting to be an exhaustive treatment of any one of these
traditions.
A more thorough treatment of the actuarial tradition, from a different
perspective, can be found in texts by Gershenson [32], Benjamin and Pollard
[11], and Batten [8]; demographic approaches are the main theme of the
works by Keyfitz and Beekman [46], Spiegelman [71], and Chiang [19]; the
medical, or biostatistical, tradition is more deeply pursued by Elandt-Johnson
and Johnson [25]. Additional texts, which deal with the statistical analysis of
survival data at the graduate level, include those by Lawless [50], Lee [51],
Miller [56], and Kalbfleisch and Prentice [42].
How is an initial estimated survival model determined from sample
data? There are many approaches to this. A survival model estimation problem will generally have three basic components: (1) the form and nature of
the sample data (which might also be called the study design); (2) the chosen
estimation procedure; and (3) any simplifying assumptions made along the
way. All of these concepts will be further developed in this text. The traditional actuarial approach, for example, is characterized by a cross-sectional
study design using the transactional data of an insurance company or pension
fund operation, a method-of-moments estimation procedure, and the Balducci
distribution assumption. In this text we will consider, as well, other study
designs, primarily those encountered by the clinical statistician or the reliability engineer. In addition, we will consider other estimation procedures, especially the maximum likelihood and product-limit methods. Finally, we will
consider other simplifying assumptions, such as the uniform and exponential
distributions.
The text presumes a basic familiarity with probability and statistics,
including the topics of estimation and hypothesis testing. The application of
these ideas specifically to the estimation of survival models is then developed
throughout the text. An effort has been made to keep the mathematics and
the pedagogy at a level which does not require a prior familiarity with the
topic. Whenever a choice between mathematical rigor and pedagogic effec-
xiv
xvi
CHAPTER 6
TABULAR SURVIVAL MODELS ESTIMATED
FROM INCOMPLETE DATA SAMPLES
MOMENT PROCEDURES
6.1 INTRODUCTION
The foundation which underlies estimation by a moment procedure is the
statistical principle that the expected number of deaths to be observed in
(x, x1] from a certain sample should be equated to the number actually
observed. Assuming that the number of actual deaths in the age interval, to
be denoted by dx , is readily available, then the estimation problem consists of
two steps:
(1) Finding an expression for the expected number of deaths in
(x, x1];
(2) Solving the moment equation for the desired estimate.
Generally a simplifying assumption will be required to solve the moment
equation.
6.2 MOMENT ESTIMATION IN A SINGLE-DECREMENT
ENVIRONMENT
Assume that the basic data regarding an observation period, dates of birth
and membership into the group under observation, and dates of death have
been analyzed to produce the ordered pair (xri , xsi ) for person is scheduled contribution to (x, x1], as illustrated in Figure 5.6. Recall that si 1 is
possible but si 0 is not, whereas, ri 0 is possible but ri 1 is not.
For person i, the probability of death while under observation in the
single-decrement environment is the conditional probability of death before
age xsi , given alive at age xri , which is si ri qxr i , and this is also the expected number of deaths from this sample of size one. This is easily
seen, since the number of deaths is one (with probability si ri qxri ), or
zero (with probability si ri pxri ). Thus the expected number is
1 si ri q x ri 0 s i r i p x r i
s i r i q x r i .
(6.1)
114
si ri q x ri .
i1
(6.2)
i1
where Dx is the random variable for deaths in (x, x1], and dx is the observed
number.
To solve (6.2) for our estimate of qx , we will use the approximation
si ri q x ri
(si ri ) qx .
(6.3)
(6.4)
i1
dx
! (si ri )
n
(6.5)
i1
(6.6)
dx
q^ x
.
nx
(6.7)
115
estimator of the conditional mortality probability qx , where the model for the
likelihood is a simple binomial model. Thus the number of persons in the
sample, nx , can be thought of as a number of binomial trials. The standard
characteristics of a binomial model are assumed to apply. Thus each trial is
considered to be independent, and the probability of death on a single trial
(qx ) is assumed constant for all trials. In such a situation, the sample proportion of deaths, which is given by (6.7), is a natural estimator for this
parameter qx .
If si 1 for all, but ri 0 for some of the nx persons who contribute
to (x, x1], then we have si ri qxri 1ri qxri , and (6.2) becomes
E [Dx ] " 1ri qxri dx .
n
(6.8)
i1
(1ri ) qx ,
(6.9)
which is the same as (3.77). Substituting (6.9) into (6.8), we obtain the result
q^ x
dx
! (1ri )
n
(6.10)
i1
(6.11)
i1
si qx ,
(6.12)
which is the same as (3.54). When (6.12) is substituted into (6.11) we obtain
the result
q^ x ndx ,
(6.13)
! si
i1
116
EXAMPLE 6.1 From the sample of five persons whose basic data are
given below, estimate q30 , using an observation period of calendar year 1997.
Person
1
2
3
4
5
Date of Birth
July 1, 1967
April 1, 1967
January 1, 1967
July 1, 1966
April 1, 1966
Date of Death
October 1, 1997
|
---30.25
30.50
30
30.75
31
1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4~~~~~~~~~~~~~~~~
5 ~~~~~~~ - - -
FIGURE 6.1
The ordered pairs (ri , si ) are (0, .50), (0, .75), (0, 1), (.50, 1), and (.75, 1), for
i 1,2,3,4,5, respectively. Thus the denominator of Estimator (6.5) is
.50 .75 .1.00 .50 .25 3.00. The numerator of (6.5) is dx 1, because person 5 died after reaching age 31. Thus q^30 13 .
6.2.3 Exposure
The concept of exposure was defined in Section 3.3.5 in the context of a life
table model. At this time, and at several other places throughout the text, we
will make use of this concept in the context of a study sample.
The ordered pair (ri , si ) for person i represents the interval of age,
within (x, x1], over which person i is potentially under observation. We say
that person i is exposed to the risk of death from age xri to age xsi .
Numerically, si ri is the length of time (in years) that person i is exposed, so
it also represents the amount of exposure, measured in units of life-years,
117
i1
^x
m
1
1
2
^x
m
(6.14)
which uses the linear distribution assumption for jxs . Alternatively, if we assume an exponential jxs (the constant force of mortality assumption), then
^ x , and thus
^ m
this constant force, ., is the same as mx , so that .
q^x 1 e.^ 1 em^ x .
(6.15)
^ (and thus q^x ) are maximum likeliIn the next chapter we will see that this .
hood estimators.
To summarize, scheduled exposure is divided into observed deaths to
estimate qx in a single-decrement environment by the moment estimation
approach, coupled with the approximation sr qxr (sr) qx . Exact exposure is divided into observed deaths to estimate mx , or to estimate a constant
force of mortality over (x, x1]. This latter case will arise regularly in
Chapter 7.
118
6.2.4 Grouping
Consider Special Case B in which the nx persons who contribute to (x, x1]
enter this interval at various ages, but all are scheduled to exit the interval at
age x1. That is, all are scheduled to survive (x, x1]; none are scheduled to
be study enders at an age zi such that x zi x1. It may reasonably be
assumed that many of these nx persons enter the interval at exact age x (i.e.,
have ri 0); let us say that bx is the subset of nx which has ri 0. (bx is
chosen to stand for beginner, since these persons are exposed from the
beginning of (x, x1].)
The remainder of the sample, say kx nx bx , all have ri such that
0 ri 1. Let us assume that the average value of these ri s is r, 0 r 1.
(Alternatively, it might be possible that the study was designed such that kx
persons all had ri r, and no averaging is needed.) This situation is
illustrated in Figure 6.2.
bx ~~~~~~~~~~~~~~~~~~~~
kx ~~~~~~~~~
- - - qqqqqqqqqqqqqqq
|
|
| --x
xr
x1
FIGURE 6.2
As a result of this grouping, the expected number of deaths in the
interval (x, x1] is bx qx kx 1r qxr , so the moment equation becomes
E [Dx ] bx qx kx 1r qxr dx .
Under the hyperbolic assumption,
solved for the estimator
1r qxr
(6.16)
x
q^x b (1d
r) kx .
x
(6.17)
119
nx cx ~~~~~~~~~~~~~~~~~~~~~~~
cx ~~~~~~~~~~~~~
- - - qqqqqqqqqqqqqqqqqqqq
|
|
|
--x
xs
x1
FIGURE 6.3
The expected number of deaths in (x, x1] is now (nx cx ) qx cx s qx , so
the moment equation becomes
E [Dx ] (nx cx ) qx cx s qx dx .
(6.18)
Under the linear assumption, s qx s qx , (6.18) is easily solved for the estimator
(6.19)
q^x n (1dx s) c .
x
x
Again the exposure interpretation of the denominator is apparent.
Our Special Case C is one that frequently arises in clinical studies.
As seen in Chapter 4, these studies often are designed so that all members of
the study sample come under observation at time t 0. It follows that all
members who reach the estimation interval (t, t1] must enter it at exact time
t, so that ri 0 for all i. But if the study is terminated before all have died,
then there will be enders at various times, so that si 1 for some i. Because
of the frequent occurrence of Special Case C in practice, we will refer to it
several more times as we develop the theory of estimation in this text.
6.2.5 Properties of Moment Estimators
In this section we will explore the bias and the variance of the moment
estimators derived thus far in the chapter. It is assumed that the reader is familiar with these two properties of estimators. Formal definitions of them are
given in Appendix A.
A characteristic of the moment estimators for qx which we will
derive is that they are always unbiased under the assumption, or other approximation, used to derive them. We will demonstrate this for the general
form of the moment estimator given by (6.5).
To investigate bias, we need to compare the expected value of the estimator with the true value of the parameter, qx , which it seeks to estimate.
We are using q^x to represent the realized value of the estimator random
variable when the random variable for number of deaths, Dx , has realized
^ for the estimator random variable, so that
value dx . Thus we use Q
x
120
Dx
! (si ri )
n
(6.5a)
i1
D (si ri )
si ri xri
i1
i1
! si ri q x ri
qx ! (si ri )
^ ]
E [Q
x
! (si ri )
i1
n
! (si ri )
i1
qx .
(6.20)
i1
i1
This shows that Estimator (6.5) is unbiased under the approximation used to
derive it. It is biased to the extent that (6.3) deviates from the mortality distribution to which the sample is actually subject. This last point is illustrated in
the following example.
EXAMPLE 6.3 Analyze the degree of bias contained in Estimator (6.10)
for Special Case B.
n
SOLUTION For Special Case B, we have E[Dx ] ! 1ri qxri and the estii1
^
mator Q
x
Dx
D (1ri )
^ ] n E[ D x ]
. Then E[Q
x
i1
D (1ri )
i1
D 1ri qxri
i1
n
D (1ri )
. If we assume
i1
i1
^ ]
thus E[Q
x
D (1ri )qx
i1
n
D (1ri )
i1
121
(6.21)
i1
i1
i1
so that
! si ri qxri (1 si ri qxri )
(6.22)
^ ln , {r ,s })
Var (Q
i i
x x
i1
!(si ri )
n
(6.23)
i1
px qx
! (si ri )
n
(6.24)
i1
122
Note that Dx Dxw Dxww , that Dxw and Dxww are both binomial random
variables, and that they are independent. From these properties we find that
Var (Dx ) Var (Dxw ) Var (Dxww ) cx s qx (1s qx ) (nx cx ) qx (1qx ), so
^ ln , c ) cx s qx (1s qx ) (nx cx )qx (1qx ) .
that Var (Q
x x x
[n (1s)c ]2
x
(6.25)
(6.26)
Note that there are no observed enders in (x, x 1] from the nx cx group.
123
(1!) ex qx
dx ,
1 ! qx
(6.27)
b b2 4! nx dx
q^x
,
2! nx
(6.28)
4
4
we have (1000!) 185
(902118!) 185
20 0, which produces
2
124
(6.29a)
(6.29b)
i1
and
n
i1
where Wx is the random variable for withdrawals in (x, x1], and wx denotes
the observed number of withdrawals in the sample.
A simple way to solve Equations (6.29a) and (6.29b) for estimates of
the double-decrement probabilities q(xd) and q(xw) would be to use approximations analogous to (6.3), namely
( d)
( d)
(6.30a)
si ri qxri (si ri ) qx
and
(w)
si r i q x ri
producing
(si ri ) q(xw) ,
(6.30b)
dx
! (si ri )
n
125
(6.31a)
i1
and
(w)
q^x
wx
! (si ri )
n
(6.31b)
i1
However, our objective is usually to estimate the single-decrement probabilities qwx(d) and qwx(w) , in this case from data that have been collected in a
double-decrement environment. To accomplish this, we could calculate esti(d)
(w)
mates of qxw (d) and qxw (w) from the estimates q^x and q^x , produced by (6.31a)
and (6.31b), using one of the approximate relationships between singledecrement and double-decrement probabilities presented in Section 5.3.
Alternatively, we could calculate estimates of qwx (d) and qwx (w) directly
from our sample data by expressing si ri q(xd) ri and si ri q(xw)ri directly in terms of
qwx (d) and qwx (w) , under a chosen assumption, without using the preliminary
approximations (6.30a) and (6.30b). This approach normally results in equaw (d)
w (w)
tions which must be solved numerically for q^x and q^x .
6.3.2 Uniform Distribution Assumptions
If the random events, death and withdrawal, are each assumed to have uniform distributions over (x, x1) in the single-decrement context, we know
from Section 3.5 that
.(xd) u
qwx (d)
1 uqwx (d)
(5.14a) becomes
w (d)
ur qxr
( d) ( d)
, so that (1ur qwx
r ).xu
qwx (d)
.
1 rqwx (d)
Then Equation
126
w (w)
w ( d)
( d)
1ur qxr 1ur qxr .xu du
s
r
s
( d)
1 r
qxw
(w)
1 u qxw
qxw
( d)
(w) du
1 r qxw
1 r qxw
(d)
qxw
( d)
qxw
( d)
1 r
(w)
qxw
s
r
(w)
(sr) 12 (s 2 r 2 ) qwx
1 r qxw 1 r qwx
(d)
(w)
1 u qxw
(w)
du
(6.32a)
qxw
(w)
( d)
(sr) 12 (s 2 r 2 ) qwx
1 r qxw 1 r qxw
( d)
(w)
(6.32b)
Substitution of (6.32a) and (6.32b) into the basic moment equations (6.29a)
and (6.29b), respectively, results in
E [Dx ] "
qwx
E[Wx ] "
qxw
(d)
1 ri qwx 1 ri qwx
(d)
i1
and
n
i1
(w)
(si ri ) 12 (si2 ri2 ) qwx
(w)
(w)
(d)
(si ri ) 12 (s2i r2i ) qwx
1 ri qwx 1 ri qwx
(d)
(w)
dx
(6.33a)
wx .
(6.33b)
(6.31c)
(w)
x
q^x w
nx ,
(6.31d)
and
127
nx
(6.34a)
w (w)
q^x
.
nx
(6.34b)
and
2
In Chapter 7 we will see that Estimators (6.34a) and (6.34b) are maximum
likelihood estimators.
Alternatively, we could simplify Equations (6.33a) and (6.33b) under
the Special Case A values of ri 0 and si 1 and solve the resulting
equations directly for our estimates of qwx (d) and qwx (w) given by Equations
(6.34a) and (6.34b). This alternative approach is pursued in the exercises.
EXAMPLE 6.6 100 persons enter the estimation interval (20,21] at exact
age 20. None are scheduled to exit before exact age 21, but one death and
two random withdrawals are observed before age 21. Assuming a uniform
distribution of both random events, estimate qw20(d) .
SOLUTION Using Equation (6.34a), we have 100 12 (2) 12 (1) 99.5, so
w (d)
that q^20
.0101015.
128
s
r
(d)
w (w)
w (d)
1 ur qxr 1 ur qxr .xu du
.(d) (
(d)
e(ur)(.
.(w) )
du
r
( d)
.
(sr)(.(d) .(w) )
.
(w) 1 e
. .
( d)
(6.35a)
.(w)
(sr)(.(d) .(w) )
.
(d)
(w) 1 e
. .
(6.35b)
Substitution of (6.35a) and (6.35b) into the basic moment Equations (6.29a)
and (6.29b), respectively, results in
E [Dx ] "
.(d)
(si ri )(.(d) .(w) )
dx
( d)
(w) 1 e
. .
(6.36a)
E [Wx ] "
.(w)
(si ri )(.(d) .(w) )
wx .
(d)
(w) 1 e
. .
(6.36b)
i1
and
i1
^ (d) and
Again, the necessity of a numerical solution of these equations for .
^ (w) is obvious.
.
If Special Case A prevails, with ri 0 and si 1 for all i, then we
can easily obtain our estimates of qwx(d) and qwx(w) by substituting the estimates
of q(xd) and q(xw) , given by Equations (6.31c) and (6.31d) for q(xd) and q(xw)
themselves in Equations (5.28a) and (5.28b), producing,
w (d)
q^ x
1 nx dnxxwx
(6.37a)
wx (dx wx )
w (w)
q^ x
1 nx dnxxwx
.
(6.37b)
dx (dx wx )
and
In Chapter 7 we will see that Estimator (6.37a) and (6.37b) are maximum
likelihood estimators.
129
EXAMPLE 6.7 Use the data of Example 6.6 to estimate qw20(d) , assuming
an exponential distribution for both death and withdrawal.
SOLUTION Using Equation (6.37a), we have the values
and
dx
dx wx
13 . Then
w (d)
q^20
1 (.97)13 .0101017.
nx dx wx
nx
.97
130
Hoem deals with this dilemma by simply assuming that no one who
died in (x, x1] had a ti value that was less than si . In other words, all deaths
contribute (si ri ) to the exposure. Then the estimate of qwx (d) is produced as
the ratio of number of deaths to the exposure.
An example will make this procedure clear. The important point is
that all persons have a known value of si at entry into (x, x1], regardless of
the outcome for person i (survival, death or withdrawal). Each persons value
of ti becomes known only if that person actually withdraws. Since ti can
never become known for a person who dies, si values are used for all deaths.
EXAMPLE 6.8 The basic data on a sample of four persons has been
analyzed to produce the following ordered pairs (ri , si ) with respect to the
estimation interval (x, x1]. In addition, it is known that one person died and
one person withdrew while under observation, at the times shown. Estimate
qwx (d) .
Person
1
2
3
4
(ri , si )
(0, .40)
(0, 1)
(.30, 1)
(.50, .80)
Time of
Death
.70
Time of
Withdrawal
.75
SOLUTION Persons 1 and 3 contribute exposure of .40 and .70, respectively. Person 4 contributes .25, the theory being that he was scheduled to
withdraw at age x.75, since, in fact, he did! Person 2 contributes 1, under
the assumption that she was not scheduled to withdraw before age x1. The
w (d)
total exposure is .40.70.251 2.35, so q^x 1 .42553.
2.35
131
It will be easy to see the reflection of these two observations in the features
of the actuarial method. The method was developed primarily in an intuitive
manner, in the absence of a guiding statistical theory, and a key theoretical
concession was made in the interest of simpler record-keeping and calculation. The traditional actuarial approach remained in use well into the second
half of the twentieth century, even after the availability of both high-speed
data processing capabilities and modern statistical theory had made the process nearly obsolete. Over the past ten years, however, we have seen mounting evidence to suggest that this pioneering method of survival model
estimation should become, and is becoming, of historical significance only.
Therefore the presentation of the method here is in the nature of a historical
summary. The reader interested in a fuller history can pursue this through
the many references given here.
We will present the actuarial method assuming a double-decrement
environment. The simplification that results if the environment is single-decrement will be easily seen.
6.4.1 The Concept of Actuarial Exposure
The key difference between the traditional actuarial approach to estimation
and the moment approaches presented thus far in this chapter is that the
actuarial approach has not generally made use of the concept of scheduled
exposure, as defined in Section 6.2.3. Rather it made use of a type of
observed exposure, which we will call actuarial exposure, to distinguish it
from the two other types of exposure (scheduled and exact) described earlier.
Recall that person i is scheduled to be an ender to the study at age zi ,
and this is known at entry to the study. Recall also that if x zi x1, then
we say that person i is scheduled to exit (x, x1] at age zi xsi , where
0 si 1. On the other hand, if zi
x1, we say that person i is scheduled
to survive (x, x1], which is the same as to say scheduled to exit at xsi ,
where si 1.
Suppose person i enters (x, x1] at age xri , 0 ri 1. Under the
actuarial method, if person i is an observed ender at xsi , he contributes
exposure of (si ri ). If he is an observed withdrawal at x ti , where,
necessarily, ti si , he contributes exposure of (ti ri ). But what if person i is
an observed death in (x, x1]? We have seen that a proper moment
estimation procedure would have person i contribute exposure to age xsi ,
the scheduled exit age, and early in the historical development of the actuarial method this was intuitively recognized. However, the identification of
scheduled exit age, for a person who died before reaching it, required more
extensive data analysis than the early manual procedures allowed. To resolve
132
133
Cantelli [18] in 1914, and Balducci [5] a few years later. (Cantellis faulty
argument is not presented here; the interested reader will find it in [18].)
Cantellis argument, although faulty, appeared to have been widely
accepted, at least in North America. Subsequent papers by Wolfenden [75],
Beers [9] and Marshall [54], and textbooks by Gershenson [32] and Batten
[8], in effect repeated his argument. But the preponderance of modern
statistical thought indicates that Cantellis approach, although correct for
Special Cases A and B, is seriously flawed in the cases involving enders
within the estimation interval. (See, for example, Hoem [36] and Seal [68].)
In a single-decrement environment the possibility of withdrawals
does not exist. Then observed enders contribute exposure to their observed
ending ages, and observed deaths contribute exposure to age x1, regardless
of a possibly earlier scheduled ending age.
6.4.2 Properties of the Actuarial Estimator
In the single-decrement environment, the actuarial estimators for Special
Cases A and B are the same as the moment estimators, so the discussion in
Section 6.2.5 regarding bias and variance is applicable to the actuarial estimators as well.
In situations involving enders within (x, x1], that is, enders at xsi
where si 1, various investigations have shown under simulation, or otherwise, that the actuarial estimator is negatively biased. It has also been shown
to be inconsistent. (See, for example, Breslow and Crowley [13], and Broffitt
[14].) In large-scale actuarial studies of insurance data, there will always be
a considerable number of enders for the observation period, and it follows
that there could therefore be a significant number in any estimation interval
(unless the study design places them at the interval boundary, which, as we
will see in Chapter 9, is frequently the case). The moment estimator will
always produce a larger estimate q^x than will the actuarial estimator, so a
natural way to counteract the negative bias in the latter is thereby suggested.
Deaths
Finally it should be repeated that all estimators of the form Exposure
,
which includes most of the ones in this chapter, are approximately binomial
proportions, so all are approximately unbiased with variance given by
px qx
Exposure . This fact plays a crucial role in the next section.
6.5 ESTIMATION OF S (x)