You are on page 1of 18

The EMpht-programme

Marita Olsson
Department of Mathematics,
Chalmers University of Technology, and Gteborg University
June 1998

Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 Phase-type distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 What does EMpht do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 How to run EMpht . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Setup and compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Sample as input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2 Distribution as input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
2.3 Starting the EM-algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Specication of the order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Specication of the structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
2.3.3 Starting values for (; T ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.4 Number of iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
2.3.5 Step-length in Runge-Kutta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
2.4 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 PHplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1
1 Introduction
EMpht is a programme for tting phase-type distributions. It can be used either to t
a phase-type distribution to a sample (which may contain censored observations), or to
make a phase-type approximation of another continuous distribution. The tting procedure
consists of an iterative estimation of the parameters of the phase-type distribution, using
an EM-algorithm. The programme is an implementation of the EM-algorithm presented
in Asmussen et al (1996), and in Olsson (1996).
The EMpht-programme is a C-programme, (ansi-standard). It is complemented by a
Matlab programme, PHplot, for graphical display of the tted phase-type distribution.
EMpht is an extension of the EMPHT-programme (Hggstrm et al 1992); the main
dierence being that EMpht can handle samples which contains right-censored and/or
interval-censored observations. The three parts of EMPHT: EMPHTenter, EMPHTdensity,
and EMPHTmain, are all contained in EMpht. EMPHTgraphics is replaced by PHplot.

1.1 Phase-type distributions


Let fJugu0 be a continuous time Markov process on a nite state space E = f1; : : : ; pg[ ,
where  is an absorbing state. If Y denotes the time to absorption, then Y has a phase-
type distribution. The simplest examples of phase-type distributions are nite mixtures
and convolutions of exponential distributions.
The parametrisation of the phase-type distribution stems from the underlying Markov
process: Let i = P (J0 = i), and  = (1 ; : : : ; p). The probability of starting in the
absorbing state, , is here assumed to be zero. The innitesimal generator Q of Ju can
be block partitioned as !
T t
Q = 0; : : : ; 0 0 ;
where ti (the i:th element of t, the exit-rate vector) is the conditional intensity of absorption
in  from state i. The (p  p)-dimensional matrix T is called the phase-type generator.
Since every row in Q sums to zero, it follows that t = T e, where e = (1; : : : ; 1)0, a
column-vector of p ones.
The pair (,T ) is referred to as a representation of the phase-type distribution, and p
is the order of the representation. There are usually several dierent representations of a
phase-type distribution; the parameters (,T ) are thus not identiable.
Some basic distributional characteristics of phase-type distributions are:
 the distribution function F (y) = 1  exp fT yg e
 the density f (y) =  exp fT yg t
R
 the Laplace transform 01 exp f syg F (dy) = (sI T ) 1 t
R
 the nth moment mn = 01 ynF (dy) = ( 1)nn! T ne.
2
More on phase-type distributions and their applications can be found in e.g. Neuts
(1981), O'Cinneide (1990), or in Asmussen and Olsson (1997) and the references given
there.

1.2 What does EMpht do?


EMpht calculates estimates of the elements in (,T ), the parameters of the phase-type
distribution, for a xed order p given by the user. Starting with initial values ((0) ; T (0) ),
which are either provided by the user or randomly generated in EMpht, the programme
produces a sequence of parameter estimates ((1) ; T (1) ), ((2) ; T (2) ), : : :, ((N ) ; T (N ) ). Each
of these estimates corresponds to one iteration of an EM-algorithm, which implies that the
likelihood function increases in every iteration. That is, if we use EMpht to t a phase-type
distribution to a sample y = (y1; : : : ; yn), we can be sure that each new estimate is better
than the previous one in the sense that
L((k) ; T (k); y)  L((k+1) ; T (k+1) ; y);
where L(; T ; y) = Qni=1  exp fT yig t is the likelihood function.
Approximation of another non-negative continuous distribution by a phase-type distri-
bution is done by minimising the information divergence (the Kullback-Leibler informa-
tion). That is, if the distribution to be approximated has density h(), and the phase-type
density is denoted by f (; ; T ), then the information divergence of f with respect to h is
Z " h(x) #
log f (x; ; T ) h(x) dx:
For a xed density h, the information divergence is minimised by maximising
Z
log[f (x; ; T )] h(x) dx;
which can be regarded as an innitesimal analogue of maximising a log-likelihood function,
(see Asmussen et al 1996). In EMpht, the same EM-algorithm is used to perform the
minimisation of the information divergence, as is used to maximise a log-likelihood function
of a sample.
The sequence of estimates converges towards a stationary point (^; T^ ) of the likelihood,
and if the number of iterations, N , is large enough we have that ((N ) ; T (N ) )  (^; T^ ).
Hopefully, this stationary point is a parameter point that maximises the likelihood (min-
imises the information divergence), but no guarantee can be given. The algorithm could
get stuck in an insignicant local maximum or maybe even in a saddle point. When using
EMpht it is therefore recommended to do several ts, starting with dierent initial values
of the parameters.
The non-identiability of the parametrisation of the phase-type distribution implies that
there can be dierent set-ups of estimates (^; T^ ) corresponding to the same phase-type
distribution. Typically, when dierent ts to a sample or a distribution are performed using
3
EMpht, they will result in dierent parameter values. However, the value of the likelihood
function are most oftenly almost the same for dierent ts, and the corresponding densities
are, when plotted, seldom possible to distinguish from each other.

2 How to run EMpht


2.1 Setup and compilation
First, if you have received the two programmes EMpht and PHplot by e-mail, you should
save them in les named EMpht.c and PHplot.m, respectively. Do not forget to remove
the mail-headings. The rst line of EMpht.c should be
/* EMpht.c */
and of PHplot.m
% PHplot.m % .
Now, the EMpht-programme has to be compiled. How to do this is dependent on your
computer environment, but in a UNIX environment the necessary command should be
something like this:
gcc ansi O o EMpht EMpht.c lm
where 'gcc' is the name of the compiler, and 'ansi' indicates ansi-standard code.
When the programme has been successfully compiled it is started by writing EMpht.

2.2 Input
The rst question which is displayed when EMpht has been started, is about the input:
Type of input:
1. Sample
2. Density
Chose 1 if you want to t a phase-type distribution to data, and 2 if you wish to approxi-
mate another continuous distribution with a phase-type distribution.
2.2.1 Sample as input
EMpht needs to know whether or not your sample contains any censored observations.
Type of observations:
1. no censored observations
2. some right- and/or interval-censored observations
If Y denotes the variable which is observed, a right-censored observation c of Y corresponds
to the event fY > cg, and an interval-censored observation (a; b] of Y corresponds to the
event fa < Y  bg. (In the programme observations are sometimes referred to as times,
or failure times.) The observations (censored and non-censored) can be entered from the
keyboard or from a le, with or without weights.
Way of entering the observations:
1. Unweighted, from keyboard.

4
2. Weighted, from keyboard.
3. Unweighted, from file 'unweighted'
4. Weighted, from file 'sample'
The weights are simply the numbers of observation of the same value. If observations
are entered without weights, EMpht assigns a weight equal to one to each observation. To
indicate end of data, the input is always ended by -1.
In whatever way you enter a set of observations, EMpht creates a le named sample
(or overwrite an existing le) and stores the observations and their weights in it. Thus,
the next time you run the programme to do another phase-type t to the same data-set,
you can use sample as input (option 4 above).
Let us illustrate the dierent ways of entering data with two examples.

Example 1, no censored observations


Consider ve observations (failure times); 4, 9, 6, 1, 6. If you have selected option 1 above,
you enter the observations one at a time.
Select (1-4): 1
Enter failure times, and quit with -1
Time 1:4
Time 2:9
Time 3:6
Time 4:1
Time 5:6
Time 6:-1

If you have selected option 2 above, you will be asked to enter the number of cases,
(the weight), after each entered observation:
Select (1-4): 2
Enter failure times and number of cases. Quit with time = -1.
Time 1:4
Number of cases:1
Time 2:9
Number of cases:1
Time 3:6
Number of cases:2
Time 4:1
Number of cases:1
Time 5:-1

To use option 3 above, you must rst store the observations in a column (or in a row)
ending with -1, in a le named unweighted. In the le sample, (option 4), each observation
must be followed by its weight. This is easiest done by letting each row of the le consist
of an observation and its weight (in that order). The le must be ended with -1.
5
4 4 1
9 9 1
6 6 2
1 1 1
6 -1
-1

Figure 1: The contents of the les unweighted (to the left), and sample (to the right), for
the data considered in Example 1.

Example 2, censored observations


To distinguish between dierent types of observations (non-censored, right-censored, and
interval-censored) an indicator must be entered together with each observation. The indi-
cator equals 0 for a right-censored observation, 1 for a non-censored observation, and 2 for
an interval-censored observation.
Consider the following set of six observations: fY = 7g (a non-censored observation),
fY > 25g, fY > 25g, fY > 7g, (three right-censored observations), and f10 < Y  12g,
f15 < Y  18g, (two interval-censored observations). The input for each observation is
always entered into EMpht in the following order: indicator, observed time-point(s), and
if choice 2 or 4 is used, the weight.

1 7 1 7 1
0 25 0 25 2
0 25 0 7 1
0 7 2 10 12 1
2 10 12 2 15 18 1
2 15 18 -1
-1

Figure 2: The contents of the les unweighted (to the left), and sample (to the right), for
the set of observations given in Example 2.

2.2.2 Distribution as input


EMpht provides six pre-specied distributions. It is also possible to dene other distribu-
tions, but it requires that you add a couple of lines of C-code in EMpht.c, (an example
is given below). Note that the exponential and the Erlang distributions, which are not
explicitly specied, are special cases of the phase-type distribution (option 6).
6
Type of density:
1. Rectangle
2. Normal
3. Lognormal
4. Weibull
5. Inverse Gauss
6. Phase-type
7. User specified
Select(1-7):

The densities of the six pre-specied distributions are:

1: f (x) = 1=(b a) n o a<x<b


2: f (x) = 22 exp
p 1 (x )2
22
n o
3: f (x) = xp1 2 exp (log2x 2 )2
n o
4: f (x) =  x 1 exp (x)
n o
5: f (x) = pc2 x 23 exp c 21 ( cx2 +  2x)
6: f (x) =  exp fT xg t

The densities 2-6 are dened for arbitrarily large x, and therefor you are asked to specify
an upper truncation point. The normal distribution is automatically truncated to the left,
so that only positive failure times are allowed. EMpht will also ask you to specify the
parameters of the density you have selected, using the same notation of the parameters as
in the density formulas given above.
Beside an upper truncation point and the parameter values, you will be asked to spec-
ify the maximum acceptable probability in one point, and the maximum time interval
corresponding to one point. This is because the distribution given as input to EMpht is
discretised into a weighted sample, where the weights are the probability mass in a small
interval. The discretisation can be done more or less rened; the smaller maximum time
interval and maximum probability you specify, the more rened the discretisation is, and
the larger the size of the weighted sample becomes. However, the size of the sample aects
the amount of computation involved in the EM-algorithm; the more observations, the more
computation is involved and the more time each iteration will require. There is usually not
much to gain in precision by letting the maximum acceptable probability be too small, (no
less than maybe 0.01, but it depends on the time scale involved). Usually, by choosing the
time interval=l, such that (truncation point / l)  500, the computations involved in each
iterations of the EM-algorithm are manageble. (The time each iteration requires, however,
depends mostly on the order p of the phase-type distribution. See section 2.3.1.)
7
A phase-type distribution as input
When option 6 is selected, you are asked to specify the number of phases, p, the initial
distribution of the underlying Markov process , and the phase-type generator T . This
can be done via the keyboard, but it is more convenient to use a le input-phases. In both
cases the parameters are entered in the following order:
p
1 T11 : : : T1p
2 T21 : : : T2p
:::
p Tp1 : : : Tpp
Let us consider an Erlang distribution (3,2) as an example of a phase-type distribution
as input. An Erlang (3,2) is the distribution of the sum of three independent exponential
random variables, each with expectation 1/2 (parameter=2). It can be represented as a
phase-type distribution of order p = 3, where the underlying Markov process starts in state
1 (implying 1 = 1; 2 = 0; 3 = 0), and jumps successively to state 2, to state 3, and
nally to the absorbing state. In each state it spends an exponentially distributed time
with parameter equal to 2; T11 = T22 = T33 = 2. Thus, the le input-phases contains the
following:
3
1 -2 2 0
0 0 -2 2
0 0 0 -2

User specied distribution


If you want a distribution not included among the pre-specied ones, you must do a bit of
programming. There are two places in the le EMpht.c where you must add some code;
in the function input-density, and in the function density. We illustrate with an example:
Let us consider the Rayleigh distribution with density
n o
f (t) = (a + b t) exp a 0:5 b t2 :
The lines number 611-614 in EMpht.c (in function input-density), read
case 7:
printf("You must do some programming, - see instructions");
/* This is where the parameters are entered */
break;

You should change these rows into the following:


case 7:
printf("A Rayleigh distribution with parameter a and b. ");

8
printf("a:");
scanf("%lf", &parameter[0]);
printf("b:");
scanf("%lf", &parameter[1]);
break;

Now, line number 355 (in function density), should contain an expression of the Rayleigh
density at t. Thus, you change this row to
return( (par[0]+par[1]*t )*exp(-par[0]*t-0.5*par[1]*t*t) );
and the specication of the Rayleigh density is completed. After compilation EMpht will
have the Rayleigh distribution as option 7 on the list of densities.
Note that in the function density , the parameter given as par[0] is always the pa-
rameter rst entered in input-density, and par[1] is the second parameter entered. If
the distribution you want to approximate has more than two parameters, you must also
increase the number of elements that the vector parameter in function input-density is
allowed to contain. For instance, if you have three parameters you change parameter[2]
to parameter[3] on line 499.

2.3 Starting the EM-algorithm


Now it is time for the actual tting of the phase-type distribution, which is conducted
iteratively with an EM-algorithm. Whether you have a sample or a density as input,
EMpht uses the same kind of algorithm, and needs the following specications: The order
and the structure (sub-class) of the phase-type distribution to be tted, starting values for
the parameters of the phase-type distribution, the number of iterations of the EM-algorithm
to be performed, and the step-length used in a Runge-Kutta solution of a system of linear
dierential equations. It is not as complicated as it might appear at rst sight. Let us take
it from the beginning.

2.3.1 Specication of the order


First you are asked to specify the order p of the phase-type distribution. Start with a small
number ( < 5 say), because the time the tting procedure requires is heavily dependent
on p, and increases with p like p2. If the rst t is not satisfactory you can do a new one
with a larger p.

2.3.2 Specication of the structure


Next choice concerns the type (or sub-class) of phase-type distribution you wish to t.
The distribution type is characterised by the structure of zeroes in the -vector and the
T -matrix, which species what paths the underlying Markov process is allowed to follow.
(For instance if element (i; j ) of T is zero, the Markov-process is not allowed to jump
directly from state i to state j ).
9
The rst ve types are pre-dened, and the last three are options where you can specify
other structures in dierent ways:
Select distribution type:
1. General phase-type
2. Hyperexponential
3. Sum of exponentials
4. Coxian
5. Coxian general
6. User specified structure (from file 'distrtype')
7. User specified starting values (from file 'phases')
8. User specified starting values (from keyboard)

Type 1, general phase-type, puts no restrictions on the underlying Markov-process.


All elements in  and T are allowed to be non-zero.
Type 2, hyperexponentials, is also known as a mixture of exponentials. The Markov-
process can start in any state (all elements in  are allowed to be non-zero), but terminates
in the absorbing state without visiting any other state (all elements in T o the main
diagonal is zero).
Type 3, sum of exponentials, or generalised Erlang-distribution; the underlying
Markov-process must start in state 1 (=(1 0 : : : 0)), and then visits each state 2; : : : ; p, in
that order, and terminates when leaving state p.
Type 4, the Coxian distribution is constructed as a sum of exponentials with the
exception that the absorbing state can be reached from all of the other states. Thus, from
state i the Markov-process can either jump to state i + 1 or to the absorbing state.
Type 5, the Coxian general structure is the same as the Coxian, except that the
Markov-process is allowed to start in any state.
To use the sixth option you must create a le distrtype, where the structure is specied.
The structure is coded as follows: The elements of  and T which are allowed to be non-
zero are represented by 1:s, and the elements which are zero are represented by 0:s. This is
with one exception; a 0 at a diagonal element of T (which can never be zero) means that
the corresponding element in the exit-rate vector t is zero. Figure 3 below shows how the
ve pre-dened structures are represented. In the le distrtype the elements are written in
the following order:
1 T11 : : : T1p
2 T21 : : : T2p
:::
p Tp1 : : : Tpp
10
1 111 1 1 1000 1 0100 1 1100 1 1100
1 111 1 1 0100 0 0010 0 0110 1 0110
1 111 1 1 0010 0 0001 0 0011 1 0011
1 111 1 1 0001 0 0001 0 0001 1 0001
Type 1 Type 2 Type 3 Type 4 Type 5
Figure 3: The pre-dened structures number 1 to 5, represented as each of them
should be in the le distrtype (when p=4): The rst column represents the -vector, and
the (p  p)-matrix to the right of it represents T .

2.3.3 Starting values for (; T )


If you have chosen any of the structures 16 above, EMpht generates random starting
values for those elements in  and T not xed to be zero. You only give a seed (an
integer) for the random generation of (0) ; T (0) . (It is thus possible to repeat a run of
EMpht by using the same seed.)
If you want to specify (0) ; T (0) yourself, you can use option 7 or 8 when choosing
the phase-type structure. In option 7 the starting values are entered from a le named
phases, and in option 8 from the keyboard. The order in which the parameters should
appear in the le phases, is the same as in distrtype, (see description of option 6 in the
previous section). note that the EM-algorithm used in EMpht preserves zeroes in  and
T . Thus, any parameter that are given zero as a starting value, stays equal to zero through
every iteration. (This is why it is so easy to t special sub-structures of the phase-type
distribution via the EM-algorithm.)
EMpht automatically saves the estimates of  and T given by the last EM-iteration in
the le phases. Therefor, option 7 can be used to continue the t done in the previous run
of EMpht.
The starting values of  and T , and also of t, are shown on the screen: The rst column
is (0) , thereafter comes the p  p-matrix T (0) , and the last column written on the screen
is t(0) .

2.3.4 Number of iterations


You can always start with a small number of iterations (like 1050), to see how much time
each iteration takes. After the specied number of iterations has been performed, you are
asked if you want to do more iterations. Hence, you wont have to start EMpht all over
again.
It is not easy to say how many iterations that are needed in order to get estimates
close to the maximum-likelihood estimates. The larger p the slower the convergence of the
EM-algorithm will be, and the more iterations are needed. If p is not too big (say p  5)
a few thousands of iterations are usually enough for the parameter-estimates to stabilise,
(but it depends on your input as well).
11
How much time each iteration requires is of course important when to decide on how
many iterations to do. As mentioned before, the time needed increases rapidly with p.
The structure of the phase-type distribution is also of importance; a structure where many
components of  and T are xed to zero gives faster iterations than the general phase-
type structure. It also requires a smaller number of iterations for the estimates to stabilise.
Very oftenly, the Coxian distribution provides just as good a t as the general phase-type
distribution (of equal order) does.

2.3.5 Step-length in Runge-Kutta


Before the EM-iterations begin, you are asked for the step-length of the Runge-Kutta
procedure:
Choose step-length for the Runge-Kutta procedure:
1. Default value
2. Your own choice of value
In each iteration of the EM-algorithm, the new parameter estimates are calculated by
solving a system of homogeneous linear dierential equations (of dimension p(p +2)). This
is done numerically with the Runge-Kutta method of fourth order. The default value of the
step-length is max0:j1T j , where max jTiij is the largest (in absolute value) diagonal element of
the last estimate of T . Thus, the larger max jTiij, the smaller the step size, and the longer
ii

time each iteration will require.


Usually the default value works well, but if you want to speed up the t you could try
a larger step-size.

2.4 Output
The estimates of  and T are written on the screen; the leftmost column is the -vector,
the rightmost column is the t-vector, and in between is the T -matrix. The estimates of
the rst 5 iterations are displayed, thereafter every 25:th iteration, and the last one. Also,
the value of the log-likelihood function is given together with every displayed setup of
parameter estimates.
It is not hard to change in the programme-code which iterations should be displayed.
On line 1573 in EMpht.c you will nd
if ((k < 6) || ( (k % 25)==0 || k==NoOfEMsteps ))
The iteration number is k, and NoOfEMsteps is the specied number of iterations to be
performed. If you for instance want to see the rst 3 estimates, every 10:th estimate and
the last estimate of  and T on the screen, you can change line 1573 to
if ((k < 4) || ( (k % 10)==0 || k==NoOfEMsteps ))
and recompile the programme.
Two les, inputdistr and phases, are created when running EMpht. The le inputdistr
contains the input; either a sample or a discretised density, and is used as input to the Mat-
lab programme PHplot. (However, if your input is a sample containing interval-censored
12
observations, the le inputdistr will be empty. Calculation of the empirical distribution for
such a sample has not yet been implemented.)
The le phases contains the nal estimates of  and T calculated in EMpht, (it is
actually updated every time new parameters are displayed). It is used as input to PHplot,
but can also serve as starting-values to the EM-algorithm (choice number 7 when selecting
structure), if EMpht is restarted to continue the last performed t.

3 PHplot
To run PHplot you must rst start Matlab. In Matlab PHplot is started by writing PHplot,
(provided PHplot has been saved in a le named PHplot.m). The following menu will be
shown on the screen:
1. Mean and standard deviation
2. Display survival function
3. Display distribution function
4. Display density
5. Display failure rate
6. Print graph
7. Save graph on file "PHgraph"
8. Load new estimates of pi and T
9. Quit
Select (1-9):

The rst option gives the mean and standard deviation of the tted phase-type distri-
bution. Options 2-5 gives a graph of the tted phase-type distribution. If you have used
EMpht to approximate another continuous distribution, this distribution will be plotted
together with the tted phase-type distribution.
If your input to EMpht is a sample containing no censored observations, the empirical
distribution function is plotted in option 3 (and one minus the empirical distribution in
option 2), together with the tted phase-type distribution. If your input was a sample
containing some right-censored observation, the Kaplan-Meier estimate of the failure time
is plotted together with the survival function of the tted phase-type distribution in option
2, (and one minus the Kaplan-Meier estimate in option 3). Option 4 and 5 give the density
and failure rate, respectively, of the tted phase-type distribution only.
Option 6 and 7 can be used to print and save on a le, the graph currently displayed.
Option 8 is convenient if you run EMpht and PHplot simultanously in dierent windows,
and want to plot the current t. Using option 8, you do not have to quit and restart
PHplot every time you have updated a t (performed more iterations) in EMpht.
13
4 Examples
In this section some examples of approximations of other distributions by phase-type dis-
tributions are presented. Hopefully these examples can provide some guidance on how
to choose the truncation point, maximal time interval, maximal probability, number of
phases, etc.

First example; an inverse Gaussian distribution


The rst example is an approximation of an inverse Gaussian distribution with parameters
c = 0:5 and  = 0:5, (mean=1 and standard-deviation=2). The probability density function
has a very pointed shape, which has the eect that more phases are required to get a
satisfactory approximation.
Figure 1 shows the density of the inverse gaussian distribution and the densities of two
dierent phase-type approximations with p=4 and with p=6, respectively. In EMpht, when
the descretisation of the inverse Gaussian distribution was specied, we set the truncation
point=8, maximum probability = 0.01, and maximum time interval = 0.05. The two tted
phase-type distributions were both of the general structure. In the rst t, p=4, 5000
iterations of the EM-algorithm was performed, and in the second t (p=6) we made 8000
iterations. The default value of the step-length in the Runge-Kutta solution was used
throughout.
For comparison, a Coxian t with p=6 was performed using 4000 iterations. When
plotted against the other ts (of general phase-type structure) it turned out to be indis-
tinguishable from the t with p=4.

2.5

1.5
DENSITIES

0.5

0
0 0.5 1 1.5 2 2.5 3
Inverse Gaussian, PH p=4 (dashed), PH p=6 (dotted)

Figure 1: The density of an inverse Gaussian distribition (solid curve) and two phase-type
approxmations; p=4 (dashed curve), and p=6 (dotted curve).
14
Second example; a lognormal distribution
A lognormal distribution with parameters =1 and =1 is approximated in our second
example by phase-type distributions of order 2 and 4, (Figure 2). This lognormal distribu-
tion has expectation about 4.5 and standard-deviation about 5.9. In EMpht we truncated
the distribution at 25, set the maximum probability to 0.01 and the maximum time in-
terval to 0.05. The phase-type approximatin with p=2 is based on 2000 iterations of the
EM-algorithm, and for p=4 we used 4000 iterations.

0.35

0.3

0.25

0.2
DENSITIES

0.15

0.1

0.05

0
0 2 4 6 8 10 12 14 16 18 20
LOGNORMAL, PH p=2 (dashed), PH p=4 (dotted)

Figure 2: The density of a lognormal distribition (solid curve), and two phase-type approxma-
tions; p=2 (dashed curve), and p=4 (dotted curve).

Third example; two dierent Weibull distributions


The Weibull distribution shown in Figure 3 is dicult to nd a good phase-type t of,
because a lot of phases (states) are needed to catch the delay in the beginning of this distri-
bution. The approximations we have done are all of the sub-structure sum of exponentials
(generalised Erlang). This sub-structure was chosen to get a phase-type approximation
with as small variance as possible (Aldous and Shepp 1987), and also because it has a lot
of zeroes in the T -matrix and thus less parameters to estimate.
The Weibull distribution has parameters c=5 and =1, expectation  0.92 and standard-
deviation  0.21. We set the truncation point=2, maximum probability=0.005, and maxi-
mum time interval=0.005. To t the sum of exponential structure we had to chose a small
step-length in the Runge-Kutta procedure; =0.001 for the phase-type t of order p=10.
For the ts of order p=15 and p=20 we did not start the EM-algorithm randomly, but
dened a le phases so that the structure was a sum of exponentials (ti;i+1 = ti;i and all
15
other tij = 0), and the diagonal elements chosen to be -15/0.92 for p=15, and -20/0.92 for
p=20. With these initial values only 10-20 iterations of the EM-algorithm were needed to
reach a stationary point in the likelihood.
It is possible that for instance the Coxian structure with p > 20 would have given a
better approximation to this Weibull distribution, but the computational eort to t a
distribution with that many parameters is substantial.
A Weibull distribution with c=1.8 and =1 (mean  0.89, standard-deviation  0.51)
is much easier to approximate, see Figure 4. The phase-type t is a Coxian distribution
with p=5, obtained through 2000 iterations of the EM-algorithm. Truncation point of the
Weibull distribution was set to 3, maximal probability to 0.01, and maximal interval to
0.0075.

1.8

1.6

1.4

1.2
DENSITIES

0.8

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Figure 3: The density of a Weibull distribition; =5, and =1 (solid curve), and three phase-
type approxmations of order 10, 15 and 20. The pointedness of the phase-type densities increase
with the order.

16
1

0.9

0.8

0.7

0.6
DENSITIES

0.5

0.4

0.3

0.2

0.1

0
0 0.5 1 1.5 2 2.5 3

Figure 4: The density of a Weibull distribition; =1.8 and =1, (solid curve), and a phase-type
approxmation of order 5 (dashed curve).

17
References
[1] D. Aldous and L. Shepp (1987) The least variable phase-type distribution is Erlang.
Commun. Statist. -Stochastic Models 3, 467473.
[2] S. Asmussen, O. Nerman, and M. Olsson (1996) Fitting phase type distributions via
the EM algorithm. Scand. J. Statist. 23 419441.
[3] S. Asmussen, and M. Olsson (1997) Phase-type distribution. Encyclopedia Of Statistical
Sciences Update volume 2, 525530.
[4] O. Hggstrm, S. Asmussen and O. Nerman (1992) EMPHT - A program for tting
phase type distributions. Technical report. Department of Mathematics, Chalmers Uni-
versity of Technology, Gteborg, Sweden.
[5] M. F. Neuts (1981) Matrix-Geometric Solutions in Stochastic Models. Johns Hopkinks,
Baltimore.
[6] O'Cinneide, C.A. (1990) Characterizations of phase-type distributions. Commun.
Statist. -Stochastic Models 6, 157.
[7] Olsson, M. (1996) Estimation of phase type distributions from censored data. Scand.
J. Statist. 23, 443460.

Marita Olsson
Department of Mathematics
Chalmers University of Technology
S-412 96 Gteborg
SWEDEN
E-mail: marita@math.chalmers.se

18

You might also like