You are on page 1of 11

Accident Analysis and Prevention 34 (2002) 417427

Generalised linear accident models and goodness of t testing


G.R. Wood
Institute of Information Sciences and Technology, College of Sciences, Massey Uni6ersity, Pri6ate Bag 11 222,
Palmerston North 5331, New Zealand
Received 26 June 2000; received in revised form 19 January 2001; accepted 22 January 2001
Abstract
This paper has two aims. The primary aim is to provide a practical resolution of the low mean value problem, a barrier to
goodness of t testing for the commonly used generalised linear accident model. A secondary aim of the paper is to describe the
underlying mechanism of these models, so making them fully accessible to the transport modeller. 2002 Elsevier Science Ltd.
All rights reserved.
Keywords: Generalised linear model; Low mean value; Negative binomial; Pearsons X
2
; Poisson; Scaled deviance
www.elsevier.com/locate/aap
1. Introduction
The prime purpose of this paper is to present a
method for overcoming the low mean value problem,
a barrier to valid goodness of t testing for generalised
linear accident models. Attention was rst drawn to
this problem by Maycock and Hall (1984) and more
recently it has been broached by Fridstrm et al. (1995)
and Maher and Summersgill (1996). A secondary aim is
to provide a straightforward description of the model
tting process, so making the methods more widely
available.
The paper can be read at two levels. An accident
modeller conversant with tting the models, but inter-
ested, in a practical method for assessing goodness of
t, should follow the main body of the paper. The
reader interested in building up a full technical knowl-
edge of the mechanism of the models can add details
from Appendix A as needed. A companion statistical
methodology paper, which delves more deeply into the
underlying goodness of t theory (Wood, 2000) is also
available.
The format of this paper is as follows. In the next
section the deterministic and stochastic components of
the generalised linear accident model are described.
Section 3 briey considers model tting and points the
reader to Appendix A, where a rst principles descrip-
tion of how the models are tted is presented. In
Section 4 a practical resolution of the goodness of t
problem for low mean values is given. Section 5 sum-
marises the paper.
2. The model
For the sake of completeness, we begin with a suc-
cinct description of the model. It was rst introduced in
Maycock and Hall (1984), extensively developed in
Hauer et al. (1988) and further discussed in Maher and
Summersgill (1996); it appears to be widely accepted
now as an appropriate accident model.
The aim is to model data of the type displayed in
Fig. 1. This shows the number of loss-of-control injury
accidents, over a 5-year period between 1980 and 1991,
for 392 approaches to intersections in New Zealand.
What type of curve should be used to track the mean
accident level? A suitable family are curves of the form
v=i
0
x
i
1
1
, where v is the mean number of accidents
when the average ow is x vehicles in the set time
period. Such curves start at the origin and allow down-
ward or upward curvature according to whether, i
1
is
smaller or larger than 1. For accident patterns involv-
ing two ows x
1
and x
2
(e.g., right-angled collisions),
the model form extends to v=i
0
x
1
i
1
1
x
2
i
1
2
, and so on.
Note that for such models the logarithm of the mean,
p, is a linear function of the logarithm of the ow, since
E-mail address: g.r.wood@massey.ac.nz (G.R. Wood).
0001-4575/02/$ - see front matter 2002 Elsevier Science Ltd. All rights reserved.
PII: S0001-4575(01)00037-9
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 418
Fig. 1. A vertically jittered scatterplot of the number of
loss-of-control injury accidents, over 5 years, against 24 h vehicle ow
for 392 approaches to intersections throughout New Zealand. The
mean number of accidents in ow classes from 0 to 1999, 2000 to
3999 and so on are shown via the dashed line; the Poisson regression
curve is shown as the solid line. (The jittering used here randomly
offsets points in a vertical direction so that overlapping points can be
seen.)
Fig. 3. A vertically jittered scatterplot of the number of rear-end
injury accidents, against vehicle ow (during business hours, from 7
a.m. to 6 p.m.) for the years 19891991, for 426 approaches to
signalised intersections throughout New Zealand. Also shown are the
grouped accident means (dashed line) and Poisson regression curve
(solid line).
ows in Fig. 1 was partitioned into seven intervals and
the mean and variance for the number of accidents in
each interval calculated.)
It is often the case, however, that greater variability
is seen in the number of accidents occurring at arms
with a given ow than with a Poisson model. A data set
exhibiting this behaviour is shown in Fig. 3; the plot
now of variance against mean is shown in Fig. 4,
revealing a more quadratic relationship. In order to
model such variation, it is necessary to acknowledge the
between arm variation in addition to the within arm
variation already acknowledged in the Poisson model.
The manner in which this is done is now described.
p=logv=logi
0
x
i
1
=logi
0
+i
1
logx=i%
0
+i
1
logx.
For a given ow x in Fig. 1, what distribution is
reasonable for Y, the number of accidents? If the
variance in accident numbers is roughly equal to the
mean, for a given ow, then a Poisson distribution
f
Y
(y) =
e
v
v
y
y!
for y=0,1,
is reasonable, providing what we term here the Poisson
model. Fig. 2 provides a plot of the variance of the
number of accidents against the mean number of
accidents, for the data shown in Fig. 1. Evidently the
variance is very close to the mean, so a Poisson model
is supported. (In order to create Fig. 2, the range of
Fig. 4. A plot of variance against mean for the rear-end data of
Fig. 3. Note that now the accident variance grows larger than the
accident mean as the mean increases, indicating that a negative
binomial distribution may be appropriate for the number of accidents
at a given ow.
Fig. 2. A plot of the variance against mean for number of loss-of-con-
trol injury accidents, using seven ow intervals in Fig. 1. The identity
relationship shown suggests a Poisson model.
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 419
For a single arm of a single intersection, we continue
to assume that the distribution of the number of acci-
dents within the intersection is Poisson, with mean m,
termed the safety of the arm in Hauer et al. (1988).
For a given ow x in Fig. 3 and associated mean v, we
are concerned with a population of many intersection
arms throughout New Zealand. We model the variation
between these arms by acknowledging that their safety
m is itself a value of a random variable M. In Maycock
and Hall (1984) and Hauer et al. (1988), it is sensibly
assumed that M on [0,) follows a gamma distribu-
tion, with parameters k and a. Since this gamma distri-
bution must have mean v, we have that v=ka (the
gamma mean) or a=v/k.
The distribution of the number of accidents Y, for a
given ow (or set of ows), will then be a mixture of the
Poisson distributions by the gamma distribution, i.e., a
negative binomial distribution. This distribution is de-
scribed using the two parameters k and v, and has the
form
f(y; k,v) =
y+k1
k1
k
v+k

k
v
v+k

y
for y=0,1,
When k is a non-negative integer, the distribution can
be conveniently remembered as that of the number of
failures Y until we have k successes, with p=k/(v+k)
equal to the probability of success in each trial.
If Y has a negative binomial distribution, it can be
shown that
Var(Y) =v+
v
2
k
,
agreeing with the quadratic relationship between vari-
ance and mean illustrated in Fig. 4 for the rear-end
data of Fig. 3. Full details of these distributions and
associated calculations are presented in Appendix A.
To summarise, for models involving a single ow,
1. The curve of true means is modelled as v=i
0
x
i
1
1
.
2. The distribution for a given ow x is assumed to
be negative binomial, with parameters k (not de-
pending on x) and v=i
0
x
i
1
1
.
The parameters i
0
, i
1
and k must be estimated
from the data. We point to how this is carried out in
the next section. (Extensions to models with more
than one explanatory ow variable are straightfor-
ward.)
3. Model tting
Given that centre stage has been captured by this
generalised linear model, it is reassuring to see that the
tting can be summarised in a single iterative equation,
for the Poisson model; for the negative binomial model
this iterated equation must be nested within a second
iterative scheme. In the discussion of Hauer et al.
(1988) by Carl Morris (p. 59, point 4), these details
were requested; to the knowledge of the author these
have not to date appeared in print, so are presented
here in Appendix A. We note that for the loss-of-con-
trol data illustrated in Fig. 1 the Poisson iterative
routine yields b%
0
=log b
0
=6.02968 (so b
0
=
2.4062610
3
) and b
1
=0.38883 as estimates of i%
0
(i
0
), and i
1
. For the rear-end data, the negative bino-
mial routine yields b%
0
=log b
0
=11.305 (so b
0
=
1.231110
5
), b
1
=1.17176 and k.=0.8 as estimates
of i%
0
(i
0
), i
1
and k, respectively. The curves of tted
means, v =b
0
x
b
1
, are superimposed on the scatterplots
in Figs. 1 and 3.
4. Goodness of t
Goodness of t for a generalised linear model is
traditionally assessed using either the scaled deviance
G
2
(twice the logarithm of the ratio of the likelihood of
the data under the larger model, to that under the
smaller model) or Pearsons X
2
statistic (sum of squares
of standardised observations). Here the models are
assumed to be nested and the larger model is that with
the greater number of parameters. As long as the data
are approximately normally distributed, these statistics
follow
2
distributions with degrees of freedom
equalling the difference between the number of parame-
ters in the larger model and the number in the smaller
model (Dobson, 1990, Section 5.7).
When the Poisson mean is low, it has been noted in
the accident literature that G
2
fails as an arbiter of
goodness of t (Maycock and Hall, 1984; Maher and
Summersgill, 1996). In the second paper, two remedies
were suggested: use of X
2
or use of G
2
/E(G
2
). In this
section we show that neither of these follow approxi-
mate
2
distributions when the mean is low. We do
offer, however, a practical grouping technique which is
capable of providing dependable test statistics, using
either G
2
or X
2
as a foundation.
This accident modelling problem has stimulated the
development of statistical methodology for the low
mean value goodness of t problem. For full details,
the interested reader is referred to Wood (2000), where
theory for both Poisson and negative binomial models
is developed. In this section, the aim is to summarise
these results for the practising transport modeller.
4.1. The Poisson problem
When drawing a sample Y
1
,,Y
n
from a single Pois-
son distribution with mean v, traditional goodness of t
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 420
Fig. 5. Mean and variance of a component of G
2
, for low Poisson v values. The mean falls well below 1 and the variance well below 2, as v falls
to 0.
testing with G
2
or X
2
rests on the assumption that the
components follow approximate
2
1
distributions, so
have mean of 1 and variance of 2. (Recall that the
mean and variance for a
2
distribution are the degrees
of freedom and twice the degrees of freedom, respec-
tively.) In the Poisson case, we have (Maher and Sum-
mersgill, 1996, p. 283)
G
2
(v; n) = %
n
i =1
2

log
y
i
v

y
i
y
i
+v

and
X
2
(v; n) = %
n
i =1
(y
i
v)
2
v
.
In Figs. 57, we show the mean and variance of the
components (i.e., single terms in the summation) of G
2
,
X
2
and G
2
/E(G
2
) respectively, for low values of Poisson
v. It is clear that as v goes to 0, E(G
2
), Var(G
2
),
Var(X
2
) and Var(G
2
/E(G
2
)) increasingly depart from
the appropriate
2
values. The curves for G
2
were
found by summing series; for X
2
it can be shown that
components have expected value of 1 and variance of
2+1/v.
To illustrate how this behaviour causes difculties
in practice, we use the loss-of-control data shown in
Fig. 1. Here,
G
2
=2 %
392
i =1

log
y
i
v
i

y
i
y
i
+v
i

=145.67.
The degrees of freedom for the test will be 3922=
390 and we nd that P[
2
390
E145.67] =1.000, indicat-
ing a t too good to be true, certainly far better than
the dashed lines of local means would suggest. We are
seeing here the effect of the low value of E(G
2
). Also
X
2
= %
392
i =1
(y
i
v
i
)
2
v
i
=407.04
and P[
2
390
E407.04] =0.2659, an apparently sensible
result since it suggests a reasonable t. As our study of
Var(X
2
) in Fig. 6 shows, however, components of X
2
have wide variation for means less than unity. This
leads to high variance, discreteness and right skewness
in the true X
2
distribution, so it may be poorly approx-
imated by the standard
2
reference distribution.
In the negative binomial situation, the departures
from the appropriate
2
values become more extreme;
the expectation and variance for a G
2
component, in
the negative binomial case with k=1, are shown plot-
ted against the mean in Fig. 8. An important point to
note is that the expectation of a component of Pear-
sons X
2
now becomes 1+v/k, so is biased upward.
4.2. A solution
We need to improve the normality of our observa-
tions, since upon this the success of G
2
and X
2
rests.
This is readily achieved by grouping the data and using
the group sums as the observations. Sums of indepen-
dent and identically distributed Poisson variables are
again Poisson and sums of independent and identically
distributed negative binomial variables are again nega-
tive binomial, so for either family summing leads to a
higher mean and so increased normality.
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 421
Grouping corresponds to testing the reduced model
against a model smaller than the usual maximal model;
instead of the individual observation, a mean of obser-
vations is used as the Poisson or negative binomial
mean in the larger model. The more we group, the
smaller becomes the model against which we test the
tted model. In Wood (2000), it is shown that G
2
demands less severe grouping than X
2
. For this reason,
the grouped G
2
is recommended, since we are able to
compare the tted model with a larger one using G
2
than we are when using X
2
. A grouped Pearsons X
2
can, however, be used provided that it is modied for
negative binomial models. In the sequel, rules for pro-
ducing and testing a grouped G
2
are presented, together
with a numerical example, rst for Poisson models and
then for negative binomial models.
Put tersely, the method proposed sacrices the extent
of the comparison in order to improve the validity of
the test; we reduce the size of the task in order to be
able to accomplish something. In statistical language,
we sacrice degrees of freedom in the test, but ensure
that the components are more closely
2
1
distributed.
4.3. Grouping for Poisson models
Informally, here is how a valid goodness of t test
can be carried out, for a Poisson model involving low
mean values. The data in Fig. 1, say, are grouped using
a table provided, by moving from right to left in the
scatterplot. For each group formed the ow and acci-
dent means must be calculated, then a grouped G
2
is
computed using a formula provided. This value can
then be tested against a critical
2
value. The way in
which the grouping table is formed is now described.
Study of Fig. 5 shows that by the time v reaches 0.5
the G
2
component mean reaches 1, but v must be 1.6
before the variance reaches 2. Thus, the variance pro-
vides the more demanding constraint. When we group
it has the effect of compressing the curves in Fig. 5
against the vertical axis, so grouped G
2
components
have acceptable values for lower Poisson means. Calcu-
lations in Wood (2000) reveal that the value of the
variance of G
2
curve for group size r at Poisson mean
v/r is the value of the original ungrouped curve at v
itself. So for v=1.6/r an r-grouped G
2
component will
have a variance of 2. Given that the variance curve in
Fig. 5 straddles the value of 2 around v=1.6 it is
reasonable to use mid-points of the intervals between
these critical 1.6/r, r=1,2, values as boundaries for
group sizes. In this way, we set up Table 1 showing the
level of grouping needed as v grows smaller.
Formally, here is how the test for a Poisson model is
carried out. We use Table 1 in the following way:
1. For each owaccident pair (x,y) compute the
tted v and use Table 1 to determine an appropri-
ate group size r.
2. Working from right to left in the regression plot (so
from high to lowv ) group the raw data so that each
observation is in a group of size at least as large
as the associated r. This gives n groups with the i th
group of size r
i
having accident counts of y
i1
,,y
ir
i
and associated ow counts of x
i1
,,x
ir
i
.
Fig. 6. Mean and variance of a component of X
2
, for low Poisson v values. The mean is always 1, but for low v values the variance rises well
above 2.
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 422
Fig. 7. Mean and variance of a component of G
2
/E(G
2
), for low Poisson v values. The mean is always 1, but the variance, as in Fig. 6, again
becomes large as v falls to 0.
3. Average the x and y values for each group, giving
x
i
and y
i
, for i =1,2,,n.
4. Calculate the grouped G
2
, which takes the form
G
2
=2 %
n
i =1
r
i

log
y
i
v
i

y
i
y
i
+v
i

,
where v
i
is the value tted by the model at x
i
.
5. The grouped G
2
will now come from an approxi-
mate
2
distribution, if the reduced, tted model
is similar in t to the larger model, with degrees
of freedom equal to n less the number of parame-
ters in the tted model. If the tted model is poor,
the grouped G
2
will run high.
To illustrate this method we apply it to the loss-of-
control data set. Using Table 1 for grouping, we nd
n=15 groups and that the grouped scaled deviance is
G
2
=2 %
15
i =1
r
i
(log(y
i
/v
i
)
y
i
y
i
+v
i
) =26.84.
Since the probability that a
2
13
variable is greater than
26.84 is 0.0131 there is evidence that the model does not
t well. The grouping process has exposed a weakness
in the t; examination of the components of the
grouped G
2
reveals that ve large ones have y
i
=0. The
absence of accidents for ows between 11300 and
14800, e.g., does not support the validity of the model.
We now progress to the more general negative bino-
mial situation.
4.4. Grouping for negati6e binomial models
As the negative binomial parameter k increases, the
distribution becomes Poisson, so that the Poisson re-
sults just discussed are a limiting case of those in this
section. When the negative binomial parameters v and
k are known it can be shown (Appendix A), assuming
a common k for the maximal and the reduced models,
that the ungrouped scaled deviance is
G
2
(k,v; n) =2 %
n
i =1

log
v+k
y
i
+k

k
+log
y
i
y
i
+k
v+k
v

y
i

.
The interval for group size r in the Poisson case
relied on the critical value 1.6/r, now denoted c

/r. In
a similar fashion, the r-grouping intervals for the nega-
tive binomial case, with parameter k, are determined by
the value v=c
k
for which the variance of a component
of G
2
for the r-grouped data is 2. For example, using
Fig. 8, where k=1, we can check that c
1
is a little over
6; in fact it can be shown that c
1
=6.5422. Fig. 9 shows
how c
k
varies with k.
As k grows larger the curve tends to the limiting
Poisson value of c

=1.6. Table 2 gives a range of


these critical c
k
values numerically. For xed k and v,
Var(G
2
) is found by calculating
E(G
2
) = %

y=0
G
2
(k,v; 1)f(y; k,v),
then
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 423
Fig. 8. Mean and variance of a component of G
2
, for low negative binomial v values with k=1. Deviations from target values are greater now
than for the Poisson case, shown earlier in Fig. 5. Note the larger range for v, compared with Fig. 5, used to indicate that Var(G
2
) is now slower
to move to a value of 2.
Var(G
2
) = %

y=0
(G
2
(k,v;1) E(G
2
))f(y;k,v).
For xed k, the critical mean value c
k
, for which
Var(G
2
) =2, has been found by bracketing the value in
a v interval and using the method of bisection.
How do we use Table 2 to group negative binomial
data? It is shown in Wood (2000) that given k, group
size r sufces for rvEc
rk
or vEc
rk
/r. The argument is
similar to that used in the Poisson case. We use mid-
points of intervals determined by these values as ex-
tremes for grouping regions. This leads to the rule
No grouping is needed for v in
1
2
c
2k
2
+c
k

and group size r\1 sufces for v in


1
2
c
(r+1)k
r+1
+
c
rk
r

,
1
2
c
rk
r
+
c
(r1)k
r1
n
.
For each k, a grouping table analogous to Table 1
can be constructed, using the rule just presented. Inter-
polation in Table 2 may be needed to give intermediate
critical c
k
values. For convenience, Table 3 gives the
group sizes for k=1.
The goodness of t testing routine is now detailed;
we assume that the model has been tted.
1. For each owaccident pair (x,y) compute the
tted v and associate the pair with a group size r,
using the rule above.
2. Working from right to left in the regression plot,
group the raw data so that each owaccident
pair is in a group of size at least as large as the
associated r. As in the Poisson case, this gives n
groups with the i th group of size r
i
having acci-
dent counts of y
i1
,,y
ir
i
and associated ow
counts of x
i1
,,x
ir
i
.
3. Average the x and y values for each group, giving
x
i
and y
i
, for i =1,2,,n.
4. Calculate the grouped G
2
now using
Table 1
For Poisson v in the interval shown, a group size of r will provide a
satisfactory
2
1
approximation (in its rst two moments) for the
grouped G
2
component
v Interval Group size, r
(1.2, ] 1
2 (0.67,1.2]
(0.47, 0.67] 3
4 (0.36, 0.47]
(0.29, 0.36] 5
(0.21, 0.29] 67
(0.15, 0.21] 810
(0.10, 0.15] 1115
1620 (0.08, 0.10]
2130 (0.05, 0.08]
3140 (0.04, 0.05]
4150 (0.03, 0.04]
51100 (0.016, 0.03]
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 424
Fig. 9. For a given negative binomial parameter k, the graph shows the critical negative binomial mean c
k
for which the variance of a component
of the ungrouped G
2
reaches 2.
G
2
=2 %
n
i =1
r
i

log
v
i
+k.
y
i
+k.

+log
y
i
y
i
+k.
v
i
+k.
v
i

y
i

,
where v
i
is the value tted by the model at x
i
.
5. Test G
2
against a
2
reference distribution, with
degrees of freedom equal to n minus the number
of parameters in the tted model.
We return to the rear-end injury accident context of
Section 2 to illustrate the method. Here k.=0.8, so
grouping intervals must be constructed as described in
(i) and (ii) above, using c
k
values found by iteration or
interpolation in Table 2. Iteration was used here; e.g.,
no grouping is needed for v\6.19 and pairing is
needed for v in the interval (1.40,6.19]. We nd the
(x,y) data placed into n=35 groups and that the
grouped G
2
is 57.04. Since the probability that a
2
33
variable exceeds 57.04 is 0.0058 we conclude that the
model is a poor t. Scrutiny of the G
2
components
reveals that one component with r
i
=7, x
i
=5223, y
i
=
1.14 and v
i
=0.28, makes a contribution of 10.4 to G
2
.
There is greater variation in this region than is ex-
plained by this model. It is convincing to study the
scatterplot in Fig. 3 at this point and to note this high
variation.
Two reminders are in order at this point. First, what
we are doing in these analyses is comparing the t of
the very restricted two-parameter model with the
largest model for which G
2
represents a valid compari-
son. We are not able to test against the maximal model.
As long as the departure of the new large model from
the true maximal model is slight, however, we are
comparing the tted model with near-perfection, not
perfection itself. (A plot of the smoothed y
i
values
against the x
i
values, overlaid on Fig. 1, e.g., would
reveal visually the difference between the full maximal
model and the smoothed maximal model.)
Table 2
Critical mean values c
k
, for which Var(G
2
(k,c
k
;1)) =2
c
k
K
326 0.5
0.6 47.0
33.4 0.7
0.8 10.5
0.9 7.81
6.54 1
1.2 5.09
4.27 1.4
1.6 3.75
1.8 3.40
3.14 2
2.73 2.5
3 2.50
3.5 2.34
2.23 4
5 2.09
6 2.00
7 1.93
8 1.89
9 1.85
10 1.83
1.75 15
20 1.71
1.67 30
50 1.64
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 425
Table 3
A group size of r will provide a satisfactory
1
2
approximation for the
grouped G
2
component, for negative binomial mean v in the specied
interval. Here k=1.
v Interval Group size, r
1 (4.06, ]
2 (1.20, 4.06]
(0.70, 1.20] 3
4 (0.49, 0.70]
5 (0.38, 0.49]
67 (0.26, 0.38]
810 (0.17, 0.26]
1115 (0.11, 0.17]
1620 (0.08, 0.11]
(0.05, 0.08] 2130
3140 (0.04, 0.05]
(0.03, 0.04] 4150
51100 (0.016, 0.03]
Appendix A. Deriving the error distributions
A.1. Poisson model
Here only variation within an arm is modelled, so the
distribution of the number of accidents Y is
f
Y
(y) =
e
v
v
y
y!
for y=0,1,
A.2. Negati6e binomial model
For a given ow x, we assume that the true mean
accident rate is v. Intersection arms with this ow are
assumed to vary in their safety M according to a gamma
distribution with parameters k and a, as in Hauer et al.
(1988). Thus, M has distribution function
f
M
(m) =
m
k1
e
m/a
a
k
Y(k)
for mE0,
centred on v, so ka (the gamma mean) equals v or
a=v/k. For a given arm, with safety m, it is assumed
that the number of accidents Y occurs according to a
Poisson distribution, with probability mass function
f
Ym
(y) =
e
m
m
y
y!
for y=0,1,
Combining these two sources of variation (m arising
via the gamma distribution and then y arising according
to a Poisson distribution with parameter m) gives that
Y will have a negative binomial probability mass func-
tion, since
f
Y
(y) =
&

0
e
m
m
y
y!
m
k1
e
m/a
a
k
Y(k)
dm,
which can be manipulated to
=
y+k1
k1
1
1+a

k
a
1+a

y
=
y+k1
k1
k
v+k

k
v
v+k

y
as a=v/k (Section 2). With p=k/(v+k) (Section 2) it
can be shown that E(Y) =v=k(1p)/p and Var(Y) =
k(1p)/p
2
(Johnson et al., 1992, p. 207), whence
Var(Y) =v/p=v(v+k)/k=v+v
2
/k.
A.3. Fitting the model
Fitting is described for the negative binomial model.
The considerable simplication possible for the Poisson
model is detailed at the end. Given data in the form (x
i
,y
i
)
for i =1,,n the aim is to t the model described in
Section 2 by estimating i
0
% , i
1
and k. We do this in ve
stages. When there is more than one prediction variable
the routine is readily extended.
Second, an old, but important cautionary remark
about goodness of t testing: given enough degrees of
freedom even a reasonable model will appear as a poor
t. Given the relatively low degrees of freedom (13 and
33) used in the examples here it appears that we have not
reached that point.
In summary, if the reduced model does not t well then
the method proposed provides the required evidence and
points out where the poor t occurs; if there is no
evidence of poor t, then the method gives us valid
grounds for increased condence in the model.
5. Summary
What has been achieved in this paper?
1. A self-contained description of the commonly used
generalised linear accident model has been presented,
together with a rst principles explanation of the
iterative tting process.
2. A resolution of the low mean value problem in the
testing of goodness of t for such models has been
offered. At the computational level, this involves a
practical grouping technique. At the theoretical level,
it involves recognition that as the mean moves to 0,
the largest model against, which the tted model can
be validly tested necessarily grows smaller.
Acknowledgements
Dr Shane Turner is thanked for his considerable
encouragement and for providing the loss-of-control
data set. Two thorough referees are thanked for many
helpful comments, which led to improvements in the
paper. This research was partly funded, and fully stimu-
lated, by Transfund New Zealand.
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 426
(1) Estimation of starting values for i
0
% , i
1
and k.
(a) Estimate initial values b%
0
(0)
and b%
1
(0)
for parame-
ters i%
0
and i
1
. Use of linear regression on a plot of the
log transformed local mean accident count (sufciently
grouped to avoid zeros) against the log transformed
ows (so the log transform of points linked by dashed
lines in Figs. 1 and 3) provides a secure means for
estimating starting values (since if v=i
0
x
i
1
then
log v=log i
0
+i
1
log x). Experience with accident
modelling has shown that b%
0
(0)
=10 and b%
1
(0)
=1
generally sufce. Set b
(0)
=(b%
0
(0)
,b
1
(0)
)
T
, where T trans-
poses the vector.
(b) Estimate an initial value k
(0)
for k. Parameter k
can be initialised by using the points in the plots of
local variance s
2
against local mean y (as in Figs. 2 and
4). Rearranging Var(y) =v+v
2
/k gives k=v
2
/
(Var(y) v) so
k
(0)
=
y
2
s
2
y
is a sensible choice. For example, in Fig. 4 the variance
of accident numbers for ows in the 20002999 range is
0.143, while the mean for accident numbers in this
range is 0.125. This yields a satisfactory estimate for k
of 0.87.
(2) Take a suitable grid of values around k
(0)
, com-
promising between neness of the grid and number of
grid points.
(3) For each k in the grid and using b%
0
(0)
,b
1
(0)
each
time, t the model in the following steps:
(a) Set up the design matrix X=

1 logx
1

1 logx
n

. (For
a model involving two ows there will be a third
column of logged ows, and so on.)
(b) Iterate
b
(m+1)
=(X
T
W
(m)
X)
1
X
T
W
(m)
z
(m)
, (1)
until the parameters are unchanging to the required
number of decimal places. Here, W
(m)
is a diagonal
nn weights matrix with w
ii
(m)
=kv
i
(m)
/(k+v
i
(m)
), where
v
i
(m)
=e
p
i
(m)
and p
(m)
=Xb
(m)
. Also z
(m)
is a column
vector of length n with z
i
(m)
=p
i
(m)
+(y
i
v
i
(m)
)/v
i
(m)
.
This iteration can be readily carried out in Matlab,
Splus or Minitab, for example.
(4) Evaluate the log-likelihood of the data for each k
value, namely
%
n
i =1

log
y
i
+k1
k1

+klog
k
v
i
+k

+log
v
i
v
i
+k

y
i

,
where v
i
=b
0
x
i
b
1
with b
0
=e
b%
0
.
(5) Select the k, calling it k. (and associated b%
0
and
b
1
), for which this log-likelihood is largest.
This provides a method for nding the maximum
likelihood estimates for k, i%
0
and i
1
.
If we are true to the initial model formulation, a
different value of k would be used for each ow x
i
since
this parameter of the underlying gamma distribution is
permitted to vary with the vehicle ow. In practice, a
single k value has been used by authors, so implicitly
assuming that the shape parameter of the underlying
gamma distribution is the same across ows. In Maher
and Summersgill (1996, p. 289), this was allowed to
vary as k(v) =hv
N
. These authors found empirically
that where it was possible to decide on a value for N (in
the interval from 0 to 1) then it was always the case
that the negative binomial model (N=0) was the best
choice. This is precisely the single k value case de-
scribed above.
For the Poisson model, we need only estimate i%
0
and
i
1
. These are found by initialising, then iterating (1)
using w
ii
(m)
=v
i
(m)
with v
i
(m)
and z
i
(m)
as already dened.
A.4. Calculating the scaled de6iance
For the Poisson model, the scaled deviance is
G
2
=2(l (v
max
,y) l (v
red
,y)),
twice the difference between the log-likelihood under
the maximal model and the log-likelihood under the
reduced model. Here, y=(y
1
,,y
n
) is the data vector,
v
max
=(v
1
,,v
n
) is the n-tuple of means postulated
under the maximal model and v
red
is that under the
reduced model. Since the probability mass function for
a Poisson variable with mean v is f(y;v) =e
v
v
y
/y!,
we have
l (v,y) =log 5
n
i =1
f(y
i
;v
i
) = %
n
i =1
logf(y
i
;v
i
)
= %
n
i =1
(logv
i
y
i
v
i
log(y
i
!)).
For the maximal model, v
i
=y
i
so
l (v
max
,y) = %
n
i =1
(logy
i
y
i
y
i
log(y
i
!)).
On the other hand, for the reduced model, v
i
=v
i
(e.g., v
i
=b
0
x
i
b
1
for a single ow model), so
l (v
red
,y) = %
n
i =1
(logv
i
y
i
v
i
log(y
i
!)).
Thus, the scaled deviance is
G
2
=2 %
n
i =1

log
y
i
v
i

y
i
y
i
+v
i

.
G.R. Wood / Accident Analysis and Pre6ention 34 (2002) 417427 427
For the negative binomial model, the corresponding
expression for G
2
is calculated using the same routine,
as
2 %
n
i =1

log
y
i
+k1
k1
,y
i
+k% 1
k% 1

+log
k
y
i
+k

k
v
i
+k%
k%

k%

+log
y
i
y
i
+k
v
i
+k%
v
i

y
i

,
where k% is the negative binomial parameter tted
using the reduced model (using the ve step routine
described earlier) and k is the negative binomial
parameter tted using the maximal model (again fol-
lowing the routine earlier, eliminating Steps 1(a) and 3,
and noting that v
i
=y
i
now for all i ). If we assume that
k=k% and v
i
=v for all i, then G
2
simplies to the
expression given for G
2
(k,v; n) in Section 4 of the
paper.
References
Dobson, A.J., 1990. An Introduction to Generalized Linear Models.
Chapman and Hall, London.
Fridstrm, L., Ifver, J., Ingebrigtsen, S., Kulmala, R., Thomsen,
L.K., 1995. Measuring the contribution of randomness, exposure,
weather, and daylight to the variation in road accident counts.
Accident Analysis and Prevention 27, 120.
Hauer, E., Ng, J.C.N., Lovell, J., 1988. Estimation of safety at
signalized intersections. Transportation Research Record 1185,
4861.
Johnson, N.L., Kotz, S., Kemp, A.W., 1992. Univariate Discrete
Distributions, Second ed. Wiley, New York.
Maher, M.J., Summersgill, I., 1996. A comprehensive methodology
for the tting of predictive accident models. Accident Analysis
and Prevention 28, 281296.
Maycock, G., Hall, R.D., 1984. Accidents at 4-arm roundabouts.
TRRL Laboratory Report 1120, UK Transport and Road Re-
search Laboratory, Crowthorne, Berkshire, England.
Wood, G.R., 2000. Assessing goodness of t for Poisson and negative
binomial models with low mean. Massey University Technical
Report, Institute of Information Sciences and Technology,
Massey University, Palmerston North, New Zealand.

You might also like