Professional Documents
Culture Documents
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/24137076
CITATIONS READS
1,267 763
1 author:
Joseph Hilbe
Arizona State U and U of Hawaii
249 PUBLICATIONS 5,264 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Joseph Hilbe on 23 December 2014.
Joseph M. Hilbe
Arizona State University
Hilbe@asu.edu; jhilbe@aol.com
The book was first released for sale on July 29, 2007 at the Joint Statistical Meetings
(JSM) of the American Statistical Association, held at Salt Lake City, UT.
I have read through the text and identified errors that were overlooked during the editing
process. I apologize for these oversights. If readers find other errors, please contact me
and I shall post them to this sight. I hope that these errors can be corrected in the second
printing of the text.
Below the list of Errata, I providing a few thoughts regarding the discussion found in the
text, perhaps giving you a better insight into the reason I wrote it, to whom it is directed,
and thoughts about the statistics involved. I finished writing the main part of the text in
2006; the book was completed in early 2007. The subject has advanced during the
interval, and I wish to provide a brief update.
I have rewritten large parts of chapter 10 in light of recent advances in the area in
particular for GEE models. You can download the revised Chapter 10 at:
http://www.statistics.com/other/hilbe/index.php
Both this Errata page and the various data sets and user authored statistical commands for
examples used in the text are posted to the web site for the book at:
http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521857727
Find the place on the lower left side of the web site to access the 33 data sets identified in
Appendix E as well as Stata commands used to create examples in the text. Data files are
available in the following formats: Stata, SAS, SPSS, Excel, R, and Limdep. I am posting
Stata files in both version 10 format as well as in the older version 8-9 format. Users of
Stata 10 can read files saved in older versions, but users of versions under 10 cannot read
files saved using version 10. Details of these files, as well as of the other formats, are
explained on the web site.
Several of the examples used in the text were modeled using Stata commands not found
in commercial Stata or on the texts web site. These commands should mostly be found
at the following web site:
http://ideas.repec.org/s/boc/bocode.html
The majority of the relevant commands are from 2005, associated with my name. The site
allows for easy searching. Others not on the site are posted to this books web site.
I should give a quick note as to my intended audience. The book is not directed to
professional statisticians, although many will likely find new and hopefully interesting
information in the text. Rather, the book was primarily written for those researchers who
have little background in count response models, but who find that they have a need to
learn about them for an upcoming project or study. I have written the text in as clear a
manner as possible, many times re-emphasizing important items that need to be
remembered in the modeling process. My tone is more like a classroom presentation
rather than a formal text. I have attempted to speak directly to the reader, giving advice as
to the comparative modeling process --- outlining the algorithmic basis of the respective
count models, detailing different methods of estimation, selecting the appropriate model,
assessing fit, interpreting parameter estimates and ancillary parameters, and so forth.
Examples are given for each model discussed.
The end result is a book that can prove useful to researchers, to graduate students who
need to have a workable understanding of count models, as well as anyone else who is
simply interested in this area of statistical modeling. The focus of discussion is negative
binomial regression, which the reader will find designates a broad range of models.
I have primarily used Stata throughout the text for examples, with Limdep being used for
examples where no Stata command yet exists. Stata is one of the most popular statistical
applications worldwide, and has commands that accommodate nearly every negative
binomial model discussed in this text. I am including user-authored commands that are
easily attainable as well. It is second only to Limdep in the range of count response
model offerings. SAS, SPSS, S-Plus, Statistica, Genstat, and other popular commercial
packages have only minimal count model capabilities. R has many user count models, but
not nearly as many as found in Limdep or Stata. Moreover, I have previously authored a
number of published statistical procedures using the Stata language the majority of
which relate to count response modeling. Stata was therefore the reasonable choice for
displaying example output.
ERRATA
PREFACE, Page x: 2nd line of 3rd complete paragraph.
Web address:
www.cambridge.org/XXXXX should read www.cambridge.org/9780521857727.
This latter correct address is found in Appendix E (p. 240).
Ch 3, page 40, 1st line under equation 2.53. caste should be spelled cast.
CH 4, Page 52, second to last line of program code on page. The line should read:
. gen xb = 1 + .5*x1 - .75*x2 + .25*x3
CH 5, Page 78. Table 5.1: The last two items should be numbered 24 and 25, and not 22
and 22.
CH 5, Page 82:Figures 5.19 and 5.20: Choose functions should not have division sign
between top and bottom terms.
CH 5, page 96: Section 5.5, Paragraph 3, line 2. Delete comma between version and
of.
CH 6, Page 112: Formula for the negative binomial variance is mistaken
Should read: + 2
The formula as it currently reads is missing in the 2nd term.
CH 7, Page 139, line 3 under table of parameter estimates, 1st word of line:
Word restricted should be changed to expected. The line should read, in part:
expected for a geometric model. Recall that unless
CH 7, Page 155, 7th-6th line from bottom: Yang, Hardin, and Addy (2006), not 2007.
CH 8, Page 171, bottom-page model output. Negative binomial age3 coefficient should
read: .023721. The decimal point was inadvertently dropped.
CH 9, Page 181, Header for bottom table: Should read: POISSON: DROPPED
VALUES 0-3
In fact, for clarification purposes, I would rather add an additional sentence after the first.
The first three sentences of the paragraph should then read:
Multilevel models are sometimes called hierarchical models, particularly in educational
and social science research. However, the majority of statisticians now tend to draw a
distinction between multilevel and hierarchical models, primarily because of the manner
in which the methods define order of levels, or nesting. Regardless, the idea behind
multilevel modeling is to model
COMMENTS ON DISCUSSION
Page 27: AIC and BIC statistics.
First note that text between /* and */ is comment, and not processed by the algorithm.
The problem here is the last term, . It should read, n, not . n represents the number of
observations in the model; is the link function, which is also the linear predictor.
The book also implies in various places that the BIC statistic can only be defined using
the deviance function, not the log-likelihood. I dont believe that I actually stated this, but
I think it is implied. Of course, this is not the case; the BIC can be defined in terms of the
log-likelihood: BIC = -2{LL k*ln(k))/n, where k is the number of model predictors
and n the number of model observations. LL is the log-likelihood function.
The model having the lowest value for its BIC statistic is the preferred model, fitting
better than the others. The degree of model preference is based on the absolute difference
between the BIC statistics of two models. A table of preferences can be shown as:
|difference| Degree of preference
-----------------------------------------------
0 - 2 Weak
2 - 8 Positive
6 - 10 Strong
> 10 Very Strong
Models A and B:
If BICA BICB < 0, then A preferred
If BICA BICB > 0, then B preferred
It is important to recognize that the AIC statistic appears with and without the
denominator, n. Both forms are common, so care must be taken when comparing models
to make certain that the same definition has been used in all cases you are comparing. As
with the BIC, the model having the lowest AIC is preferred over others.
Page 28: Section 2.2, opening paragraph. This paragraph may seem confusing, and I
offer a re-write below this paragraph. The paragraph as it exists in the book implies that
we have not yet addressed Fisher scoring, or the IRLS method of estimation that it will
follow the forthcoming discussion of Newton-Raphson type methods. However, the
reader will know that we just finished talking about IRLS methods, detailing the
theoretical justification as well as the algorithm. I wrote this section a year and a half
ago, and do not recall the exact rationale of the wording. However, I believe that I
intended to first discuss N-R methods, then IRLS. I later changed it for pedagogical
purposes, but failed to make the change to this paragraph. On the other hand, we will
discuss GLMs employing the observed information matrix in section 2.2.2, directly
following the derivation of the generalized Newton-Raphson type algorithm. Therefore,
there is some (only some) truth to what is implied in this paragraph with respect to order
of discussion, but it does need a revision as expressed below. My apologies to the reader.
NEW PARAGRAPH
In this section we discuss the derivation of the Newton-Raphson type algorithm. Until
recently, the only method used to estimate the standard negative binomial model was by
maximum likelihood estimation using a Newton-Raphson based algorithm. All other
varieties of negative binomial are still estimated using a Newton-Raphson based routine.
We shall observe in this section though, that the Iteratively Re-weighted Least Squares
method we discussed in the last section, known as Fisher Scoring, is a subset of the
Newton-Raphson method. We conclude by showing how the parameterization of the
GLM mean, , can be converted to X.
Page 38, Question 7: The AIC and BIC statistics are defined on page 27, as part of Table
2.1. There are some types of models that do not produce a log-likelihood function, and
therefore not a deviance function. Quasi-likelihood models do not produce a viable log-
likelihood function that can be used in an AIC statistic. Some software uses a deviance
statistic for the basis of IRLS modeling, but does use or define the log-likelihood. In
these cases an AIC statistic is not produced. Likewise for a model that uses a log-
likelihood function, but has no defined deviance. If it employs the BIC statistic requiring
a deviance, then it does not display a BIC statistic. The user can usually calculate it for
themselves.
For Figures 5.7 - 5.11, the values of alpha for a specific mean, from top to bottom at
count 0, are [3, 1.5, 1, .67, .33, 0]. The BOOK COVER is the same as Figure 5.10.
Page 126: line 3 of text (under display of table). Sentence beginning with Age is
missing the word the. It should read, Age and education are not contributory to the
model. This is not really an error, just better grammar.
Page 174: Bottom of page: Pursuant to the amendment I made to the text, as shown above
under Errata, the following addition to the correction could be made for clarification:
(2) 1- [1/(1+exp(-x)), the prediction that y==0, and . Also recall that for the final
formula of the paragraph, exp(x) can be substituted for .
I suggest that the reader ignore the inherent unreliability of these specific model results,
reading the text and its interpretation as if appropriate convergence was achieved as it
appears in the output. The pedagogical value of the discussion is nevertheless valid. The
correlation values produced, together with parameter estimates and standard errors,
appear to be reasonable, and can be used with value in demonstrating how the models are
to be estimated and evaluated.
The stationary 4 correlation structure is feasible for this data, unlike other stationary
correlation structures. It can be developed using the command:
. xtgee seizures time timeXprog, fam(poi) i(id) t(t) corr(stat 4) force
It is important to use the force option when time intervals are not all equal. If in fact the
time intervals are equal, the force option will have no effect.
Note on new Chapter 10, GEE models: Justine Shults at the University of Pennsylvania
School of Medicine (Biostatistics) and her colleagues have built on previous work done
by N.R. Chaganty to construct an iterative adjustment to the underlying GEE algorithm
which guarantees, for selected correlation structures, a consistent estimate of the
correlation parameter and a positive definite estimated correlation matrix. The method is
called Quasi-Least Squares (QLS) and has particular use when the corresponding GEE
model is misspecified. In such cases the model typically fails to converge. Currently the
only correlation structures developed for QLS are the exchangeable, first order
autoregressive (AR 1), first order stationary (which Shults calls tri-diagonal), and
Markov, which is not in any current commercial GEE application. I recommend reading
J. Shults, S.J. Ratcliffe, and M. Leonard, (2007). Improved generalized estimating
equation analysis via xtqls for quasi-least squares in Stata, The Stata Journal, Vol 7:2
pp147-166. In the same issue, James Cui has an article titled, QIC program and model
selection in GEE analysis pp. 209-220. Both should be of interest to those interested in
modeling longitudinal and otherwise correlated data using GEE methodology.