Panel Count Models in Stata

Recent Developments in Panel Models for Count Data
Pravin K. Trivedi
Pravin K. Trivedi
Indiana University. - Bloomington
Prepared for 2010 Mexican Stata Users Group meeting,
based on
A. Colin Cameron and Pravin K. Trivedi (2005),
Microeconometrics: Methods and Applications (MMA), C.U.P.
MMA, chapters 21-23
and
A. Colin Cameron and Pravin K. Trivedi (2010),
Microeconometrics using Stata Revised edition (MUSR), Stata Press.
MUSR, chapters 8;18.
April 29, 2010
Indiana University. - Bloomington (Prepared

Panel counts
for 2010 Mexican Stata Users Group meeting,
April 29, 2010
based on 1A./ Colin
77
Introduction
0. Dedication
Pravin K. Trivedi

Panel counts
April 29, 2010
based on 2A./ Colin
77
Introduction
1. Introduction
Objective 1: To survey recent developments in count data panel

models
Objective 2: Evaluate the advances made against background of main
features of count data
Objective 3: Highlight the areas where signicant gaps exist and
review the most promising approaches
Pravin K. Trivedi

Panel counts
April 29, 2010
based on 3A./ Colin
77
Introduction
Background (1)
Panel data are repeated measures on individuals (i ) over time (t ):
data are (yit , xit ) for i = 1, ..., N and t = 1, ..., T , and yit are
nonnegative integer-valued outcomes.
Conditional on xit , the yit are likely to be serially correlated for a
given i, partly because of state dependence and partly because of
cserial correlation in shocks.
Hence each additional year of data is not independent of previous
years.
Cross-sectional dependence between observations is also to be
expected given emphasis on stratied clustered sampling designs.
(1) Pervasive unobserved heterogeneity, (2) a typically high
proportion of zeros, (3) inherent discreteness and heteroskedasticity
generate complications that are hard to handle simultaneously
Finally, the researchers interest often goes beyond the conditional
mean.
How well does available software (Stata) handle these issues?
Pravin K. Trivedi

Panel counts
April 29, 2010
based on 4A./ Colin
77
Introduction
Basic linear panel models
2. Basic linear panel models review

Pooled model (or population-averaged)
yit = + xit0 + uit .
(1)
Two-way eects model allows intercept to vary over i and t

yit = i + t + xit0 + it .
(2)
Individual-specic eects model

yit = i + xit0 + it ,
(3)
where i may be xed eect or random eect.

Mixed model or random coe cients model allows slopes to vary over i
yit = i + xit0 i + it .
Pravin K. Trivedi
(4)

Panel counts
April 29, 2010
based on 5A./ Colin
77
Introduction
Fixed versus random eects model
3. Fixed eects versus random eects model

Individual-specic eects model:
yit = xit0 + (i + it ).
Fixed eects (FE):
I
I
I
I
i is a random variable possibly correlated with xit

so regressor xit may be endogenous (wrt to i but not it )
e.g. education is correlated with time-invariant ability
pooled OLS, pooled GLS, RE are inconsistent for
within (FE) and rst dierence estimators are consistent.
Random eects (RE) or population-averaged (PA):

I
I
I
i is purely random (usually iid (0, 2 )) unrelated to xit

so regressor xit is exogenous
appropraite FE and RE estimators are consistent for
Fundamental divide: microeconometricians FE versus others RE.

Pravin K. Trivedi

Panel counts
April 29, 2010
based on 6A./ Colin
77
Nonlinear panel models
Overview
4. Some features of nonlinear panel models

In contrast to linear models, solutions for nonlinear models tend to
lack generality and are model-specic.
Standard count models include: Poisson and negative binomial
General approaches are similar to those for the linear case
I
I
I
Pooled estimation or population-averaged

Random eects
Fixed eects
Complications
I
I
I
Pravin K. Trivedi
Random eects often not tractable so need numerical integration

Fixed eects models in short panels are generally not estimable due to
the incidental parameters problem.
Count models involve discreteness, nonlinearity and intrinsic
heteroskedasticity.

Panel counts
April 29, 2010
based on 7A./ Colin
77
Overview
Some Standard Cross-section Count Models
1
2
f (y )
Poisson
NB1
NB2
Hurdle
ZI
FMM
Pravin K. Trivedi
f (y ) = Pr[Y = y ]
e y /y !
As in NB2 below with 1 replaced by 1
1
y
( 1 + y )
1
1
1
1
( ) (y + 1 ) +
+
8
f1 (0 )
if y = 0,
<
1 f1 (0 )
f2 (y )
if y
1.
:
1 f2 (0 )
f1 (0) + (1 f1 (0))f2 (0)
if y = 0,
(1 f1 (0))f2 (y )
if y
1.
m
j =1
j fj (y jj )
Mean; Variance
(x); (x) = exp(x0 )
(x); (1 + ) (x)
(x) ; (1 + (x)) (x)
Pr [y > 0jx]Ey >0 [y jy > 0, x
(1
f1 (0))
((x)+f1 (0)2 (x))

2i =1 i i (x);
2i =1 i [i (x) + 2i (x)]

Panel counts
April 29, 2010
based on 8A./ Colin
77
Overview
A pooled or population-averaged (PA) model may be used.

I
This is same model as in cross-section case, with adjustment for

correlation over time for a given individual.
A fully parametric model may be specied, with separable

heterogeneity and conditional density
f (yit ji , xit ) = f (yit , i + xit0 , ),
t = 1, ..., Ti , i = 1, ...., N, (5)
f (yit ji , xit ) = f (yit , i + xit0 i , ),
t = 1, ..., Ti , i = 1, ...., N, (6)
or nonseparable heterogeneity
where denotes additional model parameters such as variance

parameters and i is an individual eect.
A semiparametric conditional mean (usually exponential mean) model
may be specied, with additive eects
E[yit ji , xit ] = i + g (xit0 )
(7)
g (xit0 ).
(8)
or multiplicative eects
E[yit ji , xit ] = i
Pravin K. Trivedi

Panel counts
April 29, 2010
based on 9A./ Colin
77
Overview
5. Evolution of Panel Models (1)
Focus on panel methods most commonly used by

microeconometricians. The underlying asymptotic theory assumes
short panels (T small, N large): data on many individual units and
few time periods.
The key paper in the modern treatment of panel analysis for counts is
Hausman et al. (1984).
The developments since 1984 can be summarized in generational
terms as follows.
Pravin K. Trivedi

Panel counts
for 2010 Mexican Stata Users Group April
meeting,
29, 2010
based on10A./ Colin
77
Overview
Evolution of Panel Models (2)

G-1 models
G-2 models
G-3 models
Period
1974-1990
1991-2000
Post-2000
Function
Mainly parametric
Flexible parametric
Parametric / SP
CS models
Poisson, Negbin
Hurdles, nite
Quantile reg;
mixtures, ZIP
Selection models
Panel models
Poisson, Negbin
Poisson, Negbin, EM
EM; QR;
Unobs. hetero.
Multiplicative
Separable or nonsep.
Flexible; non-sep
Modeling
Mainly RE or PA
RE, PA and
RE/PA/FE/
xed eects
Correlated RE; DV
Robust
Robust
Robust or Cl-Rob
wrt overdispersion
wrt OD
wrt OD/SC
Dynamics
Lagged xs
Exponential feedback
Linear or exponential
Endogeneity
Largely ignored
Allowed in RE models
Allowed in RE and FE
Estimators
Mainly MLE
MLE; GEE; NLIV;
MLE; GEE; NLGMM;
Variance est.
Pravin K. Trivedi
QR; QRIV

Panel counts
meeting,
29, 2010
based on11A./ Colin
77
Overview
6. Remarks on the evolution of count panel models (2)
FE panel data counterparts of several popular cross-section models

like hurdles, FMM, and ZIP are undeveloped.
When several complications occur simultaneously (e.g. nonseprable
individual-specic eects and endogenous regressors) they are most
conveniently analyzed in a RE or PA or moment-based models.
Fully parametric methods for simultaneously handling endogeneity
plus something else (e.g. nonseparable UH) are largely absent, and
moment-based methods are a dominant alternative.
Overdispersion-robust and cluster-robust estimation of variances is
now feasible and very common.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on12A./ Colin
77
Nonlinear panel estimators
Pooled or population-averaged estimators
7. Nonlinear: Pooled or population-averaged estimators

Extend pooled OLS to the nonlinear case
I
I
Give the usual cross-section command for conditional mean models or

conditional density models but then get cluster-robust standard errors
Poisson example:
poisson y x, vce(cluster id)
or
xtgee y x, fam(poisson) link(log) corr(ind) vce(cluster
id)
Extend pooled feasible GLS to the nonlinear case

I
I
Pravin K. Trivedi
Estimate with an assumed correlation structure over time

Equicorrelated probit example:
xtpoisson y x, pa vce(boot)
or
xtgee y x, fam(poisson) link(log) corr(exch) vce(cluster
id)

Panel counts
meeting,
29, 2010
based on13A./ Colin
77
Random eects estimators
Nonlinear random eects estimators

Assume individual-specic eect i has specied distribution g (i j).
Then the unconditional density for the i th observation is
f (yit , ..., yiT jxi 1 , ..., xiT , , , )
Z h
i
T
=
f
(
y
j
x
,
,
,
)
g (i j)d i .
t =1 it it i
(9)
Analytical solution:
I
I
I
For Poisson with gamma random eect

For negative binomial with gamma eect
Use xtpoisson, re and xtnbreg, re
No analytical solution:
I
I
I
I
I
Pravin K. Trivedi
For other models.

Instead use numerical integration (only univariate integration is
required).
Assume normally distributed random eects.
Use re option for xtlogit, xtprobit
Use normal option for xtpoisson and xtnbreg

Panel counts
meeting,
29, 2010
based on14A./ Colin
77
Random slopes estimators
9. Finite Mixture or Latent Class model

Suppose the sample is generated from the following dgp:
f (yit jxit , ) =
C 1
j =1
j fj (yit jxit , j ) + C fC (yi jxit , C ),
(10)
where Cj=1 j = 1, j > 0 (j = 1, ..., C ). For identiability, use labelling

restriction 1 2 .... C , always satised by rearrangement,
postestimation.
This specication accommodates discrete nonseparable heterogeneity
between latent classes.
Long history in statistics; see McLachlan and Basford (1988). Earlier
treatments emphasized univariate formulations; (Lindsey, 1995)
emphasized identication and complexity. Special cases: Heckman
and Singer (1984)
b C
b ) that maximizes L(,,C jy) is
Probability distribution f (yi j;
called the semiparametric maximum likelihood estimator
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on15A./ Colin
77
f (yi j j ) can itself be a exiblefunctional form that accommodates

within-class heterogeneity
C can be chosen using the hypothesis testing approach or model
comparison approach
Determining the number of components is a nonstandard inference
problem as testing at boundary of parameter space.
I
I
Simple approach is to use BIC or CAIC.

Or do appropriate bootstrap for the likelihood ratio test.
Can be implemented using Statas fmm command such as

fmm y $xlist1, vce(robust) components(3) mixtureof(poisson)
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on16A./ Colin
77
10. Quantile regression

b minimizes over
The q th quantile regression estimator
q
q
N
Q ( q ) =
i :yi xit
I
q jyit
xit0 q j +
(1
i :yi <xit
q )jyit
xit0 q j,
0<q<1
Example: median regression with q = 0.5.
Continuation transform: For count y adapt standard methods for

continuous y by:
I
I
I
Pravin K. Trivedi
Replace count y by continuous variable z = y + u where

u Uniform [0, 1].: "jittering step"
Then reconvert predicted z-quantile to y -quantile using ceiling
function.
Machado and Santos Silva (JASA, 2005).

Panel counts
meeting,
29, 2010
based on17A./ Colin
77
Adapting to the exponential mean

Conventional count models based on exponential conditional mean,
exp(x0 ), rather than x0 .
Qq (y jx) and Qq (z jx) denote the q th quantiles of the conditional
distributions of y and z, respectively. To allow for exponentiation,
Qq (z jx) is specied to be
Qq (z jx) = q + exp(x0 q ).
The additional term q appears because Qq (z jx) bounded from below
by q, due to jittering.
Log transformation is applied so that ln(z
adjustment if z q < 0
q ) is modelled, with the
Transformation justied by the property that quantiles are equivariant

to monotonic transformation
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on18A./ Colin
77
Implementation
Post-estimation transformation of the z-quantiles back to y -quantiles
uses the ceiling function, with
Qq (y jx) = dQq (z jx)
1e
where the symbol dr e in the right-hand side denotes the smallest

integer greater than or equal to r .
To reduce the eect jittering the model is estimated multiple times
using independent draws from U (0, 1) distribution, and estimated
coe cients and condence interval endpoints are averaged. Hence
the estimates of the quantiles of y counts are based on
b ) 1e, where
b denotes
b q (y jx) = dQq (z jx) 1e = dq + exp(x0
Q
q
the average over the jittered replications.
Variance estimation usually based on computationally intensive
bootstrap
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on19A./ Colin
77
QCR method of Machado and Santos Silva can be implemented using

Stata add-on command qcount, due to Miranda (2006). The
command syntax is:
qcount depvar [indepvars ] [if ] [in], quantile(number ) [, repetition(#)

where quantile(number ) species the quantile to be estimated and
repetition(#)species the number of jittered samples to be used to
calculate the parameters of the model, the default value being 1000.
Panel models can be estimated treating the data as repeated cross
sections, as in PA approach.
Main attraction is the ability to study dierences in marginal eects
at dierent quantiles.
The post-estimation command qcount_mfx computes marginal
eects for the model, evaluated at the means of the regressors.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on20A./ Colin
77
QCR Example - Winkelmann, JHE 2006

Using an unbalanced sample (1995-1999) from GSOEP, Winkelmann
analyzes the dierential impact of healthcare reform on distribution of
doctor visits across quantiles.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on21A./ Colin
77
QR and panel data: pros and cons
Excess zeros can make identication of lower quantiles di cult.

Can QR accommodate xed and random eects?
Interpretation of xed eects in QR context is somewhat tenuous; see
Koenker (2004).
QR has been extended to accommodate censoring, endogenous
regressors; see Chernozhukov et al (2009)
QR has also been extended to handle lagged dependent variable.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on22A./ Colin
77
11. Nonlinear random slopes estimators

Can extend to random slopes by adding an assumption about the
distribution of slopes.
I
I
I
Nonlinear generalization of xtmixed

Then higher-dimensional numerical integral.
Use adaptive Gaussian quadrature
Stata commands are:

I
I
xtmelogit for binary data

xtmepoisson for counts
Stata add-on that is very rich:

I
I
Pravin K. Trivedi
gllamm (generalized linear and latent mixed models); can be quite

slow!
Developed by Sophia Rabe-Hesketh and Anders Skrondal.

Panel counts
meeting,
29, 2010
based on23A./ Colin
77
Fixed eects estimators
12. Nonlinear xed eects estimators

In general not possible in short panels.
Incidental parameters problem:
I
I
I
I
N xed eects i plus K regressors means (N + K ) parameters

But (N + K ) ! as N !
Need to eliminate i by some sort of dierencing, or concentrated
likelihood argument
possible for Poisson, negative binomial
Stata commands
I
I
xtpoisson, fe (better to use xtpqml as robust ses)

xtnbreg, fe
Fixed eects extensions to hurdle, nite mixture, zero-inated models

currently not available.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on24A./ Colin
77
Incidental parameters in Poisson regression

Derivation of xed eects estimator for the Poisson panel
Poisson MLE simultaneously estimates and 1 , ..., N . The
log-likelihood is
i
h
ln L( , ) = ln i t fexp( i it ) (i it )yit /yit !g
h
= i
i t it + ln i t yit + t yit ln it
where it = exp(xit0 ).
FOC with respect to i yields b
i = t yit / t it (a su cient
statistic for i )
Substituting this yields the concentrated likelihood function.
Dropping terms not involving ,
h
i
ln Lconc ( ) _ i t yit ln it yit ln s is .
Pravin K. Trivedi
t ln yit(1!
(12)

Panel counts
meeting,
29, 2010
based on25A./ Colin
77
Interpretation
Here is no incidental parameters problem.
Consistent estimates of for xed T and N ! can be obtained by
maximization of ln Lconc ( )
FOC with respect to yields rst-order conditions
h
h
i h
ii
y
x
y
x
/
= 0,
it
it
it
is
is
is
s
s
i t
that can be re-expressed as
Pravin K. Trivedi
xit
i =1 t =1
yit
it
yi
i
= 0,
(13)

Panel counts
meeting,
29, 2010
based on26A./ Colin
77
FE Poisson: pros and cons

Time-invariant regressors will be eliminated also by the
transformation. Some marginal eects not identied.
May substitute individual specic dummy variables, though this raises
some computational issues.
Poisson and linear panel model special in that simultaneous
estimation of and provides consistent estimates of in short
panels, so there is no incidental parameters problem.
The above assumes strict exogeneity of regressors.
We can handle endogenous regressors under weak exogeneity
assumption. A moment condition estimator can be dened using the
FOC (13).
This FE approach does not extend to several empiricaly important
models: hurdle, fmm, and zip.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on27A./ Colin
77
Ad hoc methods for handling xed eects
Are we making too much of the xed eects and the associated
incidental paramnetr problem?
The dummy variables solution; Allison (2009); Greene (2004).
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on28A./ Colin
77
Computation with Stata Commands
13. Stata Commands

Estimator
Pooled
Quantile
Count data
poisson; nbreg
qcount
q(%), rep(#)
fmm components(#) mixtureof(poisson)
FMM
fmm components(#) mixtureof(nbreg)
GEE (PA)
xtgee,family(poisson)
xtgee,family(nbinomial)
RE
xtpoisson, re
xtnbreg, fe
Random slopes xtmepoisson
FE
xtpoisson, fe
xtnbreg, fe
FMM is not part of o cial Stata but is in the public domain and can
be added
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on29A./ Colin
77
Data example
Panel counts: data example

Data from Rand health insurance experiment.
I
y is number of doctor visits.
. use mus18data.dta, clear

. describe mdu lcoins ndisease female age lfam child id year
variable name
mdu
lcoins
ndisease
female
age
lfam
child
id
year
Pravin K. Trivedi
storage
type
float
float
float
float
float
float
float
float
float
display
format
%9.0g
%9.0g
%9.0g
%9.0g
%9.0g
%9.0g
%9.0g
%9.0g
%9.0g
value
label
variable label
number face-to-fact md visits
log(coinsurance+1)
count of chronic diseases -- ba
female
age that year
log of family size
child
person id, leading digit is sit
study year

Panel counts
meeting,
29, 2010
based on30A./ Colin
77
Data example
b [y ] = 4.502 ' 7
Dependent variable mdu is very overdispersed: V
y .
. summarize mdu lcoins ndisease female age lfam child id year

Variable
Obs
Mean
mdu
lcoins
ndisease
female
age
20186
20186
20186
20186
20186
2.860696
2.383588
11.2445
.5169424
25.71844
lfam
child
id
year
20186
20186
20186
20186
1.248404
.4014168
357971.2
2.420044
Pravin K. Trivedi
Std. Dev.
Min
Max
4.504765
2.041713
6.741647
.4997252
16.76759
0
0
0
0
0
77
4.564348
58.6
1
64.27515
.5390681
.4901972
180885.6
1.217237
0
0
125024
1
2.639057
1
632167
5

Panel counts
meeting,
29, 2010
based on31A./ Colin
77
Data example
Panel is unbalanced. Most are in for 3 years or 5 years.

. xtdescribe
id:
year:
125024, 125025, ..., 632167

1, 2, ..., 5
Delta(year) = 1 unit
Span(year) = 5 periods
(id*year uniquely identifies each observation)
Distribution of T_i:
Pravin K. Trivedi
min
1
5%
2
25%
3
50%
3
n =
T =
75%
5
5908
5
95%
5
max
5

Panel counts
meeting,
29, 2010
based on32A./ Colin
77
Data example
For mdu both within and between variation are important.

. * Panel summary of dependent variable
. xtsum mdu
Variable
mdu
Mean
overall
between
within
2.860696
Std. Dev.
4.504765
3.785971
2.575881
Min
Max
Observations
0
0
-34.47264
77
63.33333
40.0607
N =
20186
n =
5908
T-bar = 3.41672
Only time-varying regressors are age, lfam and child

And these have mainly between variation.
This will make within or xed estimator very imprecise.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on33A./ Colin
77
Likelihood-based Panel Count Estimators
14. Panel Poisson
Consider four panel Poisson estimators

I
I
I
I
Pooled Poisson with cluster-robust errors

Population-averaged Poisson (GEE)
Poisson random eects (gamma and normal)
Poisson xed eects
Can additionally apply most of these to negative binomial.

And can extend FE to dynamic panel Poisson where yi ,t
regressor.
Pravin K. Trivedi
is a

Panel counts
meeting,
29, 2010
based on34A./ Colin
77
MLE Estimation
Model
Moment spec.
Estimating equations
Pooled
E[y it jx it ] = exp (x it0 )
T
it ) = 0
N
i =1 t =1 xit (yit
where it = exp(xit0 )
ts = Cor[(yit exp(xit0 ))(yis exp
yi + /T
T
it
N
i =1 t =1 xit yit
i + /T
i = T 1 t exp(xit0 ); = var(i )
yi
T
it
= 0,
N
i =1 t =1 xit yit
i
Poisson
PA
RE
E[y it ji , x it ]
Poisson
= i exp (x it0 )
FE Pois
E[y it ji , x it ] = i exp (x it )
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on35A./ Colin
77
Pooled Poisson
15. Panel Poisson method 1: pooled Poisson
Specify
yit jxit ,
Poisson[exp(xit0 )]
Pooled Poisson of yit on intercept and xit gives consistent .

I
I
Pravin K. Trivedi
But get cluster-robust standard errors where cluster on the individual.

These control for both overdispersion and correlation over t for given i.

Panel counts
meeting,
29, 2010
based on36A./ Colin
77
Pooled Poisson
Pooled Poisson with cluster-robust standard errors

. * Pooled Poisson estimator with cluster-robust standard errors
. poisson mdu lcoins ndisease female age lfam child, vce(cluster id)
Iteration 0:
Iteration 1:
Iteration 2:
log pseudolikelihood = -62580.248

Poisson regression
Number of obs
Wald chi2(6)
Prob > chi2
Pseudo R2
Log pseudolikelihood = -62579.401
=
=
=
=
20186
476.93
0.0000
0.0609
(Std. Err. adjusted for 5908 clusters in id)

mdu
Coef.
lcoins
ndisease
female
age
lfam
child
_cons
-.0808023
.0339334
.1717862
.0040585
-.1481981
.1030453
.748789
Robust
Std. Err.
.0080013
.0026024
.0342551
.0016891
.0323434
.0506901
.0785738
z
-10.10
13.04
5.01
2.40
-4.58
2.03
9.53
P>|z|
0.000
0.000
0.000
0.016
0.000
0.042
0.000
[95% Conf. Interval]

-.0964846
.0288328
.1046473
.000748
-.21159
.0036944
.5947872
-.0651199
.039034
.2389251
.0073691
-.0848062
.2023961
.9027907
By comparison, the default (non cluster-robust) s.e.s are 1/4 as large.

) The default (non cluster-robust) t-statistics are 4 times as large!!
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on37A./ Colin
77
Population-averaged Poisson (PA or GEE)
16. Panel Poisson method 2: population-averaged

Assume that for the i th observation moments are like for GLM Poisson
E[yit jxit ] = exp(xit0 )
V[yit jxit ] =
exp(xit0 ).
Stack the conditional means for the i th individual:

2
3
exp(xi0 1 )
6
7
..
E[yi jXi ] = mi ( ) = 4
5.
.
0
exp(xiT )
where yi = [yi 1 , ..., yiT ]0 and Xi = [xi 1 , ..., xiT ]0 .

Stack the conditional variances for the i th individual.
I
With no correlation
V[yi jXi ] = Hi ( ) =
Pravin K. Trivedi
Diag[exp(xit0 )].

Panel counts
meeting,
29, 2010
based on38A./ Colin
77
Population-averaged Poisson (PA or GEE)
Assume a pattern R() for autocorrelation over t for given i so

V[yi jXi ] = Hi ( )1/2 R()Hi ( )1/2
This is called a working matrix.
I
I
I
Pravin K. Trivedi
Example: R() = I if there is no correlation

Example: R() = R() has diagonal entries 1 and o diagonal entries
if there is equicorrelation.
Example: R() = R where diagonal entries 1 and o-diagonals
unrestricted (< 1).

Panel counts
meeting,
29, 2010
based on39A./ Colin
77
GLM Estimation
17. Statas GEE command

The GLM estimator solves: N
i =1
mi0 ( )
1
Hi ( ) (yi
mi ()) = 0.
Generalized estimating equations (GEE) estimator or

population-averaged estimator (PA) of Liang and Zeger (1986) solves
i =1
N
mi0 ( ) b 1
i (yi
mi ( )) = 0,
b i equals i in with R() replaced by R(b

where
) where plim b
= .
Cluster-robust estimate of the variance matrix of the GEE estimator is
b
b [
b0 b
V
GEE ] = D
1b
g =1 Dg0 b g 1 ubg ubg0 b g 1 Dg

G
b = [D
b 1 , ..., D
b G ]0 , u
b g = mg0 ( )/ b , D
where D
bg = yg
1/2
1/2
b ) R(
b) .
b g = Hg (
and now
b ) Hg (
I
Pravin K. Trivedi
The asymptotic theory requires that G ! .
b
D0
b ),
mg (

Panel counts
meeting,
29, 2010
based on40A./ Colin
77
GLM Estimation
Population-averaged Poisson with unstructured correlation

GEE population-averaged model
Group and time vars:
id year
Link:
log
Family:
Poisson
Correlation:
unstructured
Scale parameter:
Number of obs
Number of groups
Obs per group: min
avg
max
Wald chi2(6)
Prob > chi2
=
=
=
=
=
=
=
20186
5908
1
3.4
5
508.61
0.0000
(Std. Err. adjusted for clustering on id)

mdu
Coef.
lcoins
ndisease
female
age
lfam
child
_cons
-.0804454
.0346067
.1585075
.0030901
-.1406549
.1013677
.7764626
Semi-robust
Std. Err.
.0077782
.0024238
.0334407
.0015356
.0293672
.04301
.0717221
z
-10.34
14.28
4.74
2.01
-4.79
2.36
10.83
P>|z|
0.000
0.000
0.000
0.044
0.000
0.018
0.000

-.0956904
.0298561
.0929649
.0000803
-.1982135
.0170696
.6358897
-.0652004
.0393573
.2240502
.0060999
-.0830962
.1856658
.9170354
Generally s.e.s are within 10% of pooled Poisson cluster-robust s.e.s.

The default (non cluster-robust) t-statistics are 3.5 4 times larger,
No control for overdispersion.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on41A./ Colin
77
GLM Estimation
The correlations Cor[yit , yis jxi ] for PA (unstructured) are not equal.
But they are not declining as fast as AR(1).
. matrix list e(R)
symmetric e(R)[5,5]
c1
c2
r1
1
r2 .53143297
1
r3 .40817495 .58547795
r4 .32357326 .35321716
r5 .34152288 .29803555
Pravin K. Trivedi
c3
c4
c5
1
.54321752
.43767583
1
.61948751

Panel counts
meeting,
29, 2010
based on42A./ Colin
77
Poisson random eects
18. Panel Poisson method 3: random eects

Poisson random eects model is
yit jxit , , i
Poiss [i exp(xit0 )]
Poiss [exp(ln i + xit0 )]
where i is unobserved but is not correlated with xit .

RE estimator 1: Assume i is Gamma[1, ] distributed
I
I
closed-form solution exists (negative binomial)

E[yit jxit , ] = exp(xit0 )
RE estimator 2: Assume ln i is N [0, 2 ] distributed

I
I
I
Pravin K. Trivedi
closed-form solution does not exist (one-dimensional integral)

can extend to slope coe cients (higher-dimensional integral)
E[yit jxit , ] = exp(xit0 ) aside from translation of intercept.

Panel counts
meeting,
29, 2010
based on43A./ Colin
77
Poisson random eects (gamma) with panel bootstrap ses

Random-effects Poisson regression
Group variable: id
Number of obs
Number of groups
=
=
20186
5908
Random effects u_i ~ Gamma
Obs per group: min =

avg =
max =
1
3.4
5
Log likelihood
Wald chi2(6)
Prob > chi2
= -43240.556
=
=
529.10
0.0000
(Replications based on 5908 clusters in id)

mdu
Observed
Coef.
Bootstrap
Std. Err.
lcoins
ndisease
female
age
lfam
child
_cons
-.0878258
.0387629
.1667192
.0019159
-.1351786
.1082678
.7574177
.0086097
.0026904
.0379216
.0016242
.0308529
.0495487
.0754536
/lnalpha
.0251256
alpha
1.025444
z
-10.20
14.41
4.40
1.18
-4.38
2.19
10.04
P>|z|
-.1047004
.0334899
.0923942
-.0012675
-.1956492
.0111541
.6095314
-.0709511
.0440359
.2410442
.0050994
-.0747079
.2053816
.905304
.0270297
-.0278516
.0781029
.0277175
.9725326
1.081234
Likelihood-ratio test of alpha=0: chibar2(01) =
0.000
0.000
0.000
0.238
0.000
0.029
0.000
Normal-based
3.9e+04 Prob>=chibar2 = 0.000
- Bloomington (Prepared
Panel counts
for 2010 Mexican
Statatimes
Users Group
April
meeting,
29, 2010
based on44A./ Colin
77
The defaultIndiana
(nonUniversity.
cluster-robust)
t-statistics
are 2.5
larger
Pravin K. Trivedi
19. Poisson xed eects with panel bootstrap ses
. xtpoisson mdu lcoins ndisease female age lfam child, fe vce(boot, reps(100) seed(10
(running xtpoisson on estimation sample)
Bootstrap replications (100)
1
2
3
4
5
..................................................
..................................................
Conditional fixed-effects Poisson regression
Group variable: id
Log likelihood
50
100
Number of obs
Number of groups
=
=
17791
4977

avg =
max =
2
3.6
5
Wald chi2(3)
Prob > chi2
= -24173.211
=
=
4.64
0.2002
(Replications based on 4977 clusters in id)

mdu
Observed
Coef.
age
lfam
child
-.0112009
.0877134
.1059867
Bootstrap
Std. Err.
.0095077
.1125783
.0738452
z
-1.18
0.78
1.44
P>|z|
0.239
0.436
0.151
Normal-based
-.0298356
-.132936
-.0387472
.0074339
.3083627
.2507206
The default (non cluster-robust) t-statistics are 2 times larger.

Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on45A./ Colin
77
Strength of xed eects versus random eects

I
I
Allows i to be correlated with xit .

So consistent estimates if regressors are correlated with the error
provided regressors are correlated only with the time-invariant
component of the error
An alternative to IV to get causal estimates.
Limitations:
I
I
I
Coe cients of time-invariant regressors are not identied

For identied regressors standard errors can be much larger
Marginal eect in a nonlinear model depend on i
MEj = E[yit ]/xit,j = i exp(xit0 ) j
and i is unknown.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on46A./ Colin
77
Panel Poisson: estimator comparison

Compare following estimators
I
I
I
I
I
pooled Poisson with cluster-robust s.e.s

pooled population averaged Poisson with unstructured correlations and
cluster-robust s.e.s
random eects Poisson with gamma random eect and cluster-robust
s.e.s
random eects Poisson with normal random eect and default s.e.s
xed eects Poisson and cluster-robust s.e.s
Find that
I
I
Pravin K. Trivedi
similar results for all RE models

note that these data are not good to illustrate FE as regressors have
little within variation.

Panel counts
meeting,
29, 2010
based on47A./ Colin
77
20. Comparison of dierent Poisson estimators with

cluster-robust s.e.s
Variable
POOLED
POPAVE
-0.0808
0.0080
0.0339
0.0026
0.1718
0.0343
0.0041
0.0017
-0.1482
0.0323
0.1030
0.0507
0.7488
0.0786
-0.0804
0.0078
0.0346
0.0024
0.1585
0.0334
0.0031
0.0015
-0.1407
0.0294
0.1014
0.0430
0.7765
0.0717
RE_GAMMA
RE_NOR~L
-0.0878
0.0086
0.0388
0.0027
0.1667
0.0379
0.0019
0.0016
-0.1352
0.0309
0.1083
0.0495
0.7574
0.0755
-0.1145
0.0073
0.0409
0.0023
0.2084
0.0305
0.0027
0.0012
-0.1443
0.0265
0.0737
0.0345
0.2873
0.0642
FIXED
#1
lcoins
ndisease
female
age
lfam
child
_cons
-0.0112
0.0095
0.0877
0.1126
0.1060
0.0738
lnalpha
_cons
0.0251
0.0270
lnsig2u
_cons
0.0550
0.0255
Statistics
Pravin K. Trivedi
20186 (Prepared
20186
20186Stata Users
17791
N Indiana 20186
University. - Bloomington
Panel
counts
for 2010 Mexican
Group April
meeting,
29, 2010
based on48A./ Colin
77
Moment based estimation of FE count panels
Predetrmined means regressor correlated with current and past

shoocks but not future shocks: E [uit xis ] = 0 for s t, but 6= 0 for
S < t.
Two specications are considered:
yit
yit
=
=
exp(xit0 )i wit
exp(xit0 )i + wit
A quasi-dierencing transformation is used to eliminate the xed

eect.
Then a moment condition is constructed for estimation.
Depending upon which specication is used dierent moment
conditions obtain.
Chamberlain and Wooldridge derive quasi-dierencing transformations
that are shown in the table below.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on49A./ Colin
77
Moment based estimation of FE count panels
21. Exponential Mean and Multiplicative Heterogeneity

Relies on a number of ways of eliminating the xed eects
Error may enter additively or multiplicatively
Estimating equations are orthogonality conditions after
quasi-dierencing which eliminates the xed eect
Model
Strict exog.
Predetermined
regressors
Moment spec.
E[xit uit +j ] = 0, j
E[xit uit s ] 6= 0, s
GMM
Chamberlain
Wooldridge
GMM/endog
Pravin K. Trivedi
Wooldridge
Estimating equations
0
1
it 1
yit 1 jxti 1 ) = 0
it
yit
yit 1 t 1
E
jx ) = 0
it
it 1 i
yit
yit 1 t 2
E
jx ) = 0
it
it 1 i
E yit

Panel counts
meeting,
29, 2010
based on49A./ Colin
77
Computational Strategies for GMM
Use an interactive version of an estimation command (e.g. gmm);

enter the function directly on the command line or dialog box by
using a substitutable expression.
Use a function evaluator program which gives more exibility in

dening your objective function; usually more complicated to use but
may be needed for more complicated problems.
Hint: In Stata a good place to start is the nl (nonlinear least squares)
command. Then go on to gmm.
Most of the examples here involve substitutable expressions.
Examples of function evaluator programs are in MUS and especially in
Stata manuals.
yi
T
Example: N
it
= 0,
i =1 t =1 xit yit
i
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on50A./ Colin
77
Applications
22. Applications using balanced panel MEPS data
For illustrating panel methods the RAND data set has limitations
. sum
officevis T0officevis educ age income totchr
Variable
Obs
Mean
Min
Max
officevis
T0officevis
educ
age
income
78888
78888
78888
78888
78888
1.387372
1.488084
12.32776
4.562129
27.60833
3.328148
3.334559
3.264869
1.742034
28.94855
0
0
0
1.8
-63.631
94
58
17
8.5
264.674
totchr
78888
.7881047
1.081315
Pravin K. Trivedi
Std. Dev.

Panel counts
meeting,
29, 2010
based on51A./ Colin
77
Applications
MEPS Data
Quarterly data for 2005-06
. xtdes
dupersid: 30002019, 30004010, ..., 38505016
n =
timeindex: 1, 2, ..., 8
T =
Delta(timeindex) = 1 unit
Span(timeindex) = 8 periods
(dupersid*timeindex uniquely identifies each observation)
Distribution of T_i:
Freq.
min
8
Percent
Cum.
9861
100.00
100.00
9861
100.00
Pravin K. Trivedi
5%
8
25%
8
50%
8
75%
8
95%
8
9861
8
max
8
Pattern
11111111
XXXXXXXX

Panel counts
meeting,
29, 2010
based on52A./ Colin
77
Applications
Fixed Eects GMM in Stata 11

. program gmm_poi2
1.
version 11
2.
syntax varlist if, at(name) myrhs(varlist) ///
>
mylhs(varlist) myidvar(varlist)
3.
quietly {
4.
tempvar mu mubar ybar
5.
gen double `mu' = 0 ìf'
6.
local j = 1
7.
foreach var of varlist `myrhs' {
8.
replace `mu' = `mu' + `var'*àt'[1,`j'] ìf'
9.
local j = `j' + 1
10.
}
11.
replace `mu' = exp(`mu')
12.
egen double `mubar' = mean(`mu') ìf', by(`myidvar')
13.
egen double `ybar' = mean(`mylhs') ìf', by(`myidvar')
14.
replace `varlist' = `mylhs' - `mu'*`ybar'/`mubar' ìf'
15.
}
16. end
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on53A./ Colin
77
Applications
Implementing xed eects GMM in Stata 11

. gmm gmm_poi2, mylhs(officevis) myrhs(insprv age income totchr)
///
> myidvar(dupersid) nequations(1) parameters(insprv age income totchr)
> instruments(insprv age income totchr, noconstant) onestep
Step 1
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
GMM
GMM
GMM
GMM
criterion
criterion
criterion
criterion
Q(b)
Q(b)
Q(b)
Q(b)
=
=
=
=
///
.00140916
1.487e-07
1.583e-14
1.843e-28
GMM estimation
Number of parameters =
4
Number of moments
=
4
Initial weight matrix: Unadjusted
Coef.
/insprv
/age
/income
/totchr
-.0080549
-.5125841
.001128
.2211125
Robust
Std. Err.
.5460749
13.1682
.0013911
.3354182
Number of obs
z
-0.01
-0.04
0.81
0.66
P>|z|
0.988
0.969
0.417
0.510
78888

-1.078342
-26.32178
-.0015984
-.4362951
1.062232
25.29662
.0038545
.8785201
Instruments for equation 1: insprv age income totchr

. estimates store PFEGMM
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on54A./ Colin
77
Applications
Standard xed eects panel Poisson

. * Usual panel Poisson FE
. xtpoisson officevis insprv age income totchr, fe
note: 1900 groups (15200 obs) dropped because of all zero outcomes
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -84468.435
= -84154.68
= -84154.647
= -84154.647

Group variable: dupersid
Log likelihood
Coef.
insprv
age
income
totchr
-.0080549
-.5125841
.001128
.2211125
=
=
63688
7961

avg =
max =
8
8.0
8
Wald chi2(4)
Prob > chi2
= -84154.647
officevis
Number of obs
Number of groups
Std. Err.
.027985
.0629145
.000258
.0091051
z
-0.29
-8.15
4.37
24.28
P>|z|
0.773
0.000
0.000
0.000
=
=
618.20
0.0000

-.0629046
-.6358943
.0006224
.2032669
.0467947
-.3892739
.0016336
.2389582
. estimates store PFE
Indiana
- Bloomington (Prepared
Panel counts
for 2010
Mexican
Stata
Users Group
April
meeting,
29,
based
on55A./ Colin
77
No dierence
in University.
point estimates
because
MLE
and
GMM
solve
the2010
same
Pravin K. Trivedi
Applications
Standard FE Poisson with robust SE (with xtpqml add-on)

. * Add-on xtpqml gives panel robust se's
. xtpqml officevis insprv age income totchr, fe i(dupersid)
note: 1900 groups (15200 obs) dropped because of all zero outcomes
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
= -84468.435
= -84154.68
= -84154.647
= -84154.647

Group variable: dupersid
Log likelihood
Coef.
insprv
age
income
totchr
-.0080549
-.5125841
.001128
.2211125
=
=
63688
7961

avg =
max =
8
8.0
8
Wald chi2(4)
Prob > chi2
= -84154.647
officevis
Number of obs
Number of groups
Std. Err.
.027985
.0629145
.000258
.0091051
z
-0.29
-8.15
4.37
24.28
P>|z|
0.773
0.000
0.000
0.000
=
=
618.20
0.0000

-.0629046
-.6358943
.0006224
.2032669
.0467947
-.3892739
.0016336
.2389582
Calculating Robust Standard Errors...

officevis
Coef.
officevis
insprv
age
income
totchr
-.0080549
-.5125841
.001128
.2211125
Std. Err.
.0715881
.1804831
.0007661
.0250814
z
-0.11
-2.84
1.47
8.82
P>|z|
0.910
0.005
0.141
0.000

-.1483651
-.8663245
-.0003734
.1719539
.1322552
-.1588438
.0026295
.2702712
Wald
chi2(4) = Indiana
80.59
ProbMexican
> chi2 Stata
=
0.0000
Pravin
K. Trivedi
University. - Bloomington (Prepared
Panel counts
for 2010
Users Group April
meeting,
29, 2010
based on56A./ Colin
77
Dynamic panel count models
23. Panel dynamic

Individual eects model allows for time series persistence via
unobserved heterogeneity (i )
I
e.g. High i means high doctor visits each period
Alternative time series persistence is via true state dependence (yt

I
1)
e.g. Many doctor visits last period lead to many this period.
Linear model:
yit = i + yi ,t
+ xit0 + uit .
Poisson model with exponnetial feedback: One possibility (designed

to confront the zero problem) is
it
yi ,t
Pravin K. Trivedi
= i it 1 = i exp(yi ,t
= min(c, yi ,t 1 ).
+ xit0 ),

Panel counts
meeting,
29, 2010
based on57A./ Colin
77
Panel dynamic: GMM estimation of FE model

In xed eects case Poisson FE estimator is now inconsistent.
Instead assume weak exogeneity
E [yit jyit
1 , ..., yi 1 , xit,..., xi 1 ]
= i it
1.
And use an alternative quasi-dierence

E [(yit
(it
1 /it )yit 1 ) jyit 1 , ..., yi 1 , xit,..., xi 1 ]
= 0.
So MM or GMM based on
E zit
where e.g. zit = (yit
yit
1 , xit )
it 1
yit
it
=0
in just-identied case.
Windmeijer (2008) has recent discussion.

Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on58A./ Colin
77
Example of dynamic moment-based JI GMM

Ignore individual specic eects
. gmm (officevis - exp({xb:L.officevis insprv educ age income totchr}+{b0})),
///
> instruments(L.officevis insprv educ age income totchr) onestep vce(cluster dupersid)
Step 1
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
5:
6:
GMM
GMM
GMM
GMM
GMM
GMM
GMM
criterion
criterion
criterion
criterion
criterion
criterion
criterion
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
=
=
=
=
=
=
=
4.9539327
4.7296297
1.4832673
.01045573
6.508e-06
3.032e-12
7.264e-25
GMM estimation
7
Number of moments
=
7
Number of obs
69027
(Std. Err. adjusted for 9861 clusters in dupersid)

Coef.
/xb_L_offi~s
.064072
.2152153
/xb_insprv
/xb_educ
.0404143
/xb_age
.1221278
-.0003585
/xb_income
/xb_totchr
.3027348
/b0
-1.447292
Pravin K. Trivedi
Indiana University. -
Robust
Std. Err.
P>|z|
.0041069
15.60
0.000
.0560228
.0721213
.0331676
6.49
0.000
.1502079
.2802227
.0065808
6.14
0.000
.0275162
.0533124
.0134542
9.08
0.000
.0957581
.1484976
.0004981
-0.72
0.472
-.0013347
.0006178
.0141805
21.35
0.000
.2749415
.330528
.0952543
-15.19
0.000
-1.633987
-1.260597
Bloomington (Prepared
Panel counts
meeting,
29, 2010
based on59A./ Colin
77
Example of dynamic moment-based OI GMM

///
> instruments(L.officevis educ age income totchr female white hispanic married employed) /
> onestep vce(cluster dupersid)
Step 1
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
5:
6:
GMM
GMM
GMM
GMM
GMM
GMM
GMM
criterion
criterion
criterion
criterion
criterion
criterion
criterion
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
=
=
=
=
=
=
=
4.9696148
3.7545442
.86353039
.25844389
.07248002
.07235453
.07235443
GMM estimation
7
Number of moments
= 11
Number of obs
69027

Coef.
/xb_L_offi~s
/xb_insprv
/xb_educ
/xb_age
/xb_income
/xb_totchr
/b0
.0631186
.0468067
.0422612
.1208516
.0004412
.2988192
-1.361726
Robust
Std. Err.
.0042901
.1154105
.0074362
.0136986
.0007107
.0144326
.0972536
z
14.71
0.41
5.68
8.82
0.62
20.70
-14.00
P>|z|
0.000
0.685
0.000
0.000
0.535
0.000
0.000

.0547101
-.1793937
.0276866
.0940028
-.0009518
.2705318
-1.55234
.071527
.273007
.0568359
.1477003
.0018341
.3271066
-1.171113
Instruments
for
equation
L.officevis
educcounts
age
income
totchr
female
marrie
Pravin
K. Trivedi
Indiana
University.1:- Bloomington
(Prepared
Panel
for
2010
Mexican
Stata Users
Groupwhite
April
meeting,
29,hispanic
2010
based on60
A./ Colin
77
Correlated RE and Initial Conditions
24. Poisson Extensions

A dierent ML approach to dynamic specication
P (it ), i = 1, ..., N; t = 1, ..., T
yi ,t
f (yi ,t jit ) =
it
it yit
it
yit !
= it it = E[yit jyi ,t
1 , xit , i ]
= g (yi ,t
1 , xit , i )
Initial conditions problem in dynamic model. In a short panel bias

induced by neglect of dependence on initial condition.
The lagged dependent variable on the right hand side a source of bias
because the lagged dependent variable and individual-specic eect
are correlated.
Wooldridges method (2005) integrates out the individual-specic
random eect after conditioning on the initial value and covariates.
Random eect model used to accommodate the initial conditions.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on61A./ Colin
77
Alternative specications
E[yit jxit , yit
1 , i ]
= h(yit , xit , i )
where i is the individual-specic eect.

1st alternative: Autoregressive dependence through the exponential
mean.
E[yit jxit , yit 1 , i ] = exp(yit 1 + xit0 + i )
If the i are uncorrelated with the regressors, and further if
parametric assumptions are to be avoided, then this model can be
estimated using either the nonlinear least squares or pooled Poisson
MLE. In either case it is desirable to use the robust variance formula.
Limitation: Potentially explosive if large values of yit are realized.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on62A./ Colin
77
Initial conditions
Dynamic panel model requires additional assumptions about the
relationship between the initial observations ("initial conditions") y0
and the i .
Eect of initial value on the future events is important in a short
panel. The initial-value eect might be a part of individual-specic
eect
Wooldridges method requires a specication of the conditional
distribution of i given y0 and zi , with the latter entering separably.
Under the assumption that the initial conditions are nonrandom, the
standard random eects conditional maximum likelihood approach
identies the parameters of interest.
For a class of nonlinear dynamic panel models, including the Poisson
model, Wooldridge (2005) analyzes this model which conditions the
joint distribution on the initial conditions.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on63A./ Colin
77
Conditionally correlated RE (1)

Where parametric FE models are not feasible, the conditionally
correlated random (CCR) eects model (Mundlak (1978) and
Chamberlain (1984)) provides a compromise between FE and RE
models.
Standard RE panel model assumes that i and xit are uncorrelated.
Making i a function of xi 1 , ..., xiT allows for possible correlation:
i = zi0 + i
Mundlaks (more parsimonious) method allows the individual-specic
eect to be determined by time averages of covariates, denoted zi ;
Chamberlains method suggests a richer model with a weighted sum
of the covariates for the random eect.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on64A./ Colin
77
Conditionally correlated RE (2)

We can further allow for initial condition eect by including y0 thus:
i = y00 + zi0 + i
where y0 is a vector of initial conditions, zi =xi denotes the
time-average of the exogenous variables and i may be interpreted as
unobserved heterogeneity.
The formulation essentially introduces no additional problems though
the averages change when new data are added. Estimation and
inference in the pooled Poisson or NLS model can proceed as before.
Formulation can also be used when no dynamics are present in the
model. In this case i can be integrated out using a distributional
assumption about f ().
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on65A./ Colin
77
Dynamic GMM without initial condition

Here individual specic eect is captured by the initial condition
///
> instruments(L.officevis insprv educ age income totchr) onestep vce(cluster dupersid)
Step 1
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
5:
6:
GMM
GMM
GMM
GMM
GMM
GMM
GMM
criterion
criterion
criterion
criterion
criterion
criterion
criterion
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
=
=
=
=
=
=
=
4.9539327
4.7296297
1.4832673
.01045573
6.508e-06
3.032e-12
7.264e-25
GMM estimation
7
Number of moments
=
7
Number of obs
69027

Coef.
/xb_L_offi~s
.064072
.2152153
/xb_insprv
/xb_educ
.0404143
/xb_age
.1221278
-.0003585
/xb_income
/xb_totchr
.3027348
-1.447292
Pravin K. Trivedi /b0 Indiana
University. -
Robust
Std. Err.
P>|z|
.0041069
15.60
0.000
.0560228
.0721213
.0331676
6.49
0.000
.1502079
.2802227
.0065808
6.14
0.000
.0275162
.0533124
.0134542
9.08
0.000
.0957581
.1484976
.0004981
-0.72
0.472
-.0013347
.0006178
.0141805
21.35
0.000
.2749415
.330528
.0952543
-15.19
0.000
-1.633987
-1.260597
Bloomington (Prepared
Panel counts
meeting,
29, 2010
based on66A./ Colin
77
Overidentied dynamic GMM with initial condition

///
> instruments(L.officevis educ age income totchr female white hispanic married empl
> onestep vce(cluster dupersid)
Step 1
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
5:
6:
GMM
GMM
GMM
GMM
GMM
GMM
GMM
criterion
criterion
criterion
criterion
criterion
criterion
criterion
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
Q(b)
=
=
=
=
=
=
=
4.9696148
3.7545442
.86353039
.25844389
.07248002
.07235453
.07235443
GMM estimation
7
Number of moments
= 11
Number of obs
69027

Coef.
/xb_L_offi~s
/xb_insprv
/xb_educ
/xb_age
/xb_income
/xb_totchr
Pravin K. Trivedi
.0631186
.0468067
.0422612
.1208516
.0004412
.2988192
Robust
Std. Err.
.0042901
.1154105
.0074362
.0136986
.0007107
.0144326
z
14.71
0.41
5.68
8.82
0.62
20.70
P>|z|
0.000
0.685
0.000
0.000
0.535
0.000

.0547101
-.1793937
.0276866
.0940028
-.0009518
.2705318
.071527
.273007
.0568359
.1477003
.0018341
.3271066

Panel counts
meeting,
29, 2010
based on67A./ Colin
77
Dynamic Just Identied GMM with Initial Conditions
. gmm (officevis - exp({xb:L.officevis T0officevis insprv educ age income totchr}+{b

> instruments(L.officevis T0officevis insprv educ age income totchr) onestep vce(cl
Final GMM criterion Q(b) =
6.30e-26
GMM estimation
8
Number of moments
=
8
Number of obs
69027

Coef.
/xb_L_offi~s
/xb_T0offi~s
/xb_insprv
/xb_educ
/xb_age
/xb_income
/xb_totchr
/b0
.0495929
.0311947
.2153361
.0382539
.1303702
-.0003019
.2847798
-1.484486
Robust
Std. Err.
.0044248
.0043446
.0351702
.0056386
.0095834
.0004701
.010334
.0605323
z
11.21
7.18
6.12
6.78
13.60
-0.64
27.56
-24.52
P>|z|
0.000
0.000
0.000
0.000
0.000
0.521
0.000
0.000

.0409204
.0226794
.1464038
.0272024
.111587
-.0012232
.2645256
-1.603127
.0582654
.0397099
.2842684
.0493054
.1491534
.0006194
.3050341
-1.365845
Instruments for equation 1: L.officevis T0officevis insprv educ age income totchr _c
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on68A./ Colin
77
Dynamic Over Identied GMM with Initial Condition
. gmm (officevis - exp({xb:L.officevis T0officevis insprv educ age income totchr}+{b

> instruments(L.officevis T0officevis educ age income totchr female white hispanic
> onestep vce(cluster dupersid) nolog
.0685762
GMM estimation
8
Number of moments
= 12
Number of obs
69027

Coef.
/xb_L_offi~s
/xb_T0offi~s
/xb_insprv
/xb_educ
/xb_age
/xb_income
/xb_totchr
/b0
.0490201
.0305356
.0565968
.0402952
.1299791
.0004368
.2805608
-1.408679
Robust
Std. Err.
.0046062
.0044538
.1135886
.0059253
.0098075
.000703
.0101571
.0607941
z
10.64
6.86
0.50
6.80
13.25
0.62
27.62
-23.17
P>|z|
0.000
0.000
0.618
0.000
0.000
0.534
0.000
0.000

.039992
.0218063
-.1660328
.0286819
.1107567
-.0009411
.2606532
-1.527833
.0580481
.0392648
.2792264
.0519085
.1492014
.0018148
.3004684
-1.289525
Instruments for equation 1: L.officevis T0officevis educ age income totchr female wh
married employed _cons
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on69A./ Colin
77
Alternative to EFM: LFM
An alternative to the (potentially explosive) EF is the linear

feedback model
E[yit jxit , yit
1 , i ]
= yit
+ exp(xit0 + i )
Limitation: Discontinuities avoided but model falls outside the

standard exponential class of models.
MLE not feasible, but QML/NLS/GMM is feasible.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on70A./ Colin
77
25. Linear feedback model

it
= yit
= yit
0
0
+ exp(x1it
1 + + x2it
2 + 1 yi 0 + zi0 2 + wi )
0
0
0
1 + exp(wi ) exp(x1it 1 + + x2it 2 + 1 yi 0 + zi 2 + wi )
1
MLE not feasible because the functional form is no longer belongs in

the exponential family.
GMM which uses dierencing transformations will eliminate initial
values and correlated heterogeneity.
NLS method for estimation can identify the conditional mean
function under certain conditions.
1
f,,g 2NT
min
(yit
it )2
To allow for a RE type extension should use a robust estimator of the

covariance matrix.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on71A./ Colin
77
Example: EFM vs LFM
. * Linear Feedback Model with Initial Condition Control

. gmm (officevis - {rho}*L.officevis - exp({xb: T0officevis insprv educ age income t
> instruments(L.officevis T0officevis insprv educ age income totchr) onestep vce(cl
7.35e-23
GMM estimation
8
Number of moments
=
8
Number of obs
69027

Coef.
/rho
/xb_T0offi~s
/xb_insprv
/xb_educ
/xb_age
/xb_income
/xb_totchr
/b0
.5366234
.0672159
.1509578
.0375916
.1234875
-.0002804
.3270725
-2.187085
Robust
Std. Err.
.0248079
.0038061
.0408185
.0062318
.0119579
.0006164
.0154421
.096687
z
21.63
17.66
3.70
6.03
10.33
-0.45
21.18
-22.62
P>|z|
0.000
0.000
0.000
0.000
0.000
0.649
0.000
0.000

.4880008
.0597561
.0709551
.0253774
.1000504
-.0014885
.2968066
-2.376588
.585246
.0746758
.2309606
.0498058
.1469245
.0009277
.3573383
-1.997582
Instruments for equation 1: L.officevis T0officevis insprv educ age income totchr _c
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on72A./ Colin
77
More on LFM vs. EFM
Sensitivity to omitted y0 and z varies between LFM and EFM

Monte Carlo analysis suggests omission leads to biases especially in
the coe cient of lagged variable.
EFM is preferred on predictive performance when the proportion of
zeros is high.
LFM does better when the mean of y is high and proportion of zeros
small.
NLS turns out to be a robust estimator for the LFM. Should be
considered as a serious alternative for count panel models under
certain conditions.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on73A./ Colin
77
Concluding Remarks
Much progress in estimating panel count models, especially in dealing

with endogeneity and nonseprable heterogeneity.
Great progress in variance estimation.
RE models pose fewer problems.
For FE models moment-based/IV methods seem more tractable for
handling endogeneity and dynamics. Statas new suite of GMM
commands are very helpful in this regard.
Because FE models do not currently handle important cases, and
have other limitations, CCR panel model with initial conditions, is an
attractive alternative, at least for balanced panels.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on74A./ Colin
77
References
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on75A./ Colin
77
References
For Count Data magic, if you dont have the Thundercloud,
use
instead!
References
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on75A./ Colin
77
References
Hausman, J.A., B.H. Hall and Z. Griliches (1984), Econometric Models

for Count Data With an Application to the Patents-R and D
Relationship, Econometrica, 52, 909-938.
Chamberlain, G. (1984). Panel Data. In Handbook of Econometrics,
Volume II, ed. by Z. Griliches and M. Intriligator, 1247-1318. Amsterdam:
North-Holland.
Wooldridge, J. (2005). Simple solutions to the initial conditions problem
in dynamic, nonlinear panel data models with unobserved heterogeneity.
Journal of Applied Econometrics, 20, 39-54.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on76A./ Colin
77
References
Chernozhukov, V; Fernandez-Val, I.; Kowalski, AE (2009) Censored

Quantile Instrumental Variable Estimation via
Control Functions. Discussion paper.
Koenker, R. (2004) Quantile regression for longitudinal data. Journal of
Multivariate Analysis 91, 74 89
Mundlak, Y. (1978). On the Pooling of Time Series and Cross Section
Data. Econometrica, 46, 69-85.
Stata Release 11 Manuals
Windmeijer, F.A.G. (2008), GMM for Panel Count Data Models, ch.18
in L. Matyas and P. Sivestre eds., The Econometrics of Panel Data,
Springer.
Pravin K. Trivedi

Panel counts
meeting,
29, 2010
based on77A./ Colin
77

Panel Count Models in Stata

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Panel Count Models in Stata

Uploaded by

Copyright:

Available Formats

Recent Developments in Panel Models for Count Data

April 29, 2010

Indiana University. - Bloomington (Prepared

Indiana University. - Bloomington (Prepared

Objective 1: To survey recent developments in count data panel

Indiana University. - Bloomington (Prepared

Indiana University. - Bloomington (Prepared

Basic linear panel models

2. Basic linear panel models review

Two-way eects model allows intercept to vary over i and t

Individual-specic eects model

where i may be xed eect or random eect.

Indiana University. - Bloomington (Prepared

Fixed versus random eects model

3. Fixed eects versus random eects model

i is a random variable possibly correlated with xit

Random eects (RE) or population-averaged (PA):

i is purely random (usually iid (0, 2 )) unrelated to xit

Fundamental divide: microeconometricians FE versus others RE.

Indiana University. - Bloomington (Prepared

Nonlinear panel models

4. Some features of nonlinear panel models

Pooled estimation or population-averaged

Random eects often not tractable so need numerical integration

Indiana University. - Bloomington (Prepared

Nonlinear panel models

Some Standard Cross-section Count Models

Pr [y > 0jx]Ey >0 [y jy > 0, x

((x)+f1 (0)2 (x))

Indiana University. - Bloomington (Prepared

Nonlinear panel models

A pooled or population-averaged (PA) model may be used.

This is same model as in cross-section case, with adjustment for

A fully parametric model may be specied, with separable

t = 1, ..., Ti , i = 1, ...., N, (5)

f (yit ji , xit ) = f (yit , i + xit0 i , ),

t = 1, ..., Ti , i = 1, ...., N, (6)

where denotes additional model parameters such as variance

Indiana University. - Bloomington (Prepared

Nonlinear panel models

5. Evolution of Panel Models (1)

Focus on panel methods most commonly used by

Indiana University. - Bloomington (Prepared

Nonlinear panel models

Evolution of Panel Models (2)

MLE; GEE; NLIV;

MLE; GEE; NLGMM;

Indiana University. - Bloomington (Prepared

Nonlinear panel models

6. Remarks on the evolution of count panel models (2)

FE panel data counterparts of several popular cross-section models

Indiana University. - Bloomington (Prepared

Nonlinear panel estimators

Pooled or population-averaged estimators

7. Nonlinear: Pooled or population-averaged estimators

Give the usual cross-section command for conditional mean models or

Extend pooled feasible GLS to the nonlinear case

Estimate with an assumed correlation structure over time

Indiana University. - Bloomington (Prepared

Nonlinear panel estimators

Random eects estimators

Nonlinear random eects estimators

For Poisson with gamma random eect

For other models.