Professional Documents
Culture Documents
January 2010
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Denition
A longitudinal, or panel, data set is one that follows a given sample
of individuals over time, and thus provides multiple observations on
each individual in the sample. (Hsiao,2003, page 2).
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Denitions
A micro-panel data set is a panel for which the time dimension T
is largely less important than the individual dimension N (example:
the University of Michigans Panel Study of Income Dynamics,
PSID with 15,000 individuals observed since 1968):
T << N
A macro-panel data set is a panel for which the time dimension T
is similar to the individual dimension N (example: a panel of 100
countries with quateraly data since the WW2):
T 'N
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Denition
A panel is said to be balanced if we have the same time periods,
t = 1, .., T , for each cross section observation. For an unbalanced
panel, the time dimension, denoted Ti , is specic to each
individual.
Remark: While the mechanics of the unbalanced case are similar
to the balanced case, a careful treatment of the unbalanced case
requires a formal description of why the panel may be unbalanced,
and the sample selection issues can be somewhat subtle. =>
issues of sample selection and attrition.
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
What are the main advantages of the panel data sets and
the panel data models?
1
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
What are the main advantages of the panel data sets and
the panel data models?
Denition
The oft-touted power of panel data derives from their theoretical
ability to isolate the eects of specic actions, treatments, or more
general policies.
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Example (Ben-Porath (1973), in Hsiao (2003))
Suppose that a cross-sectional sample of married women is found
to have an average yearly labor-force participation rate of 50%.
1 ) It might be interpreted as implying that each woman in a
homogeneous population has a 50 percent chance of being in the
labor force in any given year.
2 ) It might imply that 50 percent of the women in a
heterogeneous population always work and 50 percent never work.
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
What are the main advantages of the panel data sets and
the panel data models?
3. Panel data data provides a means of resolving the magnitude
of econometric problems that often arises in empirical studies,
namely the often heard assertion that the real reason one
nds (or does not nd) certain eects is the presence of
omitted (mismeasured or unobserved) variables that are
correlated with explanatory variables.
Panel data allows to control for omitted (unobserved or
mismeasured) variables.
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Example
Example: let us consider a simple regression model.
yit = + 0 xit + 0 zit + it
i = 1, .., N
t = 1, .., T
where
xit and zit are k1 1 and k2 1 vectors of exogeneous
variables
is a constant, and are k1 1 and k2 1 vectors of
parameters
it is i.i.d. over i and t, with V (it ) = 2
Let us assume that zit variables unobservable and
correlated with xit
cov (xit , zit ) 6= 0
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Example (II): It is well known the least-squares regression
coe cients of yit on xit are biaised.
Let us assume that zi ,t = zi (z values stay constant throught time
for a given individual but vary accross individuals).
yit = + 0 xit + 0 zi + it
i = 1, .., N
t = 1, .., T
yi ,t
= 0 (xit
xi ,t
1 ) + it
i ,t
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Example (III): Let us assume that zi ,t = zt (z values are common
for all individuals but vary accross time: common factors).
yit = + 0 xit + 0 zt + it
i = 1, .., N
t = 1, .., T
x t = (1/N ) N
i =1 xit
t = (1/N ) N
i =1 it
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
What are the main advantages of the panel data sets and
the panel data models?
4. Panel data involve two dimensions: a cross-sectional
dimension N, and a time-series dimension T . We would
expect that the computation of panel dataestimators would be
more complicated than the analysis of cross-section data
alone (where T = 1) or time series data alone (where
N = 1). However, in certain cases the availability of panel
data can actually simplify the computation and inference.
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Example (time-series analysis of nonstationary data)
Let us consider a simple AR (1) model.
xt = xt
+ t
1)
T !
1 W (1)2 1
R
2 1 W (r )2 dr
0
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
General introduction
+ i ,t
N (b
1)
C. Hurlin
! N (0, 2)
N ,T !
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Chapter 2
3
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Example
Let us consider the example of a simple production function (Cobb
Douglas) with two factors (labor and capital). We have N
countries and T periods. Let us denote:
- yit the log of the GDP for country i at time t.
- nit the log of the labor employment for country i at time t.
- yit the log of the capital stock for country i at time t.
yi ,t = i + i ki ,t + i ni ,t + i ,t
with i ,t i.i.d. 0, 2 , 8 i, 8 t.
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
Example
In this specication, the elasticities i and i are specic to each
country
yi ,t = i + i ki ,t + i ni ,t + i ,t
with i ,t i.i.d. 0, 2 , 8 i, 8 t. But, several alternative
specications can be considered. First, we can assume that the
production function is the same for all countries: in this case we
have an homogeneous specication:
yi ,t = + ki ,t + ni ,t + i ,t
i =
i =
C. Hurlin
i =
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
Example
However, an homogeneous specication of the production function
for macro aggregated data is meaningless. We can introduce an
heterogeneity of the Total Factor Productivity: more precisely, we
can assume that the mean of TFP (given by E (i + i ,t ) = i ) is
dierent accross countries (due to institutional organisational
factors, etc.).
Then, we can use a specication with individual eects, i and
common slope parameters (elasticities and ).
yi ,t = i + ki ,t + ni ,t + i ,t
i =
C. Hurlin
i =
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
Example
Finally, we can assume that the labor and/or capital elasticities are
dierent accross countries.In this case, we will have an
heterogeneous specication of the panel data model
(heterogeneous panel).
yi ,t = i + i ki ,t + i ni ,t + i ,t
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
Example
In this case, there are two solutions:
1 ) The rst solution consists in using N times series models to
produce some group-mean estimates of the elasticities.
2 ) The second solution consists in using a model with random
(slope) parameters => random coe cient model. In this
case, we assume that parameters i and i and randomly
distributed, but follows the same distribution:
i i.i.i
, 2
C. Hurlin
i i.i.i
, 2
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
In this case it is straightforward that pooling of all NT
observations, assuming identical parameters for all cross-sectional
units, would lead to nonsensical results because itwould
represent an average of coe cients that dier greatly across
individuals (the phantasm of the NT observations..)
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
In this case, pooling gives rise to the false inference that the
pooled relation is curvilinear.
Fact
In both cases, the classic paradigm of the representative agent
simply does not hold, and pooling the data under homogeneity
assumption makes no sense.
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
C. Hurlin
General introduction
Denitions
What are the main advantages of the panel data sets and the panel d
Issues involved in utilizing panel data
Outline of the lecture
General introduction
References
Articles (chapter 2)
Baltagi, B.H. et Kao, C. (2000), Nonstationary panels,
cointegration in panels and dynamic panels : a survey, in
Advances in Econometrics, 15, edited by B. Baltagi et C. Kao,
pp. 7-51, Elsevier Science.
Hurlin, C. et Mignon, V. (2005), Une synthse des tests de
racine unitaire sur donnes de panel, Economie et Prvision,
169-170-171, pp. 253-294
http://www.univ-orleans.fr/deg/masters/ESA/CH/churlin_E.htm
C. Hurlin