You are on page 1of 57

Regression with Panel Data (SW Ch.

8)
A panel dataset contains observations on multiple entities (individuals), where each entity is observed at two or more points in time. Examples: Data on 420 Cali ornia school districts in !""" and again in 2000, or #40 observations total. Data on $0 %.&. states, each state is observed in ' years, or a total o !$0 observations.

#(!

Data on !000 individuals, in our di erent months, or 4000 observations total. Notation for panel data A double subscript distin)uishes entities (states) and time periods (years) i * entity (state), n * number o entities, so i * !,+,n t * time period (year), T * number o time periods so t *!,+,T Data: &uppose we have ! re)ressor. ,he data are:
#(2

(Xit, Yit), i * !,+,n, t * !,+,T Panel data notation, ctd. -anel data with k re)ressors: (X!it, X2it,+,Xkit, Yit), i * !,+,n, t * !,+,T n * number o entities (states) T * number o time periods (years) &ome .ar)on+ Another term or panel data is longitudinal data balanced panel: no missin) observations
#('

unbalanced panel: some entities (states) are not observed or some time periods (years)

#(4

Why are panel data useful /ith panel data we can control or actors that: 0ary across entities (states) but do not vary over time Could cause omitted variable bias i they are omitted are unobserved or unmeasured 1 and there ore cannot be included in the re)ression usin) multiple re)ression 2ere3s the 4ey idea: 5 an omitted variable does not chan)e over time, then any changes in Y over time cannot be caused by the omitted variable.
#($

!"a#ple of a panel data set$ %raffic deaths and alcohol ta"es 6bservational unit: a year in a %.&. state 4# %.&. states, so n * o entities * 4# 7 years (!"#2,+, !"##), so T * 8 o time periods * 7 9alanced panel, so total 8 observations * 74# * '': 0ariables: ,ra ic atality rate (8 tra ic deaths in that state in that year, per !0,000 state residents) ,a; on a case o beer
#(:

6ther (le)al drivin) a)e, drun4 drivin) laws, etc.)

#(7

,ra ic death data or !"#2

2i)her alcohol ta;es, more tra ic deaths<


#(#

,ra ic death data or !"##

2i)her alcohol ta;es, more tra ic deaths<


#("

/hy mi)ht there be hi)her more tra ic deaths in states that have hi)her alcohol ta;es< 6ther actors that determine tra ic atality rate: =uality (a)e) o automobiles =uality o roads >Culture? around drin4in) and drivin) Density o cars on the road

#(!0

,hese omitted actors could cause omitted variable bias. Example 8!: tra ic density. &uppose: (i) 2i)h tra ic density means more tra ic deaths (ii) (/estern) states with lower tra ic density have lower alcohol ta;es ,hen the two conditions or omitted variable bias are satis ied. &peci ically, >hi)h ta;es? could re lect >hi)h tra ic density? (so the 6@& coe icient would be biased positively 1 hi)h ta;es, more deaths) -anel data lets us eliminate omitted variable bias when the omitted variables are constant over time within a )iven state.
#(!!

Example 82: cultural attitudes towards drin4in) and drivin) (i) ar)uably are a determinant o tra ic deathsA and (ii) potentially are correlated with the beer ta;, so beer ta;es could be pic4in) up cultural di erences (omitted variable bias). ,hen the two conditions or omitted variable bias are satis ied. &peci ically, >hi)h ta;es? could re lect >cultural attitudes towards drin4in)? (so the 6@& coe icient would be biased) -anel data lets us eliminate omitted variable bias when the omitted variables are constant over time within a )iven state.
#(!2

Panel Data with %wo %i#e Periods (SW Section 8.&) Consider the panel data model, FatalityRateit * 0 B !BeerTaxit B 2Zi B uit Zi is a actor that does not chan)e over time (density), at least durin) the years on which we have data. &uppose Zi is not observed, so its omission could result in omitted variable bias. ,he e ect o Zi can be eliminated usin) T * 2 years.
#(!'

,he 4ey idea: Any change in the atality rate rom !"#2 to !"## cannot be caused by Zi, because Zi (by assumption) does not chan)e between !"#2 and !"##. ,he math: consider atality rates in !"## and !"#2: FatalityRatei!"## * 0 B !BeerTaxi!"## B 2Zi B ui!"## FatalityRatei!"#2 * 0 B !BeerTaxi!"#2 B 2Zi B ui!"#2 &uppose E(uitCBeerTaxit, Zi) * 0. &ubtractin) !"## 1 !"#2 (that is, calculatin) the chan)e), eliminates the e ect o Zi+
#(!4

FatalityRatei!"## * 0 B !BeerTaxi!"## B 2Zi B ui!"## FatalityRatei!"#2 * 0 B !BeerTaxi!"#2 B 2Zi B ui!"#2 so FatalityRatei!"## 1 FatalityRatei!"#2 *

!(BeerTaxi!"## 1 BeerTaxi!"#2) B (ui!"## 1 ui!"#2)


,he new error term, (ui!"## 1 ui!"#2), is uncorrelated with either BeerTaxi!"## or BeerTaxi!"#2. ,his >di erence? eDuation can be estimated by 6@&, even thou)h Zi isn3t observed. ,he omitted variable Zi doesn3t chan)e, so it cannot be a determinant o the change in Y
#(!$

Example: ,ra ic deaths and beer ta;es !"#2 data:


* 2.0! B 0.!$BeerTax FatalityRate

(n * 4#)

(.!$) (.!') !"## data:


* !.#: B 0.44BeerTax FatalityRate

(n * 4#)

(.!!) (.!') Di erence re)ression (n * 4#)


FR !"## FR!"#2 * 1.072 1 !.04(BeerTax!"##1BeerTax!"#2)

(.0:$) (.':)
#(!:

#(!7

'i"ed !ffects Regression (SW Section 8.() /hat i you have more than 2 time periods (T E 2)< Yit * 0 B !Xit B 2Zi B ui, i *!,+,n, T * !,+,T /e can rewrite this in two use ul ways: !. >n(! binary re)ressor? re)ression model 2. >Fi;ed G ects? re)ression model /e irst rewrite this in > i;ed e ects? orm. &uppose we have n * ' states: Cali ornia, ,e;as, Hassachusetts.
#(!#

Yit * 0 B !Xit B 2Zi B ui, i *!,+,n, T * !,+,T -opulation re)ression or Cali ornia (that is, i * CA): YCA,t * 0 B !XCA,t B 2ZCA B uCA,t * (0 B 2ZCA) B !XCA,t B uCA,t or YCA,t * CA B !XCA,t B uCA,t CA * 0 B 2ZCA doesn3t chan)e over time CA is the intercept or CA, and ! is the slope ,he intercept is uniDue to CA, but the slope is the same in all the states: parallel lines.
#(!"

For ,I: YTX,t * 0 B !XTX,t B 2ZTX B uTX,t * (0 B 2ZTX) B !XTX,t B uTX,t or YTX,t * TX B !XTX,t B uTX,t, where TX * 0 B 2ZTX Collectin) the lines or all three states: YCA,t * CA B !XCA,t B uCA,t YTX,t * TX B !XTX,t B uTX,t Y or Yit * i B !Xit B uit, i * CA, ,I, HA, T * !,+,T
#(20

A,t

B !X

A, t

Bu

A,t

%he regression lines for each state in a picture


Y CA Y * CA B !X

CA
TX

Y * TX B !X Y * AB !X

TX
A

#(2!

Jecall (Fi). :.#a) that shi ts in the intercept can be represented usin) binary re)ressors+
Y CA Y * CA B !X

CA
TX

Y * TX B !X Y * AB !X

TX
A

5n binary re)ressor orm: Yit * 0 B CA!CAi B TX!TXi B !Xit B uit !CAi * ! i state is CA, * 0 otherwise
#(22

!TXt * ! i state is TX, * 0 otherwise leave out ! Ai ("hy<) Su##ary$ %wo ways to write the fi"ed effects #odel )n*+ ,inary regressor- for# Yit * 0 B !Xit B 2!#i B + B n!ni B ui
! or i *2 (state 82) where !#i * , etc. 0 otherwise

)'i"ed effects- for#$ Yit * !Xit B i B ui


#(2'

i is called a >state i;ed e ect? or >state e ect? 1 it is the constant ( i;ed) e ect o bein) in state i

#(24

'i"ed !ffects Regression$ !sti#ation ,hree estimation methods: !. >n(! binary re)ressors? 6@& re)ression 2. >Gntity(demeaned? 6@& re)ression '. >Chan)es? speci ication (only wor4s or T * 2) ,hese three methods produce identical estimates o the re)ression coe icients, and identical standard errors. /e already did the >chan)es? speci ication (!"## minus !"#2) 1 but this only wor4s or T * 2 years Hethods 8! and 82 wor4 or )eneral T Hethod 8! is only practical when n isn3t too bi)
#(2$

+. )n*+ ,inary regressors- ./S regression Yit * 0 B !Xit B 2!#i B + B n!ni B ui


! or i *2 (state 82) !#i * 0 otherwise

(!)

where

etc.

First create the binary variables !#i,+,!ni ,hen estimate (!) by 6@& 5n erence (hypothesis tests, con idence intervals) is as usual (usin) heteros4edasticity(robust standard errors) ,his is impractical when n is very lar)e ( or e;ample i n * !000 wor4ers)
#(2:

&. )!ntity*de#eaned- ./S regression ,he i;ed e ects re)ression model: Yit * !Xit B i B ui ,he state avera)es satis y:
! T ! T ! T Yit * i B ! X it B uit T t =! T t =! T t =!

Deviation rom state avera)es:


! T ! T Yit 1 Yit * ! X it X it B T t =! T t =! ! T uit T uit t =!

#(27

!ntity*de#eaned ./S regression, ctd.


! T ! T Yit 1 Yit * ! X it X it B T t =! T t =! ! T uit T uit t =!

or
%* ! X %B u % Y it it it
T T ! ! %* Yit 1 Yit and X % * Xit 1 X it where Y it it T t =! T t =!

%is the di erence between the For i*! and t * !"#2, Y it atality rate in Alabama in !"#2, and its avera)e value in Alabama avera)ed over all 7 years.
#(2#

!ntity*de#eaned ./S regression, ctd.


%* ! X %B u % Y it it it
T ! %* Yit 1 Yit , etc. where Y it T t =!

(2)

%and X % First construct the demeaned variables Y it it %on X % usin) 6@& ,hen estimate (2) by re)ressin) Y it it

5n erence (hypothesis tests, con idence intervals) is as usual (usin) heteros4edasticity(robust standard errors) ,his is li4e the >chan)es? approach, but instead Yit is deviated rom the state avera)e instead o Yi!. ,his can be done in a sin)le command in &,A,A
#(2"

Example: ,ra ic deaths and beer ta;es in &,A,A


. areg vfrall beertax, absorb(state) r; Regression with robust standard errors Number of obs F( , !"#) &rob ' F R(s)uared ,d- R(s)uared Root ./0 = = = = = = 336 $.% $.$$ % $.*$+$ $.""* . "*"6

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( 1 Robust vfrall 1 2oef. /td. 0rr. t &'1t1 3*+4 2onf. 5nterval6 (((((((((((((7(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( beertax 1 (.6++"#36 .!$3!#*# (3.!3 $.$$ ( .$++*"! (.!++#6++ 89ons 1 !.3##$#+ . $+ + + !!.6 $.$$$ !. #$ $* !.+"%$% (((((((((((((7(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( state 1 absorbed (%" 9ategories)

>are)? automatically de(means the data this is especially use ul when n is lar)e the reported intercept is arbitrary
#('0

Example, ctd. For n * 4#, T * 7:


* 1.::BeerTax B $tate %ixed e%%ects FatalityRate

(.20) &hould you report the intercept< 2ow many binary re)ressors would you include to estimate this usin) the >binary re)ressor? method< Compare slope, standard error to the estimate or the !"## v. !"#2 >chan)es? speci ication (T * 2, n * 4#):
FR !"## FR!"#2 * 1.072 1 !.04(BeerTax!"##1BeerTax!"#2)

(.0:$) (.':)
#('!

Regression with %i#e 'i"ed !ffects (SW Section 8.0) An omitted variable mi)ht vary over time but not across states: &a er cars (air ba)s, etc.)A chan)es in national laws ,hese produce intercepts that chan)e over time @et these chan)es (>sa er cars?) be denoted by the variable $t, which chan)es over time but not states. ,he resultin) population re)ression model is: Yit * 0 B !Xit B 2Zi B '$t B uit
#('2

%i#e fi"ed effects only Yit * 0 B !Xit B '$t B uit 5n e ect, the intercept varies rom one year to the ne;t: Yi,!"#2 * 0 B !Xi,!"#2 B '$!"#2 B ui,!"#2 * (0 B '$!"#2) B !Xi,!"#2 B ui,!"#2 or Yi,!"#2 * !"#2 B !Xi,!"#2 B ui,!"#2, &imilarly, Yi,!"#' * !"#' B !Xi,!"#' B ui,!"#', etc.
#(''

!"#2 * 0 B '$!"#2

!"#' * 0 B '$!"#'

%wo for#ulations for ti#e fi"ed effects !. >9inary re)ressor? ormulation: Yit * 0 B !Xit B 2B#t B + TBTt B uit
! when t *2 (year 82) where B#t * , etc. 0 otherwise

2. >,ime e ects? ormulation: Yit * !Xit B t B uit


#('4

%i#e fi"ed effects$ esti#ation #ethods !. >T(! binary re)ressors? 6@& re)ression Yit * 0 B !Xit B 2B#it B + TBTit B uit Create binary variables B#,+,BT B# * ! i t * year 82, * 0 otherwise Je)ress Y on X, B#,+,BT usin) 6@& /here3s B&< 2. >Kear(demeaned? 6@& re)ression Deviate Yit, Xit rom year (not state) avera)es
#('$

Gstimate by 6@& usin) >year(demeaned? data

#(':

State and %i#e 'i"ed !ffects Yit * 0 B !Xit B 2Zi B '$t B uit !. >9inary re)ressor? ormulation: Yit * 0 B !Xit B 2!#i B + B n!ni B 2B#t B + TBTt B uit 2. >&tate and time e ects? ormulation: Yit * !Xit B i B t B uit
#('7

State and ti#e effects$ esti#ation #ethods !. >n(! and T(! binary re)ressors? 6@& re)ression Create binary variables !#,+,!n Create binary variables B#,+,BT Je)ress Y on X, !#,+,!n, B#,+,BT usin) 6@& /hat about !& and B&< 2. >&tate( and year(demeaned? 6@& re)ression Deviate Yit, Xit rom year and state avera)es Gstimate by 6@& usin) >year( and state( demeaned? data ,hese two methods can be combined too.
#('#

$TATA example: ,ra ic deaths+


. . . . . . . gen :"3=(:ear== *"3); gen :"%=(:ear== *"%); gen :"+=(:ear== *"+); gen :"6=(:ear== *"6); gen :"#=(:ear== *"#); gen :""=(:ear== *""); areg vfrall beertax :"3 :"% :"+ :"6 :"# :"", absorb(state) r;

Regression with robust standard errors

Number of obs = 336 F( #, !" ) = 3.#$ &rob ' F = $.$$$" R(s)uared = $.*$"* ,d- R(s)uared = $."* % Root ./0 = . "#"" (((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( 1 Robust vfrall 1 2oef. /td. 0rr. t &'1t1 3*+4 2onf. 5nterval6 (((((((((((((7(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( beertax 1 (.63**#** .!+%# %* (!.+ $.$ 3 ( . % 3# (. 3"+""% :"3 1 (.$#**$!* .$+$!#$" ( .+* $. 3 (. #""+#* .$ *$+!! :"% 1 (.$#!%!$6 .$%+!%66 ( .6$ $. (. 6 %"6 .$ 66%%" :"+ 1 (. !3*#63 .$%6$$ # (!.#$ $.$$# (.! %+!" (.$33%!%6 :"6 1 (.$3#"6%+ .$%"6+!# ($.#" $.%3# (. 3363%% .$+#*$++ :"# 1 (.$+$*$! .$+ 6 3 ($.** $.3!+ (. +!%*+" .$+$6* # :"" 1 (.$+ "$3" .$+3"# ($.*6 $.33# (. +#"%3" .$+%!36 89ons 1 !.%!"%# . %6"+6+ 6.+% $.$$$ !. 3*3*! !.# #+%* (((((((((((((7((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
#('"

state 1

absorbed

(%" 9ategories)

Go to section for other ways to do this in STATA! So#e %heory$ %he 'i"ed !ffects Regression 1ssu#ptions (SW 1pp. 8.&) For a sin)le X: Yit * !Xit B i B uit, i * !,+,n, t * !,+, T !. E(uitCXi!,+,XiT,i) * 0. 2. (Xi!,+,XiT,Yi!,+,YiT), i *!,+,n, are i.i.d. draws rom their .oint distribution. '. (Xit, uit) have inite ourth moments. 4. ,here is no per ect multicollinearity (multiple X3s)
#(40

$. corr(uit,uisCXit,Xis,i) * 0 or t s. Assumptions 'L4 are identicalA !, 2, di erA $ is new 1ssu#ption 2+$ E(uitCXi!,+,XiT,i) * 0 uit has mean Mero, )iven the state i;ed e ect and the entire history o the X3s or that state ,his is an e;tension o the previous multiple re)ression Assumption 8! ,his means there are no omitted la))ed e ects (any la))ed e ects o X must enter e;plicitly) Also, there is not eedbac4 rom u to uture X:

#(4!

o /hether a state has a particularly hi)h atality rate this year doesn3t subseDuently a ect whether it increases the beer ta;. o /e3ll return to this when we ta4e up time series data.

#(42

1ssu#ption 2&$ (Xi!,+,XiT,Yi!,+,YiT), i *!,+,n, are i.i.d. draws rom their .oint distribution. ,his is an e;tension o Assumption 82 or multiple re)ression with cross(section data ,his is satis ied i entities (states, individuals) are randomly sampled rom their population by simple random samplin), then data or those entities are collected over time. ,his does not reDuire observations to be i.i.d. over time or the same entity 1 that would be unrealistic (whether a state has a mandatory D/5 sentencin) law this year is stron)ly related to whether it will have that law ne;t year).
#(4'

1ssu#ption 23$ corr(uit,uisCXit,Xis,i) * 0 or t s ,his is new. ,his says that ()iven X), the error terms are uncorrelated over time within a state. For e;ample, uCA,!"#2 and uCA,!"#' are uncorrelated 5s this plausible< /hat enters the error term< o Gspecially snowy winter o 6penin) ma.or new divided hi)hway o Fluctuations in tra ic density rom local economic conditions Assumption 8$ reDuires these omitted actors enterin) uit to be uncorrelated over time, within a state.
#(44

What if 1ssu#ption 23 fails$ corr(uit,uisCXit,Xis,i) 0< A use ul analo)y is heteros4edasticity. 6@& panel data estimators o ! are unbiased, consistent ,he 6@& standard errors will be wron) 1 usually the 6@& standard errors understate the true uncertainty 5ntuition: i uit is correlated over time, you don3t have as much in ormation (as much random variation) as you would were uit uncorrelated. ,his problem is solved by usin) >heteros4edasticity and autocorrelation(consistent standard errors? 1 we return to this when we ocus on time series re)ression 1pplication$ Drun4 Dri5ing /aws and %raffic Deaths
#(4$

(SW Section 8.3) So#e facts Appro;. 40,000 tra ic atalities annually in the %.&. !N' o tra ic atalities involve a drin4in) driver 2$O o drivers on the road between !am and 'am have been drin4in) (estimate) A drun4 driver is !' times as li4ely to cause a atal crash as a non(drin4in) driver (estimate)

#(4:

Drun4 dri5ing laws and traffic deaths, ctd. Pu,lic policy issues Drun4 drivin) causes massive e;ternalities (sober drivers are 4illed, etc. etc.) 1 there is ample .usti ication or )overnmental intervention Are there any e ective ways to reduce drun4 drivin)< 5 so, what< /hat are e ects o speci ic laws: o mandatory punishment o minimum le)al drin4in) a)e o economic interventions (alcohol ta;es)
#(47

%he drun4 dri5ing panel data set n * 4# %.&. states, T * 7 years (!"#2,+,!"##) (balanced) 6aria,les ,ra ic atality rate (deaths per !0,000 residents) ,a; on a case o beer (Beertax) Hinimum le)al drin4in) a)e Hinimum sentencin) laws or irst D/5 violation: o andatory 'ail o anditory Community $er(ice o otherwise, sentence will .ust be a monetary ine 0ehicle miles per driver (%& D6,)
#(4#

&tate economic data (real per capita income, etc.)

#(4"

Why #ight panel data help -otential 60 bias rom variables that vary across states but are constant over time: o culture o drin4in) and drivin) o Duality o roads o vinta)e o autos on the road use state i;ed e ects -otential 60 bias rom variables that vary over time but are constant across states: o improvements in auto sa ety over time o chan)in) national attitudes towards drun4 drivin) use time i;ed e ects
#($0

#($!

#($2

!#pirical 1nalysis$ 7ain Results &i)n o beer ta; coe icient chan)es when i;ed state e ects are included Fi;ed time e ects are statistically si)ni icant but do not have bi) impact on the estimated coe icients Gstimated e ect o beer ta; drops when other laws are included as re)ressor ,he only policy variable that seems to have an impact is the ta; on beer 1 not minimum drin4in) a)e, not mandatory sentencin), etc. ,he other economic variables have plausibly lar)e coe icients: more income, more drivin), more deaths
#($'

!"tensions of the )n*+ ,inary regressor- approach ,he idea o usin) many binary indicators to eliminate omitted variable bias can be e;tended to non(panel data 1 the 4ey is that the omitted variable is constant or a )roup o observations, so that in e ect it means that each )roup has its own intercept. Example: Class siMe problem. &uppose undin) and curricular issues are determined at the county level, and each county has several districts. Jesultin) omitted variable bias could be addressed by includin) binary indicators, one or each county (omit one to avoid per ect multicollinearity).
#($4

Su##ary$ Regression with Panel Data (SW Section 8.8) 1d5antages and li#itations of fi"ed effects regression 1d5antages Kou can control or unobserved variables that: o vary across states but not over time, andNor o vary over time but not across states Hore observations )ive you more in ormation Gstimation involves relatively strai)ht orward e;tensions o multiple re)ression

#($$

Fi;ed e ects estimation can be done three ways: !. >Chan)es? method when T * 2 2. >n(! binary re)ressors? method when n is small '. >Gntity(demeaned? re)ression &imilar methods apply to re)ression with time i;ed e ects and to both time and state i;ed e ects &tatistical in erence: li4e multiple re)ression. /i#itations9challenges Peed variation in X over time within states ,ime la) e ects can be important

#($:

&tandard errors mi)ht be too low (errors mi)ht be correlated over time)

#($7

You might also like