You are on page 1of 18

6th Seminar Jean Paelinck

Madrid, 18-19th October, 2013


Universidad Autónoma de Madrid

Geographically Weighted
Panel Regression
(GWPR)

Fernando Bruna, University of A Coruña (Spain)


Danlin Yu, Montclair State University, NJ (USA)
Motivations
1) Geographically weighted regression (GWR): technique to
study spatial nonstationarity relationships in cross-sections.
But panel data estimates have a different interpretation
because of pooling (quasi-) time-demeaned data.
How do we study nonstationarity relationships in
models with unobserved individual effects?

2) The wage equation of the New Economic Geography


(NEG) collects spatial effects without spatial
econometrics.
Can the variable of Market Potential in this equation be
interpreted in terms of local effects? What does its GW
estimate mean?
Background
• GTWR: extension of GWR considering ‘closeness’ in both
time and space, but keeping the cross-sectional framework.
• Yu (2010): Geographically weighted panel regression
(GWPR) fills the gap between the GWR and panel data
methods. LOCAL ESTIMATION WITH ECONOMETRIC
TECHNIQUES BASED IN REPEATED OBSERVATIONS
FOR EACH LOCATION. Lin (2011): maximum likelihood
estimators of spatial panel data GWR models (SPDGWR).
• Paredes and Iturra (2012): the only (cross-sectional) GWR
estimation of a NEG’s wage equation.
• Bruna (2013): Market Potential in this equation partially
collects local effects ⇒ GW approaches to the wage equation
estimate local variations of neighboring effects.
The interpretation of panel models
• Cross-sectional/pooled data with variables in levels:
𝑦𝑖𝑖 = 𝛼 + 𝛽 ′ 𝑥𝑖𝑖 + 𝑢𝑡 + 𝑢𝑖𝑖 Comparing relative levels

• Panel data with unobserved individual fixed effects:


𝑦𝑖𝑖 = 𝛼 + 𝛽′ 𝑥𝑖𝑖 + 𝒖𝒊 + 𝑢𝑡 + 𝑢𝑖𝑖
Removing the unobserved effects. ‘Within’ transformation:
𝑦�𝑖 = 𝛼 + 𝛽′𝑥̅𝑖 + 𝑢𝑖 + 𝑢�𝑡 + 𝑢�𝑖 Deviations to the regional means:
�𝒊 = 𝜷′ 𝒙𝒊𝒊 − 𝒙
𝒚𝒊𝒊 − 𝒚 �𝒊 + 𝒖𝒕 − 𝒖 � 𝒕 + 𝒖𝒊𝒊 − 𝒖 � 𝒊 Changes
of levels
or first differencing (changes with respect to the previous period):
𝑦𝑖𝑖 − 𝑦𝑖𝑖−1 = 𝛼 + 𝛽 ′ 𝑥𝑖𝑖 − 𝑥𝑖𝑖−1 + 𝑢𝑡 − 𝑢𝑡−1 + 𝑢𝑖𝑖 − 𝑢𝑖𝑖−1
If variables are in logs, this equation is pooling instantaneous growth rates.

• Pooled data in one period growth rates:


𝑔𝑌𝑖𝑖 = 𝛼 + 𝛽′ 𝑔𝑋𝑖𝑖 + 𝑢𝑡 + 𝜀𝑖𝑖
The NEG and Harris’s (1954) Market Potential
• New Economic Geography’s (NEG) wage-type-of equation
derived by Bruna (2013) to include capital stock.
• Regional wage levels are a function of the size of the markets
available to each region. Cross-sectional form:
ln 𝑤𝑖 = 𝛼 + 𝛽1 ln 𝑘𝑖 + 𝛽2 ln ℎ𝑖 + 𝛽3 ln 𝑹𝑹𝑹𝒊
• 𝑅𝑅𝑅𝑖 (Market Access) proxied by Harris’s Market Potential.
External Market Potential (𝐸𝐸𝐸𝑖 ) of region 𝑖=1,…, 𝑛: inverse
distance (𝑑𝑖𝑖 ) weighted sum of the income of all the other regions:
𝑛−1
𝐺𝐺𝐺𝑗
𝐸𝐸𝐸𝑖 = � 𝐺𝐺𝐺𝑗 = Gross value added
𝑑𝑖𝑖
𝑗≠𝑖

• 𝐸𝐸𝐸𝑖 : nonstandardized inverse distance weighted spatial lag of


income using all the observations in the sample.
• Any measure of Market Access with distance exponents close
to -1 captures neighboring effects.
Levels versus deviations to the means
1) The significance of the variables can change dramatically
when pooling data in levels or time-demeaned data.
2) Demeaned data, as well as growth rates, can be highly volatile.
3) Estimates obtained when demeaned data is pooled might be
very sensitive to the inclusion of time effects.
4) Cross-sectional data in deviations to the regional means of the
whole period has higher dispersion and lower spatial
autocorrelation than data in levels.

Dispersion and spatial autocorrelation of the variables for the


cross-section of the year 2008
Data in levels Data in deviations to the 1995-2008 means
Variables Quartile coefficient Moran's test Quartile coefficient Moran's test
of dispersion I statistic p-value of dispersion I statistic p-value
Per capita GVA 0.018 0.618 0.000 0.351 0.463 0.000
Per capita capital stock 0.015 0.518 0.000 0.249 0.323 0.000
Human capital -0.091 0.528 0.000 0.440 0.476 0.000
External Market Potential 0.037 0.919 0.000 0.050 0.773 0.000
Pooling levels and deviations to the means
The fixed effects panel estimates collect the short run effects of
changes of 𝑥𝑖𝑖 on changes of 𝑦𝑖𝑖 (wages proxied by per capita GVA)
Pooled estimation (levels) Panel with regional fixed effects
(1) (2) (3) (4) (5) (6)
(Intercept) 1.734*** 1.475*** 1.489***
(0.118) (0.116) (0.116)
Per capita capital stock 0.646*** 0.679*** 0.678*** 0.171*** 0.188*** 0.178***
(0.010) (0.010) (0.010) (0.017) (0.017) (0.017)
Human capital 0.149*** 0.165*** 0.166***
(0.008) (0.008) (0.008)
External Market Potential 0.139*** 0.139*** 0.139*** 0.610*** 0.984*** 0.854***
(0.007) (0.006) (0.006) (0.025) (0.083) (0.058)
Trend -0.010*** -0.006***
(0.001) (0.001)
Year dummies? No Yes No No Yes No
R-squared 0.792 0.806 0.805 0.793 0.796 0.795
Adj. R-squared 0.791 0.801 0.804 0.736 0.735 0.737
F 3662 743 2978 5134 692 3457
Sum sq. errors 68.43 64.00 64.13 4.37 4.32 4.34
GWR in the non parametric family
𝑦 = 𝑓 𝑥, 𝑧
1) Locally Weighted Regression (LWR): approximate 𝑓 ·
weighting 𝑥𝑗 − 𝑥𝑖 and 𝑧𝑗 − 𝑧𝑖 : 𝑦 = 𝑓 𝑥, 𝑧 + 𝑢
2) Kernel Regression: special case of LWR. Weight 𝑦𝑗 with
𝑥𝑗 − 𝑥𝑖 and 𝑧𝑗 − 𝑧𝑖 : 𝑦 = 𝑓 𝑦 + 𝑢
3) Conditional Parametric Regression (CPAR): special case
of LWR. Assume that 𝑥, 𝑧 can be divided into a
nonparametric 𝑥 and a conditionally parametric 𝑧 :
=𝛼 𝑧 +𝛽 𝑧 𝑥+𝑢
Spatial CPAR: the conditionally parametric variables are the
geographic coordinates of each point, latitude and longitude:
𝑦 = 𝛼 𝑙𝑙, 𝑙𝑙 + 𝛽 𝑙𝑙, 𝑙𝑙 𝑥 + 𝑢
4) Geographically Weighted Regression (GWR): special case
of spatial CPAR (LWR): 𝑦 = 𝛼 𝑑 + 𝛽 𝑑 𝑥 + 𝑢
Geographically Weighted Regression
• Cross-sectional data: GWR. Repeated estimation of a
local regression at each point in space with a subsample of
the data properly weighted according to the distance of
each location to the target point. Brunsdon, Fotheringham,
and Charlton (1996) and McMillen (1996).
• Geographically and temporally weighted regression
(GTWR) extends the concept of ‘closeness’: data points
close in both space and time dimensions can have a
greater influence in the estimations of local parameters for
an observation. Crespo et al. (2007), Huang et al. (2010),
Wrenn and Sam (2012), Yu (2013) and Wu et al. (2013).
• Software: R’s packages spgwr (Bivand and Yu, 2013) and
GWmodel (Lu, Harris, Gollini, Charlton and Brunsdon, 2013).
GWPR: Estimation steps
ℎ𝑖 = Bandwidth for region 𝑖
𝑑𝑖𝑖 = Distance from region 𝑖 to region 𝑗

1) Select 𝒉𝒊
2) Subsample the data in levels for 𝑖’s local estimation
3) Weight all the time observations of 𝒋’s variables in
levels for 𝑖’s local estimation
4) Apply panel data estimation to the weighted
subsampled data: all the panel data models available
in R’s plm package (Croissant and Millo, 2008)
5) Map significant coefficients (Mennis, 2006)
Spatially adaptive weighting function
1) Select 𝒉𝒊 such as all regions have the same number of nearest
neighbors (𝑑𝑖𝑖 < ℎ𝑖 ): 15, 70 and 140 nearest neighbors
3) Weight all the time observations of 𝒋’s variables in levels for
𝒊 ’s local estimation: adaptive bisquare weighting (kernel)
function:
2
⁄ ℎ𝑖 = Bandwidth for region 𝑖
𝑤𝑖𝑖 = 1 − 𝑑𝑖𝑖 ℎ𝑖 if 𝑑𝑖𝑖 < ℎ𝑖
0 otherwise 𝑑𝑖𝑖 = Distance from 𝑖 to 𝑗

Source: Fotheringham,Charlton & Brunsdon (2001)


GWPR estimates for 15 nearest neighbors

Global model: 0.178 Global model: 0.854


Median of local models: 0.161 Median of local models: 0.963
GWPR estimates for 70 nearest neighbors

Global model: 0.178 Global model: 0.854


Median of local models: 0.158 Median of local models: 0.927
GWPR estimates for 140 nearest neighbors

Global model: 0.178 Global model: 0.854


Median of local models: 0.172 Median of local models: 0.744
Fixed effect of two regions in all local estimations for 70 nearest neighbors
The estimated levels of the individual regional effects
are very sensitive to bandwidth selection

Global model: -0.083 Global model: 0.183


Local model for Galicia: -2.980 Local model for Luxembourg: 0.819
Conclusions
• The marginal effects of demeaned data must be interpreted in
terms of variations of variables.
• GWPR fills the gap between GWR and panel data. The
method is designed for any econometric technique based on
repeated observations for each location.
• GWPR estimates of a Harris’s variable of Market Potential can
be considered to capture local differences of regional
spillovers from the income variations of the nearest neighbors.
• GW fixed effects panel estimates of Market Potential (70
nearest neighbors) for Portugal, Spain, South of France and
North of Italy have a magnitude which doubles the one of the
global model.
Further research

• Current developments: next steps


– Improve GWPR diagnostics
– Optimization procedures: bandwidth selection (CV, AIC)
– Random effects models. Test fixed and/or random effects
– Analysis and potential correction of correlations among
local regression coefficients (Wheeler and Tiefelsdorf,
2005; Páez et al., 2011): extending the techniques in R’s
packages gwrr and GWmodel to panel data

• Future research
– Local spatial panel data models
– A possible GWPR R’s package
COMMENTS VERY WELCOMED
THANK YOU

Fernando Bruna (f.bruna@udc.es)


University of A Coruña, Spain

You might also like