Professional Documents
Culture Documents
Contents
Andreas Papritz 1 geostatistics: data sets and objectives of analyses 5
1.1 spatial statistics . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 objectives of geostatistical analyses . . . . . . . . . . . . 10
1.2.1 spatial prediction: Wolfcamp aquifer data set . . 11
ETH Zurich 1.2.2 spatial prediction: Dornach data set . . . . . . . 21
Institute of Terrestrial Ecosystems 1.2.3 parameter estimation (and prediction): forest soil
Institute of Biogeochemistry and Pollutant Dynamics SOC stock data . . . . . . . . . . . . . . . . . . . 32
1.2.4 parameter estimation and inference: wheat yield
Chairs of Soil Physics and Soil Protection
experiment . . . . . . . . . . . . . . . . . . . . . . 43
of analyses
• spatial statistics: statistical analysis of data for which spatial (or
spatio-temporal) position where attribute was recorded is known
● pollution level A
● pollution level B
● pollution level C
● pollution level D
● pollution level E
● pollution level F
258000
37.0
rate of sudden infant deaths, 1974, North Carolina 1. computing spatial predictions
ments
34.5
longitude
x y pressure
Min. :-233.72 Min. :-145.79 Min. : 312.1
1st Qu.: -34.26 1st Qu.:-106.73 1st Qu.: 471.8
Median : 19.57 Median : -65.74 Median : 547.7
Mean : 27.63 Mean : -33.23 Mean : 610.3
100
3rd Qu.: 114.10 3rd Qu.: 51.21 3rd Qu.: 774.2
Max. : 181.53 Max. : 136.41 Max. :1088.4
50
0
y
−50
−100
−150
−300 −200 −100 0 100 200
x
d data n
e hea e predictio
pressur surfac
trend
pressure
pressure
N N
y−
y−
co
te
rdina
co
te
rdina
or
o
or
x−co o
di
x−co
di
na
n
at
te
e
on
g predicti rror
krigin rd e
nda
sta
ng
pressure
krigi
pressure
N
N
y−
y−
co
ate
din
co
te
or
dina or
or
coor −co
din
di
x− x
na
ate
te
• set of observations (yi , xi ) where yi is a datum of a re- • model for data: Yi = S(xi ) + Zi where
sponse variable and xi is a spatial location in a study domain D
Yi ith datum
• optional: spatial covariates, say dk (xi ), used to “explain” the
S(xi ) “signal” (= true quantity) at location xi
spatial pattern of the response variable
Zi iid random measurement error
• geostatistical data often show (gradual) large-scale spatial vari-
ation (trend) and small-scale local fluctuations • decomposition of signal into trend µ(xi ) and stochastic fluctuation:
Arlesheim
●
260000
●
●
●
●
260000
● ● ●
●
● ●
●
●
●
●
●
Reinach
● ●
Reinach Arlesheim
●
● ● ● ●
● ●
●
● ● ● ●
●
●
● ● ●
●
●
● ●
●
●
●
●
●●
● ● ● ● ●
● ● ●
●
●
● ●
●
●
●
●
● ● ●
● ● ● ●
● ● ●
● ● ●
● ● ● ●
●●
● ●
●
● ● ● ●● ●
●
● ● ●
● ● ●●
●
●
●● ●
● ●
● ● ●
●
● ●
● ● ● ● ●
●
●●
● ●
●
● ● ● ● ● ● ● ●
●
●
●
● ●
● ●
● ●
●
●
● ●
●
●
●
●
●
●
● ● ● ● ●● ●
● ● ● ●● ● ● ● ●
259000
●● ● ●
● ● ● ● ● ● ● ●
● ● ● ●
●● ●
● ●● ● ● ●● ●
● ●
●
●
●
● ●
259000
● ● ● ● ● ● ●
●
●
● ● ●
●
● ●
●● ●
●● ●
●
● ● ● ● ●
● ● ● ● ● ●
●
● ●
●
●
●
● ●
●
● ● ● ●
●
● ● ●
●
●
●
● ● ●
●
● ● ●
● ● ● ●
● ● ●
● ● ● ●●
northing [m]
●●
● ●
●
●●
northing [m]
● ● ● ● ● ●● ● ● ●
●●
● ●● ● ●
●
●
●
●
● ● ●●
●
●
● ● ● ●●
●
●● ● ● ●
●
●
●
●
●●
●●
● ●
● ● ●
●
●
●● ●●● ● ● ● ● ●
●● ●
●
● ●
●
●
● ● ● ●
●●
●● ●
●●● ●●●
● ● ● ●● ● ●●●
● ● ●● ● ●
● ● ● ● ●●
● ● ●● ● ● ● ● ●
●● ●
● ● ●
● ●
● ●
● ● ● ● ● ●
●
●
●
● ●
●● ●● ●
● ●
● ● ● ● ● ●●
●
●●
●●● ● ●
● ● ●● ●
● ●
●
●●
●● ●●
●● ● ● ●● ●●
● ● ● ●
●●
● ●
● ●●● ●●● ● ● ●● ● ●
● ●
● ● ●● ●
● ● ● ● ● ● ●
● ●● ●
●
●
●
●
●
● ●● ● ●
● ●
●● ● ●●● ●
● ● ●
●●● ●
● ● ●● ● ●● ● ●
●● ● ● ●
● ●●
●
●
●● ●
●
●
●●
● ●●
● ●
● ● ● ●
● ● ●
●●
● ● ●●● ●
● ●●
● ● ●● ●● ●
●● ●
● ● ● ● ● ● ●
●
●
●
● ● ●● ● ● ●
●●●
●
●
● ●● ●●
●● ●
●
● ● ●
● ●● ●
● ●
● ● ● ● ●
●
● ● ●
●
● ●
●
●●● ●
● ●● ● ● ● ● ●● ● ● ●
●
●
●
● ● ● ● ●
●
●
●● ●●●
●
●
●●
● ●●
● ●●
● ●
● ● ● ●
●
●
● ● ●
●
● ● ●●
● ●
● ●● ●● ● ● ●
●● ● ●
● ● ●
●
●
● ● ●
●
● ● ● ● ● ●
● ● ●
●
●
●●● ●
●
●
● ● ● ●
● ● ● ● ● ● ● ●
●
●
●
●
● ● ● ● ● ●● ●
258000
● ● ● ● ● ● ● ●
● ●●
● ●
● ●
● ●
● ● ● ●
● ● ● ● ● ● ● ● ● ● ●
●
● ●
● ● ●
● ●● ● ● ● ● ●●●
●●
●
●
●● ● ● ●● ●
● ●
●
● ● ● ●
● ● ● ● ●
● ● ●
● ●
●
●
●
● ● ●● ●
● ●
●
● ● ● ●
● ● ● ● ●
●
● ● ●
258000
● ● ●
● ● ● ●
● ●
● ● ●
Dornach
● ●
●
●
● ● ● ●
●
● ●
●
● ● ●
●
●
● ● ● ●
●
● ● ●
● ●
●
●
● ●
●
● ● ●
● ● ● ●
●
● ● ●
● ●
Dornach
● ● ●
● ●
●●
●
● ●
● ●
●
●
●
●
●
● ●
● ●
● ● ● ●
● ●●
Aesch ●
● ●
●
●
257000
●
●
● ● ●
● ●
pollution level A
●
● ●
● ● ● ●
●
● ●
● pollution level C
guide < Cu ≤ trigger value 257000
●
● ●
●
●
● ● pollution level D
trigger < Cu ≤ clean-up value
●
●
● ● pollution level E
● pollution level F
Untersuchungen Stand Sommer 2003 ● Cu > clean-up value
611000 612000 613000 614000 615000 616000 611000 612000 613000 614000 615000 616000
easting [m] easting [m]
Cu: 95%-percentiles of predictive distributions of parcel means Cu: 5%-percentiles of predictive distributions of parcel means
Arlesheim Arlesheim
● ●
260000
260000
● ●
● ●
● ●
● ● ● ●
● ● ● ●
● ●
● ● ● ●
● ●
● ●
● ●
● ●
Reinach Reinach
● ● ● ●
● ●
● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ● ● ● ● ● ● ●
●
● ●
●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ●
● ● ● ● ●
● ●
● ● ● ●
● ●● ● ● ● ● ● ●
● ●● ● ●
● ● ● ●● ● ● ● ●●
● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ●
●
● ● ● ● ● ●
● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ● ● ● ● ●● ●
● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●
●● ● ● ●● ● ●
●● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●● ● ● ●● ● ●● ● ● ●●
259000
259000
● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●● ● ● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
●●
● ● ●●
● ●
northing [m]
northing [m]
● ● ●● ● ● ● ● ● ●● ● ● ●
● ● ●● ●
● ● ●● ● ● ●● ●
● ● ●●
● ● ●● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ●
●
● ●●
● ●
● ●●
●
● ● ● ● ● ● ● ●
● ●
● ● ● ● ● ● ● ●
● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●
●● ● ●● ●
● ● ● ●● ● ●●● ● ● ● ●● ● ●●●
● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ●
● ● ● ● ●● ● ● ● ● ● ● ●
●
● ● ● ● ●● ● ● ● ● ● ● ●
●
● ● ●
● ● ● ● ● ● ●
● ● ● ●
● ●● ● ● ● ●● ● ●
● ●● ●● ● ● ●● ●● ●
● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●● ● ● ● ●
●● ●
● ● ● ●● ● ●● ●
● ● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ●●● ●●● ● ● ●● ●
● ● ● ●●● ●●● ● ● ●● ●
● ●
●
● ● ●● ●
● ● ● ● ● ● ● ●
● ● ●● ●
● ● ● ● ● ● ●
● ●● ● ● ● ● ●● ● ● ●
● ● ●●● ●
● ● ● ●● ● ● ●●● ●
● ● ● ●●
● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ●
● ● ● ●
●● ● ● ● ● ●
●● ●
● ● ● ●● ● ●● ● ●● ●● ● ●
● ● ● ●● ● ●● ● ●● ●● ● ●
● ●
●●
● ● ●●● ● ● ● ●● ●● ● ● ●
●●
● ● ●●● ● ● ● ●● ●● ●
● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ●
● ●● ●
●●
●● ● ● ●● ●
●●
●● ●
● ● ●●
●● ● ● ● ● ● ● ●●
●● ● ● ● ●
● ●
●●● ● ● ● ● ●
●●● ● ● ●
● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ●
● ● ● ● ● ● ● ●
● ●● ● ● ●● ● ●
● ● ●● ● ● ● ●●
● ● ● ●● ●●● ● ● ●● ● ● ● ● ●● ●●● ● ● ●● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ●● ● ●●● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ●
●●●
●●● ●● ●● ● ● ● ● ●●●
●●● ●● ●● ● ● ● ●
● ● ●
● ● ● ● ● ●
● ● ●
● ●
●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ● ● ● ● ●● ● ●
●
●
● ● ● ● ● ●● ●
● ● ● ● ● ● ● ●
● ●● ●
● ●
●
● ● ● ● ● ● ● ●
● ●● ●
● ●
●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●
● ● ● ●● ● ● ● ● ●● ●
● ● ● ●●● ● ● ● ● ● ●●● ● ●
●
● ● ● ● ● ●● ● ●
● ● ● ● ● ●● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
●
●
● ● ●● ● ●
●
● ● ●● ●
● ● ● ●
● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ●
● ● ● ●
258000
258000
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ●
● ● ● ●
● ●
Dornach Dornach
● ● ● ● ● ●
● ● ●
● ● ●● ● ● ● ●●
● ● ● ● ● ●
● ●
● ● ● ●
● ● ● ●
● ●
● ● ●● ● ● ●●
● ●
● ● ● ●
● ●
● ●
● ●
● ●
● ● ● ●
● ●
●
● ● ● ● ●
● ● ● ●
Aesch ● ●
●
● pollution level B Aesch ● ●
●
● pollution level B
● pollution level C ● pollution level C
257000
257000
● ●
● ●
●
●
●
●
● ● pollution level D ●
●
●
●
● ● pollution level D
● pollution level E ● pollution level E
● pollution level F ● pollution level F
611000 612000 613000 614000 615000 616000 611000 612000 613000 614000 615000 616000
easting [m] easting [m]
support of data and predictions ↑↓ 31 1.2.3 SOC stocks of Swiss forest soils ↑↓ 32
igh ex
Data Source: #
* *#
!# * !
he ind
Soil profiles calibration/validation tn ess
we
!
© 2011 WSL, Soil Science Group
hic ex
rap
Soil profiles validation NABO / KABO © 2012,
ind
!#
*
*
Agroscope / ALN Zurich / AFU St. Gallen og on
Forest: SilvaProtect-CH © 2008 BAFU, Giamboni. Lakes: Vector 200 © 2007 swisstopo (DV033492.2)) 0 25 50 km top siti
o
Relief 1:1'000'000: K606-01 © 2004 swisstopo. Swiss Boundary: BFS GEOSTAT, swisstopo
h ic p
rap
og
top
> summary(spatial.model)
> summary(non.spatial.model)
...
Variogram: exponential
... Estimate
Estimate Std. Error t value Pr(>|t|) variance 0.096
(Intercept) 4.025e+00 1.834e-01 21.951 < 2e-16 snugget 0.000
sqrt(precipitation) 1.065e-02 1.118e-03 9.525 < 2e-16 nugget 0.096
nir -9.666e-03 2.841e-03 -3.402 0.000695 scale 329.205
position_index_clay_poor 2.082e-03 8.547e-04 2.436 0.015002
position_index_clay_rich -1.688e-03 4.547e-04 -3.713 0.000216 Fixed effects coefficients:
stock_soil_mass -2.506e-04 5.601e-05 -4.474 8.53e-06 Estimate Std. Error t value Pr(>|t|)
soil_map_units_B -3.740e-01 4.938e-02 -7.573 8.18e-14 (Intercept) 3.980e+00 1.979e-01 20.113 < 2e-16
soil_map_units_C -4.679e-01 4.580e-02 -10.216 < 2e-16 sqrt(precipitation) 1.098e-02 1.245e-03 8.824 < 2e-16
soil_map_units_D -2.512e-01 4.373e-02 -5.744 1.22e-08 nir -8.389e-03 2.988e-03 -2.807 0.005090
soil_map_units_E -1.179e+00 1.258e-01 -9.375 < 2e-16 position_index_clay_poor 2.303e-03 8.739e-04 2.636 0.008525
position_index_clay_rich -1.658e-03 4.806e-04 -3.450 0.000584
Residual standard error: 0.4283 on 1012 degrees of freedom stock_soil_mass -2.667e-04 5.975e-05 -4.463 8.97e-06
... soil_map_units_B -3.712e-01 5.398e-02 -6.877 1.07e-11
soil_map_units_C -4.715e-01 4.965e-02 -9.495 < 2e-16
soil_map_units_D -2.531e-01 4.728e-02 -5.353 1.07e-07
soil_map_units_E -1.142e+00 1.452e-01 -7.866 9.34e-15
2
0.10
1
0 0.05
−1 0.00
(Intercept)
sqrt(precipitation)
nir
(Intercept)
position_index_clay_poor
position_index_clay_rich
stock_soil_mass
soil_map_units_B
soil_map_units_C
soil_map_units_D
soil_map_units_E
sqrt(precipitation)
nir
position_index_clay_poor
position_index_clay_rich
stock_soil_mass
soil_map_units_B
soil_map_units_C
soil_map_units_D
soil_map_units_E
-1
SOC stock 0 - 30 cm [t ha ] Data Source:
Prediction of soil organic carbon in forests: own data
0 30 60 90 120 150 180 210 240 270 Lakes: Vector 200 © 2007 swisstopo (DV033492.2))
0 25 50 km Relief 1:1'000'000: K606-01 © 2004 swisstopo
Swiss Boundary: BFS GEOSTAT, swisstopo
120
Alps
Southern Alps
<600 m
600−1200 m
>1200 m
C stock [t/ha]
40 800
Ju
Ju
Ju
Pla
Pla
Pla
Pr
Pr
Pr
Alp
S−
CH
Alp
Alp
S−
S−
ea
ea
ea
ra
ra
ra
Alp
Alp
Alp
t
t
tea
s<
s6
s > 1200
ea
ea
lps
lps
lps
<6
60
>1
s>
s<
s6
00
60
12
u<
u6
u > 1200
0−
00
20
<6
60
>1
−
0
00
00
12
60
00
60
12
12
0
0
00
−1
00
0
−
−1
0
00
00
00
20
20
0
0
spatial regression analyses ↑↓ 41 parameter estimation vs. (spatial) prediction ↑↓ 42
• regression analyses with spatially referenced data • parameter estimation: infering values of parameters of a
stochastic model from data (e.g. coefficients and variance of error
• auto-correlation of errors frequently due to effects of unaccounted
term of a regression model)
covariates (imperfect model)
• prediction: inference about a unobserved values of random vari-
⇒ assumption of independent errors often untenable
ables (e.g. prediction of SOC stocks for a location without meas-
⇒ falsely ignoring auto-correlation biases statistical inference (p- urement);
values usually too small!)
• prediction target sometimes large (national SOC stock estimate:
spatial mean over whole study domain!)
#
S(D) = 1/|D| S(x) dx
D
1
54
27
42
41
54
12
10
53
13
31
36
12
30
39
31
15
27
38
54
20 40
with 4 blocks 7
50
26
25
40
39
1
48
36
14 32
9
37
47
29
12
8
30
48
43
55
35
2 24 38 42 18 45 53 25 34 5 24
20
12 13 37 35 31 44 6 14 25 19 45
ded
longitude
55 34 17 47 25 46 27 41 49 17 25
15 11 33 37 9 13 32 22 33 40 50
49 32 34 11 55 48 33 32 29 6
20
• more details in Pinheiro and Bates (2000, pp. 260) 5
22
19
9
29
7
20
30
21
46
5
54
52
1 18
39
7
1
46
10 21 53 22 27 49 56 45 51 14 16
15
17 56 4 52 43 28 55 41 56 9
20 51 28 8 8 26 17 51 35 10
18 31 48 51 15 11 19 4 26 37
5 8 30 47 3 38 50 20 40 2 28
5
52 29 46 26 33 49 38 35 21 22
4 28 45 39 16 16 44 15 3 23
16 3 44 56 41 34 7 3 44 11 0
10 20 30 40
latitude
> library(nlme) > # global F-Test: testing for any significant
> # analysis as a completely randomized block experiment > # treatment effects
> r.lme.means <- lme(yield~variety-1, Wheat2, > anova(update(r.lme.means, .~. + 1))
+ random=~1|Block)
> summary(r.lme.means)
numDF denDF F-value p-value
(Intercept) 1 165 242.05402 <.0001
variety 55 165 0.87549 0.7119
Linear mixed-effects model fit by REML
Data: Wheat2
AIC BIC logLik
1333.702 1514.891 -608.8508 > # testing particular treatment contrast
Random effects:
> # (BUCKSKIN vs. ARAPHAHOE)
Formula: ~1 | Block > anova(r.lme.means, L=c(1, 0, -1))
(Intercept) Residual
StdDev: 3.14371 7.041475
F-test for linear combination(s)
Fixed effects: yield ~ variety - 1 varietyARAPAHOE varietyBUCKSKIN
Value Std.Error DF t-value p-value 1 -1
varietyARAPAHOE 29.4375 3.855687 165 7.634827 0 numDF denDF F-value p-value
varietyBRULE 26.0750 3.855687 165 6.762738 0 1 1 165 0.6056841 0.4375
varietyBUCKSKIN 25.5625 3.855687 165 6.629818 0
...
residuals
25
80
20
longitude
60
15
10
40
20
5
0
10 20 30 40 50
latitude
20 40 60 80 100 120
lagged scatterplots lagged scatterplots
−20 −10 0 10 −20 −10 0 10
0 0
−10 −10
−20 −20
res
res
(12.9,17.2] (17.2,21.5] (21.5,25.8] (3.6,4.8] (4.8,6] (6,7.2]
r = 0.0281 r = 0.192 r = 0.235 r = 0.315 r = 0.252 r = 0.269
10 10
0 0
−10 −10
−20 −20
res res
along direction S −> N (latitude) along direction W −> E (longitude)
> # analysis using a geostatistical spatial model > # global F-Test: testing for any significant
> r.gls.means <- gls(yield~variety-1, Wheat2, > # treatment effects
+ corr=corRatio(form=~latitude+longitude, > anova(update(r.gls.means, .~.+1))
+ nugget=TRUE))
> summary(r.gls.means)
Denom. DF: 168
numDF F-value p-value
Generalized least squares fit by REML (Intercept) 1 30.39940 <.0001
Model: yield ~ variety variety 55 1.85094 0.0015
Data: Wheat2
AIC BIC logLik
1183.278 1367.592 -532.6389
> # testing particular treatment contrast
Correlation Structure: Rational quadratic spatial correlation > # (BUCKSKIN vs. ARAPHAHOE)
Formula: ~latitude + longitude > anova(r.gls.means, L=c(1, 0, -1))
Parameter estimate(s):
range nugget
13.4613358 0.1935803
Denom. DF: 168
Coefficients: F-test for linear combination(s)
Value Std.Error t-value p-value varietyARAPAHOE varietyBUCKSKIN
varietyARAPAHOE 26.54597 4.970942 5.340229 0e+00 1 -1
varietyBRULE 26.28374 4.984883 5.272690 0e+00 numDF F-value p-value
varietyBUCKSKIN 35.03727 5.007094 6.997526 0e+00 1 1 7.69673 0.0062
...
> # block design: testing BUCKSKIN vs. ARAPHAHOE > # spatial model: testing BUCKSKIN vs. ARAPHAHOE
> >
> # treatment means > # treatment means
> round(fixef(r.lme.means)[c(1, 3)], 3) > round(coef(r.gls.means)[c(1, 3)], 3)
> # block design: (co-)variances treatment means > # spatial model: (co-)variances treatment means
> round(vcov(r.lme.means)[c(1, 3),c(1, 3)], 3) > round(vcov(r.gls.means)[c(1, 3), c(1, 3)], 3)
> # block design: t-value treatment contrast > # spatial model: t-value treatment contrast
> (29.438-25.562) / sqrt(2*(14.866-2.471)) > (26.546-35.037) / sqrt(24.710+25.071-2*20.207)
• package sp provides more advanced S4 formal classes and meth- > library(sp)
ods for analysing spatial data > d.dornach <- read.table(”dornach.txt”, header=TRUE)
> d.dornach$dist <- with(d.dornach, sqrt(x^2+y^2))
⇒ suit of classes and methods for SpatialPoints, Spa- > str(d.dornach)
tialLines, SpatialPolygons, SpatialGrid and Spa-
tialPixel objects ’data.frame’: 181 obs. of 10 variables:
$ x : num 562 361.3 1003.5 341.2 40.1 ...
$ y : num 241 140 442 -381 -201 ...
⇒ focus here on SpatialPointsDataFrame class and associ- $ survey : Factor w/ 2 levels ”a”,”b”: 2 2 2 2 2 2 2 2 2..
ated methods $ forest : Factor w/ 2 levels ”no”,”yes”: 1 1 1 1 1 1 1 ..
$ built.up: Factor w/ 3 levels ”after.1969”,”before.1960”..
$ geology : Factor w/ 4 levels ”limestone.a”,..: 3 3 3 3 ..
• more information in Bivand et al. (2013, chap. 2–3) $ cu : num 39 51 267 278 1401 ...
$ cd : num 0.6 0.48 1.18 1.58 4.04 1.08 2.68 0.57 0..
$ zn : num NA NA NA NA NA NA NA NA NA NA ...
$ dist : num 611 388 1096 512 205 ...
cu cd zn
Min. : 5.0 Min. : 0.110 Min. : 37.0
1st Qu.: 56.0 1st Qu.: 0.810 1st Qu.: 158.2
Median : 117.0 Median : 1.320 Median : 251.0
Mean : 381.5 Mean : 1.768 Mean : 508.9
3rd Qu.: 398.0 3rd Qu.: 2.100 3rd Qu.: 557.0
Max. :3881.0 Max. :23.300 Max. :4955.0
Coefficients:
(Intercept) dist
6.277739 -0.001738
Call:
lm(formula = log(cu) ~ dist, data = spdf.dornach)
Coefficients:
(Intercept) dist
6.277739 -0.001738
obj=”SpatialPointsDataFrame”
cu
from=”SpatialGridDataFrame”, to=”SpatialPointsDataFrame”
from=”SpatialLines”, to=”SpatialPointsDataFrame”
from=”SpatialLinesDataFrame”, to=”SpatialPointsDataFrame”
from=”SpatialMultiPointsDataFrame”, to=”SpatialPointsDataFrame”
from=”SpatialPixelsDataFrame”, to=”SpatialPointsDataFrame”
0
1000
500
0
−500
−1000
cu
cd
−1500
> library(constrainedKriging)
> str(meuse.blocks, max=2)
3500
3000
Formal class ’SpatialPolygonsDataFrame’ [package ”sp”] wit..
2500
..@ data :’data.frame’: 259 obs. of 2 variables:
..@ polygons :List of 259
2000 .. .. [list output truncated]
..@ plotOrder : int [1:259] 177 179 180 178 188 182 181..
1500 ..@ bbox : num [1:2, 1:2] 178438 329598 181562 333..
.. ..- attr(*, ”dimnames”)=List of 2
1000 ..@ proj4string:Formal class ’CRS’ [package ”sp”] with 1..
500
0
> str(meuse.blocks@data)
> str(meuse.blocks@polygons[[1]], max=2) ⇒ see section 2.6 of Bivand et al. (2013) for details
> library(sp)
exploratory analysis
> library(geoR)
⇓ > library(gstat)
> d.w <- as.data.frame(wolfcamp)
trend modelling > class(d.w) <- ”data.frame”
⇓ > colnames(d.w) <- c(”x”, ”y”, ”pressure”)
> coordinates(d.w) <- ~x+y
modelling residual auto-correlation > summary(d.w)
⇓
$ %
Object of class SpatialPointsDataFrame
statistical inference Coordinates:
min max
⇓ x -233.7217 181.5314
y -145.7884 136.4061
model assessment by cross-validation Is projected: NA
proj4string : [NA]
⇓ Number of points: 85
Data attributes:
computing spatial predictions pressure
Min. : 312.1
0 100
50
y
−50
−150
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 607.77066 7.52219 80.80 <2e-16
x -1.27844 0.06552 -19.51 <2e-16
y -1.13874 0.07739 -14.71 <2e-16
> op <- par(mfrow=c(2, 2)); plot(r.lm.1); par(op) > op <- par(mfrow=c(1, 2))
> scatter.smooth(d.w$x, residuals(r.lm.1), xlab=”x”,
+ main=”residuals vs x”)
Residuals vs Fitted Normal Q−Q > scatter.smooth(d.w$y, residuals(r.lm.1), xlab=”y”,
78 78 + main=”residuals vs y”)
Standardized residuals
−2 −1 0 1 2 3
150
27 27
> par(op)
Residuals
50
residuals vs x residuals vs y
−50
56
−150
56
150
150
Fitted values Theoretical Quantiles
residuals(r.lm.1)
residuals(r.lm.1)
Scale−Location Residuals vs Leverage
78
0 50
0 50
Standardized residuals
78
Standardized residuals
0 1 2 3
1.5
56 27
73
1.0
0.5
−100
−100
Cook's distance 84
−2
0.0
300 500 700 900 0.00 0.02 0.04 0.06 0.08 −200 −100 0 100 −150 −50 0 50 100
Fitted values Leverage x y
> r.lm.2 <- update(r.lm.1, .~.+I(x^2)+I(y^2)+x:y) > anova(r.lm.1, r.lm.2)
> summary(r.lm.2)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.203e+02 1.295e+01 47.902 < 2e-16
x -1.075e+00 8.191e-02 -13.128 < 2e-16
y -1.330e+00 8.861e-02 -15.008 < 2e-16
I(x^2) 8.994e-05 5.908e-04 0.152 0.879388
I(y^2) -2.929e-03 1.101e-03 -2.659 0.009486
x:y 3.184e-03 8.790e-04 3.622 0.000515
pressure
−50
N
0
y
50
y−
150 100
co
te
<0 rdina
or
o
x−co
di
>0
atn
200 100 0 −100 −200 −300
e
x
3.3 estimating and modelling auto-correlation ↑↓ 101 auto-correlation: lag-scatter plots ↑↓ 102
4
3
4
3
y(x+1)
2
d.ar
2
1
1
0
−1 0
−1
0 10 20 30 40 50 −4 −2 0 2 4 6
y(x)
Time
lag 1 lag 2
pro memoria: sample covariance and correlation ↑↓ 104
0.82 0.75
• data: measurements (y1,i , y2,i ), i = 1, 2, . . . , n about 2 response
4
variables
3
3
y(x+1)
y(x+2)
2
• sample covariance
1
" n
1
s1,2 = (y1,i − ȳ1 )(y2,i − ȳ2 )
0
(n − 1) i=1
−1
−1
−2 0 2 4 −2 0 2 4
y(x) y(x)
where ȳ1 and ȳ2 are the (arithmetic) sample means
3
y(x+3)
y(x+5)
2
&n−h
(yi+h − ȳ)(yi − ȳ)
0
ρ̂(h) = i=1&n 2
i=1 (yi − ȳ)
−1
−1
−2 0 2 4 −2 0 2 4
y(x) y(x)
pro memoria: sample covariance and correlation ↑↓ 105 auto-correlation: correlogram of time series ↑↓ 106
200
●
auto−correlogram AR(1) process
●
● ●
●
●
●
1.0
100
●
●
●
y(x+1)
● ●
●
● ●
y ●
0.6
0
● ● ●
● ●
● ●
●
● ● ● ●
ACF
●
●
● ●
0.2
●
● ●
●
●
−200
−0.2
●
● y
0 5 10 15
−300 −100 0 100 200 Lag
y(x)
defining lags for irregular sampling grids ↑↓ 107 lag-scatter plots of trend surface residuals ↑↓ 108
lagged scatterplots
−100 0 100 200 −100 0 100 200
dφl 200
(0,20] (20,40] (40,60]
r = 0.272
(60,80]
r = 0.0436
100 r = 0.333
dh xj r = 0.406
res.1
dh dφl 0
hkl φl
−100
res.1
xi
auto-correlation: (co-)variogram of spatial data ↑↓ 109 auto-correlation: semi-variance ↑↓ 110
• (k, l)th lag class, hkl , characterized by distance, (hk − dh, hk + dh], ●
200
●
y(xi+1)-y(xi)
●
●
●
100
●
y(x+1)
●
1 " ● ●
●
●
!(hkl ) =
γ [y(xi ) − ȳ][y(xj ) − ȳ] ●
0
Nkl ●
● ● ●
●
● ●
(i,j)∈hkl ●
●
● ● ● ●
●
● ●
⇒ covariogram ● ●
● ●
●
−200
1 "
V! (hkl ) =
●
[y(xi ) − y(xj )]2 ●
2 Nkl
(i,j)∈hkl
−300 −100 0 100 200
⇒ (semi-)variogram y(x)
5000
4000
196
10000
234 320
169 258 316
353
2000 3000
semivariance
semivariance
163
82
1000
2000
nugget
range (scale)
0
0 50 100 150
0
> summary(r.georob.1)
...
4000
Coefficients: ML estimate
Estimate Std. Error t value Pr(>|t|) fit of sample variogram
(Intercept) 607.77066 7.52219 80.80 <2e-16
5000
x -1.27844 0.06552 -19.51 <2e-16
y -1.13874 0.07739 -14.71 <2e-16
semivariance
3000
Residual standard error: 62.29 on 82 degrees of freedom
Multiple R-squared: 0.8909, Adjusted R-squared: 0.8882
F-statistic: 334.8 on 2 and 82 DF, p-value: < 2.2e-16
1000
0
0 50 100 150
lag distance
> op <- par(mfrow=c(1,2)) 3.5 inference, model building and assessment ↑↓ 118
> plot(fitted(r.georob.1), residuals(r.georob.1),
+ main=”Tukey-Anscombe plot”)
> qqnorm(rstandard(r.georob.1, level=0), • data analysis often leads to a set of equally plausible candidate
+ main=”QQnorm regression residuals”)
> par(op)
models that use different set of covariates and different variograms
3
60
2
“best” model, again taking auto-correlation into account
residuals(r.georob.1)
Sample Quantiles
20
1
⇒ use cross-validation to compare the power of candidate models to
0
−1
−40
−60
−2
Df AIC Converged
- I(x^2) 1 922.05 1
- I(y^2) 1 922.13 1
<none> 922.16
- x:y 1 922.49 1
Step: AIC=922.05
pressure ~ x + y + I(y^2) + x:y
Df AIC Converged
<none> 922.05
- I(y^2) 1 922.54 1
- x:y 1 924.61 1
> summary(r.cv.1)
cross−validation subsets
subsets
Statistics of cross-validation prediction errors
100
me mede rmse made qne msse 1
-11.5675 -14.5255 59.9980 56.5949 60.3714 0.8451 2
medsse crps 3
50
0.3500 33.9145 4
5
6
0
y
7
> summary(r.cv.2)
−50
8
9
10
Statistics of cross-validation prediction errors
−150
3
standardized prediction errors
1000
2
n
"
800
1
data
600
0
i=1
−1
400
−2
400 600 800 1000 400 600 800 1000
where weights κi (x0 ) depend on trend model and variogram
predictions predictions
normal−QQ−plot of standardized prediction errors histogram PIT−values • kriging provides in addition an estimate of the variance of the pre-
quantiles of standardized prediction errors
1.0
density
0 1
0.5
−2 −1
0.0
1000 70
60
800
50
600
40
400 30
20
200
• weakly stationary process that is invariant to rotations • all finite-dimensional joint distributions are multivariate normal
'
( d F (s1 , . . . , sn ; x1 , . . . , xn ) ∼ N (µ, Σ)
("
Cov [S(x), S(x + h)] = γ(h) with h = ||h|| = ) h2 i
i=1
with mean vector µT = (µ1 , . . . , µn ); µi = E [S(xi )]
and covariance matrix with elements [Σ]ij = Cov [S(xi ), S(xi )]
⇒ unless stated otherwise, only isotropic and weakly stationary pro-
• joint multivariate normal density (sT = (s1 , . . . , sn ))
cesses are considered
1
f (s) = 2π −n/2 det(Σ)−1/2 exp(− (s − µ)T Σ−1 (s − µ))
2
2.0
• definition of variogram V (h) and covariance function γ(h)
1
V (h) = Var [S(x + h) − S(x)] γ(h) = Cov [S(x + h), S(x)]
2
1.5
• relation between variogram and covariance function
co− or semivariance
V (h) = γ(0) − γ(h) with γ(0) = Var [S(x)] sill = γ(0) covariance function
1.0
correlogram
variogram
• relation between correlogram and covariance function
γ(h)
0.5
ρ(h) =
γ(0)
• relation between variogram and correlogram range
0.0
V (h) = γ(0) (1 − ρ(h))
0 1 2 3 4
• symmetry distance
covariance functions must be positive definite ↑↓ 139 4.5 smoothness of stochastic processes ↑↓ 140
&n T
• consider weighted sum i=1 ai S(xi ) = a S of arbitrary set of • {S(x)} is mean square continuous
T
random variables S = (S(x1 ), . . . , S(xn )) with aT = (a1 , . . . , an ) * +
arbitrary real weights E (S(x + h) − S(x))2 → 0 as h→0
[Σ]ij = Cov [S(xi ), S(xj )] = γ(xi − xj ) if γ(h) (and V (h)) is twice differentiable at h = 0
⇒ γ(h) must be a positive definite function • higher order mean square derivatives analogously defined
1
more exactly: non-negative definite
smoothness of stochastic processes ↑↓ 141 4.6 examples of isotropic covariance functions ↑↓ 142
• mean square continuity and differentiability in general not suffi- preliminary remark: all models can be used for variograms as well
cient for continuity and differentiability of single realizations s(x) by the relation V (h) = γ(0) − γ(h); ⇒ in the sequel γ(0) = 1
of {S(x)}
nugget effect covariance models absence of auto-correlation
• only for Gaussian processes conditions for mean square continu- . .
1 if h = 0 0 if h = 0
ity/differentiability and continuity and differentiability of realizations γ(h) = V (h) =
0 otherwise 1 otherwise
equivalent.
⇒ {S(x)} spatial white noise process (p(u) = constant)
• if V (h) is 2m times differentiable at h = 0 then the realizations of
the associated Gaussian process have derivatives up to order m • mechanism: measurement error and small-scale spatial vari-
ation
• example: Taylor series of exponential and Gaussian variograms
exponential: 1 − exp(−h) = h − h2 /2 + h3 /6 − h4 /24 + . . . • valid for all dimensions d of study region
2 2 4 6 8
Gaussian: 1 − exp(−h ) = h /2 − h /2 + h /6 − h /24 + . . . • Gaussian {S(x)} with nugget effect covariance function has non-
continuous realizations
⇒ V (h) not differentiable at h = 0 for exponential but infinitely many
times for Gaussian variogram • see ?RMnugget (package RandomFields)
3
covariance function
variogram
2
0.4
1
s(x_1)
0
0.2
−1
−2
0.0
−3
0 1 2 3 4 5
lag h 0 5 10 15
x_1
> x2 <- seq(0, 7.5, length=151) examples of isotropic covariance functions ↑↓ 146
> RFoptions(spConform=TRUE)
> plot(RFsimulate(RMnugget(var=1), x=x1, y=x2, grid=TRUE)) Whitte-Matérn covariance
21−ν √ h √ h
−4 −2 0 2 4 γ(h) = ( 2ν ) Kν ( 2ν )
Inf
Γ(ν) α α
where α > 0 is the range (scale) and ν > 0 the smoothness parameter,
Kν is the modified Bessel function of order ν
7
Inf
6
• special cases
2
1
> RFoptions(spConform=FALSE)
> plot(x1, RFsimulate(RMmatern(var=1, scale=3, nu=0.5),
+ x=x1), type=”l”)
1.0
smoothness
3
ν=1 2
ν=3 2
2
ν=5 2
0.4
1
s(x_1)
0
0.2
−1
−2
ν=1 2 ν=3 2 ν=5 2
0.0
−3
0 5 10 15 0 5 10 15
x_1
lag h
> RFoptions(spConform=TRUE) > RFoptions(spConform=TRUE)
> plot(RFsimulate(RMmatern(var=1, scale=3, > plot(RFsimulate(RMmatern(var=1, scale=2.7,
+ nu=0.5), x=x1, y=x2, grid=TRUE)) + nu=1.5), x=x1, y=x2, grid=TRUE))
−2 −1 0 1 2 −2 −1 0 1 2 3
Inf
Inf
7
7
Inf Inf
6
6
5
5
x_2
x_2
4
4
3
3
2
2
1
1
0
0
0 5 10 15 0 5 10 15
x_1 x_1
−2 −1 0 1 2 3 γ(h) = exp(−(h/α)ν )
Inf
where α > 0 is the range (scale) and 0 < ν ≤ 2 the smoothness para-
7
Inf meter
6
0 5 10 15
x_1
> RFoptions(spConform=FALSE)
> plot(x1, RFsimulate(RMstable(var=1, scale=1.6, alpha=0.7),
+ x=x1), type=”l”)
1.0
> lines(x1, RFsimulate(RMstable(var=1, scale=2.5, alpha=1),
+ x=x1), col=2)
> lines(x1, RFsimulate(RMstable(var=1, scale=4.3, alpha=2),
0.8
+ x=x1), col=3)
co− or semivariance
0.6
smoothness
3
α = 0.7
α=1
2
α=2
0.4
1
s(x_1)
0
0.2
−1
−2
ν = 0.7 ν=1 ν=2
0.0
−3
0 5 10 15 0 5 10 15
x_1
lag h
1.0
• generated by computing average of spatial white noise within a
moving ball with radius α/2 in Rd
0.8
• models for d ≤ 3
.
co− or semivariance
h
1− α
if h ≤ α
d=1 γ(h) = triangle
0.6
0 otherwise tent
. / circular
2 h h h2
( arccos ( ) − 1− α2
) if h ≤ α spherical
0.4
d=2 γ(h) = π α α circular
. 0 otherwise
3h h3
1− + if h ≤ α
0.2
2α 2α3
d=3 γ(h) = spherical
0 otherwise
0 5 10 15
• “moving average” covariance functions exist also for d > 3
lag h
• see ?RMtent, ?RMcircular, ?RMspheric (package Ran-
domFields)
> RFoptions(spConform=FALSE) examples of isotropic covariance functions ↑↓ 158
> plot(x1, RFsimulate(RMtent(var=1, scale=5), x=x1),
+ type=”l”) compact support covariance functions with differentiable Gaussian real-
> lines(x1, RFsimulate(RMcircular(var=1, scale=5), x=x1),
izations
+ col=2)
> lines(x1, RFsimulate(RMspheric(var=1, scale=5), x=x1),
+ col=3) • cubic covariance: see ?RMcubic (package RandomFields);
valid for d ≤ 3; twice differentiable
0 5 10 15
x_1
> RFoptions(spConform=FALSE)
> plot(x1, RFsimulate(RMcubic(var=1, scale=5),
+ x=x1), type=”l”)
1.0
+ x=x1), col=3)
co− or semivariance
0.6
cubic
3
penta
2
Gneiting
0.4
1
s(x_1)
0
0.2
−1
−2
cubic penta gneiting
0.0
−3
0 5 10 15 0 5 10 15
x_1
lag h
4.7 anisotropic covariance functions ↑↓ 161 geometrically anisotropic covariance function ↑↓ 162
• for an anisotropic covariance function in general • idea: rotate and stretch/shrink components of x such that the
stochastic process is isotropic in the transformed coordinate sys-
γ(h) ̸= γ(||h||) tem /
γ(h∗ ) = γ( (hA)T Ah)
• particular application: covariance functions for weakly stationary
space-time data s(xi , tj ) (e.g. Gneiting and Guttorp, 2010b) ⇒ iso-covariance contours are ellipsoids in space of untransformed
coordinates and are mapped to unit sphere in Rd by transformation
Cov [S(x + h, t + u), S(x, t)] = γ(h, u)
• example geometrically anisotropic covariance in R2
• valid weakly stationary covariance functions for such zonally an-
h2
isotropic stochastic processes difficult to construct in a general
manner ω
α
, -, -
• approaches f1α 1/α 0 cos(ω) sin(ω)
A=
0 < f1 ≤ 1 h1
0 1/(f1 α) − sin(ω) cos(ω)
1. geometrically anisotropic covariance function 0
)=
2. product-sum covariance function ,h 2
γ(h
1
0 1 2 3 4 5 6 7
x
> (r.exp <- fit.variogram.model(r.sv,
+ variogram.model=”RMexp”,
+ param=c(variance=1500, nugget=10, scale=4 ),
xy.angle: (−45,45] + aniso=c(f1=0.4, f2=1, omega=0, phi=90, zeta=0),
xy.angle: (45,135]
3000 + fit.aniso=c(f1=TRUE, f2=FALSE, omega=FALSE, phi=FALSE,
+ zeta=FALSE), min.npairs=1))
> lines(r.exp, xy.angle=c(0, 90))
semivariance
2000
Variogram: RMexp
variance snugget(fixed) nugget
1.861e+03 0.000e+00 7.240e-05
scale
1000
3.176e+00
f1 f2(fixed) omega(fixed)
0.3151 1.0000 0.0000
phi(fixed) zeta(fixed)
90.0000 0.0000
0
0 2 4 6 8
lag distance
xy.angle: (−45,45]
temporal change of soil water storage in a forest (Jost et al., 2005)
xy.angle: (45,135]
3000
semivariance
2000
1000
0
0 2 4 6 8
lag distance
zonal space-time sample variogram product-sum covariance model fitted to space-time sample variogram
γ(h, u) = a0 γx (h)γt (u) + a1 γx (h) + a2 γt (u)
100
100
semivariance [mm2]
80
80
60
semivariance
60
40
20
40
0 time - lag
0
100
20
10
30 20
tim
e la 50 20 30
g [d ]
lag [m
0
ays 10
e
] 0 0 spac 0 5 10 15 20 25 30 35
space lag [m]
• Gaussian stochastic process: all joint and conditional distribu- 3. variogram grows at at least quadratically at origin: realiza-
tions are normal tions everywhere at least once differentiable
summary section 4 ↑↓ 173 References
Chilès, J.-P. and Delfiner, P. (1999). Geostatistics: Modeling Spatial
• geometrically anisotropic auto-correlation:
Uncertainty . John Wiley & Sons, New York.
1. iso-semivariance surfaces are ellipsoids Diggle, P. J. and Ribeiro, Jr., P. J. (2007). Model-based Geostatistics.
2. variogram has same sill and nugget for all directions but Springer, New York.
direction-dependent range
Gneiting, T. and Guttorp, P. (2010a). Continuous parameter stochastic
3. modelled by linear transformation of spatial coordinates process theory. In A. E. Gelfand, P. J. Diggle, M. Fuentes, and
(stretching and rotation) P. Guttrop, editors, Handbook of Spatial Statistics, pages 17–28.
CRC Press.
• zonal anisotropy:
Gneiting, T. and Guttorp, P. (2010b). Continuous parameter spatio-
1. also nugget and sill depend on direction temporal processes. In A. E. Gelfand, P. J. Diggle, M. Fuentes, and
2. modelling more demanding P. Guttrop, editors, Handbook of Spatial Statistics, pages 427–436.
CRC Press.
space-time distribution of soil water storage of a forest ecosystem 5 ad-hoc estimation of parameters of model
using spatio-temporal kriging. Geoderma, 128(3–4), 258–273.
for spatial data
pro memoria: model for Gaussian spatial data ↑↓ 177 pro memoria: steps of a geostatistical analysis ↑↓ 178
• model for data: Yi = S(xi )+Zi = µ(xi )+E(xi )+Zi where exploratory analysis
Yi th
i datum ⇓
S(xi ) “signal” (= true quantity) at location xi trend estimation by linear regression analysis
µ(xi ) trend ⇒ estimate of trend parameters β
{E(xi )} a zero mean Gaussian process, parametrized ⇓
by covariance function γ(h; θ) or variogram V (h; θ) modelling residual auto-correlation
Zi iid Gaussian measurement error with variance τ 2
computing sample variogram of residuals; fitting model function to it
• trend µ(xi ) modelled by linear
" regression model ⇒ estimate of variogram parameters θ and of nugget variance τ 2
µ(xi ) = dk (xi )βk = d(xi )T β ⇓
k $ %
with dk (xi ) denoting (spatial) covariates statistical inference
• unknown elements of model:
⇓
1. structure and parameters β of trend model model assessment by cross-validation
2. covariance (or variogram) parameters θ
⇓
3. nugget variance τ 2
computing spatial predictions
5.1 ordinary least squares trend estimation ↑↓ 179 generalized least squares trend estimation ↑↓ 180
!
β T −1 T
• GLS = OLS with “orthogonalized” data Ỹ = L−1 Y and design
OLS = (X X) X Y
customary OLS fit (ignoring auto-correlation) example: trend estimation Wolfcamp data ↑↓ 181
(Intercept) x y (Intercept) x y
7.52218651 0.06552440 0.07738785 22.5365079 0.1950825 0.2411611
> # t-values
> # t-values > coef(r.ols) / se.ols.2
> coef(r.ols) / se.ols.1 (Intercept) x y
26.968271 -6.553340 -4.721909
example: trend estimation Wolfcamp data ↑↓ 183 5.2 computing sample variogram of residuals ↑↓ 184
generalized squares estimate and standard errors ! of fitted linear model (or use data
• extract residuals R = Y − X β
> L.inv <- solve(t(chol(Gamma))) Y if model has constant µ(x))
> y.tilde <- L.inv %*% d.w$pressure
> X.tilde <- L.inv %*% X • choose bin width dh (and width of angular classes dφ) to define
> r.gls <- lm(y.tilde~X.tilde-1) (k, l)th lag class, hkl , characterized by distance, (hk − dh, hk + dh]
> rbind(ols=coef(r.ols), gls=coef(r.gls)) (and angular class, φl − dφ, φl + dφ])
(Intercept) x y
ols 607.7707 -1.278442 -1.138741
gls 624.3471 -1.329099 -1.180178
dφl
> rbind(se.ols.1=se.ols.1, se.ols.2=se.ols.2,
dh xj
+ se.gls=sqrt(diag(vcov(r.gls))))
dh dφl
hkl φl
(Intercept) x y xk
se.ols.1 7.522187 0.0655244 0.07738785
se.ols.2 22.536508 0.1950825 0.24116113 xi
se.gls 20.812750 0.1612431 0.21161369
computing sample variogram of residuals ↑↓ 185 example: sample variogram Wolfcamp data ↑↓ 186
• form all Nkl pairs (i, j) with xi − xj ≈ hkl and compute for each > library(georob)
lag class hkl the semivariance > r.sv.5 <- sample.variogram(residuals(r.ols),
+ locations=coordinates(d.w), lag.dist.def=5,
1 " + max.lag=200, estimator=”matheron”)
V! (hkl ) = [R(xi ) − R(xj )]2 > plot(r.sv.5, main=”lag class width=5 km”)
2 Nkl
(i,j)∈hkl > text(gamma~lag.dist, r.sv.5, labels=npairs, pos=3)
> ...
• sample variogram plot of V! (hkl ) vs. hkl
• rules of thumb:
lag class width=5 km lag class width=10 km fitting variogram model to sample variogram ↑↓ 188
48 52
• semivariance required for arbitrary lag distances when computing
5000
5000
37 60 71 147
100
7158 76
96
48 54
49 73
8391 7568 95
80
70 109125
116
173 148
168 168
predictions
33 47 90 142
semivariance
semivariance
5869 93 99 185
45 40 52 48 73
⇒ smoothing sample variogram by fitting a parametric variogram
3000
3000
42 94 78 83
43
38 33 70
282328 55 51 93 function V (h, θ)
23
31
1000
1000
5000
196
234 334 kl
320 405
258 316 265 433
169 526
semivariance
semivariance
353
• options for weighing
3000
3000
161
254
163
82
152
1. equal weights: w(hkl ) = 1
2. by number of pairs: w(hkl ) = Nkl
1000
1000
5000
> r.sph.e <- fit.variogram.model(r.sv.20,
+ variogram.model=”RMspheric”,
4000
+ param=c(variance=3000, nugget=100, scale=100),
+ weighting=”equal”)
> plot(r.sv.20)
3000
semivariance
> lines(r.sph.e)
> ...
2000 1000
weighting
equal
npairs
Cressie
0
0 50 100 150
lag distance
+ max.lag=200, estimator=”matheron”,
semivariance
20
5
10
30
0
0 50 100 150
lag distance
computing sample variogram of pressure data in N-S, NE-SW, E-W and
SE-NW direction
0 50 100 150
lag distance
problematic
semivariance
• auto-correlation of residuals
30000
! = Y − Xβ
R ! OLS = Y − X(X T X)−1 X T Y = (I − H)Y
4 56 7
H
differs from auto-correlation of underlying stochastic process
10000
8 9 * +
Cov R, ! R ! T = (I−H)Cov Y , Y T (I−H) = (I−H)Γθ (I−H) ̸= Γθ
0
pro memoria: maximum likelihood estimation ↑↓ 199 6.1 ML estimation for Gaussian spatial model ↑↓ 200
• principle of ML estimation: find parameters that maximize joint • consider now a Gaussian stochastic process {Y (x)} with a linear
probability for observed data trend function
• properties of ML estimates: asymptotically unbiased and fully • any arbitrary set of random variables Y = (Y (x1 ), . . . , Y (xn )) has
efficient; asymptotically normally distributed a multivariate Gaussian distribution with expectation
!
β T −1 −1 T −1 !
GLS = (X Γθ X) X Γθ Y ⇒ numerical optimization requires initial values of θ
• plugging β !
GLS for β into L(β, θ; y) gives profile likelihood function
for θ
1 1 ! GLS }T Γ−1 {y − X β
! GLS }
Lp (θ; y) = − log(|Γθ |) − {y − X β θ
2 2
example: ML estimates Wolfcamp data ↑↓ 203 Convergence in 12 function and 7 Jacobian/gradient evaluations
6000
(Intercept) 620.3550 17.0641 36.354 < 2e-16
x -1.3256 0.1360 -9.750 2.33e-15
y -1.2061 0.1793 -6.727 2.16e-09
5000
Residual standard error (sqrt(nugget)): 35.16
3000 4000
semivariance
Robustness weights:
All 85 weights are ~= 1.
2000
1000
0
0 50 100 150
lag distance
−459.0
> r.proflik.ml <- profilelogLik(r.georob.ml,
+ values=data.frame(scale=seq(50, 500, by=5)))
> str(r.proflik.ml)
−460.0
’data.frame’: 91 obs. of 9 variables:
loglik
$ scale : num 50 55 60 65 70 75 80 85 90 95 ...
$ loglik : num -461 -461 -461 -460 -460 ...
$ variance : num 3641 3508 3393 3315 3240 ...
−461.0
$ nugget : num 535 668 780 860 933 ...
$ (Intercept) : num 617 618 618 618 618 ...
$ x : num -1.29 -1.29 -1.3 -1.3 -1.31 ...
$ y : num -1.24 -1.25 -1.25 -1.25 -1.26 ...
$ gradient.nugget: num 0.018148 -0.006713 -0.00423 -0.00..
$ converged : num 1 1 1 1 1 1 1 1 1 1 ... −462.0
100 200 300 400 500
> plot(loglik~scale, r.proflik.ml, type=”l”, scale
+ main=”ML profile likelihoood for scale”)
> abline(v=r.georob.ml$param[”scale”])
> abline(h=r.georob.ml$loglik - qchisq(0.95, 1)/2)
equivalent number of independent observations ↑↓ 209 example: neq Wolfcamp data ↑↓ 210
• for small sample size MLEs of variogram parameters often negat- > library(RandomFields)
ively biased when trend is simultaneously estimated > Gamma <- RFcovmatrix(
+ RMspheric(var=3329, scale=123)+RMnugget(var=1236),
• for auto-correlated data this problem is more severe because ef- + x=coordinates(d.w))
> var.y <- sum(c(variance=3329, nugget=1236))
fective sample size usually much smaller than nominal sample > (n <- nrow(d.w))
size n
6.2 restricted maximum likelihood estimation ↑↓ 211 restricted maximum likelihood estimation (REML) ↑↓ 212
• bias of MLEs of variogram parameters θ can be reduced by re- • principle of REML (continued)
stricted maximum likelihood estimation
2. estimate θ by maximizing likelihood function for n − p ele-
• principle of restricted maximum likelihood estimation (REML) ments of Z
⇒ this is equivalent to maximizing the restricted log-likelihood
1. form linear combinations Z = AY of data Y that have zero
function
expectation (and do no longer depend on β )
1 1
Lr (θ; y) = − log(|Γθ |) − log(|X T Γ−1 θ X|)
E [Z] = AXβ = 0 2 2
1 ! GLS }T Γ−1 {y − X β
! GLS }
− {y − X β θ
⇒ matrix A must satisfy condition AX = 0 2
⇒ A non-unique; many possibility, e.g. ⇒ REML estimate θ !REML has same properties (asymptotic nor-
mal distribution, likelihood ratio statistic) as ML estimate
A = I − H OLS = I − X(X T X)−1 X T 8 9
3. given θ!REML compute β ! ! !T
GLS and Cov β GLS , β GLS =
⇒ Z is an error contrast or a generalized increment (X T Γ−1 X)−1
θ
example: REML estimates Wolfcamp data ↑↓ 213 Convergence in 6 function and 5 Jacobian/gradient evaluations
snugget(fixed) 0.00 NA NA
nugget 1151.28 540.41 2452.7
scale 138.91 82.62 233.6
6000
Fixed effects coefficients:
5000
Estimate Std. Error t value Pr(>|t|)
(Intercept) 624.3287 20.7961 30.021 < 2e-16
x -1.3291 0.1611 -8.248 2.25e-12
semivariance
Residual standard error (sqrt(nugget)): 33.93
Robustness weights:
All 85 weights are ~= 1.
1000
fitted sample variogram
ML estimate
0 REML estimate
0 50 100 150
lag distance
6.3 testing hypotheses about trend coefficients ↑↓ 217 example: hypothesis tests trend Wolfcamp data ↑↓ 218
• likelihood ratio test can only be used to test hypotheses and build > d.w$xs <- d.w$x - mean(d.w$x)
confidence regions for θ > d.w$ys <- d.w$y - mean(d.w$y)
> r.georob.full <- georob(
• LRT for regression for β in general biased (too small p-values Pin- + pressure~xs+ys+I(xs^2)+I(ys^2)+xs:ys, d.w,
heiro and Bates, 2000, pp. 87) + locations=~x+y, variogram.model=”RMspheric”,
+ param=c(variance=3000, nugget=1000, scale=100),
⇒ use conditional F -tests for testing hypotheses about β : + tuning.psi=1000)
> summary(r.georob.full)
1. fit covariance parameters of “largest” regression model
⇒θ !
Call:georob(formula = pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys,
data = d.w, locations = ~x + y, variogram.model = ”RMspheric”,
2. compute covariance matrix ⇒ Γθ! param = c(variance = 3000, nugget = 1000, scale = 100), tuning.psi = 1000)
3. compute ⇒Lθ! by Cholesky decomposition of Γθ!
Tuning constant: 1000
4. orthogonalize response vector and design matrix
Convergence in 10 function and 8 Jacobian/gradient evaluations
⇒ Ỹ = L−1 −1
! Y , X̃ = Lθ
θ ! X
Estimating equations (gradient)
5. conventional F -test with orthogonalized items Ỹ and X̃
eta scale
Start: AIC=922.16
pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys
Df AIC Converged
- I(xs^2) 1 922.05 1
- I(ys^2) 1 922.13 1
<none> 922.16
- xs:ys 1 922.49 1
Step: AIC=922.05
pressure ~ xs + ys + I(ys^2) + xs:ys
Df AIC Converged
Start: AIC=936.81 <none> 931.79
pressure ~ xs + ys + I(xs^2) + I(ys^2) + xs:ys - ys 1 1006.98 1
- xs 1 1085.42 1
summary section 6 ↑↓ 226
Tuning constant: 1000
• observations y T = (y1 , . . . , yn ) available for a set of n locations xi • consider for simplicity case m = 1, i.e.
• y considered as realization of the multivariate random variable S = S(x′1 ) and S! = S(x
! ′1 ; Y )
Y T = (Y1 , . . . , Yn )
mean square prediction ↑↓ 231 7.2 mean square prediction for Gaussian process ↑↓ 232
• MSEP can alternatively be written as (e.g. ?, p. 135) • standard results from theory about multivariate normal distribu-
8 9 tions apply
MSEP[S]! = EY ES|Y [{S! − S}2 ] = . . .
:; • joint distribution of (S T , Y T ) multivariate normal with mean vector
* + <2 =
! , - , -
= EY VarS|Y [S] + EY ES|Y [S] − S µS XS β
µ= =
µY XY β
⇒ conditional expectation S!opt = ES|Y [S] minimizes MSEP
and covariance matrix
, * + * + - , -
• MSEP of S!opt equal to expectation of conditional variance Cov S, S T Cov S, Y T ΣSS ΣSY
Σ= * + * + =
* + Cov Y , S T Cov Y , Y T ΣTSY ΓY Y
MSEP[S!opt ] = EY VarS|Y [S]
• note that
• evaluation of S!opt and MSEP[S!opt ] requires fully specified paramet-
ric model for joint distribution of S and Y 1. Σ depends on covariance parameters θ, τ 2 and
2. again ΓY Y = ΣSS + τ 2 I
mean square prediction for Gaussian processes ↑↓ 233 properties of simple kriging predictor ↑↓ 234
(conditional covariance matrix independent of y ) • one may show that Λ and λ minimize for any heterogeneous linear
predictor 2 8 93
• simple kriging predictor (= optimal predictor) equal to conditional trace E {S! − S}{S
! − S}T
expectation 8 9
! opt = ES|Y [S]
S subject to constraint E S! −S =0
• MSEP of simple kriging predictor equal to ⇒ simple kriging: BLUP (Best Linear Unbiased Predictor)
! opt ] = CovS|Y [S, S T ]
MSEP[S
3
depend on the prediction locations x′j ; then for each x′j
2
n
"
S!opt (x′j ) = µj + Mi γ(x′j − xi )
1
i=1
s.opt
0
⇒ simple kriging predictor for x′j : weighted sum of covariance
terms “pinned down” at data locations xi (dual form of kriging)
−1
⇒ shape of covariance function (or variogram) close to origin determ- RMexp
RMexp+RMnugget
−2
ine shape of prediction surface near data locations RMmatern
⇒ continuity and diffentiability of variogram at origin control geomet- 0.0 0.2 0.4 0.6 0.8 1.0
rical properties of simple kriging prediction surface x.prime
7.3 universal/external drift kriging ↑↓ 237 properties of universal kriging predictor ↑↓ 238
properties of universal kriging predictor ↑↓ 239 example: UK predictions Wolfcamp data ↑↓ 240
1000
800
600
400
200
> # plot UK prediction standard errors and data locations
UK standard error
> spplot(r.uk, zcol=”se”, main=”UK standard error”)
> trellis.focus(”panel”, row=1, column=1) 90
> panel.points(x=d.w$x, y=d.w$y)
80
NULL
70
60
> trellis.unfocus()
50
40
30
20
1000
800
600
400
200
> # plot upper limits of 95% prediction intervals
upper limit 95% prediction interval
> spplot(r.uk, zcol=”upper”, at=breaks,
+ main=”upper limit 95% prediction interval”) 1200
1000
800
600
400
200
computing predictions of signal and observations ↑↓ 251 example: UK predictions Wolfcamp data ↑↓ 252
1. data locations xi or
> head(tmp <- predict(r.georob, type=”response”)[, 1:4])
2. predictions locations x′i (without data)
x y pred se
1 68.851186 44.45399 446.2190 0
2 -44.090428 -14.82616 778.1401 0
3 -1.871464 -24.30719 657.7464 0
4 -29.962712 -37.89631 748.2703 0
5 155.243957 -57.00122 535.2190 0
6 174.711819 -27.48198 518.7601 0
> summary(d.w@data[, ”pressure”] - tmp$pred) example: UK predictions Wolfcamp data ↑↓ 254
x y pred se
1 -240.0 -150 1117.333 67.99812
2 -237.5 -150 1114.903 67.46577
3 -235.0 -150 1112.538 66.98365
4 -232.5 -150 1110.235 66.55098
5 -230.0 -150 1107.991 66.16483
6 -227.5 -150 1105.801 65.82024
Variogram: RMexp
variance snugget(fixed) nugget
6.0 1000
5.5
500
5.0
4.5 0
300
0.30
200
0.25
100
0.20
0
summary section 7 ↑↓ 265 summary section 7 ↑↓ 266
• mean squared error (MSE) captures bias and random variation • simple kriging predictor: weighted sum of covariance (variogram)
terms “pinned-down” at observation locations (dual form)
• mean squared prediction error (MSEP) usual criterion for optimal-
ity of predictions • universal kriging predictor: approximation of simple kriging pre-
! GLS
dictor where β is estimated by β
• optimal predictor (which minimizes MSEP): conditional expecta-
• MSEP of universal kriging predictor equal to MSEP of simple kri-
tion of prediction target, given observations
ging predictor plus a term that accounts for the estimation of β
• Gaussian random processes: optimal predictor ≡ simple kriging
• computing universal kriging predictor requires:
predictor
1. known structure of trend function known
• simple kriging predictor: weighted sum of observations with
weights equal to ΣSY Γ−1 2. known structure and parameters θ, τ 2 of covariance function
Y Y ; ΣSY accounts for auto-correlation
between target and observations and Γ−1 or variogram
Y Y for auto-correlation
between observations
⇒ “plug-in” predictor: uncertainty of variogram is ignored when com-
puting predictions
• general strategy to assess precision of predictions of new data by • root mean square error RMSE
'
a statistical model (Hastie et al., 2009, chap. 7) ( n ; <2
(1 "
RMSE = ) Y!k (xi ) − yi
• recipe: n i=1
1. split data set (randomly) into K subsets (typically K = 5 or ⇒ overall measure of precision (bias and random variation)
K = 10)
• bias
2. for each k = 1, . . . , K 1 " ;!
n <
th
(a) exclude observations of k subset and fit model to re- BIAS = Yk (xi ) − yi
n i=1
maining data
• robust variants:
(b) predict with this model (and excluding again the data of
the k th subset) all observations Y (xi ) of the k th subset robBIAS = mediani (Y!k (xi ) − yi )
and compute prediction errors Y!k (xi ) − yi robRMSE = MADi (Y!k (xi ) − yi ) = 1.4826 mediani (|Y!k (xi ) − yi |)
3. pool prediction errors for all subsets and compute statistics of
• R2 measures strength of linear dependence between yi and Y!k (xi )
Y!k (xi ) − yi for evaluating prediction precision and the accur-
and is not a measure of precision
acy of modelling prediction uncertainty (e.g. MSEP[Y!k (xi )])
8.2 criteria to assess MSEP[Y!k (xi )] ↑↓ 271 8.3 criteria to assess probabilistic predictions ↑↓ 272
• mean of squared standardized prediction errors • for Gaussian stochastic processes kriging provides estimates of
mean and variance of conditional distribution of target Y (x′j ) given
n
1 " {Y!k (xi ) − yi }2 the data Y
MSSE = should match 1
n i=1 MSEP[Y!k (xi )]
Y (x′j )|Y ∼ N (Y!k (x′j ), MSEP[Y!k (x′j )])
• robust variant of MSSE for normally distributed prediction errors
> ? • denote cdf of predictive distribution by F!Y (x′j )|Y (y)
{Y!k (xi ) − yi }2
MEDSSE = mediani should match 0.455 • probability integral transform PIT (Gneiting et al., 2007)
MSEP[Y!k (xi )]
PITj = F!Y (x′j )|Y (yj )
1.2
1.5
• overall criterion to assess quality of probabilistic predictions
frequency
frequency
0.8
1.0
• predictive distribution is “sharp” if it is narrow (small variance) and
0.4
0.5
is centred on true value (no bias)
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
PIT PIT
4
prediction intervals too wide CDF: prediction intervals ok
3
1.5
0.8
pdf
2
1.0
frequency
Fn(x)
1
0.4
0.5
yi yj
0
0 1 2 3 4
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 y
PIT x
criteria to assess “sharpness” of F!Y (x′ )|Y (y) ↑↓ 275 continuous ranked probability score ↑↓ 276
• measure for sharpness of predictive distribution for single predic- • continuous ranked probability score (CRPS) measures average
tion site x′j #
sharpness of predictive distributions for all sites of a data set
{F!Y (x′j )|Y (y) − I(yj ≤ y)}2 dy n #
1" ∞
CRPS = {F!Y (x′j )|Y (y) − I(yj ≤ y)}2 dy
where I(A) is indicator function with value equal to 1 if A is true n j=1 −∞
n
1" !
0.6
n j=1
0.4
0.2
2
bic bic
standardized prediction errors
> par(op)
1
histogram PIT−values histogram PIT−values
0
1.5
−1
−1
1.5
1.0
−2
−2
1.0
density
density
400 600 800 1000 1200 1400 −2 −1 0 1 2
predictions quantile N(0,1)
0.5
0.5
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
PIT PIT
full
bic Society Series B, 69(2), 243–268.
0.08