Professional Documents
Culture Documents
9.11.2017
Jaakko Madetoja
Slides also by Rangsima Sunila & Kirsi Virrantaus
Materials
• Geographic Information Analysis, O’Sullivan,
D. & Unwin, D. 2010 – e-book available from
library; chapter 10 (and 9.3 if you’re not
familiar with IDW)
• Principles of Geographical Information
Systems, Burrough,P. & McDonnell,R., 1999
• (Spatial Data Modelling Using Fuzzy and
Geostatistical Applications. Sunila,R. Dr. thesis,
2009.)
2
Today
• Introduction to interpolation
• Background in Geostatistics
• Concepts of kriging
• Types of kriging
• Further notes
3
Interpolation
• How to estimate unknown values at specific
locations.
5
Spatial Interpolation
• Methods
– Nearest neighbours: Thiessen polygons (i.e.
Voronoi diagram)
– Triangulated Irregular Network (TIN)
– Trend surfaces
– Inverse Distance Weighting (IDW)
– Kriging
• The methods vary in the way they regard
spatial autocorrelation
6
Thiessen Polygons
7
Triangulated Irregular Network (TIN)
8
Example:
Site x y z D to
(5,5)
10
1 2 2 3 4.2426
2
8
2 3 7 4 2.8284 4
6
4
3 9 9 2 5.6569 4
6
2 3
4 6 5 4 1.0000
0
0 2 4 6 8 10
5 5 3 6 2.0000
9
Inverse Distance Weighting (IDW)
• Value of z(x) is estimated from all known
values of z at all n points. (Weighted Moving
Average technique)
n
z ( x) wi zi
i 1
n
w i 1
Weights usually add to 1, i 1
10
IDW
• In IDW, the weights are based on the distance
from each of the known points (i) to the point
we are trying to estimate (k): dik. In IDW, we
consider the inverse distance, 1/dik
1
d ik
wi n
1
11 d ik
11
Example: IDW
Location (x,y) z D to (5,5) ID Weights
(2,2) 3 4.2426 0.2357 0.1040
(3,7) 4 2.8284 0.3536 0.1560
(9,9) 2 5.6569 0.1768 0.0780
(6,5) 4 1.0000 1 0.4413
(5,3) 6 2.0000 0.5 0.2207
sum 2.2661 1
12
Introduction to statistical analysis of
fields
• The previously introduced methods are based
on mathematical, deterministic models
• Disadvantages:
– The control points are assumed to be error free
– The methods assume that nothing is known about
the phenomenon being interpolated
• Two statistical interpolation methods: trend
surface analysis and kriging
13
Trend surface analysis
• Approximates trends in the control points by
fitting a (polynomial) function to the data
• Idea the same as in regression modeling
15
Introduction to kriging
• IDW takes spatial autocorrelation into account,
but the weighting is arbitrary
• In trend surface analysis a function is fitted to the
data, i.e. the methods lets the “data speak for
themselves”
17
What is Geostatistics
• Techniques which are used for mapping of surfaces
from limited sample data and the estimation of values
at unsampled locations
• Geostatistics is used for:
– spatial data modelling
– characterizing the spatial variation
– spatial interpolation
– simulation
– optimization of sampling
– characterizing the uncertainty
• The idea of geostatistics is the points which are close to
each other in the space should be likely close in values.
18
• Mining
• Geography
• Geology
• Geophysics
• Oceanography
• Hydrography
• Meterology
• Biotechnology
• Enviromental studies
• Agriculture
19
Geostatistical methods provide
• How to deal with the limitations of
deterministic interpolation
• The prediction of attribute values at unvisited
points is optimal
• BLUE (Best Linear Unbiased Estimate)
20
Geostatistical method for interpolation
• Reconigtion that the spatial variation of any
continuous attribute is often too irregular to
be modelled by a simple mathematical
function.
• The variation can be described better by a
stochastic surface.
• The interpolation with geostatistics is known
as kriging.
21
The steps in kriging
1. Describe the spatial variation with variogram
2. Summarize the variation with a mathematical
function
3. Use the function to determine interpolation
weights
22
Step 1: Variogram cloud
Variability increase
Semivariogram, y(h)
27
Variogram
• Experimental variogram (sample or observed
variogram) :
– when variogram is computed from sampled data.
– The first step towards a quantitative description of
the regionalized variation.
• Theoretical variogram or variogram model:
– when it is modelled to fit the experimental
variogram.
28
Experimental and theoretical
variogram
from Longley, P.A., Goodchild, M.F., Maguire, D.J. And Rhind, D.W., 2001, Geographic Information Systems and Science
29
Describing the variogram
• Lag – The distance between sampling pairs.
• Sill – The value where the semivariogram first
flattens off, the maximum level of semivariance.
• Range – The point where the semivariogram
reaches the sill on the lag-axis. Sample points
that are farther apart than range are not spatially
autocorrelated.
• Nugget – The value of the variogram with 0 lag;
errors in measurements
30
Different variogram models
h h
Spherical Exponenial
range range
sill sill
nugget nugget
h Linear h Gaussian
range
sill
nugget nugget
31
The significance of the terms
describing the variogram model
• Sill, range and nugget define the model
• For example, spherical model:
32
Step 3: Determine interpolation
weights
• The variogram model is used to determine the
weights for unknown points.
• The calculation is rather complex; see O’Sullivan
and Unwin (2010) page 302 for details.
• With the weights calculated, interpolation is the
same as with IDW
• Kriging also produces kriging variance which can
be used for estimating the uncertainty of the
interpolation
33
Example
• Semivariogram modeling in ArcMap
34
Further notes in kriging
• Anisotropy
• Different types of kriging
– Ordinary
– Simple
– Universal
– Block
– Indicator
– Co-kriging
• How to evaluate the quality of the interpolation
– Cross-validation (CV)
– Using training and testing data sets
35
Anisotropy
• Spatial variation is not the same in all
directions
• The variogram is computed for specific
directions
• If the process is anisotropic (as opposed to
isotropic), then so is the variogram
36
Example: Anisotropy
37
Ordinary, simple and universal kriging
• The difference between the methods is what they
assume about the mean z-value
• In ordinary, mean is an unknown value estimated
locally
• In simple, mean is a known constant, i.e. average
of the entire data set
• In universal, drift in the data is modeled using
trend surface analysis and the semivariogram is
calculated using residual values from the surface
38
Example of trend removal
39
Block kriging
• Estimate an average value of a block
• Used for example in mining, where blocks of
rocks are extracted
40
Indicator kriging
• Used when the interpolated value is binary
41
Co-kriging
• It is an extension of ordinary kriging where
two or more variables are interdependent
• The information contained in the associated
variable is used to enable better estimations
of the other variable
• Useful when the associated variable is easier
to acquire
42
Evaluating the quality of the
interpolation
• Cross-validation method leave-one-out:
1. Drop one input point out of the model
2. Interpolate the surface with kriging
3. Compare measured (i.e. real) value and
predicted (i.e. from kriging) value
43
Cross-validation in ArcMap
44
Evaluating the quality of the
interpolation
• Training and testing data sets:
– Use two different data sets (or divide one data set
in two)
– The interpolation is done with training data set
– Testing data set is used for testing how well the
interpolated surface predicts testing points
45