You are on page 1of 23

An overview of Geostatistical

Concepts & Examples


Lecture 9
Geostatistics
Geostatistics combines practical conceptual thoughts that
facilitate the modeling of spatial variability with mathematical
and statistical methods.
It is rigorous and has the ability:
analyze and integrate different types of spatial data
measure spatial autocorrelation by incorporating the statistical
distribution
measure spatial relationships between the sample data
perform spatial prediction
assess uncertainty.
Geostatistics predicts the value of unsampled locations from
the observed nearby samples by the defined relationships.


Geostatistics vs. Classical Statistics
Geostatistics assumes
there is spatial autocorrelation of a random function consisting of
random variables spatially distributed in a 2-dimensional space
data values of a random function at different locations are spatially
auto-correlated with each other.
Classical statistics assumes there is no spatial autocorrelation
of a random variable, that is, data values of a random variable
at different locations are independent.
Regionalized variables In geostatistics, the random variables
are called regionalized variables.
the closer the locations of the data, the more similar the data values.
the similarity becomes weaker as the separation distance of data locations
increases and
disappears when the distance reaches a certain value called range.
Geostatistics (Example)
Lets suppose we want to measure variables like rainfall and
temperature
It can be possible through the meteorological stations located
at specified locations.
But it is impossible to put monitoring stations everywhere.
Therefore we will establish spatial relationships between the
known values of our observed locations and use these
relationships too make predictions at unobserved locations.
****Geostatistics will play a role here****
Regionalized variables
A variable that takes on values according to its spatial
location is known as a regionalized variable.
Considering a variable z measured at location i, we can
partition the total variability in z into three components:

z(i) = f(i) + s(i) +

where f(i) is some coarse-scale forcing or trend in the data,
s(i) is local spatial dependency, and
is error variance (presumed normal).



Regionalized variables



blue dots represent the data
The structural component (e.g., a linear trend)
The random noise component (non-fitted)
The spatially correlated component
Regionalized variables
Regionalized variables are variables that fall between random
variables and completely deterministic variables.
Typical regionalized variables are functions describing
variables that have geographic distributions
Example: elevation of ground surface).
Unlike random variables, regionalized variables exhibit spatial
continuity
the change in the variable is so complex that they cannot be
described by any deterministic function.
The variogram is used to describe regionalized variables



Variograms (Basic Concepts)
Variogram: A visual exploratory tool for characterizing the
spatial continuity of the variable.
Sill: the plateau that the variogram reaches;
in the variogram context it is the average squared difference between
paired data values and it is approximately equal to twice the variance of
the data
Range: The distance at which the variogram reaches the sill.
Nugget Effect: The vertical height of the discontinuity at the
origin. It is the combination of:
(1) short-scale variations that occur at a scale smaller than the closest
sample spacing; and
(2) sampling error due to the way the samples were collected, prepared,
and analyzed.

Variograms (Basic Concepts)
Kriging: The process of fitting the best linear unbiased
estimate of a value at a point or of an average over a volume.
Isotropic (semi)variogram: This is when the spatial pattern is
identical in all directions.
In this case, the fitting of the semivariogram model will heavily depend
on the (Euclidean) distance between locations.
Anisotropic (semi)variogram: This is when the spatial pattern
is strongly biased towards a specific direction.
This phenomenon is also at times referred as directional variograms
because the weighting scheme depends on distance and direction.

Variograms
0
0.2
0.4
0.6
0.8
1
1.2
0 40 80 120 160 200
Distance between data locations h (m)
Maximum distance for spatial auto correlation = 150 m
V
a
r
i
a
n
c
e
Nugget
Range
Structure
Sill = nugget + structure
Variograms (Basic Concepts)
In mathematical terms, the semi-variogram:






Where h represents a distance vector.

( )
2
1
1
( ) [ ( ) ( )]
2 ( )
N h
h z u z u h
N h
o o
o

=
= +

h
h
h
Variograms





The semi-variogram is
based on modelling the
(squared) differences in
the z-values as a function
of the distances between
all of the known points.
Variograms (ArcGIS Geostatistic Analysts)





This is an example of
a variogram produced
using ArcGIS's
Geostatistical Analyst.
Variograms
Statistical assumptions:
Stationarymean and variance are not a function of location. Second-
order stationary is requiredvariance is a function of the separation
distance.
Isotropyno directional trends occur in the data (as contrasted with
anisotropy).
However, you can compute directional variograms in order to assess directional
trends in the data.
Use of trend surface analysis to remove global trends in the data (to
transform a non-stationary variable [mean varies across space] to a
stationary one).
Lag distances typically we group the distance intervals into classes so that
we can have enough sample points within any one distance class (typically
30 is suggested as the minimum number).
Small-scale (high resolution) variation (at the resolution implied by the original sampling
scheme) may not be detectable as a result.

Variograms
The technique can provide:
a quantification of the scale of variability exhibited by natural patterns
of resource distributions and
an identification of the spatial scale at which the sampled variable
exhibits maximum variance.
At larger lag distances harmonic effects can be noted, in which
the variogram peaks or dips at lag distances that are multiples
of the natural scale.
Given the noise present in natural environmental data sets, it is
unlikely that you will be able clearly to identify multiple
scales.
One approach might be to fit a semivariogram model to the data, and
to examine the residuals for the presence of multiple patterns of scale.

Variograms





Variograms





Variogram models





Kriging
Kriging is a spatial interpolation technique based on semi-
variograms.
Unlike every other spatial interpolation technique, kriging
provides a map that shows you the uncertainty associated with
the prediction.

Kriging
?
Sample data z(u

) at u

Cell u to be estimated
Neighborhood used
to estimate cell u
( )
2( )
1
( ) (0) ( ) ( ) ( )
n u
ok ok ok
u C u C u u u
o o
o
o
=
=

( )
1
( ) ( ) ( )
n u
ok ok
z u u z u
o o
o

=
=

1 ) (
) (
1
=

=
u
u n
ok
o
o

Kriging
Kriging produces the best linear unbiased estimate of an attribute at an
unmeasured site, once the variogram has been modeled.
Ordinary kriging: used when there is no drift in the data.
Universal kriging accounts for drift (in ArcGIS drift is modeled by a
constant, linear, second or third order equation).
Punctual kriging: produces values for non-sampled points.
Block kriging: produces values for areas instead of points. Estimates for
blocks have lower variance because several point values are averaged to
get the estimated value for one block. This averaging smoothes the
small scale fluctuations of the function [Z(x)] over the area of the block.
Co-kriging: uses 2 or more variables that are correlated between
themselves in the estimation of values for one of them (e.g: soil bulk
density and soil water content).

Geostatistics
Geostatistical analysis is highly useful for accounting for the
small population problem and to solve the spatial prediction
(will accurately predict better local estimates) and analysis
The main basis of geostatistical analysis is the regionalized
variable theory.
A geostatistical analysis must be properly implemented
following a solid knowledge of mathematical and statistical
methods.

References & Examples of application
Goovaerts, P. 1997. Geostatistics for Natural Resources Evaluation. Oxford University
Press.
Wang, G., T. Oyana, M. Zhang, S. Adu-Prah, S. Zeng, H. Lin, and J. Se. 2009. Mapping and
spatial uncertainty analysis of forest vegetation carbon by combining national forest
inventory data and satellite images. Forest Ecology and Management 258(7):1275-1283.
Wang, G., G.Z. Gertner, H. Howard, and A.B. Anderson. 2008. Optimal spatial resolution
for collection of ground data and multi-sensor image mapping of a soil erosion cover factor.
Journal of Environmental management 88:1088-1098.
Wang, G., G.Z. Gertner, and A.B. Anderson. 2007. Sampling and mapping a soil erosion
relevant cover factor by integrating stratification, model updating and cokriging with
images. Environmental Management. 39(1):84-97.
Oyana, T.J., (2004). Statistical comparisons of positional accuracies of geocoded databases
for use in medical research. In Egenhofer M, Freksa C, and Miller H. (eds.): In Proceedings of
the Third International Geographic Information Science, GIScience 2004, October 2023,
2004. Regents of the University of California: pp.309313.
Robertson, G.P. (1987). Geostatistics in ecology: interpolating with known variance. Ecology,
68(3):744748.
Yarus, J.M. and Chambers, R.L. (2006). Practical geostatisticsAn armchair overview for
petroleum reservoir engineers. Distinguished Author Series, JPT, Society of Petroleum
Engineers

You might also like