You are on page 1of 46

GEOS 5311 Lecture Notes: Interpolation and

Model Parameterization

Dr. T. Brikowski

Spring 2013

0
interpolation.tex,v, Vers. 1.22
Parameterization:Data Needs

I general categories include (see Fig. 1, or Table 1 Mercer and


Faust (1980))
I physical framework: physical extent of the hydrostratigraphic
units to be modeled
I hydrologic framework
I water table and potentiometric surface maps
I hydrographs of groundwater well and surface water body water
levels
I aquifer hydraulic properties: storativity, hydraulic conductivity,
leakance, etc.
I source/sink properties: well pumping rates, evapotranspiration,
recharge estimates, etc.
Modeling Data Needs

Figure 1: Flow modeling data needs. After Anderson and Woessner


(Table 3.1, 1992).
Obtaining Regionalized Data

I a basic problem in numerical modeling is reconciling point


(well) measurements with the averaging implicit in discretized
models. Two problem areas:
I obtaining a valid areal average from point measurements of
highly heterogeneous aquifer properties (e.g. hydraulic
conductivity)
I estimating aquifer properties at points away from well or other
data (interpolation/extrapolation)
I Areal average example:
I Snake River Plain, where lava tubes and buried sedimentary
channels provide much faster flow & transport pathways than
areal averages would indicate (Poeter and Gaylord, 1990)
Hanford Nuclear Reservation Problem

Purpose Big Mess!, must evaluate transport of tritium (3 H,


radioactive) to Columbia River and accessible
environment
Cause careless disposal of radionuclides (Fig. 3), leaky
“temporary” storage tanks
Hurdles: Dangerous tanks contain mixed
radionuclide-hydrocarbon waste and
can be explosive (e.g. USSR Mayak
explosion, 1957)
Heterogeneity contaminated aquifer highly
heterogeneous mix of Columbia River
deposits and a bit of basalt flow (Figs.
4–5
Complex Paths observed transport complex (Fig. 6)
Hanford Site Surface Features

Figure 2: Hanford Nuclear Reservation site features, after (Fig. 1,


Poeter and Gaylord, 1990). See state location map.
Hanford Kriged Water Table

Figure 3: Water table elevations (kriged) for Hanford site, 1951 and
1981. Water level rise resulted from disposal of liquid wastes. After (Fig.
3, Poeter and Gaylord, 1990).
Hanford Lithofacies

Figure 4: Lithofacies maps for upper 80 ft. of saturated zone at Hanford


site. (a) gravel-dominated facies, (b) mud-dominated (Poeter and
Gaylord, 1990).
Hanford Cross-Sections

Figure 5: Correlated borehole cross-sections, Hanford Site. After (Fig. 7,


Poeter and Gaylord, 1990).
Hanford Tritium Migration

Figure 6: Observed tritium migration vs. time, Hanford Site. After (Fig.
4, Poeter and Gaylord, 1990). N.B. EPA MCL 3 H is 20,000 pCi l . See
other maps.
Regionalization Approaches

I evaluate likely lithologies present in model area, model each as


separate hydrostratigraphic unit
I ensure that areas of differing hydraulic gradient in the field are
associated with differing conductivity zones or
source/sink/boundary effects in model
I perform a formal inverse model, which estimates the
distribution of rock properties based on hydrologic
observations (usually transmissivity estimates based on head
gradients) (Menke, 1984; Townley and Wilson, 1985; Yeh,
1986)
Anisotropy in Averaging

I Recall the effect of layered heterogeneity on directional


(anisotropic) hydraulic conductivity, as discussed in the
Hydrogeology lecture notes
I see also Anderson and Woessner (1992, p. 69-70) or (Freeze
and Cherry, 1979)
I When approximating a layered series of units, specification of
anisotropy can be used to incorporate the relative limits of
vertical to horizontal flow through the series as a whole.
I Recall that anisotropy “skews” the flow field in the dominant
principal component direction (Fig. 8).
I Streamlines are no longer perpendicular to isopotentials,
giving anisotropic flow model results an “odd” appearance.
Anisotropy and Layering

Figure 7: Relationship between layered heterogeneity and anisotropy. In


this case the homogeneous equivalent Kz = Pnd di and
i=1 Ki
Pn Ki di
Kx = i=1 d . From Freeze and Cherry (Fig. 2-9, 1979).
Anisotropic Flow

Figure 8: Flow field given various hydraulic conductivity ellipsoid


geometries. After Freeze and Cherry (Fig. 5-9, 1979).
Introduction to Interpolation (“contouring”)

I in modeling a continuous 2- or 3-D field of values is needed


for many parameters, and must be obtained from scattered
point (borehole) data. Usually this take the form of
contouring the point data.
I Interpolation estimates values between known points, i.e.
inside of the convex hull (the line surrounding all the points,
Fig. 9)
I Extrapolation estimates values outside the convex hull
I often the choice of interpolation/extrapolation method can
greatly influence model results
Convex Hull

Figure 9: Convex hull and TIN of a scattered dataset


Interpolation methods in GMS

Four basic types of interpolation are available in GMS


I TIN-Based: Linear interpolation and Clough-Tocher
I Inverse-Distance Weighted (IDW, only method available in
many packages like ArcView)
I Natural Neighbor (area and distance weighted, good for
clustered data)
I Kriging (correlation-length weighted, good for geologic media)
Linear Interpolation

I simplest method, best suited for huge, smooth datasets


I generates temporary TIN (triangular irregular network), fits a
plane to each triangle (e.g. Fig. 9), calculates surface value
for points lying in the triangle using the plane (i.e. a 4-term
equation)
I local method, so fast, but no extrapolation is allowed
I produces “rough” surface (first derivative discontinuous or
“C0”)
I doesn’t allow unsampled local minima and maxima to be
inferred (all interpolated values lie between extremes of
sampled values)
Clough-Tocher

I AKA “finite-element” method, also based on temporary TIN


I fits a cubic surface to the triangle (i.e. a 10 term equation,
Fig. 10
I local method, so fast, but no extrapolation is allowed
I makes smooth surface (first derivative is continuous or
“C1-continuity”)
I good at quickly emphasizing local trends in data (mostly used
to generate realistic artificial topography in animations)
Clough-Tocher Element

Figure 10: Clough-Tocher discretization, showing subdivisions and inputs


used to approximate function surface. After EMRL (2003).
IDW Methods in GMS

I Inverse distance weighted (IDW): nearby points influence the


estimate most strongly.
I expressed mathematically as
N
X
F (x,y ) ≈ wi (x,y ) · fi (1)
i=1

where wi is the weight associated with the ith point (which


has function value fi )
I a wide variety of methods can be used to determine the
weights, and to select the points that are used to make the
estimate
I see also GMS Online Help
IDW Weight Functions
I Shepard’s Method:
I weight function varies linearly over range 0:1 and is radially
symmetric about each point
I fails to infer local extrema, produces rough (C0) surface
I tends to show “bullseyes” around isolated points (because of
radial symmetry)
I Modified Shepard’s Method:
I weight functions are nodal or basis functions defined at each
data point (node).
I various function types are used:
I gradient plane: sloping planes reflecting local estimated
gradient in data. Allows inference of local extrema. Produces
rough surface (first derivative discontinuous across
connections between nodes).
I quadratic function: higher-order nodal functions that
approximate (don’t necessarily pass through) neighboring
nodes. Can be slow with large datasets, produces nicely
smoothed surface, requires at least 5 data points.
IDW Point Subsetting

All, or a subset of points may be used in IDW estimates (see eqn.


1). Subsets use the nearest N points (in any direction, or by
quadrant, which can help with clustered data), which are found by
one of three methods:
I global: use or examine all points in dataset. Can be slow for
large datasets
I local: points are crudely sorted into distance “bins” using a
temporary TIN, only examines points in the same triangle as
the point of interest, then neighboring triangles, etc. Saves
much time for large datasets.
I “enclosing triangle”: only TIN triangles are generated using
the point of interest as a vertex (Fig. 11). Makes method
local (i.e. fast) and default basis function makes method C1
continuous (smooth surface)
Enclosing Triangle Subset

Figure 11: Enclosing triangle subset method for IDW, and “S”-shaped
basis function for surface approximation. “A” is the point of interest,
bold line shows perimeter of enclosing triangles, which are the only areas
used to generate the interpolated value at A. After EMRL (2003).
Natural Neighbor Method

I essentially a combination of Clough-Tocher and IDW


I data points whose “areas of influence” (in a finite-element
sense) are adjacent to the interpolation point are used (using
Delaunay Triangulation, Fig. 12)
I weights based on size of boundary between areas of influence
of “neighboring” data points and interpolation point
I extrapolates beyond convex hull easily
Delaunay Triangulation

Figure 12: Delaunay triangulation for natural neighbor method.


IDW weights are based on Theissen polygon area for each node.
After EMRL (2003).
Introduction to Kriging

I uses spatial correlation structure in data to compute weights


in IDW-style interpolation (Journel, 1989)
I GMS implementation based on GSLIB (Deutsch and Journel,
1992)
I two basic types: ordinary, and universal. The latter attempts
to account for regional trends in the data.
I computes a spatially-varying estimate of interpolation error
(estimation variance), so error distribution can be mapped
GMS Interpolation/Extrapolation Summary

Table 1: Major characteristics of interpolation/extrapolation methods


implemented in GMS. Continuity: ’C0’ implies only ’zeroth’ derivative
(i.e. the function itself) is continuous, producing a rough surface.
Continuity ’C1’ indicates the function and its first derivative are
continuous, making a smooth surface at the data points. All methods
except Kriging produce a surface that passes through the data points.
Local methods are faster for large datasets, but may yield rougher
surfaces when regional trends are present.

Method Local Continuity Comment



Linear C0 Faceted surface
Shepard’s C0 Smoother, prone to over-
(IDW) shoot
GMS Interpolation/Extrapolation Summary (cont.)

Modified C1
Shepard’s
(Gradient
Plane)
Modified C1 Smoother than gradient
Shepard’s plane
(Quadratic)

Clough- C1
Tocher
Natural C1 Weights computed using
Neighbor inverse distance and rela-
tive density of points in any
direction
GMS Interpolation/Extrapolation Summary (cont.)

Kriging C1 Uses spatial correlations to


improve interpolation esti-
mate. Inexact interpola-
tion, allowing calculation of
estimation error.
GMS Interpolation/Extrapolation Summary (cont.)
Jackknifing

Methods are needed to test the accuracy of an interpolation


method with a given data set.
I input data can be subsetted,
I withhold some points (e.g. 13 )
I compare interpolated to measured values at those points
I withhold individual points successively (Jacknifing)
I re-compute interpolation without the point
I compare interpolated to actual value at the point
I accumulate statistics on the error
I Jackknifing is offered as a menu choice in GMS/[2-3]D
Scatter Module/Interpolation. Use it!!
I it is a special form of the more general approach toward
determination of variance termed resampling (Tukey and
Mosteller, 1977)
Quantifying Error

A variety of standard formulas are used to quantify error, usually


for model calibration. The following three are discussed in the
textbook (Sec. 8.4, Anderson and Woessner, 1992):
ME Mean Error: mean difference between measured head (hm )
and simulated head (hs ):
n
1X
ME = (hm − hs )i (2)
n
i=1

MAE Mean Absolute Error: mean of absolute value of differences


(avoids cancellation by large errors of opposite sign)
n
1X
MAE = |hm − hs |i (3)
n
i=1
Quantifying Error (cont.)

RMS Root Mean Squared Error: standard deviation


v
u n
u1 X
RMS = t (hm − hs )2i (4)
n
i=1
Types of Kriging
I Simple kriging
I assumes the mean (expected) value of the interpolant does not
vary with position (i.e. is stationary)
I seeks changes from that expected value, i.e. that the expected
value is zero everywhere in the problem domain
I therefore generally produces smoother but less accurate
interpolation
I Ordinary kriging
I most commonly applied method
I assumes an unknown but spatially invariant mean value
I Universal kriging
I allows for spatially varying mean (drift, e.g. a sloping surface)
I in GMS this can be fit by a linear or quadratic function
I Indicator kriging
I interpolates an indicator function rather than the data
I e.g. for defining lithology, where function equals 1 if in sand,
zero if in clay
Variograms

I determines degree of correlation vs. distance. Sample variance


plotted on Y-axis, distance on X-axis. This is called the
experimental variogram in GMS (Fig. 13).
I typically these show points close to one another with lower
variance than points farther apart. Beyond some distance
(called the range) variance remains constant provided there is
no regional trend in the data
I the experimental variogram is usually rough, and is fit by a
model for use in interpolation 14
The Mathematics of Kriging
I the model variogram is used to compute weights in an IDW
scheme to give minimum estimation error at observation
points.
I note kriging may not interpolate known points exactly
I this can be an advantage, since estimation error can then be
calculated without jacknifing
I for datasets with constant variance (no spatial correlation),
kriging should interpolate as well as any IDW scheme (i.e. you
can’t do worse than most non-kriging approaches, and often
will do better)
I Ordinary kriging can be expressed mathematically as:
N
X
F (x,y ) ≈ wi (x,y ) · fi (5)
i=1
N
X
wi = 1 (6)
i=1
The Mathematics of Kriging (cont.)

where wi is the weight associated with the ith point, which


depends on spatial correlation and other information.
I technically inclusion of (6) is what distinguishes ordinary from
simple kriging
I This form is quite similar to the IDW formulation (1), and
produces the same result when no spatial correlation is used.
I the weight function wi (x,y ) is determined by constructing a
variogram, which expresses the spatial dependence (i.e.
variance) of variability between observations fi
I the calculation of that variance is the topic of Geostatistics
classes, GMS provides a number of variogram forms (e.g.
“Gaussian” which has an “S” shape) that can be selected to
best-fit the data
Model Variogram

Figure 13: Features of a model variogram. These are used to determine


the model variogram (i.e. inputs to GMS model variogram editor). After
EMRL (2003).
Model vs Experimental Variogram

Figure 14: Relation between model and experimental variograms. γ is


variance between samples a distance h apart. After EMRL (2003).
Anisotropy

I variogram range may depend on direction (Fig. 16)


I software like GMS/GSLIB that allow directional searches may
be used to detect anisotropy through trial-and-error
I the result is to define a variogram with maximum
contribution, and another one perpendicular to it.
Interpolation is carried out using both variograms to estimate
spatial correlation in the selected directions.
I directional searches are specified using an azimuth and
bandwidth (i.e. for a semi-rectangular search area, Fig. 17)
Geologic Origin of Anisotropy

Figure 15: Geological example of anisotropic variograms. After Journel


and Huijbregts (Fig. 1.1, 1989).
Anisotropic Variograms

Figure 16: Model and experimental variograms in anisotropic case.


Experimental variogram designated by filled squares compares data in a
direction perpendicular to that designated by open squares. After EMRL
(2003).
Directional Variogram Specification

Figure 17: Geometric parameters for specifying directional variograms in


GMS (or GSLIB). After EMRL (2003).
References
Anderson, M.P., Woessner, W.W.: Applied Groundwater Modeling.
Academic Press, San Diego (1992)
Deutsch, C.V., Journel, A.G.: GSLIB: Geostatistical Software Library and
User’s Guide. Oxford University Press, New York (1992)
EMRL: GMS Reference Manual. Environmental Modeling Research
Laboratory, Brigham Young Univesity, Provo, UT, 4.0 edn. (2003),
http://www.bossintl.com/online_help/gms/
Freeze, R.A., Cherry, J.A.: Groundwater. Prentice-Hall, Englewood Cliffs,
NJ (1979)
Journel, A.G.: Fundamentals of geostatistics in five lessons. Short Course
in Geology 8, 40 (1989)
Journel, A.G., Huijbregts, C.J.: Mining Geostatistics. Academic Press,
San Diego (1989), fourth Printing
Menke, W.: Geophysical Data Analysis: Discrete Inverse Theory.
Academic Press, Inc., Orlando, FL (1984)
Mercer, J.W., Faust, C.R.: Ground-water modeling: An overview.
Ground Water 18, 108–115 (1980)
References (cont.)

Poeter, E., Gaylord, D.R.: Influence of aquifer heterogeneity on


contaminant transport at the hanford site. Ground Water 28, 900–909
(1990)
Townley, L.R., Wilson, J.L.: Computationally efficient algorithms for
parameter estimation and uncertainty propogation in numerical models
of groundwater flow. Water Resour. Res. 21, 1851–60 (1985)
Tukey, J., Mosteller, F.: Data Analysis and Regression, A Second Course
in Statistics. Addison-Wesley, Reading, MA (1977), out of print?
Yeh, W.W.G.: Review of parameter identification procedures in
groundwater hydrology: The inverse problem. Water Resour. Res. 22,
95–108 (1986)

You might also like