You are on page 1of 10

Multiple-point geostatistics: a powerful tool to

improve groundwater flow and transport


predictions in multi-modal formations
Luc Feyen1,3 and Jef Caers2
1
Stanford University, Department of Geological and Environmental Sciences,
Stanford, USA
2
Stanford University, Department of Petroleum Engineering,
Stanford, USA
3
Katholieke Universiteit Leuven, Hydrogeology & Engineering Geology,
Leuven, Belgium
luc.feyen@geo.kuleuven.ac.be, jef@pangea.stanford.edu

Abstract
In this paper we introduce the use of multiple-point geostatistics in hydrogeology.
Multiple-point geostatistical algorithms allow the generation of stochastic models
reproducing more realistically geological shapes and connectivities as compared
to more traditional variogram-based techniques. Multiple-point geostatistics relies
on the concept of 3-dimensional training images from which higher order statistics
can be borrowed. The training images are merely conceptual, i.e., they need not be
conditioned to any local data, and depict the expected pattern of geological
heterogeneity. We use the snesim (single normal equation simulation, see Strebelle
2002) code, a pixel-based algorithm that employs a fast and robust sequential
simulation methodology. As an example problem, we use a reference field
representing a typical fluvial deposition that consists of an interconnected network
of permeable sand channels embedded in less permeable fine-grained floodplain
material. The width and orientation of the sand channels in the area are nonstationary. We show how strongly heterogeneous models can be built from simple
training images, whose patterns are modified based on local anisotropy. We
investigate the impact of the facies geometry and intrafacies heterogeneity on
groundwater flow and transport predictions. Results indicate that it is of the
utmost importance to properly represent and locate the sand channels, because
transport is mainly occurring through these high-permeable zones. We show how
different sources of information, such as local conditioning data, angle and
channel width information, improve the representation of the spatial
heterogeneity, hence flow and transport predictions. We also apply a sequential
indicator approach and compare the results with those of the multiple-point
approach. Results show that this two-point correlation based technique is not able
to represent the high-permeable interconnected channel network. As such, these

Luc Feyen and Jef Caers

methods tend to underestimate contaminant movement within the channel


network.

1 Introduction
Groundwater flow and transport models rely on a detailed description of the
hydraulic properties of the subsurface. Because of financial and physical
limitations to data collection, the subsurface heterogeneity cannot be described in
detail deterministically. In recent decades numerous stochastic approaches have
been developed to overcome this problem. These methods interpolate between
hard data and use geologic, hydrogeologic and geophysical information to create
images of the property of interest. An excellent review (up to 1995) of structureimitating, process-imitating and descriptive methods is presented by Koltermann
and Gorelick (1996). To date, most applications of geostatistics in hydrogeology
have employed variogram-based techniques. The use of two-point correlation
methods can be justified to describe the heterogeneity within a single statistically
homogeneous stratigraphic unit. However, they are too limited to adequately
characterize the spatial continuity for multimodal distributions, such as sand-shale
formations, fractured rock masses or dolomite rocks with dissolution channels.
Also, variogram-based methods cannot take full advantage of existing prior
geological knowledge or depositional information.
Multiple-point (mp) geostatistics aims to overcome the limitations of the
variogram-based techniques in representing realistic geological continuity.
Strongly connected, curvilinear structures often constitute preferential flow paths
that largely affect groundwater flow and transport. Conductivity barriers of
various sizes and shapes may be present and need to be adequately represented.
Mp-geostatistics is an active area of research that recently emerged in the field of
petroleum engineering (see e.g., Caers and Zhang 2003; Strebelle 2002; Strebelle
et al. 2002). In this paper we show that some of the techniques developed could
prove to be powerful tools for a wide range of hydrogeological applications.
Therefore, we employ a synthetic non-stationary bimodal reference field
representing a typical fluvial deposition that consists of permeable sand channels
embedded in less permeable fine-grained floodplain material. We show results of
a numerical analysis to evaluate groundwater flow and transport behavior in these
types of settings and compare the mp-geostatistical approach with a more
traditional 2-point variogram-based method.

2 Multiple-point geostatistics
The premise of mp-geostatistics is to generate models/images of the subsurface by
borrowing patterns of geological heterogeneity from training images. Training
images are merely conceptual and depict the expected patterns of geological
heterogeneity. They need not be conditioned to any local data nor carry other
locally accurate information. Several training images may be used to reflect
different scales and styles of heterogeneities, or alternative conflicting geological

Multiple-point geostatistics

interpretations to account for uncertainty about the subsurface architecture. 3-D


training images can be obtained from unconditional object-based or pixel-based
techniques, 3-D interpretation of outcrop data or high resolution geophysical data
from analog fields of study. An example training image is presented in Fig. 1. It
represents a fluvial setting of W-E oriented sand channels with an average channel
width of 25-30 m. The training image is generated with the object-based algorithm
fluvsim (Deutsch and Tran 2002).
y (m)

300

Sa
Fl
0
0

300

600

900

1200

x (m)
Fig. 1. Training image representing a fluvial deposit (generated with fluvsim, Deutsch and
Tran 2002): Sa = sand, Fl = floodplain

Similar to variogram-based geostatistics, the training images are bound by the


same principles of stationarity and ergodicity. They are essentially databases of
geological architectures, and if patterns are to be extracted from them enough
repetitivity and consistency of patterns is required. Ergodic considerations dictate
the minimum size of the training image. Reproduction of large scale patterns like
sand channels require training images of at least 2 times the size of the area in the
direction of the channel continuity. Small training images will result in large
ergodic fluctuations and will deteriorate pattern reproduction (Caers and Zhang
2002).

3 Single Normal Equation Simulation algorithm


The single normal equation simulation algorithm (snesim) developed by Strebelle
(2000, 2002) is an efficient pixel-based sequential simulation algorithm that
obtains multiple-point statistics from the training image(s), exports it to the
geostatistical model and anchors it to the actual subsurface data, both hard and
soft. For each location u = ( x, y ) along a random path, the set of local data values
and their spatial configuration, termed data event, is recorded. The training
image is scanned for replicates that match this event. The central node values
corresponding to the replicates are used to calculate the conditional probability of
the central value, given to the data event. Current implementations of snesim
acquire significant CPU efficiency by performing this scanning prior to simulation
and storing the conditional probabilities in a dynamic data structure, called the
search tree. In summary, the snesim algorithm works as follows:
construct a 2-D (or 3-D) grid for the area, assign hard data to closest grid
cells

Luc Feyen and Jef Caers

scan the training image for data events and store them in a search tree
define a random path
until each non-datum cell with coordinates u = ( x, y ) on the random path
is visited
1. search for the closest nearby well data and previously simulated
cells (this set is the data event);
2. obtain the probability distribution for the property to be
simulated from the search tree; and
3. draw an outcome from the probability model in step 2 and
assign that value to the current grid cell.
In two-point geostatistical methods, the probability distribution in step 2 is
obtained through some form of kriging based on a variogram model. In the snesim
approach no kriging or variogram is involved and the probability distribution is
obtained directly from the training image. For details of this procedure the reader
is referred to the works of Strebelle (2000, 2002). Soft data can be included
through an extension of Bayes theorem, as discussed in Strebelle et al. (2002).
Caers (2002) describes how production data can be incorporated using history
matching.
The stationarity requirement for the training image does not imply that only
stationary fields can be generated. Similar as to building complex variogram
models from basic variograms, the well known principles of nesting models,
rotation and affinity transformation can be used to build complex strongly nonstationary fields, such as sand channels with locally varying channel widths or
changing channel directions. Nesting of models is obtained by using different
training images for different scales of observations (see Strebelle and Journel
2001). For the rotation and affinity transforms, each single datum with original
coordinates u orig = ( x orig , y orig ) in the entire data event is rotated and affinely
transformed along the center node to the new coordinates u new = ( x new , y new )
according to

u new = A(u) R (u) u orig


where

(1)

0
a x (u)

A(u) =
a y (u )
0

contains the major and minor range of continuity, a x (u) and a y (u ) , respectively;
and
cos( (u)) sin( (u))

R (u) =
sin( (u)) cos( (u))
contains the rotation angle azimuth (u) . Example maps for the affinity factors
and rotation angles are presented in Fig. 2. Location-dependent rotation and

Multiple-point geostatistics

affinity information can be obtained from well-data, seismic, geological, or


depositional information.
450

450

y (m)

300

y (m)

300

1.5
150

150

ratio

(a)

150

300

450

600

(b)

x (m)

150

300

450

600

x (m)

Fig. 2. (a) affinity factors (ratio) a x (u) a y (u) , a x (u ) = 1 ; and (b) channel rotation angles

4 Synthetic fluvial case study


A typical fluvial fan depositional system is presented in plan view in Fig. 3 (a).
The area of interest is 600 m by 450 m, and is discretized in 3x3 m blocks. The
system is characterized by high permeable sand channels embedded in less
permeable fine-grained floodplain material. Sand channels compose 30% of the
system and form an interconnected network. The channels are oriented W-E and
diverge north- and southwards when moving along the x-axis. The channel width
in the area decreases from 25-30 m to 10-15 m moving from west to east. The
synthetic field is generated with snesim using the training image from Fig. 1 and
the angle rotation and affinity data presented in Fig. 2. Within each facies the
natural log of the hydraulic conductivity (Y = ln K) is modeled as a realization of a
second-order stationary Gaussian random field using the sequential Gaussian
simulation algorithm sgsim (Deutsch and Journel 1998). The statistics of both
random fields are presented in Table 1. The ln K histogram and experimental
facies variogram for the reference field are plotted in Fig. 3 (b) and (c),
respectively. Despite the strong connectivity of the sand channels, the facies
variogram is characterized by short ranges. The histogram clearly shows that a
unimodal approach would be inappropriate and that the two facies composing the
system should be modeled as distinct units.
Table 1. Statistical and hydraulic parameters for the sand and floodplain facies

Variogram type
mean ln K
Geometric mean K
sill ( Y2 )
correlation length (Y x )

sand

Floodplain

exponential

Exponential

-3

7.389 m/day

0.05 m/day

0.25

75 m

30 m

Luc Feyen and Jef Caers


anisotropy (Yx Y y )
Effective porosity ( )

Dispersivity ( L , T )

y (m)

300

150

0.12

150

300

450

0.2

number of data
mean
std. dev.
maximum
upper quartile
median
lower quartile
minimum

0.08

0.3

0.2

0.1

0
-5 -4 -3 -2 -1

600

(b)

x (m)

0.15 - 0.015 m

30,000
-1.32
2.51
5.44
1.66
-2.68
-3.19
-4.91

0.04

ln K

(a)

0.3
0.3 - 0.03 m
5
4
3
2
1
0
-1
-2
-3
-4
-5

frequency

450

2.5

100

(c)

ln K

200

300

distance

Fig. 3. (a) ln K distribution for the reference field; (b) ln K histogram; and (c) experimental
facies variogram

The reference field was randomly sampled at 100 locations, with 30 samples
located in the sand channels. A sample consists of the facies type and the ln K
value at that location. To compare the mp-statistics approach with a more
traditional 2p-correlation approach, a random realization, conditioned on the
extensive sample data set, was generated using the sequential indicator simulation
program sisim (Deutsch and Journel 1998). The variogram used to generate the
sisim realization is that of the training image shown in Fig. 1. The resulting ln K
image, ln K histogram and experimental facies variogram are presented in Fig. 4
(a), (b) and (c), respectively. The corresponding results for a random conditional
realization generated with the snesim algorithm are given in Fig. 4 (d), (e) and (f).
It is important to note that only the facies geometries differ, and that the
conditional ln K realizations of the sand and floodplain formations are the same in
the conditional facies realizations generated with sisim and snesim. Both methods
very closely reproduce the ln K histogram and experimental facies variogram of
the reference field. However, results clearly indicate that the 2p-approach fails to
reproduce the channel network, in contrast to the mp-approach. Hence, using a
variogram model accounting only for 2p-correlation fails to mimic the
interconnected channel network, even for extensive conditioning data sets. Also
presented in Fig. 4, in plates (g), (h) and (i), are the results of an unconditional
realization generated with snesim and sgsim. Again, the statistics and the channel
structures are very well reproduced. However, the exact locations of the channels
are not reproduced without conditioning data.
5
4
3
2
1
0
-1
-2
-3
-4
-5

y (m)

300

150

150

300

x (m)

450

600

number of data
mean
std. dev.
maximum
upper quartile
median
lower quartile
minimum

0.08

30,000
-1.22
2.57
5.84
1.56
-2.69
-3.19
-4.91

0.04

0.3

0.2

0.1

ln K

(a)

0.12

frequency

450

0
-5 -4 -3 -2 -1

(b)

ln K

(c)

100

distance

200

300

Multiple-point geostatistics
5
4
3
2
1
0
-1
-2
-3
-4
-5

y (m)

300

150

150

300

450

5
4
3
2
1
0
-1
-2
-3
-4
-5

y (m)

300

150

150

300

450

600

x (m)

0.2

0.1

number of data
mean
std. dev.
maximum
upper quartile
median
lower quartile
minimum

0.08

100

(f)

ln K
0.12

200

300

200

300

distance
0.3

30,000
-1.18
2.47
5.18
1.61
-2.58
-3.11
-4.71

0.2

0.1

0.04

ln K

0.3

30,000
-1.29
2.52
5.61
1.56
-2.64
-3.12
-4.98

0.04

(e)

450

(g)

0.08

-5 -4 -3 -2 -1

600

x (m)

frequency

number of data
mean
std. dev.
maximum
upper quartile
median
lower quartile
minimum

ln K

(d)

0.12

frequency

450

-5 -4 -3 -2 -1

(h)

100

(i)

ln K

distance

Fig. 4. (a), (d), (g) ln K distribution; (b), (e), (h) ln K histogram; and (c), (f), (i)
experimental facies variogram: (a), (b), (c) = sisim, conditional realization; (d), (e), (f) =
snesim, conditional realization; and (g), (h), (i) = snesim, unconditional realization.

5 Some observations on flow and transport


To investigate the impact of the interconnected channel structure on groundwater
flow and transport we performed a numerical analysis for which the results are
presented in this section. We consider the case of a confined aquifer. The
governing equations for steady-state confined groundwater flow and non-reactive
single species transport are
.(Th) q = 0

(2)

T = Kb

(3)

T
v = h
b
c
b
= .(b Dc) b vc + q(cs c)
t

(4)
(5)

where v is the groundwater flow velocity vector (L/T); T is the transmissivity


tensor (L2/T); K is the hydraulic conductivity tensor (L/T); is the effective
porosity (dimensionless); h is the hydraulic head (L); q are fluid sources/sinks
(L/T); c is the solute concentration (L/T); cs is the solute concentration in the
fluid sources/sinks (L/T); and D is the hydrodynamic dispersion tensor (L2/T).
Neglecting molecular diffusion, the principal terms of the dispersion tensor are

Dxx = L

v 2y
vx2
+ T
v
v

and

D yy = L

v 2y
v

+ T

v x2
v

(6)

Luc Feyen and Jef Caers

where L and T are the longitudinal and transverse dispersivity (L/T),


respectively.
The simulation model used to predict groundwater flow behavior is
MODFLOW-2000 (Harbaugh et al. 2000). Transport is simulated with MT3DMS
(Zheng and Wang 1999), using the Third-Order TVD solution scheme. The
confined aquifer has a uniform thickness b = 25 m. At the north and south
boundaries of the area no-flow boundary conditions are specified. Constant head
values are set along the west (h = 22 m) and east (h = 20 m) boundaries. At
location ( x = 75, y = 225) a spill of an inert contaminant occurred during a 10-day
period with a low constant flow rate of 100 l/day and a source concentration cs =
20 mg/l. It is assumed that at the location of the spill the facies type is sand, and
that this information is known in all cases evaluated in the numerical analysis.
Numerically solving the groundwater flow and transport model requires
specification of values for the unknown parameters , K , L and T in each
cell throughout the model area. The spatial distribution of ln K is generated as
described above using a combination of snesim and sgsim. The other hydraulic
parameters are assumed homogeneous within the facies. The values used for the
hydraulic parameters are given in Table 1. The small dispersivity values imply
that transport is dominated by advection.
450

22
21.5

y (m)

300
21

150

20.5
20

h (m)

0
0

(a)
22

300

450

600

x (m)

450

22

21.5
21

150

20.5

21

150

20.5

20

(b)

150

300

x (m)

450

600

21.5
21

150

20.5

20

h (m)

22

300

y (m)

300

y (m)

300

450

21.5

y (m)

450

150

(c)

20

h (m)

0
0

150

300

x (m)

450

600

h (m)

(d)

150

300

450

600

x (m)

Fig. 5. Simulated head distributions: (a) reference field; (b) conditional sisim realization;
(c) conditional snesim realization; and (d) unconditional snesim realization.

The true head distribution is obtained by running the groundwater flow model
for the reference field and is presented in Fig. 5(a). The head contours clearly
show the effect of the permeable sand channels, which dominate flow through the
system. The head distributions for the conditional sisim and snesim, and the
unconditional snesim realizations, are given in Fig. 5 (b), (c) and (d), respectively.
The conditional snesim realization yields a good prediction of the reference head
distribution. For the unconditional snesim realization, the effect of the sand
channels on heads can also clearly be seen. However, the different positioning of

Multiple-point geostatistics

the sand channels results in a less accurate prediction of the heads. Despite the
large number of conditioning data, the conditional sisim realization fails to
reproduce the reference head field. This can be attributed largely to the inability of
the method to represent the channel structure.
With the transport model we simulated the behavior of the released
contaminant under natural steady-state flow conditions for a period of 1500 days.
Plates (a), (b), and (c) in Fig. 6 display the distribution of the contaminant plume
for t = 300, 900 and 1500 days, respectively. The bulk of the released contaminant
moves through the permeable sand channels. Fig. 6 also shows the transport
predictions for the conditional sisim (plates (d), (e) and (f)) and snesim (plates (g),
(h) and (i)) realizations, and the unconditional snesim (plates (j), (k) and (l))
realization. The variogram-based method underestimates solute movement in the
direction of flow, as it is not able to reproduce the interconnected preferential flow
paths. Once the solute mass enters into the floodplain material it moves
downstream very slowly, until perhaps, a new permeable sand body is
encountered. The conditional snesim realization yields a fairly good prediction of
the location of the contaminant plume through time. Results for the unconditional
snesim realization indicate that for transport predictions it is very important to
accurately determine the location of the sand channels. The training image, angle
and affinity information allow characterizing the structural features of the system,
but conditioning is needed to precisely locate the sand channels.
450

2.5

450

2.5

150

0.5

0
450

600

150

(b)
2.5

(mg/l)

x (m)

450

0.5

150

300

450

600

x (m)
2.5

150

0.5

0
300

450

600

300

1.5
1

150

0.5

0
0
0

(e)

x (m)
2.5

c
150

300

450

600

150

0.5

x (m)

450

600

450

600

2
1.5
1

150

0.5

2.5

0
0

150

300

450

600

x (m)

450

2.5

time = 1500 days

300

1.5
1

150

0.5

(h)

150

300

x (m)

450

(mg/l)

(mg/l)

(mg/l)

300

x (m)

y (m)

y (m)

1.5

300

2.5

time = 900 days

300

300

150

x (m)

(f)

150

450

(mg/l)

450

time = 300 days

(mg/l)

450

(g)

time = 1500 days

y (m)

y (m)

1.5

150

0.5

time = 900 days

300

(c)

(d)

1.5

150

(mg/l)

450

time = 300 days

2
300

y (m)

300

1.5

600

2
300

1.5

y (m)

150

300

y (m)

y (m)

1.5

y (m)

300

2.5

time = 1500 days

(a)

450

time = 900 days

time = 300 days

150

0.5

(mg/l)

(i)

150

300

x (m)

450

600

(mg/l)

Luc Feyen and Jef Caers

450

2.5

450

2.5

y (m)

1.5
1

150

0.5

0
150

300

450

600

300

1.5

y (m)

300

2.5

time = 1500 days


2

(j)

450

time = 900 days

time = 300 days

150

0.5

(mg/l)

x (m)

(k)

150

300

x (m)

450

600

2
300

1.5

y (m)

10

150

0.5

(mg/l)

(l)

150

300

450

600

(mg/l)

x (m)

Fig. 6. Simulated contaminant concentrations for t = 300, 900 and 1500: (a), (b), (c)
reference field; (d), (e), (f) conditional sisim realization; (g), (h), (i) conditional snesim
realization; and (j), (k), (l) unconditional snesim realization.

6 Conclusions
Results shown in this paper indicate that multiple-point geostatistics is potentially
a very powerful tool to characterize subsurface heterogeneity for hydrogeological
applications in a wide variety of complex geological settings. Geological
structures or features such as sand channels or clay lenses often constitute
preferential flow paths or obstacles to flow. Accurately representing and locating
these structures is of high importance when predicting groundwater flow and
transport, as was shown in this work. Because data are scarce, the mp-statistics are
borrowed from training images that depict the expected patterns of geological
heterogeneity. The mp-statistics are exported to the geostatistical model and
anchored to hard and/or soft data. Strongly non-stationary fields can be generated
using several training images, angle rotation and affinity information. Mpgeostatistics should bring geological interpretation closer to hydrogeological
modeling.

Acknowledgements
The first author wishes to acknowledge the Fund for Scientific Research
Flanders (Belgium) for providing a Postdoctoral Fellowship and a mobility grant.

References
Caers J (2002) Geostatistical history matching under a training image-based geological
model constraints. SPE Journal: SPE 77429
Caers J, Zhang T (2003) Multiple-point geostatistics: a quantitative vehicle for integration
geologic analogs into multiple reservoir models. In: "Integration of outcrop and mod
ern analog data in reservoir models" AAPG memoir, in press
Deutsch CV, Journel AG (1998) GSLIB, Geostatistical Software Library and Users Guide.
Oxford University Press, New York
Deutsch CV, Tran TT (2002) FLUVSIM: a program for object-based stochastic modeling
of fluvial depositional systems. Computers and Geosciences 28: 525-535

You might also like